Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Bug with unitted functions #34

Open
bnordgren opened this issue Sep 20, 2014 · 10 comments
Open

Bug with unitted functions #34

bnordgren opened this issue Sep 20, 2014 · 10 comments

Comments

@bnordgren
Copy link

Not sure exactly the jargon to express this in, but there is a strange usage issue with united functions.

Take the "add()" function described here: http://scimath.readthedocs.org/en/latest/units/unit_funcs.html

I get the following behavior:

testsm.add(UnitScalar(2,units="m"),UnitScalar(3, units="m"))
Out[69]: UnitScalar(4.999999999999999, units='1.0*m+0.0')

testsm.add(2,3)
Out[70]: 1.524

testsm.add(2_m,3_m)
Out[71]: 1.524*m+0.0

Out 69 and 70 are expected. Out[71] is not. It got the type right, but not the value. It looks like it is ignoring the provided units and assuming feet, as if the parameters were regular python types instead of special unit-carrying types.

Is there a mailing list for this project?

@bnordgren
Copy link
Author

Further testing shows that the multiplication by the dimensionless conversion factor is the culprit. However, behavior on the command line is different than behavior via a united function. Performing the "add()" function directly on the command line using UnitScalar types also does not yield the expected results (whereas it does inside a united function). I guess there are two strange things happening here:

  1. Why is a UnitScalar different than multiplying a numeric type by a unit object?
  2. Why does the same expression treat UnitScalars differently depending on whether the expression occurs on the command line or inside a unitted function?

(2_m + 3_m)
Out[82]: 5.0*m+0.0

(2_m + 3_m) * (feet/m)
Out[83]: 1.524*m+0.0

(UnitScalar(2,units="m") + UnitScalar(3,units="m")) * (feet/m)
Out[85]: UnitScalar(1.524, units='1.0*m+0.0')

@rkern
Copy link
Member

rkern commented Sep 20, 2014

The expression 2*m builds a new unit of length that is twice as long as a meter, not a length quantity of 2 meters. To get the latter, you need to explicitly build it: UnitScalar(2, units=m). The main focus of scimath.units is to help build GUI applications, so command-line convenience has not been prioritized. There are other unit-handling systems that you may find more to your taste, if that is your typical use case.

enthought-dev@enthought.com is the mailing list.

@bnordgren
Copy link
Author

Ok, with that terminology, the add() function adds one unit of length 2 m long to one unit of length 3 m long and comes up with a unit of length 1.524 m long. I must still insist that this is incorrect. The add function is behaving exactly as if I specified 2 feet and 3 feet. It took the values and clobbered the units. On a unit type.

It may well be that I don't understand what a unitted function does. I infer from the example that the innards of the function are supposed to be coded as if all the parameters are plain, non-unit-bearing python numeric types, and that the @has_units decorator ensures that the values are converted to the correct magnitude prior to use.

I added some print statements and it appears that when unitscalars are provided to the unitted function, they are indeed converted to base python objects with the correct value, but sans units. Unit objects are just left as unit objects, which violates the assumption that the equation is seeing bare values expressed in the desired units. The upshot appears to be that the final conversion factor is wrong because the value is expressed in SI base units, not the units the function was designed for.

So is there any reason the @has_units decorator could not treat a unit object the same as a UnitScalar? At the end, wrap the value in a UnitScalar with the appropriate return units, just like now.

This would mean that unitted functions may accept "units" and return a "UnitScalar", but I don't think that would ever matter, as unitted functions are not trying to create new unit types. This might even be appropriate behavior for the add/subtract operator on two unit objects.

So a unitted function should be thought of as a black box. It takes unit-bearing quantities or non-unit-bearing quantities which must have the correct value. It emits unit-bearing quantities (or not). Internally, units are not involved in the actual computation, so there is no automatic check that dimensions are correct or that the units are obeyed.

For the second matter, the disturbing thing about the behavior of unitted functions was that I could not replicate the result from an expression inside a unitted function on the command line. Given the above experimentation, I can now see that the expression on the command line I was using for debugging is different than the one inside the function. However, the translation happens invisibly using an area of the library not well documented. The second bullet appears to be intended, but nonintuitive behavior.

Should I break this into two tickets? The first a request to have @has_units treat unit objects just like unit scalars, and the second as a request to add content to the @has_units section of the documentation? For instance, it may be beneficial to include a version of the add function with no decorator, but which makes explicit the machinations provided by the decorator.

Neither the library I'm building nor scimath should be overly concerned with whether the client of the library is a GUI app, batch script, or interactive use. I'm mostly looking for good integration with numpy arrays.

@rkern
Copy link
Member

rkern commented Sep 21, 2014

Ok, with that terminology, the add() function adds one unit of length 2 m long to one unit of length 3 m long and comes up with a unit of length 1.524 m long. I must still insist that this is incorrect. The add function is behaving exactly as if I specified 2 feet and 3 feet. It took the values and clobbered the units. On a unit type.

No, the add() function is not supposed to be passed unit objects at all. It is documented to expect a number (a plain float, ndarray or UnitScalar or UnitArray). unit objects just happen to implement __add__, so they don't raise an exception when they are passed in. If you tried some other numeric operation on them, it wouldn't work. If Python were a statically typed language, you would get an error earlier. unit objects cannot be treated like UnitScalars. They are semantically different, especially when you consider the offset-bearing units, like degC and degF.

We will be happy to accept a PR documenting the details of the @has_units decorator. However, we will not be changing it to treat unit objects like UnitScalars.

Thanks!

@bnordgren
Copy link
Author

I agree units are semantically different than quantities. However, perhaps I can sidestep the volatile topic of treating units like UnitScalars by suggesting that has_units should behave as if it were passed UnitScalar(1,units=unit_arg). Perform the conversion to the desired units if possible, and error if not. This seems a minor change which preserves semantics, behaves intuitively, and greatly improves usability.

I think one thing you can do to keep other people from falling into this pitfall is to rewrite your "getting started" section. Particularly, the following code examples are actually constructing new unit types when the text is trying to show you how units are used (e.g., to make quantities/UnitScalars/UnitArrays). Even the exception thrown warns that one cannot add two quantities with incompatible units. If you agree, then perhaps this is a second documentation-related issue?

from scimath.units.length import foot, inch, meter
3 * foot
0.9144000000000001*m
foot / inch
12.0
foot / meter
0.3048
...
However, adding incompatible units raises an IncompatibleUnits exception:

from scimath.units.electromagnetism import volt
from scimath.units.mass import kilogram
1 * volt + 2 * kilogram
Traceback (most recent call last):
File "", line 1, in
File "scimath/units/unit.py", line 62, in add
raise IncompatibleUnits("add", self, other)
scimath.units.unit.IncompatibleUnits: Cannot add quanitites with units of 'm2_kg_s-3_A_*-1' and 'kg'

@rkern
Copy link
Member

rkern commented Sep 22, 2014

No, we will not make @has_units automatically convert a unit input to its corresponding UnitScalar. That would create an inconsistency: some places you could use unit objects as if they were UnitScalars and others you couldn't. The only change I would make here is to have @has_units raise a TypeError if given unit-type inputs to arguments that have a specified unit.

I do agree about the examples in the current documentation. They are misleading.

@bnordgren
Copy link
Author

TypeError it is then. Anything but successfully returning the wrong answer is acceptable. So to summarize, this issue can be considered resolved when has_units returns TypeError for the case where the user provides a unit-type to an argument which has a specified unit.

Perhaps something for a 5.0 version (not now) would be to appropriate the notation now used for creating new units (x = 3 * m) as a notation for creating new UnitScalars. Does making new units really need to be easier and more natural than making quantity instances? I don't see this as GUI vs. command line convenience, it's convenience of making new unit objects vs. convenience of making new quantities: which activity are users of your library going to do more of?

I'll put refactoring the "Getting Started" examples into a separate issue.

@rkern
Copy link
Member

rkern commented Sep 23, 2014

For our uses of this library, making new units has, in fact, been more common than making UnitScalars, especially thanks to the @has_units decorator where we put all of the unit declarations into the docstring instead of code. The multiplication syntax for making UnitScalars is great when you are typing in the value and the unit literally, which is the most common case when working with a unit system interactively. It's much less readable deep in your module code when both the value and the unit are not literally represented, e.g. x = my_value * my_unit. For that, it really helps to be explicit, x = UnitScalar(my_value, units=my_unit). When using this library to build unit-aware UIs, it is this latter case that is by far most common. We just don't make UnitScalars from literal values. They come from files or UI controls. That is why the I mention UI-focus of our uses of this library as relevant to this design decision.

Your focus is likely somewhat different. There are other (more actively developed!) unit libraries, like astropy.units, that do have the multiplication syntax for building unitted quantities. I think scimath.units's main neat trick compared to the other unit systems is probably the @has_units decorator. But this is something that can be replicated with most other units systems with just a little bit of effort and stealing our code (in a BSD-compatible way, of course). ;-)

@bnordgren
Copy link
Author

An example of my focus is here (https://github.com/firelab/met_utils/blob/master/satvp.py) Libraries which implement physical calculations. And here: (https://github.com/firelab/met_utils/blob/master/tests/test_satvp.py) unit tests for the above. Usually the physical constants and empirical parameters are literals, with units. The other bits to an equation come from outside the local context via the parameter list, and require compatible units. I noted that the has_units decorator has the advantage that all clients of my unitted function can use whatever units they want, but it doesn't help me write dimensionally correct code within the function. This may not be a showstopper, as it is at least not harder than writing code without unit support, and I can sometimes just appropriate non-unit-bearing code directly. And inside the unitted function, the ease of creating quantities appears to be somewhat moot as everything is a plain python scalar/array. This of course dawned on me slowly as I wrapped my head around the scimath way. I still don't see one method of creating quantities as more explicit/readable/informative than the other: there's just verbose and concise ways of being opaque. :)

Thanks for the link to astropy, it did not show up on a google search for python unit libraries (and a ton of dead libraries did). I'm mostly looking for tight and efficient integration with numpy ndarrays. The runner up at this point is Pint, but it does all numpy calculations twice, which outweighs the fact that it's maintained. I'll go check astropy and see if I'm motivated to jump ship. Looks like there might be a lot of dependencies. The fact that a current version is in Canopy is a plus though (and a minus for Pint).

Funny you should mention stealing code for other projects. Some of your header comments read "All Rights Reserved" with no license at all. Kind of precludes cloning the repo and making a derived product. That may deserve some attention. ;) Maybe even another ticket.

This ticket probably should still stand for "raise a TypeError...", though.

@rkern
Copy link
Member

rkern commented Sep 24, 2014

The BSD-like LICENSE.txt in the root applies to all files. The files you are looking at (where the author is Michael Aivazis) are also BSD-licensed.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants