-
-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Should unstruct_hook not unstruct? #432
Comments
Hello! Thanks for the writeup, always glad to hear from cattrs users, especially folks just coming in. To provide a little context, the I do agree cattrs might have a functionality gap. Lately I've been reframing what exactly I think cattrs is, and the definition I'm landing at is that it's a data transformation framework based on function composition. But... cattrs doesn't make it very easy to compose functions. Or at least it could make it significantly easier. Some of this should be done through new convenience APIs, and some through documentation. (I'm generally very careful about adding convenience APIs, especially when you can just use direct Python instead. I think they clog up the mental model of the API and can be an attractive nuisance.) I think the most basic and easiest form of function composition in Python is... Just wrapping functions. I think that may be part of your intuition gap; I guess you were expecting the As a thought exercise, I tried solving your example as a cattrs expert to see how I'd go about it. My first though was leveraging the fact that Continuing your scenario: unstruct = make_dict_unstructure_fn(
Person, cattrs.Converter(), _children_by_name=override(rename="children")
)
print(unstruct(alice)) # {'name': 'Alice', 'children': {'Bob': {'name': 'Bob', '_children_by_name': {}}}} But interestingly, that's actually wrong. You have a recursive class and if the hook doesn't actually get registered on the converter the nested data is incorrect. So in this case it's more of a foot gun than a cool convenience. We actually need to do the entire registration dance. But how do we wrap it? default_hook = make_dict_unstructure_fn(
Person, c, _children_by_name=override(rename="children")
)
def my_hook(val: Person) -> dict:
res = default_hook(val)
res["children"] = list(res["children"].values())
return res
c.register_unstructure_hook(Person, my_hook)
print(c.unstructure(alice)) This'll get us what we want but it's borderline too complex. Brainstorming a little bit, here's what we could do.
Then the example could be a little more general (merging the rename into the wrapper): default_hook = c.get_unstructure_hook(Person)
def my_hook(val: Person) -> dict:
res = default_hook(val)
res["children"] = list(res["_children_by_name"].values())
return res
c.register_unstructure_hook(Person, my_hook)
Unsure what the wrapping API would be like yet. Maybe it's a factory function that takes a callable and returns a callable? |
Ah, I appreciate the context! That makes total sense to me and evokes Haskell's typeclasses and Rust Drawing on this notion, your This method does treat the Here's a slightly different example to justify needing both pre- and post-processing (whereas you reasonably fused the list and rename operations). This code already works with the current cattrs! (But hits an internal API.) # To really drive home the function composition parallel, let's make a compose function.
def compose(*fns):
"""Compose functions like Haskell's `.`."""
def composed_fn(val):
for fn in reversed(fns):
val = fn(val)
return val
return composed_fn
def pre_fn(val: Person) -> Person:
# Sometimes we need to both preprocess and postprocess values for maximum control. In this case,
# we need to filter out any possessed children *before* turning to the default implementation,
# or the default hook will get stuck to the ceiling muttering in tongues.
return attrs.evolve(val,
children_by_name={k: v for k, v in val._children_by_name.items() if "Possessed" not in v.name})
# In the post_fn, we do the rename and the listification as previously. We can't do it in the
# preprocessing function since we haven't yet dictified there.
def post_fn(res: dict) -> dict:
res["children"] = list(res.pop("_children_by_name").values())
return res
c = cattrs.Converter()
default_hook = c._unstructure_func.dispatch(Person)
c.register_unstructure_hook(Person, compose(post_fn, default_hook, pre_fn))
demonic_bob = Person(name="Evil Possessed Bob")
alice.add_child(demonic_bob)
assert repr(c.unstructure(alice)) == "{'name': 'Alice', 'children': [{'name': 'Bob', 'children': []}]}" Not to get too carried away, but one could even envision including # Before, to rename a field:
renamer_hook = make_dict_unstructure_fn(
Person, c, _children_by_name=override(rename="children")
)
c.register_unstructure_hook(Person, renamer_hook)
# After:
from cattrs.functional import rename, compose
c.register_unstructure_hook(Person, compose(rename(_children_by_name="children"), c.builtin))
Then we could write my exorcism example like so: c.register_unstructure_hook(Person, compose(
lambda res: {**res, 'children': list(res.pop('_children_by_name').values())},
c.builtin,
lambda val: attrs.evolve(val, children_by_name={k: v for k, v in val._children_by_name.items() if "Possessed" not in v.name})
)) Or perhaps for clarity and convenience there could be a function specifically for doing this where the composition is implied. So finishing the thought you started, the convenience function could in fact be called c.wrap_unstructure_for(Person,
lambda res: {**res, 'children': list(res.pop('_children_by_name').values())},
c.previous,
lambda val: attrs.evolve(val, children_by_name={k: v for k, v in val._children_by_name.items() if "Possessed" not in v.name}),
) Renamed c.wrap_unstructure_for(Person,
field_transformer(children=lambda d: list(d.values())),
field_renamer(_children_by_name='children'),
c.previous,
attrs_transformer(children_by_name=lambda val, a: {k: v for k, v in a.items() if "Possessed" not in v.name}),
) This approach would make it extremely easy to assemble complex transformation pipelines using simple, well-understood functions. I find it quite expressive and, at least in my eyes, intuitive. It's well-aligned with the concept of function composition for data transformation. And there's no magic; each function we include is straightforward and easily understandable. |
I've written and documented I'm going to leave it here for now, until I see how things are progressing. I feel Python already has powerful ways of composing functions so we don't need to provide |
Description
The new
unstruct_hook
override feature in cattrs provides a way to customise unstructuring per attribute but it appears a blunt instrument: you now seem to be responsible for the totality of unstructuring from that attribute down.What I Did
Discussion: An Intuition Gap?
I'm a new user and might just be doing things sideways. All of the following might be wrong, just my quick thoughts.
It seems
unstruct_hook
is intended to be the ultimate escape hatch that enables wholesale replacement of the unstructuring behaviour for specific attributes. But I do wonder if this is usually what one wants withoverride
. If I wanted to just write my ownasdict
method forPerson
, I could already just write a function and use it directly withregister_unstructure_hook
. The reason I'm usingcattrs.gen
is to tweak the built-in behaviour rather than reimplementing it from scratch.There's also an intuition gap here. Specifically, it feels counterintuitive that customising
Person
unstructuring causes a non-unstructuredPerson
to pop out. I recognise this is in truth a misunderstanding: I think I'm customising but I'm actually replacing. Still, I'm probably not the last person who'll make the same mistake.To me it would make sense to do one of the following:
unstructure_hook
rather than replacing itObviously you could argue that just calling
c.unstructure
recursively like I do in the final example is fine and I'm just holding it wrong.The text was updated successfully, but these errors were encountered: