Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

API: make attribute setting de-facto insert column #9033

Closed
jreback opened this issue Dec 7, 2014 · 8 comments
Closed

API: make attribute setting de-facto insert column #9033

jreback opened this issue Dec 7, 2014 · 8 comments
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@jreback
Copy link
Contributor

jreback commented Dec 7, 2014

xref #8994
xref #5904
xref #8572

This might be a bit controversial, but the issues raised in #8994 and #5904
point to some continued confusion w.r.t. attribute setting 'being' column setting

so if we now have

df = DataFrame({'A' : [1,2,3], 'B' : 5 })
df.C = 5

is an attribute set

it could be a column set
e.g. de-facto df['C'] = 5

If someone actually wants an attribute to 'stick' around. (meaning they would have to intercept the __finalize__ methods and actually deal with them properly, then I think it is reasonable to also have them add to the _internal_names as well (see #8572)

So basically would try to set a column (unless its an internal name).

(note that getattr is de-facto already equivalent to __getitem__, e.g. df.B === df['B'])
(don't mind my JS equivalence notion :)

@jreback jreback added API Design Compat pandas objects compatability with Numpy or Python functions labels Dec 7, 2014
@jreback jreback added this to the 0.16.0 milestone Dec 7, 2014
@jreback
Copy link
Contributor Author

jreback commented Dec 7, 2014

cc @jakevdp
cc @kjordahl
cc @hugadams
@jorisvandenbossche @shoyer @cpcloud @hayd

@jreback jreback changed the title API: make attribute setting de-facto setattr API: make attribute setting de-facto insert column Dec 7, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015
@BrenBarn
Copy link

I think this change should be considered very very very carefully. It is a gigantic change to the API. It means that any later additions of new methods or attributes to DataFrame may unexpectedly break tons of code. For instance, if I have code that does df.blah = 5, and we later add a method or attribute called blah, my code will now overwrite that attribute. Allowing quick attribute access for reads is very different, because even if we add a blah method, code that does df.blah will not modify the object. I think it would be wise not to into changes like this that not only change behavior, but have major implications for any and all future changes to the DataFrame API.

@shoyer
Copy link
Member

shoyer commented May 21, 2015

@BrenBarn Thanks for commenting, you've convinced me. This indeed seems very dangerous.

@jreback
Copy link
Contributor Author

jreback commented May 21, 2015

yeh, this was sort of pie-in-the-sky. To promote consistency. In theory its a nice idea, but by auto-converting to a column, then you pretty much preclude future method expansion. Fundamentally there is really ONE way to create columns, namely __setitem__, but for convience attribute access is not symmetric.

@jreback jreback modified the milestones: Someday, Next Major Release May 21, 2015
@hughesadam87
Copy link

Personally, this would break my library as attribute setting is hacked in. I'm not sure if it would affect GeoPandas, since they have honest-to-goodness dataframe subclasses, and I'm using composite classes. Not that I think this should hold you back from a major release, but personally I'm not a fan of this behavior. The attribute setting syntax is pretty consistent throughout Python, and it's not really that difficult to do df['C'] = 50. If anything, maybe raise a warning when trying to set the attribute?

If you do put this through, can you do a small writeup about how to use the _finalize_ method as you mentioned?

@jreback
Copy link
Contributor Author

jreback commented May 21, 2015

@hugadams you might find this interesting reading (near the bottom): http://pandas.pydata.org/pandas-docs/stable/internals.html

@hughesadam87
Copy link

Ha wow, I need to keep up I guess!

On Thu, May 21, 2015 at 12:24 PM, jreback notifications@github.com wrote:

@hugadams https://github.com/hugadams you might find this interesting
reading (near the bottom):
http://pandas.pydata.org/pandas-docs/stable/internals.html


Reply to this email directly or view it on GitHub
#9033 (comment).

Adam Hughes
Physics Ph.D Candidate
George Washington University

@mroeschke mroeschke added Enhancement and removed Compat pandas objects compatability with Numpy or Python functions labels Apr 10, 2020
@mroeschke mroeschke added Indexing Related to indexing on series/frames, not to indexes themselves and removed API Design labels Apr 11, 2021
@mroeschke mroeschke removed this from the Someday milestone Oct 13, 2022
@jbrockmendel
Copy link
Member

closeble? NDFrame.__setattr__ not has a warning specifically about not doing this

@jbrockmendel jbrockmendel added the Closing Candidate May be closeable, needs more eyeballs label Jan 23, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

6 participants