API: make attribute setting de-facto insert column #9033

jreback · 2014-12-07T15:20:52Z

This might be a bit controversial, but the issues raised in #8994 and #5904
point to some continued confusion w.r.t. attribute setting 'being' column setting

so if we now have

df = DataFrame({'A' : [1,2,3], 'B' : 5 })
df.C = 5

is an attribute set

it could be a column set
e.g. de-facto df['C'] = 5

If someone actually wants an attribute to 'stick' around. (meaning they would have to intercept the __finalize__ methods and actually deal with them properly, then I think it is reasonable to also have them add to the _internal_names as well (see #8572)

So basically would try to set a column (unless its an internal name).

(note that getattr is de-facto already equivalent to __getitem__, e.g. df.B === df['B'])
(don't mind my JS equivalence notion :)

The text was updated successfully, but these errors were encountered:

jreback · 2014-12-07T15:21:50Z

cc @jakevdp
cc @kjordahl
cc @hugadams
@jorisvandenbossche @shoyer @cpcloud @hayd

BrenBarn · 2015-05-21T05:00:58Z

I think this change should be considered very very very carefully. It is a gigantic change to the API. It means that any later additions of new methods or attributes to DataFrame may unexpectedly break tons of code. For instance, if I have code that does df.blah = 5, and we later add a method or attribute called blah, my code will now overwrite that attribute. Allowing quick attribute access for reads is very different, because even if we add a blah method, code that does df.blah will not modify the object. I think it would be wise not to into changes like this that not only change behavior, but have major implications for any and all future changes to the DataFrame API.

shoyer · 2015-05-21T05:56:57Z

@BrenBarn Thanks for commenting, you've convinced me. This indeed seems very dangerous.

jreback · 2015-05-21T13:58:58Z

yeh, this was sort of pie-in-the-sky. To promote consistency. In theory its a nice idea, but by auto-converting to a column, then you pretty much preclude future method expansion. Fundamentally there is really ONE way to create columns, namely __setitem__, but for convience attribute access is not symmetric.

hughesadam87 · 2015-05-21T16:07:02Z

Personally, this would break my library as attribute setting is hacked in. I'm not sure if it would affect GeoPandas, since they have honest-to-goodness dataframe subclasses, and I'm using composite classes. Not that I think this should hold you back from a major release, but personally I'm not a fan of this behavior. The attribute setting syntax is pretty consistent throughout Python, and it's not really that difficult to do df['C'] = 50. If anything, maybe raise a warning when trying to set the attribute?

If you do put this through, can you do a small writeup about how to use the _finalize_ method as you mentioned?

jreback · 2015-05-21T16:24:01Z

@hugadams you might find this interesting reading (near the bottom): http://pandas.pydata.org/pandas-docs/stable/internals.html

hughesadam87 · 2015-05-21T16:27:57Z

Ha wow, I need to keep up I guess!

On Thu, May 21, 2015 at 12:24 PM, jreback notifications@github.com wrote:

@hugadams https://github.com/hugadams you might find this interesting
reading (near the bottom):
http://pandas.pydata.org/pandas-docs/stable/internals.html

—
Reply to this email directly or view it on GitHub
#9033 (comment).

Adam Hughes
Physics Ph.D Candidate
George Washington University

jbrockmendel · 2023-01-23T19:04:54Z

closeble? NDFrame.__setattr__ not has a warning specifically about not doing this

jreback added API Design Compat pandas objects compatability with Numpy or Python functions labels Dec 7, 2014

jreback added this to the 0.16.0 milestone Dec 7, 2014

jreback changed the title ~~API: make attribute setting de-facto setattr~~ API: make attribute setting de-facto insert column Dec 7, 2014

jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015

jreback modified the milestones: Someday, Next Major Release May 21, 2015

jreback mentioned this issue Dec 15, 2015

Feature Request: Allow new column setting via attribute assignment #11848

Closed

mroeschke added Enhancement and removed Compat pandas objects compatability with Numpy or Python functions labels Apr 10, 2020

mroeschke added Indexing Related to indexing on series/frames, not to indexes themselves and removed API Design labels Apr 11, 2021

mroeschke removed this from the Someday milestone Oct 13, 2022

jbrockmendel added the Closing Candidate May be closeable, needs more eyeballs label Jan 23, 2023

mroeschke closed this as completed Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API: make attribute setting de-facto insert column #9033

API: make attribute setting de-facto insert column #9033

jreback commented Dec 7, 2014

jreback commented Dec 7, 2014

BrenBarn commented May 21, 2015

shoyer commented May 21, 2015

jreback commented May 21, 2015

hughesadam87 commented May 21, 2015

jreback commented May 21, 2015

hughesadam87 commented May 21, 2015

jbrockmendel commented Jan 23, 2023

API: make attribute setting de-facto insert column #9033

API: make attribute setting de-facto insert column #9033

Comments

jreback commented Dec 7, 2014

jreback commented Dec 7, 2014

BrenBarn commented May 21, 2015

shoyer commented May 21, 2015

jreback commented May 21, 2015

hughesadam87 commented May 21, 2015

jreback commented May 21, 2015

hughesadam87 commented May 21, 2015

jbrockmendel commented Jan 23, 2023