Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

PERF: DataFrame.from_records with DataFrame input #51353

Closed
jbrockmendel opened this issue Feb 12, 2023 · 2 comments · Fixed by #51697
Closed

PERF: DataFrame.from_records with DataFrame input #51353

jbrockmendel opened this issue Feb 12, 2023 · 2 comments · Fixed by #51697
Labels
Constructors Series/DataFrame/Index/pd.array Constructors Performance Memory or execution speed performance

Comments

@jbrockmendel
Copy link
Member

jbrockmendel commented Feb 12, 2023

Speculative: DataFrame.from_records if passed a DataFrame goes through internals.construction.to_arrays and ends up splitting the frame into individual columns. This can result in slower subsequent operations if that splitting turned out to be unnecessary. Can we handle DataFrame input inside from_records and avoid this splitting?

This is also the only way we get to to_arrays with a DataFrame input, so the function could be simplified if that case could be ruled out.

@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member Performance Memory or execution speed performance Constructors Series/DataFrame/Index/pd.array Constructors and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 12, 2023
@phofl
Copy link
Member

phofl commented Feb 28, 2023

Can we simply deprecate a DataFrame input? I might be missing something, but there seems to be no use-case that can't be achieved through calling other functions? e.g. set_index, reindex and drop

@jbrockmendel
Copy link
Member Author

Agreed it seems like a really weird use case.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Constructors Series/DataFrame/Index/pd.array Constructors Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants