-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
DEPR: concat ignoring empty objects #52532
DEPR: concat ignoring empty objects #52532
Conversation
pandas/core/dtypes/concat.py
Outdated
if len(non_empties) < len(to_concat) and not any( | ||
obj.dtype == _dtype_obj for obj in non_empties | ||
): | ||
# Check for object dtype is an imperfect proxy for checking if | ||
# the result dtype is going to change once the deprecation is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we verify this exactly? (checking what the common dtype would be with or without the empties?)
Because I think the check above will easily give false positives (unless I am misreading it): for example, for just a mixture of int types, of which one is empty, will trigger the warning? (since none of them is object dtype) While it will typically not change the resulting dtype (except if the input arrays with the largest bitwidth are all empty)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth a try! Will take a look
Rereading some of the previous discussions, would you be OK with start doing this deprecation for empty objects that are not object/float dtype? (since those are used as generic "empty" dtype) |
I'm not dead-set against that, but would like to push down this road a bit further first. IIRC the object/float (in particular float) cases were mostly a sticking point in the all-nan cases more than the empty cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like something went wrong when merging in the whatsnew
also, this is kind of hard to review, it looks like you're both refactoring and introducing the deprecation warning - any chance to split out the refactoring into a precursor PR please? (or to write a few changes summarising the changes)
I think I can refactor out as a precursor, coming up shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, but I don't understand the is_na_without_isna_all
-> is_na_after_size_and_isna_all_deprecation
change - could you explain please?
def is_na_without_isna_all(self) -> bool: | ||
def is_na_after_size_and_isna_all_deprecation(self) -> bool: | ||
""" | ||
Will self.is_na be True after values.size == 0 deprecation and isna_all | ||
deprecation are enforced? | ||
""" | ||
blk = self.block | ||
if blk.dtype.kind == "V": | ||
return True | ||
if not blk._can_hold_na: | ||
return False | ||
|
||
values = blk.values | ||
if values.size == 0: | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the future behavior won't depend on values.size == 0
(note the changed method name/docstring)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure but the deprecation hasn't been enforced yet, why is this changing already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is for checking on the future behavior to see if we need to issue a warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine to me
@jorisvandenbossche any objections?
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
@jorisvandenbossche @MarcoGorelli gentle ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
I'd suggest self-merging if it's something you're confident about and have had an approval
My apologies for the slow response here, but I would like to get back to this one point (#52532 (comment)):
I am not fully sure how to interpret the "push down this road a bit further first", but so you would not necessarily be against doing this deprecation only for non-object/float dtypes? (I am also happy to do a follow-up PR for that) |
@jbrockmendel when you have a moment, could you also address this test failure (looks like this PR could have used a rebase)
|
will take a look now.
IIRC at the time of that comment there were other review comments to address (#52532 (comment)) and i wanted to see how "clean" this PR would be once those were addressed. Also as I said in that comment, my understanding of the motivation to limit to only non-object/float cases was about the all-NA case, not the empty case. What would the motivation be to special-case the deprecation here? |
This change rolls together several small changes to avoid warnings generated when running the tests. Here is a summary of the changes: * Update use of `datetime.utcnow()` deprecated in Python 3.12 * Concatenate pandas dataframes more carefully to avoid `FutureWarning` when concatenating an empty dataframe (see pandas-dev/pandas#52532). * Adjust plotter tests to avoid warnings + Use supported options rather than arbitrary options to avoid unexpected option warning + Catch expected warning when passing an untrained discriminator to `IQPlotter` * Move the setting of the axis scale earlier in the logic of `MplDrawer.format_canvas`. `format_canvas` queries the limits of each `axes` object and uses those values. When no data set on a set of axes (an odd edge case), the default values depend on the scale (because the minimum for linear is 0 but for log is 0.1), so the scale needs to be set first. * Remove transpile options from `FineDragCal` (`_transpiled_circuits` for `BaseCalibrationExperiment` warns about and ignores transpile options) * Replace `isinstance` checks with `pulse_type` checks for calibration tests checking pulse shape * Replace `qiskit.Aer` usage with `qiskit_aer` * Set tolerance on cvxpy test to larger value, close to the tolerance achieved in practice. The routine was maxing out its iterations without achieving tolerance but still producing a close enough result for the test to pass. Increasing the tolerance avoids the max iterations warning and makes the test faster. * Rename `TestLibrary` to `MutableTestLibrary`. The class name starting with `Test*` causes pytest to warn about the class looking like a class holding test cases. pytest is not the main way we run the tests but it still seems best to avoid confusion between test case classes and test helper classes. * Catch warnings about insufficient trials in `QuantumVolume` tests using small numbers of trials. * Catch user and deprecation warnings about calibration saving and loading to csv files. * Convert existing calibration saving and loading tests from json to csv, leaving basic coverage of csv saving. A bug was found with json saving, resulting in one test being marked as an expected failure. * Set data on the `ExperimentData` for `TestFramework.test_run_analysis_experiment_data_pickle_roundtrip` to avoid a warning about an empty experiment data object. Other cases of this warning were addressed in 729014b but this test case was added afterward in a6c9e53. In order to avoid warnings creeping in in the future, this change also makes any warnings emitted by qiskit-experiments code be treated as exceptions by the tests (with an exception for `ScatterTable` methods for which it is planned to remove the deprecation). Any expected warnings in tests can be handled by using `assertWarns` or `warnings.catch_warnings`. As all the changes relate to tests (except the `FineDragCal` one which is a non-functional change to avoid a warning), no release note was added. --------- Co-authored-by: Helena Zhang <Helena.Zhang@ibm.com>
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.cc @jorisvandenbossche