Skip to content

BUG: Rolling.apply requires that the result of each operation be numeric #56351

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
2 of 3 tasks
alexmerm opened this issue Dec 6, 2023 · 1 comment
Closed
2 of 3 tasks
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@alexmerm
Copy link

alexmerm commented Dec 6, 2023

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
df = pd.DataFrame(range(10))
df.rolling(5).apply(lambda x: "test")

Issue Description

When running an apply operation on a rolling window of a series (or dataframe), if the result of the applied function is not a float or it, it throws a TypeError: "must be real number, not str"

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Untitled-3.ipynb Cell 3 line 1
----> 1 df.rolling(5).apply(lambda x: "test")

File [PYTHON_DIR]/venv/lib/python3.11/site-packages/pandas/core/window/rolling.py:2043, in Rolling.apply(self, func, raw, engine, engine_kwargs, args, kwargs)
   2010 @doc(
   2011     template_header,
   2012     create_section_header("Parameters"),
   (...)
   2041     kwargs: dict[str, Any] | None = None,
   2042 ):
-> 2043     return super().apply(
   2044         func,
   2045         raw=raw,
   2046         engine=engine,
   2047         engine_kwargs=engine_kwargs,
   2048         args=args,
   2049         kwargs=kwargs,
   2050     )

File [PYTHON_DIR]/venv/lib/python3.11/site-packages/pandas/core/window/rolling.py:1503, in RollingAndExpandingMixin.apply(self, func, raw, engine, engine_kwargs, args, kwargs)
   1500 else:
   1501     raise ValueError("engine must be either 'numba' or 'cython'")
-> 1503 return self._apply(
   1504     apply_func,
   1505     name="apply",
   1506     numba_args=numba_args,
   1507 )

File [PYTHON_DIR]/venv/lib/python3.11/site-packages/pandas/core/window/rolling.py:617, in BaseWindow._apply(self, func, name, numeric_only, numba_args, **kwargs)
    614     return result
    616 if self.method == "single":
--> 617     return self._apply_blockwise(homogeneous_func, name, numeric_only)
    618 else:
    619     return self._apply_tablewise(homogeneous_func, name, numeric_only)

File [PYTHON_DIR]/venv/lib/python3.11/site-packages/pandas/core/window/rolling.py:492, in BaseWindow._apply_blockwise(self, homogeneous_func, name, numeric_only)
    488 except (TypeError, NotImplementedError) as err:
    489     raise DataError(
    490         f"Cannot aggregate non-numeric type: {arr.dtype}"
    491     ) from err
--> 492 res = homogeneous_func(arr)
    493 res_values.append(res)
    494 taker.append(i)

File [PYTHON_DIR]/venv/lib/python3.11/site-packages/pandas/core/window/rolling.py:612, in BaseWindow._apply.<locals>.homogeneous_func(values)
    609     return func(x, start, end, min_periods, *numba_args)
    611 with np.errstate(all="ignore"):
--> 612     result = calc(values)
    614 return result

File [PYTHON_DIR]/venv/lib/python3.11/site-packages/pandas/core/window/rolling.py:609, in BaseWindow._apply.<locals>.homogeneous_func.<locals>.calc(x)
    600 start, end = window_indexer.get_window_bounds(
    601     num_values=len(x),
    602     min_periods=min_periods,
   (...)
    605     step=self.step,
    606 )
    607 self._check_window_bounds(start, end, len(x))
--> 609 return func(x, start, end, min_periods, *numba_args)

File [PYTHON_DIR]/venv/lib/python3.11/site-packages/pandas/core/window/rolling.py:1530, in RollingAndExpandingMixin._generate_cython_apply_func.<locals>.apply_func(values, begin, end, min_periods, raw)
   1527 if not raw:
   1528     # GH 45912
   1529     values = Series(values, index=self._on, copy=False)
-> 1530 return window_func(values, begin, end, min_periods)

File aggregations.pyx:1423, in pandas._libs.window.aggregations.roll_apply()

TypeError: must be real number, not str

Expected Behavior

For the above code, would expect a series to be returned of [NaN,NaN,NaN,NaN, "test","test,"test","test","test","test"]

Installed Versions

[PYTHON_DIR]/python3.11/site-packages/_distutils_hack/init.py:33: UserWarning:

Setuptools is replacing distutils.

INSTALLED VERSIONS

commit : 2a953cf
python : 3.11.2.final.0
python-bits : 64
OS : Darwin
OS-release : 23.1.0
Version : Darwin Kernel Version 23.1.0: Mon Oct 9 21:27:24 PDT 2023; root:xnu-10002.41.9~6/RELEASE_ARM64_T6000
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8

pandas : 2.1.3
numpy : 1.26.2
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 65.5.0
pip : 22.3.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.3
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.18.1
pandas_datareader : None
bs4 : 4.12.2
bottleneck : None
dataframe-api-compat: None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.8.2
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.11.4
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None

@alexmerm alexmerm added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 6, 2023
@mroeschke
Copy link
Member

Thanks for the report but closing as a duplicate of #23002

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants