Skip to content

GH61405 Expose arguments in DataFrame.query #61413

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,7 @@ Other API changes
- Index set operations (like union or intersection) will now ignore the dtype of
an empty ``RangeIndex`` or empty ``Index`` with object dtype when determining
the dtype of the resulting Index (:issue:`60797`)
- :meth:`DataFrame.query` does not accept ``**kwargs`` anymore and requires passing keywords for desired arguments (:issue:`61405`)

.. ---------------------------------------------------------------------------
.. _whatsnew_300.deprecations:
Expand Down
95 changes: 86 additions & 9 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -4477,18 +4477,58 @@ def _get_item(self, item: Hashable) -> Series:

@overload
def query(
self, expr: str, *, inplace: Literal[False] = ..., **kwargs
self,
expr: str,
*,
parser: Literal["pandas", "python"] = ...,
engine: Literal["python", "numexpr"] | None = ...,
local_dict: dict[str, Any] | None = ...,
global_dict: dict[str, Any] | None = ...,
resolvers: list[Mapping] | None = ...,
level: int = ...,
inplace: Literal[False] = ...,
) -> DataFrame: ...

@overload
def query(self, expr: str, *, inplace: Literal[True], **kwargs) -> None: ...
def query(
self,
expr: str,
*,
parser: Literal["pandas", "python"] = ...,
engine: Literal["python", "numexpr"] | None = ...,
local_dict: dict[str, Any] | None = ...,
global_dict: dict[str, Any] | None = ...,
resolvers: list[Mapping] | None = ...,
level: int = ...,
inplace: Literal[True],
) -> None: ...

@overload
def query(
self, expr: str, *, inplace: bool = ..., **kwargs
self,
expr: str,
*,
parser: Literal["pandas", "python"] = ...,
engine: Literal["python", "numexpr"] | None = ...,
local_dict: dict[str, Any] | None = ...,
global_dict: dict[str, Any] | None = ...,
resolvers: list[Mapping] | None = ...,
level: int = ...,
inplace: bool = ...,
) -> DataFrame | None: ...

def query(self, expr: str, *, inplace: bool = False, **kwargs) -> DataFrame | None:
def query(
self,
expr: str,
*,
parser: Literal["pandas", "python"] = "pandas",
engine: Literal["python", "numexpr"] | None = None,
local_dict: dict[str, Any] | None = None,
global_dict: dict[str, Any] | None = None,
resolvers: list[Mapping] | None = None,
level: int = 0,
inplace: bool = False,
) -> DataFrame | None:
"""
Query the columns of a DataFrame with a boolean expression.

Expand All @@ -4507,11 +4547,41 @@ def query(self, expr: str, *, inplace: bool = False, **kwargs) -> DataFrame | No

See the documentation for :meth:`DataFrame.eval` for details on
referring to column names and variables in the query string.
parser : {'pandas', 'python'}, default 'pandas'
The parser to use to construct the syntax tree from the expression. The
default of ``'pandas'`` parses code slightly different than standard
Python. Alternatively, you can parse an expression using the
``'python'`` parser to retain strict Python semantics. See the
:ref:`enhancing performance <enhancingperf.eval>` documentation for
more details.
engine : {'python', 'numexpr'}, default 'numexpr'

The engine used to evaluate the expression. Supported engines are

- None : tries to use ``numexpr``, falls back to ``python``
- ``'numexpr'`` : This default engine evaluates pandas objects using
numexpr for large speed ups in complex expressions with large frames.
- ``'python'`` : Performs operations as if you had ``eval``'d in top
level python. This engine is generally not that useful.

More backends may be available in the future.
local_dict : dict or None, optional
A dictionary of local variables, taken from locals() by default.
global_dict : dict or None, optional
A dictionary of global variables, taken from globals() by default.
resolvers : list of dict-like or None, optional
A list of objects implementing the ``__getitem__`` special method that
you can use to inject an additional collection of namespaces to use for
variable lookup. For example, this is used in the
:meth:`~DataFrame.query` method to inject the
``DataFrame.index`` and ``DataFrame.columns``
variables that refer to their respective :class:`~pandas.DataFrame`
instance attributes.
level : int, optional
The number of prior stack frames to traverse and add to the current
scope. Most users will **not** need to change this parameter.
inplace : bool
Whether to modify the DataFrame rather than creating a new one.
**kwargs
See the documentation for :func:`eval` for complete details
on the keyword arguments accepted by :meth:`DataFrame.query`.

Returns
-------
Expand Down Expand Up @@ -4624,8 +4694,15 @@ def query(self, expr: str, *, inplace: bool = False, **kwargs) -> DataFrame | No
if not isinstance(expr, str):
msg = f"expr must be a string to be evaluated, {type(expr)} given"
raise ValueError(msg)
kwargs["level"] = kwargs.pop("level", 0) + 1
kwargs["target"] = None
kwargs: Any = {
"level": level + 1,
"target": None,
"parser": parser,
"engine": engine,
"local_dict": local_dict,
"global_dict": global_dict,
"resolvers": resolvers or (),
}

res = self.eval(expr, **kwargs)

Expand Down
Loading