Skip to content

7.1.0

Compare
Choose a tag to compare
@ibis-project-bot ibis-project-bot released this 16 Nov 19:51

7.1.0 (2023-11-16)

Features

  • api: add bucket method for timestamps (ca0f7bc)
  • api: add Table.sample method for sampling rows from a table (3ce2617)
  • api: allow selectors in order_by (359fd5e)
  • api: move analytic window functions to top-level (8f2ced1)
  • api: support deferred in reduction filters (349f475)
  • api: support specifying signature in udf definitions (764977e)
  • bigquery: add location parameter (d652dbb)
  • bigquery: add read_csv, read_json, read_parquet support (ff83110)
  • bigquery: support temporary tables using sessions (eab48a9)
  • clickhouse: add support for timestamp bucket (10a5916)
  • clickhouse: support Table.fillna (5633660)
  • common: better inheritance support for Slotted and FrozenSlotted (9165d41)
  • common: make Slotted and FrozenSlotted pickleable (13cbce0)
  • common: support Self annotations for Annotable (0c60146)
  • common: use patterns to filter out nodes during graph traversal (3edd8f7)
  • dask: add read_csv and read_parquet (e9260af)
  • dask: enable pyarrow conversion (2d36722)
  • dask: support Table.sample (09a7626)
  • datafusion: add case and if-else statements (851d560)
  • datafusion: add corr and covar (edc42be)
  • datafusion: add isnull and isnan operations (0076c25)
  • datafusion: add some array functions (0b96b68)
  • datafusion: add StringLength, FindInSet, ArrayStringJoin (fd03831)
  • datafusion: add TimestampFromUNIX and subtract/add operations (2bffa5a)
  • datafusion: add TimestampTruncate / fix broken extract time part functions (940ed21)
  • datafusion: support dropping schemas (cc6870c)
  • duckdb: add attach and detach methods for adding and removing databases to the current duckdb session (162b058)
  • duckdb: add ntile support (bf08a2a)
  • duckdb: add dict-like for DuckDB settings (ea2d317)
  • duckdb: add support for specific timestamp scales (3518b78)
  • duckdb: allow users to register fsspec filesystem with DuckDB (6172f07)
  • duckdb: expose option to force reinstall extension (98080d0)
  • duckdb: implement Table.sample as a TABLESAMPLE query (3a80f3a)
  • duckdb: implement partial json collection casting (aae28e9)
  • flink: add remaining operators for Flink to pass/skip the common tests (b27adc6)
  • flink: add several temporal operators (f758228)
  • flink: implement the ops.TryCast operation (752e587)
  • formats: map ibis JSON type to pyarrow strings (79b6eac)
  • impala/pyspark: implement to_pyarrow (6b33454)
  • impala: implement Table.sample (8e78dfc)
  • implement window table valued functions (a35a756)
  • improve generated column names for methods receiving intervals (c319ed3)
  • mssql: add support for timestamp bucket (1ffac11)
  • mssql: support cross-db/cross-schema table list (3e0f0fa)
  • mysql: support ntile (9a14ba3)
  • oracle: add fixes after running pre-commit (6538b70)
  • oracle: add fixes after running pre-commit (e3d14b3)
  • oracle: add support for loading Oracle RAW and BLOB types (c77eeb2)
  • oracle: change parsing of Oracle NUMBER data type (649ab86)
  • oracle: remove redundant brackets (2905484)
  • pandas: add read_csv and read_parquet (34eeca6)
  • pandas: support Table.sample (77215be)
  • polars: add support for timestamp bucket (c59518c)
  • postgres: add support for timestamp bucket (4d34afc)
  • pyspark: support Table.sample (6aa897e)
  • snowflake: support ntile (39eed1a)
  • snowflake: support cross-db/cross-schema table list (2071897)
  • snowflake: support timestamp bucketing (a95ffa9)
  • sql: implement Table.sample as a random() filter across several SQL backends (e1870ea)
  • trino: implement Table.sample as a TABLESAMPLE query (f3d044c)
  • trino: support ntile (2978d1a)
  • trino: support temporal operations (8b8e885)
  • udf: improve mypy compatibility for udf functions (65b5bb7)
  • use to_pyarrow instead of to_pandas in the interactive repr (72aa573)
  • ux: fix long links, add repr links in vscode (734bd91)
  • ux: implement recursive element conversion for nested types and json (8ddfa94)
  • ux: render url strings as links in rich table output (1c7a9b6)
  • ux: show syntax-highlighted SQL if pygments is installed (09881b0)

Bug Fixes

  • bigquery: apply unnest transformation in other methods that execute SQL (2cc9d0e)
  • bigquery: avoid trying to filter separator argument to GroupConcat operation (ed3b017)
  • bigquery: ensure that the identifier is parsed according to the dialect (f5bb555)
  • bigquery: move sql code to proper argument (abb0bdd)
  • datafusion: do_connect: properly deal with config-is-actually-context (649480c)
  • datafusion: fix some temporal operations (3206dbc)
  • datatypes: correct uint upper bounds (5ca56d5)
  • datatypes: correct unsigned integer bounds (1e40d4e)
  • deps: bump pins lower bound to pickup transitive fsspec upper bound (983e23e)
  • deps: bump sqlglot lower bound (a47be79)
  • deps: pin pyspark to a working version (7eb8a19)
  • deps: update dependency datafusion to v32 (1afbe9c)
  • deps: update dependency pyarrow to v14 (bce86c4)
  • deps: update dependency sqlglot to v19 (1f3ae07)
  • duckdb: ensure proper quoting when compiling cross database/schema tables (8d7b5fa)
  • duckdb: query table list directly instead of relying on sqlalchemy (5d7822c)
  • duckdb: use connect instead of begin to avoid nesting transactions (6889543)
  • flink: cast argument to integer for reduction (5059eed)
  • flink: correct the filtered count translation (2cbca74)
  • flink: re-implement ops.ApproxCountDistinct (2e3a5a0)
  • ir: ibis.parse_sql() removes where clause (522f3a4)
  • ir: coerce integers passed to Value[dt.Floating] annotated values as dt.float64 (b8a924a)
  • ir: ensure that windowization directly wraps the reduction/analytic function (772df36)
  • mssql: support translation of ops.Neg() when projecting a field (ca49d2a)
  • oracle: change filter inside select into case when (c743fa2)
  • oracle: disable if_exists for Oracle drop view command (973133b)
  • oracle: fix fallback column type inference (fb5d56d)
  • pandas: drop __index_level_N__ cols before applying schema (b53feac)
  • patterns: Object pattern should match on positional arguments first (96c796f)
  • patterns: PatternList should keep the original pattern's type (6552639)
  • polars: bump lower bound to 0.19.8 and clean up a bunch of backcompat code (462bd17)
  • polars: various polars enhancements (5948dd6)
  • repr: add dispatch for repr of GeoSpatialBinOps (843d086)
  • snowflake: include views when listing tables for backwards compatibility (094881b)
  • snowflake: support snowflake 3.3.0 (nanoarrow) (a0f24e8)
  • sqlalchemy: ensure that limit on .sql calls works (a5e3062)
  • sqlite: handle BLOB datatype (d36ed1c)
  • sqlite: truncate week to previous week not following (6239794)
  • sql: subtract one from ntile output in string-generating backends (1d264dc)
  • support self joins on memtables (f24e355)
  • trino: enable passing the database argument when accessing tables (e7ce43e)
  • trino: ensure that a schema is not required upon connection when accessing tables with explicit schema (8bde3e0)
  • use pyarrow_hotfix where necessary (0fa1e5d)

Documentation

  • add .nullif() example (6d405df)
  • add "similar to pandas ..." to docstrings (cd7be29)
  • add basic intro docstring to Table class (1a68f31)
  • add callout note for Table.sample (51027d9)
  • add copyright holders to license (ca97dfb)
  • add deprecation to .nullifzero docstring (8502e81)
  • add example to Value.hash() (501ae92)
  • add examples to Value.typeof() (c146381)
  • add more examples to Table.select() (735bbd0)
  • add See Also sections to some APIs (be8938f)
  • clickhouse: freeze clickhouse backend docs to avoid rate limit from upstream playground (e3a7eac)
  • contribute: fix instructions for nix environment setup (013cedd)
  • contribute: fix path to conda-lock files for contributors (ef5bdf9)
  • dedupe 6.2.0 and 7.0.0 release notes (7ce4b1a)
  • fix and improve .isin() docstring (063cfba)
  • fix dask compile docstring typo (d38d2c4)
  • fix link in Value.type() docstring (43b798c)
  • fixup link (d4c97b0)
  • flink: add backend back to support matrix df (e846e80)
  • improve .between() docstring (a086134)
  • improve .case() and .cases() docstrings (7fc89e8)
  • improve cast() and try_cast() docstrings (0b686e8)
  • improve cross-linking within reference (9e45194)
  • improve examples for Table.order_by() (9465b2a)
  • improve join() docstring (84c08c6), closes #7424
  • improve re_replace docstring (f55d0db)
  • improve Table.columns docstring (d50558b)
  • mysql: render_do_connect mssql to mysql (3c2da6c)
  • pandas: show methods from BasePandasBackend (20fd120)
  • ranking: add ranking function docstrings (750bfeb)
  • setup codespace configuration [skip ci] (5363b94)
  • style: replace Black with Ruff in guidelines (1db3047)
  • temporal: add Literal annotation to display possible units for delta method (ee94cb5)
  • trino: add details for connecting to starburst (ca9873a)
  • trino: add note about SSO configuration (457534b)
  • udfs: fix udf interlink locations (c26e48b)

Refactors

  • analysis: remove _rewrite_filter() in favor of using replacement patterns (4c0ac2e)
  • analysis: remove is_reduction() (2acc31f)
  • analysis: remove pushdown_aggregation_filters() (cf95ff7)
  • analysis: remove sub_for(), substitute(), find_toplevel_aggs() (492b296)
  • analysis: remove substitute_parents() (cd91a7e)
  • analysis: remove substitute_unbound() since it is used at a single place (6a6ad19)
  • analysis: simplify and improve pushdown_selection_filters() (2e47738)
  • analysis: vastly simplify windowize_function (998bbaa)
  • backends: move read_delta to base io handler (3d5a684)
  • bigquery: add schema kwarg to list_tables (95be62f)
  • bigquery: remove session use (60e7900)
  • bigquery: remove unused BigQueryTable object (b83e60e)
  • clean up lit usage (1bc6cee)
  • clickhouse: apply repetitive transformations as pattern replacements (e966af8)
  • clickhouse: replace lit with builtin sqlglot functions (221b630)
  • clickhouse: use a pattern for one-to-zero index conversion of ranking window functions (732c031)
  • clickhouse: use sqlglot for create_table implementation (ea0826d)
  • common: remove ibis.common.bases.Base in favor of Abstract (8ed313c)
  • datafusion: create registry of time udfs to create them only once (9ed0a89)
  • docker-compose: clean up unused exposed ports and make envar spec uniform (7ee518d)
  • duckdb: remove lit (6f77df9)
  • flink: use FILTER syntax when counting (815c12f)
  • imports: move pandas-importing object to method (103a524)
  • ir: remove ibis.expr.streaming (70df318)
  • ir: remove ops.Negatable, ops.NotAny, ops.NotAll, ops.UnresolvedNotExistsSubquery (e31e8fd)
  • ir: unify ibis.common.pattern builders and ibis.expr.deferred (652ceab)
  • make _WellKnownText not a NamedTuple (9a9e733)
  • oracle: deprecate database for schema in list_tables (c8ea79f)
  • patterns: support more flexible sequence matching (b8e463d)
  • postgres: deprecate database for schema in list_tables (d622730)
  • remove unused *args in udf functions (e22236c)
  • sql: align logic for filtered reductions (0347036)
  • temporal: remove unnecessary Temporal* classes (d3bcf73)
  • trino: support better cross-db/cross-schema table list (d2cf1c9)
  • use rewrite rules to handle fillna/dropna in sql backends (f5e06a6)

Performance

  • bigquery: use more efficient representation for memtables (697d325)