apache-iceberg

Star

Here are 78 public repositories matching this topic...

risingwavelabs / risingwave

Star

Stream processing and management platform.

rust database kafka postgresql stream-processing data-engineering materialized-view apache-iceberg

Updated Jun 5, 2025
Rust

matanolabs / matano

Star

Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS

Updated Jan 8, 2025
Rust

apache / incubator-xtable

Star

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

apache-iceberg delta-lake apache-hudi

Updated Jun 4, 2025
Java

Fastest open-source tool for replicating Databases to Data Lake in Open Table Formats like Apache Iceberg. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres, MongoDB and MySQL

database replication s3 parquet elt cdc data-pipeline change-data-capture apache-iceberg lakehouse

Updated Jun 5, 2025
Go

tansu-io / tansu

Star

Apache Kafka® compatible broker with S3, PostgreSQL, Apache Iceberg and Delta Lake

postgres postgresql s3 parquet apache-kafka datalake apache-arrow apache-iceberg delta-lake datafusion built-with-rust

Updated Jun 3, 2025
Rust

cuebook / cuelake

Star

Use SQL to build ELT pipelines on a data lakehouse.

sql apache-spark etl pipelines data-engineering data-lake data-transfer delta data-integration upsert elt data-pipeline datalake data-ingestion spark-sql zeppelin-notebook apache-iceberg lakehouse incremental-updates

Updated May 25, 2022
JavaScript

lhbench / lhbench

Star

Lakehouse storage system benchmark

benchmark database cidr databricks apache-iceberg delta-lake lakehouse apache-hudi

Updated Feb 22, 2023
Scala

dominikhei / Local-Data-LakeHouse

Star

Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.

data-lake minio trino hive-metastore apache-iceberg lakehouse data-lakehouse

Updated Sep 2, 2023
Dockerfile

nimtable / nimtable

Star

The Control Plane for Apache Iceberg

iceberg datalake polaris apache-iceberg lakehouse s3-tables

Updated Jun 5, 2025
TypeScript

dacort / modern-data-lake-storage-layers

Star

Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work

aws amazon-emr iceberg hudi apache-iceberg delta-lake apache-hudi

Updated Jul 13, 2022
Jupyter Notebook

abeltavares / real-time-data-pipeline

Star

📡 Real-time data pipeline with Kafka, Flink, Iceberg, Trino, MinIO, and Superset. Ideal for learning data systems.

docker open-source aws big-data etl s3 data-visualization data-engineering minio apache-flink apache-kafka real-time-data data-pipeline trino streaming-analytics apache-superset apache-iceberg lakehouse sql-analytics

Updated Jan 18, 2025
Python

hyparam / icebird

Star

Icebird: JavaScript Iceberg Client

javascript data-engineering data-lake parquet iceberg datalake apache-iceberg hyparquet

Updated May 8, 2025
JavaScript

aws-samples / transactional-datalake-using-apache-iceberg-on-aws-glue

Star

Stream CDC into an Amazon S3 data lake in Apache Iceberg table format with AWS Glue Streaming and DMS

apache-spark aws-athena aws-glue aws-dms apache-iceberg

Updated Feb 15, 2025
Python

guidok91 / spark-movies-etl

Star

Spark data pipeline that processes movie ratings data.

spark etl pyspark data-engineering elt data-pipeline apache-airflow uv apache-iceberg

Updated Jun 1, 2025
Python

bodo-ai / denali

Star

An open-source, community-driven REST catalog for Apache Iceberg!

go golang catalog iceberg apache-iceberg

Updated Jun 26, 2024
Go

aws-samples / monitoring-apache-iceberg-table-metadata-layer

Star

Sample code to collect Apache Iceberg metrics for table monitoring

aws apache-spark monitoring aws-lambda aws-cloudwatch data-quality aws-glue sam-cli apache-iceberg pyiceberg

Updated Aug 18, 2024
Python

aws-samples / iceberg-streaming-examples

Star

This repo contains examples of high throughput ingestion using Apache Spark and Apache Iceberg. These examples cover IoT and CDC scenarios using best practices. The code can be deployed into any Spark compatible engine like Amazon EMR Serverless or AWS Glue. A fully local developer environment is also provided.

apache-spark structured-streaming apache-iceberg

Updated Nov 14, 2024
Java

aws-samples / sample-pace-data-analytics-ml-ai

Star

DAIVI is a reference solution with IAC modules to accelerate development of Data, Analytics, AI and Visualization applications on AWS using the next generation Amazon SageMaker Unified Studio. The goal of the DAIVI solution is to provide engineers with sample infrastructure-as-code modules and application modules to build their data platforms.

terraform sagemaker apache-iceberg sagemaker-studio

Updated May 28, 2025
HCL

aws-samples / aws-glue-streaming-etl-with-apache-iceberg

Star

Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3

apache-spark aws-athena aws-glue apache-iceberg aws-glue-streaming

Updated Sep 10, 2024
Python

tj--- / iceberg-demo

Star

A sample implementation of stream writes to an Iceberg table on GCS using Flink and reading it using Trino

java kafka gcs apache-flink apache-kafka flink iceberg flink-stream-processing trino apache-iceberg

Updated May 30, 2022
Java

Improve this page

Add a description, image, and links to the apache-iceberg topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the apache-iceberg topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apache-iceberg

Here are 78 public repositories matching this topic...

risingwavelabs / risingwave

matanolabs / matano

apache / incubator-xtable

datazip-inc / olake

tansu-io / tansu

cuebook / cuelake

lhbench / lhbench

dominikhei / Local-Data-LakeHouse

nimtable / nimtable

dacort / modern-data-lake-storage-layers

abeltavares / real-time-data-pipeline

hyparam / icebird

aws-samples / transactional-datalake-using-apache-iceberg-on-aws-glue

guidok91 / spark-movies-etl

bodo-ai / denali

aws-samples / monitoring-apache-iceberg-table-metadata-layer

aws-samples / iceberg-streaming-examples

aws-samples / sample-pace-data-analytics-ml-ai

aws-samples / aws-glue-streaming-etl-with-apache-iceberg

tj--- / iceberg-demo

Improve this page

Add this topic to your repo