Skip to content

Iceberg sql json #88

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

jeremyber-aws
Copy link
Contributor

Add Iceberg SQL Examples with Glue and S3 Tables Catalogs

Purpose of the change

Add two new examples demonstrating how to write streaming data to Iceberg tables using SQL operations with different catalog implementations:

  1. Using AWS Glue Data Catalog
  2. Using Amazon S3 Tables

Both examples showcase implementation of streaming stock price data to Iceberg tables using Flink SQL, providing users with patterns for both Glue and S3 Tables integration.

Verifying this change

Local Testing

Tested both examples locally in IntelliJ:

  1. Configure flink-application-properties-dev.json with appropriate S3 bucket/ARN
  2. Run the main application class
  3. Verified data generation and table creation in both Glue Data Catalog and S3 Tables
  4. Confirmed data writing through SQL queries

Example output from Glue example:

mysql> SELECT * FROM iceberg.stock_data LIMIT 5;
+--------+---------+-------------------------+
| price  | ticker  | eventtime              |
+--------+---------+-------------------------+
| 45.23  | AAPL    | 2025-02-07 11:00:15.0 |
| 89.67  | AMZN    | 2025-02-07 11:00:15.2 |
| 67.89  | MSFT    | 2025-02-07 11:00:15.4 |
| 34.12  | INTC    | 2025-02-07 11:00:15.6 |
| 78.90  | TBV     | 2025-02-07 11:00:15.8 |
+--------+---------+-------------------------+

Significant changes

  • Completely new example
  • Updated an existing example to newer Flink version or dependencies versions
  • Improved an existing example
  • Modified the runtime configuration of an existing example
  • Modified the expected input or output of an existing example

New features:

  • Flink SQL operations with Iceberg
  • Integration with AWS Glue Data Catalog
  • Integration with Amazon S3 Tables

Dependencies:

  • Flink 1.19.0
  • Iceberg 1.6.1
  • Java 11
  • AWS SDK v2
  • S3 Tables Catalog 0.1.3

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant