Amazon S Tables Now Support Apache Iceberg REST Endpoint

Listen to this Post

Amazon S3 Tables has introduced a major update by embedding an Apache Iceberg REST endpoint, addressing earlier concerns about vendor lock-in and integration challenges. This enhancement simplifies interoperability with open-source query engines and strengthens AWS’s commitment to open data standards.

Key Improvements:

  • Iceberg REST Catalog Integration: Enables seamless interaction with S3 Tables using standard Iceberg APIs.
  • Better Ecosystem Support: Facilitates integration with tools like PyIceberg, Trino, and Spark.
  • Reduced Vendor Lock-in: Aligns with open-source standards, making data portability easier.

🔗 Docs: Accessing Amazon S3 Tables from Open-Source Query Engines

You Should Know:

  1. Setting Up Iceberg REST Catalog with AWS S3
    To connect to S3 Tables via Iceberg REST, use the following Python (PyIceberg) example:
from pyiceberg.catalog import RESTCatalog

catalog = RESTCatalog( 
name="s3_tables", 
uri="https://s3-tables-iceberg-rest.amazonaws.com", 
credentials={"aws_region": "us-east-1"} 
)

tables = catalog.list_tables() 
print(tables) 

2. Querying S3 Tables with Trino

Configure Trino to use the Iceberg REST catalog:

-- Configure catalog properties in `etc/catalog/iceberg.properties` 
connector.name=iceberg 
iceberg.catalog.type=rest 
iceberg.rest-catalog.uri=https://s3-tables-iceberg-rest.amazonaws.com 
iceberg.rest-catalog.credentials-provider=aws 

Then query directly:

SELECT  FROM iceberg.s3_tables.my_table; 

3. Managing S3 Tables via AWS CLI

Check and modify S3 Tables metadata:

aws s3api list-buckets  List available buckets 
aws s3api get-object --bucket my-bucket --key metadata/metadata.json  Fetch Iceberg metadata 

4. Spark Integration

Use Spark to read/write Iceberg tables in S3:

val df = spark.read 
.format("iceberg") 
.option("catalog", "rest") 
.option("uri", "https://s3-tables-iceberg-rest.amazonaws.com") 
.load("s3_tables.my_table") 

What Undercode Say:

The addition of Iceberg REST support in S3 Tables is a game-changer for data engineers. It bridges the gap between AWS services and open-source tools, reducing friction in data workflows. However, challenges remain:
– Hidden File Management: S3 Tables still obscure underlying files, complicating debugging.
– Permission Handling: Fine-grained access control requires deeper AWS IAM integration.

For Linux/IT practitioners, mastering these commands ensures smooth operations:

 Check AWS permissions 
aws iam get-user --user-name data_engineer

List Iceberg snapshots (via REST API) 
curl -X GET "https://s3-tables-iceberg-rest.amazonaws.com/v1/namespaces/{namespace}/tables/{table}/snapshots"

Monitor S3 access logs 
aws s3api get-bucket-logging --bucket my-data-lake 

Expected Output:

A unified, open-standard approach to managing S3 Tables, fostering interoperability across data platforms. 🚀

References:

Reported By: Royhasson When – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅

Join Our Cyber World:

💬 Whatsapp | 💬 TelegramFeatured Image