Amazon SageMaker's lakehouse architecture now automates the optimization of Apache Iceberg tables on Amazon S3, simplifying maintenance through catalog-level configuration. This enhancement allows data lake administrators to enable automated table optimizations, such as compaction and orphan file deletion, across all Iceberg tables with a single setting, improving performance and cost efficiency.
Amazon S3 Tables allows for optimized storage of tabular data in Apache Iceberg format, but its data ingestion options are limited. For cost-efficient event ingestion, using Amazon Data Firehose is recommended over Athena or EMR Spark clusters, with various architectural implementations available for handling web analytics events.