Icestream is an asynchronous table maintenance tool for iceberg tables that converts equality deletes (write optimized) to positional deletes (read optimized). It does so efficiently by leveraging an index over the table data stored in Cassandra, and the cassandra-spark-connector, which enables efficient distributed nested-loop indexed joins from iceberg data in spark to the index data in the database.
Icestream is an asynchronous table maintenance tool for iceberg tables that converts equality deletes (write optimized) to positional deletes (read optimized). It does so efficiently by leveraging an index over the table data stored in Cassandra, and the cassandra-spark-connector, which enables efficient distributed nested-loop indexed joins from iceberg data in spark to the index data in the database.