#68: Moving Spatial Files to Cloud Storage Doesn’t Make Them Modern

Share

Adding a transactional metadata layer does. Object storage is cheap but dumb. Multiple writers collide. Silent data loss happens. Iceberg provides the coordination layer that turns S3 into a production spatial database.

The fundamental problem: Raw GeoParquet files on object storage have no transaction semantics. Two analysts writing simultaneously? One overwrites the other. A failed write leaves partial files. A schema change requires rewriting everything. This is why organizations still buy expensive spatial databases—monoliths bundling storage and compute—despite cloud storage being 10x cheaper.

Iceberg decouples them. Storage is S3 (cheap, durable, massive). Compute is your choice (Spark, Trino, DuckDB, Snowflake). The metadata layer (Iceberg) guarantees ACID transactions and concurrency across any number of readers and writers.

The spatial data silo breaks. Legacy GIS forced data into proprietary databases (PostGIS, Oracle Spatial). Other tools couldn’t access it safely. Iceberg is an open format. Any modern compute engine reads and writes concurrently. Data scientists in Snowflake, analysts in Trino, engineers in Spark—all reading the same spatial table without coordination overhead.

This transforms costs. You stop paying for database compute on static reference data (parcel boundaries, zoning layers). Store as Iceberg on S3. Query with whatever engine is cheapest that day.

Reliability is operationalized. Time travel, schema evolution, branching, compaction—all built-in. No homegrown versioning hacks. No disaster recovery nightmares.

The rule: Stop buying expensive spatial database compute just to store static spatial data. If your GIS can’t support concurrent writers without file locks, your architecture is obsolete. Iceberg turns a cloud bucket of geometries into a production-grade spatial warehouse.

#67: Updating a Single Geometry Shouldn’t Require Rewriting a 500MB File

Prev

#69: Baking Business Logic Into Silver Tables Guarantees an Inflexible Data Lake

Next
Comments
Add a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Get every update, in your inbox.
Get every update, in your inbox.
Get every update, in your inbox.
One tip, every day
Get every update, in your inbox.
Subscribe below and join 11,000+ others learning modern GIS.