#67: Updating a Single Geometry Shouldn’t Require Rewriting a 500MB File

Share

Unless your read performance depends on it. This is the core trade-off between Copy-on-Write and Merge-on-Read—deciding whether you pay compute during ingestion or during queries.

Copy-on-Write (CoW) rewrites entire files for updates. Change one geometry in a 500MB Parquet file? Iceberg reads the file, applies the change, writes a new 500MB file. The old file is deleted. Next query reads the clean, optimized file immediately. Perfect for analytical queries. Terrible for streaming ingestion—every update becomes massive I/O.

Merge-on-Read (MoR) writes delta and delete files. Update a geometry? Write a tiny delta file containing just that change. Delete a feature? Write a delete marker. The base file stays untouched. Query engines reconcile the base file with deltas on the fly. Ingestion is fast. Queries pay the reconciliation cost.

The hidden cost of MoR: query engines must stitch files together mid-query. If deltas accumulate without compaction, read performance collapses. A thousand small delta files means a thousand I/O operations per query.

Choose based on velocity and SLAs. Static reference data (parcel boundaries updated monthly)? Use CoW. Ingestion is rare; queries are frequent. Vehicle telematics (position updates every second)? Use MoR. Write speed is critical; compaction jobs run off-peak.

Compaction bridges both. MoR tables compacted frequently behave like CoW tables—optimized data, fast reads. But you control the compaction schedule, not the ingestion frequency.

The rule: Use CoW when you query often and update rarely. Use MoR when you update constantly. If read times degrade on MoR, your compaction strategy is failing.

Whenever you’re ready, here are 4 ways I can help you grow in GIS & spatial data:

​Spatial Lab​ – My private community where GIS professionals, data engineers, and analysts connect, swap workflows, and build repeatable systems together.

​Modern GIS Accelerator​ – A guided program to help you break out of legacy GIS habits and learn modern, scalable workflows.

​Career Compass​ – A career-focused program designed to help GIS pros navigate the job market, sharpen their pitch, and find roles beyond traditional GIS paths.

​Sponsorship​: Interested in sponsoring this newsletter (or other content)? ​Learn more here​ and fill out the form to get in touch!

#66: The Fastest Query Engine Can’t Save You From Ten Thousand 10KB Files

Prev

#68: Moving Spatial Files to Cloud Storage Doesn’t Make Them Modern

Next
Comments
Add a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Get every update, in your inbox.
Get every update, in your inbox.
Get every update, in your inbox.
One tip, every day
Get every update, in your inbox.
Subscribe below and join 11,000+ others learning modern GIS.