#71: Treating GeoParquet as a Universal Database Replacement Cripples Transactional Workflows

Share

GeoParquet is brilliant for analytics. It’s terrible for real-time applications. Using it for high-frequency writes or single-feature lookups will inflate your compute bill and destroy performance.

Columnar storage is designed for scans, not seeks. Parquet excels at “give me all elevation values across 1 million features.” It fails catastrophically at “give me all attributes for feature ID 12345.” A columnar layout stores each column separately. Reconstructing a single complete row requires reading from multiple columns, decompressing multiple blocks, and reassembling them. That’s exponentially slower than a row-store that stores the entire feature contiguously.

Write amplification from streaming kills performance. Vehicle tracking data: one GPS point arrives every 5 seconds. Batch them into Parquet files. Write 1,000 files per day. Compaction jobs merge them. Rewriting terabytes of data daily to consolidate small files costs a fortune. With a row-store (PostGIS), you insert the point and continue. No rewrites. No amplification.

The immutability trap. Parquet files are append-only. Update a feature’s attributes? Rewrite the entire file. Delete a corrupted geometry? Rewrite. High-frequency updates destroy economics.

Separate your architecture. PostGIS or DuckDB handles transactional spatial queries—live dashboards, vehicle tracking, web applications. Parquet handles batch analytics—historical trends, bulk reporting, machine learning feature engineering. Buffer streaming data in a row-store, then periodically dump to Parquet for long-term storage.

The rule: Parquet scans forests; row-stores examine trees. Never force columnar files to do a transactional database’s job. Continuous streaming writes into Parquet guarantee compute costs from constant file rewrites.

Whenever you’re ready, here are 4 ways I can help you grow in GIS & spatial data:

​Spatial Lab​ – My private community where GIS professionals, data engineers, and analysts connect, swap workflows, and build repeatable systems together.

​Modern GIS Accelerator​ – A guided program to help you break out of legacy GIS habits and learn modern, scalable workflows.

​Career Compass​ – A career-focused program designed to help GIS pros navigate the job market, sharpen their pitch, and find roles beyond traditional GIS paths.

​Sponsorship​: Interested in sponsoring this newsletter (or other content)? ​Learn more here​ and fill out the form to get in touch!

#70: Forcing BI to Execute Spatial Joins Is a Failure of Your Gold Layer

Prev

#72: Stop Parsing Text. Start Reading Typed Geometries.

Next
Comments
Add a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Get every update, in your inbox.
Get every update, in your inbox.
Get every update, in your inbox.
One tip, every day
Get every update, in your inbox.
Subscribe below and join 11,000+ others learning modern GIS.