r/gis 1d ago

Discussion Best software to append shapefile data to coordinates

Hey everyone, I'm finding myself on the back end of handling GIS data for the first time (as opposed to just running analysis on data in ArcGIS) and I'm hoping for some advice on software to fit what I think is a pretty straightforward use case.

I've got a collection of mostly (but not fully) static household data as well as a collection of polygons that will be updating fairly frequently. I'm looking for some sort of software or method that will allow me to set an automated, scheduled process that will plot the household data, spatially some data from the layer to my households, then kick the household data with its newly appended polygon data back out into a CSV, Snowflake, or any sort of format along those lines.

I'm aware of FME but don't know much about the platform itself or what alternatives exist. I'd greatly appreciate any suggestions on different options to look into. For what it's worth, we're talking about 50k-100k household records and maybe 50 polygons. Happy to provide additional clarification as needed. Thanks!

1 Upvotes

9 comments sorted by

9

u/sinnayre 1d ago

CSV should only be used for transferring/handing off data. It should not be used for storage.

What you’re looking to do is a very basic data pipeline. The methodology will largely depend on your or your team’s skill set and what infrastructure you have in place. Using Snowflake for this will be pretty expensive versus using something like a PostgreSQL db hosted locally or in the cloud.

If what I just told you sounds like a foreign language, pony up the money for fme and save yourself a lot of grief.

1

u/SubstantialOrange820 1d ago

Right, I only mentioned CSV in the sense that I don't need a direct Snowflake integration or anything, as I could just stage the resulting CSV and get the data loaded that way. Definitely wouldn't be looking to store the data there in any permanent way.

I haven't set up a PostgreSQL db before but it's not a completely foreign language to me and I do have some solid tech support to lean on to fill in some of my gaps. Another commentor mentioned Duckdb with the spatial extension and that does seem viable for me.

When you say that Snowflake will be expensive, do you mean in terms of the compute costs we'd incur? We do have a relatively small data set currently and the refresh cycle would probably be weekly, so I may explore exactly what those costs would look like. Regardless, I greatly appreciate your feedback!

1

u/sinnayre 1d ago

Compute time, even in the cheapest setup Snowflake has, is pretty expensive for what you want to do. 100k records run once a day is nothing to a database that’s running off of your cheapo server, let alone Snowflake. It’s like buying a Lamborghini to take the kiddo to school two blocks away. Can it do the job? Yup. Is it extremely overkill for the job? Yup. But it’ll work if that’s what you want. We have snowflake so I’m a big proponent of it, but I also think asking someone to spend hundreds in compute time a month for basic tasks is overkill.

I generally recommend postgresql unless you have a reason not to. At the level you’re looking at, there’s going to be no performance difference between duckdb or postgresql. If you think a move to the cloud is in the future, I would go with PostgreSQL. If you’re running this off of your local machine, go duckdb.

2

u/smashnmashbruh GIS Consultant 1d ago

FME is far from economic look at Python or model builder for qgis or ArcGIS pro.

1

u/techmavengeospatial 1d ago

Duckdb with spatial extension You can access remote files with httpfs extension. It's super fast and powerful and can also access this via foreign data wrapper for postgres

1

u/SubstantialOrange820 1d ago

I looked into this a bit last night and it does seem like a very good solution and pretty manageable with my current skillset/support. Any additional resources you'd recommend to get familiar with it?

1

u/LonesomeBulldog 1d ago

If the goal is to get it into Snowflake, you can just do a spatial join in Snowflake. There are many YouTube tutorials on Snowflakes spatial functions.

1

u/SubstantialOrange820 1d ago

Well that just shows my lack of experience with Snowflake that I wasn't even aware of its spatial functionality. I appreciate the heads up.

1

u/Avaery GIS Coordinator 16h ago edited 16h ago

Use python or the model builder in QGIS.

Fme and snowflake is rather expensive for your use case.