Introducing Geomancer: an open-source library for geospatial feature engineering
Here at Thinking Machines, we work with a lot of geospatial data: we’ve identified gaps in OpenStreetMap (OSM), provided geospatial analytics for our clients, and harnessed machine learning to estimate poverty from satellite imagery. However, we realized that we were spending too much time in repetitive feature engineering tasks. So to operate on geospatial data at scale, we decided to automate our execution and delivery workflows.
Enter Geomancer, our open-source library for geospatial feature engineering! It leverages geospatial data such as OpenStreetMap (OSM) coupled with a data warehouse like BigQuery. We use this to create, share, and iterate features for our downstream machine learning tasks. This tool allows us to:
- Engineer features at scale. Using Geomancer’s feature-primitives called Spells, we can easily iterate on feature creation tasks. If we want to get a point’s distance to the nearest university, or the number of supermarkets within a certain range, then we can do so via a single function call.
- Compile features across multiple data warehouses. Geomancer primarily supports BigQuery since that’s the data warehouse we use most often. However, in cases when we want to query from a PostGIS server or a local SQLite database, we can easily do so because Geomancer grants data warehouse flexibility. We’re looking forward to supporting more databases in the future!
- Save, reuse, and share features. Geomancer also features Spellbooks, a great way to store and save features from our experiments. If we found a good set of features for a certain task, then we can simply compile the Spells into a Spellbook, add some metadata, and export it as a JSON file. We can then share these files to others so that they can apply it on their own datasets.
Let’s see Geomancer in action! Given a set of points, we can create a feature that gets the distance to the nearest supermarket within a 10-km radius:
Geomancer’s Core API is powered by a SQLAlchemy backend that handles the translation of a Spell into a SQL dialect. This makes the library highly-extensible, allowing you to add new feature-primitives and database backends for your specific use-case.
We hope that Geomancer can help you scale your geospatial feature engineering needs! You can get started by reading through our getting started demo and setup guide. You can find more details through the documentation. Lastly, contributions are welcome! Simply file an issue or submit a pull request through GitHub.
Are you interested in using machine learning and geospatial data to help you and your organization make better and more informed decisions? Get in touch with us at [email protected] to learn more!