Header image

Introducing Geomancer: an open-source library for geospatial feature engineering

April 16, 2019 blog-post geospatial machine-learning feature-engineering geomancer open-source

Here at Thinking Machines, we work with a lot of geospatial data: we’ve identified gaps in OpenStreetMap (OSM), provided geospatial analytics for our clients, and harnessed machine learning to estimate poverty from satellite imagery. However, we realized that we were spending too much time in repetitive feature engineering tasks. So to operate on geospatial data at scale, we decided to automate our execution and delivery workflows.

Enter Geomancer, our open-source library for geospatial feature engineering! It leverages geospatial data such as OpenStreetMap (OSM) coupled with a data warehouse like BigQuery. We use this to create, share, and iterate features for our downstream machine learning tasks. This tool allows us to:

Let’s see Geomancer in action! Given a set of points, we can create a feature that gets the distance to the nearest supermarket within a 10-km radius:

Geomancer’s Core API is powered by a SQLAlchemy backend that handles the translation of a Spell into a SQL dialect. This makes the library highly-extensible, allowing you to add new feature-primitives and database backends for your specific use-case.

We hope that Geomancer can help you scale your geospatial feature engineering needs! You can get started by reading through our getting started demo and setup guide. You can find more details through the documentation. Lastly, contributions are welcome! Simply file an issue or submit a pull request through GitHub.

Are you interested in using machine learning and geospatial data to help you and your organization make better and more informed decisions? Get in touch with us at [email protected] to learn more!


6 Use Cases of Geospatial Analytics That Change Your View of the World

Successful businesses use geospatial analytics wisely and we’ll tell you why you should start using it too.

Making Panel Discussions Memorable with Data Visualization

Minutes are the default method of documenting and summarizing meetings. But they’re also about as interesting to read as your social media's terms and conditions.

Creating a Single Customer View out of Big Messy Data

We worked with LBC to develop a Single Customer View (SCV) that automatically identifies unique customers from daily transactional records, resulting to 40x faster processing and 3.5x higher accuracy.