Overture Maps - a Fusion of Open and Commercial Data for a New Era in Mapping
When it comes to creating maps, performing spatial analysis, or building digital simulations of our real world, one thing is crucial: having geospatial data. Depending on your application, you may collect some data from your field, but most likely you will enrich this data with “generic” data such as roads, buildings, or administrative boundaries. There are many data providers, which can be categorized by one criterion: open or proprietary access to their data.
Open and commercial data providers
By far the most popular open data provider is OpenStreetMaps (OSM), a collaborative project that aims to create a free, editable map of the world that anyone can contribute to. Their global community consists of millions of contributors, and you, dear reader, are also welcome to map whatever comes to mind (and is not already mapped). One big advantage of OSM is the high level of detail, e.g. even the type of pedestrian traffic light button and its signal is specified. The quality of OSM data is very high because it is collected and verified by humans. However, there is also a major drawback: the map is only as good as its contributors, which makes OSM lack consistency on a global scale.
Commercial data providers often try to provide more consistency by focusing on less detail. They do this in part by sending people out into the field, but they also spend large sums of their budget on building algorithms to extract information from aerial and satellite imagery. This approach is particularly useful in remote and rapidly changing regions.
In a nutshell, OSM is good at data quality, while commercial vendors are good at data quantity, and this is where Overture Maps comes in.
What is Overture Maps?
Overture Maps is an open data project launched by the Linux Foundation in 2022. Its goal is to create a geospatial data product that is easy-to-use and interoperable, merging the datasets from both open and commercial data contributors. The list of Overture members is long and includes many of the industry’s biggest players, like Amazon, Meta, Microsoft, and TomTom. Now they are teaming up with open data projects to create a larger and more consistent geospatial dataset than ever before.
Merging datasets — Hierarchy and Schema
In order to merge multiple datasets, a specific hierarchy must determine which element to use in case of overlap. The key criterion here is reliability. Overture Maps considers human-collected data to be more reliable than machine-collected data.
For example, Overture Maps ranks OSM building footprint data higher than Microsoft’s building footprint data, detected by machine learning algorithms. If a building has both OSM and Microsoft data available, Overture uses the OSM data. However, if there is only Microsoft data, Overture uses that instead. In this case Overture aggregates OSM data with Microsoft machine learning data to provide a more comprehensive picture of buildings.
Another example: If there is no height for a building in OSM, Overture adds the height estimates from USGS LiDAR, a remote sensing program that uses laser light to measure distances and generate detailed, three-dimensional information about the Earth’s surface.
Of course, this hierarchy is much more extensive and detailed than described here. Besides the data collection method, data reliability varies by location, feature type (road, building, boundary, …), and more.
Another important aspect of integrating multiple data sources is creating a schema (or data model) into which all features are transformed. To do this, Overture created a schema that abstracts the attributes of a feature to a higher level. Imagine a very precise dataset and a less precise one. The only way to merge them and keep them consistent is to zoom out and bring them up to a higher level. For example, buildings categorized in the previous dataset as a warehouse, shed, or apartment building are now classified into a well-defined set of broad building types. Classifying data from multiple data sources into your schema takes a lot of time and effort. Overture Maps takes care of this task and eliminates much of the processing on the user's side.
Data access
Speaking of the user side, it is important to understand that Overture Maps sees itself as a data provider, not a service provider. This means that their focus is on creating the data, not distributing it. It is made for developers who want to use it to build their own custom services. You will not find a download button on their website to get the data. (Aside from the fact that it would probably crash your local machine if you worked with these huge datasets covering the entire planet.) Instead, they provide the data with an API as cloud-optimized Parquet files, making it easy to integrate into your cloud-native solution.
Overture Maps currently provides five data layers: Admins (administrative boundaries), Base (water surfaces and land bodies), Places (real-world facilities and places of interest), Transportation (transportation infrastructure like roads, cycleways, and ferry routes), and Buildings. It is expected that more data layers will be added in the future.
Further resources and support
Overture Maps is a project in the making. It will be exciting to see how this data product evolves, who joins the project, and what is built with it.
If you want to learn more about Overture, check out their official website. Also, I highly recommend this episode of the MapScaping Podcast for a deeper dive into the topic.
Interested in working with Overture data or geospatial data in general? Mapular is a company focused on custom geospatial solutions, we would love to hear from you.