python3 -m pip install pyquadkey2
python3
Building footprints from Global ML
I’ve recently learned of the GlobalML
dataset from Microsoft [1] which has extracted building footprints from satellite imagery. However, accessing the data is not trivial as only a handful of countries have specific data releases. The rest of the world, including India, are available as compressed files and can be downloaded using the quadkey
for the required map tile.
The challenge I faced was figuring out which quadkey
tile contained the latitude and longitudes I was interested in. Luckily, I found a very useful snippet [2] that makes use of the pyquadkey2
python package.
from pyquadkey2 import quadkey
def geo_to_qk(latitude, longitude, zoom_level):
"""Convert latitude and longitude to quadkey at specified
integer zoom level, returning quadkey as a string
"""
= str(quadkey.from_geo((latitude, longitude), zoom_level))
qk return qk
# To get a 9 digit quadkey, the zoom level is 9
19.07403, 72.99571, 9) geo_to_qk(
The GlobalML
data tiles are based on zoom level 9 so once I had the requisite quadkey (123300311
), I just had to look up the file containing it in the data links csv, to get the following link: https://minedbuildings.blob.core.windows.net/global-buildings/2022-11-15/global-buildings.geojsonl/RegionName%3DIndia/quadkey%3D123300311/part-00055-2cb4a5ad-6652-48b0-a336-f09518f4c9e5.c000.csv.gz
For some unknown reason, the geojson
file is saved as csv
. So, once the csv.gz
is downloaded and extracted, the file needs to be renamed from .csv
to .geojson
before being read into R with sf::st_read()
. A quick visualisation shows that the tile gets footprints for Mumbai suburban district and the wider metropolitan region to the east. A different tile will be required for the Mumbai city district.