Adding GeoDjango to an existing Django project
3rd May 2021
Work on VIAL for Vaccinate The States continues.
I talked about matching last week. I’ve been building more features to support figuring out if a newly detected location is already listed or not, with one of the most significant being the ability to search for locations within a radius of a specific point.
I’ve experimented with a PostgreSQL/Django version of the classic cos/sin/radians query for this but if you’re going to do this over a larger dataset it’s worth using a proper spatial index for it—and GeoDjango has provided tools for this since Django 1.0 in 2008!
I have to admit that outside of a few prototypes I’ve never used GeoDjango extensively myself—partly I’ve not had the right project for it, and in the past I’ve also been put off by the difficulty involved in installing all of the components.
That’s a lot easier in 2021 than it was in 2008. But VIAL is a project in-flight, so here are some notes on what it took to get GeoDjango added to an existing Django project.
Alex Vandiver has been working with me on VIAL and helped figure out quite a few of these steps.
Activating PostgreSQL
The first step was to install the PostGIS PostgreSQL extension. This can be achieved using a Django migration:
from django.contrib.postgres.operations import CreateExtension from django.db import migrations class Migration(migrations.Migration): dependencies = [ ("my_app", "0108_previous-migration"), ] operations = [ CreateExtension("postgis"), ]
Most good PostgreSQL hosting already makes this extension available—in our case we are using Google Cloud SQL which supports various extensions, including PostGIS. I use Postgres.app for my personal development environment which bundles PostGIS too.
So far, so painless!
System packages needed by GeoDjango
GeoDjango needs the GEOS, GDAL and PROJ system libraries. Alex added these to our Dockerfile (used for our production deployments) like so:
RUN apt-get update && apt-get install -y \
binutils \
gdal-bin \
libproj-dev \
&& rm -rf /var/lib/apt/lists/*
Adding a point field to a Django model
I already had a Location
model, which looked something like this:
class Location(models.Model): name = models.CharField() # ... latitude = models.DecimalField( max_digits=9, decimal_places=5 ) longitude = models.DecimalField( max_digits=9, decimal_places=5 )
I made three changes to this class: I changed the base class to this:
from django.contrib.gis.db import models as gis_models class Location(gis_models.Model): # ...
I added a point
column:
point = gis_models.PointField( blank=True, null=True, spatial_index=True )
And I set up a custom save()
method to populate that point
field with a point representing the latitude and longitude every time the object was saved:
from django.contrib.gis.geos import Point # ... def save(self, *args, **kwargs): # Point is derived from latitude/longitude if self.longitude and self.latitude: self.point = Point( float(self.longitude), float(self.latitude), srid=4326 ) else: self.point = None super().save(*args, **kwargs)
srid=4326
ensures the point is stored using WGS84—the most common coordinate system for latitude and longitude values across our planet.
Running ./manage.py makemigrations
identified the new point
Point column and created the corresponding migration for me.
Backfilling the point column with a migration
The .save()
method would populate point
for changes going forward, but I had 40,000 records that already existed which I needed to backfill. I used this migration to do that:
from django.db import migrations class Migration(migrations.Migration): dependencies = [ ("core", "0110_location_point"), ] operations = [ migrations.RunSQL( sql=""" update location set point = ST_SetSRID( ST_MakePoint( longitude, latitude ), 4326 );""", reverse_sql=migrations.RunSQL.noop, ) ]
latitude/longitude/radius queries
With the new point
column created and populated, here’s the code I wrote to support simple latitude/longitude/radius queries:
from django.contrib.gis.geos import Point from django.contrib.gis.measure import Distance def search_locations(request): qs = Location.objects.filter(soft_deleted=False) latitude = request.GET.get("latitude") longitude = request.GET.get("longitude") radius = request.GET.get("radius") if latitude and longitude and radius: # Validate latitude/longitude/radius for value in (latitude, longitude, radius): try: float(value) except ValueError: return JsonResponse( {"error": "latitude/longitude/radius should be numbers"}, status=400 ) qs = qs.filter( point__distance_lt=( Point( float(longitude), float(latitude) ), Distance(m=float(radius)), ) ) # ... return JSON for locations
In writing up these notes I realize that this isn’t actually the best way to do this, because it fails to take advantage of the spatial index on that column! I’ve filed myself an issue to switch to the spatial-index-friendly dwithin instead.
Getting CI to work
The hardest part of all of this turned out to be getting our CI suites to pass.
We run CI in two places at the moment: GitHub Actions and Google Cloud Build (as part of our continuous deployment setup).
The first error I hit was this one:
psycopg2.errors.UndefinedFile: could not open extension control file "/usr/share/postgresql/13/extension/postgis.control": No such file or directory
It turns out that’s what happens when your PostgreSQL server doesn’t have the PostGIS extension available.
Our GitHub Actions configuration started like this:
name: Run tests
on: [push]
jobs:
test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:13
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: vaccinate
options:
--health-cmd pg_isready --health-interval 10s --health-timeout 5s --health-retries 5
ports:
- 5432:5432
steps:
The postgres:13
image doesn’t have PostGIS. Swapping that out for postgis/postgis:13-3.1
fixed that (using this image).
Our Cloud Build configuration included this:
# Start up a postgres for tests
- id: "start postgres"
name: "gcr.io/cloud-builders/docker"
args:
- "run"
- "-d"
- "--network=cloudbuild"
- "-e"
- "POSTGRES_HOST_AUTH_METHOD=trust"
- "--name"
- "vaccinate-db"
- "postgres"
- id: "test image"
name: "gcr.io/cloud-builders/docker"
args:
- "run"
- "-t"
- "--network=cloudbuild"
- "-e"
- "DATABASE_URL=postgres://postgres@vaccinate-db:5432/vaccinate"
- "${_IMAGE_NAME}:latest"
- "pytest"
- "-v"
I tried swapping out that last postgres
argument for postgis/postgis:13-3.1
, like I had with the GitHub Actions one... and it failed with this error instead:
django.db.utils.OperationalError: could not connect to server: Connection refused
Is the server running on host "vaccinate-db" (192.168.10.3) and accepting
TCP/IP connections on port 5432?
This one stumped me. Eventually Alex figured out the problem: the extra extension meant the PostgreSQL was taking slightly longer to start—something that was covered in our GitHub Actions configuration by the pg_isready
line. He added this step to our Cloud Build configuration:
- id: "wait for postgres"
name: "jwilder/dockerize"
args: ["dockerize", "-timeout=60s", "-wait=tcp://vaccinate-db:5432"]
It uses jwilder/dockerize to wait until the database container starts accepting connections on port 5432.
Next steps
Now that we have GeoDjango I’m excited to start exploring new capabilities for our software. One thing in particular that interests me is teaching VIAL to backfill the county for a location based on its latitude and longitude—the US Census provide a shapefile of county polygons which I use with Datasette and SpatiaLite in my simonw/us-counties-datasette project, so I’m confident it would work well using PostGIS instead.
Releases this week
-
django-sql-dashboard: 0.11a0—(22 total releases)—2021-04-26
Django app for building dashboards using raw SQL queries
TIL this week
More recent articles
- Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac - 12th November 2024
- Visualizing local election results with Datasette, Observable and MapLibre GL - 9th November 2024
- Project: VERDAD - tracking misinformation in radio broadcasts using Gemini 1.5 - 7th November 2024