Using Neo4j Spatial Procedures in legis-graph-spatial09 Aug 2016
Neo4j 3.0 introduced the concept of user defined procedures: code written in Java (or any JVM language) that is deployed to the database and callable from Cypher. User defined procedures are an alternative to unmanaged extensions, with the key difference that user defined procedures are callable from Cypher (instead of extending the http REST endpoints). This allows for an alternative API for Neo4j’s spatial extension - namely, exposing Neo4j Spatial’s functionality through Cypher!
If you’re not familar with Neo4j Spatial, it’s an extension to Neo4j that implements spatial indexing that enables import, storage, and querying of spatial data in Neo4j. After the introduction of user defined procedures in Neo4j many procedures were added to spatial, making it much easier to use - instead of making separate http requests to the endpoints exposed by spatial or using the spatial Java API a user can now simply call spatial procedures directly from Cypher (and make use of the official language drivers and the efficient Bolt binary protocol).
Today I got around to updating legis-graph-spatial to use these new spatial procedures, instead of the spatial REST API it was using previously. Here’s a brief overview of the update:
- Simplified congressional district boundaries
- Update to Neo4j 3.04 and spatial 0.20
- Geospatial querying using spatial procedures from Cypher
If you haven’t seen it before, legis-graph-spatial provides a visual way to explore US Congress members topics of influence by district. The user clicks on a map to see who represents that area in the House, as well as information about the committees that representative serves on and over what issues they have influence in Congress. This is all powered by Neo4j. See this post for more info.
Legis-graph-spatial. Explore legislator topics of influence by district.
Simplified Congressional District Boundaries
Previously I was using a much higher resolution version of the congressional district boundaries then was necessary. I took this opportunity to replace the WKT boundaries for each district using this lower resolution data from Govtrack..
This Python script crawls the gis.govtrack.us API to fetch WKT format boundaries for each congressional district and saves it in a CSV file that we can later import into Neo4j easily as part of the LOAD CSV import query for legis-graph:
Upgrading And Installing Neo4j Spatial
The server for legis-graph-spatial is running on a Digital Ocean vps instance, so I upgraded to Neo4j 3.0.4 using the Neo4j Debian package which will install Neo4j as a service. Once I installed Neo4j I built and installed spatial:
Instead of building spatial you can also download the spatial plugin jar file from Github:
Then I used the legis-graph import script with the LazyWebCypher tool for easily running multi statement Cypher scripts from the browser to load legis-graph data for the 114th Congress.
Once the data is loaded we want to add the
District nodes to a spatial index so we can do things like find the Congressional District that contains a certain latitude and longitude.
We’ll use a spatial procedure,
spatial.addWKTLayer to create a WKT layer:
Then we’ll match on all
District nodes and add them to the WKT layer we just created (again using a spatial procedure, in this case
Now that the
District nodes have been added to the spatial layer, we can perform an indexed geospatial query to find the Congressional district given a latitude and longitude. And since these
District nodes are part of legis-graph, it’s just a simple graph traversal to find the Legislator that represents that District and their topics of influence:
Querying legis-graph using indexed geospatial query from Cypher
Previously I was making an http request to the spatial endpoint to query the district and another to the Cypher transactional endpoint to query for topics of influence, but now we can combine this all into one query as seen above.
Also, we were using a jQuery AJAX POST request to the transaction Cypher HTTP endpoint to execute our Cypher query:
Then to execute the Cypher query we saw above that does our spatial lookup and parse the data:
As you can see we are using a promise for dealing with the asynchronous behavior of a database request. Alternatively we could act on the data as a stream, by subscribing to the data event:
However since we only expect to receive one row back we don’t need to process as a stream.