Often, activist groups will have a list of supporters and they will need to know which riding those supporters live in. The easiest way to do this is to use a paid platform such as NationBuilder which has this built in. But that costs money and often groups do not have the money for that. For only marginally more work, you can do it for free using Open Refine and Open North.
Open Refine is an open source application for “working with messy data”. Originally built by Google it has been released to the public and is used by data nerds the world over. A lot of librarians, statisticians, or scientists use it but it can also be a lifesaver for activists working with data. We’re going to cover only one way to use it.
It’s extremely powerful and worth investing some time to learn. Note that using Open Refine, your data never goes to Google so you do not have to worry about Google using it in ways you do not want.
Step one: get the Open Refine application.
It’s available at openrefine.org/download.html and works on Windows, Mac or Linux. These instructions are going to cover the Windows version but they all run very similarly.
There is nothing to install on Windows. Just download the zip file, and put the files somewhere on your computer where you’ll be able to find them again.
Step two: Create a project
Open the application and you’ll be asked to open a project. You’ll likely not have any, and you have to make a new project every time you want to work with a new data set, so move to “Create a project”.
Here you pick the files you want to work with. It can accept Excel, CSV, and a large number of other data sources.
On the next screen, you will see how the file is seen by OpenRefine. Down at the bottom, you can adjust any of the parameters but it usually does a good job of getting them correct out of the box. This guide is going to assume that you are just accepting the defaults.
Put a name in the “Project name” dialog (upper right corner) and hit the “Create project” button.
Depending on the speed of your computer and the size of the file, you will move to a new screen that will display the number of rows in big print near the top and will display a sample of the data. The default is to show you 10 rows but you can up that to 50 rows if that’s helpful.
The above screen video is using sample data.
Step three: Fetch the Open North data
Now, find the column that has the postal code in it and click on the downward pointing arrow next to the column title.
You are looking for the “Edit Column” menu item, and want to select “Add column by fetching URLs” from the menu that slides out to the right.
This is where things can get complicated but I’ll give you the code so it works!
Firstly, I should say:
This operation sends your postal code data to the servers run by Open North. Open North is a Canadian non-profit which is mandated to make government data open to the public. They have an application interface, or API, which allows you to push a postal code and get a bunch of information about the layers of government in that place.
Start by filling in the “New column name” with a name. Something like “Riding” would be appropriate. And reduce the throttle delay down to 1000 milliseconds.
In the Expression box replace “value” with this:
"http://represent.opennorth.ca/postcodes/"+replace(value," ", "")
Now hit the “OK” button at the bottom of the box and it will begin work. Depending on the size of your file and the load on the Open North server, it may take a while. The counter will tell you how much longer you have until it completes.
Step four: Make the data result human readable
If you look at the new Riding column it will be hard to understand. That’s the response from the Open North database. It’s not supposed to be read by humans. We need to tell Open Refine to clean it up and focus in on the data we want.
Right click on the downward arrow at the top of the column again, but this time go to “Edit Cells” and then “Transform”. In the Expression box replace “value” with this:
value.parseJson().boundaries_centroid[3].parseJson().name
The number [3] is important. It specifies which part of the data you want. 3 is for the federal level. You can use 0 for provincial level. If you are looking for the name of the local representative rather than the riding name, you can change the part of the code from “boundaries_centroid” to “representatives_centroid”.
Once you see the desired data in the preview, it’s time to hit “Ok”.
Again depending on the size of your data set, this might take time to process though less than the previous operation.
Step five: Export your data back out again
You should now have a column in the file with the riding names! Congratulations!
Oh, you want to use this data now? That makes sense and is fairly easy to do.
Use the “Export” button in the upper right corner to export the data back out of Open Refine in any of many different file types. And you are done!