Creating a Deprivation Index for GA
07 Sep 2016
Enhance Google Analytics' default geographic data with custom information, such as wealth, deprivation or home ownership. This feature will significantly improve your ability to tailor the geographic segments that you compare and analyse.
Geographic Data Imports – Creating a Deprivation Index
For some time now Google Analytics has had a feature available allowing us to upload custom data to enhance the data collected by Google. The most common use case we have for this feature has, to date, been the cost data import, for paid channels other than AdWords.
Google have since extended this feature to include geographic data. This means we can now build our own regions, using, for example, Local Authorities or UK counties. It also enables us to group regions by common features, for example, demographics or wealth.
Today, I am going to focus on a particular use case in which we’ve used UK Government data on deprivation, to build an index incorporating the top 100 or so UK cities in Google Analytics (GA).
There are two main parts to getting this all working and I’ll run through both of them below.
The Data
The first is sourcing the data and using a common key to link with the GA data. The dimensions currently available to link on are:
1. cga:cityId
2. ga:countryIsoCode
3. ga:regionId
4. ga:subContinentCode
You can find a full list of these IDs (known as criteria IDs in AdWords) here. It is worth noting here that you can only match ga:cityId to cities or municipalities in the UK, so make sure you remove any of the other target types before trying to match your data sets.
For the deprivation data, we used the English Indices of Deprivation 2010, published hereby the Department for Communities and Local Government.
data.gov.uk has literally thousands of freely available data sets, so it’s really worth taking a look at what else may be of use to your clients or business.
We used standard index matching to link the data sets, however, because the data sets use different names and/or boundaries for local areas, there’s some unavoidable, manual data cleansing and checking required (we have provided our data set for download, below).
So, now we have a list of criteria IDs for UK towns and cities linked to a list of local authorities with deprivation data. The next step is to rank them, place them into deciles and label them:
The Configuration
Once this is complete, we are ready to set up the data import in GA. I’ve outlined the steps, below:
Create a custom dimension named "Deprivation"
1. Go to the admin section of GA.
2. Click "Custom Definitions" and then "Custom Dimensions" (highlighted below).
3. Go to the admin section of GA.
4. Click "New Custom Dimension".
5. Name your custom dimension and change the scope to session, because all geographic data is session level.
6. Save the dimension and return to the admin screen.
Create a new data set
1. Click on "Data Import".
2. Select "New Data Set".
3. Select "Geography Data".
4. Name the data set and select the view(s) you wish to apply the new deprivation dimension to.
5. If you're using our data set, then at the next step, select "City ID" as the Key. This is the GA dimension against which we are linking our custom data to.
6. Select "Deprivation" as the Imported Data.
7. Finally, click "Get Schema" and then "Done". This will save the configuration and export a template file, into which you can paste the values from the corresponding columns from our data set.
Import the data
1. Return to the "Data Import" tab but this time select "Manage uploads".
2. Click on "Upload file" and choose your file.
3. Click on "Upload file", choose your file and click upload.
The csv file should then be processed within 24 hours (in reality this is almost always complete in under an hour).
Reporting on the Data
The final step is actually making use of the data. To do this, you can either create a custom report, with the deprivation index as a primary dimension alongside whichever metrics are of interest to you. Alternatively, you can add the deprivation index as a secondary dimension in most reports.
As previously mentioned, there are many thousands of open data sets, so you can follow this template for your own use cases. You can in fact widen against any of the following:
Content Data, such as article author or category.
Product Data, such as brand or category.
Custom Data for your specific use case.
You can download our data set here.
To read this blog by Arran Gosal on the Periscopix website, please click here.
Please login to comment.
Comments