Humanitarian Exchange Language (HXL) and HXL Proxy use cases
The Humanitarian Exchange Language is a data standard to help the sharing of information in humanitarian response. More details can be found on the HXL website.
Potential applications in the sector are huge. The challenge of course, as with any new standard is for it to become more widely acknowledged and adopted by humanitarians both in the field and at HQ level. Below are some examples of how we have been using HXL at the British Red Cross:
Use case 1: Nepal Survey of Surveys
|Nepal assessments dashboard — https://data.hdx.rwlabs.org/dataset/assessment-coverage-interactive-portal|
In the wake of the 25 April and 12 May Nepal Earthquakes in 2015, an OCHA-led assessment unit collected a database of all assessments covering humanitarian needs. A central aim, was to make it easier for responders to find those assessments most relevant to their activities.
A web-based dashboard was made using a HXLated data set (with a google spreadsheet as the backend). By HXLating the data, it can be easily sorted into different templates; columns can be moved and headings can be changed as needed. The only requirement, is that the data set contains certain HXL tags for the dashboard to find. Such dashboard templates are easily adapted for other responses as data collectors are not tightly constrained by how they have to organise their data, as long as it is HXLated.
As a developer working on the dashboard, using HXL saved a lot of time. Each assessment is accounted for on a single line with separate columns for each sector; ‘true’ or ‘false’ specifying whether it covers that sector. After a few weeks a ‘Recovery’ sector column was added. By introducing the new column with an appropriate HXL tag, the dashboard automatically found this and included ‘Recovery’ into the sector graph. There was no need to adjust any of the code in the dashboard.
Use case 2: Shelter Cluster Portal for Nepal
|Shelter Cluster Nepal Dashboard|
Our team was approached by The Shelter Cluster to assist them in building an interactive dashboard to better communicate their data. The first iteration was created, using Nepal Earthquake response data.
One of the key requirements was that it could function in a low bandwidth environment. A problem with the current microsite approach was that it front-loads all the data from the google spreadsheet. It would be more efficient to be selective and load just the information needed.
Thankfully David Megginson, the experienced Canadian software developer has been working on an open source application called the HXL Proxy. For the Shelter Cluster dashboard, this application enabled easy filtering of data determined by the user; for example, filtering data by particular Nepali districts.
|The HXL Proxy|
With the HXL proxy, only rows relevant to what the user is viewing are loaded from the Google Spreadsheet. This reduces the initial page load by more than half.
The dashboard also uses damage data about each district. Two dashboards are being run off the same dataset and to provide an update to multiple products it requires a change of one data set. If OCHA or other coordinating organisations released similar data for future responses then it would be a case of only the coordinating organisation maintaining the data set to keep all the products up to date.
Use case 3: PCoding and shared cleaning sheets and workflows in the European Migration Crisis
|Indicator tracker used by the IFRC for the European Migrant Crisis response|
A Place code is a kind of addressing system, providing unique identifiers to locations and administrative units in a humanitarian operation. UN OCHA often (but not always) defines p-codes for a country. These standardised codes facilitate sharing between organisations. Without p-codes, collating location-based information is much more difficult as place names are often spelt differently! P-coding is the process of assigning place codes to countries and regions.
For the European Migrant Crisis we were collecting data about countries that the Red Cross was actively responding in. The HXL proxy offered two features that made our workflow more efficient. One is the function called replace table and the second is the ability to join columns.
We had one replace table serving multiple workflows. If someone added a new term such as ‘Macedonia’ should be ‘Macedonia, FYRO’ it automatically fed into all of our workflows. This means we can clean data using this process for each new workflow.
|Replace table used in the European Migrant Crisis response|
Using the HXL proxy we can automatically PCode our data. By joining the correctly spelt names with our PCode list, all of our outputs had ISO3 codes attached. The advantage of including these processes in our work is that even if original data sets are significantly changed, a cleaned PCoded version can quickly be generated using the HXL proxy workflow. PCoding can otherwise be quite an arduous manual task!
In this case, we had our base data set as a google spreadsheet, so those in the field could update the figures and indicators. After the initial set up, the latest data would therefore be processed to create a downloadable PCoded latest version of the data without involvement needed from our side.
Another benefit came when we decided to change the naming of ‘Macedonia, FYRO’ to Macedonia, Former Yugoslav Republic of’. Quick changes were made to the two workflow spreadsheets and this automatically fed through to our dashboards and final datasets with no changes to the original data or to dashboard visualisations required.
Use case 4: Who, what, where data collection
|IFRC 3W for Nepal Earthquake Response|
This is a prototype of a data collection system which we look forward to improving and testing again in the near future. During an emergency response the IFRC acts as a coordinator for different Red Cross/Red Crescent national societies. This means collecting information from the involved partner national societies. For the 2015 Nepal Earthquakes response, a collaborative google spreadsheet was set up for national societies to enter information about recovery activities. There were a few problems, and this time, it ultimately failed to fulfil it’s potential as a useful tool. Incorrect information was often entered, correct information was sometimes replaced accidentally; contributors fed back that the template was too easily breakable — the whole single sheet data set was editable to a vast number of users. Too much time was spent reverting and cleaning the input.
Learning from this, using a HXL based process we have prototyped an improved method. Again we opted for Google Spreadsheets as the platform, but this time we decided to assign one tab to each national society. The other tabs are locked, so users can only edit their tab and not other national society’s data. These separate tabs are combined via the HXL proxy to provide a single spreadsheet output. The combined data set is then passed through a separate mapping table which corrects spelling mistakes in the data entry.
A separate (Google Spreadsheet) data mapping table means that the Red Cross IM Team is able to collaboratively generate the terms. If a person finds a common misspelling it is added to the list. Inconsistent location spellings is often a thorn in the side of Humanitarian IM work — sharing this tool on an appropriate site for future use is something that could work well.
Once the data set has been cleaned, the HXL proxy is applied to add extra columns. First we p-code the districts (admin 3) and join other relevant information columns such as district priority and other Red Cross related information.
A second join is then used to p-code to admin level 4. This adds extra relevant location data on the fly. This means data enterers need not worry about entering complicated p-codes and other IM heavy columns.
The end result is a web page where the latest data set can be downloaded, cleaned and p-coded. Links to summary data sets of the latest information, such as the number of activities by district, can also be provided which may be used in other IM processes.
With compatible HXL tags being used between organisations, the collation of data will be more efficient. A philosophy of standardised and open inter-organisational data sharing, should also reduce the duplication of work in maintaining big data sets. The challenge is for this to be achieved amidst a culture of needs to know bases and data silos.