Download PDF version (1 MB)

View all in one page

Prototype HRC-IDI dataset - dynamic workforce data

SSC has been working with Statistics New Zealand to test the integration of the HRC dataset into the Integrated Data Infrastructure (IDI). The IDI combines data from a range of organisations (see Figure 2.5) into a powerful dataset for government and academic research purposes. It provide the insights government needs to improve social and economic outcomes for New Zealanders. Integrated data is particularly useful to help address complex social issues such as crime and vulnerable children.

Statistics NZ operates the IDI within a 'five safes' framework that ensures that:

  • researchers can be trusted to use data appropriately and follow procedures
  • the project has a statistical purpose and is in the public interest
  • security arrangements prevent unauthorised access to the data
  • the data itself inherently limits the risk of disclosure (e.g. all personal information is removed from the IDI)
  • the statistical results produced do not disclose any identifying information.

Integrating the HRC into the IDI enhances the HRC's usefulness in a number of ways:

  • allows employees to be followed anonymously through the Public Service and beyond thereby creating new information on career pathways
  • provides additional information on employees not captured by the HRC (e.g. highest qualification, migration, nationality, job history)
  • makes it easier to produce workforce information for the wider State sector and the private sectors that are comparable to the Public Service
  • allows easier access to HRC data for researchers and opens up the use of more advanced statistical modelling techniques.

Work to date has shown that it is feasible to integrate the HRC into the IDI. Although the HRC is an anonymised dataset (i.e. it does not collect names or IRD numbers), the relatively small size of Public Service departments has made it possible to integrate the data using other payroll variables. The match rates between HRC and IDI data are generally high – they were around 90% to 95% for the prototype data underlying Figures 2.1 and 2.5. The quality of the match appears very high, with testing showing a very low rate of false-positive matches. The key limitations at this stage are a lack of timeliness (the HRC-IDI results in this report are historic, rather than for 2015) and missing data for one of the larger Public Service agencies.

SSC will work with Statistics NZ on the feasibility of integrating the HRC data into the IDI on an ongoing basis, addressing these limitations and looking to further develop the range of information that is produced.

See Appendix 6 for further information on the IDI.

Figure 2.5 Statistics NZ's Integrated Data Infrastructure

Figure 2.5 Statistics NZ's Integrated Data Infrastructure

Last modified: