Chapter 2

Chapter 2

Big health data: Australia's big potential


2.1        Big data has the potential to create big opportunities for Australia. A recent estimate by Lateral Economics suggests that open government data could contribute up to $25 billion per annum across the economy.[1] This analysis also suggests that Australian government held health-specific data alone could account for an increase of $5.9 billion per annum.[2]

2.2        Big data also creates opportunities for considerable savings to the Australian health care system. Professor Fiona Stanley, Patron and former director of the Telethon Kids Institute told the committee that significant gains could be made with the health budget if government appropriately harnessed linked health data. Professor Stanley suggested that linked data could be used to reduce costly but ineffective clinical interventions, detect and prevent harmful health outcomes through early intervention and also alert regulators to fraud in the healthcare system.[3]

2.3        These are just some of the potential benefits Australia may obtain if the Australian Government and the States and Territories combined and fully utilised their administrative datasets.

2.4        Over the last three years, Australian Public Service agencies have been working together to promote a new approach to using and releasing datasets held by the Australian Government.[4]

2.5        On 7 December 2015 the Prime Minister, the Hon Mr Malcolm Turnbull MP and the Minister for Industry, Innovation and Science, the Hon Mr Christopher Pyne MP, launched the National Innovation and Science Agenda.[5] One of the agenda's key planks was for government to 'lead by example in the way Government invests in and uses technology and data to deliver better quality services'.[6] This announcement coincided with the release of the Public Sector Data Management report and the Public Data Policy Statement.[7] The report and the statement are considered at paragraphs 2.50–2.56 below.

2.6        The committee has previously heard from the Population Health Research Network (PHRN) in October 2014 about some of the challenges faced in maintaining health data linkages and in encouraging custodians of health data to be more open in releasing their data sets.[8] These and similar concerns from other witnesses prompted the committee to initiate this current examination of issues relating to big data and data linkage.[9]

2.7        This chapter will consider the meaning of data linking and the new opportunities for Australia to harness the full benefits of big data and data linkage. This will be considered having regard to the existing framework and the government's recently announced data policies.

2.8        There are some key concepts that are important for this report. These include: big data, data linkage, data custodianship, unit record level data and data linkage keys.

Big data

2.9        The phrase 'big data' has been defined to mean 'high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight, decision making, and process optimization'.[10]

2.10      Examples of big health data include:

Data linking

2.11      Data linking is the bringing together of two or more data sets to create a new, richer data set.[13] By bringing together sets of data that were previously isolated, researchers, clinicians and governments can deepen their understandings of the ways people actually use the health care system. This has the potential to inform government policy making and decisions about improving service delivery.[14]

Data custodianship

2.12      According to the National Statistics Service, data custodians are:

...agencies responsible for managing the use, disclosure and protection of source data used in a statistical data integration project. Data custodians collect and hold information on behalf of a data provider (defined as an individual, household, business or other organisation which supplies data either for statistical or administrative purposes). The role of data custodians may also extend to producing source data, in addition to their role as a holder of datasets.[15]

2.13      For example the Department of Health is the custodian of the Medicare Benefits Schedule data.[16]

Unit record level data

2.14      A distinction needs to be made between individual unit records and aggregated data. Aggregated data provides information about a population as a whole and no individual can be identified from that data.[17] An example of aggregated data is the Census.

2.15      This can be contrast with unit record level data which, according to the Australian Bureau of Statistics is:

...a file of responses to ABS surveys or censuses that have had specific identifying information about persons and organisations confidentialised. [The unit record level data files] contain very detailed information for each individual record - a record can be a person, a business, a family, household or a job for example.[18]

2.16      For researchers that wish to understand the health system or are interested in a particular pharmaceutical product, it is preferable to have de-identified unit level records as Dr Merran Smith, Chief Executive of the PHRN explains:

Aggregated data is valuable and even linked aggregated data is valuable. But it probably cannot do the sorts of things we are talking about for the health/medical research that really needs the detail.[19]

2.17      For that reason, researchers need access to de-identified unit record level data to achieve the best result.

Data linkage key

2.18      A data linkage key is a code that is constructed to replace identifying information, such as name, date of birth and address on a linked record in order to protect the privacy of the subjects of the study. By using a linkage key, researchers can link records that belong to the same person from multiple datasets without needing to know who the person is.[20]

2.19      Additional terms used in this report may be found in the Glossary.[21]

Australia's potential

2.20      Data is an important and valuable government resource. Data linking has the capacity to maximise that resource and to create new opportunities for more complex and expanded evidence-based policy and research.[22] Professor Stanley highlighted the benefits to government of using more linked data:

...[Australia] would be second to none in the world in enabling us to evaluate all the outcomes of all [government] services that are provided. [Australia] would be able to influence and evaluate evidence based practice; we would be able to look at the epidemiological trends and risk factors of major and costly problems.[23]

2.21             In the medical sphere there are some shining examples of how data linking has improved health outcomes. For instance, data linking has helped to identify the role of folate in pregnancy in reducing neural tube defects, such as spina bifida.[24] The Northern Territory Government facilitated 'a study that reviewed the association between primary care utilisation and the number of hospital admissions for the NT remote Aboriginal population'.[25]

2.22             Linked data sets have also been used to 'estimate the prevalence of dementia in the NT Aboriginal and non-Aboriginal populations' and analyse the 'cost effectiveness of primary care in the management of diabetes'.[26]

2.23      The Commonwealth Scientific and Industrial Research Organisation (CSIRO) has used linked data to create a Patient Admission Prediction Tool (PAPT) that is helping to make hospitals more efficient.[27] The tool uses historical data from emergency departments and hospital data sets to model the number of patients that are likely to present at the emergency department and the numbers that are likely to require admission to wards. The CSIRO notes that improved access to hospital datasets held by the Australian Government would ensure that decisions could be made on the most comprehensive data available.[28]

2.24      Many witnesses argued that governments could facilitate a greater degree of health data linkage, thereby releasing significant untapped opportunities. For instance the Council of Academic Public Health Institutions Australia (CAPHIA) noted that linking State and Australian Government datasets has:

...the potential for national, state and local comparative effectiveness, clinical trials and registry research that has thus far been largely untapped, to drive health policy, redesign, quality improvement and evidence translation in health care. Additionally, it enables...the rigorous objective evaluation of health policy for government and key policy professionals; and the ability to compare trends nationally, to identify programs that deliver value for money and to avoid wasting resources on those that are not delivering. The result is better targeted, evidence-based and more cost-effective health policy, services and interventions for the Australian community.[29]

2.25      In addition to the excellent research outlined in paragraphs 2.21–2.22, the Northern Territory submitted that the following opportunities may be possible if more Australian Government datasets were accessible:

Geographic distribution of Medicare and PBS [Pharmaceutical Benefits Scheme] funded service access mapped against state based services or health need,

Socioeconomic distribution of Medicare and PBS funded service access

Associations between utilisation of Medicare funded services and hospital and/or [Emergency Department] services...

The distribution of PBS funded items and measures of health need.

Quality and safety measures of primary care, by linking Medicare or PBS items and outcomes such as diabetic control, hospitalisation and mortality.[30]

2.26      The Australian Government also acknowledged the latent potential of data linkage. For example Department of Health representative Ms Alanna Foster, First Assistant Secretary told the committee:

Linked data would also enable understanding of the full extent of patients' health-service usage—that is, it would be possible to follow patients' pathways through the system and answer questions about patient populations, such as: are the high users of primary care also high users of the hospital system? If we provide better access to chronic disease management in primary care are patients less likely to present to hospital? What interactions do patients have, with their General Practitioners (GPs), when they leave hospital?

With big-data technologies linking and advanced analytic capabilities, we could, for example, use pattern mining to quickly identify adverse events that may arise from medical devices or health services, use cluster analysis to assign patients to like groups—for example, identifying groups with diabetes or cardiovascular conditions that may be amenable to policy intervention and then model the impacts of those imperfections, in terms of costs and patient outcomes. We could use pathways analysis to investigate how patients—for example, cancer patients—are moving through the health system and model the impact of policy interventions targeted at improving these pathways. These are just some of the tools that could be used when forming government decision making and the work of researchers.[31]

International standing

2.27      The Australian experience stands in stark contrast to those of other developed economies that have already liberalised their use of administrative data. In 2013 the Productivity Commission reported that:

In Denmark, Sweden, Finland and the Netherlands, linked administrative data are accessible for research purposes. Statistics Finland considers that statistics should be compiled from administrative records whenever possible — around 96 per cent of its data come from these sources. This openness promotes research — ‘microsimulation specialists pour into Nordic countries because of their liberal approach towards sharing statistics’...[32]

2.28      Meanwhile, Australian researchers, frustrated at the relative inaccessibility of Australian datasets are choosing to use datasets from other countries. For instance Professor Philip Clarke, Professor of Health Economics at the University of Melbourne informed the committee:

Other countries have very good datasets. I have done work with Scandinavian registries in diabetes. They make those available... I am currently building a cardiovascular health policy model with funding from the NHMRC [National Health and Medical Research Council], but explicitly in my application I said I would be using New Zealand data, because there was no appropriate Australian data. I am able to work with researchers at the University of Auckland. There are half a million clinical records with cardiovascular patients that have had their cardiovascular risk assessed. Those have been linked to hospital records and medical records, and I am able to work with researchers almost immediately to start analysing that. I would be dreaming if I thought that could happen in Australia within the next few years.[33]

2.29      Australia is missing out on important opportunities to identify health risks for our own population because Australian Government datasets are inaccessible. This is particularly the case with pharmaceutical safety. Professor Sallie-Anne Pearson, Head of the Medicines Policy Research Unit at the Centre for Big Data Research in Health noted that data inaccessibility has meant that medicine safety research is not commonly undertaken in Australia:

...fewer than 30 studies have examined drug safety in the last 25 years. This needs to change. Australia is actually well-placed to deeply understand our return on PBS investment, and also other health programs. The data already exists. We have information that covers our entire population.[34]

2.30      The lack of research is surprising when there are 190 000 hospitalisations caused by medications in Australia every year at a cost of $660 million to the health care system.[35]

2.31      Witnesses told the committee that Australia could safely exploit the existing PBS data for the benefit of Australians. Dr Barbara Mintzes, Senior Lecturer in Pharmacy at the University of Sydney informed the committee of the approach of several other developed countries:

The experience to date in Canada, the US, the UK and Scandinavia makes it clear that these databases are important tools for medication safety and protection of public health.[36]

2.32      In some cases Australia has been collecting data for years but without fully utilising the data, its collection is rendered fruitless. As Professor Fiona Stanley identified:

My biggest anguish has been that over 30 years of setting up a birth defects registry to find the next thalidomide, another one could be happening all the time and we are unable to detect it.[37]

2.33      In 2015 the Productivity Commission attempted to articulate why Australia was falling behind other developed countries in releasing administrative data. In its Efficiency in Health research paper the Productivity Commission suggested several reasons including:

2.34      The Productivity Commission concluded that:

The potential of administrative data is not being realised in Australia, and the lost opportunities will only grow as technology continues to open up new ways to use and analyse data. Calls to release and better link administrative datasets have been made previously by the Commission and by others.[39]

Committee view

2.35      The evidence heard by the committee and received in submissions suggests that Australia has significant health data assets and medical research capabilities. The evidence also clearly demonstrates that in comparison to other countries Australia is failing to capitalise on its data potential.

Recommendation 1

2.36      The committee recommends that Australia forms partnerships with other countries engaged in data linking to ensure that Australian data access and linkage policies and regulations are developed to world's best practice.

Australian framework

2.37      As the Productivity Commission and other experts have noted, the factors that are holding Australia back are largely barriers erected by the legislative framework or its application by the public service. The blockage is not in technical expertise or infrastructure. Australia has a world leading data linkage system and many talented researchers and academics in the field.

Experience and history

2.38      Australia's modern data linkage capacity dates back to 1995. Before this time, some statistics were collected but as Emeritus Professor D'Arcy Holman, formerly a Professor of Public Health at the University of Western Australia noted 'what we could do with health statistics...was severely constrained by the technical infrastructure available to us'.[40]

2.39      That changed in 1995 when the Western Australian Data Linkage System (WADLS) was established.[41] The formation of the WADLS allowed population health researchers to: over 30 pre-existing health databases on the people of WA. The links mean that the journeys of individuals through the health system can be followed anonymously over many years and thus their risk factors for major diseases, and the use and outcomes of health services can be evaluated using anonymous information.[42]

2.40      More information on the change in the use of technology and how improvements in technology are being used to protect privacy can be found in Chapter 3.

2.41      At the Australian Government level there is a restriction on who can perform the data linkage function. The Australian Government requires that only certain accredited 'integrating authorities' may link Australian Government data. More information on integrating authorities can be found in Chapter 3.

2.42      Each State and Territory either has its own data linkage unit or is associated with a data linkage unit.[43] In 2004 the Australian Government established the National Collaborative Research Infrastructure Strategy (NCRIS). Through NCRIS the government provided $20 million to establish the PHRN.[44] The PHRN is a national network that works to support collaboration between data linkage units and further Australia's linkage potential.

State / Commonwealth divide

2.43      Witnesses told the committee that Australia's federal constitution contributes to its data challenges. As Emeritus Professor Holman noted:

Australia differs from other federations, Canada for example, in that our [Australian] Government has not directed its financial support for these integral components of health care through the states, but has established itself as a separate vertical player.[45]

2.44      This State / Commonwealth divide means that the Australian Government collects primary health and aged care data whilst the States collect hospital, births, deaths and cancer information. A list of the Australian Government's major health related data holdings can be found in Appendix 4.

2.45      One of the challenges to sharing data between the Australian Government and the States and Territories has been a reticence by Australian Government departments to release data based on privacy concerns. Ms Alanna Foster, First Assistant Secretary of the Department of Health insisted that 'due to the separate legislative requirements, it can be challenging to link these datasets while also adhering to strict privacy guidelines'.[46]

2.46      One of these privacy guidelines requires that MBS [Medicare Benefits Schedule] and PBS data cannot be linked and another requires that Australian Government data linkages must be destroyed at the conclusion of the project.[47] These two restrictions will be considered in greater detail in Chapters 3 and 4 respectively.

2.47      Despite these restrictions, Professor Clarke told the committee that 'there have been linkages but they tended to be sporadic'.[48]

2.48      However, Emeritus Professor D'Arcy Holman described the period between 2007 and 2012 in Western Australia when 'things were different'. This was because, as Emeritus Professor Holman recalled:

The two separate information systems [the Australian and Western Australian] were permitted to talk one with the other.

A short reprieve of different senior administration in the [Australian Government] led to a collaboration with the State to include the Medicare, pharmaceutical and aged care data within the WADLS system. This was the first and only instance since federation that the [Australian Government] and an Australian State agreed to integrate their data in a functional way to create a total picture of health system performance.[49]

Recent developments

2.49      In late 2015, government attitudes toward sharing data started to change. On 3 December 2015, the Department of the Prime Minister and Cabinet released the Public Sector Data Management Report.[50]

2.50      The report sets out a roadmap towards the regular and systematic release of public sector data and highlights the need to reform certain areas to enable the Australian Public Service to get the most out of Australia's data holdings.[51]  

2.51      On 7 December 2015, the Department of the Prime Minister and Cabinet released the Australian Government Public Data Policy Statement.[52] The statement declares that Australian Government entities will:

2.52      Whilst this was seen as a welcome development, it was a surprise to many non‑government witnesses who told the committee that they had not been consulted and were not aware that the government had been working on the policy statement or the data management report.[54]

2.53      When Ms Helen Owens, Assistant Secretary of the Department of the Prime Minister and Cabinet was asked who the government consulted she listed:

...organisations like Telstra, Google, the World Bank, the [Australian Broadcasting Corporation], [software producer] IBM, [software company] SAP. We also spoke with some research institutions—the Grattan Institute and the Crawford school at ANU. [The Department of the Prime Minister and Cabinet] then did some individual consultations with business leaders in the data space and open data space.[55]

2.54      The Office of the Australian Information Commissioner was nominally consulted in the development of both the Public Sector Data Management Report and the Public Data Policy Statement.[56] However, the government did not consult the National Health Performance Authority (NHPA), the National E-Health Transition Authority (NEHTA) or the Australian Commission on Safety and Quality in Health Care in the development of either document.[57]  

2.55      Turning the report and the statement into a reality will take commitment and perseverance, something previous governments have promised in this space but not delivered.[58] As the Productivity Commission stated in their 2012-13 Annual Report:

Realising these goals [harnessing administrative data to support research and evidence-based policy evaluation] requires political will, articulated at the highest levels, to persevere with a concerted strategy with clear timeframes based on the principle that open access to de-identified information should be a default position. Realistically, it could take 5-10 years to rollout and embed systems before the ‘holy grail’ of relatively unimpeded remote access to high quality, de-identified and linked administrative data is achievable.

While there have been announcements and initiatives in the past and more recently, the lack of sustained tangible progress means that it is important that the 5-10 year timeframe does not become a motivation for more ‘false starts’, deferrals or eventual reprioritisation and non-delivery. International practices and over thirty years of experience in Western Australia suggest that the capabilities necessary to achieve a more open data culture could be developed by all Australian governments.[59]

Committee view

2.56      The evidence presented to this committee demonstrates that Australia has the potential to create a world leading data linkage system that can both maintain data security and produce ground-breaking public health research. 

2.57      The committee recognises that linking administrative data, which is already routinely collected, has the potential to reveal new insights about the ways Australians use the healthcare system and potential ways to improve the health outcomes of all Australians. 

2.58      The opportunities Australia is squandering are not just possibilities for health improvements for future generations; but the ability to detect causes of harm to Australians. The committee has received evidence that Australia could be using its data resources to detect harmful prescription medications both in children and in adults. Instead, Australian researchers are forced to rely on studies conducted in other countries where such drug safety studies are possible. For the benefit of the health of all Australians we can and must do better. 

2.59      Improving our data linkage system involves breaking down some of the historical barriers that have resulted from our federated system of government. We have seen in sporadic intervals that such cooperation is possible and can lead to highly beneficial outcomes.

2.60      Australia has the infrastructure and the knowledge to make a national data linkage system work but it will require legislative changes and cultural changes in the Australian Public Service. The nature of these challenges will be examined in greater detail in Chapters 3 and 4. These changes could catapult Australia to become a world leader in data linkage.

2.61      The committee welcomes the renewed focus on Australia's data assets and is encouraged by the attempt to coordinate efforts across government to make more datasets available. But the committee notes that there is still a long way to go to overcome many of the barriers currently faced by researchers and the valid community concerns regarding privacy.

2.62      The committee further notes that this is not the first time an Australian Government has promoted a more open approach to sharing data. The committee is concerned at the very limited nature of the government's consultation in developing its recent Australian Public Data Policy Statement and its Public Sector Data Management report. In compiling its most recent policies, the government obtained very limited input from key stakeholders, including those funded by the Australian Government. By failing to consult any health professionals it became manifestly clear that the use of health data was not a priority for the government. The committee is concerned by the low regard in which the government seems to hold health data and the research groups that work with it.

2.63      To ensure that the government's newly articulated approach to releasing data maximises Australia's big health data potential, while attending to valid community expectations about security and privacy around personal health data, the government must broaden its data policy engagement to include health-related academics, researchers and practitioners.

Navigation: Previous Page | Contents | Next Page