TL;JR: Data Sharing and Release Discussion Paper

This is a collection of my tweet thread on the Data Sharing and Release Legislative Reforms Discussion Paper that was published on 3 September 2019. You can find the original here.

Introduction and Summary

Well I guess I’m reading this discussion paper now. Data Sharing and Release Legislative Reforms Discussion Paper May as well do a #tljr as well.

The intro is full of fluff about ‘unlocking potential’. The legislation’s aim is “to streamline and modernise data sharing, overcoming complex legislative barriers and outdated secrecy provisions.”

“The legislation will also allow government agencies to draw on expert advice to assist them to share data safely using contemporary tools and techniques.”

Yeah, they’ve responded really well when experts like @chrisculnane @VTeagueAus and people like me provide advice.

“The data world can be confusing, filled with complicated and often conflicting terminology.” so we’re going to define terms for the discussion paper.

“Public sector data is data held by the Australian government as it fulfils its various functions.” So, any and all of it.

“Data means any facts, statistics, instructions, concepts, or other information in a form capable of being communicated, analysed or processed”

Groundbreaking Meme

“Closed data also keeps the Australian public in the dark about what government does with the data it collects and holds.” Yes, and our ongoing #FOI experience shows that government is not interested in changing that.

“We think government agencies should be able to share information safely and consistently for the benefit of all Australians.”

And I think you should be able to answer the phone, yet here we are.

“The Data Sharing and Release legislation will provide legal grounds to empower the government to share public sector data for specified purposes with the right safeguards.”

Safeguards like the ones the police routinely ignored to access metadata?

All this goes to the heart of the matter: government has demonstrated a systemic inability to do any of this well, safely, or transparently. For decades. There is no credibility here. Credibility is vital for trust to exist.

At 1.3 we talk about benefits. Individually you’ll apparently be able to tell agencies stuff once, and they can pre-fill form for you (like with myTax).

myTax is pretty good, especially when compared with the situation in the USA where the tax prep software makers have regulatory capture of the IRS.

But as for Tell Us Once: Centrelink can’t manage that inside itself. If you can’t do it alone, how does sharing inaccurate data with everyone else help us?

Errors happen. Data is *messy*. But sure, let’s add distributed systems and cache coherency issues to the challenge of government’s ability to computer. What could go wrong?

The details *matter*. Farting about with theoretical policy ideas from 50,000 feet in the air doesn’t help when agencies think maths is secondary to legislation.

“For Australia’s research sector, the legislation will provide access to data to advance knowledge and create better public policy”

Bullshit. You don’t need more data. Policy is crap because your politics is crap.

There is already plenty of good, robust data that is being systemically ignored: climate change, poverty, housing, monetary/fiscal policy, #robodebt, etc. A lack of data is not the problem.

Other benefits: reducing over-collection of data. This is the Tell Us Once idea duplicated. This is double-counting of the benefits.

“For example, if you report your circumstances to one government agency to receive a service, you would not need to provide the same information again to another government agency to receive other services.”

Similarly, if Centrelink decides you should stop receiving government services, you’ll lose access to *all of them* because the computer said no. Fuck that.

“Increasing the transparency of government operations around the use of public sector data, which improves the community’s trust in the government’s handling of data.”

You could do this now. But you don’t. You do exactly the opposite, in fact: Suppression and secrecy: how Australia’s government put a boot on journalism’s throat

“Minimising the risk of data breaches and the burden of storing duplicate datasets, by allowing government agencies to draw on datasets held by a collecting agency in a federated model, rather than holding multiple versions in each agency.”

This isn’t a terrible idea, in theory, but in practice the security challenge is greater. The theory is that, rather than having loads of copies of data around the place in agencies that suck at infosec, you only have the data in one place, and then guard it better.

But that means either a) a single massive honeypot of data like MyHealthRecord, or b) keeping the data at the existing collection point but providing increased access to it. Increasing the sharing increases the attack surface, regardless of where the data resides.

More stuff on bad public policy somehow being a result of lack of data and not mendaciousness and bastardry.

The Data Sharing Principles looks suspiciously like the Five Safes.

The Five Data Sharing Principles

The Five Data Sharing Principles.

“We are hoping to shift thinking about data sharing from ‘can I share?’ to ‘how can I safely share?’”

Not “should I share?” Just because you can, doesn’t mean you should.

The government is claiming strong support for an independent oversight role, hence the National Data Commissioner.

“We heard the Commissioner should be empowered to apply strong penalties to intentional or negligent misuse”

I note that the strong penalties aren’t going to be available for anyone to use, such as a if a tort of privacy existed. No, you have to rely on a politically appointed Commissioner to act on your behalf.

Independent oversight has worked very well with ASIC oversight of the banks, and with the deliberate underfunding of OAIC, after all. No idea why you’d be cyincal.

“There were divergent views on the balance of benefits and risks of sharing and release of public sector data.”

Researchers are dead keen on it, people interested in privacy are not.

Straight after this point is made: “University researchers have formed one of the main groups who have engaged with us.” Uh huh.

In my discussions in these meetings, there are some researchers who understand the privacy and consent issues, but they are a minority, sadly.

The researchers fervently believe they can improve public policy and programs by doing more research. Can someone point to research that supports this position? I mean, you’ve been doing research for decades, so surely there’s lots of good, systemic evidence that more data and research improves things on a public policy front, yes? Should be easy to show me this evidence. You shouldn’t need to rely on Big Data Exceptionalism and handwavey appeals to nebulous future benefits. You should have a solid track record of benefit now. So show me. Since you’re all keen on research and evidence and all. Let’s see it.

“Some are concerned the legislation could provide a blanket override of secrecy provisions without fully appreciating the need for secrecy provisions on a case by case basis.”

That’s what you said it was going to do. It’s not an opinion, you actually said that. [Edit: This is quibbling over the Data Sharing and Release legislation not providing an override of all secrecy provisions. But it is quite literally specifically designed to provide an override to most existing legislation. That is specifically the point of the legislation, and it is explicitly mentioned that this is its purpose in multiple places in this discussion paper.]

“the research sector is concerned secrecy provisions are used by the Australian Public Service to indiscriminately lock up data, restricting uses in the public interest.”

Seeing the trend here?

“We heard broad support for cooperation between the Australian Information and Privacy Commissioner and the National Data Commissioner to address the ‘grey areas’ between freedom of information, privacy and data sharing laws.”

“There were also frequent and recurring debates about de-identification and the difficulty of ensuring information is appropriately de-identified, leading some to suggest the term be retired entirely.”

Well yes. You can’t reliably de-identify unit level data.

“There were robust discussions and debate in roundtables about consent.”

Consent of the governed is *vital* in a democracy.

There is apparently a longer discussion about consent in section 4.6.

“We heard concerns relating to Indigenous access to Indigenous data and Indigenous data sovereignty.”

Indeed. Relates to consent and the generally colonial attitudes of those designing these systems.

“Legal and privacy experts were concerned the Five-Safes were not privacy safeguards and privacy should be specifically addressed in legislation.”

Oh wow, they actually listened to that bit?

“We heard this feedback and remodeled the Five-Safes as the Data Sharing Principles.”

Uh, those are just the Five Safes renamed to remove the word ‘safe’.

Broad support for accreditation before you can participate in data sharing, though universities, states, and the private section “felt accreditation criteria should not be overly onerous” thus missing the point entirely.

“There was strong support from privacy experts, civil society and others for the publication of Data Sharing Agreements to increase the transparency of how public sector data is used.”

What changed since the issues paper?

“data sharing for compliance and assurance purposes will not be allowed under the Data Sharing and Release legislation.”

Good. However: watch for the loopholes that allow agencies to do it anyway, such as how Centrelink does data matching without using TFNs.

“While consent is important in certain situations, the societal outcomes of fair and unbiased government policy, research and programs can outweigh the benefits of consent, provided privacy is protected.”

FFS. Translation: getting consent is too hard, so we aren’t going to try most of the time.

Privacy Back Door

“In response to concerns about overriding all secrecy provisions, the Data Sharing and Release legislation will not compel sharing. Government agencies will be responsible for deciding whether to use the legislation, only if they are satisfied data can be shared safely.”

There’s a lot going on here, so let’s unpack these two sentences.

Data Sharing and Release (henceforth DSR) won’t force agencies to share data with each other. However, my reading of this is that they will be able to use DSR as a backdoor to sharing, thus overriding other legislation, which is the very complaint we all made.

But don’t worry! Agencies will only be allowed to do this if they believe they can share data safely, and they’ve never been wrong about that before.

So basically our concerns here have been ignored, but it’s been dressed up to look like we were listened to.

This is definitely* going to increase public trust for this scheme.

“The National Data Commissioner will not be able to compel or overturn decisions to share or not to share, instead focusing on ensuring that when data is shared, it is done safely.”

Right, so they’ll have no power at all.

“although the Data Sharing and Release legislation does not compel sharing, we will be finalising a list of secrecy provisions to be exempt from the override.”
That’ll be the ASIO exemption, because the spooks *definitely* don’t want to share.

“The list of exemptions will be provided for public consultation alongside the Exposure Draft of the legislation.”

Keep an eye out for who management to get the right meetings with the Minister.

“The National Data Commissioner will be empowered to advocate for open data, but the legislation will not provide a new legislative authorisation for open data release as we heard the current mechanisms are sufficient.”

Thus highlighting the purpose of this legislation: it’s to remove privacy constraints from government agencies so they can share data about you, but not *with* you, oh no. You’re not involved. Researchers and agencies know best, so shut up and bask in our paternalism.

A New Regulatory Framework

“There are two main classes of information that will likely be exempted from the scope of the legislation:
• information collected or held by the National Intelligence Community
• information provided under the My Health Record scheme.”

But the DSR remains a backdoor to sharing, as this flowchart demonstrates:

A flowchart of how the DSR backdoor will operate.

A flowchart of how the DSR backdoor will operate.

“The Data Sharing and Release framework will provide an alternative pathway for government agencies who want to share data. It removes the need for lengthy legal processes to establish authority,”

Because fuck the law, amirite?

So in essence, rather than do the hard policy work of looking at all the existing legislation in detail, and understanding why existing safeguards exist, then coming up with new, unified legislation, DSR just monkey-patches the system to add a backdoor.

This is public policy borne of laziness and arrogance.

Sharing Data for Public Benefit

An example provided for the benefits of data sharing is school funding. Apparently the fact that private schools can build a third equestrian centre and poor kids can’t afford pencils is because government hasn’t data linked parent’s income data.

“This enables Government to deliver a much fairer allocation of funding to schools in the future.”

Define ‘fair’.

No, seriously, the issue of what ‘fair’ means is actually really, really important especially when it comes to automated decision making.

The spectrum of what the legislation will allow on day 1. Note how close the ‘assurance’ section is for scope creep reasons.

A roadmap for scope creep.

A roadmap for scope creep.

For service delivery, there’s more detail on the Tell Us Once idea, which is a fine idea.

“As yet we have not finalised our position on commercial use of public sector data.” Well that’s ominous. #tljr

“Private sector will not be able to access public sector data for activities prevented under existing laws.”

I’m not that confident, actually. The conflicting law potential here is substantial, and we’ll need to see the draft legislation itself to work this out.

“A modernised risk management framework will safely unlock greater benefits of public sector data for the public good.”

A fine aspiration unsupported by any recent evidence government can actually do this.

Strengthening Safeguards

“4. STRENGTHENING SAFEGUARDS” by providing a way around them. Seems legit.

“The approach we have adopted builds on the internationally recognised Five-Safes Framework.” Ah yes, the one that isn’t designed for this situation, yes.

Here are the Data Sharing Principles.

The five data sharing principles in a bit more detail.

The five data sharing principles in a bit more detail.

The Five Safes are: Safe projects, safe people, safe settings, safe data, and safe outputs.

So, when they say “based on” they mean “is a verbatim copy of”. [Ed: Not verbatim! They took out the word ‘safe’. All fixed now.]

“Legal and privacy experts were concerned the Five-Safes were not privacy safeguards and privacy should be specifically addressed in legislation. We heard this feedback and remodeled the Five-Safes as the Data Sharing Principles.”

Yet again the government has ignored experts because they told them something they didn’t want to hear, and are going to do what they want anyway, but they’ll spin it as if the consultation had any effect whatsoever.

You mendacious shitlords.

No More Consent

“We propose the Data Sharing and Release legislation not require consent for sharing of personal information.”

This bit on the interaction with the Privacy Act 1988 is weird. “Laws that authorise the government’s actions often use the ‘authorised by law’ mechanisms in the Privacy Act 1988.”

We need to see the actual legislation, because on my reading, the DSR would create the very “authorised by law” parts that would allow sharing under the Privacy Act, thus rendering the Privacy Act moot.

But this is being spun as if the DSR will somehow not water down or override the protections in the Privacy Act. I mean, apart from the fact that the protections in the Privacy Act are almost non-existent in practice, yes it bloody well will!

“Requiring consent for all data sharing will lead to biased data that delivers the wrong outcomes.”

In very specific cases, yes. There are ways to design a study that overcome these issues, however.

A brief interlude…

On this point: I had a discussion with a researcher whose opinion I respect (who I am keeping anonymous until I check that they’re okay with being identified here), and my point here is not to imply that situations where gaining consent is impossible do not exist (they do, e.g. if a person is already dead). My contention is that these situations are rare. For the vast majority of research, consent is very much possible. It may be difficult or expensive, but that’s not the same thing as impossible.

Practically impossible is not the same as impossible, either. That’s a function of the constraints you operate under. An example provided by this very researcher is that, for a very large administrative dataset it might be prohibitively expensive for the researcher to obtain consent themselves. But that’s not the only way: the government agency could seek consent on the researcher’s behalf.

Far too often researchers throw up their hands at the first sign of a consent hurdle decrying the situation as practically impossible to obtain consent when it’s really a failure of imagination on their part. This subordinates informed consent to their desire for data, which is a paternalistic, colonial approach to research.

Anyhow, back to the thread.

“If we required consent, then data would only be shared where consent was given.”

Well yes, that’s the entire point. #tljr

“Rather than take a one-size-fits-all approach, we have taken an approach similar to the European approach in the General Data Protection Regulation (GDPR), which makes consent one of six ‘lawful bases of processing.’”

Okay, but the EU already has lots of other more robust privacy protections in law that we don’t have, and the GDPR also contains a bunch of additional protections we don’t have. Equating GDPR to the Privacy Act 1988 is misleading. I believe deliberately so.

The right approach here is to require a baseline of consent, and provide an exemptions mechanism where consent is challenging for specific reasons. The better Research Ethics Committees already do this.

But instead of doing that, the government is flipping it around to say “consent is challenging in a couple of edge cases, so let’s just abandon consent altogether”.

Building Trust Through Transparency

“Data Sharing Agreements will be a requirement for all data sharing under the Data Sharing and Release legislation.” There will be a searchable public register of these things.

“Under the Notifiable Data Breaches Scheme individuals will continue to receive notification of any real or suspected data breaches involving personal information so they can take action to mitigate the risks.”

That might sound reasonable, but…

This is government administrative data. It’s collected by your interaction with government. It’s not like you can change government like you can change bank if they fuck it up.

What, exactly, are you supposed to do to “mitigate the risks” if Centrelink ‘accidentally’ overshares with the police?

“Binding guidance issued by the National Data Commissioner will be in the form of Data Codes that are legislative instruments.”

Binding… how? What happens when they’re violated (and they will be)?

Aha, section 7: When Things Go Wrong.

When Things Go Wrong

“The Data Sharing and Release legislation will provide an alternative pathway to share data, where it is currently prevented by a secrecy provision or where it is simpler than existing pathways.”

It’s a backdoor.

“If data is shared for purposes that are not authorised, or if safeguards are not applied correctly under the Data Sharing Principles, the Data Sharing and Release legislation authority will fall away and the original offences and penalties will apply.”
“We are calling this the ‘rebound approach'”.

There will be some new offenses created! “unauthorised sharing, release and use of data” is one. How that will work when so much becomes tacitly authorised is a bit weird.

There will not be strict liability for breaches. Imprisonment or 60 penalty units is apparently the threshold for “too much” according to the Attorney General. I think the phrasing here is a bit odd.

“Penalties will not be strict liabilities to ensure benefits of data can be realised through a culture of responsible data sharing.”

I agree this is the right setting to promote a safety culture.

“We found including new defences could water down existing penalties in the rebound approach and could lead to negligence in applying safeguards when sharing data.”

Good.

Ah, but here’s the rub: determining breaches and imposing penalties will be a function of the National Data Commissioner. No individual right of action.

“the Office of the National Data Commissioner will apply a graduated enforcement approach that applies proportional responses that are likely to deter future non-compliance.”

The graduated response framework that the National Data Commissioner will apparently use.

The graduated response framework that the National Data Commissioner will apparently use.

“The model relies on the National Data Commissioner’s discretion and its effectiveness will be contingent on the National Data Commissioner’s willingness to escalate matters when necessary.”

and the footnote points at the Royal Commission into the banks. lol.

“The Data Sharing and Release legislation will not provide for merits review of data sharing decisions by Data Custodians” which is only the most important part of the sharing. *sigh*

So it’s very unclear what kinds of conduct the NDC will actually have oversight of. It looks to me like this is constructing a parallel bureaucracy to all the existing ones, and the enforcement will be of niche bureaucratic rule breaks rather than real privacy breaches.

“In early 2020, we will consult again on an exposure draft of the legislation”

In summary, there are some good bits in there, but lots of it ignores the feedback from experts about the dangers, because taking their advice would slow things down.

Here endeth the #tljr for today.

Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.