Census 2016: Arguments For And Against Name/Address Retention

The Village Lawyer, c. 1621, by Pieter Brueghel the Younger

The Village Lawyer, c. 1621, by Pieter Brueghel the Younger

The Australian Bureau of Statistics has made a right hash of the decision to retain name and address data provided in this year’s Census for up to four years.

The ABS has repeatedly hand-waved away specific concerns raised by experts in privacy, information security, and statistical research who rely on Census data, while simultaneously claiming we have nothing to worry about. Various supporters of the ABS’ position have used similarly thin justifications, and waved away concerns and criticisms as unhinged, conspiratorial, and overblown.

What hasn’t been attempted — by the ABS or anyone else, really — is to make a clear and cogent argument for why retention of the name and address data is so important, and worth it.

So I’m going to attempt it.

I’m going to try very hard to take the side of those who want to keep names and addresses, and to argue logically from facts. I’m not trying to build a strawman that I then dismantle. I want to build a strong summary argument; the kind that I wish the ABS had made in the first place.

Why? Because the census is important, as I will cover. I am a big believer in it, and until this year was a very big fan of both the census and the ABS more generally.

Why A Census Is Important

That the Census is important and worthwhile isn’t the issue here, but let’s quickly recap some of its most important points.

Firstly, and possibly most importantly, the Census is central to a representative democracy. Representation in the House of Representatives is based on how many people live in a given area, so we need to count them periodically to keep the tally up to date.

The number of people in an area, such as a state or electorate, is also used for divvying up various common resources, like tax dollars towards things like roads, healthcare, schools, and so on. We need to know how many children are in a given area so we can make sure there are enough schools, for example.

Other demographic information is also important for making policy decisions. For example, how many people live in cities versus rural areas? What is the literacy rate for English? What is the distribution of ages in a given area? How does the proportion of Indigenous or Muslim people in the population compare to the proportion in Parliament, or in upper management of corporations?

Note how these are all aggregate statistics. They say nothing about individuals.

They also provide data at a point in time: when the census is taken. Multiple snapshots, one at each census, can show us how things change over time (trends) but again, only at an aggregate level.

That’s really, really useful all by itself, because we can track things like literacy rates, life expectancy, rates of disease, and so on. We don’t need names for any of this.

We do need addresses, at some level of granularity, or we won’t know where in Australia people are. Your specific address doesn’t matter once we’ve counted the number of people in Victoria (for example), which is why after the initial processing, the address can be discarded.

Longitudinal Studies

Unlike snapshots (or cross-sectional study), longitudinal studies can provide more detailed information about how things change over time.

A longitudinal study looks at the same group of people at each sample time, so you can track what happened to a specific group of people, rather than a general class of people. One great example of how this is useful is in studying poverty. If the poverty rate in the 2005 Census was 10%, and at the 2011 Census it was still 10%, we don’t know much about why. It could be that the same 10% of people were poor both times, or that 10% of the population (different people) are poor at any one time.

A longitudinal study would help you figure out which is the case, and that should result in different policies. If it’s always the same 10% of people who are poor, you can direct your efforts to just that 10% of people, and not waste your effort on people who don’t need it (to oversimplify a bit). If 10% of the population generally are poor all the time, then helping people is a more general problem, and requires a different approach.

But if we don’t keep people’s names and addresses, we have no way of knowing who filled out which set of data for any given Census, and can’t do longitudinal research. If we have your name and address, we can. That’s why in the Census we want to ask for your name and address now, and also five years ago, because then we can look up where you were last time, and match your answers to this time, and see if you’re still below the poverty line or not.

What About Privacy?

The ABS has a lot of very sensitive information about every Australian, so it’s vital that the information be kept safe. Since the very beginning, the Census and Statistics Act has made the privacy and security of your personal data very clear. It is expressly forbidden for the ABS to share personally identifiable information about anyone. There are substantial penalties for anyone at the ABS misusing data, or providing information to anyone else, including any other government department, including ASIO and the ATO.

In fact, back before World War II, a government official attempted to get personal wealth details out of the ABS, but the Statistician at the time refused. The ABS has a long history of defending the privacy of those who fill out the Census, and this speech by Dennis Trewin to the National Press Club highlights just how important the ABS considers its role.

But researchers need to be able to look at data, or they can’t do their job. They don’t get to see individual responses. They only get to see aggregate data, and the ABS employs a variety of techniques to foil attempts at unmasking individuals. Sometimes data is randomly scrambled, which doesn’t affect the aggregation too much (because of the way randomness works), but it hides the individual responses. Other times the data is combined with a minimum number of other answers, again, to hide individuals but providing useful summary statistics.

Because the Census isn’t about individuals. It’s about all of us.

In the modern, computerised age, additional care needs to be taken. It’s far harder to make off with a million paper forms that you have to physically steal from a warehouse than it is to silently infiltrate a computer system and download them all.

The ABS employs a variety of well-tested techniques to protect the computer systems that house sensitive data. There’s the usual principles of keeping systems maintained, ensuring only authorised people have access, using encryption and so on. The ABS has been independently audited by the Australian National Audit Office and found to have a very high level of internal security. But the ABS goes much further.

Not all of the data is kept in the one place, which is called the separation principle. This guards against an intruder making off with all the data in one go. If someone did get in, which is extremely unlikely but cannot be deemed impossible, this reduces the risk they can make off with all of the data. The idea is to make the task so difficult that even if someone were to partially succeed, they would be caught in the act.

Keeping names and addresses increases the risk to privacy, which is why the ABS has added even more layers of security to keep the data safe. Name and address data is kept completely separate from all the other data (the separation principle again). In order to do longitudinal studies, this name and address data needs to be linked with different sets of data. How do we do that without violating the separation principle?

The ABS uses a special technique called linkage keys. These are special fields created from names, addresses, and some other fields like date of birth or sex, to create a code that is used instead of the original data. The way the code is created makes it difficult to reverse back to get the original information (similar, but not identical to a hash function). The keys are created on the name and address data by one person, who is granted access to that sensitive data, but they are not allowed to see any other data. A different person then uses the linkage key to match individual responses on a different dataset, ensuring that no one person is given access to all of the data at any one time, but data can still be linked together.

It’s a little complex, but those are the lengths the ABS has to go to in order to keep your data safe.

Why Retain For So Long?

Why not just create a single master linkage key, and then throw away the names and addresses? While that would simplify the linking process, and mean we didn’t need the name and address data for as long, it would create some major problems that the ABS cannot support.

Each dataset to be linked doesn’t have the same key in it. That would require every person in Australia to be issued with a single, constant identifier that is used on all datasets. The risk of having that linkage identifier exposed is just too great. The idea has come up before, most famously as the Australia Card, and it was rejected for a host of very good reasons.

Instead, each dataset keeps its own set of fields and a way of identifying people, independently of each other. There is no over-arching government way to track an individual across all the datasets held by government.

If we want to link datasets together, we need to create a special linkage key just for that purpose. In order to create that linkage, we need identifying information: names and addresses. By keeping names and addresses for four years, we can create linkage keys for up to four years where there are valid research projects that require linking.

Why four years? We believe it’s a reasonable trade-off between the number of worthwhile research projects that normally occur and the risk of hanging onto personally identifiable data, given all of the additional steps we’ve taken to protect it.

Argument Against

That took a bit longer than I’d expected.

There are two major arguments against name and address retention. The first is security, and the second is privacy and consent.

Security

Security of the data cannot be guaranteed. It is not possible for the ABS to rule out the data being misused, stolen, or exposed.

We know that the ABS has suffered at least 14 data breaches since 2013. One ex-employee was sent to jail for misusing ABS data for insider trading.

Still, this wasn’t Census data, and the ABS has an admirable track record of protecting census data. The ABS is more likely to be able to protect this sensitive data that your mate Dave at IT-for-Less Inc., but that’s a function of practice and resources, not some kind of innate immunity to being hacked.

Success in the past is not a guarantee of success in the future. I’ve never had Hepatitis C, but that doesn’t mean I can’t get it.

And we have loads of examples of data breaches, some of very sensitive data that we would expect was well protected. I can’t imagine that the NSA really wanted Snowden to make off with all those documents, and the Office of Personnel Management held people’s fingerprints for pity’s sake!

The easiest way to protect people’s data is to not collect it in the first place. If the data is collected, it will probably be misused or exposed at some point.

We also have the ABS at first claiming they were in the very top Cyber Secure Zone when they plainly weren’t, and then quietly removing the references when people noticed. The response I got when I asked them about the claims was (in part):

“The ABS has improved its compliance and security since the ANAO audit, and continues to do so through a rolling program of security projects. The ABS has updated its online information in relation to this audit. The ABS network is subjected to regular independent security testing and audits, and been found to be highly resilient to external attacks.”

I was also alerted by someone on Twitter, and then verified myself, that the ABS emails you a server-generated password tied to your Census supplied login ID if you opt to save a partially completed Census form. In plain text. Sending plain text passwords in email is known to be bad for a host of reasons.

Information security is very hard to do well. These examples undermine the credibility of the ABS that they really are as good at information security as they claim. It’s not just the ABS either, as the ABS has contracted IBM to supply the online census system, so we need to trust IBM’s information security practices as well as those of the ABS.

The ABS is critically reliant on people’s trust in order that they provide accurate and truthful answers in the Census. Its actions thus far have unfortunately undermined that trust for a great many people.

So we’re now down to a trade-off between the value of keeping the data, and the risk and impact of it being stolen. That’s a really difficult calculus to attempt, and the ABS hasn’t really done this in any significant detail. At least, not yet.

If you are a judge, policeman, or member of the military, having your personal details exposed would be substantial.

If you are a women with an abusive ex, a witness to a major crime, or a persecuted minority, the impact of personal details leaking could be devastating.

How do you even calculate the upside versus downside here? And how can the ABS make that decision on behalf of vulnerable people?

Which brings us to privacy and consent.

Privacy and Consent

Australians have a legal right to privacy, but it’s a bit of a fluid concept that depends on context. The right to privacy is balanced against other rights, responsibilities, and duties.

Once again, we need to make a trade-off between individual privacy and the benefits to society or other groups from a lack of individual privacy.

Here, we have a risk to our personal security by giving up our privacy, as discussed above. That implies that the benefits from a lack of privacy for individuals must be pretty substantial. The ABS has not made the case for people to give up their privacy voluntarily. It is being compelled by force of law and the threat of prosecution for a crime, as well as open-ended fines. That’s pretty heavy handed stuff given that not voting, which I would argue is a greater personal duty than participation in the Census, carries a maximum fine of $180.

There’s also the issue of informed consent. Informed consent is a vital ethical consideration for any research involving humans, as all university ethics committees know.

The ABS is proposing that individuals will be linked to other, as yet undefined datasets. It is not possible to give informed consent to undefined future research. Now, sometimes consent is not possible, but where it is, it should be attempted. The ABS here is compelling people to give their consent, up-front, to future unnamed research projects that use their private, personally identifiable information.

This is another major expansion in the purview of what a census is supposed to be about. What is supposed to be a periodical cross-sectional snapshot of the population — counting people — has turned into an undefined number of perpetual longitudinal studies carried out on every individual in Australia without their consent.

Now maybe, maybe, the benefits of this situation are worth it. What are the benefits from these longitudinal studies? How is their worth measured? I just don’t believe the ABS has made the case other than to broadly assert that there are benefits.

I’m willing to look, but the ABS doesn’t seem willing to articulate these purported benefits and clearly demonstrate that it is aware of the risks and show that the benefits outweigh them.

The Eigencast 019: DigitalOcean

The Eigencast

Ben Uretsky, CEO and Co-Founder of DigitalOcean

Ben Uretsky, CEO and Co-Founder of DigitalOcean

Justin talks to Ben Uretsky, CEO and Co-Founder of DigitalOcean.

They talk about how DigitalOcean is one of the only pure play cloud providers in the world, and how it wants to be the default provider for Software-as-a-Service companies worldwide.

They discuss containers, and how we’re about to make a bunch of mistakes with them. “If we thought that VM sprawl was bad,” Uretsky says, “now container sprawl is even worse.”

They also cover how maybe servers and storage, and maybe even network, is possibly an out-of-date way of thinking about how to build applications. Maybe service composition is the way of the future? But then, what do the services run on?

Finally they cover the world of Open Source, and what it takes to make money when you give away your code for free.

Chapters

  • 00:00:00.000 Intro
  • 00:00:15.856 Episode Intro
  • 00:02:50.933 Interview
  • 00:07:16.121 Why Add Block Storage?
  • 00:13:43.070 What Makes Your Community Special?
  • 00:17:53.211 Word Of Mouth
  • 00:20:25.081 Software Defined Control
  • 00:22:36.419 Autonomous Systems Are The Future
  • 00:25:28.662 Levels of Abstraction
  • 00:30:01.912 Open Source World
  • 00:33:25.127 Community
  • 00:36:14.179 Google Platform For All?
  • 00:38:43.855 Closing Remarks
  • 00:39:45.916 Outtakes

Links

Sponsors

PivotNine-cropped-logo

This episode of The Eigencast was sponsored by PivotNine. Research, analysis, advice.

 

 

The Australian Census 2016 Controversy

Photo by willhowells (via @Flickr)

Photo by willhowells (via @Flickr)

The Australian Census for 2016, which is run on 9 August this time around, is a little different to previous censuses. This year, the Australian Bureau of Statistics (ABS) has decided that name and address information will be retained longer than ever before. Instead of being kept for the roughly 18 months it takes to process all the forms and data before being destroyed, name and address data will instead be kept for up to four years.

A bunch of people aren’t overly happy with this plan, and here’s a sample of the coverage the issue has been getting in recent days:

There’s an awful lot of misinformation and scaremongering going on, and the Australian Bureau of Statistics (ABS) isn’t helping matters with its “Just trust us, okay?” approach to public relations. Here are my thoughts and links to information I’ve found in my own attempt to understand the situation.

Background

On 11 November 2015, the ABS published a Statement of Intent to conduct a Privacy Impact Assessment on the retention of names and addresses from responses to the 2016 Census on its website. This method of announcing the proposal is worth noting for later. The general tone of the Statement is also worth noting.

Almost no one noticed the Statement and the media alert the ABS sent out. There were two media mentions (as listed in the Privacy Impact Assessment itself): one in PSNews (Independent News for the Australian Public Service), and one in iTNews. Both are pretty vague on the privacy implications, and the piece in iTNews made no mention of the fact that the ABS was seeking input from the public, or that the deadline for responses was a mere three weeks away. (Disclosure: I write pieces for iTNews from time to time, and its sister publication CRN, and am a member of the iTNews advisory board)

The Privacy Impact Assessment itself provides a bit more detail on why the ABS wants to retain names and addresses:

The ABS is now considering the retention of names and addresses from the 2016 Census as a key enabler to meet the growing stakeholder demand to provide a richer and more dynamic statistical picture of Australia through the integration of Census data with other survey and administrative data, the geospatial enablement of that data, and improvements to our household surveys. The retention of names and addresses would also reduce the cost to taxpayers and the burden on Australian households through more efficient ABS survey operations.

The ABS enjoys a generally excellent reputation for both data collection and safeguarding, and also the production of high-quality statistical products. Australians have historically had a high degree of trust in the institution, and that trust has, so far at least, been well placed. I have personally used the ABS website and statistics from it from time to time, and it’s great stuff (if you’re into this sort of thing). The ABS even has data on what people think of them (of course they do!) and you can look at it here. I particularly recommend the summary tables here.

The law safeguarding Australians’ information collected in the Census is pretty robust, with the main statutory protections being in the Census And Statistics Act 1905, the Australian Bureau of Statistics Act 1975, and the Privacy Act 1988. The trend in legislation over the years has been to generally improve individuals’ right to privacy.

The ABS received a pretty high rating from the Australian National Audit Office Cyber Attacks: Securing Agencies’ ICT Systems audit (full report here) of various agencies in 2014. It’s a couple of years old, but it’s reasonable to expect that the situation would have improved, rather than gotten worse, with the level of focus information security has received in recent years and the value of the statistical dataset that the ABS looks after.

Agency Compliance Grade (Source: ANAO Audit Report No 50)

Agency Compliance Grade (Source: ANAO Audit Report No 50)

ABS planned state by 30 June 2014 (Blue: observed state 30 Nov 2013, Grey: target state by 30 June 2014) (Source: ANAO Audit Report No. 50)

ABS planned state by 30 June 2014
(Blue: observed state 30 Nov 2013, Grey: target state by 30 June 2014)
(Source: ANAO Audit Report No. 50)

We can see that the ABS was planning to improve in its IT general controls, if not its implementation of the Top Four ISM strategies and related controls required for entering the Cyber Secure Zone.

Note that the ABS public statement on privacy here says (retrieved on 1 Aug 2016):

“The ABS took part in an Australian National Audit Office cross-agency audit in 2014 on information technology system security against cyber-attacks. The ABS was rated as being in a Cyber Secure Zone (having high-level protection from external attacks and internal breaches and disclosure of information).”

This is incorrect. As we saw in the figure included above, the ABS is in the Internally Secure zone, the requirement for which is defined as:

Reasonable level of protection from breaches and disclosures of information from internal sources—but vulnerabilities remain to attacks from external sources. (Source: ANAO Audit Report No. 50)

UPDATE

I have checked the ABS statement with ANAO, via email through an intermediary (I won’t name them unless/until they say it’s ok) and here is ANAO’s response:

Thank you for your clarification question referencing the ANAO Audit Report No.50 2013–14 Cyber Attacks: Securing Entities’ ICT Systems.

In answer to your questions:
1. The ABS was not compliant with the PSPF and ISM—and was graded as being located in the Internally Secure Zone.
2. The ANAO has not conducted a follow up audit on the ABS since 2014; therefore I cannot validate your statement in your ZDNet article that “… the bureau now claims that it’s rated in the “Cyber Secure Zone”.
3. The audit methodology, grading scheme and ranking is created by the ANAO, and is shared with ASD prior to tabling the report in Parliament. The work conducted by ASD on behalf of government entities is not assessed by the ANAO. Our evidence is based on audit fieldwork and privileged access to artefacts and enterprise systems under the Auditor-General’s Act 1997.

By way of background, the ANAO assessed seven entities in 2013-14: ABS, Customs, AFSA, ATO, DFAT, DHS, DHS and IP Australia. In summary, each of the auditees was locate in the Internally Secure Zone. The entities had security controls in place to provide a reasonable level of protection from breaches and disclosures of information from internal sources. The preferred state is for entities to be located in the Cyber Secure Zone.

The selected auditees had not achieved compliance with the Protective Security Policy Framework (PSPF) and the Australian Government Information Security Manual (ISM). A copy of this report is available at: https://www.anao.gov.au/work/performance-audit/cyber-attacks-securing-agencies-ict-systems

The ANAO recently published another cyber report titled, ANAO Audit Report No.37 2015-16 Cyber Resilience. Two of the four selected entities achieved compliance—AUSTRAC and the Department of Agriculture and Water Resources. Two entities did not achieve compliance—AFP and Department of Industry, Innovation and Science.

A copy of this report is available at: https://www.anao.gov.au/work/performance-audit/cyber-resilience

The Value of Linking

It is true that having a way of directly linking data from the Census to other datasets, via name and address to determine if the information relates to the same person or not, would be very useful statistically. Without this information, it’s harder to know if changes in a given population are statistically valid. A simple example is given in introductory stats classes is a paired difference test where you survey the same group of people twice. It’s a special case of blocking.

Without name and address data, how can you tell that a given set of census data in 2011 matches the data from 2016? If a person’s income went from $45,000 to $149,000 in five years, but that result was because you linked two different people, then you get bad statistics if this is repeated across a large group. If you don’t do linkages, you can’t say as easily if there really was much of a change in incomes of two groups of people between one census and the next.

That’s a vast over-simplification, but you get the idea.

Knowing trend information like this is important for a host of public policy reasons. Everything from whether changes in health policy actually work, to knowing if the situation of those in vulnerable populations (the very poor, indigenous people, the elderly with chronic disease, etc.) is getting worse or better.

You can read some examples of what the ABS is able to do with this sort of data in Australians’ journey’s through life: Stories from the Australian Census Longitudinal Dataset, 2006-2011.

In short, there are plenty of perfectly good and reasonable reasons linking data from one census to another.

What’s the Big Deal?

This is where things get messy.

The creation of the Australian Census Longitudinal Dataset (ACLD) was first proposed back in 2005, and you can read the discussion paper about the issue: Discussion Paper: Enhancing the Population Census: Developing a Longitudinal View, 2006. Importantly, in the very beginning of this paper, the retention of names and addresses was specifically called out as not required.

Under this proposal the ABS would create a SLCD through combining census data over time. The proposal does not require the retention of names and addresses from the census. Names and addresses will continue to be destroyed following processing of the
2006 Census.

Name and address data would be used “during the period of census processing” but the proposal was quite specific about the purposes for which names and address would be used. There is also some detailed discussion of how individuals within the ABS would get access to data for linking, and the steps the ABS would take to guard against individuals being identified.

One way to bring the data together over time would be to use personal identifiers, such as name and address information, to bring together records for the same person. However, this would require the retention of name and address information from one census to the next. This raises significant privacy issues. The ABS is not proposing to retain name and address past the time needed to process each census and would not be using name and address information to bring the census data together over time. The ABS will continue its practice of destroying census forms containing names and addresses following census processing, as it has done in previous censuses. (Emphasis mine)

This is a detail that appears to have been missed in recent commentary about the decision to retain name and address information in the 2016 census. The creation of the ACLD in the first place was very well aware of the privacy and security implications from retaining names and addresses, and deliberately excluded them apart from very specific cases.

The Privacy Impact Assessment conducted to assess the 2005 proposal to create the ACLD went further, recommending that these specific name and address matching within the census processing time (which was estimated to be a maximum of 15 months, by the way) should be dropped.

Consideration should be given to abandonment of the Census Data Enhancement proposals involving name matching and of reverting to previous ABS practice of confining the use of names during Census processing periods to ABS quality studies only.

What Changed?

Rather than argue specifically about the merits of keeping names versus not, I’d like to look at what changed between 2005 and now. In 2005, the proposal was not to retain names and addresses any longer than normal processing, and the linkages were statistical aggregations for the most part, with some very specific name/address based linkages.

Now, however, the proposal is to simultaneously expand the retention of names and addresses from 18 months (not the 15 originally estimated) to 48 months and also to use them for a much larger range of linkages than what was proposed in 2005.

Why?

Unfortunately, the ABS has been rather vague and evasive about this issue. Unlike the 2005 discussion paper, which was detailed in its review of both the benefits and risks of the creation of the ACLD, the recent names and addresses proposal was, well, brief. My interpretation of the reasons for retaining names boils down to two major things.

Firstly, the ABS is under a lot of pressure to provide better quality data. Linking with more datasets that the ABS already has will allow it to provide more, higher value statistical products. Name and address data will make this much easier to do, and in some cases it might not be possible without name and/or address data.

Secondly, the ABS is simultaneously under a lot of pressure to cut costs. Easier equals cheaper.

Given these two pressures, I can understand why the ABS may want to look at keeping name and address data.

But in 2005, the ABS was well aware of the risks of retaining name and address data. Again, what has changed since then?

In that time, the information security landscape has changed substantially. Both the number and sophistication of malicious actors has increased, and while information security practices have improved, we are also far more aware of just how under-prepared many organisations are. However, the ABS is well practiced at keeping data safe, and has a long and proud history of doing so. It has, in short, been successful. So far.

However what isn’t clear is that the ABS is better at safeguarding data than it was in 2005. If the risk has increased (which it has, as we just discussed), then the ABS’ ability to mitigate the risks must have increased substantially more. The ABS must overcome the risk inflation we’ve established is real, and must also add in the additional risk created by retaining name and address data. This data is attractive all by itself, but even more so because of its ability to link datasets, which is the very reason for keeping it. What the ABS might find useful would also be useful to whomever might steal the data.

There’s another change that has been hinted at by various commenters.

When the 2005 proposal was made, Facebook was barely a year old. Now, in 2016, Australians have had more than a decade of getting used to the idea of sharing private information with strangers through the Internet. The argument is made that people have become so used to giving up their privacy by carrying mobile phones everywhere (which function very nicely as tracking devices) and posting selfies on Instagram.

Attitudes to privacy may well have shifted. Alas, that’s a set of data I don’t have at my fingertips (happy to update this blog if someone tips me off to some high quality survey data). And that appears to be at least part of the reason for the ABS taking the approach it has.

Hubris Born of Success

The recent opinion piece in The Sydney Morning Herald by ABS Chief Statistician David Kalisch (Give us your name on census night, it’ll be safe) strikes a somewhat odd note to me. Its feel a tad preemptively defensive, and there are some weasel words in it.

Australians have no cause for concern about any aspect of this census, and can have ongoing trust and confidence in the ABS.

This statement is quite early on in the piece. Why state this? If I didn’t have any cause for concern, I certainly do now. This is a bit “Pay no attention to that man behind the curtain” for me.

In 2016, I’ve decided to keep names and addresses for longer. This is for statistical purposes only, and will increase the value of census data.

That’s nice. But at what cost?

Note that Kalisch uses the phrase “I’ve decided”, rather than “the ABS has decided.” That choice of phrase bothers me, as it suggests that a great deal of power rests in the hands of a single person, who can simply decide things. On the plus side, since My Kalisch has so clearly nailed himself to this particular plank, should this decision prove to be a poor one, there is no one else to blame.

My decision followed community consultation, direct engagement with the Australian Information Commissioner and each State and Territory Privacy Commissioner, and a Privacy Impact Assessment (PIA). The ABS has transparently communicated its process and decisions every step of the way. We advertised our PIA process in the national media in November 2015 and received few responses.

This is technically true. However, the Privacy Impact Assessment doesn’t appear to be independent in the way the PIA from 2005 was. The author in the PDF is listed as Zoe Winston-Gregson, who appears to be a Graduate Development Program hire according to this document [PDF]. It would appear that the ABS performed the PIA itself. While the Office of the Australian Information Commissioner’s (OAIC) Guide to Undertaking Privacy Impact Assessments does not require a PIA to be conducted by an external party, I would be interested to know why the ABS chose not to have an external third party conduct this assessment when it saw value in an externally conducted PIA in 2005.

The 2016 PIA states:

The Privacy Impact Assessment identified a small number of potential risks to personal privacy associated with the retention of names and addresses from responses to the 2016 Census, but concluded that in each case the likelihood of these risks eventuating was ‘very low’. The Privacy Impact Assessment determined that these risks can and would be effectively mitigated by implementation of an internationally accepted practice known as functional separation and by existing ABS governance and security arrangements. Nevertheless, a small number of recommendations have been made in relation to implementation of the proposal.

Throughout the document, the emphasis is on risk likelihood rather than risk impact, and both are required for a proper understanding of risk; a low likelihood of losing $2 is very different from a low likelihood of losing your life. All of the risks are assessed as being of Very Low likelihood, but there is no corresponding scale for the impact of the risk, should it occur.

In section 4 Privacy Risk and Mitigation, where the risks are enumerated (all five of them) there is a statement describing the consequence of the risk eventuating, but there appears to be no assessment of the magnitude of its impact. The emphasis is on the ABS’ ability to mitigate the likelihood of the risk occurring, of which it appears to be supremely confident.

Risk 4.3 “Accidental release of name and/or address data in ABS outputs or through loss of work related IT equipment and IT documentation” has a consequence listed as “Name and/or address information is publically[sic] released.” Well yes, but is that it?

What about the impact to the people whose name and address information is released? What about the impact to the reputation of the ABS, who have now spent a lot of time and energy hyping themselves as superior custodians of this name and address information?

And this is listed as a separate risk from risk 4.2 “Unauthorised non-ABS access to data stored in the ABS environment” where the consequence is “The consequences of breach of privacy depend on whether names, anonymised names, or linked data is accessed.” There’s no discussion of the consequence of more than one of these things happening at the same time.

In my view, this PIA is nowhere near as thorough as the one performed in 2005 by Pacific Privacy Consulting, i.e. Mr. Nigel Waters. It is rather light-on for an issue that has been addressed multiple times in the past with great caution and care. I do not see the same level of care and diligence here.

My Conclusion

In deciding to retain name and address data, the ABS (or David Kalisch all by himself, who knows?) believes that it is trusted by the public, and that people aren’t all that concerned about providing this information to the ABS. It is under pressure from the government to cut costs and to provide more higher value products and services. Sound familiar?

I think the ABS, or whomever was involved in driving the retention of names and addresses, has become arrogant. They are convinced of the benefits of retention, and are dismissive of the risks. They are concerned about people’s perception of the issue but only superficially. They are not interested in carefully explaining their reasoning and all the safeguards that have been put in place and demonstrating very clearly to all who will listen why they should be trusted. The ABS has assumed they are trusted and worked from there.

The way the issue has been handled looks to me like one that was carefully stage-managed so as not to spook the horses. A series of process boxes have been checked. Focus groups to figure out how concerned people really were, which showed up as “well, not all that much” That’s just a qualitative step that should have been followed by a quantitative study of some kind, of which I have yet to see evidence. The ABS published a short Statement of Intent with vague, hand-wavey statements about how the ABS would take care of everything, because they’ve always managed really well before, and put it out quietly as people were leaning into the Christmas period. People were given a mere three weeks to respond, if you were to somehow stumble across the idea that you could, and the lack of responses was used as more evidence that no one was bothered by the proposal.

Absence of evidence is not evidence of absence, as the statisticians at the ABS well know, and as the architects of this situation are now finding out, no doubt to their chagrin.

Now we have anodyne corporatised PR management of the issue which boils down to “Trust us, and don’t worry your pretty little head about any pesky details.”

I believe that the ABS is probably better equipped than most government departments to safeguard this information, but it’s a lot easier to keep a secret if you don’t know it in the first place. I remain unconvinced that the benefits of retaining name and address data outweigh the risks. I haven’t seen a good enough explanation from the ABS on this issue, and they’ve had ample time to provide one, but seem incapable of engaging with the public in a non-condescending way.

This needs to stop immediately.

Postscript

Risk 4.4 in the 2016 PIA is

RISK:Reduction in participation levels in ABS collections due to loss of public trust
Consequence: The proposal to retain names and addresses from responses to the Census may cause public concern which results in a reduction of participation levels in ABS collections, and/or a public backlash.
Management if risk eventuates: Depending on the circumstances, the ABS will:
Respond to concern from the media, stakeholders and the public;
Conduct further consultations;
Reconsider the privacy design for the proposal, if required.

Let’s keep making noise.

The Eigencast 018: Bimodal Begone

The Eigencast

Justin talks to Meredith Whalen, Senior Vice President, U.S. Insights and Vertical Business Units, IDC about IDC’s alternative to Gartner’s bimodal approach.

They talk about why integration is a vital piece that is missing from Gartner’s bimodal approach, and why that’s going to cause problems for organisations that try to use it. Justin refers to Simon Wardley’s PST model as an alternative.

They discuss how digital transformation is really just another form of organisational change, so many of the existing tools and techniques will work. Putting digital in front of words doesn’t fundamentally change the universe.

Justin points out that IT departments have been forced to compete with cloud, and for many this is the first time they’ve ever had to operate like a real business. Meredith shares IDC research that indicates some IT departments are beginning to get it, and their businesses are noticing the improvement.

Chapters

  • 00:00:00.000 Intro
  • 00:00:15.856 Episode Intro
  • 00:02:50.933 Interview
  • 00:04:14.666 Simon Wardley’s PST Model
  • 00:07:04.304 IT Transformation To Help Business Transformation
  • 00:07:58.407 What Even Is Digital Transformation?
  • 00:10:58.529 The Business Controls The Money
  • 00:13:08.929 Cloud Competition
  • 00:13:42.186 Are You Ready?
  • 00:15:04.301 Where Are The Marketers?
  • 00:20:25.638 Change Management
  • 00:22:25.759 IT Is Getting Better
  • 00:24:35.314 Closing Remarks
  • 00:25:38.952 Outtakes

Links

Sponsors

PivotNine-cropped-logo

This episode of The Eigencast was sponsored by PivotNine. Research, analysis, advice.

 

The Eigencast 017: Nutanix .NEXT with Sudheesh Nair

The Eigencast

Sudheesh Nair, President of Nutanix

Sudheesh Nair, President of Nutanix (Source: LinkedIn)

Justin talks to Sudheesh Nair, President of Nutanix, at the Nutanix .NEXT 2016 conference, held in Las Vegas, NV.

We discuss the delayed Nutanix IPO, and the Nutanix approach to financing. We talk about the market climate for startups generally in 2016, and how Nutanix may use the tight financing conditions as an opportunity to pick up complementary startups on the cheap. Nutanix PR make it clear these are all hypothetical situations.

Sudheesh explains the underlying principles driving the Nutanix product evolution, and the company culture. We discuss the new Xpress product line, and what going down-market to SMB implies for the company.

We also talk about what Enterprise Cloud actually means, and what companies are really trying to do with infrastructure.

Justin travelled to Nutanix .NEXT 2016 as a guest of Nutanix.

Chapters

We’re experimenting with chapters in this episode.

  • 00:00:00.000 Intro
  • 00:00:15.856 Episode Intro
  • 00:03:33.803 Interview
  • 00:05:57.546 Explaining the IPO Strategy
  • 00:08:23.640 Jeff Bezos Investor Letter
  • 00:13:12.502 Coming to America
  • 00:16:40.373 Storage is Central
  • 00:22:21.631 The Xpress Experiment
  • 00:24:37.085 Enterprise Cloud
  • 00:28:58.530 Flattery will get you everywhere
  • 00:29:27.601 Closing Remarks
  • 00:30:47.462 Outtakes

Links

Sponsors

PivotNine-cropped-logo

This episode of The Eigencast was sponsored by PivotNine. Research, analysis, advice.