Bill C-11 (the Digital Charter Implementation Act) was introduced on November 17, 2020. It proposes the new Consumer Privacy Protection Act (“CPPA”) as a replacement for the existing Personal Information Protection and Electronic Documents Act (“PIPEDA”), the federal legislation regulating privacy in the private sector.
This is the fourth of a series of articles addressing specific issues raised by the proposed CPPA. This article discusses how the CPPA would treat de-identified information and what it means for businesses.
Click here for a more general discussion of the changes that would be introduced by the Bill; scroll to the bottom for links to other posts in the CPPA: In Depth series.
Background: de-identified information versus anonymized information
De-identified information and anonymized information are generally understood to be different things. De-identified information is information for which the risk of re-identifying the individual is significantly reduced or eliminated in the context in which it is to be used. This generally includes removing or obscuring both “direct identifiers” (i.e. attributes that alone enable unique identification of individual) and “indirect identifiers” (i.e. attributes that, when combined with other information, enable identification of an individual). De-identified data can necessarily be “re-identified”. The method for doing so depends on the particular de-identification technique. For example, where data is de-identified by replacing identifying information with random codes (i.e. “pseudonymization”), there would generally be a separately stored key that could be applied to the codes to restore the identifying information. In other words, de-identified information can be re-identified, with varying degrees of difficulty.
Anonymized information is information which cannot be re-identified in any context. Because of that, it generally falls outside the reach of privacy laws. In order to be truly anonymized, an organization must strip personal information of a sufficient number of elements such that the individual can no longer be identified. However, if it is possible to use any reasonably available means to re-identify the individuals to which the information refers, that data will not have been effectively anonymized but will have merely been de-identified. A failure to understand this distinction means that organizations that believe they have “anonymized” data may, in fact, be handling “de-identified” data and therefore still subject to privacy laws.
How does PIPEDA treat de-identified information?
PIPEDA is silent on the collection, use or disclosure of de-identified information. The question of where to draw the line between identifiable and anonymous information has thus far been left to the Office of the Privacy Commissioner (“OPC”) and the courts. The OPC stated in its 2013 Information Bulletin: Personal Information that personal information that has been de-identified does not constitute anonymous information (and is thus personal information for the purposes of PIPEDA) if there is a serious possibility that someone could link the de-identified data back to an identifiable individual (see also PIPEDA Case Summary #2009-018). This statement from the OPC builds on the decision in Gordon v. Canada (Health), 2008 FC 258 (CanLII), which looked at the definition of “personal information” pursuant to an access request made under a related statute, the Access to Information Act.
The OPC has also stated that information is only “truly” anonymous (and hence not “personal”) when it can never be linked to an individual, either directly or indirectly. This threshold is higher than in some other jurisdictions, such as the UK. In the UK, the High Court in R (on the application of the Department of Health) v Information Commissioner  EWHC 1430 (Admin) stated that the risk of identification must be greater than remote and reasonably likely for information to be classed as personal data under that country’s privacy law.
How do other jurisdictions treat de-identified information?
In the years since PIPEDA became law, data-driven technologies have become significant drivers for innovation, economic growth, and socially beneficial purposes in both the public and private sectors. The value of the data that powers many of these technologies can often be realized without the inclusion of personal information. Recognizing this, jurisdictions around the world have sought to address the use of de-identified data in their regulatory regimes.
The EU’s GDPR provides that pseudonymized data, while still “personal”, may be used for purposes other than those for which it was collected. In addition, data controllers may use pseudonymization to satisfy GDPR’s data security requirements, and in some circumstances controllers need not satisfy certain data subject requests related to pseudonymized data. In other words, while pseudonymized data is still caught by the GDPR, it takes a more flexible approach to such data. Recital 26 of the GDPR states that the legislation is not concerned with the processing of anonymous information.
California’s CCPA excludes de-identified information from its reach entirely, provided that the controlling business implements safeguards and processes prohibiting re-identification, as well as processes to prevent the inadvertent release of de-identified data, and does not make any attempt to re-identify the information.
Under both GDPR and the CCPA, data is considered de-identified is when it is not reasonably likely that it could be used to re-identify the individual.
How would the CPPA define de-identified information?
If passed, the CPPA would not actually define “de-identified information”. Instead, it would define the process of de-identification: “to modify personal information – or create information from personal information – by using technical processes to ensure that the information does not identify an individual or could not be used in reasonably foreseeable circumstances, alone or in combination with other information, to identify an individual.
The CPPA would thus bring de-identified information squarely within the scope of Canada’s federal private-sector privacy regime.
The inclusion of de-identified information under CPPA would be broadly consistent with the OPC’s prior commentary that de-identified information may still be “personal.” Essentially, the CPPA would carve out “de-identified” information as a subset of “personal information” to which certain exemptions or obligations would apply. This is similar to how de-identified information is treated under the GDPR, but falls short of a total exclusion as is the case under the CCPA.
In addition, the CPPA would shift the line between “de-identified” and truly anonymous information. Information would be “de-identified” (and hence not anonymous) where re-identification is “reasonably foreseeable”, rather than a “serious possibility.” In theory, this would broaden the scope of what would be considered “de-identified” information (and thus regulated), and narrow the scope of what would be considered anonymous information (and thus not regulated). This would also bring Canada into line with the international regulations described above.
As currently drafted, the CPPA does not provide any further detail as to when re-identification would be “reasonably foreseeable”. Rather, the CPPA would require organizations to ensure that any technical or administrative measures applied to the information are proportionate to:
- the purpose for which the information is de-identified; and
- the sensitivity of the personal information.
At a minimum, it seems that organizations would be expected to balance the method of de-identification against the proposed use of the de-identified information, as well as the information’s sensitivity.
The inclusion of the “purpose” and “sensitivity” considerations is a bit confusing. It appears that what is intended here is that the more robust de-identification measures should be used where the personal information is particularly sensitive, and the purpose exposes the de-identified information to increase risk (for instance, used for a public facing purpose as opposed to being limited to internal use only). This is really a risk of harm analysis, and may be more understandable framed that way: are the measures applied proportionate to the harms that could result if the information were to be re-identified?
Organizations would not be required to use a particular technique of de-identification, as is the case under the GDPR (i.e. pseudonymization), and could seemingly use methods such as randomization (i.e. modifying data attributes such that their new value differs from their true value in a random way) or aggregation (i.e. grouping values into ranges).
What about information created from personal information?
The definition in the CPPA specifically captures “information create[d] from personal information”. This is very broad, and creates the potential for overreach, in which very little information can ever be anonymous (and therefore escape privacy laws).
For example, if a company were to take a list of mailing addresses of its customers and from that, generate a list of sales volumes by the first 3 digits of postal code alone, this list appears to be captured under the CPPA as “de-identified” information (based solely on the fact that it was created from personal information). In fact, the way the language is drafted, it is impossible for any information derived from personal information to ever be anonymous simply because it is derived from personal information. The most it will ever be is de-identified.
Under the current PIPEDA, it is likely the list would simply not be personal information, and not subject to regulation (e.g., safeguarding, general use without consent, etc.).
Note, too, the disposal requirements of the CPPA, which we address in a separate article in this series. Disposal is the “permanent and irreversible deletion” of personal information. Unlike in PIPEDA, there is not provision for anonymization qualifying as disposal. As a result, absent any grandfathering of existing data sets, organizations that have relied on anonymization as a form of destruction will need to update their policies and procedures to ensure “permanent and irreversible” deletion. If grandfathering is to be permitted, then organizations may wish to anonymize critical data sets prior to the coming into the force of the CPPA, as after that date, these data sets would only be de-identified information still subject to the CPPA, with no avenue to remove it from the CPPA’s purview.
What would an organization’s obligations be with respect to de-identified information?
Organizations would be prohibited from re-identifying an individual from de-identified information (alone or in combination with other information), except in order to conduct testing of the effectiveness of any security safeguards. Note that the “combination” that is contemplated here is not just with other personal information, but with any other type of information.
In addition, organizations contemplating a prospective business transaction must now de-identify any personal information before using it or disclosing that context (more on business transactions below).
What would organizations be able to do with de-identified information?
Organizations would be able de-identify an individual’s personal information without their knowledge or consent. Organizations would then be able to use or disclose such de-identified information without the knowledge and consent of the individual in the following circumstances:
- Organizations would be able to use de-identified information for their own internal research and development purposes.
- Parties to a prospective business transaction would be able to use and disclose de-identified information in order to assess and complete the transaction, provided the information would remain de-identified until they completed the transaction. This provision would essentially make the existing business transaction exemption under PIPEDA more exacting by requiring information to be de-identified. Note that there would no longer be a provision that allows actual personal information to be used or disclosed by the parties to a prospective business transaction (as there currently is in PIPEDA). The information must be de-identified.
- Organizations would be able to disclose de-identified information for a socially beneficial purpose to the specific entities, including:
- a government institution or part of a government institution;
- a health care institution, post-secondary educational institution or public library in Canada;
- any organization that is mandated, under a federal or provincial law or by contract with a government institution or part of a government institution, to carry out a socially beneficial purpose; or
- any other prescribed entity.
A “socially beneficial purpose” would be defined as a purpose related to “health, the provision or improvement of public amenities or infrastructure, the protection of the environment or any other prescribed purpose.”
It is noteworthy that the government chose to provide for the expansion of what constitutes a socially beneficial purposes, as well as the possible recipients of information for such purposes, through regulation. It remains to be seen how broad the scope of social beneficially purposes and recipients will ultimately be.
Other posts in the CPPA: In Depth series:
For more information about Denton’s data expertise and how we can help, please see our Transformative Technologies and Data Strategy page and our unique Dentons Data suite of data solutions for every business, including enterprise privacy audits, privacy program reviews and implementation, and training in respect of personal information. Subscribe and stay updated.