Generating information with AI may be considered a collection of personal information under privacy law

On November 9, 2022,Quebec’s privacy regulator, the Commission d’accès à l’information (the “CAI”), concluded an investigation under the Act Respecting Access To Documents Held By Public Bodies And The Protection Of Personal Information (the “Public Sector Act”) into a school board’s (“School Board”) use of algorithmic technology to identify grade-six students at risk of dropping out of school (“Tool”).

The CAI’s findings and order (Enquête concernant le Centre de services scolaire du- Val- des- Cerfs (anciennement Commission scolaire du Val-des-Cerfs), the “Decision”) will be of particular interest to organizations dealing with de-identified or anonymized personal information, as well as those developing or adopting technology involving predictive analytics.

Notably, the CAI rejected the characterization of the system as being one that simply extracted existing data. Instead, the CAI noted the Tool had the ability to produce information that, for a human being, would have been very complex to obtain and would have required comparative analysis of a very large amount of data and extensive statistical knowledge. It concluded that the Tool thus produced new personal information – the predictive indicators of dropout risk – and determined that this production of new personal information was a collection of personal information under the Public Sector Act.

The Tool

The Tool, which was developed by a third-party vendor, worked as follows:

The School Board provided the vendor with temporary access to a database consisting of approximately 300 types of de-identified data (e.g. admission and registration documents, report cards, school information sheets etc.);
The vendor used algorithmic technology to analyze the data and generate a set of predictive indicators of drop-out risk for grade-six students; and
The vendor provided the School Board with a list consisting of student record numbers, along with the dropout risk levels and factors associated with that number, as determined by the algorithm.

In order to identify specific students at risk, the School Board would then have been required to link the list with its own records using the student record number. At the time of the investigation, it had not yet done so.

The Decision

1) De-identified information may still be “personal information” under the Public Sector Act

The Decision reaffirms the current consensus view of Canadian privacy regulators that de-identified information will still be considered “personal information” for the purposes of privacy law. In this case, even though the School Board had removed 80 types of data (including student names, home addresses, email addresses and student portal usernames), before granting the vendor access to its database, the CAI still considered the resulting de-identified data to be personal information under the Public Sector Act. In doing so, the CAI relied on the section 73 of the Public Sector Act (as amended by Bill 64), which provides that information concerning a natural person is anonymized if it “irreversibly no longer allows the person to be identified directly or indirectly.”[1] Accordingly, the CAI took the position that data here (which could be re-identified by linking it back to the original data set) remained personal information. Unfortunately, the CAI did not take the opportunity to weigh in as to whether the data would have been considered “de-identified” for the purposes of sections 65.1 and 12 of the Public and Private Sector Acts, respectively, and thus whether it could have been used without consent for the purposes narrowly articulated at those sections.

Organizations contemplating the use of de-identified information (particularly for secondary purposes), will need to remain cognizant of the fact that where de-identified information may be re-identified, it will likely be subject to privacy legislation. Organizations will need to closely scrutinize their processes for de-identification and/or anonymization, and should be prepared to provide sufficient notice of the purposes for which de-identified information may be used.

2) Using artificial intelligence to generate indicators based on de-identified data consists of a collection of personal information

The School Board took the position that by using the Tool, it was merely enhancing data already in its custody when it utilized the Tool to generate the predictive indicators. The CAI disagreed. Rather, the CAI drew a distinction between a system that would simply extract or categorize raw data, and the Tool, which had the ability to produce information that, for a human being, would have been very complex to obtain and would have required comparative analysis of a very large amount of data and extensive statistical knowledge. It concluded that the Tool thus produced new personal information – the predictive indicators of dropout risk – and determined that this production of new personal information was a collection of personal information under the Public Sector Act.

The CAI’s determination that the production of predictive indicators constitutes a “collection” of personal information is a novel one. Organizations may be surprised to learn that, by using artificial intelligence systems to generate insights, they may be “collecting” information about the individuals concerned under privacy law. If so, that means that generation of such indicators (or similar insights) would subject organizations to the provisions applicable to the collection (and not use) of personal information. In this case, that would mean that the School Board should have informed individuals of the collection, conducted a privacy impact assessment and adopted security measures proportionate to the sensitivity of the information generated by the Tool. “Uses”, by contrast, would not attract the same number of obligations.

The use of predictive analytics in this context also raises issues with respect to the accuracy of personal information. Organizations are required to maintain accurate personal information about individuals. If, by generating insights about an individual, an organization is indeed collecting a new type of personal information about that individual, then they will also have a corresponding obligation to ensure that such information is accurate. The question then becomes – how can organizations ensure that information generated via algorithmic processes is accurate? What does it mean for information that is inherently prospective and that essentially constitutes an educated guess about an individual’s future behaviour to be “accurate”?

It is clear that the rapid and widespread use of artificial intelligence systems, particularly for the types of purposes described in the Decision, is here to stay. Less clear is how Canadian privacy regulators will treat such uses with respect to privacy organizations’ obligations under privacy law. Regardless, there is little doubt that these technologies will continue to attract significant scrutiny moving forward.

[1] This provision comes into effect on September 22, 2023, along with its identical counterpart at s. 12 of the Act Respecting The Protection Of Personal Information In The Private Sector.

The authors would like to thank Rachel Kuchma, an articling student in Dentons’ Ottawa office, who helped write this article.

Subscribe and stay updated

Receive our latest blog posts by email.

Stay in Touch

Artificial Intelligence, Bill 64

**Generating information with AI may be considered a collection of personal information under privacy law**

About Luca Lucarini

About Sasha Coutu