On April 29, 2024, the European Center for Digital Rights, better known as noyb, co-founded by Austrian lawyer and privacy activist Max Schrems, has filed a formal complaint against OpenAI, the company behind the popular ChatGPT.
The complaint raises concerns about the chatbot’s handling of personal data, focusing on two main issues: the provision of inaccurate personal data via chat output and the failure to respond integrally to data access and erasure requests, which are alleged to be in violation of the General Data Protection Regulation (GDPR).
Specifically, ChatGPT was found to generate an incorrect date of birth when asked about a data subject. When the affected data subject made a request for access, OpenAI reportedly provided only account information, omitting any details from the chat databases or training data of its Large Language Models (LLM). Further, in response to an erasure request, noyb claims that OpenAI stated that it was unable to isolate and correct the false data without impacting other data. The complaint can be found here.
GDPR Implications
The GDPR aims to protect personal data within the EU and applies to AI technologies that process such information. Key articles highlighted in the noyb complaint include:
- Article 5 para. 1 lit. d, which requires personal data to be accurate and, where necessary, kept up to date; inaccuracies must be erased or rectified without delay.
- Article 15, which grants individuals the right to access their personal data.
The complaint argues that by generating incorrect data, such as false birthdates, ChatGPT fails to comply with the GDPR’s accuracy requirements. It also notes the significant difficulty in correcting or preventing the spread of incorrect data (which we explained in a previous article). OpenAI’s response to the access request was also deemed non-compliant.
Personality Rights
Incorrect information about individuals is a topic that also concerns personality rights, which protect an individual’s control over his or her own identity.
In a paper by Sandra Watcher et al, the authors cite the German Federal Court’s 2013 ruling on Google’s autocomplete feature as a relevant precedent for cases in which false information about an individual is disseminated. In that case, the Court ruled that autocomplete suggestions linking a person’s name negatively charged words infringed on his personality rights. Despite the function’s design to reflect popular search patterns rather than factual accuracy, the Court emphasized that autocomplete must not be reputationally damaging. Google was found liable because it controls the suggestion process, even though it was not required to filter information in advance, only to correct known inaccuracies.
This case may be relevant to the challenges of aligning LLMs and AI systems with truth obligations. If autocomplete suggestions can mislead users about someone’s identity and thus create liability for their providers, the consequences should be even clearer in the case of LLM providers, since platforms like ChatGPT also create expectations among users, who generally tend to believe in the accuracy of the output, even after being presented with disclaimers.
The Steep Path for AI Providers
While a mistake on a date of birth may not sound like a big deal to some, these instances of “careless speech” have an impact on individual’s data protection rights and pose risks to science, education, society, and even democracy (after all, misinformation about Barack Obama’s birthplace almost cost him the presidency in 2008).
Ongoing scrutiny under the GDPR poses a critical challenge not only for OpenAI but also for all providers of AI technology, who have previously prioritized engagement and scale over accuracy and compliance. Transparency around AI training data could clarify the causes of inaccuracies and reduce errors. It may also indirectly improve the legality of training practices from both a privacy and intellectual property perspective.
As the AI Act emerges as the shiny new regulatory development, it is the GDPR and the data protection authorities that are currently providing essential tools to protect individuals’ privacy from tech companies. There is an urgent need for AI systems to be designed from the outset to comply with data protection laws, which will shape how AI technologies handle personal data responsibly. The resolution of this complaint could set an important precedent for the future of privacy in the AI era.