A Canadian privacy investigation has found that OpenAI collected personal data without proper legal basis when training ChatGPT. For UK small businesses already using — or considering — cloud AI tools, the case raises a pointed question: if a global AI giant can fall short of data protection standards overseas, how confident are you that your customer data stays compliant under UK GDPR?

What the Canadian investigation found

Canada's privacy watchdog concluded that OpenAI failed to obtain valid consent for scraping personal information used to train its models. While OpenAI disputes some findings, the investigation highlights a broader pattern: large AI labs operate across borders with data practices designed for scale, not for the specific consent frameworks that smaller markets demand. For UK firms, this is not a distant regulatory story. The Information Commissioner's Office has already signalled that AI training data is within scope of GDPR, and any business feeding customer data into a cloud AI tool shares responsibility for that lawful basis.

GDPR is not just a big-firm problem

Small businesses often assume GDPR enforcement targets multinationals. In reality, the ICO has fined SMEs for unlawful data sharing, inadequate privacy notices, and failing to complete Data Protection Impact Assessments before deploying new technology. When you upload client emails, invoices, or employee records to a cloud AI service, you are processing personal data through a third party. If that third party's training pipeline scrapes, retains, or reproduces your data, your firm risks being the data controller held accountable. The Canadian case shows that even well-resourced AI providers can have gaps that regulators spot.

Practical steps for UK SMEs using AI

Start with a simple data map: identify what personal data enters AI tools, where it is processed, and whether the provider trains models on your inputs. Check your Data Processing Agreement — many cloud AI services reserve broad rights to use interactions for product improvement. Consider whether the business benefit justifies the risk, and document your rationale as a DPIA-style record. For sensitive sectors — legal, financial, healthcare — this is often where cloud AI stops making sense and local, managed alternatives become the clearer compliance path.

Keeping data on UK premises

The simplest way to remove cross-border uncertainty is to remove the border. On-premise AI keeps your data within your own infrastructure, governed by your existing policies, with no third-party training loops. You retain full audit trails, control retention periods, and avoid the surprise policy changes that accompany global cloud services. For many UK SMEs, this is not about rejecting AI — it is about using it in a way that privacy regulators, insurers, and clients can all understand.