An oft asked question: “Is the data that I put into a Large Language Model (LLM) safe”? Or, alternatively, “If I put commercial or confidential information into an LLM, what is the chance that this information will be leaked?” There is great motivation for putting information into LLM, because LLMs are calculators for text, and they are intensely useful, for example, for summarizing, extracting data, rewriting, highlighting changes, and responding. This is a worry for CIO’s and IT departments everywhere, because, in the absence of guidelines and training, people are putting information into LLMs, due to their utility.
The thing is, whenever information is stored, there’s a chance that it will get leaked. Emails can be forwarded, computers or SharePoint can be hacked. The likelihood depends on the security of the system. Large established companies, e.g. as Microsoft, Google and Apple, tend to be more secure, because they have more resource, and more to lose if they fail. Information in an LLM is similar, in that it is held on a computer, and people try to keep it safe.
One difference with LLMs is around motivation. The information you give to the LLM would be useful for training the LLM. Just like your Google queries help Google become a better search engine, and Facebook posts help ad targeting, your LLM queries could help improve the LLM, and early LLMs did use submitted information to train their models. This is mostly changed, where almost all the leading models allow users to secure their data from being trained on.
How each platform handles your data
Here are the details for each of the major models, and links to disable tracking, if possible:ChatGPT personal or free – Training enabled, can disable. For details, see policy.
ChatGPT Teams or Enterprise – Training disabled by default. For details, see policy.
Gemini – Training enabled, can disable (but lose access to history). For details, see policy.
Claude.ai (Anthropic) – Training disabled by default (unless feedback is submitted). For details, see policy.
Deepseek – Training enabled, cannot disable.
Grok.ai (X/Twitter) – Training enabled, can disable. (via the App)
Microsoft Copilot – Training disabled. For details, see policy.