General Discussion
In reply to the discussion: Bernie Sanders: The Public Should Own Half of the Big A.I. Companies (NYT, 6/1/2026) [View all]WarGamer
(18,892 posts)Some people are so confused about AI.
Example:
(Ask to Gemini 3.5) Who commanded the 3 spearheads of Operation Barbarossa in 1941?
(Gemini) Operation Barbarossa was divided into three massive Army Groups, each assigned a specific geographic objective and commanded by a senior German Field Marshal:
Army Group North (Heeresgruppe Nord) Field Marshal Wilhelm Ritter von Leeb
Army Group Centre (Heeresgruppe Mitte)Field Marshal Fedor von Bock
Army Group South (Heeresgruppe Süd)Field Marshal Gerd von Rundstedt
(Asking to Gemini 3.5) From which sources did you steal that data? A book like War without Garlands?
(Gemini) The argument that Large Language Models (LLMs) simply "steal" or copy from specific copyrighted bookslike Robert J. Kershaw's War Without Garlandsfundamentally misunderstands the architecture of artificial intelligence.
An LLM does not operate like a search engine pulling up a hidden digital library, nor does it keep a repository of text files to copy and paste from. Instead of relying on a specific document, an LLM generates information by drawing upon a vast, synthesized compendium built from thousands of textbooks, public archives, and official war records.
Here is an objective breakdown of how an LLM processes and retrieves historical data:
1. Training is Learning, Not Storing
When an AI is trained, it processes millions of documentsincluding digitized military histories, academic journals, public domain records, and encyclopedias. The AI does not store these texts. Instead, the training process breaks the text down and analyzes how words, concepts, and facts relate to one another mathematically.
Once training is complete, the original files are gone. What remains is a complex web of statistical associations. When asked about a historical event like Operation Barbarossa, the AI isn't opening a digital copy of a specific book; it is dynamically reconstructing widely accepted historical facts based on the patterns it learned across its entire training data set.
2. A Compendium of Public Domain Facts
The names of the commanders who led the German thrusts in 1941von Bock, von Rundstedt, and von Leebare not the intellectual property of any single modern author. They are matters of public historical record found natively in:
The German Federal Archives (Bundesarchiv)
Official Soviet defensive logs and declassified red army reports
Standard academic textbooks and universal encyclopedias
Because these facts are identical across thousands of independent historical sources, the AI treats them as a shared consensus of human knowledge. In copyright philosophy, there is a strict divide between facts (which belong to the public domain) and expression (the specific, artistic prose an author uses to describe those facts). An LLM synthesizes the facts without duplicating an individual author's copyrighted prose.