You’d think an AI trained on the entire internet would know how to spell its own parent company’s name. But here we are: Google’s most advanced language models can’t reliably spell “Google” — or, frankly, much of anything. This isn’t a bug; it’s a feature of how these models work, and it reveals a deep limitation that matters far beyond embarrassing demos.
The culprit is tokenization. Large language models don’t see words as sequences of letters. Instead, they break text into tokens — common subword chunks like “Go” and “ogle,” or even smaller fragments. When asked to spell, the model isn’t thinking letter by letter; it’s trying to generate a sequence of tokens that happen to form the word. But since tokens don’t cleanly map to letters, the model often hallucinates extra letters or swaps them. It might write “Gogle” or “Gooogle” without realizing the mistake.
This isn’t just a trivia problem. If an AI can’t reliably spell, it undermines trust in every text it produces. In legal documents, code comments, or brand names, accuracy isn’t optional. The industry’s response has been to add spell-check layers, but that’s a band-aid on a broken finger.
The irony is delicious: Google’s own AI can’t spell “Google.” But behind the joke is a serious lesson. We’ve built systems that are powerful yet brittle, and we’re only starting to understand where they break. Next time your AI writes “teh” instead of “the,” remember: it’s not learning the alphabet — it’s just guessing tokens.
Source: TechCrunch AI
Comments
No comments yet
Connect with Google to comment or reply.
Connect with Google