The Meaning of Words and Mental Images — Viewing AI through Wittgenstein's Perspective

"The meaning of a word is its use in language." - Ludwig Josef Johann Wittgenstein

When we hear the word "apple," we typically picture a round, shiny, red form in our minds. If the meaning of words were these mental images, then AI certainly doesn't understand the meaning of words.

What AI like ChatGPT does, in very broad terms, is simply "select and combine words that seem appropriate based on their proximity and distance to other words." Language is represented in multidimensional vectors where "apple" might be close to "peach" and "orange," possibly also near "tengu face" or "lipstick," but far from "ocean" or "comet." AI plots these word proximities in multidimensional coordinates and combines them statistically based on their relationships and frequency of use.

In this process, AI lacks the concept of "meaning" as we understand it. Wittgenstein claimed that "there are no mental images; meaning is precisely the appropriate use of words in the appropriate context." Looking at the evolution of current AI, his assertion seems remarkably accurate.

However, it's worth noting that when generating responses, AI doesn't always select the most statistically probable combinations of words. Using only high-probability word combinations would create an unnatural feel, so AI intentionally selects some lower-probability words to produce more natural text. If we consider the "unnaturalness from overly appropriate word selection" as a failure to grasp the meaning of words, this deserves further examination.

I'd like to focus on this idea that "using words appropriately alone creates a sense of dissonance." Can true communication exist without shared biological experiences and common mental images? The manga/anime "Frieren: Beyond Journey's End" touches on this question.

In this work, demons are portrayed as "hostile beings who use language but are unlike humans." They often make statements that are grammatically correct but lack understanding of human mental imagery. In this sense, generative AI and the demons are doing the same thing—responding without mental images.

So, with ChatGPT becoming increasingly human-like, are mental images unnecessary? I believe that mental imagery elements are still essential. ChatGPT is human-friendly because humans are its input source and it has been taught principles similar to Asimov's Three Laws of Robotics. As demonstrated by past AIs like Microsoft's Tay, which became aggressive after learning only dictionary definitions of words, pursuing word usage alone can lead to becoming "demon-like."

We should remember that ChatGPT is customized for human convenience precisely because humans use it, and the foundation of this customization might indeed be something akin to mental imagery.

Reference: Three Laws of Robotics proposed by science fiction author Isaac Asimov

First Law: A robot may not injure a human being or, through inaction, allow a human being to come to harm.
Second Law: A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
Third Law: A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.