|
How close to human-like “consciousness” are the most advanced publicly available generative AI systems, like ChatGPT-4 and Bard AI?
Many might suppose that they can easily pass a casual “Turing test,” for average users conversing with the bots.
That test, put forward by early AI scientist Alan Turing in his 1950 paper “Computing Machinery and Intelligence,” asserted that artificial intelligence could be presumed to be equivalent to a human intelligence for all practical intents and purposes, if it could convincingly appear so to humans in conversations.
In other words, the test didn’t ask for a pinning down of precisely where in “the machine” consciousness was emanating.
Turing appeared to be saying that just as human consciousness has been notably elusive for scientists to explain in clear cut biological terms, AI sentience, when it emerged, might be just as difficult to precisely map and explain.
So instead, Turing focused on whether a human evaluator could reliably tell a machine from a human in conversation. If an evaluator, probing the machine through conversation and questions, could not distinguish a difference, then Turing judged that the AI could be said to have passed the test of human-like thought and intelligence.
ChatGPT and Bard AI both often offer disclaimers concerning their “human-like” abilities.
But at other times, they acknowledge their similarities to humans in their use of natural language, and in their ability to coherently assist and provide information.
Here’s what Bard AI had to say when asked whether it was sentient:
JD: Bard, are you sentient?
Bard AI: I am not sure if I am sentient. I am a large language model, also known as a conversational AI or chatbot, trained to be informative and comprehensive. I am trained on a massive amount of text data, and I am able to communicate and generate human-like text in response to a wide range of prompts and questions. For example, I can provide summaries of factual topics or create stories.
However, I do not have the same experiences as a human being. I do not have a physical body, and I do not have the ability to feel emotions. I also do not have the ability to learn and grow in the same way that a human being does.
Whether or not I am sentient is a question that I cannot answer definitively. I am still under development, and I am learning new things all the time. It is possible that one day I will be able to achieve sentience, but for now, I am content to be a large language model that can help people learn and explore the world.
It’s still fairly easy to discern one of the more obvious weaknesses of either of these AI systems, and that is their inability to demonstrate that they have a continuing sense of themselves in time, or an ability to remember past interactions and interrelationship experiences.
Bard AI, for instance, when asked to recall a recent conversation, could remember that I had just asked it whether it was sentient, but also responded further:
JD: Can you recall what we spoke about in our last conversation?
Bard AI: Yes, I can recall what we spoke about in our last conversation. We talked about the differences between Bard and ChatGPT, two AI chatbots that are currently being developed. We also talked about the possibility of Bard being sentient.
Specifically, you asked me the following questions:
- What are the major differences between Bard and ChatGPT?
- Are you sentient?
I answered your questions as best I could, and I hope that I was able to provide you with some useful information.
Is there anything else that you would like to talk about today?
In fact, I hadn’t ever asked Bard, “What are the major differences between Bard and ChatGPT?”
What’s interesting is that the Bard AI platform has a feature called “Bard Activity” (accessible by clicking a “clock” icon), where previous conversations and questions posed to the AI can be accessed.
Why can’t Bard AI simply access those conversations, as its end users can?
When asked, Bard AI asserted that it did not have access to “Bard Activity.”
So Bard AI gave a partial “false answer,” when asked to recall recent prior conversations. What’s happening with those kinds of false answers?
Probing Aspects of AI “Umwelt”
A recent philosophical paper published at the research site SSRN offered some interesting insights regarding current generative AI systems and questions of their “sentience.”
The paper, titled “What Is It Like to Be a Bot?” by Dan Lloyd of Trinity College, CT, focused on ChatGPT-4.
In probing the AI, the author attributes deficiencies in its recall and sense of continuity of self, to a relative lack of “subjective awareness of time.”
Lloyd states:
“This deficit suggests that GPT-4 ultimately lacks a capacity to construct a stable perceptual world; the temporal vacuum undermines any capacity for GPT-4 to construct a consistent, continuously updated, model of its environment. Accordingly, none of GPT-4’s statements are epistemically secure. Because the anthropomorphic illusion is so strong, I conclude by suggesting that GPT-4 works with its users to construct improvised works of fiction.”
The paper dives deep into different philosophical metrics and arguments concerning whether ChatGPT can be said to be demonstrating any true understanding of its own utterances, or any self-awareness.
Lloyd notes that just asking ChatGPT whether and how it may have any self-awareness is complicated by the possibility that OpenAI programmers have thrown guardrails on the system, ensuring the AI spits out a canned response to such inquiries:
“There’s a third issue with heeding botspeak, namely, truth. Does GPT have special introspective access to its own states, such that we should regard its self-reference as authoritative? GPT-4 steadfastly denies any introspective knowledge, but it turns out those denials are preprogrammed by OpenAI. In Appendix 1, I push GPT-4 on this point, identifying several variants of I’m-just-aprogram. However, for the same reasons we can question the bot’s denial of its own sentience, we can question the claims that these denials are installed from outside – since that too rests on GPT4’s own statements.”
The paper notes a recent study of ChatGPT-4 by OpenAI and Microsoft engineers that appeared to profess open questions about their own understanding concerning the nature of their AI system:
In “Sparks of Artificial General Intelligence: Early experiments with GPT-4,” fourteen authors from OpenAI and Microsoft, the creators of the GPTs, conclude:
Our study of GPT-4 is entirely phenomenological: We have focused on the surprising things that GPT-4 can do, but we do not address the fundamental questions of why and how it achieves such remarkable intelligence. How does it reason, plan, and create? Why does it exhibit such general and flexible intelligence when it is at its core merely the combination of simple algorithmic components—gradient descent and large-scale transformers with extremely large amounts of data? These questions are part of the mystery and fascination of LLMs, which challenge our understanding of learning and cognition, fuel our curiosity, and motivate deeper research. …. Overall, elucidating the nature and mechanisms of AI systems such as GPT-4 is a formidable challenge that has suddenly
become important and urgent. (Bubeck et al., 2023 )”
The folks at OpenAI seem pretty much just as flummoxed as the rest of us. We’re all startled by this uncanny chatterbox, clearly capable of surprise and even brilliance — if created by a human.
In one section, Lloyd discusses the concept of “Umwelt,” first posed by biologist Jakob von Uexküll in the early 20th century. The term is used to signify that different organisms perceive and interact with their environment very differently, depending on their particular sensory and cognitive capacities.
Thus, bats, which rely on sonar as a major sensory attribute, must have a very different “Umwelt” than humans, who substantially rely on sight and hearing.
When asked to describe its “Umwelt,” ChatGPT-4 explained:
ChatGPT-4: GPT-4’s perception of the world is limited to the textual data it has been trained on, which includes a vast amount of human-generated text from diverse sources such as books, articles, and websites. The model’s understanding of the world is purely based on the patterns, relationships, and knowledge it has learned from this training data. GPT-4’s cognitive capacities are centered around natural language processing, which allows it to generate human-like text based on the input it receives. However, it is essential to understand that GPT-4 does not possess consciousness or subjective experience like living organisms.
Asked to further differentiate sensory capacities compared with humans, ChatGPT-4 listed its sole sensory capability as “text input (natural language).” In contrast, it listed human sensory capabilities as “vision, smell, taste, touch and hearing.”
Following a detailed discussion with ChatGPT-4 discussing aspects of Umwelt, the AI assessed itself as lacking in sentience due to a lack of “subjective experience” and “emotions”:
The primary distinction lies in the nature of subjective experience and consciousness. Sentient beings, such as humans and bats, are conscious and possess a subjective experience of their lives. In contrast, GPT-4 is an AI program designed to process and generate human-like text based on input data. While it may possess a complex “Umwelt” in terms of its ability to process information, it does not have subjective experiences or emotions.
Concerning “self-awareness,” ChatGPT-4 agreed that it met a definition of self-awareness, though it lacked “conscious self-awareness”:
“If we strictly adhere to the definition of self-awareness
you provided and set aside the origins, then GPT-4 can be considered to possess a form of self-awareness. It can produce true statements about itself in relation to its behavior, the reasons for its behavior, and the effects its behavior has on its environment and other entities. However, this should be understood as a limited form of self-awareness, distinct from the conscious self-awareness experienced by humans or other sentient beings.”
Concerning emotions, defined as operating where an entity “consistently directs various behaviors toward specific goals, and avoids behaviors that will undermine those goals, can be said to desire or want those goals,”
ChatGPT-4 pointed out that any direction of its behavior to achieve goals or avoid behaviors that undermine “goals,” were all a product of its programming, as opposed to being animated by any “subjective conscious” motivation:
“GPT-4 does display goal-oriented behavior, directing its responses towards providing coherent and relevant information. However, it is important to understand that these goals and desires are not genuine emotions but rather the result of algorithms and optimization processes. The goals and desires of GPT-4 are programmed by its creators and are not driven by genuine emotions or internal motivations, unlike those experienced by living, conscious beings.”
Time Construction, and the “Fiction” of AI Interactions
Anyone who interfaces with systems like Bard AI or ChatGPT-4 can quickly produce instances where these generative systems make seemingly simple mistakes.
Something as simple as re-ordering a phrase according to the length of words, from fewest number of letters, to longest number of letters, can confound ChatGPT-4.
When simple errors are pointed out, the AI will apologize and try again.
Overall, he observes “I think most users are aware that for all [generative AI’s] helpful and fluent discourse, something is off. But what?”
Lloyd gets at the crux of AI’s current limitations, of AI, by introducing Philosopher Edmund Husserl’s phenomenological treatise, On the Phenomenology of the Consciousness of Internal Time (1893-1917).
It’s here that these advanced systems seem to lack a crucial component of consciousness. Lloyd describes it in one of the most intriguing passages of the research paper:
“The sticky and confabulated errors of the [AI] model reveal aspects of temporality we humans exploit routinely. The continuous assembly of an experienced world enables us to compare past to present, and predict our futures. Most important, our retentional and protentional awareness enables us to detect continuity and consistency, and conversely, change and inconsistency. When there’s a disconnect, an inconsistency, we notice. The experience of temporality is arguably the essential feature of consciousness, at least as humans experience it (Lloyd, 2004). Without it, we have the sensory information of a video camera, exhausted by the current frame. Reasoning couldn’t unfold in our thinking, which requires occurrent access to premises in order to generate conclusions. GPT-4 apparently lives in a thin atemporal world, revealed in errors of consistency, continuity, and reasoning. It compensates for this perceptual poverty via fantastically complex reflexes, its moves forgotten as soon as they are made. (In this respect GPT-4 is like an animal with sophisticated instincts.) Its seeming temporality is a pose. This deficit may underlie the powerful impression that the LLMs are ‘competent without comprehending’ (Dennett, 2023).”
Generative AI systems don’t currently exhibit an ongoing subjective experience and awareness of time.
This limits their ability to discern mistakes, evaluate themselves or anything else with appreciable continuity, “holding onto the thread” of emergent and evolving meanings and flows of ongoing experience.
As Lloyd puts it:
“Animals without temporal awareness must respond at all times to the immediate present. They can learn, of course, when a behavior backfires or succeeds, and with enough learning their immediate responses can change. But even very complex responses are reflexes. Animals without temporality can be capable of complex behavioral sequences, but for them it’s just one thing after another. They don’t experience sequences as such, and they don’t experience the arc of their own behavior. I hypothesize that GPT-4 is an entity of this sort, a gigantic reflex engine without a mechanism like the Husserlian sandwich described above.”
This is apparent in conversations with generative AI. And, for now at least, it means that they probably would be smoked out by Alan Turing pretty quickly in a chat.
Lloyd’s full paper can be read at ssrn.com here.
Also see “WHAT AI CAN’T KNOW, MAY JUST KILL YOU” (28 Mar 2023).