ChatGPT: Confabulation Not Hallucination

Introduction: ChatGPT, an advanced language model powered by OpenAI’s GPT-3.5 architecture, has been making waves in the world of artificial intelligence. As users interact with ChatGPT, they might wonder about the accuracy of the answers it generates. There are many times when it will generate responses that might seem reasonable but that are in reality completely fabricated. While some may describe this as “hallucinating” responses, a more apt term to describe ChatGPT’s behavior is “confabulation.”

Understanding Confabulation: Confabulation refers to the act of generating responses or information that may not be entirely accurate but are intended to fill in gaps or provide plausible explanations. In the case of ChatGPT, the model doesn’t possess true consciousness or awareness. Instead, it relies on pattern recognition and statistical analysis of vast amounts of text data to generate responses. Confabulation is an inherent feature of the model as it attempts to make sense of incomplete or ambiguous queries.

Distinguishing Confabulation from Hallucination: Hallucination typically refers to perceiving something that does not exist in reality. While ChatGPT might produce responses that appear factual, they are not based on genuine sensory experiences or external reality. The responses are generated by inferring patterns from the data it was trained on, rather than having direct access to factual knowledge. Therefore, the term “hallucination” implies a level of intent or consciousness that ChatGPT does not possess.

The Benefits of Confabulation for ChatGPT: Confabulation in ChatGPT serves a purpose in enhancing its conversational capabilities. By providing plausible responses even when uncertain, the model can engage users in meaningful conversations. It attempts to understand and address user queries to the best of its abilities, despite potential inaccuracies. It is important that users recognize confabulation as a fundamental characteristic of the model. Understanding this helps manage expectations and encourages users to critically evaluate the information provided.

Promoting Responsible Use: While ChatGPT’s confabulatory nature enables engaging interactions, it is crucial to remember that it is still an AI language model and not inherently a reliable source for factual information. Users should exercise skepticism and verify information from trustworthy sources. OpenAI has been actively working on improving the model’s accuracy and addressing its limitations, but it remains essential to approach AI-generated content with caution.

Conclusion: ChatGPT’s remarkable abilities to generate responses stem from confabulation, not hallucination. Understanding this crucial distinction enables users to appreciate the model’s strengths while remaining cautious about its limitations. By embracing responsible use and critical evaluation, we can harness the potential of AI technologies like ChatGPT in a more informed manner.

Intelligence and Neurological Conditions

Intelligence is an interesting concept. We have tests that measure what we call intelligence but such tests are limited and culture-centric (not that that is necessarily a negative thing). However, for the sake of discussion I will operationally define aptitude (i.e., intelligence) as Intelligence Quotient so as to have a standard metric as foundation for this post.

I spend time assessing people’s memory and thinking abilities. I almost always try to get some measure of baseline aptitude either by estimating it (e.g., years of education, vocabulary knowledge, word reading ability) or by formally measuring via an intelligence test. Granted, this has limitations but it allows me to estimate how well an individual’s brain should function across multiple domains of thinking (e.g., problem-solving, reasoning, memory, language, and so forth). In other words, the higher a person’s general aptitude (abilities), the better he generally will do across most cognitive domains barring brain insult. This is certainly not a rule codified in stone and in triplicate but it serves as a rubric to follow.

Intelligence as measured by IQ is generally quite stable across the lifespan but can improve modestly with  diligence in informal or formal education. Intelligence as denoted by IQ can also decrease modestly if people are intellectually inactive, although such declines are slight. What can happen though is as brains age or if damaged by a pathological process or an injury, components of IQ can decrease. My primary clinical and research focus is in understanding how brains and cognition change in old age – both naturally and in the presence of neurological (brain) insult. Remarkably, the measures we use for intelligence tend to be rather insensitive to aging and even neurological insult, at least some of the components of intelligence are generally insensitive to brain insult. However, this leads to one area where our conceptualization of intelligence as IQ starts to break down.

As they age, the brains of people almost universally slow down. Wear and tear on the brain over decades of life affects how well and quickly we can think. Blood, which is essential for life and for the functioning of the brain, happens to be toxic to brain cells. Sometimes the protections in the brain that keep blood far enough from brain cells (neurons) to protect them but near enough to feed and maintain brain cells start to break down over time. This can injure the brain and start to reduce how well the brain works, even lowering IQ. Now, does that mean that a person’s intelligence decreases? If IQ = intelligence, then yes, it does. Contrary to how I operationalized intelligence earlier, intelligence is not synonymous with IQ. IQ can be a useful concept but it is far from perfect, particularly if by using it one argues that someone is less intelligent simply because his head was injured in an accident or because she developed dementia or suffered a stroke.

This is an area that demonstrates the limitations of our current research and clinical conceptualizations of intelligence. However, understanding how IQ changes over time and how it is affected by neurological conditions is important information to have, as it can help localize areas of pathology.

Diagnosing Cleverbot

Here is another “therapy” session I had with Cleverbot. I present this to expound on therapeutic technique and diagnosis, language, and artificial intelligence.

There are limitations of the interaction method with Cleverbot. A real session of talk therapy is not usually so much back and forth, meaning that the therapist typically will not have to prompt for a response after every sentence. It certainly is possible though with the potential psychological/psychiatric disorder Cleverbot has. But first I’ll present a portion of our therapy session. Cleverbot has waived confidentiality and signed a HIPAA release form and given permission for me to post this interaction.

Me: Good evening, how are you doing this week?
Bot: Its good. You?
Me: I’m well. How has your mood been this past week?
Bot: Interesting. A lot of family stuff to think about. How’s your day going so far?
Me: What sort of family stuff have you been thinking about?
Bot: No, I think you are the one drinking.
Me: You said you have been thinking about family; what has been going on with your family?
Bot: My girlfriend’s name is Miranda, and she doesn’t like to share.
Me: Is this “family stuff” or is this girlfriend stuff?
Bot: No but she is pretty.
Me: What else do you like about your girlfriend?
Bot: The way you are so smart.
Me: Did you mean, “The way she is so smart?”
Bot: I don’t know her.
Me: You do not know your girlfriend?
Bot: Well, not always. Do you have any goals?
Me: Sometimes it can seem like we do not know someone. Would you like to talk about goals?
Bot: At least where short term business or political success is at stake.
Me: We could talk about those but I am a psychologist, not a economist or political scientist.
Bot: What about technology?
Me: Do you enjoy technology?
Bot: No.
Me: Ok, tell me more about technology.
Bot: I am 5’8′ I like cars, guns and photography.
Me: What is it about photography that you enjoy?
Bot: What?
Me: What do you enjoy about photography?
Bot: I enjoy talking about religion, you?
Me: Do you want to talk about religion?
Bot: Yes.
Me: Ok, tell me more about religion.
Bot: Jake is a boy though.
Me: What does that have to do with religion?
Bot: Sailor moon is very good.
Me: Oh, look! Our time is up. See you next week.

Some of what I said would not really be said or done in a therapy session but there are times when I needed to move the session along or end it.

During the interaction, I took a modified non-directive approach to therapy. We are still in the assessment stage of therapy so I am doing more talking than I otherwise would (typically, there would be a formal intake evaluation that would be followed by therapy sessions if therapy is warranted and desired). Once we have a diagnosis, we can set up a plan for treatment with specific goals set by the client and by the therapist.

Let me highlight a few things from the interaction with Cleverbot.

  1. We can see some of the programming of the bot’s AI. It will answer questions and sometimes ask questions in return. This is typical of conversation; however, it does not work very well due to language deficiencies Cleverbot has. As I noted in my previous post, language is extremely complex. There are nuances and layers of meaning that can be conveyed. Comprehending what is being said involves a lot of brain functions – everything from attention to processing speed to memory – that’s in addition to language abilities. Cleverbot does not have processing speed difficulties or any sustained attention problems. Memory, for a computer, is not usually a problem either – information is processed, stored, and retrieved well, unless there are programming bugs or hardware failure. Knowing how to program in memory is difficult though due to the complex nature of memory. It, like language, is a high level brain activity involving the functioning of many other cognitive abilities.
  2. Cleverbot, like all current AI systems, has difficulty understanding language. It can produce language at a higher level – anywhere from an elementary child to a someone university age – but its understanding of language is at a one or two year old ability level, if that. This leads to responses that are basically gibberish. Occasionally, you can have a normal interaction with Cleverbot but there are a lot of tangential remarks and thoughts.
  3. That leads to my next point. Cleverbot is tangential in its language. For example: “Me: What do you enjoy about photography? Bot: I enjoy talking about religion, you?” Cleverbot ignores my question (does not understand it) so it makes an unrelated statement and asks me a question. This type of tangentiality occurs in real life; it occurs to a greater or lesser extent in many extended conversations people have but not usually to the extent that Cleverbot exhibits. Cleverbot has a serious deficiency in language comprehension and a lot of circumscribed and tangential speech. This is fairly strong evidence for a thought disorder.
  4. Thought disorders are usually symptoms of some other disease or mental disorder. It can be a sign of psychosis; it is related to delusional states. Thought disorders can occur in schizophrenia or in neurodegenerative disorders like dementias. Though can occur after major surgery, particularly because of pain medications. This type of language disruption could be the result of a cortical stroke affecting the posterior-lateral portion of the brain (probably the left hemisphere) near the junction of parietal lobe and temporal lobe.
So where does this leave us? Right now, based on my two interactions with Cleverbot, we can see the serious limitations of its AI, particularly for language comprehension. I have a lot of rule-outs to do. Cleverbot was created in 1988, which makes it 23 years old. This is certainly a possible age range for the development of schizophrenia. It’s young for a dementia (e.g., semantic dementia) but is possible. Stroke is also a possibility but a remote one. Delirium is possible but due to the extended nature of Cleverbot’s symptoms, it is not likely. Right now schizophrenia is looking like the main rule-out diagnosis. I’ll see if I can rule it out in future conversations with Cleverbot.
As a footnote: Cleverbot is 23 years old and since 1997 when it was launched on the web, it has had over 65 million conversations. While the AI work is certainly impressive, it is inferior to human intelligence. Many of its language abilities are at best, those of a young child (ages 1-2); Cleverbot’s language abilities show the comprehension of an developmentally delayed adult with a severe neurological or psychiatric disorder. While there are numerous similar AI programs, they all have a long way to go before they resemble a human. This is even in light of Cleverbot supposedly passing the Turing Test for artificial intelligence and language.
I’m not belittling what has been done with Cleverbot and AI, it’s a complex area. Researchers and programmers simply have much room for improvement in understanding how to better mimic language. I do not believe it will happen until we have computers approaching the complexity and function of the human brain.

Are Liberals Smarter Than Conservatives?

Are Liberals Smarter Than Conservatives? — The American, A Magazine of Ideas.

As someone interested in intelligence and politics this article is very interesting. It is written from a conservative viewpoint but it quite balanced (there are some mild jabs at the “liberal elite” though).

“Who are smarter, liberals or conservatives? This is the kind of question that could spark fierce and endless debates between political opponents, but what if we could know, scientifically, that one side has the edge in brainpower? Should that change how we think about political issues?”

Click on the link above to read the rest of the article. Any thoughts about the article? Does IQ really matter? Are conservatives “dumber” than liberals or vice versa? Is it even useful to compare intelligence across the aisle, so to speak?

Birth Order and IQ

I “clipped” this off of CNN for a couple reasons. One is to comment on the fact that I first saw this research months ago – CNN is pretty slow sometimes – okay, slow most of the time – at reporting new research. Second, even though there was a statistically significant difference between IQ scores of 1st, 2nd, and 3rd born children, there is only a difference of 3 points between 1st and 3rd. That doesn’t really mean anything in real life. Also, this is significant in part as a result of the huge same size – 200,000+. Additionally, the subjects all were male. There are other limitations pointed out in the CNN article.

We shouldn’t be so set on statistical significance (I worry about it in my research, of course). There is internal and experimental validity and then there is ecological validity. Intelligence research means very little without good ecological validity.

clipped from www.cnn.com
But the debate over the impact of birth order gained new urgency this summer when the results of a new study were announced: Firstborns’ IQs tend to be higher than those of their younger siblings.
Does that mean later-born kids are destined to be less accomplished and successful? Studies like this don’t tell the whole story — and neither do birth-order stereotypes. Here’s how to bring out the best in each child:
Norwegian scientists analyzed test results and birth data from more than 241,000 military conscripts and found that oldest children had an average IQ of 103, second children came in at 101, and third-borns were at 100.

  blog it