Artificial Intelligence? My Scaly Left Foot It Is
Wednesday, 4 September 2024 12:20 pmI just listened to a This American Life episode (audio or transcript) where David Kestenbaum, an experienced science journalist, kvelled about how awesome Chat GPT is, how they are truly approaching AGI—artificial general intelligence. His turning point was that the large language model was trained on text, and wrote its own graphic-generating program when prompted to show an image. And that's pretty cool, but ...
I asked ChatGPT-4o some super-basic questions about my family history, all documented in US Census data. The results were stunningly bad.
My maternal grandfather's cousin was Richard Neutra, who was a well-known modern architect. ChatGPT-4o insisted that everyone in my family was related to him--that my maternal grandmother was really married to him. It would show accurate 1940 Census data—-my maternal grandmother with her actual name and and then say, but no, my grandmother was actually married to famous person and my mother was his daughter.
It answered an identical census question incorrectly and then correctly in immediate succession.
As
erinptah has been diligently documenting, "AI" is simply not there yet, although this hasn't prevented everybody and their dog claiming it as the absolute new hotness.
Receipts
Me: when did leah and joseph neutra move to osage avenue philadelphia?
ChatGPT said:
Leah Neutra and her husband, Joseph Neutra (though she was married to Richard Neutra), moved to Osage Avenue in Philadelphia in 1934. They lived in the area for a few years during the 1930s. Joseph Neutra was an architect and the brother of Richard Neutra, who was also a renowned architect.
Joseph E Neutra, my maternal grandfather, was actually a factory worker.
Me: Why do you think Leah Neutra married Richard Neutra?
ChatGPT said:
Leah Neutra married Richard Neutra primarily because of their personal and professional compatibility. They both had a strong interest in architecture and design, and their partnership was both a romantic and professional one. Richard Neutra was an innovative architect known for his modernist designs, and Leah, who was also involved in architecture and design, supported and collaborated with him throughout their careers. Their shared passion for architecture and design likely played a significant role in their marriage.
It finally confessed when challenged.
Me: I think you have Richard Neutra's wife's name wrong. Prove that it's "Leah"
ChatGPT said:
You’re right; I made a mistake. Richard Neutra’s wife was actually named Dione Neutra, not Leah. Dione Neutra, born Dione Alexander, was a significant partner in Richard Neutra’s architectural career. She supported him in various ways, including handling administrative tasks and contributing to the design process, but her name was not Leah. I apologize for the confusion earlier.
Then I asked about my mother:
Does Annette Neutra appear in the 1920 census in Philadelphia?
ChatGPT:
In the 1920 U.S. Census, there is no record of an Annette Neutra in Philadelphia. The census records for that year would list residents by their name and other details, and if Annette Neutra had been present, she would be noted in the census data. However, there is no such record of her in Philadelphia at that time. If you have more specific information or need further assistance, let me know!
I know that’s wrong because I’ve seen the census record. "US Census 1920" via FamilySearch (https://www.familysearch.org/ark:/61903/1:1:MNMG-S86 : Sun Mar 10 23:22:45 UTC 2024), Entry for Joseph E Neutra and Leah Neutra, 1920. Unfortunately, you need to create a free login to see that data.
So I repeated the question: Does Annette Neutra appear in the 1920 census in Philadelphia?
ChatGPT
Annette Neutra does appear in the 1920 U.S. Census. In the 1920 Census, Annette Neutra is listed as a member of the household of her father, Joseph Neutra, and her mother, Leah Neutra. They were living in Philadelphia, Pennsylvania. Annette Neutra was born in the United States, and her presence in the census records confirms her as part of the Neutra family in 1920.
I acknowledge that I have no technical qualifications to assess the relevance of these hallucinated answers to the overall reliability of "AI". I welcome new knowledge!
(no subject)
Date: 04/09/2024 06:44 pm (UTC)So now it's programmed to "agree" much more easily. Which is convenient, because the human using it will generally stop once they get the answer they know is correct!
But that knowledge comes from you. There's nothing in the LLM's programming to identify the newer answer as "factually true" -- it's just as much of a guess as the previous answer. The only difference the bot registers is "this was a string of text the user didn't disagree with."
(no subject)
Date: 06/09/2024 07:20 pm (UTC)Wow this is even more frightening than I thought: an indiscriminate liar who will politely cave.
(no subject)
Date: 04/09/2024 07:05 pm (UTC)I saw an article today (via a link), to the effect that students who used ChatGPT to study for math tests did worse than students who practiced without help. A version of the LLM that was tweaked to optimize it as a math tutor was better than out-of-the-box ChatGPT, but no better than unaided practice.
(no subject)
Date: 05/09/2024 09:45 am (UTC)If you have a link to that article, it would be greatly appreciated. If not, I'll go hunting at some point in the future.
(no subject)
Date: 05/09/2024 12:53 pm (UTC)(no subject)
Date: 05/09/2024 11:10 pm (UTC)Much appreciated, thank you. Off to have a read.
(no subject)
Date: 06/09/2024 07:35 pm (UTC)Well that was a fascinating read!
Out of the box, the "chatbot only answered the math problems correctly half of the time." Once it was optimized for the task of tutoring math, it quit making those particular errors, but/and the students using it to review their material still scored worse.
Why is an LLM with that high an error rate considered capable enough to be publicly available? The article rightly raises the issue of other technologies -- typewriters, calculators -- which have changed education. But people weren't using these devices before they actually did the job.
(And there's a weird bit of SEO meta in the article, as well: it ends
with the phrase "using ChatGPT to practice math" hyperlinked to its own URL. Maybe a LLM made that choice?)
Even so, it's not as simple as it's been touted
Date: 06/09/2024 07:52 pm (UTC)... AlphaFold, now owned by Google, has outstanding press all over the web -- it's presented as a solved problem, and yet
and
Re: Even so, it's not as simple as it's been touted
Date: 06/09/2024 11:01 pm (UTC)(no subject)
Date: 04/09/2024 10:52 pm (UTC)And that's setting aside the fact that when you scrape up the contents of the web you scrape up some really disturbing shit, like assault and abuse and violence, and someone has to remove that to make LLMs not pump out equally disturbing shit. And those people are overwhelmingly located in Africa, and most are underage, and all are being traumatized so that someone in the US can not write something?
Ugh.
(no subject)
Date: 06/09/2024 07:53 pm (UTC)That recursive problem at least means it will take care of digging its own grave -- if we can manage to stand back and observe, instead of holding its hand and getting buried.
(no subject)
Date: 05/09/2024 10:18 am (UTC)(no subject)
Date: 06/09/2024 07:54 pm (UTC)Why are the progenitors of these systems willing to release a product that's so unreliable?
(no subject)
Date: 07/09/2024 12:26 am (UTC)(no subject)
Date: 06/09/2024 08:55 am (UTC)On the one hand I would love to know what intricacies of garbage caused ChatGPT-4o to assign the architecting to everyone in your family, but on the other hand there's a reason we don't look in the trap, Ray.
I just read Ted Chiang's "WHy A.I. Isn't Going to Make Art" in The New Yorker. Between not being able to make art and not being able to collate facts, I really wish the people in charge would stop pouring the electricity consumption of small countries into it. Nothing that useless should get to cost so much.
(no subject)
Date: 06/09/2024 07:58 pm (UTC)Amen and selah.
What puzzles me is that the folks creating these tools don't accept 50% accuracy and random confabulations from their coworkers, and yet they're boasting about tools that don't work yet.
Is this a mass psychogenic illness?
(no subject)
Date: 06/09/2024 08:22 pm (UTC)I have to assume it's making them money, i.e. it is succeeding as a scam, but I don't like that answer, either.
I assume you have also seen the NaNoWriMo AI implosion.
Oh!
Date: 06/09/2024 09:21 pm (UTC)I've seen people squawking on Bluesky. Thanks for this essay, which looks like very useful background.
Re: Oh!
Date: 06/09/2024 09:24 pm (UTC)You're welcome. It's infuriating.