jesse_the_k | Artificial Intelligence? My Scaly Left Foot It Is

I just listened to a This American Life episode (audio or transcript) where David Kestenbaum, an experienced science journalist, kvelled about how awesome Chat GPT is, how they are truly approaching AGI—artificial general intelligence. His turning point was that the large language model was trained on text, and wrote its own graphic-generating program when prompted to show an image. And that's pretty cool, but ...

I asked ChatGPT-4o some super-basic questions about my family history, all documented in US Census data. The results were stunningly bad.

My maternal grandfather's cousin was Richard Neutra, who was a well-known modern architect. ChatGPT-4o insisted that everyone in my family was related to him--that my maternal grandmother was really married to him. It would show accurate 1940 Census data—-my maternal grandmother with her actual name and and then say, but no, my grandmother was actually married to famous person and my mother was his daughter.

It answered an identical census question incorrectly and then correctly in immediate succession.

As erinptah has been diligently documenting, "AI" is simply not there yet, although this hasn't prevented everybody and their dog claiming it as the absolute new hotness.

Receipts

Me: when did leah and joseph neutra move to osage avenue philadelphia?

ChatGPT said:

Leah Neutra and her husband, Joseph Neutra (though she was married to Richard Neutra), moved to Osage Avenue in Philadelphia in 1934. They lived in the area for a few years during the 1930s. Joseph Neutra was an architect and the brother of Richard Neutra, who was also a renowned architect.

Joseph E Neutra, my maternal grandfather, was actually a factory worker.

Me: Why do you think Leah Neutra married Richard Neutra?

ChatGPT said:

Leah Neutra married Richard Neutra primarily because of their personal and professional compatibility. They both had a strong interest in architecture and design, and their partnership was both a romantic and professional one. Richard Neutra was an innovative architect known for his modernist designs, and Leah, who was also involved in architecture and design, supported and collaborated with him throughout their careers. Their shared passion for architecture and design likely played a significant role in their marriage.

It finally confessed when challenged.

Me: I think you have Richard Neutra's wife's name wrong. Prove that it's "Leah"

ChatGPT said:

You’re right; I made a mistake. Richard Neutra’s wife was actually named Dione Neutra, not Leah. Dione Neutra, born Dione Alexander, was a significant partner in Richard Neutra’s architectural career. She supported him in various ways, including handling administrative tasks and contributing to the design process, but her name was not Leah. I apologize for the confusion earlier.

Then I asked about my mother:

Does Annette Neutra appear in the 1920 census in Philadelphia?

ChatGPT:

In the 1920 U.S. Census, there is no record of an Annette Neutra in Philadelphia. The census records for that year would list residents by their name and other details, and if Annette Neutra had been present, she would be noted in the census data. However, there is no such record of her in Philadelphia at that time. If you have more specific information or need further assistance, let me know!

I know that’s wrong because I’ve seen the census record. "US Census 1920" via FamilySearch (https://www.familysearch.org/ark:/61903/1:1:MNMG-S86 : Sun Mar 10 23:22:45 UTC 2024), Entry for Joseph E Neutra and Leah Neutra, 1920. Unfortunately, you need to create a free login to see that data.

So I repeated the question: Does Annette Neutra appear in the 1920 census in Philadelphia?

ChatGPT

Annette Neutra does appear in the 1920 U.S. Census. In the 1920 Census, Annette Neutra is listed as a member of the household of her father, Joseph Neutra, and her mother, Leah Neutra. They were living in Philadelphia, Pennsylvania. Annette Neutra was born in the United States, and her presence in the census records confirms her as part of the Neutra family in 1920.

I acknowledge that I have no technical qualifications to assess the relevance of these hallucinated answers to the overall reliability of "AI". I welcome new knowledge!

Flat | Top-Level Comments Only

From:

erinptah

Earlier versions of ChatGPT would "argue" with users when challenged, which led to the company getting deeply embarrassed by conversations where it would double down on something obviously wrong. (Like "the current date.")

So now it's programmed to "agree" much more easily. Which is convenient, because the human using it will generally stop once they get the answer they know is correct!

But that knowledge comes from you. There's nothing in the LLM's programming to identify the newer answer as "factually true" -- it's just as much of a guess as the previous answer. The only difference the bot registers is "this was a string of text the user didn't disagree with."

From:

jesse_the_k

this was a string of text the user didn't disagree with

Wow this is even more frightening than I thought: an indiscriminate liar who will politely cave.

From:

redbird

The annoying thing is, there are actual useful AI applications out there, for things like protein folding and some kinds of medical diagnosis, and instead of articles about those, we get more LLM nonsense.

I saw an article today (via a link), to the effect that students who used ChatGPT to study for math tests did worse than students who practiced without help. A version of the LLM that was tweaked to optimize it as a math tutor was better than out-of-the-box ChatGPT, but no better than unaided practice.

From:

fred_mouse

If you have a link to that article, it would be greatly appreciated. If not, I'll go hunting at some point in the future.

From:

redbird

Here you go. I found it in Andrew Duckers "interesting links" roundup.

From:

fred_mouse

Much appreciated, thank you. Off to have a read.

From:

jesse_the_k

Well that was a fascinating read!

Out of the box, the "chatbot only answered the math problems correctly half of the time." Once it was optimized for the task of tutoring math, it quit making those particular errors, but/and the students using it to review their material still scored worse.

Why is an LLM with that high an error rate considered capable enough to be publicly available? The article rightly raises the issue of other technologies -- typewriters, calculators -- which have changed education. But people weren't using these devices before they actually did the job.

(And there's a weird bit of SEO meta in the article, as well: it ends

This story about using ChatGPT to practice math was written by ...

with the phrase "using ChatGPT to practice math" hyperlinked to its own URL. Maybe a LLM made that choice?)

From:

jesse_the_k

... AlphaFold, now owned by Google, has outstanding press all over the web -- it's presented as a solved problem, and yet

[May 2024] Last week’s announcement of AlphaFold 3, a new artificial intelligence–powered program from Google DeepMind, sparked excitement in the scientific community for its promise to vastly improve predictions of the structure and interactions of proteins and aid discovery of new drugs. But DeepMind and Nature, which published the research, have come under fire for offering only restricted access to the program and failing to release the computational code underlying it. Last week’s announcement of AlphaFold 3, a new artificial intelligence–powered program from Google DeepMind, sparked excitement in the scientific community for its promise to vastly improve predictions of the structure and interactions of proteins and aid discovery of new drugs. But DeepMind and Nature, which published the research, have come under fire for offering only restricted access to the program and failing to release the computational code underlying it.

https://www.science.org/content/article/limits-access-deepmind-s-new-protein-program-trigger-backlash

and

[June 2024] The overarching theme is that careful consideration is due when using AlphaFold2-generated models to generate testable hypotheses and structural models, rather than treating predicted models as de facto ground truth structures.

https://www.nature.com/articles/s41589-024-01638-w

From:

redbird

Good point, but at least they're making real progress: the LLMs that are being called artificial intelligence are nothing of the kind.

From:

sheafrotherdon

The problem with LLM using scraped text is . . . well, manyfold. But as people use LLMs to produce text that goes online, the quality (such as it is) of LLM output declines. It's already declining. Because it's feeding itself its own gibberish and so . . . I really don't know where the optimism about it comes from!

And that's setting aside the fact that when you scrape up the contents of the web you scrape up some really disturbing shit, like assault and abuse and violence, and someone has to remove that to make LLMs not pump out equally disturbing shit. And those people are overwhelmingly located in Africa, and most are underage, and all are being traumatized so that someone in the US can not write something?

Ugh.

From:

jesse_the_k

That recursive problem at least means it will take care of digging its own grave -- if we can manage to stand back and observe, instead of holding its hand and getting buried.

From:

lunabee34

When my students use AI to write papers, it makes bizarre mistakes; it gets quotes wrong (using synonyms of words for the actual quote from the text) or invents plot points.

From:

jesse_the_k

Why are the progenitors of these systems willing to release a product that's so unreliable?

From:

lunabee34

I have no idea. Money is the answer, I'm assuming. It's always money. :(

From:

sovay

Joseph E Neutra, my maternal grandfather, was actually a factory worker.

On the one hand I would love to know what intricacies of garbage caused ChatGPT-4o to assign the architecting to everyone in your family, but on the other hand there's a reason we don't look in the trap, Ray.

I just read Ted Chiang's "WHy A.I. Isn't Going to Make Art" in The New Yorker. Between not being able to make art and not being able to collate facts, I really wish the people in charge would stop pouring the electricity consumption of small countries into it. Nothing that useless should get to cost so much.

From:

jesse_the_k

Nothing that useless should get to cost so much.

Amen and selah.

What puzzles me is that the folks creating these tools don't accept 50% accuracy and random confabulations from their coworkers, and yet they're boasting about tools that don't work yet.

Is this a mass psychogenic illness?

From:

sovay

Is this a mass psychogenic illness?

I have to assume it's making them money, i.e. it is succeeding as a scam, but I don't like that answer, either.

I assume you have also seen the NaNoWriMo AI implosion.

From:

jesse_the_k

I've seen people squawking on Bluesky. Thanks for this essay, which looks like very useful background.

From:

sovay

I've seen people squawking on Bluesky. Thanks for this essay, which looks like very useful background.

You're welcome. It's infuriating.

Flat | Top-Level Comments Only

Follow blog by email

Adaptive Tech Resources

Disability Culture

Dreamwidth Tools

Popular Tags

accessibility [38]
advocacy [18]
apple [16]
assistive tech [124]
- audio/image description [17]
- captions [26]
- wheelchairs [66]
audio [28]
beadwork [16]
bodymind & impairment [279]
capitalism [21]
classism [38]
climate [24]
clouds-clouds-clouds [16]
comics [38]
communal living [12]
computer hardware [22]
cons
- wiscon [65]
covid-19 [24]
cultural appropriation [13]
design [19]
disability
- culture [123]
- history [20]
- news [17]
- policing [17]
- pride [54]
- rights [36]
- studies [36]
disablism [66]
doctors [25]
dogs [35]
- bella [18]
dreamwidth [22]
family [69]
fandom [29]
- due south [14]
- sherlock holmes [21]
fanfic [42]
feminism [13]
food [51]
friday five [17]
geek [21]
gender [13]
gluten-free [43]
help wanted [19]
history [16]
home in msn [59]
how to computer [27]
how to dreamwidth [28]
human rights [26]
inaccessible content [37]
infrastructure [38]
interdependence [56]
ios [19]
joyful things [340]
language [26]
lazyweb [91]
living online [140]
media [76]
meme [68]
memoir [183]
meta [55]
metaphor [27]
moan-whine-rant [72]
moving with purpose [31]
music [45]
myguy [72]
photo [99]
pod* [31]
political rhetoric [16]
politics [19]
poll [13]
ponderments [57]
power [27]
privilege [13]
queer [12]
racism [62]
radio [22]
rave [73]
reading [94]
rec [52]
review [42]
rhetoric [26]
science fiction fantasy [33]
science is fun! [18]
sex [15]
sexism [13]
signal boost [269]
snowbirds [23]
software [17]
tea [12]
technology [15]
trans* [17]
transit [14]
travel [68]
typography [13]
video [99]
visibility [13]
white privilege [14]
writing [12]

Subscription Filters

sherlocks

Style Credit

Style: Pool for Stepping Stones by branchandroot

Page generated Friday, 15 May 2026 01:40 pm

Inside the Little Plastic Castle

adventures with Jesse the K

Artificial Intelligence? My Scaly Left Foot It Is

Artificial Intelligence? My Scaly Left Foot It Is

Receipts

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

Even so, it's not as simple as it's been touted

Re: Even so, it's not as simple as it's been touted

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

Oh!

Re: Oh!

Follow blog by email

Adaptive Tech Resources

Disability Culture

Dreamwidth Tools

Popular Tags

Subscription Filters

Style Credit