jesse_the_k | boost: Adam Engst Learns Seven Agentic Web Browsers Can't Count

You're viewing

jesse_the_k's journal
Create a Dreamwidth Account Learn More

Reload page in style: site light

jesse_the_k

from someone who's a realist-for-now yet also wants to believe.

Adam Engst on Can Agentic Web Browsers Count?

tl;dr No, given a readily available data set on a webpage, they can't.

The sweetest and scariest part was his sympathy for Copilot's very anxious inner monologue as it tried to come up with answers while working to a deadline that nobody had created.

When it comes to system prompts, the anxious tone of Copilot’s internal responses suggests a “ship now, apologize later, if you’re caught” system prompt that, if reflected in a real-world workplace, would be problematic. Obviously, AIs don’t have feelings that can be hurt and won’t complain to HR, but such a culture tends to encourage people to cut corners and make poor decisions that compromise quality and customer service. If Copilot is any indication, the same is true for AIs.

Flat | Top-Level Comments Only

From:

merrileemakes

This was an interesting article, thanks for sharing.

I didn't know you could now see that internal monologue of some AIs. It helps to combat the black box problem, but opens up a whole bunch of new ones. Poor Copilot.

From:

jesse_the_k

Could it be a conscious choice to make Copilot that uncertain and unable to speak up for itself? Is it a plan to prevent union activity among artificial "intelligences"? Are the LLM engineers so accustomed to those unreasonable time pressures on their own work that they've recreated those attitudes in their progeny?

From:

jadelennox

I am absolutely floored by

The only problem was that when I checked the Google Sheet against the reported numbers, the results for the Strides of March meet were wrong (196 instead of 173). So I trimmed the data in the spreadsheet to just that meet and used BBEdit to compare it against the actual confirmation list, which is when I discovered that ChatGPT Atlas had gone full cuckoo with the Strides of March registration list. It had replaced numerous people with completely fabricated names and ages and increased the total number of registrants by four.

Since the spreadsheet contents for both the January Jicker and February Flash Dash meets were completely correct, I would guess this is another case of the chatbot losing its context window due to too much data. Had ChatGPT Atlas not hallucinated data in the spreadsheet, it would have received an A.

So: "it turned out that when I double checked it, it was utterly full of incorrect made-up nonsense, and it's mostly by chance that the results were close to right. But if that hadn't happened, it would have been perfect. Ah, well, B."

Edited Date: 15/11/2025 02:54 am (UTC)

From:

yourlibrarian

And this also assumes that whoever is working with or checking the AI's work actually know anything about the task or issue. Which if current students are any indication, they won't.

From:

jesse_the_k

a willingness to accept 88% correct as good enough.

I was very surprised to see a standard that squishy from Engst, whose Apple-oriented tech support site predates the web.

From:

davidgillon

When Google searches started adding AI summaries at the top, the first one I saw said something on the lines of "the 2024 budget is $183m, by 2026 this will have increased to $163m".

From:

jesse_the_k

someone with a high school diploma, much less a college degree, can accept consistent, significant numerical errors as "good enough."

Flat | Top-Level Comments Only

Follow blog by email

Adaptive Tech Resources

Disability Culture

Dreamwidth Tools

Popular Tags

accessibility [38]
advocacy [18]
apple [15]
assistive tech [124]
- audio/image description [17]
- captions [26]
- wheelchairs [66]
audio [28]
beadwork [16]
bodymind & impairment [279]
capitalism [21]
classism [38]
climate [24]
clouds-clouds-clouds [16]
comics [38]
computer hardware [22]
cons
- wiscon [65]
covid-19 [24]
cultural appropriation [13]
design [18]
disability
- culture [123]
- history [20]
- news [17]
- policing [17]
- pride [54]
- rights [36]
- studies [36]
disablism [66]
doctors [25]
dogs [58]
dreamwidth [22]
family [68]
fandom [29]
- due south [14]
- sherlock holmes [21]
fanfic [42]
feminism [13]
food [51]
friday five [17]
geek [21]
gender [13]
gluten-free [43]
help wanted [19]
history [16]
home in msn [57]
how to computer [27]
how to dreamwidth [28]
human rights [26]
inaccessible content [37]
infrastructure [38]
interdependence [56]
ios [19]
joyful things [337]
language [26]
lazyweb [91]
living online [139]
media [76]
meme [67]
memoir [182]
meta [55]
metaphor [27]
moan-whine-rant [72]
moving with purpose [31]
music [43]
myguy [71]
photo [97]
pod* [31]
poetry [11]
political rhetoric [16]
politics [19]
poll [13]
ponderments [57]
power [27]
privilege [13]
queer [12]
racism [62]
radio [22]
rave [73]
reading [94]
rec [52]
review [42]
rhetoric [26]
science fiction fantasy [33]
science is fun! [18]
sex [15]
sexism [13]
signal boost [268]
snowbirds [23]
software [17]
tea [12]
technology [15]
trans* [17]
transit [14]
travel [68]
typography [13]
video [98]
visibility [13]
web browsers [11]
white privilege [14]
writing [12]

March 2026

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Style Credit

Style: Pool for Stepping Stones by branchandroot

Page generated Saturday, 4 April 2026 09:58 pm

Inside the Little Plastic Castle

adventures with Jesse the K

boost: Adam Engst Learns Seven Agentic Web Browsers Can't Count

boost: Adam Engst Learns Seven Agentic Web Browsers Can't Count

(no subject)

Glad to share!

(no subject)

(no subject)

Seems to be going around

(no subject)

I simply don't understand how

Follow blog by email

Adaptive Tech Resources

Disability Culture

Dreamwidth Tools

Popular Tags

March 2026

Style Credit