Which AI Large Language Model are you using?

IT Minion · 14 Jul 2025

Brigadier said:
In the way that a student becomes a teacher, and a trainee becomes a boss, so LLMs eventually (through your asking questions and providing them with information) will eventually be capable of the same.

LLMs aren't intelligent, they're 'just' very good at repeating information back in a different format.

In information theory terms they don't add anything to the system, they reduce information.

They're fantastic tools, but they really don't deserve the term 'intelligence'. If you keep feeding them the output of LLMs then they progressively get less and less useful. Like a digital version of 'Idiocracy'.

Or in other words, they're closer to HAL than Skynet or Data.

motorbiking · 14 Jul 2025

Brigadier said:
In the way that a student becomes a teacher, and a trainee becomes a boss, so LLMs eventually (through your asking questions and providing them with information) will eventually be capable of the same.

Yes - If facts could be crowd sourced. But we know they can't.

It tends to believe the things it reads.

Brigadier · 14 Jul 2025

IT Minion said:
LLMs aren't intelligent, they're 'just' very good at repeating information back in a different format.

In information theory terms they don't add anything to the system, they reduce information.

They're fantastic tools, but they really don't deserve the term 'intelligence'. If you keep feeding them the output of LLMs then they progressively get less and less useful. Like a digital version of 'Idiocracy'.

Or in other words, they're closer to HAL than Skynet.

GIGO.

Which is why I was careful to specify the user "providing them with information", rather than they fetch it themselves (one of my staff made that mistake).

I didn't claim LLMs to be "intelligent" but, because they neither tire nor forget nor are hampered by emotion, they will outperform humans far more often than the humans will outperform them.

Brigadier · 14 Jul 2025

motorbiking said:
Yes - If facts could be crowd sourced. But we know they can't.

It tends to believe the things it reads.

"Write me a 5000 word essay on insulating a typical UK house".

"Write me a 5000 word essay on insulating a typical UK house, using XXXXX and XXXX as the reference material".

IT Minion · 14 Jul 2025

Brigadier said:
GIGO.

Which is why I was careful to specify the user "providing them with information", rather than they fetch it themselves (one of my staff made that mistake).

I didn't claim LLMs to be "intelligent" but, because they neither tire nor forget nor are hampered by emotion, they will outperform humans far more often than the humans will outperform them.

That was partially a response to your post and general musings.

Not yet, their output tends to be more average and generic rather than best in class. They're good at doing the simple stuff insanely fast, but they're weak at doing the high end stuff. A line I use a lot is that you should treat them as a super keen intern.

LLMs are fast but underperform a trained human in a topic in terms of quality. It's other AI tools that outperform like cancer scan screening.

Where 'ok' and fast is acceptable then they do win though.

IT Minion · 14 Jul 2025

Brigadier said:
"Write me a 5000 word essay on insulating a typical UK house".

"Write me a 5000 word essay on insulating a typical UK house, using XXXXX and XXXX as the reference material".

This is a good example of how the LLM will do acceptably. A generic response with lots of source material for it to lean on. The source material hasn't changed dramatically so it's internally consistent and there are a limited number of approaches and standards.

They'll churn out a distinctly average report for you in no time.

MNW67 · 14 Jul 2025

IT Minion said:
They'll churn out a distinctly average report for you in no time.

I'm in trouble, then. It takes me an absolute age to write a distinctly average report.

Brigadier · 14 Jul 2025

IT Minion said:
That was partially a response to your post and general musings.

Not yet, their output tends to be more average and generic rather than best in class. They're good at doing the simple stuff insanely fast, but they're weak at doing the high end stuff. A line I use a lot is that you should treat them as a super keen intern.

LLMs are fast but underperform a trained human in a topic in terms of quality. It's other AI tools that outperform like cancer scan screening.

Where 'ok' and fast is acceptable then they do win though.

I think you've, perhaps inadvertently, made my point: the boundaries set determine who (or what) "wins".

Set it up for humans to outperform, and they should.
And vice versa.

IT Minion · 14 Jul 2025

MNW67 said:
I'm in trouble, then. It takes me an absolute age to write a distinctly average report.

Nope, use AI and tweak it to give slightly above average.

IT Minion · 14 Jul 2025

Brigadier said:
I think you've, perhaps inadvertently, made my point: the boundaries set determine who (or what) "wins".

Set it up for humans to outperform, and they should.
And vice versa.

I just spent 2 hours with a company exploring the limits of LLMs, please, tell me more.

In terms of written output they're great at summarising, or providing a draft but they're consistently rated by users as not giving good enough output. Even if you know how to get it to emulated your tone of voice the output frequently still needs work and they do make basic mistakes.

In the example of insulating a house they'll often include things like an attachment method for fibre bats when talking about attaching PIR. It's why you should always be fact checking your output.

Brigadier · 14 Jul 2025

IT Minion said:
I just spent 2 hours with a company exploring the limits of LLMs, please, tell me more.

I'll have to look up the reference, but I recall a study where the groups were solely AI, solely humans, and AI but the humans were allowed to use their "experience", "gut instinct" etc to make the final decision.

Sole AI results were better than both other groups.

IT Minion said:
It's why you should always be fact checking your output.

I've posted this before: we may be at a sweet spot in that we have enough background knowledge to be able to proofread the AI output.
I gave the example where my deputy, for sh!ts and giggles, used AI to write a report for him.

It was beautifully worded, well-structured, and eminently plausible.
Full of cobblers, but an otherwise wholly-believable piece. Only because he knew the subject matter did he know this.

Roll forward a few years (or use it injudiciously), and Gawd knows the veracity of the output.
And without proper oversight, it will be self-referential too.............................

IT Minion · 14 Jul 2025

Brigadier said:
I'll have to look up the reference, but I recall a study where the groups were solely AI, solely humans, and AI but the humans were allowed to use their "experience", "gut instinct" etc to make the final decision.

There are examples, even of experts, doing some analysis of the AI model working better. But those are mostly not general purpose AIs, they're specific ones trained in a set area.

Brigadier · 14 Jul 2025

IT Minion said:
There are examples, even of experts, doing some analysis of the AI model working better. But those are mostly not general purpose AIs, they're specific ones trained in a set area.

Which is what I have been saying all along : with the right instruction and within the right parameters, AI roolz

IT Minion · 14 Jul 2025

Brigadier said:
Which is what I have been saying all along : with the right instruction and within the right parameters, AI roolz

And it's what I've been saying: Generally they aren't LLMs.

Brigadier · 14 Jul 2025

IT Minion said:
And it's what I've been saying: Generally they aren't LLMs.

Horses for courses.

Anyone using an off - the - shelf, free-to-use LLM shouldn't expect it to be anything other than average.
And this shouldn't be used as a valid argument for denigrating them either.

You don't get the bus then complain about the engine note not being Ferrari-like.

Which AI Large Language Model are you using?

Similar threads