Can the civil service get to grips with Humphrey’s GenAI BS?

In January, the UK government announced a suite of AI tools called “Humphrey” for use by civil service officials.

Humphrey will contain several products. One, called Consult, analyses government consultations and chisels the information into interactive dashboards. It will also help civil servants take meeting minutes and search internal documents.

The suite was named in homage to the Machiavellian permanent secretary Humphrey Appleby from the classic sitcom Yes Minister. While the tongue-in-cheek naming has raised some eyebrows, it may be more appropriate than it first appears.

A powerful aid to understanding how LLMs function is outlined in an article for the academic journal Ethics and Information Technology titled ‘ChatGPT is Bullshit’. The author compares the output of LLMs to, well, bullshit – hereafter referred to as BS.

To appreciate the comparison, we must think of BS in the sense defined by the moral philosopher Harry Frankfurt, the author of On Bullshit. Frankfurt defines BS as statements intended to persuade without regard for truth. A “BS artist” says what they think will get the job done. They are indifferent to whether or not what they tell you is true. It’s a trait that’s not necessarily uncommon in public life.

‘ChatGPT is Bullshit’ claims that LLMs have precisely the same characteristics as BS artists.

LLMs sometimes suffer so-called hallucinations, in which the LLM makes claims that are not true. That is because LLMs base their outputs on what data and statistics tell them is correct – not on truth or accuracy.

The key insight to remember is that LLMs are always making things up. It’s what we are asking them to do. We have coded them to tell us BS.

Much of the time, the “statistical fictions” of LLMs seem correct to us, simply because they coincide with reality as we understand it.

But human users are able to apply a test – “is this correct?” – that the LLM is unable to perform.

When the LLM’s statements don’t align with truth or reality, we may be puzzled or amused and brand the falsehood a hallucination. But from the LLM’s perspective it’s all the same – there is no distinction between a hallucination and a non-hallucination.

Some marketing departments have promised that LLMs are an all-solving panacea for all problems. That’s evidently not true.

But they are not useless, either. Instead, LLMs have an unfamiliar kind of utility: they are masters of BS. From this perspective, the appropriate principle for designing tasks for an LLM becomes: which tasks would you be OK to delegate to an excellent BS artist?

Back to Humphrey

The tools included in Humphrey are targeted at some of the tasks that LLMs do well: summarising the history of parliamentary debates, policy and laws.

Should we have any concerns about these tasks being performed with a certain amount of BS? What is the potential harm, for example, if a summary of laws includes a plausible-looking citation that turns out to be entirely fictional?

Anyone who has dealt with LLMs for serious work has likely encountered such questions.

In general, the risk of BS is manageable in two situations. Firstly, if the human user is able to immediately and easily detect BS, perhaps by being familiar with the domain of interest as to rule out any made-up information.

And, secondly, if the use of LLMs is limited so that any generated text is, effectively, only a presentation of information that has already been vetted with high confidence. We would then be limiting the creative opportunities for the BS artist.

Given the capabilities already demonstrated by LLMs, this restrictive approach can seem underwhelming.

However, a conservative approach is advantageous because no one needs to lose sleep over what kind of BS an LLM might produce.

More ambitious uses of LLMs, where they’re given more leeway, might be tempting, but this would risk undermining trust – something scarcely given to public servants as it is.

In order for Humphrey or any other AI tool like it to be a success, it’ll be important for users and decision-makers to have a clear understanding of how LLMs work and what their limitations are. This will help to avoid costly mistakes or erosion of trust.

It may be beneficial for those using Humphrey to pose the question: do I have a task that’d be most suitable for an excellent BS artist?

Digital TransformationTechnologyArtificial IntelligenceDigitalOpinion

Can the civil service get to grips with Humphrey’s GenAI BS?

Back to Humphrey

Read this next

Want to read on?

Subscribe to our Daily Newsletter