Artificial Intelligence is NOT, But Will RAG Augment Your LLM

29 Nov 2023

AI IW

First, it’s not intelligence. It has no grounded understanding or lived experience; it is not thinking in a humanly recognisable sense. AI is not understanding and applying experience to make judgments based on reason, or form opinions, in relation to matters which have not been experienced.


Second, it’s notartificial, at least not anymore. It was originally very rule and program-based, but its strength now is derived from appropriating the work of human creators - the programmers and the visual artists, musicians, scientists and other writers whose works are scraped from the internet and repurposed.


Third, at least up to now, most of the AI noise is around text-focused generative AI tools like Chat GPT and other Large Language Models (LLMs); see here for a clear 3-minute explanation from the University of Sydney. These programs are, in simple terms, an iPhone’s predictive text function on steroids. They still have significant limitations flowing from the way in which they are constructed and operate, which is to take an initial input and then serially predict the next possible word based on statistical relationships between vast volumes of training data” scraped from the internet. Complex mathematical models build vectors and neural networks between words in the data, so that an input will generate an output.


That gives rise to at least 3 problematic implications beyond the above:


•   The models perpetuate biases inherent in the training data. If, for example, Doctor is an input, the model is going to guess and reproduce male pronouns with it in the output if they were associated with Doctor” in the training data.


     •  The models hallucinate. They are not (as yet) looking up facts in a dictionary or encyclopedia. They are merely providing plausible output based on statistical word association, so they may well be wrong.

•   Up to now, the programs have generally been static and limited to their training period/data. So, if the world changes, the model won’t know. Updating is a challenge in time and money.

Moreover, those implications are exacerbated by the nature of the training data, given the ridiculous range and variable quality of the information and commentary available on the internet.


Retrieval-Augmented Generation (RAG) and other potential solutions are being developed to facilitate more reliable LLM answers by reference to instructions to retrieve relevant information from content stores which can be updated. But LLMs do not know what is real, and what is not. So don’t give up your day job just yet.


These are just general observations, but for the legally inclined see a guide just published by The Law Society of New South Wales on the responsible use of AI by solicitors.

The content of this article is intended to provide a general guide to the subject matter. Specific advice should be sought about your specific circumstances.