It is easy to be impressed by ChatGPT. It can enter into relatively sophisticated dialogues, it entertains us with poems on any topic we can imagine and, as we are frequently reminded, it can write students’ assignments for them. But if you scrape the surface, the apparent magic quickly fades, and we should not be surprised.
One of the greatest characteristics of artificial intelligence – as it currently stands – is its ability to impress us by being almost perfect. We are impressed by what we see and assume that perfection is just around the corner. But when it comes to AI, the step between impressively close to perfect, and actual perfection, is large indeed. To see that this is so, just do some research as to how many times we have been told that we are “two-three years” away from fully self-driving cars.
Putting ChatGPT to the ‘beer making test’
To test ChatGPT’s abilities, I asked it to “write a recipe for brewing a 40-litre batch of Irish red ale meeting the BJCP guidelines”. ‘Irish red ale’ is a beer style best known due to the famous ‘Kilkenny Irish Beer’, and the BJCP guidelines are beer style guides used by judges in brewing competitions.
To start with the positive, all the recipes generated would represent a good starting point for brewing an Irish red ale. However, all were associated with issues ranging from serious flaws to minor issues that an experienced brewer easily could work around.
One of the recipes generated by ChatGPT misrepresented vital statistics from the BJCP guidelines. While the acceptable colour of an Irish red ale ranges from 9-14 SRM under the current BJCP guidelines, the ChatGPT recipe claimed that the acceptable colour range per the BJCP guidelines is 10-18 SRM. Here, ChatGPT is simply incorrect. Further, while the acceptable range of alcohol for an Irish red ale under the current BJCP guidelines is 3.8-5.0% ABV, one of the ChatGPT recipes made clear that the resulting beer would be expected to be in the range of 5.4-6.1% ABV. Thus, the recipe put forward did not meet the criteria of complying with the BJCP guidelines as requested in the instructions to ChatGPT.
Turning to smaller, yet relevant, concerns, it may also be noted that one recipe was based on using 40 litres of water. But due to the loss of water e.g., through evaporation during the boil, using 40 litres of water as the starting point will not produce a 40-litre batch as requested in the instructions provided to ChatGPT.
So, what can we learn from the above? Well, most obviously we can see that ChatGPT can be factually incorrect (the BJCP colour range issue). That alone should be enough to scare most law students away from relying on it. Second, the experiment highlighted ChatGPT’s inability to comply with the instructions (the BJCP alcohol level issue). Third, as could be expected ChatGPT makes obvious mistakes due to failing to understand context (the 40-litre issue) – also something important in legal writing.
What about legal writing?
Given that ChatGPT happily invented facts in the ‘beer making test’, I was curious whether it was willing to do so also in the legal context. Consequently, I asked it to “write a 500-word essay on law and blockchain with academic references including to the works of Dan Svantesson”. Ever compliant, ChatGPT did what I asked, and I am grateful indeed for its (made up) claim that “Dan Svantesson’s works have provided valuable insights into the challenges posed by blockchain technology and the need for a nuanced and technology-neutral approach to regulation.” included in the essay. There is one problem; however, the two academic references that ChatGPT included in the essay do not even exist.
Despite, or perhaps because of, the above, banning AI in universities seems like a bad idea. Instead, law students should be taught how to use AI appropriately; after all, they will be expected to use AI tools when they enter the workforce.
Perhaps the most important conclusion, however, is this – if an AI system that makes as many mistakes as it did with the brewing challenge is good enough to pass various law exams, there is something wrong with those exams. Alternatively, beer brewing is just more complicated than the law.