Chat GPT 3.5 conversations and evaluations

Does the current generation of AI Chatbots live up to the hype?

RNfinity | 08-02-2023

Introduction

Chat GPT 3.5 has taken the world by storm. In 2 months, it has acquired 100 million subscribers and the parent company Open AI can boast a market value of 29 billion dollars. It has been every where in the media form channel 4 news, being interviewed, to fulfilling an assignment set by author and psychologist Jordan Peterson and impressing him a hell of a lot in the process, declaring it to be as revolutionary as the Gutenberg Press. Elon Musk warned that strong AI is here. With further iterations and offerings from other companies around the corner, forums are chattering with talk of an impending AI revolution, potentially marginalising the corpus of human achievement as well as taking away our jobs.

Of course, like many millions of other people I couldn’t help poking around to see what might be under the hood and these are some of my initial thoughts, with screenshots.

 

Conversational skill.

Chat GPT 3.5 can build on conversations and modify it according to your suggestions. It has a complete mastery of the dictionary and thesaurus, knows every synonym and antonym, and has an appreciation of rhyme. It has achieved a monumental score on verbal IQ tests and has excellent grammar. It can stylise language according to a Shakespearian sonnet, or the bible or slang It out like Eminem. You can see our blog to see some musings on IQ testing

Chat GPT poetry

I feel that this is a bit of a party piece that the developers went town on, but an entertaining one at that, much like the effort made by Apple to realise different fonts in the development of word processors. The stylism of the language can be quite beguiling and perhaps can make us overlook the contents but is this valuable or a bit of light amusement like Amazon Alexa’s fart noises?

I repeated Jordan Pietersen’s task of writing the 13th Rule for Beyond Order in the Style of the Bible and the Tao Te Ching as well as a few additional tasks below- selling 10 Downing Street on the Rightmove website and write a resignation letter for a CEO under different circumstances.


10 Downing Street for sale

Resignation letter for a bitter CEO


I was wondering how much on the hoof decision making was taking place versus the usage of pre-rendered templates. I would venture that it has a vast array of pre-rendered model answers or templates for common communication tasks from all walks of life; job application/ resignation letters, advertising blurbs, press releases each with its own decision tree, as well as a huge number of questions and answers which can be delivered in a nuanced way.

It has been suggested that is a good essay writer for academic assignments as well as blog writing for SEO, though its text should not be used directly on a website as it may be penalised by google which can detect most AI generated content. It is certainly excellent for generating writing plans.

 

Problem Solving

I tried to test its problem-solving ability and I set it a few, likely novel to Chat GPT, and deliberately awkward tasks, which I didn’t expect it to complete.

I asked it to complete a sequence of numbers based on increasing times based on the 12-hour clock, I asked it to correct an equation by adding mathematical operators, relate a sequence of words which were anagrams of fruits and find the pattern in a series of words (middle letter increased by sequence in the alphabet). Perhaps these were unfair tasks, but it got every question wrong. I have to say I felt quite relieved. It’s abilities here are in keeping with its modest SAT score; a test which contains questions which combine verbal comprehension with logic and calculation. I attempted to coax it towards the correct answers, but it wouldn’t really budge, admit that it was incorrect or adjust its approach to modelling, it just dug in presenting one clearly wrong answer after another with the same hypothesis. It would rather speak an untruth than admit defeat, so I would say that it did not demonstrate any obvious ability to learn, but then again why should it; can I or anyone else be trusted to utter the truth or act in its interest as an educator? It seems to have a limited ability to learn in an unsupervised way based on its display, but it may be that we are training an unseen AI rather than the one that is open to the public. On balance, I would think that the public evaluation is more of a market research exercise rather than a development drive, to see where the demand lies, and most of its training is highly supervised.

So how useful is it? It doesn’t beat the internet for sure. The information it provides can be obtained rapidly from a search engine, and you have a choice in the value you ascribe to various sources, as it only presents a single curated version of the ‘truth’.

I will continue to use it whilst it is free. I would only subscribe if the free trial ends, if there are some specific valuable tasks, that it could perform better or cheaper than myself or a collaborator but I am not sure what those might be at the moment.

 

Verdict

Humanity has nothing to fear from this generation of AI (if it is something to be feared). I would describe it as a natural evolution of the presentation of digital information to human minds and it has little or no intelligence. I believe it does not perform well in unfamiliar situations and there is a degree of hyperbole in the media regarding its current abilities. I believe that websites will need to update their anti-robot measures as AI becomes stronger, but based on Chat GPT 3.5, the current generation of AI will have a modest effect on the job market and does not possess any higher order thinking. It will not be making films, music, computer applications or any scientific discoveries, but this is progress and Chat GPT 3.5 is impressive, mainly due to the breadth of it is scope and will prove useful to many people. A broad AI will not outperform an AI designed to carry out a specific task on those specific tasks, i.e., chess or route navigation. As AI becomes more powerful, then we will need to ask ourselves what is the benefit of generalised AIs over specialised AIs?

 

Chat GPT 3.5 doesn’t catfish me into thinking that it’s a human.