A recently released research reveals a huge challenge to the development of artificial intelligence: ChatGPT has become worse at performing some basic mathematical operations.
Researchers at Stanford University and the University of California, Berkeley they said that deterioration is an example of a phenomenon known to AI developers as drift.
What happens;
Attempts to improve one part of highly complex AI models make other parts of the models perform worse.
So far, researchers have tested two versions of ChatGPT: version 3.5, which is available for free online, and the premium subscription version 4.0. The results are not promising.
They gave the chatbot a basic task: recognize whether a certain number is a prime number.
Is 17.077 a prime number? Is 17.947 first? If you're not an insider, you can't work it out in your head, but it's very easy for computers to evaluate.
To track performance, the researchers fed ChatGPT 1.000 different numbers. In March, the premium GPT-4 correctly identified whether 84% of the numbers were prime or not, a fairly modest performance for a computer.
By June his success rate had dropped to 51%. In eight different tasks, GPT-4 performed worse in six of them.
GPT-3.5 improved a bit, but remains worse than its smart brother in most cases.