GPT-4: Users Share Its Wins and Losses on Social Media

March 17, 2023

OpenAI announced GPT-4 on Tuesday, the latest iteration of the world-famous chatbot that has captured the imagination of the internet since its launch last November.

MetaNews took to social media to uncover what users have been doing with the upgraded tech, and to find out what the bot’s biggest wins and losses are so far.

Announcing GPT-4, a large multimodal model, with our best-ever results on capabilities and alignment: https://t.co/TwLFssyALF pic.twitter.com/lYWwPjZbSg

— OpenAI (@OpenAI) March 14, 2023

The wins

Since the launch of GPT-4 users have been keen to share their victories with the chatbot, and the wins are stacking up.

One of the big headlines since the launch of GPT-4 is that the bot has an uncanny ability to pass standardized exams with little difficulty at all. The Bar Exam, which prospective lawyers must sit in order to practice law, is among those the bot can now pass with flying colors (90%). Other exams included the LSAT law exam (88%) and GRE Quantitative math (80%).

Here are a few more of the big wins for GPT-4.

From doodle to website

In one demonstration of its abilities, GPT-4 transformed a hand-drawn sketch into a functional website. The website is certainly very basic, but it’s a solid proof of concept.

I just watched GPT-4 turn a hand-drawn sketch into a functional website.

This is insane. pic.twitter.com/P5nSjrk7Wn

— Rowan Cheung (@rowancheung) March 14, 2023

£5,000 and 2 weeks saved

One canny user relayed how they were able to leverage GPT-4 to write code for 5 microservices for a new product. According to the user, a “very good” developer quoted £5,000 and stated they required 2 weeks to complete the job. Using GPT-4, the user was able to complete the job in a mere 3 hours.

Identify security holes in smart contracts

Another application for GPT-4 is identifying security holes in Ethereum smart contracts, which, when exploited, can result in the theft and loss of significant sums of money.

Conor Grogan, the Director of Coinbase, demonstrated the ability from his Twitter account on Tuesday.

“I dumped a live Ethereum contract into GPT-4,” said Grogan. “In an instant, it highlighted a number of security vulnerabilities and pointed out surface areas where the contract could be exploited. It then verified a specific way I could exploit the contract.”

I believe that AI will ultimately help make smartcontracts safer and easier to build, two of the biggest impediments to mass adoption.

— Conor (@jconorgrogan) March 14, 2023

The losses

One of the biggest losses for ChatGPT came directly from its own social media. The bot is predicting 20 jobs it can potentially replace in the near future, with roles ranging from Data Entry Clerk to Recruiter and Copywriter.

Jobs that #GPT-4 will replace, written by GPT-4: pic.twitter.com/aMrwQHnfwH

— ChatGPT (@ChatGPT_0penAI) March 16, 2023

Not so fast, GPT-4.

While the powers of GPT-4 may be impressive, the bot still has a considerable way to go before it can replace the work of a skilled human being. Case in point: CNET. When the tech publication recently replaced human writing staff with its own copywriting AI, the articles it discharged were nothing short of disastrous. Certainly, that bot wasn’t ChatGPT – but it illustrates how quickly things can unravel when you leave a chatbot to do human work with little oversight.

As for the notion that GPT-4 could replace a “Data Entry Clerk” or “Recruiter” – this strains credulity to absolute breaking point. No GPT-4, nobody is falling for this.

Here are some other examples of GPT-4 fails reported by social media users.

GPT-4 is bored of your terrible questions

One of the expected advantages of using a bot to write your code as that, unlike a hired software engineer, the bot will never tire, slow down or get bored. At least, that’s the hope.

A user reported that when asking GPT-4 for “lengthy segments of code” the AI appeared “to get bored” and simply stopped the task halfway through. Observing this behavior, the user went on to glibly state, “This thing is getting more human-like by the day…”

MetaNews suggests tasking GPT-4 with more interesting projects or paying it more.

The victory of failure

Tried the below logic puzzle on GPT-4 without additional prompt eng. GPT-3.5 used to spectacularly fail on this puzzle with endless hallucinations while GPT-4 fails only less spectacularly.

Still long way to go for achieving robust reasoning abilities but it’s a progress.

— Shital Shah (@sytelus) March 14, 2023

Yes, “fails only less spectacularly,” may be the faintest of faint praise, but it is still progress. Perhaps this one should be called a ‘ruined victory.’

Integer fail

Small GPT-4 finding: ChatGPT-4 can sort integers where N=20, often fails when N=21, and almost always fails when N=22. Someone please tell me what this means.

— Adam (@traditionalboi) March 16, 2023

It means you need to go back to doing integers in your head.

The loss that thought it was a win

Such has been the rush to identify significant use cases for GPT-4, not everyone has had time to stop and actually think out whether their win was really a win or not.

This phenomenon was epitomized by one overexcited user who explained how he was able to use Visual-ChatGPT to scan a picture of a fridge filled with fruits, cheeses, meats, eggs, and other staple ingredients into the chatbot. The user then commanded it to deliver five recipes from the ingredients it identified, all in just 60 seconds.

The user then confidently shared GPT-4’s output with what he described as five “pretty decent food recipes.” Those recipes were fruit salad, cheese omelette, ham and cheese sandwich, fruit smoothie, and cheese and fruit platter.

The savage internet was quick to point out, however, that most of those suggestions are barely any kind of recipe at all, let alone a decent recipe. Worse still, three of the so-called recipes are simply variations of putting fruit on a plate or in some other receptacle.

In fairness to GPT-4, of the 20 jobs it predicted it could replace, Chef was not one.

Still, if all this talk of food has worked up an appetite, please feel free to try out GPT-4’s “pretty decent” recipe for the “ham and cheese sandwich.”