Open AI’s GPT-4 demonstrates “human-level performance” on professional and academic benchmarks

Published 15/03/2023, 04:01 pm

ChatGPT's parent company Open AI has exhibited “human-level performance” in its GPT-4 model, a large multimodal model that aced on several professional and academic benchmarks.

GPT-4 outperformed its predecessor GPT-3.5 by a significant margin as demonstrated by its ability to achieve a score in the top 10% of test takers on a simulated bar exam, while GPT-3.5 only scored in the bottom 10%.

While it is currently available to subscribers of ChatGPT Plus, OpenAI plans to launch GPT-4 capabilities through ChatGPT and its commercial API via a wait-listed release.

Announcing GPT-4, a large multimodal model, with our best-ever results on capabilities and alignment: https://t.co/TwLFssyALF pic.twitter.com/lYWwPjZbSg
— OpenAI (@OpenAI) March 14, 2023

Aced in simulated exams

Addressing the capabilities of the new model, OpenAI said: “In a casual conversation, the distinction between GPT-3.5 and GPT-4 can be subtle.

“The difference comes out when the complexity of the task reaches a sufficient threshold - GPT-4 is more reliable, creative and able to handle much more nuanced instructions than GPT-3.5.

“To understand the difference between the two models, we tested on a variety of benchmarks, including simulating exams that were originally designed for humans.

“We proceeded by using the most recent publicly-available tests (in the case of the Olympiads and AP free response questions) or by purchasing 2022–2023 editions of practice exams.

“We did no specific training for these exams.

“A minority of the problems in the exams were seen by the model during training, but we believe the results to be representative”.

Exam results.

What’s more

OpenAI also evaluated GPT-4 on traditional benchmarks designed for machine learning models.

Encouragingly, GPT has also significantly outperformed existing large language models, alongside most state-of-the-art (SOTA) models which may include benchmark-specific crafting or additional training protocols.

Apart from textual data, GPT-4 can also accept visual inputs, however, the output will always be textual in nature.

Specifically, it generates text outputs (natural language, code, etc) given inputs consisting of interspersed text and images.

Latest comments

S&P/ASX 200

8,390.50

-53.80

-0.64%

ASX 200 Futures

8,413.00

-37.0

-0.44%

ASX All Ordinaries

8,650.10

-49.90

-0.57%

US 500

6,004.90

+6.2

+0.10%

Dow Jones

44,722.06

-138.25

-0.31%

China A50 Futures

13,090.00

-168.5

-1.27%

Dollar Index

106.14

+0.092

+0.09%

Most Popular Articles

News

Analysis

Tokyo CPI rises more than expected in Nov, core inflation muted

By Investing.co...

28 Nov 2024

Oil up as Israel, Hezbollah trade accusations of ceasefire violation

By Reuters

28 Nov 2024

Gold prices edge lower as strong US data fuels rate uncertainty

By Investing.co...

28 Nov 2024

Bitcoin price today: Holding below $96k as Thanksgiving holiday starts

By Investing.co...

28 Nov 2024

China criticises Trump tariff threat, says it won't solve America's problems

By Reuters

28 Nov 2024

More News

Market Movers

Name	Last	Chg. %	Vol.
Metro Mining	0.058	-1.69%	842.07M
ANZ Holdings	31.090	-1.30%	1.68M
BHP Group Ltd	39.990	-0.20%	1.43M
Commonwealth Bank Australia	157.425	-0.73%	710.32K
Rio Tinto Ltd	116.790	-0.30%	479.36K
CSL	281.77	-1.20%	261.67K
Macquarie	229.210	-1.46%	143.27K

Name	Last	Chg. %	Vol.
Errawarra	0.09	+46.67%	716.64K
Desert Metals	0.03	+50.00%	428.32K
Linius Technologies	0.002	+33.33%	362.00K
Asra Minerals	0.004	+33.33%	16.99K
Australian Oil Company	0.004	+33.33%	150.00K
Avecho Bio	0.004	+33.33%	2.58M
Mt Monger Resources	0.089	+30.88%	26.40M

Name	Last	Chg. %	Vol.
Energy Resources Of Australia	0.002	-33.33%	401.23K
Exinol Wellness	0.004	-33.33%	200.00K
88 Energy	0.002	-25.00%	1.66M
Australian Potash	0.031	-22.50%	3.01M
Bounty Oil and Gas NL	0.004	-20.00%	52.73K
Pursuit Minerals	0.002	-20.00%	355.92K
Findi	6.30	-18.71%	429.47K

Trending Stocks

Name	Last	Chg. %	Vol.
BHP Group Ltd	39.990	-0.20%	1.43M
Commonwealth Bank Australia	157.425	-0.73%	710.32K
Pilbara Minerals Ltd	2.365	-1.05%	12.24M
Fortescue Metals	18.680	-0.11%	631.42K
Pro Medicus Ltd	249.88	+0.68%	50.88K

Install Our AppScan QR code to install app

Risk Disclosure: Trading in financial instruments and/or cryptocurrencies involves high risks including the risk of losing some, or all, of your investment amount, and may not be suitable for all investors. Prices of cryptocurrencies are extremely volatile and may be affected by external factors such as financial, regulatory or political events. Trading on margin increases the financial risks.
Before deciding to trade in financial instrument or cryptocurrencies you should be fully informed of the risks and costs associated with trading the financial markets, carefully consider your investment objectives, level of experience, and risk appetite, and seek professional advice where needed.
Fusion Media would like to remind you that the data contained in this website is not necessarily real-time nor accurate. The data and prices on the website are not necessarily provided by any market or exchange, but may be provided by market makers, and so prices may not be accurate and may differ from the actual price at any given market, meaning prices are indicative and not appropriate for trading purposes. Fusion Media and any provider of the data contained in this website will not accept liability for any loss or damage as a result of your trading, or your reliance on the information contained within this website.
It is prohibited to use, store, reproduce, display, modify, transmit or distribute the data contained in this website without the explicit prior written permission of Fusion Media and/or the data provider. All intellectual property rights are reserved by the providers and/or the exchange providing the data contained in this website.
Fusion Media may be compensated by the advertisers that appear on the website, based on your interaction with the advertisements or advertisers.

Popular Searches

Please try another search

Open AI’s GPT-4 demonstrates “human-level performance” on professional and academic benchmarks

Latest comments

Trending Stocks