Tech Bytes: Former OpenAI employee levels copyright accusations, claims OpenAI is “destroying” the internet

Published 25/10/2024, 02:30 pm

MSFT

As Artificial Intelligence (AI) transforms the digital landscape, reshaping it with generative AI models and increasingly complex algorithms, it has also brought a little-known market to the forefront of the public consciousness – data.

AI models, especially large language models that use machine learning techniques like OpenAI’s ChatGPT program, require massive amounts of data.

Data that doesn’t come cheap – the data analytics market was valued at US$41.05 billion in 2022 and is expected to grow at an eye-watering compounding annual growth rate of 27.3% to hit US$279.21 billion by 2030.

Now, a former employee and researcher for OpenAI, Suchir Balaji, has raised the alarm over the company’s data collection practices, claiming OpenAI is “destroying” the internet and directly infringing on copyright law.

Gen AI researcher speaks out

Balaji was employed at OpenAI from 2020 until August this year – his LinkedIn page states he was working on post-training for ChatGPT, reasoning algorithms, pre-training for ChatGPT and reinforcement learning for the web version of ChatGPT.

He was part of the team organising and leveraging the huge reams of data the company used to build its GenAI bot.

After ChatGPT’s release to market in 2022, he began to consider the implications of what OpenAI was doing.

In August this year, he chose to leave the company because of ethical concerns with the way the AI pioneer was collecting and using data.

“If you believe what I believe, you have to just leave the company,” he said during a recent series of interviews with The New York Times.

Is GenAI destroying the internet?

Recently, Balaji published a post on his own website explaining the damage OpenAI and similar GenAI models are already doing to the internet.

Programming in particular is suffering, with many open-source platforms losing participants at staggering rates as individuals turn to AI to answer questions rather than their peers.

Balaji is a published AI researcher – he has three papers on various elements of AI models, with more than 8,000 citations.

In the post, titled “When does generative AI qualify for fair use?” he argues that GenAI is not truly transformative as required by fair use laws, since it simply alters the form and structure of content.

Balaji also argues GenAI content threatens to replace the very market it feeds from – should GenAI replace elements of content creation altogether, it will very quickly lose the ability to train on new data.

The lack of good data causes a lot of problems in these LLM models, leading to what researchers call ‘hallucinations’ … essentially, the model begins to make things up and churn out nonsense outputs.

Balaji argues none of the fair use defences clearly favour ChatGPT, or any other GenAI model for that matter, especially in light of the potential economic harm they could represent.

“This is not a sustainable model for the internet ecosystem as a whole,” he told The New York Times.

Issue before the courts

OpenAI has categorically disagreed with Balaji’s assessment.

“We build our AI models using publicly available data, in a manner protected by fair use and related principles, and supported by longstanding and widely accepted legal precedents,” the company said in a statement.

“We view this principle as fair to creators, necessary for innovators, and critical for US competitiveness.”

The truth of that will be revealed in court – The New York Times has sued OpenAI and Microsoft (NASDAQ:MSFT) for copyright infringement, and it's not the only one to go into bat against the GenAI company.

“Defendants seek to free-ride on The Times’s massive investment in its journalism,” the complaint says, accusing OpenAI and Microsoft of “using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it.”

As of April this year, eight newspapers and a slew of YouTube creators, actors, authors and the Center for Investigative Reporting are all actively suing OpenAI for copyright infringement claims.

Intellectual Property lawyer Bradley J. Hulbert told The New York Times that intellectual property laws were woefully out of date, and the issue is yet to be decided definitively in court.

“Given that AI is evolving so quickly,” he said, “it is time for Congress to step in.”

Latest comments

S&P/ASX 200

8,286.20

+1.00

+0.01%

ASX 200 Futures

8,316.50

+40.5

+0.49%

ASX All Ordinaries

8,540.80

+1.80

+0.02%

US 500

5,884.60

+14.0

+0.24%

Dow Jones

43,444.99

-305.87

-0.70%

China A50 Futures

13,408.50

+90.5

+0.68%

Dollar Index

106.605

-0.015

-0.01%

Name	Last	Chg. %	Vol.
South32	3.760	+5.92%	9.70M
BHP Group Ltd	40.605	+1.34%	1.71M
Westpac Banking	33.000	-0.18%	1.24M
ANZ Holdings	32.185	-0.82%	1.18M
National Australia Bank	38.875	-0.88%	1.18M
Commonwealth Bank Australia	152.740	-1.54%	644.95K
CSL	272.12	-1.77%	317.27K

Name	Last	Chg. %	Vol.
AXP Energy	0.002	+100.00%	1.46M
Noviqtech	0.078	+39.29%	63.66M
Thrive Tribe Technologies	0.003	+50.00%	2.86M
Alderan Resources	0.004	+33.33%	750.00K
MRG Metals Ltd	0.004	+33.33%	418.17K
Pioneer Lithium	0.14	+31.82%	26.31K
VRX Silica	0.052	+33.33%	3.13M

Name	Last	Chg. %	Vol.
Seafarms	0.001	-50.00%	866.67K
Bentley Capital Ltd	0.010	-41.18%	1.00K
Bridge Saas	0.02	-27.59%	172.13K
Pantera Minerals	0.02	-22.22%	7.56M
OAR Resources	0.002	-25.00%	3.10M
Grand Gulf Energy Ltd	0.003	-25.00%	1.66M
Latitude 66	0.070	-22.22%	761.77K

Trending Stocks

Name	Last	Chg. %	Vol.
Commonwealth Bank Australia	152.740	-1.54%	644.95K
Bisalloy Steel Group Ltd	3.910	+2.09%	93.38K
Magellan Financial Group	10.60	+1.05%	126.01K
BHP Group Ltd	40.605	+1.34%	1.71M
Metcash	3.085	+0.82%	467.13K

Install Our AppScan QR code to install app

Risk Disclosure: Trading in financial instruments and/or cryptocurrencies involves high risks including the risk of losing some, or all, of your investment amount, and may not be suitable for all investors. Prices of cryptocurrencies are extremely volatile and may be affected by external factors such as financial, regulatory or political events. Trading on margin increases the financial risks.
Before deciding to trade in financial instrument or cryptocurrencies you should be fully informed of the risks and costs associated with trading the financial markets, carefully consider your investment objectives, level of experience, and risk appetite, and seek professional advice where needed.
Fusion Media would like to remind you that the data contained in this website is not necessarily real-time nor accurate. The data and prices on the website are not necessarily provided by any market or exchange, but may be provided by market makers, and so prices may not be accurate and may differ from the actual price at any given market, meaning prices are indicative and not appropriate for trading purposes. Fusion Media and any provider of the data contained in this website will not accept liability for any loss or damage as a result of your trading, or your reliance on the information contained within this website.
It is prohibited to use, store, reproduce, display, modify, transmit or distribute the data contained in this website without the explicit prior written permission of Fusion Media and/or the data provider. All intellectual property rights are reserved by the providers and/or the exchange providing the data contained in this website.
Fusion Media may be compensated by the advertisers that appear on the website, based on your interaction with the advertisements or advertisers.

Popular Searches

Please try another search

Tech Bytes: Former OpenAI employee levels copyright accusations, claims OpenAI is “destroying” the internet

Latest comments

Trending Stocks