NY Times copyright suit wants OpenAI to delete all GPT instances

geese_feces [comrade/them, love/loves]@hexbear.net · 2 years ago

NY Times copyright suit wants OpenAI to delete all GPT instances

makotech222 [he/him]@hexbear.net · 2 years ago

thank god i hope this kills all this ai shit

bdonvr · 2 years ago

At best it will be a slight setback, I think the cats just outta the bag now.

ziggurter [he/him, comrade/them]@hexbear.net · edit-2 2 years ago

There’s really no cat. We’ve been using algorithms to do stuff for a very long time (thousands of years), and there’s literally no intelligence behind what people are calling “artificial intelligence”. It’s just another algorithm. This is just another increment in automation like all the rest (plow, printing press, loom, assembly line, computer, etc.), except the marketing is making it sound even more fundamental, when in reality it’s really less impressive (i.e. a spell-checking feature added to a word-processing program! ).

Will capitalists still use the term “artificial intelligence” to try to justify whatever BS they’re pulling—against other capitalists in the market, but especially against workers? Of course. Just like they’re likely to keep using the term “sharing” to bypass labor protections and other regulation having to do with taxis, hotels, etc.

Anyway, we really don’t have a horse in this race. Whether the capitalists wanting to preserve “intellectual property” win and Napsters and Pirate Bays keep getting taken down, or the SPAM engine capitalists win and everything we try to do gets flooded with so much barely camouflaged marketing junk that we can’t sort through it all. Heads they win/tails we lose. Or whatever boring dumbassery winds up getting settled on in the middle to maximally both preserve and enhance our exploitation, which is the most likely result.

flan [they/them]@hexbear.net · 2 years ago

The cat, in this case, isn’t necessarily an actual artificial intelligence but is instead a cursed abuse of linear algebra smashed into a shape beyond human comprehension using an unimaginable amount of data and computational power.

duderium [he/him]@hexbear.net · 2 years ago

Critical support to the world’s worst newspaper 🫡

RyanGosling [none/use name]@hexbear.net · 2 years ago

Removed by mod

CrushKillDestroySwag@hexbear.net · 2 years ago

A real “let them fight” moment, I’ll laugh at whoever loses but I hope that this and other copyright suits strangle the plagiarism machine to death.

stigsbandit34z@hexbear.net · 2 years ago

The only good outcome is this ending in the abolishment of copyright

That’ll never happen, so this is just another chapter in our boring dystopia. Will probably end in with the most boring agreement or some sort of slow burn that will end up making life worse for everyone. AI bullshit is here to stay imo

keepcarrot [she/her]@hexbear.net · 2 years ago

Lathe: Copyright remains, but big companies are allowed to use AI to squeeze labour regardless.

FuckyWucky [none/use name]@hexbear.net · edit-2 2 years ago

NYT is totally going to win this

Honestly I hope NYT goes bankrupt, they have always given me annoyingly smug elitist vibes. Where NY Post does transphobia overtly, NYT does transphobia ‘respectfully’

In April 2020, the New York Times ran a story with a strong headline about the situation of press freedom in India: “Under Modi, India’s Press Is Not So Free Anymore.” In that story, the reporters showed how Modi met with owners of the major media houses in March 2020 to tell them to publish “inspiring and positive stories.”

When the case against NewsClick appeared to go cold, the New York Times – in August 2023 – published an enormously speculative and disparaging article against the foundations that provided some of NewsClick’s funds. The day after the story appeared, high officials of the Indian government went on a rampage against NewsClick, using the story as “evidence” of a crime. The New York Times had been warned previously that this kind of story would be used by the Indian government to suppress press freedom.

article

Pluto [he/him, he/him]@hexbear.net · 2 years ago

Honestly?

Good news.

Though it didn’t come from the right people.

nothx [he/him]@hexbear.net · 2 years ago

Is this just postering before the Times has a large round of layoffs and gets an openAI subscription?

Cynicism aside, I would love if this actually hurts AI.

hummingspark [none/use name]@hexbear.net · 2 years ago

consider this capitalist rivalry. openai basically undermining the writing sector. To cheapen costs NYT would have to eventually adopt this, but it basically cheapens the content produced. it is basically massed produced stuff that will eventually just be really dull. sports illustrated just recently fired their ceo for having already used ai generated stuff.

ziggurter [he/him, comrade/them]@hexbear.net · 2 years ago

Buzzfeed tried to replace their news department with ChatGPT and failed. They shut down the division instead. Even though Buzzfeed was already creating nothing original and just publishing a mashup of shit from other sources. Still too complicated for the dumbass plagiarism algorithms, which are basically incapable of producing anything that humans find interesting for more than a few seconds and a couple of “oohs and ahs”. LOL.

MaxOS [he/him]@hexbear.net · 2 years ago

Didn’t expect NYT to be the vanguard of the butlerian jihad

Assian_Candor [comrade/them]@hexbear.net · 2 years ago

Critical support

kot [they/them]@hexbear.net · 2 years ago

FunkyStuff [he/him]@hexbear.net · 2 years ago

How do we realistically feel this is gonna play out long term? I think there’s no shot the old guard of the ruling class that wants to prevent AI slop from ruining everything wins over the very real incentive to embrace the AI gold rush. Feels like that ship sailed once AI generated articles kept getting lots of clicks even when they’re devoid of content. This is very much vibes based though, I hope someone here can illuminate some other factors in their cost benefit analysis.

flan [they/them]@hexbear.net · 2 years ago

The cat’s already out of the bag. I would be extremely surprised if the NYT gets what they want instead of a “win” where OpenAI pinky promises to stop using NYT content and pays $30 million in damages.

FunkyStuff [he/him]@hexbear.net · 2 years ago

That’s what I see as the most likely outcome, yeah, but isn’t there gonna be a point where other corporations step in because cheap AI slop is a genuine threat to their bottom line?

drhead [he/him]@hexbear.net · 2 years ago

In this case, NYT most likely is actually just looking for a cut of the money. Their claims in this are too absurd to actually hold up under scrutiny, nobody is using ChatGPT to bypass NYT’s paywall on whatever years old content they actually have in their training data, people are using browser extensions for that. I would also want to know who is the target of the claims that ChatGPT hallucinating about NYT articles is damaging NYT’s reputation.

One of the more significant things that could happen is that OpenAI could be forced to disclose details of their training data as part of discovery, which they really will not want to do. It would then be pretty easy to gauge exactly how overfit ChatGPT is (GPT 4.0 has 1.1 trillion parameters, depending on what precision they run it at this would be around a terabyte or more in size, I think 3.5 is closer to 350B, if the dataset has less entropy than the model parameters it is effectively guaranteed to start spitting out exact copies of training data). It would also be very useful info for OpenAI’s competitors, so OpenAI will try to get the suit dismissed or settle before then. Deleting their dataset like NYT is demanding is absolutely not going to happen, since at most they have standing to make them delete their articles from their training dataset. Finetuning the model to not comply with NYT-related requests would also be enough to get their model to no longer infringe on their copyrights as well.

They might also be angling for government regulation with a lawsuit making bold claims that they expect to catch headlines and shape public opinion but don’t completely expect to stick in court, since that’s a recurring pattern in a lot of lawsuits against AI firms, like the Stable Diffusion lawsuit which contained absolute bangers like the claim that it stores compressed images just like JPEG compression and that the text-prompt interface “creates a layer of magical misdirection that makes it harder for users to coax out obvious copies of training images” (this is actually in the announcement for that lawsuit, I’m not making this shit up. It’s really not surprising that most of that suit got thrown out).

There’s no real endgame for them where they get anything further than a cut. AI companies can still train on copyright-free or licensed data and over time will get similar results, so there’s not really anything that can be done to stop that in general. Copyright-reliant industries can certainly secure themselves a better position within that, though, where they might be able to gain either a steady income from licensing fees or exclusive use of their content for models under their control.

WithoutFurtherBelay@hexbear.net · 2 years ago

Their claims aren’t that absurd; Their articles likely were all used for training data. You could make an argument that that is copyright violation anyways.

I don’t AGREE with copyright but I don’t think the concept is absurd, especially when you’ve already established that legally protecting information behind paywalls is allowed (also stupid).

drhead [he/him]@hexbear.net · 2 years ago

Using it for training data is one thing, but that’s not all that’s being claimed. Merely using it isn’t enough for it to be infringement because fair use can be a defense, and quite likely a viable one if it wasn’t spitting articles out verbatim. People already do use copyrighted data from news sites verbatim for making new products that do something different, like search engines, or for other things that are of significant public interest, like archival. People also do republish articles from news sites, with or without attribution. So for the basic case of copyright infringement by training, NYT has to show that what ChatGPT is doing is more akin to that than it is akin to what a search engine does in order to get something that sticks in court.

They are, among other things, effectively asking for compensation as if people were using ChatGPT as an alternative to buying a NYT subscription, which is just the type of clown shit that only a lawyer could come up with. At the same time, they are also asking for compensation for defamation when it fails to reproduce an article and makes shit up instead. If this case keeps going, those claims are going to end up getting dismissed like a lot of the claims in the Andersen v. Midjourney/Stability AI/Deviantart case did. The lawyers involved know this, they’re probably expecting the infringement for training to stick and consider the others to be bonuses that would be nice to have. Probably also door-in-the-face technique as well.

A settlement is probably more likely still, because at the end of the day OpenAI would much rather avoid going through what this case will require of them during discovery, and the most significant claim NYT has against them is literally demonstrating a failure mode of the model, which OpenAI will want to fix whether or not there’s copyright issues involved (maybe by not embodying the “STACK MORE LAYERS” meme so much next time). After that’s fixed, the rest of what NYT has against them will be much more difficult to argue in court.

daisy@hexbear.net · 2 years ago

How do we realistically feel this is gonna play out long term?

Which side has the most money to hire lawyers and bribe politicians?

FunkyStuff [he/him]@hexbear.net · 2 years ago

That’s a good question, right? You’d think that the established media tycoons like Murdoch would have the kind of pull to have killed this baby in the womb, but they didn’t. Is that because they’re confident they can adapt to it?

daisy@hexbear.net · 2 years ago

The more I think about this, the more I wonder if it’s all an elaborate play by the media companies to get the tech companies to buy them out. The tech companies have ridiculously huge cash reserves, and media companies’ stocks aren’t nearly as valuable as people think. For example, the New York Times has a market cap of $8 billion USD, and made a profit of $90 million USD in their July/August/September 2023 quarter. Apple made $23 billion USD in profit in that same quarter, has a market cap of $3 trillion USD, and has cash reserves that would make Scrooge McDuck envious.

Imagine if all these legal fights over AI scraping are the media industry’s way to say to the tech companies “Hey, the data we have the rights to is incredibly valuable to your AI work. We could tie you up in court for years, setting you well behind your competitors. Wanna make a bid?”

FunkyStuff [he/him]@hexbear.net · 2 years ago

That’s totally valid, but what about the Disneys, the Universals, and the Sonys? Not all media companies are made equal, and there’s a lot of inertia behind those giants despite the falling rate of profit.

drhead [he/him]@hexbear.net · 2 years ago

Have you SEEN what Disney has been making lately? They’d gladly pivot to AI slop the second it matches their declining quality standards.

daisy@hexbear.net · 2 years ago

Of course it’s just an idea. It’s probably also a plan that would appeal more to print media companies that have doubts about long-term profitability and stand to lose a lot from text-generation AIs.

WithoutFurtherBelay@hexbear.net · 2 years ago

You can say what you want about AI art, I think it’s inherently designed to reproduce societal attitudes in its current form, but it’s arguably still an art form despite that.

But, AI writing is basically irredeemable. Only case I can think of for it to have any purpose is in helping disabled people communicate or express themselves. Other than that, it’s literally just a magic lying machine (I don’t care if they’ve decreased the amount of lies, that just makes it more unexpected when it does)

NYT sucks though, fuck em

viva_la_juche [they/them, any]@hexbear.net · 2 years ago

We could get so lucky

Red_Left_Hand [none/use name]@hexbear.net · 2 years ago

deleted by creator

NY Times copyright suit wants OpenAI to delete all GPT instances

NY Times copyright suit wants OpenAI to delete all GPT instances

NY Times sues Open AI, Microsoft over copyright infringement