1725
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 06 Sep 2024
1725 points (90.1% liked)
Technology
60337 readers
4069 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
Another good question is why AIs do not mindlessly regurgitate source material. The reason is that they have access to so much copyrighted material. If they were trained on only one book, they would constantly regurgitate material from that one book. Because it’s trained on many (millions) books, it’s able to get creative. So the argument of OpenAI really boils down to: “we are not breaking copyright law, because we have used sufficient copyrighted material to avoid directly infringing on copyright”.
Eeeh, I still think diving into the weeds of the technical is the wrong way to approach it. Their argument is that training isn't copyright violation, not that sufficient training dilutes the violation.
Even if trained only on one source, it's quite unlikely that it would generate copyright infringing output. It would be vastly less intelligible, likely to the point of overtly garbled words and sentences lacking much in the way of grammar.
If what they're doing is technically an infringement or how it works is entirely aside from a discussion on if it should be infringement or permitted.