266
submitted 1 year ago* (last edited 1 year ago) by Rinna@lemm.ee to c/asklemmy@lemmy.ml
you are viewing a single comment's thread
view the rest of the comments
[-] XEAL@lemm.ee 0 points 1 year ago* (last edited 1 year ago)

LLMs just automates and does faster certain things that a person could do on their own if they invested way more effort and time. If a human being takes people's work and pieces it together in a way that resembles other works without using any LLM/AI or automation tool, is the final result content theft too?

I agree with the content enshitification, but I disagree about the coherency.

Usually, implementations like the ChatGPT web/app will generate different outputs for the same prompt/input. You can also ask it to tweak a previous output, make it shorter, more concise, exclude parts, etc. And if you're making API calls through a script you can tweak parameters like the Temperature, Top P, Presence Penalty or Frequence Penaly, which affect things like the coherence, randomness or repetitiveness of the output.

There's also fine tunning using embeddings, which can help training a model to fit one's specific needs and expectations, but I haven't got to try it yet.

[-] alcoholicorn@hexbear.net 10 points 1 year ago

I disagree about the coherency.

Coherency requires relating symbolic meanings. AI just uses statistical analysis.

Consider if you were locked in the national library of Thailand. You don't speak Siamese, and any pictures or bilingual dictionaries were removed.

Given a thousand years, you could look at the patterns and produce text similar to what someone who writes Siamese would write, but there's still no coherency because you cannot connect the meaning behind any of the words.

That doesn't necessarily mean your outputs are useless though, someone who does read Siamese can have you generate outputs until you print out something they can infer a coherent thought from, but you're fundamentally unable to be trained to do that yourself.

If a human being takes people's work and pieces it together in a way that resembles other works without using any LLM/AI or automation tool, is the final result content theft too?

We're getting into ethics territory. IP is a social construct and we live under capitalism, our model for determining what is and isn't theft should be selected by what supports artists and consumers against capitalists.

[-] boboblaw@hexbear.net 3 points 1 year ago

Ah, the Siamese Room argument.

[-] TheActualDevil@sffa.community 3 points 1 year ago

If a human being takes people’s work and pieces it together in a way that resembles other works without using any LLM/AI or automation tool, is the final result content theft too?

Yes, obviously. Artists and writers can learn from others and can be inspired by other's works, but they can't use parts of those works. That is content theft. Imitating a style is fine, but you have to create something new. LLMs cannot create, only steal.

[-] XEAL@lemm.ee -1 points 1 year ago* (last edited 1 year ago)

If, for example, I ask an LLM to produce a short story with a completely unique and random prompt that doesn't resemble any known existing story in its training data (or in the entire world, if you like), is the generated output of the LLM also stolen?

[-] TheActualDevil@sffa.community 1 points 1 year ago

I think what you're proposing isn't something they can do. Are you saying "What if I asked it to create a short story who's pieces don't resemble any pieces of known stories?" or are you saying "What if I asked it to create a short story who's whole doesn't resemble any known stories?"

The first one can't happen. The second? Yes, it's stealing.

Where is it getting this story? LLMs don't have creativity. They don't understand story structure. It pulls sentences and paragraphs from work in it's training data. If the generated output contains work that others have made, that's called plagiarism. If it doesn't, then your hypothetical isn't realistic. LLMs can't create original works. That's the whole point. It pulls pieces of the training data and rearranges them. It would be like if I was writing a college paper and instead of writing anything myself I just pulled 100 different sources and copied a sentence or two from each source and structured them as my paper. That's 100% plagiarism.

this post was submitted on 13 Sep 2023
266 points (98.5% liked)

Asklemmy

44173 readers
1926 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~

founded 5 years ago
MODERATORS