One of the arguments made for Reddit’s API changes is that they are now the go to place for LLM training data (e.g. for ChatGPT).

https://www.reddit.com/r/reddit/comments/145bram/addressing_the_community_about_changes_to_our_api/jnk9izp/?context=3

I haven’t seen a whole lot of discussion around this and would like to hear people’s opinions. Are you concerned about your posts being used for LLM training? Do you not care? Do you prefer that your comments are available to train open source LLMs?

(I will post my personal opinion in a comment so it can be up/down voted separately)

  • OptimusPrime@lemmynsfw.com
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    1 year ago

    Well, these AIs are being trained on public figures, and there isn’t much they can do unless they livestream with the AI impersonating them, allowing them to potentially identify who is behind it. How will people figure out if there’s an LLM out there that speaks just like them? It’s similar to fine-tuning AIs on artists to create art that mimics their style. It can be frustrating, but there isn’t much anyone can do unless surveillance software is installed on every computer. In summary, I don’t mind because I won’t even find out.