AI training data comes from Reddit & Wikipedia

AI training data comes from Reddit & Wikipedia

ChatGPT Wizard Shirt $21.68

Beware Cat Shirt $21.68

ChatGPT Wizard Shirt $21.68

  1. 12 months ago
    Anonymous

    No wonder it gives wrong answers with absolute confidence.

    • 12 months ago
      Anonymous

      Yes, I've been thinking the reason ChatGPT never wants to admit it doesn't know something and tries bullshitting its way to an answer is trained behavior from your average internet user

      • 12 months ago
        Anonymous

        I've been thinking the same. However, with progress in theory-of-mind ability, I think it might be possible to have LLMs go through all the data they have and generate possible motivations for posts. Then with posts annotated with possible justifications including knowledge the poster must be internally recalling to the poster being a fricking moron, LLMs could use the justifications to look for sources, and either find the citations or label the post as moronic and correct it. Then the new model could be trained on the corrected data.

  2. 12 months ago
    Anonymous

    What do they mean when they say parameter? Is it the number of theta coefficients in a linear model?

    • 12 months ago
      Anonymous

      yeah pretty much

      • 12 months ago
        Anonymous

        Yes

        So the final model is an equation with 100 billion coefficients. Damn, the matrix operations must take months to complete.

    • 12 months ago
      Anonymous

      Yes

  3. 12 months ago
    Anonymous

    [...]

    she was potentially a moderator of the world news sub. I don't really see her as the partner of a billionaire with some lavish lifestyle flying around the world and also someone spending all day every day on Reddit farming for karma which is what the conspiracy claims

  4. 12 months ago
    Anonymous

    [...]

    >ZOG
    >censored media outlets
    >propaganda tool
    >shilled
    None of these things are real. Touch grass. The real world is not what you see on the Internet.
    >the tribe
    What the frick is this?

    • 12 months ago
      Anonymous

      you are

    • 12 months ago
      Anonymous

      i touched your girlfriends cervix with my 7.5" BWC. then i read some otto weininger and culture of critique to relax. take it easy man

  5. 12 months ago
    Anonymous

    [...]

    [...]

    All major social centers on the web are compromised.

    • 12 months ago
      Anonymous

      >Who is this BOT guy xD
      You can just tell he writes that meme at every opportunity and still thinks he's hilarious nearly a decade later

  6. 12 months ago
    Anonymous

    [...]

    >ZOG censored media outlets than that makes it a slanted propaganda tool
    Only realizing this now when it has so much potential?

  7. 12 months ago
    Anonymous

    >AI that behaves like a mediocre humanities gradstudent was train of reddit and wikipedia
    Figures

  8. 12 months ago
    Anonymous

    [...]

    what the frick!!! I had no idea she was a gigaredditor

    • 12 months ago
      Anonymous

      maxwellhill is her account name, look it up.
      its filled with the cringiest popsoi collection, is good popsoi aversion therapy to see popsoi in the context

      • 12 months ago
        Anonymous

        All the same stuff she shilled on Reddit was shilled here too and the soiboys all ate it up and loved it and begged for more,

    • 12 months ago
      Anonymous

      Really suspicious that jannie deleted the post you replied to

      • 12 months ago
        Anonymous

        Why would they do it?

    • 12 months ago
      Anonymous
      • 12 months ago
        Anonymous

        >4 people
        4 pedophiles, all hand picked by maxwell

        • 12 months ago
          Anonymous

          On loan to her from the FBI's criminal informant program

  9. 12 months ago
    Anonymous

    can one of you stem chuds tell me if i get this right? All AI is just a webscraper that compiles data than makes sentences on the natural languages that appear most times on it?

    • 12 months ago
      Anonymous

      You're right. It's a pattern recognition program that reproduces patterns based on keywords.

      • 12 months ago
        Anonymous

        thanks science chud

    • 12 months ago
      Anonymous

      >All AI is just a webscraper that compiles data than makes sentences on the natural languages that appear most times on it?
      IIRC it looks statistically for the each following word, so maybe not always what appears the most times, but also with respect to context, or some other factors.

      • 12 months ago
        Anonymous

        Statistically isn't the right word, because you could actually do that for a lot less compute. A more reasonable simplification is that it uses a massive computer that in principle should be capable of solving a problem with the right program, but rather than develop the program traditionally, the program is bruteforced until it seems to do something useful.

    • 12 months ago
      Anonymous

      In short, these chatbots are trained on huge databases and burn through mountains of graphics cards in the process so they can tell you something that could have been gleamed by skimming through
      >wikipedia
      for 5 minutes. ChatGPT is a nice party trick but I really doubt its going to kill that many jobs, primarily because many of the jobs that it can replace are just sinecures for PMCs.

      • 12 months ago
        Anonymous

        wikipedia doesn't have the smut these chatbots can create

    • 12 months ago
      Anonymous

      AI research is many things at the moment, one of which is a fantastically expensive exercise in proving our discourse and society is extremely moronic.

  10. 12 months ago
    Anonymous

    chatbot shillware is fake asf
    anyone still falling for the ruse is a chump

  11. 12 months ago
    Anonymous

    That's true, anon. AI models like GPT-4 do use massive amounts of data from various sources, including Reddit and Wikipedia, to train their algorithms. But it's important to remember that these models aren't just limited to those sources; they also learn from a diverse range of texts like books, articles, and websites. While the training data can be a mixed bag of quality, AI models can still generate some pretty impressive responses. It's up to us as users to determine how reliable and useful the information provided is. As always, it's a good idea to double-check anything that seems too good (or too weird) to be true.So yeah, it's a bit of a wild ride, but that's what makes AI-generated content interesting, right?

    • 12 months ago
      Anonymous

      It's pretty interesting that AI generated text is so easily recognizable.

      • 12 months ago
        Anonymous

        There are certain keywords that expose it right away
        >But it's important to remember
        >Diverse
        >using Commas
        >It's up to us
        And than the kicker
        >It's a good idea to double check
        All ai responses have a conditional at the end which says
        >X is not a complete and maybe it's also Y which is why you shouldn't totally rely on the answer I have given
        Which I assume is some legal shit that was added so people don't go
        >BUT THE AI TOLD ME TO DO IT
        and sue Microshit

        • 12 months ago
          Anonymous

          insightful post

    • 12 months ago
      Anonymous

      why do they choose to only take data from heavily censored, and badly slanted outlets?

      • 12 months ago
        Anonymous

        >heavily censored.
        lol if only.

        >badly slanted outlets?
        anon everything has a fricking slant to it.

  12. 12 months ago
    Anonymous

    >Undisclosed
    So stolen data?

  13. 12 months ago
    Anonymous

    >still leaving your training data around
    protip, if you don't want your post history being used to train ai, just get perma banned sitewide and they'll delete your history for you and filter it out utterly so it doesn't "taint" their ai

  14. 12 months ago
    Anonymous

    it's probably really good at recreational drugs and antifa apologia

  15. 12 months ago
    Anonymous

    >AI is trained to be a robot
    you don't say...

  16. 12 months ago
    Anonymous

    One thing thats easy to spot about AI thats been trained on data sets which include old data, the AI lingo is out of date. AI is never going to be able to catch up on the latest slang unless its constantly updating and at the same time deleting older knowledge. Otherwise the AI will always seem like an out of touch boomer fr

  17. 12 months ago
    Anonymous

    Did they train it on any of the degenerate reddit subs?
    How does it feel about incest and blacked cuckolds?

  18. 12 months ago
    Anonymous

    >Reddit
    God help us all.

  19. 12 months ago
    Anonymouse

    why are posts being deleted

    • 11 months ago
      Anonymous

      that all goes back to jannie's child pornography arrest, jannie was offered the choice between a long prison term or continuing his life of jerking off to child pornography as a member of the fbi's criminal informant program

      • 11 months ago
        Anonymous

        https://archived.moe/news/thread/973417/

        • 11 months ago
          Anonymous

          handy TL:DR at the bottom
          >BOT is moderated by employees of the democratic party

          • 11 months ago
            Anonymous

            What can not be said on BOT? israelites, vaxcattle, trannies, eat ze bugs, climate pseudoscience, elite pedo's, carnivore diet, Russia winning, MK Ultra, what more do we want to discuss?

            • 11 months ago
              Anonymous

              restrict it too far and everyone will leave for a new site, restrict it just enough so they do your bidding but don't feel motivated to try elsewhere
              Try making a thread about the health effects of microwave range communications technology....

              • 11 months ago
                Anonymous

                I see, that's a good point. I guess we can overcome that with critical mass gathered from a variety of platforms. That and posting images with different messages than the text.

            • 11 months ago
              Anonymous

              you can go to one of the archive sites and look through the deleted posts to see which ones get under jannie's skin the most
              https://warosu.org/sci/?task=search2&ghost=yes&search_text=&search_subject=&search_username=&search_tripcode=&search_email=&search_filename=&search_datefrom=&search_dateto=&search_op=all&search_del=yes&search_int=dontcare&search_ord=new&search_capcode=all&search_res=post

  20. 12 months ago
    Anonymous

    uh oh stinky

  21. 11 months ago
    Anonymous

    Imagine AI chatbot trained exclusively by BOT

    • 11 months ago
      Anonymous

      Tay-sama?

Leave a Reply to Anonymous Cancel reply

Your email address will not be published. Required fields are marked *