It's over, OpenAI has just been killed by gzip.

It's over, OpenAI has just been killed by gzip.

Thalidomide Vintage Ad Shirt $22.14

The Kind of Tired That Sleep Won’t Fix Shirt $21.68

Thalidomide Vintage Ad Shirt $22.14

  1. 10 months ago
    Anonymous

    is gzip sentient?

    • 10 months ago
      Anonymous

      yes

      • 10 months ago
        Anonymous

        https://i.imgur.com/vmEM4Tj.jpg

        It's over, OpenAI has just been killed by gzip.

        my sides have left orbit

      • 10 months ago
        Anonymous

        https://i.imgur.com/bmbWkQK.png

        What does this all mean for us tech nerds who aren't programmers?

        • 10 months ago
          Anonymous

          >tech nerds who aren't programmers?
          That sounds extremely gay. You should go outside.

        • 10 months ago
          Anonymous

          It means sexbots are coming. Buy the lube now.

      • 10 months ago
        Anonymous

        https://i.imgur.com/bmbWkQK.png

        He won.

    • 10 months ago
      Anonymous

      I've always honestly suspected a greater risk of rampancy from tar.

      • 10 months ago
        Anonymous

        Just you wait. In 500 years, gzipping and feathering will be known as barbarian method of punishment of the past.

  2. 10 months ago
    Anonymous

    Note that this is classification only, no generation.

    • 10 months ago
      Anonymous

      Why can't gzip be reversed to generate, provided we put in a sampling algorithm?

  3. 10 months ago
    Anonymous

    Quick rundown? How does a compression software kill an AI company?

    • 10 months ago
      Anonymous

      14 lines of python outperforms 350 million parameter models.

      It's only a matter of time until a better compression algorithm allows the script to outperform the billion parameter llamas

  4. 10 months ago
    Anonymous

    >llamas are just a noisy dictionary hack

  5. 10 months ago
    Anonymous

    Sometimes I wish I had a programmer brain so I could know exactly what the heck I'm looking at.

    • 10 months ago
      Anonymous

      its a very simple idea: two things that look the same when compressed by an encoding algorithm must also look the same when uncompressed.

      this algorithm takes two things x1 x2, encodes them using gzip , calculates a distance , then use that distance to compare x1 to anything else

      • 10 months ago
        Anonymous

        >: two things that look the same when compressed by an encoding algorithm must also look the same when uncompressed.
        This is completely wrong. Using huffman encoding on the two strings "aaaaaaaa" and "mmmmmmmm" Would yeild the exact same encoded number 0b00000000. The distance is 0 despite the distance between the original strings being very big. You know nothing of what you're talking about.

        • 10 months ago
          Anonymous

          so how does it work then

          • 10 months ago
            Anonymous

            https://aclanthology.org/2023.findings-acl.426.pdf

            • 10 months ago
              Anonymous

              >https://aclanthology.org/2023.findings-acl.426.pdf
              So they're comparing how many new bytes are needed for the query string given an example string.
              It's sort of a hanning distance of the embeddings but faster to calculate?.

              • 10 months ago
                Anonymous

                You could probably do a search to find a continuation given a set of example strings and a prompt string. Then this could be used for inference.

              • 10 months ago
                Anonymous

                https://en.wikipedia.org/wiki/Normalized_compression_distance

      • 10 months ago
        Anonymous

        hmm.. I wonder if lossy compression would work. Doubt a hash like sha256 would work

        • 10 months ago
          Anonymous

          Hashing algorithms are perturbative, a small change (a single bit flipped) in the input will result in a big change in the output. Calculating distance between hash sums is completely pointless because it has almost no relation to the distance between hashed content.

  6. 10 months ago
    Anonymous
  7. 10 months ago
    Anonymous

    is text classification something like being able to feed it a stack overflow post and it recognizing that the post is discussing something to do with programming?

  8. 10 months ago
    Anonymous

    How do I use this breaking edge news to make money though?

    • 10 months ago
      Anonymous

      Install AI on your computer and tell it to invent immortality, allow donations on your website. BAM, set for life and praised for saving humanity.

    • 10 months ago
      Anonymous

      Scrape twitter and generate e-trade API calls via sentiment analysis from the example python code.

Your email address will not be published. Required fields are marked *