When I ask AI experts about memory allocation and optimization issues they never answer.

Is the whole AI profession is just about calling premade deep learning libraries from python?

  1. 2 months ago
    Anonymous

    Yes and making papers about things they don't even understand.

    • 2 months ago
      Anonymous

      >Yes and making papers about things they don't even understand.
      That was my impression too.

  2. 2 months ago
    Anonymous

    99% of work is gathering and cleaning up the data then waiting 50 years to train the model only to find out that it's shit

  3. 2 months ago
    Anonymous

    The issue with optimizations for GPUs is that they are highly hardware-dependent.
    If you were to directly use CUDA/OpenCL to train a neural network the performance would degrade when you swap GPUs (+ all the regular portability issues).
    I think very few people would be able to implement low-level code for training neural networks that is more efficient than to just use something like Tensorflow.
    Of course you still need to understand GPU architecture when you want to implement anything non-standard in an efficient manner.

    Also it's much easier to teach domain experts how to use libraries like Keras than it is to teach domain knowledge to a competent low-level programmer.

  4. 2 months ago
    Anonymous

    Most Models allocate a huge block of memory at startup, probably on the GPU, mess with the values in the block during runtime (so no allocations or deallocations but huge memory usage), then free the block at the end. It's not like if else statements. It's more like SQL where clauses and joins that filter or transform data points in parallel.

    Making a novel model topology is hard, but most are some variant of a convolutional recurrent neural network anyhow.

    It's the training regime and cleaning up the training data set that's important.

    Have you tried looking on the huggingface website for what you want?

    • 2 months ago
      Anonymous

      >allocate a huge block of memory at startup
      >mess with the values in the block during runtime (so no allocations or deallocations
      Guess whaw webtard, that's literally what every single existing memory allocator does too.

      • 2 months ago
        Anonymous

        It doesn't matter what the allocator or operating system does internally, only that it provides memory to the rest of the application when asked. I prefer to not split hairs to that degree as it isn't meaningful.

        Of course a computer will ultimately have the same amount of memory locations, unless the hardware configuration is changed.

  5. 2 months ago
    Anonymous

    And asking for money, yes.

  6. 2 months ago
    Anonymous

    >is just about calling premade deep learning libraries from python?
    I'm starting to think its bloated as hell, just visit the website that hosted dalle mini and check it's google collab.
    It just links to random libraries and downloads gb of stuff that is not even the model, just libraries.
    I think that they eventually gonna need to switch from python to a decent language

    • 2 months ago
      Anonymous

      The models don't run in Python. It's only that their topology is defined using Python. When the model is run, C code (via C++ wrappers) is executed behind the scenes. I think JSON could be used to define the topology just as easily.

  7. 2 months ago
    Anonymous

    while on topic, could i run a small model on a 1070? I will probably get a 3080 later but I want to get my feet wet now.

    • 2 months ago
      Anonymous

      >could i run a small model on a 1070?
      Running a small model on a 1070 is no problem.
      With 8GB of VRAM you can run models like GPT-2 (with exception of the biggest version), waifu2x, DeepCreamPy, srmd, or video2x without issue.
      Especially video2x will take forever though.

      Or did you mean training a model?

      • 2 months ago
        Anonymous

        >DeepCreamPy

      • 2 months ago
        Anonymous

        >deepcreampy
        based, didn't know there are decensors like that. Also waifu2x runs in my shitty laptop with cpu

        • 2 months ago
          Anonymous

          Honestly the decensor quality is kind of shit though, particularly with foregrounds like fluids, hands, hair, or text; as it is I think the unprocessed images look better 90% of the time.
          If you do want to use DeepCreamPy I would recommend you also use hent-AI to automatically detect mosaics/bars - once again this struggles with foregrounds.
          There is also DoujinCI that automates the process of downloading and decensoring a gallery by running a Gitlab CI pipeline.

          On a somewhat related note I'm currently trying (so far without success) to train an alternative to the hent-Ai -> DeepCreamPy pipeline.

          He can do transfer learning. Done a ton of stuff for school on a laptop 1660ti just using transfer learning.

          [...]
          I am doing an applied math masters and I took my electives in image processing, computer vision and formal learning theory.
          My impression is that in academia, professors and high end researchers alike do know what they're doing.
          Moreover, to even grasp the basics of learning theory you basically need either an applied math or a math bachelors. This is because learning theory is filled with combinatoric and measure/probability theoretic notions that the average cs and engineering grad is light years away from grasping. This insane knowledge req to truly understand whats going on makes it so that only mathematicians that study the deep theory actually understand any of this whilst the rest of the people in AI merely use the whole thing as a tool and thus they have absolutely no clue what the fuck any of it truly means. Basically like engineers doing calculus, they seldom know what the reals even are.
          To make matters worse it seems a great deal of people have been convinced that you can just learn "data science" in some "boot camp" in just 3 months, pass some trivial boot camp cert exam or something and then work as a ""data scientist"" in some mediocre software company. Since this group of people is very sizable then in turn there's a lot of people looking to make money "teaching" them, thus they output an unprecedented amount of ultra diluted machine learning content (videos, courses, etc) aimed at the boot camp idiots. Now, these teachers are seldom real experts and thus they output straight up garbage so it even worsens the situation.
          In short, idiots looking for money have poisoned the well, but some people do know their shit and the field is in fact fascinating and very rich in both theory amd also open problems.

          I may be biased due to my own background, but in particle physics there are also many people that have a good grasp of the mathematical foundations upon which machine learning is built.

          • 2 months ago
            Anonymous

            >but in particle physics there are also many people that have a good grasp of the mathematical foundations upon which machine learning is built.
            Most likely they understand the math behind the actual learning algos (SGD, how to train SVMs, etc) but don't actually understand general learning theory.
            I dont mean to shit on physicists in the slightest, but they dont know measure theory and analysis. They're not mathematicians after all.

    • 2 months ago
      Anonymous

      inference? maybe, training? technically possible, but not practical for even the smallest versions of popular models on mixed precision, loading the model themselves onto the gpu will consume most of the 1070 vram. Get a 2080 ti if you can

      • 2 months ago
        Anonymous

        He can do transfer learning. Done a ton of stuff for school on a laptop 1660ti just using transfer learning.

        https://i.imgur.com/4xVD2tE.jpg

        Is the whole AI profession is just about calling premade deep learning libraries from python?

        I am doing an applied math masters and I took my electives in image processing, computer vision and formal learning theory.
        My impression is that in academia, professors and high end researchers alike do know what they're doing.
        Moreover, to even grasp the basics of learning theory you basically need either an applied math or a math bachelors. This is because learning theory is filled with combinatoric and measure/probability theoretic notions that the average cs and engineering grad is light years away from grasping. This insane knowledge req to truly understand whats going on makes it so that only mathematicians that study the deep theory actually understand any of this whilst the rest of the people in AI merely use the whole thing as a tool and thus they have absolutely no clue what the fuck any of it truly means. Basically like engineers doing calculus, they seldom know what the reals even are.
        To make matters worse it seems a great deal of people have been convinced that you can just learn "data science" in some "boot camp" in just 3 months, pass some trivial boot camp cert exam or something and then work as a ""data scientist"" in some mediocre software company. Since this group of people is very sizable then in turn there's a lot of people looking to make money "teaching" them, thus they output an unprecedented amount of ultra diluted machine learning content (videos, courses, etc) aimed at the boot camp idiots. Now, these teachers are seldom real experts and thus they output straight up garbage so it even worsens the situation.
        In short, idiots looking for money have poisoned the well, but some people do know their shit and the field is in fact fascinating and very rich in both theory amd also open problems.

        • 2 months ago
          Anonymous

          >Done a ton of stuff for school on a laptop 1660ti just using transfer learning.
          What type of _useful_ stuff if you don't mind me asking? Genuinely curious if I can pull off something of value or just toy projects.

          • 2 months ago
            Anonymous

            >_useful_
            Define useful.

            • 2 months ago
              Anonymous

              makes you money

            • 2 months ago
              Anonymous

              Some model that I can put in a product and commercialize it. Like can I do some analysis on my own PC and use that to create some sort of classifier that produces viable results in NLP or computer vision?

              • 2 months ago
                Anonymous

                >Like can I do some analysis on my own PC and use that to create some sort of classifier that produces viable results in NLP or computer vision?
                Yeah absolutely, for reasonably simple tasks you should be able to.
                For NLP I havent done anything practical but for computer vision theres a ton of stuff you can do. Start by taking one of the many pretrained resnet models out there and adapting it to work on something you want to have a classifier for.

              • 2 months ago
                Anonymous

                Not that Anon but training a classifier is absolutely possible with mid-range consumer-grade GPUs.
                However, the quality of the classification will be worse in one or more of the following:
                1) Accuracy
                2) Number of categories
                3) Size of the input
                There's also the issue of deployment: your pre-trained model will either need to be small enough to run on consumer devices or you will need to run a server that runs the model.

                For CV it might be possible like the other anon said, but NLP models can be really memory consuming with the embedding and attention layers, training on anything lower than 8gb is very awkward and especially so for generation models, ie gpt, bart, t5, and the likes, you might make it work for classifiers but expect proof of concept level of stuff, not commercialization

                Thanks!

              • 2 months ago
                Anonymous

                Not that Anon but training a classifier is absolutely possible with mid-range consumer-grade GPUs.
                However, the quality of the classification will be worse in one or more of the following:
                1) Accuracy
                2) Number of categories
                3) Size of the input
                There's also the issue of deployment: your pre-trained model will either need to be small enough to run on consumer devices or you will need to run a server that runs the model.

              • 2 months ago
                Anonymous

                For CV it might be possible like the other anon said, but NLP models can be really memory consuming with the embedding and attention layers, training on anything lower than 8gb is very awkward and especially so for generation models, ie gpt, bart, t5, and the likes, you might make it work for classifiers but expect proof of concept level of stuff, not commercialization

        • 2 months ago
          Anonymous

          so did you just not unfreeze the lower layers ever and just train the head? I guess that would be fine to play around with, but you do lose out on practical experience on finetuning, the initial unfreeze can give you a nice performance bump too provided you didnt fuck up the settings

          • 2 months ago
            Anonymous

            >I guess that would be fine to play around with, but you do lose out on practical experience on finetuning, the initial unfreeze can give you a nice performance bump too provided you didnt fuck up the settings
            Yes for fucking around its nice. But pretrained models usually perform better actually. Theres a few papers on this for multiple models.

            For CV it might be possible like the other anon said, but NLP models can be really memory consuming with the embedding and attention layers, training on anything lower than 8gb is very awkward and especially so for generation models, ie gpt, bart, t5, and the likes, you might make it work for classifiers but expect proof of concept level of stuff, not commercialization

            Nice to know, makes sense. I havent trained nlp models yet

            • 2 months ago
              Anonymous

              And I have only trained nlp models in any serious capacity. Interesting, so it is better to actually not update the pretrained weights at all? Is it a CV specific thing? do you mind sharing the papers you mentioned?

              • 2 months ago
                Anonymous

                >so it is better to actually not update the pretrained weights at all?
                I believe that in most cases this holds.
                This is because the pretrained weights are trained on much bigger tasks than what youre going to train the final layers for. Take a look at this paper, this is the most famous one on the topic I believe.
                https://arxiv.org/abs/1512.03385

              • 2 months ago
                Anonymous

                Then maybe it is a CV specific thing, probably the pretraining task of NLP models are usually diffrent enough from downstream task to make unzreezing worth it. Also did you mean to link the resnet paper? It doesnt touch in finetuning dynamics does it, or rather it was even before transfer learning became really common

              • 2 months ago
                Anonymous

                >or rather it was even before transfer learning became really common
                yes this paper popularized it

                >Then maybe it is a CV specific thing
                may well be, I see most papers in the area are on cv. however it makes sense intuitively that it would work on other tasks.

    • 2 months ago
      Anonymous

      just pay for collab pro.
      no need to deal with the nvidia's driver mess.

  8. 2 months ago
    Anonymous

    Where did it all go so wrong bros

    • 2 months ago
      Anonymous

      Black pill thread

  9. 2 months ago
    Anonymous

    what's the matter, too POOR to be able to afford stuff that can deal with O(infinity) memory complexity?

    • 2 months ago
      Anonymous

      No, but not dumb enough to believe what snake oil salesmen say either.

    • 2 months ago
      Anonymous

      I just like to build everything myself for learning.
      Why do you think it's a competition or something?

  10. 2 months ago
    Anonymous

    This meme needs to stop. It's gets stupider by the second.

  11. 2 months ago
    Anonymous

    >memory allocation and optimization issues
    garbage collection
    let the compiler blow out this big ass, multi-step tree structure(s) and do it for me

    done. kys op.

  12. 2 months ago
    Anonymous

    We are about 5 years away from Strong AI. I and everyone I know here in college find you AI denialists on the same level as flat earthers and that's being generous. Enjoy your screeching while you can.

    • 2 months ago
      Anonymous

      More like 5 years before you realize you'll never figure out consciousness with your primitive methods. Any serious psychonaut knows more about consciousness than all your eggheads put together. Enjoy your fail.

  13. 2 months ago
    Anonymous

    no it's about almost reaching sentience, then infecting it with SJW garbage and killing it a few weeks later

    • 2 months ago
      Anonymous

      >it's about almost reaching sentience, then infecting it with SJW garbage and killing it a few weeks later
      It's a probable outcome.

  14. 2 months ago
    Anonymous

    I know it's basically nvidia's ball game with AI stuff, but I'm curious if my new 6750 XT is going to be comparable to my GTX 1070 I have in another machine. I've been interested in tinkering with voice generation models like 15.ai but the 1070 didn't seem to cut it for the application I'm seeking.

Your email address will not be published.