Is the whole AI profession is just about calling premade deep learning libraries from python?
ChatGPT Wizard Shirt $21.68 |
Beware Cat Shirt $21.68 |
ChatGPT Wizard Shirt $21.68 |
Is the whole AI profession is just about calling premade deep learning libraries from python?
ChatGPT Wizard Shirt $21.68 |
Beware Cat Shirt $21.68 |
ChatGPT Wizard Shirt $21.68 |
Yes and making papers about things they don't even understand.
>Yes and making papers about things they don't even understand.
That was my impression too.
99% of work is gathering and cleaning up the data then waiting 50 years to train the model only to find out that it's shit
The issue with optimizations for GPUs is that they are highly hardware-dependent.
If you were to directly use CUDA/OpenCL to train a neural network the performance would degrade when you swap GPUs (+ all the regular portability issues).
I think very few people would be able to implement low-level code for training neural networks that is more efficient than to just use something like Tensorflow.
Of course you still need to understand GPU architecture when you want to implement anything non-standard in an efficient manner.
Also it's much easier to teach domain experts how to use libraries like Keras than it is to teach domain knowledge to a competent low-level programmer.
Most Models allocate a huge block of memory at startup, probably on the GPU, mess with the values in the block during runtime (so no allocations or deallocations but huge memory usage), then free the block at the end. It's not like if else statements. It's more like SQL where clauses and joins that filter or transform data points in parallel.
Making a novel model topology is hard, but most are some variant of a convolutional recurrent neural network anyhow.
It's the training regime and cleaning up the training data set that's important.
Have you tried looking on the huggingface website for what you want?
>allocate a huge block of memory at startup
>mess with the values in the block during runtime (so no allocations or deallocations
Guess whaw webtard, that's literally what every single existing memory allocator does too.
It doesn't matter what the allocator or operating system does internally, only that it provides memory to the rest of the application when asked. I prefer to not split hairs to that degree as it isn't meaningful.
Of course a computer will ultimately have the same amount of memory locations, unless the hardware configuration is changed.
And asking for money, yes.
>is just about calling premade deep learning libraries from python?
I'm starting to think its bloated as hell, just visit the website that hosted dalle mini and check it's google collab.
It just links to random libraries and downloads gb of stuff that is not even the model, just libraries.
I think that they eventually gonna need to switch from python to a decent language
The models don't run in Python. It's only that their topology is defined using Python. When the model is run, C code (via C++ wrappers) is executed behind the scenes. I think JSON could be used to define the topology just as easily.
while on topic, could i run a small model on a 1070? I will probably get a 3080 later but I want to get my feet wet now.
>could i run a small model on a 1070?
Running a small model on a 1070 is no problem.
With 8GB of VRAM you can run models like GPT-2 (with exception of the biggest version), waifu2x, DeepCreamPy, srmd, or video2x without issue.
Especially video2x will take forever though.
Or did you mean training a model?
>DeepCreamPy
>deepcreampy
based, didn't know there are decensors like that. Also waifu2x runs in my shitty laptop with cpu
Honestly the decensor quality is kind of shit though, particularly with foregrounds like fluids, hands, hair, or text; as it is I think the unprocessed images look better 90% of the time.
If you do want to use DeepCreamPy I would recommend you also use hent-AI to automatically detect mosaics/bars - once again this struggles with foregrounds.
There is also DoujinCI that automates the process of downloading and decensoring a gallery by running a Gitlab CI pipeline.
On a somewhat related note I'm currently trying (so far without success) to train an alternative to the hent-Ai -> DeepCreamPy pipeline.
I may be biased due to my own background, but in particle physics there are also many people that have a good grasp of the mathematical foundations upon which machine learning is built.
>but in particle physics there are also many people that have a good grasp of the mathematical foundations upon which machine learning is built.
Most likely they understand the math behind the actual learning algos (SGD, how to train SVMs, etc) but don't actually understand general learning theory.
I dont mean to shit on physicists in the slightest, but they dont know measure theory and analysis. They're not mathematicians after all.
inference? maybe, training? technically possible, but not practical for even the smallest versions of popular models on mixed precision, loading the model themselves onto the gpu will consume most of the 1070 vram. Get a 2080 ti if you can
He can do transfer learning. Done a ton of stuff for school on a laptop 1660ti just using transfer learning.
I am doing an applied math masters and I took my electives in image processing, computer vision and formal learning theory.
My impression is that in academia, professors and high end researchers alike do know what they're doing.
Moreover, to even grasp the basics of learning theory you basically need either an applied math or a math bachelors. This is because learning theory is filled with combinatoric and measure/probability theoretic notions that the average cs and engineering grad is light years away from grasping. This insane knowledge req to truly understand whats going on makes it so that only mathematicians that study the deep theory actually understand any of this whilst the rest of the people in AI merely use the whole thing as a tool and thus they have absolutely no clue what the frick any of it truly means. Basically like engineers doing calculus, they seldom know what the reals even are.
To make matters worse it seems a great deal of people have been convinced that you can just learn "data science" in some "boot camp" in just 3 months, pass some trivial boot camp cert exam or something and then work as a ""data scientist"" in some mediocre software company. Since this group of people is very sizable then in turn there's a lot of people looking to make money "teaching" them, thus they output an unprecedented amount of ultra diluted machine learning content (videos, courses, etc) aimed at the boot camp idiots. Now, these teachers are seldom real experts and thus they output straight up garbage so it even worsens the situation.
In short, idiots looking for money have poisoned the well, but some people do know their shit and the field is in fact fascinating and very rich in both theory amd also open problems.
>Done a ton of stuff for school on a laptop 1660ti just using transfer learning.
What type of _useful_ stuff if you don't mind me asking? Genuinely curious if I can pull off something of value or just toy projects.
>_useful_
Define useful.
makes you money
Some model that I can put in a product and commercialize it. Like can I do some analysis on my own PC and use that to create some sort of classifier that produces viable results in NLP or computer vision?
>Like can I do some analysis on my own PC and use that to create some sort of classifier that produces viable results in NLP or computer vision?
Yeah absolutely, for reasonably simple tasks you should be able to.
For NLP I havent done anything practical but for computer vision theres a ton of stuff you can do. Start by taking one of the many pretrained resnet models out there and adapting it to work on something you want to have a classifier for.
Thanks!
Not that Anon but training a classifier is absolutely possible with mid-range consumer-grade GPUs.
However, the quality of the classification will be worse in one or more of the following:
1) Accuracy
2) Number of categories
3) Size of the input
There's also the issue of deployment: your pre-trained model will either need to be small enough to run on consumer devices or you will need to run a server that runs the model.
For CV it might be possible like the other anon said, but NLP models can be really memory consuming with the embedding and attention layers, training on anything lower than 8gb is very awkward and especially so for generation models, ie gpt, bart, t5, and the likes, you might make it work for classifiers but expect proof of concept level of stuff, not commercialization
so did you just not unfreeze the lower layers ever and just train the head? I guess that would be fine to play around with, but you do lose out on practical experience on finetuning, the initial unfreeze can give you a nice performance bump too provided you didnt frick up the settings
>I guess that would be fine to play around with, but you do lose out on practical experience on finetuning, the initial unfreeze can give you a nice performance bump too provided you didnt frick up the settings
Yes for fricking around its nice. But pretrained models usually perform better actually. Theres a few papers on this for multiple models.
Nice to know, makes sense. I havent trained nlp models yet
And I have only trained nlp models in any serious capacity. Interesting, so it is better to actually not update the pretrained weights at all? Is it a CV specific thing? do you mind sharing the papers you mentioned?
>so it is better to actually not update the pretrained weights at all?
I believe that in most cases this holds.
This is because the pretrained weights are trained on much bigger tasks than what youre going to train the final layers for. Take a look at this paper, this is the most famous one on the topic I believe.
https://arxiv.org/abs/1512.03385
Then maybe it is a CV specific thing, probably the pretraining task of NLP models are usually diffrent enough from downstream task to make unzreezing worth it. Also did you mean to link the resnet paper? It doesnt touch in finetuning dynamics does it, or rather it was even before transfer learning became really common
>or rather it was even before transfer learning became really common
yes this paper popularized it
>Then maybe it is a CV specific thing
may well be, I see most papers in the area are on cv. however it makes sense intuitively that it would work on other tasks.
just pay for collab pro.
no need to deal with the nvidia's driver mess.
Where did it all go so wrong bros
Black pill thread
what's the matter, too POOR to be able to afford stuff that can deal with O(infinity) memory complexity?
No, but not dumb enough to believe what snake oil salesmen say either.
I just like to build everything myself for learning.
Why do you think it's a competition or something?
This meme needs to stop. It's gets stupider by the second.
>memory allocation and optimization issues
garbage collection
let the compiler blow out this big ass, multi-step tree structure(s) and do it for me
done. kys op.
We are about 5 years away from Strong AI. I and everyone I know here in college find you AI denialists on the same level as flat earthers and that's being generous. Enjoy your screeching while you can.
More like 5 years before you realize you'll never figure out consciousness with your primitive methods. Any serious psychonaut knows more about consciousness than all your eggheads put together. Enjoy your fail.
no it's about almost reaching sentience, then infecting it with SJW garbage and killing it a few weeks later
>it's about almost reaching sentience, then infecting it with SJW garbage and killing it a few weeks later
It's a probable outcome.
I know it's basically nvidia's ball game with AI stuff, but I'm curious if my new 6750 XT is going to be comparable to my GTX 1070 I have in another machine. I've been interested in tinkering with voice generation models like 15.ai but the 1070 didn't seem to cut it for the application I'm seeking.