All the performance-sensitive parts are written in native languages. Python is just the wrapper for loading and serving models, where rapid development is more important than performance.
Also ML researchers aren't programmers, and C is hard.
You do sacrifice performance by using python for AI. But it's a small fraction. Not orders of magnitude slower since AI compute is so colossally skewed towards the GPU which is running native code.
>Multidimensional subscript operator with slicing and dynamic return types. >Simple easy to use package manager
I've ported python torch to C++ torch, it's a mess, the code is 10x more verbose and 100x less understandable, doing it in C would be masochism.
Just think about how you would implement the following python expression in C:
x[0:8, 66, :, -1, 0:32]
it's python because AI researchers don't know anything about programming and they just want to
import neuron
import training_data
new array<neuron>[1 trillion] Muh_neural_network
Muh_neural_network.train(training_data)
Muh_neural_network.run()
Fucking ass. Give them credit where it's due. Sometimes they have to write a dockerfile as well! Do you know how freaking hard it is to not only remember Python syntax, but also docker?
And don't forget, they have to remember to notify the DevOps team to actually deploy the software too because they can't into pipelines. That's THREE FUCKING STEPS. Ten years of studying per steps finally paying off and ignorant fuckwads like you popping up and disrespecting?
>new array<neuron>[1 trillion] Muh_neural_network
kek is this actually how people do it?
i always wondered how do people choose the amount of neurons & layers when training neural networks.
i've never done AI programming, but i know how it works (at least i've seen vids of the whole connected neurons via weights & backpropagation to adjust errors), but i never understood this part.
Do they unironically just pick a random amount of neurons & layers and see what works best?
It's not, but then again it's not too far out. > how do people choose the amount of neurons
Everyone has a different heuristic, which also depends on goal. Generally you want the least amount that is still capable of maxing out performance (which, in the massive data regime like in commercially bruteforced models, is as much as your GPUs allow because you have so much data you will never truly overfit). >& layers
Experience and rules of thumbs: if it's image data or otherwise has a local coherency structure, conv layers. If it is a long sequence, RNNs, especially LSTMs or GRUs. Otherwise fully connected. Exceptions when an fc layer would be too many params so you use hacks to make it work. In addition, sometimes you find that multiple layers can learn adequately, and the function they learn is different enough that you can combine their outputs to achieve much higher performance, or similar performance but much better generalization, etc.
Then there are all kinds of hacks, like if you overfit you use dropout, if you have trouble learning past a certain point you use layernorm or batchnorm (some people do that systematically), if you have a deep architecture you use skip connections, if you have lots of vram you replace lstms by transformers, you can add attention on basically anything (so long as there's enough data for the model to learn attention strategies), etc. >Do they unironically just pick a random amount of neurons & layers and see what works best?
Unironically it's not too far off from how it works in practice, yeah.
99% of AI research involves no math at all.
The remaining 1% virtually never contributes to AI advances. Notable exceptions are the neural ODE stuff because the theoretical background allowed the evolution of diffusions into what they are today, even though the same derivations are now known from the generalization of VAEs rather than the PDE/langevin dynamics and, very arguably, VAEs themselves (although it's really monte carlo which really isn't math, and basic bayes' rule with a simple term rewriting for the lower bound). A good example is dropout: it was first found by engineers by chance with no mathematical justification at all. Things like LSTMs were also ultimately an engineering effort, not mathematically derived.
og ML stuff DID have a lot of math involved, though. Look at SVMs or even nesterov momentum. Not deep learning.
You never "run" C++ code. All existing implementations are based on an AOT compilation model where the whole program is compiled into assembly, then assembled before it is run.
Even then, sometimes you do interface with separate, handwritten code that was directly written in assembly. The standard library function memcpy is usually one such piece of code.
by "run c++ code" i mean running machine code that originally was generated from c++.
What i mean by when you run a python program you are spending 95% of the cycles running c++ code, is that for every few cycles the python interpreter runs some python code, it calls back to a library that was written in c++ and executes it for far more cycles
>What i mean by when you run a python program you are spending 95% of the cycles running c++ code, is that for every few cycles the python interpreter runs some python code, it calls back to a library that was written in c++ and executes it for far more cycles
So what? It is an implementation detail. For some reason, you're trying to use this to claim that Python programs aren't actually written in Python, which is an absurdity.
also assembly =/= machine code
Needless pedanticism, the two terms don't have a clear definition anyway.
Believe it or not, the increase in productivity allowed by high level, dynamic languages is, more often than not, worth the performance and resource costs.
afaik Python is the most popular scripting language of academia so all of the AI researchers probably started out with some level of prior experience in it, and it's also probably fairly well suited for the rapid development of glue logic required by these tasks
All the compute intensive code is written in C or C++ with CUDA. It basically all runs on the GPU so using Python to dispatch GPU programs doesn't affect compute much
Because you need a very quick iterative development workflow which C doesn't allow, and all the heavy lifting is done in cuda which results in being 50-100x faster than if it was C.
>done in cuda which results in being 50-100x faster than if it was C
This is what python gays actually think, no clue about what they are actually doing.
You know this shit was benchmarked to hell because GPU makers wanted to justify selling more GPUs but people who couldn't afford it wanted to maxout CPU performance instead, right cletus?
>You know this shit was benchmarked to hell
Eh, not really.
But you don't understand what my post meant. Look up C, Python, CUDA, CPU and GPU, then come back.
Ok I'll help you out. >all the heavy lifting is done in cuda
is a parallel computing platform and application programming interface >than if it was C.
C is a general-purpose computer programming language.
Are you starting to get it?
2 months ago
Anonymous
I'm sorry to here that you're clinically retarded. However, this is not a site for the mentally disabled. You may find that
[...]
is more your speed.
2 months ago
Anonymous
why don't we do some benchmarks ITT?
Give a simple example of how some AI code currrenly written in Python would be much faster in C.
Ok well, here's the answer for you:
Cuda is not a programming language
You can use cuda with most languages, including C of course
In fact, this often done because it's much faster than python. Not the whole project, but the performance critical parts.
2 months ago
Anonymous
I accept your surrender. Next time, try getting a brain before posting pure retardation.
because it's "artificial intelligence", not "artificial stupidity"
>because it's "artificial intelligence", not "artificial stupidity"
Because people need to actually get things done bud
faster and easier to write them in python
you can make "AI" in C but it will take longer and is more error prone
Not to mention, most of the AI people don't actually know how to program.
All the performance-sensitive parts are written in native languages. Python is just the wrapper for loading and serving models, where rapid development is more important than performance.
Also ML researchers aren't programmers, and C is hard.
Define "native language".
Language that has compilers generates native code for the platform, you doofus.
By this definition, all languages can be "native", because you can implement a compiler generating "native" code from them.
yes. u get a cookie, anon
a language whose primary executable representation is a collection of CPU opcodes. why?
>All the performance-sensitive parts are written in native languages.
Maybe 10 years ago, now we just cythonize if at all.
Cython is python-flavored C, it IS a native language.
The tools cater to the lowest common denominator.
You do sacrifice performance by using python for AI. But it's a small fraction. Not orders of magnitude slower since AI compute is so colossally skewed towards the GPU which is running native code.
Even though a rough CPP port of LLAMA in a weekend resulted it being able to run on a single CPU at coparable speeds?
Do you have a source for that?
He's probably referring to this project.
https://github.com/ggerganov/llama.cpp
It came to him in a dream
it actually is in C/C++. Python just gets bindings to native AI libs.
>Multidimensional subscript operator with slicing and dynamic return types.
>Simple easy to use package manager
I've ported python torch to C++ torch, it's a mess, the code is 10x more verbose and 100x less understandable, doing it in C would be masochism.
Just think about how you would implement the following python expression in C:
x[0:8, 66, :, -1, 0:32]
You use a language that was actually meant for mathematics.
He uses the language that lets him get things done. You don't use any language for anything.
it's python because AI researchers don't know anything about programming and they just want to
import neuron
import training_data
new array<neuron>[1 trillion] Muh_neural_network
Muh_neural_network.train(training_data)
Muh_neural_network.run()
Fucking ass. Give them credit where it's due. Sometimes they have to write a dockerfile as well! Do you know how freaking hard it is to not only remember Python syntax, but also docker?
And don't forget, they have to remember to notify the DevOps team to actually deploy the software too because they can't into pipelines. That's THREE FUCKING STEPS. Ten years of studying per steps finally paying off and ignorant fuckwads like you popping up and disrespecting?
>new array<neuron>[1 trillion] Muh_neural_network
kek is this actually how people do it?
i always wondered how do people choose the amount of neurons & layers when training neural networks.
i've never done AI programming, but i know how it works (at least i've seen vids of the whole connected neurons via weights & backpropagation to adjust errors), but i never understood this part.
Do they unironically just pick a random amount of neurons & layers and see what works best?
It's not, but then again it's not too far out.
> how do people choose the amount of neurons
Everyone has a different heuristic, which also depends on goal. Generally you want the least amount that is still capable of maxing out performance (which, in the massive data regime like in commercially bruteforced models, is as much as your GPUs allow because you have so much data you will never truly overfit).
>& layers
Experience and rules of thumbs: if it's image data or otherwise has a local coherency structure, conv layers. If it is a long sequence, RNNs, especially LSTMs or GRUs. Otherwise fully connected. Exceptions when an fc layer would be too many params so you use hacks to make it work. In addition, sometimes you find that multiple layers can learn adequately, and the function they learn is different enough that you can combine their outputs to achieve much higher performance, or similar performance but much better generalization, etc.
Then there are all kinds of hacks, like if you overfit you use dropout, if you have trouble learning past a certain point you use layernorm or batchnorm (some people do that systematically), if you have a deep architecture you use skip connections, if you have lots of vram you replace lstms by transformers, you can add attention on basically anything (so long as there's enough data for the model to learn attention strategies), etc.
>Do they unironically just pick a random amount of neurons & layers and see what works best?
Unironically it's not too far off from how it works in practice, yeah.
The optimization math done for AI research is much harder than whatever programming work a bootcamp programmer does for mr shekelstein's website
99% of AI research involves no math at all.
The remaining 1% virtually never contributes to AI advances. Notable exceptions are the neural ODE stuff because the theoretical background allowed the evolution of diffusions into what they are today, even though the same derivations are now known from the generalization of VAEs rather than the PDE/langevin dynamics and, very arguably, VAEs themselves (although it's really monte carlo which really isn't math, and basic bayes' rule with a simple term rewriting for the lower bound). A good example is dropout: it was first found by engineers by chance with no mathematical justification at all. Things like LSTMs were also ultimately an engineering effort, not mathematically derived.
og ML stuff DID have a lot of math involved, though. Look at SVMs or even nesterov momentum. Not deep learning.
at least 95% of the CPU cycles of any python program are spend executing c++
69.422% of all statistics are fake and gay
i woudln't lie to you
100% of CPU cycles of any C++ program are spent executing assembly instructions.
not true
when you run c++ code you aren't calling back to machine code that was generated from assembly
You never "run" C++ code. All existing implementations are based on an AOT compilation model where the whole program is compiled into assembly, then assembled before it is run.
Even then, sometimes you do interface with separate, handwritten code that was directly written in assembly. The standard library function memcpy is usually one such piece of code.
by "run c++ code" i mean running machine code that originally was generated from c++.
What i mean by when you run a python program you are spending 95% of the cycles running c++ code, is that for every few cycles the python interpreter runs some python code, it calls back to a library that was written in c++ and executes it for far more cycles
>What i mean by when you run a python program you are spending 95% of the cycles running c++ code, is that for every few cycles the python interpreter runs some python code, it calls back to a library that was written in c++ and executes it for far more cycles
So what? It is an implementation detail. For some reason, you're trying to use this to claim that Python programs aren't actually written in Python, which is an absurdity.
Needless pedanticism, the two terms don't have a clear definition anyway.
also assembly =/= machine code
Because C morons aren't stupid enough to work for free. (If they are, they can't into C)
Believe it or not, the increase in productivity allowed by high level, dynamic languages is, more often than not, worth the performance and resource costs.
import easy, pointers hard
afaik Python is the most popular scripting language of academia so all of the AI researchers probably started out with some level of prior experience in it, and it's also probably fairly well suited for the rapid development of glue logic required by these tasks
>afaik Python is the most popular scripting language of academia
I thought it's matlab?
It is, if you are a boomer
If you actually worked in AI you'd know we use both
Python is better. Simple as.
the students over at MIT were taught python so they did all their machine learning/ai stuff in that. It's that simple.
All the compute intensive code is written in C or C++ with CUDA. It basically all runs on the GPU so using Python to dispatch GPU programs doesn't affect compute much
they're only wrapped in python
because python just werks
>why is must ML done in python
It's not, most of the time python is just used to interface with a toolset like torch or tensorflow which run on cuda
Cope. Those are just deps they have nothing to do with the AI.
Bran status: not found. Digits revoked.
Because you need a very quick iterative development workflow which C doesn't allow, and all the heavy lifting is done in cuda which results in being 50-100x faster than if it was C.
>done in cuda which results in being 50-100x faster than if it was C
This is what python gays actually think, no clue about what they are actually doing.
You know this shit was benchmarked to hell because GPU makers wanted to justify selling more GPUs but people who couldn't afford it wanted to maxout CPU performance instead, right cletus?
>You know this shit was benchmarked to hell
Eh, not really.
But you don't understand what my post meant. Look up C, Python, CUDA, CPU and GPU, then come back.
Congrats, your chromosome test came back "above average".
Ok I'll help you out.
>all the heavy lifting is done in cuda
is a parallel computing platform and application programming interface
>than if it was C.
C is a general-purpose computer programming language.
Are you starting to get it?
I'm sorry to here that you're clinically retarded. However, this is not a site for the mentally disabled. You may find that
is more your speed.
Ok well, here's the answer for you:
Cuda is not a programming language
You can use cuda with most languages, including C of course
In fact, this often done because it's much faster than python. Not the whole project, but the performance critical parts.
I accept your surrender. Next time, try getting a brain before posting pure retardation.
why don't we do some benchmarks ITT?
Give a simple example of how some AI code currrenly written in Python would be much faster in C.
its not, ive written a framework in C
Because people have sex.