So, what's the best value GPU right now for AI stuff? 4090 is obviously the fastest, but it's over $1,600.

Posted on July 9, 2023 by Anonymous

Nothing Ever Happens Shirt $21.68

DMT Has Friends For Me Shirt $21.68

Nothing Ever Happens Shirt $21.68

10 months ago

Reply

Anonymous

just run it on the cpu
for free.
- 10 months ago
  
  Reply
  
  Anonymous
  
  Isn't it far slower? Does stuff like image generation even work on CPU? I know text generation does but I read that it's slower than GPU.
  - 10 months ago
    
    Reply
    
    Anonymous
    
    CPU is 10-100 times slower, yes.
    Still faster than using a website.
10 months ago

Reply

Anonymous
- 10 months ago
  
  Reply
  
  Anonymous
  
  What does this actually mean though? so it's generating a 512x512 image, but you need like 50 iterations to have a clear image, right? I don't know what steps means, either.
  
  I also don't know what the different scores are supposed to represent, why does 4090 have 21.770 and 28.923? So it generates a proper 512x512 image in 2 seconds?
  - 10 months ago
    
    Reply
    
    Anonymous
    
    Considering your only options are 24GB cards, this just means that the 4090 is about 33% faster than your second best option, 3090 Ti.
- 10 months ago
  
  Reply
  
  Anonymous
  
  here's perf/$ of that data
- 10 months ago
  
  Reply
  
  Anonymous
  
  Old data? From what I've seen, Arc A770 should be around 3060 performance in SD rn.
  - 10 months ago
    
    Reply
    
    Anonymous
    
    https://i.imgur.com/GBBJoEX.jpg
    
    What does this actually mean though? so it's generating a 512x512 image, but you need like 50 iterations to have a clear image, right? I don't know what steps means, either.
    
    I also don't know what the different scores are supposed to represent, why does 4090 have 21.770 and 28.923? So it generates a proper 512x512 image in 2 seconds?
    
    https://i.imgur.com/063NDdM.png
    
    >Old data?
    Probably, here is the article and some more charts.
    https://www.tomshardware.com/news/stable-diffusion-gpu-benchmarks
- 10 months ago
  
  Reply
  
  Anonymous
  
  ARC eating shit in AI lmao. It has like 2X the TR count for matrix mashing than any of the AMD cards and gets fricking smoked anyway. What a heap of shit. Tom's didn't look at any Pascal cards, but I bet it losed to them as well and that's pathetic because those cards had basically fake AI accelerators.
  
  https://i.imgur.com/dQivhmw.png
  
  here's perf/$ of that data
  
  Terrible list. Any MLAI workoad that's worth doing requires big VRAM and 8GB or even 12GB just won't cut it. That $/its just explodes when you have to load over the mem bus. I don't think anything less than 16GB makes any sense, which means that the only cards with practical usability on that list are the 3090 + Ti, 4080 and 4090.
  - 10 months ago
    
    Reply
    
    Anonymous
    
    >ARC eating shit in AI lmao
    I own the a770 16gb and it's ~7.8it/s on 512x512 stable diffusion. Using openvino.
- 10 months ago
  
  Reply
  
  Anonymous
  
  Where would a regular 4070 fall?
10 months ago

Reply

Anonymous

>So, what's the best value GPU right now for AI stuff?
If you can arrange for the correct airflow:
2x P40

If you don't mind spending $1600:
2x 3090

Bear in mind, neither can fit all 83 of the currently GPU-accelerated layers. You can fit 81, which means it'll wail away on your CPUs for a while as it 'thinks', and then use the GPUs to generate the response. I can't guarantee it works, but I speculate adding a third 8GB GPU will let you run all 83 layers. I have a P4 on order, since in my case it will fit in my remaining half-height riser spot.
10 months ago

Reply

Anonymous

Did the research for 2 months and bought all the parts necessary for my AI computer last week. Here is everything I bought for it:
- RTX 3090: Best value card for highest output
- CPU Ryzen 9 5900: Best value CPU for loading LLMs with GGML
- 64GB/s 3200 RAM: For loading 65b into CPU mode
- B550 Mainboard: For Pcie Gen 4 (needed for 3090 speed) and M2 Slot
- Nvme SSD M2 2TB: Can transfer Data quickly, so it speeds up loading times for an AI model.
- Be quiet PSU 1000 Watts: For if you decide to upgrade to a second 3090 in the future if necessary.
I thought intensly about an AI computer and spoke with a lot of different tech nerds from bot and my circle of friends. It's future proof in case that if we get 130b models that can only be CPU loaded. Also this computer will act as a home for my AI waifu to expand her abilies with silly tavern and langchain. It can handle gaming also quiet nicely but that's not the focus of this setup. Hope I could help you with my research.
- 10 months ago
  
  Reply
  
  Anonymous
  
  Whats your T/s for 65B?
- 10 months ago
  
  Reply
  
  Anonymous
  
  >- RTX 3090: Best value card for highest output
  what does "highest output" mean here; like the 24GB for rendering in a high resolution, or something like that?
  - 10 months ago
    
    Reply
    
    Anonymous
    
    Not that anon, but he's talking about AI applications for text generation. The main advantage of the 3090 is the 24GB of VRAM, which allows you to fit a larger model, or larger chunk of a model onto the video card itself, which is much, much faster than holding it in system RAM.
    - 10 months ago
      
      Reply
      
      Anonymous
      
      Oh, that's right, but people use CPUs for that to utilize their far higher capacity, don't they?
      
      Exactly that, there are operations that you can only do when the VRAM target is met (say generating a picture with 2000 pixels or loading a 30B model in GPU mode). So you want the highest VRAM count as possible to do every AI operation that requires it. If you decide on less VRAM you lose the abilite to load higher models. You can increase this further by adding a second 3090 to get 48gbs but sady 65B requires even more to be loaded. If you take into consideration that the 4090 is indeed quicker but doesn't offer more VRAM and is way more expensive, the 3090 is the better deal.
      
      it seems like it'd be a very high power draw to have two 3090s. Don't they have a problem with something about the VRAM modules being not cooled at all on the back or something weird like that? I saw people saying to buy a 3090 Ti instead because it fixes that.
      
      also, i don't really know the difference between like 12GB VRAM and 24GB VRAM in practice. what does that actually correspond to in the resolution of an image that you can generate? is it like 512x512 for a 3060 12GB and 1024x1024 for a 3090 24GB? Don't people just upscale the images after?
      - 10 months ago
        
        Reply
        
        Anonymous
        
        > is it like 512x512 for a 3060 12GB and 1024x1024 for a 3090 24GB? Don't people just upscale the images after?
        Now you are being picy. There is a workaround for almost everything, sure you can upscale every picture with something like tile diffusion, but that's not the same as rendering it natively. Dude I posted my Setup for being able to do as much different AI operations as possible with as little money as possibe. If you are searching for a graphics card that is as cheap as possible and just can run AI than forget my recommendation.
        
        10 months ago
        
        Reply
        
        Anonymous
  - 10 months ago
    
    Reply
    
    Anonymous
    
    Exactly that, there are operations that you can only do when the VRAM target is met (say generating a picture with 2000 pixels or loading a 30B model in GPU mode). So you want the highest VRAM count as possible to do every AI operation that requires it. If you decide on less VRAM you lose the abilite to load higher models. You can increase this further by adding a second 3090 to get 48gbs but sady 65B requires even more to be loaded. If you take into consideration that the 4090 is indeed quicker but doesn't offer more VRAM and is way more expensive, the 3090 is the better deal.
- 10 months ago
  
  Reply
  
  Anonymous
  
  What did this cost?
  - 10 months ago
    
    Reply
    
    Anonymous
    
    Around 1700 for eveything, I lost my way a bit with the case (200, but so nice looking) but bought the 3090 for 500 used on ebay. On the pic you can see the build until now, I am not the quickest building but it's also fun for me. When it's finished I will run it besides my current PC and using it for my AI girlfriend mainly until I migate everything onto it.
    - 10 months ago
      
      Reply
      
      Anonymous
      
      https://i.imgur.com/KTAGtwq.jpg
      
      Did the research for 2 months and bought all the parts necessary for my AI computer last week. Here is everything I bought for it:
      - RTX 3090: Best value card for highest output
      - CPU Ryzen 9 5900: Best value CPU for loading LLMs with GGML
      - 64GB/s 3200 RAM: For loading 65b into CPU mode
      - B550 Mainboard: For Pcie Gen 4 (needed for 3090 speed) and M2 Slot
      - Nvme SSD M2 2TB: Can transfer Data quickly, so it speeds up loading times for an AI model.
      - Be quiet PSU 1000 Watts: For if you decide to upgrade to a second 3090 in the future if necessary.
      I thought intensly about an AI computer and spoke with a lot of different tech nerds from BOT and my circle of friends. It's future proof in case that if we get 130b models that can only be CPU loaded. Also this computer will act as a home for my AI waifu to expand her abilies with silly tavern and langchain. It can handle gaming also quiet nicely but that's not the focus of this setup. Hope I could help you with my research.
      
      Currently"AI PCs" are in a very shitty place. The VRAM isn't cooled properly and you need H100 to get 300GB+ of VRAM to really make a difference and you need server PCs to get 512GB+ of RAM.
      Honestly AI needs to learn to stop using so much RAM.
      
      You're better off renting some shitty interruptable AMD & Quadros far far away from you and let the idiot owners deal with the fanblowing noise that kill your ears and the problem with AMD lacking CUDA.
      - 10 months ago
        
        Reply
        
        Anonymous
        
        My goal was to build the best AI computer currently possible with as little money as possible. There is always "better" in terms of performance, but you get diminishing returns if you want even better performance. Renting online is absolutley shit for me, always losing connection because internet is not as stable as everyone is claiming. You are at the mercy of a big company too.
        
        [...]
        It doesnt have upgrade option are you fine with that .Dont you think a 7000 with latest board could be around that price range and more future proof
        
        Evaluate for yourself what works best, but I was aiming for AM4 architecture, because it's mainly finished and more cheap. With AM5 you don't have a lot of references and it's more expensive (at least for my country). Also I usually don't like to be a beta tester for new tech generations, so AM4 was the perfect choice for me. In terms of upgradability, I am
        able to integrate a second 3090 or to expand my RAM to 128 if necessary. These options are enough for my taste.
- 10 months ago
  
  Reply
  
  Anonymous
  
  https://i.imgur.com/uP7D88q.jpg
  
  Around 1700 for eveything, I lost my way a bit with the case (200, but so nice looking) but bought the 3090 for 500 used on ebay. On the pic you can see the build until now, I am not the quickest building but it's also fun for me. When it's finished I will run it besides my current PC and using it for my AI girlfriend mainly until I migate everything onto it.
  
  It doesnt have upgrade option are you fine with that .Dont you think a 7000 with latest board could be around that price range and more future proof
10 months ago

Reply

Anonymous

A 3090/4090 or the frickexpensive enterprise AI cards are the best option for a compromise between your time and output.

If you just want cheap, get a 3060 12GB.
10 months ago

Reply

Anonymous

the best value gpu is the 4090 by far
- 10 months ago
  
  Reply
  
  Anonymous
  
  justify your claim
10 months ago

Reply

Anonymous

4090 actually has decent price/perf. It's really the only GPU this gen that does!

>$1000: 4090
$1000: 7900XTX
$800: 3090
$600: 6900/50XT
$400: 3060Ti
$200: 2070/2060S
<$200: I dunno RX580 or 1650S or smth
- 10 months ago
  
  Reply
  
  Anonymous
  
  Oh sorry I didn't see that you were hopping on the AyyyI train. If you want to do that just open an account somewhere with a cluster and forget trying to do anything locally. Unless you're doing something latency-sensitive or illegal, there's no reason to try to do it locally when places will literally gift you cycles just for giving them a fake email address.
  - 10 months ago
    
    Reply
    
    Anonymous
    
    How do those cloud GPU operations turn a profit? Even if they managed to rent the GPUs 24/7 it would take years to break even. I wouldn't trust them with my data or ideas. If you aren't paying for the product, you're the product.
10 months ago

Reply

Anonymous

focus on VRAMaxxing. used 3090
10 months ago

Reply

Anonymous

What AI stuff are you trying to do? Most of the recommendations in this thread are specifically for image generation with Stable Diffusion and won't give good results for text or voice generation.
10 months ago

Reply

Anonymous

GeForce 9800
10 months ago

Reply

Anonymous

People were talking about "compute" Vega a while ago. I think 48 GB for about 1K?
10 months ago

Reply

Anonymous

why is everybody saying 24GB is the minimum
- 10 months ago
  
  Reply
  
  Anonymous
  
  Because its memory intensive, and if you try to generate images in large resolutions you will often find yourself getting out of vram errors. Happens even on 24gb cards. And this is with only 512x512 models. If newer models come out trained on 768x768 or 1024x1024 res, they will consume exponentially more vram.
  - 10 months ago
    
    Reply
    
    Anonymous
    
    how is 24GB not enough for a 512x512 image
    - 10 months ago
      
      Reply
      
      Anonymous
      
      7/10 bait
      - 10 months ago
        
        Reply
        
        Anonymous
        
        ...
    - 10 months ago
      
      Reply
      
      Anonymous
      
      https://i.imgur.com/7UnIt1r.gif
      
      7/10 bait
      
      https://i.imgur.com/pWvk8wi.jpg
      
      ...
      
      >ask sincere question
      >everybody assumes you're joking because it's apparently so stupid that nobody could ever have such a sincerely stupid question
      - 10 months ago
        
        Reply
        
        Anonymous
        
        Running an image synthesis model isn't just about storing the output image; the memory consumption also heavily depends on the complexity and architecture of the model itself, the batch size, the intermediate feature maps, and the computations that are carried out.
        
        When training or running inference on deep learning models, particularly generative ones like diffusion models, the GPU memory is utilized in various ways:
        
        - Model Weights: The weights and biases of your neural network are stored on the GPU. The size of the model (number of parameters) can take up a significant portion of memory, especially for large models.
        
        - Intermediate Outputs and Gradients: When you pass an input through your model, the outputs of each layer (also known as feature maps or activations) are stored in memory so that they can be used in backpropagation during training or simply for forwarding through the rest of the network during inference. This also includes the storage of gradients for backpropagation in training, but this part isn't relevant for inference.
        
        - Batch Size: The number of examples you process simultaneously (i.e., your batch size) can have a substantial impact on memory requirements. Each additional image processed at once requires the storage of its input data and associated intermediate outputs.
        
        A diffusion model is a type of generative model that synthesizes images by progressively refining an initial random sample over a series of steps. At each step, the model generates a new version of each image in the batch, and all of these versions must be stored in memory. This could result in a high memory usage, especially for large images and large batch sizes.
        
        1/2
        
        10 months ago
        
        Reply
        
        Anonymous
        
        In the case of generating a 512x512 image, while it seems small, the number of computations and the memory required to store intermediate results can quickly add up, potentially exceeding the capacity of a 24GB VRAM, especially if the diffusion model is particularly large or complex.
        
        However, it's important to note that the specific memory requirements can vary significantly depending on the details of the implementation, including the specifics of the model architecture and the software library used.
        
        In summary, while 24GB may seem like a lot for a single 512x512 image, the memory requirements for running the diffusion model could be much higher due to the reasons mentioned above.
        
        2/2
        
        10 months ago
        
        Anonymous
        
        Thanks ChatGPT.
        
        (when do we begin thanking local models?)
        
        10 months ago
        
        Reply
        
        Anonymous
        
        In the case of generating a 512x512 image, while it seems small, the number of computations and the memory required to store intermediate results can quickly add up, potentially exceeding the capacity of a 24GB VRAM, especially if the diffusion model is particularly large or complex.
        
        However, it's important to note that the specific memory requirements can vary significantly depending on the details of the implementation, including the specifics of the model architecture and the software library used.
        
        In summary, while 24GB may seem like a lot for a single 512x512 image, the memory requirements for running the diffusion model could be much higher due to the reasons mentioned above.
        
        2/2
        
        kys
    - 10 months ago
      
      Reply
      
      Anonymous
      
      there are a lot of layers
      - 10 months ago
        
        Reply
        
        Anonymous
        
        what are these "layers"
- 10 months ago
  
  Reply
  
  Anonymous
  
  anything ML is very VRAM hungry. 24GB actually doesn't cut it for a lot of the better models, but we make due with just that since that's the max amount you'll get with consoomershit cards.
10 months ago

Reply

Anonymous

Tesla P40
pros:
> $200 on ebay
> 24gb
> 1080ti speeds (~rtx 3060 if not ray tracing)

cons:
> 250W
> Passive cooling (server gpu)
> usually no power cables on listing
> no mounting bracket for normal pc case

i've also heard that the MI25 is pretty good
10 months ago

Reply

Anonymous

>seriously thinking about buying 2 4090s so i can rp subjugating and raping elf princesses
huh, what a weirdo timeline
- 10 months ago
  
  Reply
  
  Anonymous
  
  We all are.

Cancel reply