NVIDIA releases GPT model to be run locally on your RTX 30 and 40 series GPU card
OpenAI, AMD, Grok sisters??? our response?
Nothing Ever Happens Shirt $21.68 |
UFOs Are A Psyop Shirt $21.68 |
Nothing Ever Happens Shirt $21.68 |
NVIDIA releases GPT model to be run locally on your RTX 30 and 40 series GPU card
OpenAI, AMD, Grok sisters??? our response?
Nothing Ever Happens Shirt $21.68 |
UFOs Are A Psyop Shirt $21.68 |
Nothing Ever Happens Shirt $21.68 |
Cucking the consumer GPUs to not compete with their thousand dollar PRO ones while simultaneously advertising the former as AI friendly feels like they're laughing at us.
https://danbo.org/register?with=BOT
>Google won
Google euro
>it's okay for shills to spam bot as long as they give hiro money! sign up at buyanad.com
speak about being a fricking cuck
you will never be a janny
Doesnt run offline, you need permanent cloud connection for account subscription and extensive telemetry
>Doesnt run offline
it runs offline
youtube integration and more, user layer is half of the job of making GPT model useable
>youtube integration and more
It's like 20 lines of code using LLama.Index. Nothing difficult, you mainly have to copy-paste the documentation.
these posts remind me of that infamous comment on hackernews about dropbox
paraphrasing from memory but it was
>you can already do all that with rsync and a rented vps
oh my god, I actually looked for it and it's even worse.
Did 2007's hackernews users all move to BOT?
He's only wrong in that he's proposing to use FTP and not SFTP.
Not a viable solution for everyone who wants to have their files on the go because it would require a little more knowledge about PCs, but still a solution over cloud syncing
>"for a linux user, you can already build such a system yourself quite trivially by..." [non-trivial setup]
I hope Linuxhomosexuals never change. Not in a million years. Not in ten billion years. Not past the heat death of the fricking universe.
with Nvidia it's different because they literally don't intend to make money off of this (particular) app
as usual everything they do, from publishing open source papers with code to making demo apps, is for the sole purpose of building a CUDA moat and selling more GPUs
He's not wrong
>Doesnt run offline
What the frick?
>select your model
>mistral
>llama
this is just a llama UI for brainlets
And with Nvidia telemetry.
I'm downloading it right now because why not but the models available here (LLaMA and Mistral) have been publicly available for months now.
There is already a lot of mature, open-source software available to run these models so "less botnet" is hardly an argument in favor of Chat with RTX.
at the very least we might expect better performance than with the open source solutions
On the other hand, NVIDIA is only offering int4 quantization and not any of the larger models like Yi or Mixtral that you could potentially fit in 24 GB VRAM.
Mistral is 7b and my guess is that the LLaMA model they offer is also 7b.
You can already run these models at extremely high speeds with open source software, even more performance would hardly make a difference.
My expectation is that the models themself will be extremely moronic by comparison but the option to search documents could be interesting.
>Mistral is 7b and my guess is that the LLaMA model they offer is also 7b.
What does 7b mean?
7 billion parameters.
It's made of matrices that, all added up, contain 7 billion numbers.
Those numbers can be quantized at different sizes. So if you use one byte per number, that's 7GB in file size. If you use 2 bytes per number, that's 14 GB.
So going by parameter count is more generalizable than file size.
Remember these things are literally just data in a bespoke encoding. If it's 14 GB, that's the upper limit on how much it "knows" (and the actual knowledge is probably lower than that, due to redundant or non-encoding parameters). Shrinking it to 7 GB makes it a little dumber, but not as bad as you might think.
anything coming from the Freetard Corp. is probably not worth anyones time
>trusting anything nshitia puts out to be "without botnet"
from the looks of it that youtube thing is literally just downloading the transcript file and adding it to the prompt context. possibly feeding it into a small embedding model first to speed things up by only adding the most relevant context. Literally nothing you can't already do now.
>Literally nothing you can't already do now.
go back to kindergarden
Cope
You're getting excited about repackaged open source shit that has been available for months and in some cases years. But daddy Jensen is shitting it into your mouth so that makes it special right?
performance and actually backed by paid programmers and not hobbyist that work slowly, commie
Jensen should ask for his money back on this one.
go back to lewddit
whats with the gay porno music?
just use freedomGPT
>35 GB archive
kek I really need to buy a new SSD.
My 4060 has 6GB of RAM but the website says I need 8GB. FRICK.
a 4060 has 8 gigs what are you talking about
Notebook GPU..
Wait, actually it does habe 8GB. Why the frick did I think it had 6GB? Neat.
>My 4060 has 6GB of RAM
Are you moronic?
Can this shit access urls and summarize?
otherwise its fricking useless, I miss being able to ask chatgpt to summarize an article or 'search for any references to xxx in this url and tell me'
Now it's neutered and tells you it cant access external sources and never had the ability to do so (it did)
https://docs.llamaindex.ai/en/stable/examples/data_connectors/WebPageDemo.html
>Now it's neutered and tells you it cant access external sources and never had the ability to do so (it did)
it literally still does this
it fakes it
you used to be able to ask it for instance, "What source file do I need to modify in Quake 2 to alter the physics of the players movement" and it would give you the file and the code
Now it just shits out "I DONT HAVE ACCESS" blah blah
>The primary file you'll want to get your hands dirty with is pmove.c. This file is part of the game's source code that handles player movement physics, including jumping, walking, and running dynamics.
????
Is that gpt 4? could be why
>bitching about gpt3
wow i'm so surprised the free service has worse outputs than the paid product
Im just asking you stupid gorilla Black person, I stopped paying for gpt4 when they kept censoring it
Tried to install it several times, and got pic related when trying to launch. Why can't anyone in the LLM space make an installer that works? This is giving me oobabooga vibes.
Is there any way to get it to run on a 2060 12GB?
>gr*dio
is this some interns job or something?
unless they somehow increased the context size its just a neat little toy. Maybe you can integrate it in word or outlook to write better mails.
I highly doubt this is better than the local models we can already run with llama.cpp
Just a reminder you can also use
https://gpt4all.io/ and it supports CUDA ootb
Or get podman/docker and run models with an ollama container, you need to install CUDA toolkit though
https://hub.docker.com/r/ollama/ollama
>redditBlack person tools
Or just use koboldcpp or ooba like a normal /lmg/ denizen
Platform: Windows
I'm moronic, I installed it but don't know how to run it.
And why would I care about it over any existing /lmg/ solution?
My thoughts exactly. It's like the Stable Diffusion vs DALLE-3 situation. One is better than the other, but guess what? Your local model will never get "updated"/lobotomized. The only target audience this will have are normies who are perfectly okay with interacting with an ever more locked down advertising bot. Everyone else who wants to use AI for something will rely on something they control, even if they do not create it themselves.
>My thoughts exactly. It's like the Stable Diffusion vs DALLE-3 situation. One is better than the other, but guess what? Your local model will never get "updated"/lobotomized.
Except compared to for example Ooba, Chat with RTX is worse not only in terms of cucking but also just the software quality:
It's trash.
Don't worry bros amd will release a chatbot that'll work on 4gb vram nvidia cards 😉