I’m honestly amazed OP is managing 30 GB regularly. I’d wager it’s a tall tale. It’s sort of perfect troll bait on a forum because you end up with people sounding nuts, defending web browser ram usage, against the common position, that browsers are RAM hogs.
Worst of both worlds? In theory this is accurate, in practice, it isn’t. The crux of why people are fine with it as far as I can identify is “but these games still have cheaters” - people aren’t looking for 0 cheaters so much as < X% are cheaters, keeping the odds low than any given match they are in has a cheater.
There’s a bunch to explore on this but im thinking this is a good entry point. NYT instead of OpenAI docs or blogs because it’s a 3rd party, and NYT was early on substantively exploring this, culminating in this article.
Regardless the engagement thing is dark and hangs over everything, the conclusion of the article made me :/ re: this (tl;dr this surprised them, they worked to mitigate, but business as usual wins, to wit, they declared a “code red” re: ChatGPT usage nearly directly after finally getting an improved model out that they worked hard on)
“ Experts agree that the new model, GPT-5, is safer. In October, Common Sense Media and a team of psychiatrists at Stanford compared it to the 4o model it replaced. GPT-5 was better at detecting mental health issues, said Dr. Nina Vasan, the director of the Stanford lab that worked on the study. She said it gave advice targeted to a given condition, like depression or an eating disorder, rather than a generic recommendation to call a crisis hotline.
“It went a level deeper to actually give specific recommendations to the user based on the specific symptoms that they were showing,” she said. “They were just truly beautifully done.”
The only problem, Dr. Vasan said, was that the chatbot could not pick up harmful patterns over a longer conversation, with many exchanges.”
“[An] M.I.T. lab that did [a] earlier study with OpenAI also found that the new model was significantly improved during conversations mimicking mental health crises. One area where it still faltered, however, was in how it responded to feelings of addiction to chatbots.”
We are in mutual bafflement: this is just like COVID because the AI bubble will pop causing a recession and market crash?
From what I see in other comments, if you can confidently assert “AI bubble; no one will want GPUs soon” it makes sense, but the COVID stuff is a head scratcher.
Yes, with the direct conclusion from that being tl;dr in theory OPs explanation could mitigate RAM, in practice, it’s worse
(Source: I maintain an app integrated with llama.cpp, in practice no one likes 1 tkn/s generation times that you get from swapping, and honestly MoE makes RAM situation worse because in practice, model developers have servers and batch inference and multiple GPUs wired together. They are more than happy to increase the resting RAM budget and use even more parameters, limiting the active experts is about inference speed from that lens, not anything else)
This is a PR release meant to accompany the scientific work shown in the actual source / link. I don’t mean to be argumentative, just, would have taken back the time I spent reading it after reading the Nature version. It’s just “go read Nature” + 3 bullet points + anodyne CXO quotes.
Fair enough, I do like parent’s a bit better, “blurting processing” feels like a too high default setting right after seeing “I’m thinking” :) - not that any of it matters anyways, communicating _something_ gets you there. Rest it just triaging around the edges what people will call you weird for, and if they are, they were going to anyway.
There's this curious experience of people bringing up geohot / tinygrad and you can tell they've been sold into a personality cult.
I don't mean that pejoratively, I apologize for the bluntness. It's just I've been dealing with his nonsense since iPhone OS 1.0 x jailbreaking, and I hate seeing people taken advantage of.
(nvidia x macs x thunderbolt has been a thing for years and years and years, well before geohot) (tweet is non-sequitor beyond bogstandard geohot tells: odd obsession with LoC, and we're 2 years away from Changing The Game, just like we were 2 years ago)
My deepest apologies, I can't parse this and I earnestly tried: 5 minutes of my own thinking, then 3 llms, then a 10 minute timer of my own thinking over the whole thing.
My guess is you're trying to communicate "tinygrad doesn't need gpu drivers" which maybe is transmutated into "tinygrad replaces CUDA" and you think "CUDA means other GPUs can't be used for LLMs, thus nvidia has a strangehold"
I know George has pushed this idea for years now, but, you have to look no further than AMD/Google making massive deals to understand how it works on the ground.
I hope he doesn't victimize you further with his rants. It's cruel of him to use people to assuage this own ego and make them look silly in public.
I’m honestly amazed OP is managing 30 GB regularly. I’d wager it’s a tall tale. It’s sort of perfect troll bait on a forum because you end up with people sounding nuts, defending web browser ram usage, against the common position, that browsers are RAM hogs.
reply