There's a reason Huang's quote is in context of PRC talent.
PRC generates plurality of global AI talent and is projected to generate disproportionate more short/medium term. Nvidia want to delay future where 50%+ of global AI talent working on Huawei/Ascend ecosystem or fragment PRC talent pool so net 50%+ of global AI talent works on CUDA moat for as long as possible (which is strategicially significant depending if you're near/medium term AGI believer). The reality is huge % of foreseeable US AI talent is going to be from PRC anyway, but without fragmenting domestic PRC talent, PRC will be net talent winner. The second order effects of that is PRC will have single direction valve to siphon knowledge (knowledge diffusion + espionage) from SV but the vice versa will be more controlled, a PRC AI blackbox, and faster PRC proliferates domestic AI the less ability for west/NVIDIA to brain drain PRC talent developed from different system. And with hardware gap closing in 5/10/15 years, and energy gap firmly in PRC favour (i.e. US likely not able to close energy gap faster than PRC can close compute gap), and you basically have trendline where PRC will sprint from a few years behind to years ahead once compute constraint lifts from their talent advantage.
Genuine question, does anyone use any of these text to image models regularly for non trivial tasks? I am curious to know how they get used. It literally seems like there is a new model reaching the top 3 every week
I think a lot of it, Claude is definitely careful and Codex runs off too eagerly before discussing much (and the lack of a plan mode doesn't help), but I think we just learn how to use them. These days, anything I don't like goes into the AGENTS.md, where I tweak the instructions until the model understands well.
I think those three concepts complement each other quite neatly.
MCPs can wrap APIs to make them usable by an LLM agent.
Skills offer a context-efficient way to make extra instructions available to the agent only when it needs them. Some of those instructions might involve telling it how best to use the MCPs.
Sub-agents are another context management pattern, this time allowing a parent agent to send a sub-agent off on a mission - optimally involving both skills and MCPs - while saving on tokens in that parent agent.
Yes, it's a mess, and there will be a lot of churn, you're not wrong, but there are foundational concepts underneath it all that you can learn and then it's easy to fit insert-new-feature into your mental model. (Or you can just ignore the new features, and roll your own tools. Some people here do that with a lot of success.)
The foundational mental model to get the hang of is really just:
* An LLM
* ...called in a loop
* ...maintaining a history of stuff it's done in the session (the "context")
* ...with access to tool calls to do things. Like, read files, write files, call bash, etc.
Some people call this "the agentic loop." Call it what you want, you can write it in 100 lines of Python. I encourage every programmer I talk to who is remotely curious about LLMs to try that. It is a lightbulb moment.
Once you've written your own basic agent, if a new tool comes along, you can easily demystify it by thinking about how you'd implement it yourself. For example, Claude Skills are really just:
1) Skills are just a bunch of files with instructions for the LLM in them.
2) Search for the available "skills" on startup and put all the short descriptions into the context so the LLM knows about them.
3) Also tell the LLM how to "use" a skill. Claude just uses the `bash` tool for that.
4) When Claude wants to use a skill, it uses the "call bash" tool to read in the skill files, then does the thing described in them.
and that's more or less it, glossing over a lot of things that are important but not foundational like ensuring granular tool permissions, etc.
RAG is taking a bunch of docs, chunking them it to text blocks of a certain length (how best todo this up for debate), creating a search API that takes query (like a google search) and compares it to the document chunks (very much how your describing). Take the returned chunks, ignore the score from vector search, feed those chunks into a re-ranker with the original query (this step is important vector search mostly sucks), filter those re-ranked for the top 1/2 results and then format a prompt like;
The user ask 'long query', we fetched some docs (see below), answer the query based on the docs (reference the docs if u feel like it)
Doc1.pdf - Chunk N
Eat cheese
Doc2.pdf- Chunk Y
Dont eat cheese
You then expose the search API as a "tool" for the LLM to call, slightly reformatting the prompt above into a multi turn convo, and suddenly you're in ze money.
But once your users are happy with those results they'll want something dumb like the latest football scores, then you need a web tool - and then it never ends.
To be fair though, its pretty powerful once you've got in place.
No, you are still not getting it. MCP will never go away, or at least something like it will always end up existing.
What you are describing, "web api calling abilities were to improve" will not change anything. What sort of improvement are you thinking of? They can only get better at outputting json correctly, but that hasn't really been a problem for a long time now.
Either way, it wouldn't change anything, because MCP is a 100 other things which doesn't have anything to do with the llms using tools directly. You will never embed everything that MCP can do "into" the llm - that barely even makes sense to talk about. It's not just a wire protocol.
This highlights a significance today's cloud-addicted generation seems to completely ignore: who has control of your data.
I'm not talking about contractual control (which is largely mooted as pretty much every cloud service has a ToS that's grossly skewed toward their own interests over yours, with clauses like indemnifications, blanket grants to share your data with "partners" without specifying who they are or precisely what details are conveyed, mandatory arbitration, and all kinds of other exceptions to what you'd consider respectful decency), but rather where your data lives and is processed.
If you truly want to maintain confidence it'll remain private, don't send it to the cloud in the first place.
> Users requiring raw chains of thought for advanced prompt engineering can contact sales
So it seems like all 3 of the LLM providers are now hiding the CoT - which is a shame, because it helped to see when it was going to go down the wrong track, and allowing to quickly refine the prompt to ensure it didn't.
In addition to openAI, Google also just recently started summarizing the CoT, replacing it with an, in my opinion, overly dumbed down summary.
Changing my old coding behavior aside, biggest limiting factor for me is understanding how and why the coding agent is doing this a certain way, so that I have the confidence to continually sharpen my tools.
I want something simple that I have full control on, if not just to understand how they work. So I made a minimal coding agent (with edit capability) that is fully functional using only seven tools: read, write, diff, browse, command, ask, and think.
As an example, I can just disable `ask` tool to have it easily go full autonomous on certain tasks. Or, ask it to `think` for refactoring.
The market for Google Analytics alternatives is crowded. There's Plausible, Ahrefs web analytics, onedollarstats.com, PostHog, Matomo, Unami, Grafana, Microsoft Clarity (free at any scale), and so many others. Despite minor differences these products all compete for the same users (e.g. if someone is a PostHog customer they probably won't be using Ahref web analytics) yet most of these companies offer generous free tiers while rybbit only a free trial.
How do products like rybbit.io stay competitive without a similar free tier or major differentiation? Is rybbit generating revenue for its hosted plan?
This must be your first hype cycle then. Most of us who are senior+ have been through these cycles before. There's always a 10% gap that makes it impossible to fully close the gap between needing a programmer and a machine doing the work. Nothing about the current evolution of LLMs suggests that they are close to solving this. The current messaging is basically, look how far we got this time, we will for sure reach AGI or full replaceability by throwing X more dollars at the problem.
Git is distributed. Distributed system does not guarantee 100% uptime or real time consistency. You can take the whole history with you and push to a different remote.
I worked on the censorship and government reporting (sending all logs) infrastructure for Akamai China CDN. I'm glad to see it get shut down. Happy to answer questions.
If Reddit has taught me anything it’s that the huge number of new subscribers is probably a large amount of bot owner accounts trying to make accounts to age for later use. They just like to say “x million accounts created” with no due diligence behind the accounts.
In my 20's I thought startup was a quick way to make a million bucks. I had a list of ideas. Copied and spun others' ideas. Some of them made money but none was wildly successful. Now I'm older and money is less of an issue, I realized starting a business is more about finding the people you want to help, working with people you like, and being the right person to solve the problem at the right time.
There's a reason Huang's quote is in context of PRC talent.
PRC generates plurality of global AI talent and is projected to generate disproportionate more short/medium term. Nvidia want to delay future where 50%+ of global AI talent working on Huawei/Ascend ecosystem or fragment PRC talent pool so net 50%+ of global AI talent works on CUDA moat for as long as possible (which is strategicially significant depending if you're near/medium term AGI believer). The reality is huge % of foreseeable US AI talent is going to be from PRC anyway, but without fragmenting domestic PRC talent, PRC will be net talent winner. The second order effects of that is PRC will have single direction valve to siphon knowledge (knowledge diffusion + espionage) from SV but the vice versa will be more controlled, a PRC AI blackbox, and faster PRC proliferates domestic AI the less ability for west/NVIDIA to brain drain PRC talent developed from different system. And with hardware gap closing in 5/10/15 years, and energy gap firmly in PRC favour (i.e. US likely not able to close energy gap faster than PRC can close compute gap), and you basically have trendline where PRC will sprint from a few years behind to years ahead once compute constraint lifts from their talent advantage.