Hacker Newsnew | past | comments | ask | show | jobs | submit | Alifatisk's commentslogin

A bit irritating to see people ruining the demo by calling the phone number

The evals look impressive, we'll see how it performs on Artificial analysis. Looks like this is another chinese lab who joins the race. Better for the consumers!

So if I get this right, all transformers until today has the same residual design, one stream carrying information between layers. DeepSeek figured out how to widen it without training collapsing. Wow, incredible work Deepseek!

I saw this topic in my Youtube feed (YTers are fast). Looking for a bit more info for laypeople found this[0].

[0] https://www.toolmesh.ai/news/deepseek-mhc-architecture-ai-pe...


Yes. This is a general improvement in a long time of the residual design in deep neural networks and it also improves on training LLMs with hyper-connections (HC) at a large scale when compared with the standard HC architecture.

So far they tested this on training 27B models with a tiny overhead and has less "exploding" signals when compared to the other approaches and the baseline. Would be interesting to see results from >100B+ parameter models.

This should be recommended reading for those interested in micro-design changes from the days of residual networks (ResNet) to Manifold-Constrained Hyper Connections (mHC).

Instead of just adding more GPUs + Money + Parameters + Data at the problem.


I found this on Twitter, I think the creator was referring to whats happening in Minnesota and the uncovering of the fraudulent daycares. I do agree with you though, its a bad domain name.


I wish the article dug deeper into how the workflow would look like in the practical sense when using jBang and jReleaser.

Thats part of what I want to do more of in 2026 and I hoping others will help as otherwise I'll just be sitting in my own quiet echo chamber :)

Luckily, that's only during aot compilation and not runtime.

Right but the inspiration for this article is using Java as a terminal vibe coding language, so the aot step would be part of the critical path.

I’m not surprised this was not obvious to the LLM that “cleaned up my notes” for the “author”.



The site is indeed instant, those performance tricks does work (inline everything, botli compression, cache, edge network like cdn), BUT the site is also completely empty, it shows nothing except a placeholder.

Things can easily change when you start adding functionalities. One site I like to visit to remind myself of how fast usable websites can be, is Dlangs forum. I just navigate around to get the experience.

https://forum.dlang.org


Another super super fast website is https://www.mcmaster.com/

> One site I like to visit to remind myself of how fast usable websites can be, is Dlangs forum. I just navigate around to get the experience

Interestingly, for me each page load takes a noticeably long delay. Once it starts loading all of the content snaps in almost at once. It’s slower to get there than the other forums I visit though.


I made a fast usable product page recently https://www.buyadagger.com/

It's crazy how unusable most gun websites are for browsing what's available. This though is the perfect example of what I really want when browsing catalogues.

Doesn't work without Javascript?

Some sites can be very simple and yet quite useful. For example https://rawdiary.com/ always impresses me with its speed.

So this is like nanos.org ?

This is like a dream come true, fantastic! Regarding the spinner component, can I create multiple of those in the terminal and have them run concurrently? That is one of the features that lots of gems have been lacking in Ruby, at least from what I've found. Tty-progressbar is the only gem I've found that can do this.

Yeah definitely! When you use them in the bubbletea `view` you can render multiple bubbles components at the same time.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: