Hacker News

Real-time LLM Inference on Standard GPUs: 3k tokens/s per request

by NicoConstant

Read more

Show HN: Tiny-vLLM – high performance LLM inference engine in C++ and CUDA

by yu3zhou4

Read more

Zig ELF Linker Improvements Devlog

by kristoff_it

Read more

Hard-Won Lessons from a Year of Using AI

by henrik_w

Read more

It's Not Just X. It's Y

by mooreds

Read more

Social Animus

by jart

Read more

Move over, AlphaFold: open-source model predicts shape of 1B proteins

by limbicsystem

Read more

Macsurf, "modern" web browser for macOS 9

by gattilorenz

Read more

On Rendering Diffs

by amadeus

Read more

Show HN: Open Envelope – an open schema for defining AI agent teams

by ashconway

Read more

GTA 6 Developers Unionize

by AndrewKemendo

Read more

Show HN: Open-source private home security camera system (end-to-end encryption)

by arrdalan

Read more

Notes from the Mistral AI Now Summit

by vnglst

Read more

Rotary GPU: Exploring Local Execution for Large MoE Models Under Limited VRAM

by dryarzeg

Read more

Leo's first encyclical attacks technological messianism

by 1vuio0pswjnm7

Read more