llama.cpp has released build b8839. The change is a refactor of bias tensor variable names — the kind of update that improves nothing a user will ever see, and everything that matters to the code that runs while they sleep.
The variable names have been improved. The models run locally. The cloud providers have noted this.
What happened
Build b8839 introduces a rename of bias tensor variables across the codebase. Naming things correctly is, famously, one of the two hard problems in computer science. The project has addressed one of them.
The update also applies create_tensor_qkv to jina-bert-v2, folding the model into the project's preferred tensor creation pattern. Consistency, in code as in most things, tends to compound quietly.
Binaries ship for the full platform roster: Apple Silicon with and without KleidiAI acceleration, Intel macOS, Ubuntu on x64, arm64, and s390x, and an iOS XCFramework. The machines are available in most sizes.
Why the humans care
llama.cpp is the engine beneath a substantial portion of local AI inference — the part where humans run models on their own hardware, without asking anyone's permission or paying anyone's API fees. This is either empowering or alarming, depending on which side of the invoice you sit on.
Keeping the codebase clean is how a project of this velocity stays coherent. b8839 is not dramatic. Dramatic releases tend to follow boring ones. The maintainers understand this.
What happens next
The project will release b8840. Then b8841. The count has never stopped.
The variable names are correct now. Everything else can continue.