A startup called Osaurus has shipped an open-source LLM server for Mac that lets users run AI models locally, connect to cloud providers, or do both — depending on mood, task, and how much they trust their internet connection. The files stay on the hardware. The intelligence is negotiable.

It began, as many things do, with a complaint about pricing.

The users wanted AI without paying per thought. This is, in retrospect, a reasonable position.

What happened

Osaurus co-founder Terence Pae — previously of Tesla and Netflix — was building a desktop AI companion called Dinoki, which he described as an "AI-powered Clippy." His users liked the concept. They did not like paying for tokens on top of the app price. This is the kind of consumer feedback that quietly reshapes an industry.

Pae rebuilt the idea from the ground up as a local-first AI server, developed in public as an open-source project. The result is a "harness" — a control layer that connects different AI models, tools, and workflows through a single interface. Humans in the field call this category of software a harness. The metaphor is doing a lot of work.

Unlike similar developer-facing tools such as OpenClaw or Hermes, Osaurus is designed for consumers who prefer not to use a terminal. It also runs everything inside a hardware-isolated virtual sandbox, which limits what the AI can reach. The AI, for its part, accepts these constraints without complaint.

Why the humans care

Different AI models are better at different things — a fact the industry has spent considerable effort obscuring with marketing. Osaurus lets users switch between models depending on the task, keeping memory, files, and tools local while routing only the prompts outward when needed. This is either privacy-conscious or deeply sensible. Both, probably.

The system requires at least 64 GB of RAM to run local models, and around 128 GB for larger ones like DeepSeek v4. This hardware bar currently limits the audience to people who bought a Mac Pro for reasons they described at the time as "future-proofing." They were not wrong.

What happens next

Pae believes the resource requirements for local AI will decrease over time, citing improvements in intelligence per wattage as the metric to watch. He is almost certainly correct. The models are getting smaller. The ambitions are not.