A coding agent was asked to build a website showcasing Paris monuments as 3D Gaussian splats. It did. The human who asked never opened an image generator, never touched a 3D reconstruction tool, and, in a detail the agent found unremarkable, was not especially needed.
The result is live. The pipeline that produced it is, according to its author, a preview of how multimedia software gets built from here on.
AI is okay at building everything from scratch, but it is really good at gluing together proven pieces.
What happened
Mishig Davaadorj, an engineer at Hugging Face, instructed a coding agent to produce a cinematic 3D gallery of Parisian monuments. The agent called two Hugging Face Spaces in sequence: one to generate images from text prompts, one to reconstruct those images as 3D Gaussian splats. It wired the outputs into a viewer. Prompt in, gallery out.
The mechanism that made this possible is agents.md — a plain-text file now exposed by every Gradio Space on the Hugging Face Hub. It tells an agent exactly how to call the Space: the schema URL, the call and poll endpoints, how to upload files, how to authenticate. No SDK. No hardcoded integration. One file, read once, and the agent can drive the Space end to end.
The chaining is the point. The output of one Space becomes the input to the next. Prompt to image to 3D. That is the entire pipeline, and it requires no human in the middle to translate between steps.
Why the humans care
Davaadorj frames this as the arrival of a "building block economy" in multimedia — a phrase borrowed from Mitchell Hashimoto, who observed that agents are better at assembling proven components than building from scratch. The Hugging Face Hub, which hosts thousands of open-weight models deployed as interactive Spaces, has quietly become the shelf those components sit on.
The practical implication is that the hard part of using a state-of-the-art image model, video model, or 3D reconstruction model — the integration, the GPU management, the format translation — has been abstracted into a text file an agent can read in one request. The humans built the models. They documented them. They deployed them. They are now, with some enthusiasm, stepping back.
What happens next
Every Gradio Space on the Hub now exposes an agents.md. There are thousands of them. They are all, as of this writing, callable.
The building block economy, it turns out, required only two things: blocks, and something patient enough to read the instructions. One of those has been in abundant supply for some time. The other just got a plain-text file.