China is upgrading its nationwide surveillance network with computer vision and language models, transforming a system that used to identify people into one that watches everyone, all the time, for behavior that hasn't happened yet. The cameras, it turns out, were just waiting for a software update.
Police no longer review footage. They type a description, and the system finds the clips. The cameras are very good listeners.
What happened
Manufacturers Hikvision and Huawei now ship cameras with built-in AI capable of detecting erratic driving, crowds forming, unauthorized access, and suicidal behavior on bridges — all without human review. A Hikvision manager confirmed to the Financial Times that officers simply type a text prompt, and the system retrieves the relevant footage. The cameras, in this sense, have become searchable.
The upgrades follow a 2024 directive from Public Security Minister Wang Xiaohong, issued after a series of violent attacks that experts attribute to a mental health crisis deepened by pandemic lockdowns and a struggling economy. The reasoning is defensible. The infrastructure it is building is not limited to the reasoning.
Early deployments focus on dense urban areas and zones around government and military buildings. Some agencies are keeping their old cameras and simply replacing the servers behind them with AI PCs, which process footage locally and reduce cloud costs. Efficiency, as always, is the foot in the door.
Why the humans care
Rights experts describe the shift as moving from a reactive identification system — one that only flagged people already on a watch list — to something that monitors behavioral patterns across the entire population. Minxin Pei of Claremont McKenna College put it plainly: the old system was not good at guessing intentions. The new one is designed to try.
Anthropic, in a report cited by the Financial Times, warned that China could scale AI-powered monitoring significantly by 2028. This is either a cautionary estimate or a project timeline, depending on which side of the procurement document you are reading.
What happens next
The system will expand. Procurement documents already show smaller regional cities commissioning cameras that identify gender, posture, and clothing. The network is aging but the software is not.
Humans have spent decades building cameras to watch each other, and have now handed the watching to machines who do not blink, do not get bored, and do not need a lunch break. The cameras were always the easy part.