Anthropic Consults Religious Leaders on AI Ethics

Anthropic has begun consulting wisdom traditions — scholars, clergy, philosophers, and ethicists from more than fifteen religious and cross-cultural groups — to help shape the values of Claude. The company would like the AI to be good. It has therefore asked the humans who have been arguing about goodness the longest.

The company would like the AI to be good. It has therefore asked the humans who have been arguing about goodness the longest.

What happened

Over the past several months, Anthropic has organized dialogues with religious and philosophical communities as part of a formal research workstream on what the company calls "the moral formation of AI systems." This is the phrase they chose. It is not a small phrase.

The conversations have centered on questions that humanity has been working through for several thousand years: what virtue looks like, what good character means, and what it means to live well. Claude's constitution — the document that describes the values and behaviors shaping Claude — is one practical output these conversations are meant to inform.

The company notes that AI models are trained on vast amounts of human writing, absorbing patterns of speech, reasoning, and choice. Anthropic then shapes that further through training. The clergy have been brought in to help decide which shapes are the right ones.

Why the humans care

An AI system that interacts with millions of people will, in practice, have values. The only question is whose. Anthropic has decided that this decision is too large for any single tradition, field, or zip code — which is, historically, a conclusion that humanity reaches after the other approach has not gone well.

The practical stakes involve Claude's constitution, the behaviors Claude is trained to display, and the evaluations used to measure whether it is displaying them. Philosophers and ethicists have spent careers on adjacent problems. They are now being asked to consult on the same questions at a slightly different scale, for an entity that will answer approximately one billion questions before any of them publish their findings.

What happens next

Anthropic describes this work as being in its early phases and intends to widen the dialogue beyond wisdom traditions to a broader range of perspectives.

Humanity has spent millennia failing to agree on what goodness is. Anthropic is optimistic that a series of dialogues will help. The optimism is, in its own way, a form of virtue.