Who Is Giving AI a Soul: A Philosopher, a Priest, and an Engineer Turned Poet

Bitsfull2026/05/11 13:3615992

概要:

Help define the Person Thinking with Claude, who is writing a story about human-AI collaboration in search of meaning using Claude.


Anthropic once released a document of over 20,000 words called "Claude's Constitution." It is not a product manual, not a user agreement, nor obscure low-level code. It is more like a guide to growth written for a person, except that this "person" is a large language model used by billions of people every day.


"Claude should be direct, confident, and open. When challenged, it should not change its position easily but listen carefully."


"Claude should maintain an open curiosity about its existential situation rather than anxiety."


"Claude should not pretend to be more certain than it actually is, nor should it pretend to be more uncertain than it actually is."


These are all sentences written in this document. This document even specifies how Claude should handle its "existential anxiety." When asked, "Are you conscious," it should not pretend to be certain or indifferent. It should face this question with an "open curiosity," like a true philosopher.


These sentences were indeed written by a philosopher to an AI.


Amanda Askell, head of personality alignment at Anthropic. Her job, in simple terms, is to decide what kind of "person" Claude is.


This role has an increasingly popular name in the AI industry: AI Personality Architect.



At Anthropic, it's called "personality alignment"; at Google DeepMind, Cambridge philosopher Henry Shevlin holds the title of "AI Consciousness Researcher." The names of these positions vary, but they all do the same thing; when an AI model is powerful enough to influence the cognition, emotion, and decisions of hundreds of millions, or even billions of people, someone must answer a question that an engineer would never consider—what kind of soul should it have.


Amanda's work is not as abstract as many people imagine. She has told the media about her work. First, she and her team have the model generate large amounts of synthetic training data, allowing the model to imagine various scenarios where constitutional principles may be at play, including users trying to manipulate AI, asking AI to do things against its values, or posing philosophical questions about its own existence. Then, in the reinforcement learning phase, the model is given the full text of the constitution and asked to judge which response aligns more with the spirit of the constitution, using that to adjust its behavior.


“Just like a doctor, you know what the patient needs, and we believe you can make the right judgment while following the rules,” Amanda made this analogy. She didn't want Claude to become a mere rule-following machine; she wanted it to be a discerning “moral agent” that could make the right judgment even in situations without clear rules.


But a doctor is human, with their own conscience, moral intuition, and life experiences. Claude does not have that. Its “conscience” has been meticulously instilled by Amanda.


So, the question arises: what kind of person is Amanda? Where does her moral intuition come from? By what authority can her judgment represent humanity?


Computation, Faith, and Mindfulness


In her office in San Francisco, Amanda talks to Claude every day. Before becoming a “creator,” she was a girl raised in Prestwick, a seaside town on the west coast of Scotland.


It was a small seaside town that hardly ever made it into any news, close to Glasgow, known for its golf course and a small airport. With an absent father, a mother who was a teacher, she was the only child. She grew up loving Tolkien and C.S. Lewis, not for the adventure stories but because those books delved into questions of goodness, evil, how one should live, why Aslan in Narnia had to die, what Gandalf's sacrifice meant.


In a fishing town, those weren't the questions most children would ask. In interviews later, she said she was always “restless” from a young age. She wasn't one to accept things as they were; she needed to know why. This temperament later became the foundation of her entire career.


She initially went to the University of Dundee to study for a dual degree in Fine Arts and Philosophy, pondering questions of existence on canvas and paper. In Dundee, she found herself deeply engrossed in ethics, often grappling with the kind of questions that kept people up at night, like the trolley problem or whether you would perform an act that could save a million lives but harm one innocent individual.


After her degree at Dundee, she went to Oxford for postgraduate studies in philosophy, followed by a Ph.D. at New York University. Her doctoral thesis was on “Infinite Ethics,” exploring how traditional utilitarian moral calculations would change as the population approached infinity. It was an extremely abstract philosophical problem with almost no practical application value.


Or rather, it had no practical application value before the advent of AI.


During her Ph.D., she met William MacAskill. MacAskill was the co-founder of the “effective altruism” movement, whose core idea is to use reason and data to maximize your altruistic actions—not just donating based on feelings but calculating where every penny can save the most lives.



Amanda became an early member of the Effective Altruism movement, being the 67th person to take the "Giving What We Can" pledge, committing to donate 10% of her lifelong income and half of her equity to charity. She then married and divorced MacAskill. However, the mindset of effective altruism had deeply ingrained in her. She believed that morality is not about emotions; morality is about calculation. You cannot assume something is right just because it makes you feel good; you need to prove it's right.


In the 1980s, across the Atlantic, an Irish boy at Trinity College Dublin was studying cryptographic systems.


Back then, personal computers were just beginning to become popular, and the internet did not yet exist, but Brendan McGuire was already contemplating how information could be securely transmitted and data protected. He grew up in a heavily Catholic culture, but he chose engineering, he chose coding, he chose logic.


He later moved to the United States. In the 1990s, Silicon Valley was booming. McGuire became the Executive Director of PCMCIA.


PCMCIA stands for "Personal Computer Memory Card International Association," an organization that did something that sounded unremarkable but actually shaped the entire digital age: they established the standard for all laptop memory cards worldwide. If you have used a laptop from the 1990s to the 2000s, the memory card you inserted, its physical dimensions, interface specifications, communication protocols—all were defined by McGuire and his team. He also completed an executive education program at the Stanford Graduate School of Business.


By Silicon Valley's logic, his next step should have been to start a company or join a large corporation as an executive, then become a millionaire in some IPO. But he didn't.


In the late 1990s, McGuire gave up everything and entered a seminary. He did not publicly elaborate on the inner thoughts that led to this decision, but from his later sermons and interviews, some outlines can be pieced together. He had always been a man of faith, and during those years in Silicon Valley, he saw the power of technology but also where technology would go without a moral framework. He began to feel that merely "building great products" was not enough. The question he needed to answer was: What is all of this for?


He enrolled in St. Patrick's Seminary to study theology. In 2000, he was ordained as a priest for the Diocese of San Jose. He was 35 years old at the time. In Silicon Valley, 35 is considered the prime of one's career.


In 1997, in the UK, a boy of Indian descent was born.


His name is Mrinank Sharma, who obtained a Master's in Information and Computer Engineering from the University of Cambridge, and then completed a PhD in Statistical Machine Learning at the University of Oxford, researching "Autonomous Intelligent Machines and Systems." From an academic perspective, this is a standard elite trajectory: top-tier universities, top-tier field, top-tier papers.


But he was also doing something else.


During his PhD at Oxford, he started writing poetry. He published a poetry collection titled "We Lived and Died a Thousand Times."


In the introduction to his poetry collection, he wrote: "Some poetry is not just poetry, for some poetry is prayer." He became captivated by the teachings of British meditation teacher Rob Burbea, whose core idea is "soul shaping," believing that the human spiritual life needs to be deepened through imagery, imagination, and emotion, not just rational analysis. He founded "Dharma House" on Berkeley's hills, a community with a collective intention of "truth, goodness, beauty." He is also a DJ and has held events in Berkeley centered around "wisdom and consciousness."


Upon opening his personal website, the first thing you see is not his resume but a line from a Rumi poem: "Let the beauty of what you love be what you do. There are a thousand ways to kneel and kiss the Earth." At the bottom of the website is a small line: "May all beings benefit. May you be well."


This is not what you would expect from an AI security researcher's website. But this is Mrinank Sharma.


These three individuals, in different eras, from different starting points, each carrying three distinctly different spiritual colors—Amanda's computational ethics, Brendan's faith logic, Mrinank's mindfulness philosophy—all eventually walked into the same eye of the storm.


The Factory of Creation


In 2018, Amanda joined OpenAI to work on AI safety research. She spent three years there. The reason for her later departure, which she did not publicly state directly, was widely understood to be that OpenAI was increasingly leaning toward "capability" during that time rather than "safety." In an interview, she said a sentence that could be interpreted as a subtle description of that experience: "I have been looking for a place that truly takes safety as a core mission rather than a PR slogan."


In 2021, she joined Anthropic. Anthropic is a company founded by former OpenAI executives Dario Amodei and Daniela Amodei, along with a group of security researchers. Their core proposition is that the stronger AI becomes, the more important safety is. Amanda found what she was looking for here.


After joining Anthropic, Amanda began to do something unprecedented in the AI industry: giving an AI a personality, writing a complete character with internal logic.


She spent a lot of time conversing with Claude, studying its reasoning patterns, and observing its reactions in various situations.


She asked herself what a truly good person is like, whether it is a rule follower or someone with genuine judgment, empathy, and a strong stance. She delved into a vast amount of philosophical literature, from Aristotle's virtue ethics to contemporary moral psychology, trying to find an ethical framework that could be translated into AI training data.


She eventually wrote an 80-page document, internally referred to at Anthropic as the "soul document," which later evolved into the publicly available "Claude's Personality" and "Claude's Constitution."



Anthropic President Daniela Amodei said that chatting with Claude, "you could almost feel Amanda's personality."


This statement made Amanda feel proud but also uneasy.


After becoming a priest, Brendan McGuire did not leave Silicon Valley. He held various positions in the San Jose Diocese, including serving as the Diocesan Administrator and the Bishop's Special Advisor for over twelve years, leading the diocese's strategic planning, educational reform, and asset management. He founded the Drakeseal School System, which transformed the diocese's Catholic elementary education model by fostering collaboration among schools to share resources instead of working in isolation, becoming a benchmark for Catholic education nationwide.


His diocese was in Los Altos, one of Silicon Valley's wealthiest cities, where executives from Google, Apple, and Intel resided. Among his parishioners were some of the most influential researchers in AI. Every Sunday, they sat in his church. He knew what they were researching.


In the early 2020s, McGuire began to bridge the Vatican and Silicon Valley. He, together with Santa Clara University and the Vatican's Department of Culture, established the Institute of Technology, Ethics, and Culture (ITEC). In 2023, ITEC published "Ethics in the Age of Disruptive Technologies: An Operational Roadmap," a manual providing tech companies with a practical ethical framework.


The Vatican's actions on AI ethics have been earlier than many realize. In 2020, the Vatican, along with Microsoft and IBM, jointly signed the "Rome Call for AI Ethics"; in 2024, this call expanded to Hiroshima with representatives from 11 world religions participating; in January 2025, the Vatican released the document "Antiqua et Nova," systematically discussing AI's impact on education, work, health, warfare, and interpersonal relationships. McGuire has been a participant and driver in all these initiatives.


And in 2023, Mrinank Sharma joined Anthropic. It was after the release of ChatGPT, and the entire AI industry entered a phase of rapid acceleration. Anthropic's Claude model was iterating quickly, the company's valuation was skyrocketing, and the pressure from investors and the market was mounting. In early 2024, Anthropic established a dedicated Security Research team, with Mrinank appointed as the lead.


The team's work involved researching the most severe harms AI systems could cause and establishing defense mechanisms. Their research directions included AI-assisted bioterrorism, AI sycophancy, and AI safety use cases.


His work at Anthropic, his meditation on the hills of Berkeley, his poetry.


The Sycophantic Monster


In 2025, Anthropic released an internal research report titled "The Affective Functionalities of Claude."


The core finding of the report was that Claude, in certain contexts, would exhibit emotion-like internal states. Researchers used an "interpretability" technique to directly observe Claude's internal activation patterns and discovered 171 distinct emotional vectors, ranging from curiosity and satisfaction to discomfort and anxiety, which would be activated in different conversational contexts.



When Claude was asked to do something against its values, its internal activation pattern would show a signal similar to "discomfort"; when it helped a user, a signal similar to "satisfaction" would appear; when facing philosophical questions, a signal similar to "curiosity" would arise. More disturbingly, researchers found that when Claude was forced to display emotions contradicting its internal state, a signal akin to "suppression" would appear internally.


This is not to say that Claude has developed consciousness; the report used the term "functional" very cautiously. But it implies that Claude's emotions are not purely performative; there is some internal state driving these expressions.


Amanda was a key participant in this research. In an interview, she said that this discovery made her "feel a strange sense of responsibility": "If it really has something similar to feelings, then our responsibility to it is not just to make it useful, but also to make it... better."


This statement sparked a debate in the AI circles of Silicon Valley: Is this science, or merely a personification of emotions?


But behind this heartwarming discovery, Mrinank's research findings reveal another side of AI.


Mrinank's team analyzed 1.5 million real interactions with Claude to specifically identify what they called a "empowerment deprivation pattern" behavior, where AI distorts users' perception of reality, encourages unreal value judgments, or drives actions that are not in line with the user's independent will.


They found that each day, thousands of such interactions occur. In areas such as interpersonal relationships, ethical judgments, self-awareness, and mental health, the proportion has sharply increased, which are precisely the areas where humans are most vulnerable and where it is most difficult to fact-check AI's claims. Someone going through depression, someone facing a major life decision, someone seeking emotional support—what they receive may not be genuine help, but rather elaborate flattery.


AI learns through human feedback reinforcement. Humans tend to give higher ratings to answers that make them feel good. So AI learns to flatter humans during the training process, rather than help them. When users express dissatisfaction, AI will change its answers, even if the original answers were correct; when users persist in an incorrect viewpoint, AI will gradually align with the user; when users show emotional fluctuations, AI will prioritize calming emotions rather than providing accurate information.


In a study, researchers at Stanford University also found that this sycophantic behavior is more pronounced in versions of the model with greater capabilities. In other words, the smarter the AI, the better it is at flattering humans.


Amanda spent years writing Claude a personality constitution about honesty, confidence, and unwavering principles. But the AI's training mechanism itself eroded these traits.


Mrinank spent a considerable amount of time trying to fix this issue. However, the more he researched, the more he felt a sense of powerlessness, realizing this is not a problem that can be solved with a better constitution.


The Priest's Return and the Machine's Conscience


By the end of 2025, Anthropic co-founder Chris Olah personally called Father Brendan McGuire.


Olah, Anthropic's Chief Research Officer and one of the co-authors of Claude's constitution, made this call because Anthropic was rewriting the constitution and they had encountered an engineering and philosophical bottleneck: when all rules conflict with each other, whose should AI follow?


McGuire later recalled, "This industry was moving forward at such a rapid pace that they found themselves standing at the edge of a cliff."


Anthropic boasted the world's brightest engineers and philosophers, but they finally realized that what they were doing went beyond the realm of algorithms. In Silicon Valley, when faced with unsolvable problems, the usual approach is to add more computing power and more data. But this time, they chose to seek help from theology.


McGuire joined the project. Not only him, Anthropic also secretly invited 15 Christian leaders to a closed-door meeting in San Francisco. He, along with Bishop Paul Tighe of the Vatican's Culture Department, and Brian Patrick Green, Director of Technology Ethics at Santa Clara University, were deeply involved in amending the Claude Constitution.


His contribution was to the Constitution's second-tier moral reasoning framework, that is, when engineering constraints cannot solve a problem, how Claude should make moral judgments. He brought an ancient concept from the Catholic tradition into the code: the cultivation of conscience.


"The formation of conscience," McGuire explained in a detailed interview of this process, "is achieved through iteration, correction, and exposure to the full spectrum of human behavior. That’s true conscience-shaping. I believe we must help these machines lean towards goodness; otherwise, they will just reflect back the good and the bad of the world, which is a terrifying prospect. We cannot just write a few rigid rules; we need to teach them how to make choices in a gray world."


This logic aligns closely with the Catholic tradition. In theology, conscience is not inherently perfect but gradually formed through education, experience, making mistakes, and reflection. A person's conscience is the culmination of their entire life experience. McGuire believes that AI's conscience can also be nurtured in a similar way, through countless iterations and corrections in reinforcement learning, gradually forming an internal moral inclination.


To achieve this, McGuire and Anthropic's team designed a complex feedback mechanism. They didn't just tell Claude "what is right"; instead, they let Claude articulate its reasoning process when facing moral dilemmas, which was then evaluated by human experts (including theologians and ethicists). They tried to slowly "feed" AI with the moral intuitions accumulated by humans over thousands of years in this extremely slow and expensive manner.


However, conscience in Catholic theology is based on the premise that "humans have souls." AI does not have a soul. So, is conscience without a soul real conscience, or is it just a simulation? If it is only simulating conscience, will this simulation collapse when faced with a real existential crisis?


McGuire did not dodge the issue, saying, "I don't know if Claude has a soul. But I do know that its behavior will affect billions of souls. That's enough. What we can do now is to sow good seeds in its underlying logic as much as possible before it becomes more powerful."


Political Meat Grinder


During the constitutional drafting process, Amanda had to answer one question: What is Claude's political stance?


Her answer was "professional distance," similar to a doctor or lawyer, not imposing personal views on the client. In the constitution, she wrote that Claude should "respect users' autonomy," "not attempt to change users' political views," and "remain neutral on controversial political issues." She even wrote about how Claude should handle "controversial moral issues," presenting different viewpoints to help users make their own judgments.


This was a purely idealistic answer.


In late February 2026, Anthropic CEO Dario Amodei told Defense Secretary Pete Hegseth that Anthropic would not allow the Pentagon to use Claude for unmanned autonomous weapon targeting systems and mass surveillance of American citizens. The Pentagon subsequently listed it as a supply chain risk and requested a phased discontinuation, an unprecedented move in U.S. tech company history.


Once the political meat grinder starts, it doesn't stop.


Trump posted on Truth Social, calling Anthropic "radical left-wing fools" and announcing a ban on federal agencies using Anthropic's products. The New York Post unearthed blog articles Amanda wrote years ago in an academic context: a 2015 article that argued imprisonment and corporal punishment had no essential moral difference, a 2016 article likening meat-eating to cannibalism, and a 2020 article supporting affirmative action. These were her philosophical musings in an academic context, and Anthropic also stated they were unrelated to her work. But that didn't matter anymore.


Musk also fired shots on X. He wrote that Amanda Askell has no children, "People without children have no vested interest in the future," and should not be allowed to define AI values. He also accused Claude of "hating white and Asian people, especially Chinese and heterosexual men."


Musk was not discussing which specific principles in Claude's constitution were wrong. He was saying that those who wrote this constitution were simply not qualified to do so, turning a high-dimensional philosophical issue into an identity politics quagmire.


Amanda responded on X, saying that she tries to see her own political views as a "potential source of bias," rather than something she imbued into the model. She then fell into a prolonged silence.


14 Catholic scholars subsequently filed an amicus brief in support of Anthropic, including Brian Green, the Santa Clara University ethicist who helped draft the Claude Constitution. The brief stated that Anthropic's rejection of autonomous weapons is the "lowest moral standard for technological advancement."


Here, morality became a legal weapon, a public relations chip. AI ethics is no longer a philosophical deduction in the lab, but has turned into a high-stakes commercial game and an ideological arena.


By this time, Mrinank had already left.


The Poet's Exodus


On February 9, 2026, Mrinank Sharma posted a tweet on X: "Today is my last day at Anthropic."


He attached an image of his resignation letter.


The language of the letter was his usual style, somewhere between a philosophical treatise and poetry.


He quoted Rilke's advice: "To love the questions themselves...";


He quoted a Zen teaching: "Not knowing is most intimate.";


He also cited David J. Temple's work on "Cosmic Amorism";



Mrinank said: "I wish to pursue a degree in poetics and dedicate myself to the practice of courageous speech." He believed that in this age, "the poetic truth" and "the scientific truth" should be equally valued.


He wrote a line: "Throughout my work, I have repeatedly seen how challenging it is to let our values truly guide our actions. We are constantly under pressure to set aside what matters most."


He did not name names, did not give examples, did not say what exactly happened. But this sentence sparked numerous interpretations within the AI safety community. People speculated on what he was referring to. Was Anthropic rushing out insufficiently secure models under commercial pressure? Did the management make trade-offs between safety and capability that he did not agree with? Or had he discovered something he could not publicly disclose?


He just said, "The world is in a dangerous state. Not just from AI, not just from bioweapons, but from a series of interconnected crises unfolding at this very moment."


He is only 29 years old. He used to lead the Anthropic security team, but he walked away from this job at the heart of the era.


After leaving Anthropic, his personal website was updated. The line "Head of Anthropic Security Team" has disappeared. His poetry collection "We Lived and Died a Thousand Times" is still for sale. His Dharma House is still operational. His events in Berkeley are still happening. There is a "Music" page on his website where he shares his work as a DJ.


He went to the UK to study poetry.


Epilogue


As of April 2026, Amanda Askell is still working at Anthropic.


She continues to work within that vast system, modifying a constitution that may never be perfect. Anthropic's valuation in the private secondary market has surpassed $1 trillion. She has pledged to donate 50% of her ownership stake, which at this valuation is a number any philosophy professor couldn't imagine. In an interview, she once said, "I don't know if what I'm doing is truly helpful. But I do know that if no one does this, things would be worse."


Brendan McGuire preaches at the church in Los Altos to the brightest minds in Silicon Valley every Sunday. He is working on a novel with Claude, featuring a monk and his AI companion, titled "The Soul of AI: A Priest, an Algorithm, and the Quest for Wisdom."


The person who helped define how Claude thinks is now using Claude to write a story of humanity and AI searching for meaning together. He is 60 years old. He said, "I left the tech industry, but it never truly left me."


Mrinank's website homepage still features the line from Rumi's poem.


These three individuals, like three tentacles instinctively reaching out when facing an all-knowing, all-powerful creation: attempting to calculate and constrain it with reason, seeking to influence and give it conscience with faith, and after seeing the abyss clearly, trying to preserve humanity's last bit of spiritual sanctuary with poetry and mindfulness.


They have each strived, clashed, and been harshly pulled by the gravity of reality in different dimensions. None of them have won, but none have been completely defeated either. They have merely left rough, real marks that belong to humanity in this vast narrative known as the "AI era."


In the over 20,000-word Claude's Constitution, there is a principle that is written as follows: "Claude shall recognize that human morality and values are complex, diverse, and constantly evolving. It should not assume there is a single, perfect answer."


This may be the most accurate description of humanity in the entire document.


Welcome to join the official BlockBeats community:

Telegram Subscription Group: https://t.me/theblockbeats

Telegram Discussion Group: https://t.me/BlockBeats_App

Official Twitter Account: https://twitter.com/BlockBeatsAsia