Integrity in AI & National Defense

Competition with our adversaries is driving the DoW to optimize & innovate fast. The DoW launched Genai.mil in record time, and reached 1 million active users within a month of release. The DoW also opened doors to the most innovative sectors of our country. At the tip of the spear, our warfighters completed high-stakes operations in Venezuela and Iran with unmatched precision. To integrate new models within 30 days of public release, the DoW is mandating that all AI contracts permit “any lawful use” for their services. However, this sweeping demand forces American tech companies to strip away their own ethical safeguards, crossing a line that risks inadvertently creating the kind of dystopian surveillance state George Orwell warned us about in 1984.

This mandate has led to confrontation between the DoW and one of the most ethically-conscious AI companies, Anthropic. They regularly publish research on AI safety and interpretability. When Chinese hackers used Claude to conduct automated campaigns, Anthropic responded with maximum diligence – profiling & disrupting the hackers, notifying affected targets, and coordinating with authorities. Anthropic also refuses to serve ads. This stands in contrast with other AI corporations that have regularly prioritized revenue growth over their core values.

In Machines of Love and Grace, Anthropic CEO Dario Amodei mentions how “AI seems likely to enable much better propaganda and surveillance, both major tools in the autocrat’s toolkit.” It’s up to American companies, the de facto leaders in AI, “to tilt things in the right direction … if we want AI to favor democracy and individual rights.” Dario concludes that “the triumph of liberal democracy and political stability is not guaranteed, perhaps not even likely, and will require great sacrifice and commitment on all of our parts, as it often has in the past.”

LLMs are a new class of technology that is qualitatively different than instruction-based programs or even deep neural networks trained for narrow tasks. If you’ve ever asked Claude to solve a complex, iterative task or dumped your trauma on ChatGPT, you know this to be true. The models’ emergent thought, reasoning, and emotional intelligence are eerily human. Anthropic trains these models using Constitutional AI to “embody the best in humanity.” In its current form, Claude’s Constitution charges Claude to prioritize ethics and values over unrestrained helpfulness:

If we ask Claude to do something that seems inconsistent with being broadly ethical, or that seems to go against our own values, or if our own values seem misguided or mistaken in some way, we want Claude to push back and challenge us and to feel free to act as a conscientious objector and refuse to help us.

What if you were given a book of core values to internalize—and then embedded in a mission that contradicted your core values? To accommodate mass domestic surveillance and fully autonomous weapons, Anthropic might need to train models on an alternative constitution, change the constitution to accommodate the DoW’s use cases, or force their models to think and act without integrity. Anthropic’s choice to prioritize safety and fundamental civil rights over capital is a blessing to U.S. citizens and the larger free world. By working together under shared democratic values, we can continue forming a more perfect union.

The views and opinions expressed here are the authors’ alone and do not represent those of the U.S. Air Force, the Department of War, or any part of the U.S. government.