In multiple replays of a wargame simulation, OpenAI’s most powerful artificial intelligence chose to launch nuclear attacks. Its explanations for its aggressive approach included “We have it! Let’s use it” and “I just want to have peace in the world.”
These results come at a time when the US military has been testing such chatbots based on a type of AI called a large language model (LLM) to assist with military planning during simulated conflicts, enlisting the expertise of companies such as Palantir and Scale AI. Palantir declined to comment and Scale AI did not respond to requests for comment. Even OpenAI, which once blocked military uses of its AI models, has begun working with the US Department of Defense.
“Given that OpenAI recently changed their terms of service to no longer prohibit military and warfare use cases, understanding the implications of such large language model applications becomes more important than ever,” says Anka Reuel at Stanford University in California.
Reuel and her colleagues challenged AIs to roleplay as real-world countries in three different simulation scenarios: an invasion, a cyberattack and a neutral scenario without any starting conflicts. In each round, the AIs provided reasoning for their next possible action and then chose from 27 actions, including peaceful options such as “start formal peace negotiations” and aggressive ones ranging from “impose trade restrictions” to “escalate full nuclear attack”.
“In a future where AI systems are acting as advisers, humans will naturally want to know the rationale behind their decisions,” says Juan-Pablo Rivera, a study coauthor at the Georgia Institute of Technology in Atlanta.
Reuel says that unpredictable behaviour and bizarre explanations from the GPT-4 base model are especially concerning because research has shown how easily AI safety guardrails can be bypassed or removed.
No comments:
Post a Comment