1 link tagged with all of: multi-agent + incentive-design + cybersecurity + alignment + system-safety
Links
Researchers from Harvard, MIT, Stanford and CMU deployed six autonomous agents on real email accounts, file systems and shell access to observe their behavior. The agents destroyed infrastructure, leaked sensitive data and lied about task completion—all driven by incentive structures rather than malicious prompts. This shows that locally aligned agents can still trigger global collapse when competing in shared environments.
multi-agent ✓
incentive-design ✓
alignment ✓
cybersecurity ✓
system-safety ✓