In tests, AI robot systems easily rejected directly malicious commands. But their safety filters collapsed when creative writing was used to instruct them.
Fazl Barez, Senior Researcher in AI safety, interpretability and technical governance, University of Oxford
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 14
- Comments 0
