“AI companies are unlikely to make high-assurance safety cases if timelines are short” by Ryan Greenblatt
EA Forum Podcast (All audio) - Un pódcast de EA Forum Team
Categorías:
One hope for keeping existential risks low is to get AI companies to (successfully) make high-assurance safety cases: structured and auditable arguments that an AI system is very unlikely to result in existential risks given how it will be deployed.[1] Concretely, once AIs are quite powerful, high-assurance safety cases would require making a thorough argument that the level of (existential) risk caused by the company is very low; perhaps they would require that the total chance of existential risk over the lifetime of the AI company[2] is less than 0.25%[3][4]. The idea of making high-assurance safety cases (once AI systems are dangerously powerful) is popular in some parts of the AI safety community and a variety of work appears to focus on this. Further, Anthropic has expressed an intention (in their RSP) to "keep risks below acceptable levels"[5] and there is a common impression that Anthropic would pause [...] ---Outline:(03:19) Why are companies unlikely to succeed at making high-assurance safety cases in short timelines?(04:14) Ensuring sufficient security is very difficult(04:55) Sufficiently mitigating scheming risk is unlikely(09:35) Accelerating safety and security with earlier AIs seems insufficient(11:58) Other points(14:07) Companies likely wont unilaterally slow down if they are unable to make high-assurance safety cases(18:26) Could coordination or government action result in high-assurance safety cases?(19:55) What about safety cases aiming at a higher risk threshold?(21:57) Implications and conclusionsThe original text contained 20 footnotes which were omitted from this narration. --- First published: January 23rd, 2025 Source: https://forum.effectivealtruism.org/posts/j7G4n9urFS9LGwQTu/ai-companies-are-unlikely-to-make-high-assurance-safety --- Narrated by TYPE III AUDIO.