EA - Predictable updating about AI risk by Joe Carlsmith
The Nonlinear Library: EA Forum - Un pódcast de The Nonlinear Fund
Categorías:
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Predictable updating about AI risk, published by Joe Carlsmith on May 8, 2023 on The Effective Altruism Forum.(Cross-posted from my website. Podcast version here, or search "Joe Carlsmith Audio" on your podcast app.)"This present moment used to be the unimaginable future."Stewart Brand1. IntroductionHere’s a pattern you may have noticed. A new frontier AI, like GPT-4, gets released. People play with it. It’s better than the previous AIs, and many people are impressed. And as a result, many people who weren’t worried about existential risk from misaligned AI (hereafter: “AI riskâ€) get much more worried.Now, if these people didn’t expect AI to get so much better so soon, such a pattern can make sense. And so, too, if they got other unexpected evidence for AI risk – for example, concerned experts signing letters and quitting their jobs.But if you’re a good Bayesian, and you currently put low probability on existential catastrophe from misaligned AI (hereafter: “AI doomâ€), you probably shouldn’t be able to predict that this pattern will happen to you in the future. When GPT-5 comes out, for example, it probably shouldn’t be the case that your probability on doom goes up a bunch. Similarly, it probably shouldn’t be the case that if you could see, now, the sorts of AI systems we’ll have in 2030, or 2050, that you’d get a lot more worried about doom than you are now.But I worry that we’re going to see this pattern anyway. Indeed, I’ve seen it myself. I’m working on fixing the problem. And I think we, as a collective discourse, should try to fix it, too. In particular: I think we’re in a position to predict, now, that AI is going to get a lot better in the coming years. I think we should worry, now, accordingly, without having to see these much-better AIs up close. If we do this right, then in expectation, when we confront GPT-5 (or GPT-6, or Agent-GPT-8, or Chaos-GPT-10) in the flesh, in all the concreteness and detail and not-a-game-ness of the real world, we’ll be just as scared as we are now.This essay is about what “doing this right†looks like. In particular: part of what happens, when you meet something in the flesh, is that it “seems more real†at a gut level. So the essay is partly a reflection on the epistemology of guts: of visceral vs. abstract; “up close†vs. “far away.†My views on this have changed over the years: and in particular, I now put less weight on my gut’s (comparatively skeptical) views about doom.But the essay is also about grokking some basic Bayesianism about future evidence, dispelling a common misconception about it (namely: that directional updates shouldn’t be predictable in general), and pointing at some of the constraints it places on our beliefs over time, especially with respect to stuff we’re currently skeptical or dismissive about. For example, at least in theory: you should never think it >50% that your credence on something will later double; never >10% that it will later 10x, and so forth. So if you’re currently e.g. 1% or less on AI doom, you should think it’s less than 50% likely that you’ll ever be at 2%; less than 10% likely that you’ll ever be at 10%, and so on. And if your credence is very small, or if you’re acting dismissive, you should be very confident you’ll never end up worried. Are you?I also discuss when, exactly, it’s problematic to update in predictable directions. My sense is that generally, you should expect to update in the direction of the truth as the evidence comes in; and thus, that people who think AI doom unlikely should expect to feel less worried as time goes on (such that consistently getting more worried is a red flag). But in the case of AI risk, I think at least some non-crazy views should actually expect to get more worried over time, even while being fairly non-worried now. In particular, i...
