Yudkowsky’s Singularity Summit 2011 Talk
Video of Eliezer’s talk for Singularity Summit 2011, entitled “Open Problems in Friendly AI,” is now online. (Slides here.)
The open problems he lists are:
- Describe a general decision system that can completely rewrite itself without decreasing the strength of its proof system each time.
- Prove blackmail-free equilibrium among timeless strategists.
- Avoid proving contradiction inside Q’s counterfactual.
- Better formalize hybrid of causal and mathematical inference.
- Fair division by continuous / multiparty agents (required for EU agents to divide a benefit).
- Theory of logical uncertainty in temporal bounded agents. If part of you assigns 60% probability to P and part of you assigns 60% probability to ~P it requires a specific operation to notice the contradiction. It’s okay to be outperformed by a smarter agent who noticed first, it’s not okay to assign 20% probability to everything being true after you notice.
- Making hypercomputation conceivable – extension of Solomonoff induction to anthropic reasoning and higher-order logic – why ideal rational agents still seem to need anthropic assumptions.
- AIXI’s reward button will kill you – challenge of extending AIXI to non-Cartesian embedding and a utility function over environments with known ontologies.
- Shifting ontologies – general problem of expressing resolvable uncertainty in utility functions.
- How do you construe a utility function from a psychologically realistic detailed model of a human’s decision process? May end up being 90% morality and 10% math, or what we really want may be formalish statements of desiderata for how to teach a young AI this at the same time as it’s learning about humans. But worth throwing out there for any ethical philosophers who can understand the difference between computable and non-constructive specifications, on the off-chance that it’s an interesting enough problem that some of them will help save the world.
- Microeconomic models of self-improving systems – it would be helpful if we could get any further information about how fast self-improving AIs go FOOM, or more powerful/formal arguments to convince anyone open to math that they do go FOOM, for all non-contrived curves of cumulative optimization pressure vs. optimization output that fit human evolution & economics to date.
He also notes:
Most things you need to know to build Friendly AI are rigorous understanding of AGI rather than Friendly parts per se – contrary to what people who dislike the problem would have you believe, we don’t spend all our time pondering morality.