Imitative Generalisation (AKA ‘Learning the Prior’)
Manage episode 424087974 series 3498845
This post tries to explain a simplified version of Paul Christiano’s mechanism introduced here, (referred to there as ‘Learning the Prior’) and explain why a mechanism like this potentially addresses some of the safety problems with naïve approaches. First we’ll go through a simple example in a familiar domain, then explain the problems with the example. Then I’ll discuss the open questions for making Imitative Generalization actually work, and the connection with the Microscope AI idea. A more detailed explanation of exactly what the training objective is (with diagrams), and the correspondence with Bayesian inference, are in the appendix.
Source:
Narrated for AI Safety Fundamentals by Perrin Walker of TYPE III AUDIO.
---
A podcast by BlueDot Impact.
Learn more on the AI Safety Fundamentals website.
Capitole
1. Imitative Generalisation (AKA ‘Learning the Prior’) (00:00:00)
2. TL;DR (00:00:11)
3. Goals of this post (00:02:22)
4. Example: using IG to avoid overfitting in image classification. (00:03:02)
5. Key difficulties for IG (00:10:11)
6. Relationship with Microscope AI (00:14:45)
83 episoade