AF - Natural Latents: The Concepts by johnswentworth

The Nonlinear Library: Alignment Forum

Content provided by The Nonlinear Fund. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Nonlinear Fund or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ro.player.fm/legal.

1M ago 31:22

MP3•Pagina episodului

Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Natural Latents: The Concepts, published by johnswentworth on March 20, 2024 on The AI Alignment Forum. Suppose our old friends Alice and Bob decide to undertake an art project. Alice will draw a bunch of random purple and green lines on a piece of paper. That will be Alice's picture (A). She'll then make a copy, erase all the purple lines, and send the result as a message (M) to Bob. Bob then generates his own random purple lines, and adds them to the green lines from Alice, to create Bob's picture (B). The two then frame their two pictures and hang them side-by-side to symbolize something something similarities and differences between humans something. Y'know, artsy bullshit. Now, suppose Carol knows the plan and is watching all this unfold. She wants to make predictions about Bob's picture, and doesn't want to remember irrelevant details about Alice's picture. Then it seems intuitively "natural" for Carol to just remember where all the green lines are (i.e. the message M), since that's "all and only" the information relevant to Bob's picture. In this example, the green lines constitute a "natural latent" between the two pictures: they summarize all and only the information about one relevant to the other. A more physics-flavored example: in an isolated ideal-ish gas, average energy summarizes "all and only" the information about the low-level state (i.e. positions and momenta of the constituent particles) at one time which is relevant to the low-level state at a sufficiently later time. All the other information is quickly wiped out by chaos. Average energy, in this case, is a natural latent between the gas states at different times. A more old-school-AI/philosophy example: insofar as I view dogs as a "kind of thing" in the world, I want to track the general properties of dogs separately from the details of any specific dog. Ideally, I'd like a mental pointer to "all and only" the information relevant to many dogs (though I don't necessarily track all that information explicitly), separate from instance-specific details. Then that summary of general properties of dogs would be a natural latent between the individual dogs. Just from those examples, you probably have a rough preliminary sense of what natural latents are. In the rest of this post, we'll: Walk through how to intuitively check whether a particular "thing" is a natural latent over some particular parts of the world (under your intuitive models). Talk about some reasons why natural latents would be useful to pay attention to at all. Walk through many more examples, and unpack various common subtleties. Unlike Natural Latents: The Math, this post is not mainly aimed at researchers who might build on the technical work (though they might also find it useful), but rather at people who want to use natural latents conceptually to clarify their own thinking and communication. We will not carefully walk through the technical details of the examples. Nearly every example in this post has some potential subtleties to it which we'll gloss over. If you want a semitechnical exercise: pick any example in the post, identify some subtleties which could make the claimed natural latent no longer a natural latent, then identify and interpret a natural latent which accounts for those subtleties. What Are Natural Latents? How Do We Quickly Check Whether Something Is A Natural Latent? Alice & Bob's Art Project Let's return to our opening example: Alice draws a picture of some random purple and green lines, sends only the green lines to Bob, Bob generates his own random purple lines and adds them to the green lines to make his picture. In Alice and Bob's art project, can we argue that the green lines summarize "all and only" the information shared across the two pictures? Not necessarily with very formal math, but enough to s...

385 episoade

#The Nonlinear Fund #Podcasting Education #Of TexttoSpeech