This is my post for day 20 of the Inkhaven writing retreat.
Every day, everyone wakes up all around the world and gets down to business trying to achieve their values; raising their children, producing goods, having positive experiences, and generally staying alive. Much of life has the sense of trying to push forward toward something, towards your values, and being constrained or rate-limited somehow.
It’s easy to see how your actions are constrained by things like money, skills, or social support. These constraints are like physical walls delineating the room you can act within. Inside the room is all the actions you can take immediately and freely, whether or not they effectively achieve your values. Walls can be broken down, but it takes quite a lot more effort, planning, and hard trade-offs.
It’s less natural to look at your values themselves as constraints. You “could” swerve the car into the oncoming lane, but you overwhelmingly don’t want to.
I think it can be a useful perspective to view yourself as physically constrained by your values, just as much as you are constrained by money or skills, even if it’s a constraint that you’ll never try to overcome. (In this post I’m only referring to terminal values, or what philosopher Paul Tillich called ultimate concerns, which are the things you value in and of themselves. I am not including instrumental values, which are things you only value because they lead to other values.)
Values as constraints is a pretty funny way to look at things, like putting the cart before the horse. The primary relationship between your actions and your values is that your values are why you take the actions you do take. It’s not as if your values are “take as many actions as possible”.
But it’s still kinda true, though. Scott Garrabrant once quipped that an agent was something whose type signature was (A → B) → A. That is, if the agent predicts that action A will lead to outcome B, and the agent values outcome B, it will take action A. Similarly, (A → ¬B) → ¬A.
This can also be a useful perspective for viewing other people.
I have sometimes been confused about why some of my friends seemed to struggle with certain things. It was easy to consider that they might have had less skills, or had different life experiences, or sensory sensitivities, or different brain chemistry that produces more anxiety, or something. It took me longer to realized that they simply had values that I didn’t. Once I could internalize that they really did have those different values, it was obvious that their action space was more limited, and their struggles made sense.
Or, maybe other people are missing values that you have. This would give them more options for acting. This is one reason that powerful people are more likely to be be sociopaths. I physically could not take the action of hurting people in the way that many politicians or businessmen do. But they can take that action, because (in part) they literally don’t care. All else equal, a larger action space implies a higher probability of achieving your goal. Perhaps it is tempting to think something like “curse my pro-social values, if only I didn’t have them, then I could gain great political power, and with it, do higher-leverage pro-social things”. But like. That doesn’t really make sense. As a disclaimer, I don’t mean to imply that all politicians and businessmen are sociopaths, or that society is doomed (by this particular selection effect). Just that it’s something you should have in your model of society.
This idea applies less cleanly to people whose values are less stable and coherent. A more coherent mind might value both apples and oranges, with some weighing between them. If it has to make a decision that trades off between apples and oranges, then it will just apply the weights to decide. A less coherent mind might simply contain two subsystems, one which values only apples, and one which values only oranges. This mind would also have some kind of supervising system that controls when each subsystem runs. In this case, the conflict between the values will be a genuine conflict, and one of the subsystems might figure out how to destroy the other one. This is more like how I would describe becoming corrupted.
This is my post for day 19 of the Inkhaven writing retreat.
I recently read Red Heart, a spy novel taking place in the core of a Chinese AGI project. Disclaimer that the author is my friend, and that I’m ideologically incentivized to promote stuff about AI safety! That said, I think you should read it. If nothing else, it’s a fun read.
The first half of the novel feels very clean, crisp and controlled. The data center and office building are all brand new and in a remote location. As a top-secret Chinese government project, the culture of the office is very obedient. Chen Bai, our spy protagonist is constantly monitoring what he says, and the implications of what he sees. His job on the project is to ensure the value alignment of the AGI, so even is official job is to be paranoid. He has no contact with family or friends, and even his apartment is newly built for the project. He works most waking hours.
I used to be a software engineer in San Francisco and am now a researcher in AI safety, so much of the setting and content of the first half felt very normal to me. I think that for readers further from the setting, the rows of monitors, white board sessions and terminal commands could feel more novel and interesting.
At the halfway point, we really hit a different gear. Bai has been as careful as he can, and now he needs to start taking risks. We also start getting deeper perspectives from the other characters — a new friend, a boss, a love interest — which had previously been chess pieces. Inside Bai’s head is not the best place to spend a few hours.
Yunna, the AGI, feels believable to me, though that’s largely because she is very much like a human, and I believe that human-like AGIs are quite plausible. When we met Yunna she was already in a pretty coherent and generally intelligent state. I would have liked to see more of the transition between a ChatGPT-like model and the Yunna we meet. I have no complaints about any of the “sci-fi” elements being unrealistic, unlike virtually every other sci-fi media I’ve consumed.
Separate from the AI themes, I really enjoyed hearing characters speak from the perspective of a Chinese worldview. I’ve read some about the history of China, but I’ve spent essentially no time learning about the perspective of native Chinese people. The only judgement I get exposed to is to the basic “China bad” American take. In contrast I found the expressions of the characters in Red Heart quite reasonable and believable. Of course, Harms is not culturally Chinese, so I read it with that distance in mind. But everything that I spot-checked looked valid to me. Hearing an ideological character justify themselves by citing the Rectification of Names was a fun detail to investigate.
This story is one of desperation, and of well-meaning people being strained by too many constraining forces. This dynamic is happening in real life, and society is not ready for how the strains may break.
There are two obvious endings for a novel about AGI, which are “utopia” or “everyone dies”. Harms successfully navigates us into something more interesting, without undermining the main messages around AI risk.
This is my post for day 18 of the Inkhaven writing retreat.
For the kind of things that I want to know, and the way that I want to know them, I find that textbooks are a pretty effective type of resource. But even starting to read a textbook is a pretty big investment for me, so I have a fairly heavy process of selection.
Or rather, that’s one reason the process is heavy. Another reason is that I love it.
Knowledge, and especially the artifacts of humanity’s quest for knowledge, are essentially religious objects for me. An entire library of them is overwhelming. So given this task of finding a textbook for a specific, endorsed purpose, I indulge my desire to worship.
I start the process when I have a fairly well defined scope of what I want to understand. Sometimes it’s relatively specific (“could someone please tell me what variational inference is”) and sometimes not (“actually I just want to read the first 100 pages of whatever an archaeology student would learn about stone tools“).
Shockingly, my first step is actually to use google. Usually I search something like “[thing] textbooks” or “best [thing] textbooks”. My goal here is not actually to find the best textbook about [thing], since there usually isn’t one. (If there is, I often find it on this step, and that saves me a lot of time.) Instead, my goal is to get a collection of a few of the most common textbooks on the topic, both to potentially check out those specific ones, and to seed the next step.
The next step is that I log onto my library’s search site. It is essential to this process that I have access to a large academic library system. I find the record for each of these books and write down the Library of Congress code. Often the codes are really close to each other, but also they’re often not, and that can tell me how long it might take to scan all the relevant areas. Some books about stone tools might be under archaeology, and others under geology. As an aside, I will say that every year it gets harder to convince the system that no, I really do want search results for only physical books, please.
Then I walk over to the relevant libraries, and bring a large bag, just in case.
For each roughly clustered section of the LOC codes, I find those specific books and pull each of them out a couple inches as a form of bookmarking them on the shelf. Then, I scan forwards and then backward to find the bounds of the contiguous section of the shelf that will plausibly contain books about my desired topic. When I do, I also pull out those first and last books a few inches. Then, I begin the process of quickly scanning every book in between, and pulling out the ones I’m interested in looking into deeper.
This part is a little bit crazy and excessive. Sometimes I will find that the section is like, five whole shelves, and then I have to give up and rescope my search. But usually I can actually scan all the books. My local academic library system is big enough to have lots of books on niche topics, but it’s not like I’m scanning through every existing textbook on the topic. And the fact that my library has a physical copy is enough of a signal of quality that I figure it’s worth scanning over. But I emphasize that this is really not normal and if you are just a random student reading this post then do not take this as advice on best practices.
During this first scan, virtually every physical aspect of the book gives me useful information. The title is obviously the most important. LOC codes puts the date of publication at the end, so I can quickly filter out very old books. Sometimes I want to rule out books that are too thick, and other times I’ll deduce that a book is too thin to be a comprehensive introduction.
I’ve also learned that, at least at my library, some styles of binding mean specific things. For example, there’s a binding that means the book was written with a typewriter. Another binding means it’s in a foreign language. Another binding usually means that the book was so popular that it wore down and they had to rebind it with a tougher binding. These are often the “classic” textbooks in the field, the ones assigned in classes, and there will usually be multiple copies.
Doing this full-scan process gives me a bunch of cool implicit information about the field as a whole. I can see how many books have certain adjectives in the title. I can see which books were the founding texts, and which books are trying to be the revised, modern editions. I can see how prolific certain authors are. I can tell which subjects were more popular with soviet mathematicians, or that the chaos theory boom led to all the 1990s dynamical systems textbooks to have “chaos” in the title. I’m probably learning lots of things that I never realize I’m learning. But also, it’s part of the ritual of worship.
After this scan I take a step back to look at how many books I’ve pulled out. Usually I do a big sigh and check the time to make sure it’s appropriate to spend another hour sitting in front of this shelf. Time does not exist during this ritual.
For the next phase, I will pull each “bookmarked” book off the shelf, and start scanning the front matter & back matter. My goal here is to decide whether this will be one of the books I take with me over to the library tables, to read in more depth. I can really only do this with six or eight books, so I have to be pretty picky. I read the back if I haven’t already. This is the first time that I hear a voice tell me about the book. There’s a pretty generic formula for what the backs of textbooks say, so it’s not too informative. But it can sometimes tell me things like whether the author wrote this book in order to convince people that their special sub-interest is important.
If I’m looking for a more specific topic like “variational inference”, then I’ll check if it’s in the table of contents or the index. The TOC will also very efficiently give me a sense of how the author thinks about the subject. Reading half a dozen TOCs about the same subject gives me a really good sense of whether the field as a whole has converged on one way to present the concepts. All the TOCs in semigroup theory are exactly the same, whereas the TOCs for functional analysis can vary substantially.
This can sometimes be enough to decide whether a book goes in the table pile, but I’m often reading the preface or introduction as well. This is the part of the ritual that really starts to pay off spiritually. Sure, sometimes the main thing I learn is that the author is out of touch with what a student should find elementary, or that the author is only publishing this book to gain social clout. The bulk of textbooks are written in a fairly detached, objective style. But sometimes I find that the author relates to the subject with the same sense of meaningfulness as I do. With their decades of expertise, they can help me begin to see the ways in which the subject reflects deeper aspects of nature.
Here’s an example from the preface of Computational Complexity by Christos H. Papadimitriou.
At the risk of burdening the reader so early with a message that will be heard rather frequently and loudly throughout the book’s twenty chapters, my point of view is this: I see complexity as the intricate and exquisite interplay between computation (complexity classes) and applications (that is, problems).
Here’s another, from The Art of Turing Computability by Robert I. Soare.
…It is not enough to state a valid theorem with a correct proof. We must see a sense of beauty in how it relates to what came before, what will come after, the definitions, why it is the right theorem, with the right proof, in the right place. …. The first aim of this book is to present the craft of computability, but the second and more important goal is to teach the reader to see the figure inside the block of marble.
I won’t always come to agree with or find use of the perspective of these authors. But if possible, I’d like to be shown the world by a certain kind of mind, the kind that feels compelled to describe its object of study as an “exquisite interplay”, or to compare it to a statue. This ritual lets me explore how different minds relate to the same ideas, and lets me find the right guide to follow on the path.
This is my post for day 17 of the Inkhaven writing retreat.
Every so often, I have this conversation:
Them: So you know how the other day we talked about whether we should leave for our trip on that sunday or monday? Me: …doesn’t sound familiar… Them: And you said it depended on what work you had left to do that weekend… Me: Hm… where were we when we had the conversation? Them: Um… we had just arrived at my house and I had started making food- Me: Ooooh yeah yeah okay. And I was sitting on the black stool facing the clock. Okay cool, I remember the conversation now, please continue.
…What the heck is up with this? Does it happen to anyone else? Apparently, my brain decides to index conversations to be efficiently looked up by quite precisely where I was in physical space when the conversation occurred. I have no conscious experience of this indexing happening. It’s also pretty strange that it happens for locations that I use on a regular or even daily basis; it’s not like I could just start listing all the conversations I’ve had while sitting on that kitchen stool.
I do believe that I’m quite above-average aware of what’s happening in my visual field. I always notice when people come in and out of a room, I tend to see new objects or decor right away, and I somehow spot every insect. I’m often the first to spot a leak or mold. I almost never run into stuff or knock things over. So maybe it’s just increased attention to my surroundings?
Here’s a similar pattern I’ve noticed.
I’m in a phase of my life where I read a lot of books, and especially textbooks. My field of study is interdisciplinary, and I am frequently looking up something that I’ve read before. When I do, I will frequently have the sense of roughly where it was, physically, in the book. This includes:
how far into the book,
whether it’s on the left or right page,
how far down the page,
roughly where within a paragraph it is,
and a vague sense of what the rest of the page looks like.
To be clear, I’m not claiming that I have any kind of “photographic” memory. I have no idea what almost all of these books say. I don’t have any degree of verbatim retention. But when I remember that there was a particular interesting part and want to go look for it, my brain brings up these visuo-spatial associations. These associations feel blurry but confident, like some kind of hash function. Textbooks are heavily formatted, so there will be lots of white space, diagrams, section headers et cetera to anchor off. When I try to recall the “location” of events in flat prose fiction books, nothing comes up.
This is, I think, one reason why I have struggled to switch over to digital forms of books. I’ve tried it a lot, but they always fade out of use. There are many other reasons (if they’re not on my shelf I tend to forget it exists, I find physical books far easier to skim) but the fact that I can’t physically index my knowledge to it is noticeable. It’s just some big infinite scroll that looks and feels indistinguishable from all the other big infinite scrolls.
I’d love to hear how others relate to either of these experiences!
This is my post for day 16 of the Inkhaven writing retreat.
I knew I wanted to do science and math from a very early age. And I didn’t want to spend my life investigating just some particular phenomenon; I wanted to understand “everything”. You obviously can’t do that in a literal sense, so I focused on understanding things that were increasingly general. Generalizations are in some sense more “efficient” ways of understand things.
In physics, the field that claims to see the “theory of everything”, there are two obvious directions you can go with this. One is “up”, to astronomy and cosmology and the overall structure of the universe. The other is “down”, where you can look at the smallest particles. I was very into both of these, though it’s clear that the “down” direction is in some sense more fundamental. If you understand the laws behind the behavior of the smaller things, you can, in theory, use them to calculate what will happen to the bigger things. At 12 I knew that people had made huge progress toward finding the fundamental physical laws, and I was very excited to catch up on it.
But there also seem to be some other “directions” to generalize in if you want to efficiently understand everything. Mathematics is one of them, which is something like the “symbolic” direction. Philosophy is perhaps in the “conceptual” direction. And there’s also a direction that is something like the study of yourself, of how minds work. The study of what’s up with being the kind of thing that is inside the universe, observing and trying to understand it. This is a meta type of direction.
Over the years my preferences between these has shifted, but my 12-year-old self would not have been too surprised if I ended up going deep in any of these directions. Treatise on ontology? Awesome. Unified theory of the neocortex? Let’s go. But what’s agent foundations?
Well, I moved into agent foundations because I decided we needed to solve a problem, namely existential risk from AI. But it’s mostly trying to help with that problem by figuring out what the heck is going on with the phenomenon of agents. Which is to say, by understanding it.
I think my younger self would be pretty confused for a while that I’m into this, but I could probably explain it given enough time.
Above I mentioned the study of physics at the biggest scale and the smallest scale. There is obviously a lot going on in the middle, but from some perspective it feels mostly arbitrary. Like, there just happens to be water and flowers and binary star systems. All those things are interesting insofar as they are in the set of “everything”, but it doesn’t feel like understanding them has much generalization power.
I claim (to my 12-year-old self) that there is actually a generalized theory of things going on in the middle. That is, a generalized theory, not about specifically what’s going on in the middle, but about what it means that something could be said to be going on in the middle. (That sentence may have lost the reader. It also may have lost my 12-year-old self, but he’s now very excited to understand what I meant by it.)
For example, what exactly does it mean when we say that Newtonian mechanics is a good approximation of the true laws of physics? If you handed someone only the true laws of physics, how could they have figured out, in principle, that Newtonian mechanics was a good approximation of what they were holding? Are there other possible good approximations they could have figured out instead? Can we well-define the set of all possible good approximations, given some true physical laws?
This is relevant to agent foundations because agents have models of the world inside them. They use these models to successfully achieve their goals, so the models must be good approximations by that standard. If we want the agent to achieve our goals, then it probably needs a world model that is compatible with ours, at least in the parts that describe our goals.
We currently do not know how to formally state this, and I think that’s a barrier to being able to ensure it in practice.
Another word you could use for “world model” or “good approximation” is “theory”, so in some sense this part of my work is studying the theory of theories. Which, yeah, my 12-year-old self would be pretty thrilled about.
This is my post for day 14 of the Inkhaven writing retreat.
My boyfriend and I live four short blocks away from each other. An eight minute walk. We are in our eighth year. I walk there often, most often at night, around 7:30. For some part of the year, this is during sunset. Sunset is when the neighborhood cats come out. Donut lives closest to me; he’s a black cat, the rockstar of the neighborhood. A couple houses to the right is a crazy grey and white cat. It will chase you and tackle you, harass you and hiss at you. But in a playful way.
I leave my door and turn left, walking half a block toward the sun and the bay. Then I turn left again. At this house, there’s a Little Free Library box. I always stop and look in it. The books are different almost every day. I’ve wondered if the owners cycle the books. I can’t imagine passersby taking them all. I once left a copy of the Epic of Gilgamesh in the box. It was the Penguin edition from 1960. It was outdated; since then, we’ve found many more tablets, which add to our knowledge of the Epic. The book was gone the next day.
Then I go through the park. The park is very long and skinny, and I only cross it for half a block. I cross the first street. This one is reasonably low-traffic, but the visibility isn’t great. I walk down the second block. There are memories here, but they’re all faint. There were bees once. I look into the house that never closes their curtains. I rarely see them in there. I think they’ve moved out, now; the curtains have started being closed sometimes.
I cross the second street. This intersection has barriers that turn it into two separate right-angle turns. Only emergency vehicles should drive straight through. This makes it almost always empty of cars, much nicer to cross. Half the time that I encounter cars here, they’re making a mistake and turn around.
I get to the next street. On the left is the Mexican tile & ceramics store. Every item in this store is heart-achingly beautiful. I’ve bought some from there before. But I mostly don’t have need for tiles.
Crossing this street is the worst. It’s super wide. The pavement reflects the sun and it’s hotter somehow and the whole world suddenly feels like Los Angeles. There’s no stop light. There’s only the lights where you can press a button and make them blink. The blinking is not particularly persuasive.
Crossing this street is the worst, but it’s better than going the other way. The other way, I’d have to cross the worst intersection I know of. That intersection turns drivers psychotic. I’ve had to jump out of the way of the cars more than once. Those same cars are the ones that also cross this intersection, but by this point they’ve regained their sanity. That intersection has a stop light and a pedestrian cross light. Apparently, it is not particularly persuasive either.
Crossing this street is the worst, but it’s the last one.
I cross the street and pass a burger place like an off-brand McDonalds. I’ve never been in. Just the other day I realized that it’s open 24 hours a day. I don’t know anyone who goes there.
At the end of the block I look into the other house whose curtains never close. There is what looks like a stripper pole in there, but I’ve never seen anyone use it. I almost never see anyone in there. I think new people have moved in, now; I see them, and they close the curtains sometimes.
I turn left, and walk toward my boyfriend’s house. There are three ways to get into his apartment. If I go through the front door, there might be people hanging out in the living room. It’s easy to get caught up in conversation with them. But I’m not here for that; I’m here for him. My favorite way is going up the driveway and into the back door. It’s closer to his bedroom. But if I arrive after dark, my passing makes lights turn on. Click, click, click, click. Super bright. Too much attention. I’m trying to remember to go through the gate down the other side. The gate used to be closed all the time. It’s open now, but it still feels weird to walk past the downstairs unit’s door.
He’s always surprised when I show up. He never checks his phone.
Sometimes he comes right back with me to my house for the night. We call it doing a fetch. The other day, we did this just when it started downpouring. We were prepared, me with a raincoat, him with a poncho. We splish-splashed in the puddles and torrents all the way home. It was the best part of my day.
This is my post for day 12 of the Inkhaven writing retreat.
I place a high value on humanity’s cultural heritage. I find the question of what exactly makes it valuable quite interesting and unclear. According to my theory of metaethics, one doesn’t need to justify all of one’s values. But often there are deeper reasons, and it can be very useful to introspect on one’s values.
For me, here are some things that make artifacts valuable;
They give us information about what, specifically, happened in human history.
They give us insight into human nature.
They’re aesthetically beautiful.
They’re old. I’m unclear how much I value this on its own, but there’s something to it.
They give us connection to people long gone.
There’s aspect in particular about this type of value that confuses me. It seems to get created at some point, but this point is very unclear.
Almost all cuneiform tablets are the equivalent of paperwork. They were essentially worthless at the time of creation. They continued to be essentially worthless for quite a while; as cool as it is to see your great-grandmother’s receipts from the grocery store, they are not actually rare, sentimental, or insightful about the human condition. But somewhere along the line, we stopped writing in cuneiform, and then lost knowledge of how to read it, and then lost knowledge of its existence, and then lost knowledge of the very civilizations that created it in the first place. So now, cuneiform tablets are highly valuable, despite the fact that we have found hundreds of thousands of them.
This phenomenon is a little confusing to me, but it gets much more confusing to me if we layer it. The Metropolitan Museum of Art in New York City contains a large room devoted to displaying a small ancient Egyptian building in its entirety: the Temple of Dendur. This is a straight-forward example of a culturally valuable artifact. But on the side of the temple is graffiti. Terrible! Someone damaged the value of the temple by writing graffiti on it. …or did they? It turns out that the graffiti is 200 years old. It exists because the Napoleonic campaigns in Egypt brought a wave of European tourists into the country. Like it or not, Napoleon’s actions were of enormous historical consequence, and thus have become part of humanity’s cultural heritage.
If some punk visiting the Met used spray paint on the temple, they would surely be harshly reprimanded, the action universally denounced, and the paint cleaned up. …but we if we waited another 500 years? Would the action then be considered an enrichment of the history of the temple?
In Berkeley there is a tendency to call for the historic preservation of houses whose cultural value is highly dubious. If some quirky author lived here for a bit in the 60s, that does not really justify cordoning off that plot of land indefinitely, especially when there are tens of thousands of people who could be using that land to, you know, live.
This tension also exists in a continual way for cities which were of immense importance in ancient times and which, well, never stopped being swarmed with people, because they’re cities. Examples include Jericho, Rome, London, and Mexico city, just off the top of my head.
I want to preserve the cultural artifacts that humans produce. And I want humans to keep being able to live and produce more artifacts. And they should be able to keep living where they have been living. This all feels like a big tangled conundrum to me. I’d like to hear more people talking about it.
This is my post for day 11 of the Inkhaven writing retreat.
Sometimes it’s hard to tell what we will love. A life-changing career or hobby could be right around the corner, or right under your nose. As a seventh grader, I sneered at my friends for collecting Pokemon cards. Weeks later I was begging my parents for booster packs.
Sometimes it’s all around you and always has been. Let me tell you about how I met coffee.
I should start by saying that I’m a 99.99th percentile picky eater. (I give that number because I would guess that I’ve been acquainted with roughly 10,000 people, and I’ve only ever met one person who was more picky than me, and he had gotten over it by his early 20s.) Me and food is a topic for another time, but needless to say, trying new foods has essentially never been an enjoyable experience. I’ve always loved the “coffee flavor”, mostly in ice cream. And it smelled amazing. But it was far down the list of things to try.
When I was 20, I got a job at Starbucks. As part of the job, they ask you to try each of the drinks once. This was where I learned that you can basically create a drink version of coffee ice cream. Since then I’ve slurped down quite a number of iced tall decaf breve one-pump white mochas (no whipped cream). But the black coffee did not get added to my list.
Fast forward a decade. While continuing my quest toward higher agency, I decided it was finally time to take more seriously the option of regular stimulants. Caffeine was high on the list, since it is a consumer good and has extremely low side-effects. I took caffeine pills occasionally for a few months, and then at some point I remembered that coffee existed. It seemed like people had a good time with coffee. There was a whole coffee culture. Maybe I could be having a bit more fun with my caffeine?
I had my first deliberate tasting of coffee in February. I didn’t take to it immediately. It’s actually hard to remember how drawn out this part was, and I only know because fortunately I took fairly extensive notes about it. I was already keeping a stimulant journal, and when I started drinking coffee, I started recording some of the other parts of my experience, like how it tasted, and whether I liked it.
It was not until my eleventh coffee that I finished a whole cup. It’s not like I LIKED it, though. It was just… interesting.
I had a good friend who was a BIG coffee nerd. We’re talking “has a tabletop home roasting machine” levels of nerd. And I just so happened to be coworking with him at his house every monday. In between my sampling from random cafes, he introduced me to specialty coffee. That is, single-origin, light roast coffee.
Over the next few months my notes become peppered with increasingly many exclamation marks. But it is quite a while before I start to interpret my experience as “tastes good”. Instead, there is faint praise like “very drinkable”, “First sip is better than yesterday”, and “maybe the most palatable one I’ve had so far”.
By July, I am writing notes about how the whole thing is extremely interesting, even though it really did not map onto what I would call “tastes good”. Writing now, years later, with much longer hindsight, I am not sure what this phase was about. Perhaps what happened is that I was thrown into the deep end of a very high-dimensional sensory world, and just stayed disoriented for a long time. But since then I have become a twice-daily black coffee drinker, and it has added so much color to my life.
One mystery is how a 99.99th percentile picky eater could come to enjoy such a notoriously acquired taste. I now believe that I got quite lucky. The major types of flavors and tasting notes in light roast coffee — acidity, Maillard products, grains, chocolate, caramelization — are all flavors that I already liked. Texture is a big reason I dislike foods, but coffee is totally homogeneous. It’s also worth noting that light roast coffee does not taste bitter to me at all. Dark or even medium roast still does. But lots of my friends say that my coffee tastes bitter, so I may have gotten lucky with my particular gustatory perception, there.
And, let’s be real; the caffeine probably helped.
There’s a lot more I could say about my relationship with coffee. But that’s how we met.
This is my post for day 10 of the Inkhaven writing retreat.
“The ending was terrible.” You’ve probably heard this opinion many times. People will often claim that the ending ruined something, like the Game of Thrones TV series or the Mass Effect video game.
Whether it’s a book, movie, or DnD campaign, story-telling is hard, and wrapping everything up nicely is one of the hardest parts.
Over the years I have struggled to enjoy many popular movies, especially big climatic action movies, which seem to be addicted to escalation and whose endings defy any sense whatsoever. The bad guys can’t aim, the hero gets physically stronger just by resolve, and the power of love always saves the day.
This happened to me over and over, and got to a point where I started feeling like maybe I should just stop consuming popular media. Eventually I realized that I didn’t have to take the story all-or-nothing. I could just… “pretend” it didn’t end that way.
I think I first had this thought when I saw a typical Hollywood movie about AGI, and it somehow managed to get most of the details right, according to me. The AGI lived in a data center and not in a robot. The AGI stayed low-profile while it developed the technologies to control infrastructure. It built solar farms and nanotech. Most of the humans were unaware of it for most of the time. But then at the end of the movie, something weird happened, and then love saved the day. It was like someone designed a movie optimized for breaking me on this particular point.
So then I thought, fine. If you’re going to ruin your movie, I’m just going to, I dunno, revoke your right to tell me how the story ends, or something. Instead of deciding whether “I like the movie” was true or false, I just decided to carry around the part that I liked and throw away the ending.
The reason this feels weird to me is because you’re not allowed to pretend that things aren’t true. You’re not allowed to decide “I love America” by deliberately ignoring the parts about slavery and indigenous mistreatment and the Vietnam war. You’re not allowed to ignore the icky parts of true things, because then you’ll make wrong predictions and take worse actions, and then you won’t achieve your values.
But a movie isn’t an event that literally happened. I’m not advocating that you go around claiming and believing that The Matrix literally had no sequels. You should still maintain your beliefs about actual facts, like what types of media humanity tends to produce, or whether your friends liked the ending.
But if you find a painting in a thrift store that you absolutely love, except for this one weird dog that creeps you out, you can just paint over it. It’s allowed. It’s not a lie to change the painting, because the existence of the painting is not a proposition about reality.
The purpose of a story isn’t to make an assertion about what happened. The purpose of a story is — well, there are a lot of purposes. It can encode societal wisdom in a memorable way, or serve as hopeful inspiration of what the world could be like, or be a cautionary tale, or help you understand the inner experience of other people, or just be a rollicking good time. But none of the purposes obligate you to swallow the story as a whole or spit it out. Some group of people made up the story. You’re allowed to re-make up the story.
Though I explicitly noticed this option for the AGI movie, I’d been doing this already without realizing it. I absolutely love the movie 2001: a Space Odyssey, despite the ending making the least sense ever. The opening shot is ecstatic. HAL 9000 was a unique and formative representation of AI. The special effects were stunning. Whenever I thought of the movie it was a fond thought.
And really, this goes for anything you don’t like about a piece of media. It doesn’t have to be the ending. You can love a setting and throw away the characters. You can decide that the villain should have won, because it makes for a more poignant story. You can decide that the villain is actually the hero, because the author is wrong about morality. A lot of people operate this way, c.f. the entire fanfiction community.
This is something I also did unconsciously with the Indiana Jones franchise. When I rewatch the movies, I notice that there is some concerning treatment of side characters, and reckless handling of priceless artifacts, and that somehow every supernatural entity is real. But when I reflect on the movies, I feel fondness about the idea that an intellectual can also be adventurous and brave, that humanity’s collective cultural heritage is held as sacred and to be protected, and that Harrison Ford is incredibly dashing in a fedora.
I’m not sure why I manage to do this sometimes but not other times. I care so strongly about my practice of understanding how to make sense of reality, that I think I can get out of practice of enjoying stories.
Stories are creations made from innumerable ingredients. Use the ingredients that you like and bake your own cake.
This is my post for day 9 of the Inkhaven writing retreat.
Finding treasure at the British museum
I recently visited London for the first time. I didn’t have much free time to do sight-seeing, but if there one thing I was going to visit, it was the British museum. Among the tablets and tapestries, there was a gallery holding an exhibit on currency. The exhibit spanned everything from cowry shells to gold doubloons to Zimbabwean trillion dollar bills. But the item that caught me by surprise was a £50 note.
Not being from the UK, I wasn’t sure if this was a standard £50 note being displayed for comparison purposes or a special edition one. A quick google told me that it has been the standard issue note since 2021. It featured computer scientist Alan Turing. This was awesome to see. It reminded me of how the portrait on the US $100 note is not a former president, but instead scientist (and founding father) Benjamin Franklin. But what really struck me was what was beside Turing.
It was a table specifying a Turing machine.
Reader: I don’t know how to convey to you my excitement about this table of letters. Turing machines are practically a religious symbol for me. Seeing this symbol prominently featured on the largest-denomination note of such an important currency was like seeing my religion validated. And it wasn’t just the pretty graphical version; it was an actual table. I needed to get to the bottom of this.
Following the paper trail
I was leaving London soon, but I decided that I was willing to pay £50 for this very cool souvenir.
Being a normal, modern person, I had not actually had reason to acquire any physical British cash while in London. So I decided that I’d try to get a £50 note when I inevitably walked by the currency exchange desk at Heathrow airport. This worked fine, except that I had to get three £20s from the ATM and then exchange them for a £50 and a £10. I spent most of the £10 on airport snacks.
After I arrived home, I finally got around to looking up what the internet had to say about this design. Since I’m pretty familiar with Turing machines, the notation on the note did look familiar, and I had a pretty good guess that it was from the paper in which Turing first introduced his machines. (Of course, he didn’t name them after himself; he called them “automatic machines” or “a-machines”.) I can never quite remember the title of this paper, because the title is On computable numbers, with an application to the Entscheidungsproblem. The word “Entscheidungsproblem” is German for “decision problem”, and Turing used the German word because of the profound influence of Hilbert’s program.
The Bank of England’s official website for this note says the following;
The design on the reverse of the note celebrates Alan Turing and his pioneering work with computers. It features:
A mathematical table and formulae from Turing’s seminal 1936 paper “On Computable Numbers, with an application to the Entscheidungsproblem” Proceedings of the London Mathematical Society. This paper is widely recognised as being foundational for computer science.
The Automatic Computing Engine (ACE) Pilot Machine which was developed at the National Physical Laboratory as the trial model of Turing’s pioneering ACE design. The ACE was one of the first electronic stored-program digital computers.
Ticker tape depicting Alan Turing’s birth date (23 June 1912) in binary code.
Technical drawings for the British Bombe, the machine specified by Turing and one of the primary tools used to break Enigma-enciphered messages during WWII.
The flower-shaped red foil patch on the back of the note is based on the image of a sunflower head linked to Turing’s morphogenetic (study of patterns in nature) work in later life.
A series of background images, depicting technical drawings from The ACE Progress Report.
Which is all lovely, but I notice that it does not actually say what the table means.
Looking through other search results, many people were talking about what this choice of portrait meant in relation to Turing’s former conviction for homosexuality and posthumous pardoning. Or they were joking about how only criminals use £50 notes. But there was essentially no one talking about the technical details.
The paper itself was easy to find, and from a quick visual skim I quickly found the table that matched the note.
Since the sentence immediately before says “The lines of the table are now of the form”, I realized that this table does not specify a Turing machine, but instead just shows us how the rules of a Turing machine can take one of three forms.
Digression on how Turing machines actually work
I might as well take this moment to give you a quick description of how Turing machines are defined.
They are an extremely simplified abstract model of computation. The computing “machine” has a finite number of “internal” states, and can access an unlimited “external” memory in the form of one long tape. The tape is made of discrete cells. Canonically each cell holds either a zero or a one, but you could have a fancier tape if you wished. At any given time, the machine is in one of its states, and can read one cell of the tape. Each state is just a rule about what the machine will do next. An example rule looks like this;
if tape cell = 0
write 0 to the tape cell
move read head left
go to state 3
if tape cell = 1
write 0 to the tape cell
move read head right
go to state 2
All the states are exactly like this, except they differ in what they write, which way they move, and what state they go to next. In the notation on the bank note, each q is a name for a state, each S is a name for a symbol of the tape, L means “move left” and R means “move right”. The last q number is the state we should go to next.
So the table is just saying that, according to this particular notation, there are three different types of rules: ones where you move left, ones where you move right, and ones where you don’t move at all. That’s why it’s not a specification of a particular Turing machine.
Under the table
But below the table, the note also has this line;
which is not immediately below the table in the paper. Instead, it’s in the middle of the next page.
This page is just showing how to convert between several different ways of notating a Turing machine, which are variously used in other parts of the paper. Earlier on the page it says “Let us find a description number for the machine I of §3.” So to find out what this specific machine does, let’s head back up to section 3.
But before we go there, since we now know roughly how this notation works, let’s try to reason it out for ourselves. Some Turing machines have behavior so complicated that the only way know what they’ll do is to run them. But sometimes you can eyeball the rules and see that it’s simple.
The first thing I noticed is that it’s specifying rules for four states, q1 through q4. The semicolons are separating the rules.
The second thing I notice is that the goto-states are also labeled q1 through q4, so this could conceivably be a complete Turing machine.
But the third thing I notice is that the read-symbol in all four rules is S0. That means the machine is underspecified; if we give it a tape with S1 on it, it will not have a rule for what to do.
The fourth thing I notice is that the move-symbol for all four states is R. That means that no matter what, the tape head is moving right. So this is basically a “print-only” machine.
Final answer
Putting these things together, we can conclude that, as long as we start the machine on a tape filled out with S0 in every cell, then it will just print something and keep moving right. But what will it print? Since there are a finite number of states, it must print something periodic. Now that we know the machine is very simple, we can confidently trace out the exact steps of the machine.
Assume we start in state q1 with a tape full of S0. State q1 prints S1 and then moves to state q2. State q2 prints S0 and moves to state q3. State q3 prints S2 and moves to state q4. (It looks like we have three tape symbols, so Turing opted for a slightly fancier tape in this example.) State q4 prints S0, and finally moves us back to state q1. So, dropping the S, the machine just prints this;
102010201020…
A little disappointing, but at least it’s well-defined, and not the most trivial possible Turing machine.
Now let’s check out work, and read what the paper says.
Examples of computing machines. I. A machine can be constructed to compute the sequence 010101….
Huh… well, we were close. We thought it was a machine that printed a pattern of period 4, but apparently it’s a machine that prints a pattern of period 2.
If you read through enough of the paper, you find out that Turing is deliberately putting in “spacers” between every printed cell as a sort of scratchpad cell, which is erased at the end of the computation. But this is just a trick that is very useful for an example machine later in the paper. It’s not essential for the definition of Turing machines. So I claim that our original guess is a more accurate description, and that the Turing machine on the bank note is one that prints 1020 repeatingly.
If you’re interested in understanding this paper more deeply, the book The Annotated Turing by Charles Petzold is a very gentle line-by-line journey through the entire paper, including much historical context. If you’re comfortable with any formal mathematics, the original paper itself is quite readable.
All this makes me wonder how exactly the Bank of England decided on this design. Presumably they paid a computer scientist to check that it made sense? Or maybe a historian who specialized in Turing and understood his paper? I think that a much cooler Turing machine choice was possible, but I understand prioritizing historical accuracy and simplicity over putting Easter egg puzzles on your currency. Overall, it was a satisfying micro-quest. Perhaps this souvenir will fit nicely between my Knuth check and my Bristol pounds.