War of Dots is a free minimal RTS on Steam.The gameplay is simple enough to be described in its entirety in a blogpost, but I'm too lazy to do so, so I'll only consider what I need for my analysis.
Recently I've been working in one-week sprints, and I've really enjoyed it! Tl;dr I need to do a lot of creative knowledge work, and have recently fallen into a routine which IMO is pretty good at facilitating that. The weekMonday and Tuesday — intense new work. I'm recharged and high-energy, and ready to grind very hard! I try to keep the whole day free: one contiguous, uninterrupted stretch of deep work, 12 hours or more.
I will examine the concept of chirality (the difference between a right hand and a left hand, generalized) and its relevance to philosophy of mind. Philosophy of mind often deals with colors: colors of worldly objects and of mental representations of them. Chirality, like color, can be experienced: it feels different to look at a left hand compared with looking at a right hand. Physics treats chirality more directly than it treats color.
In Are Mythos’ cyber capabilities overhyped?, co-authored with Epoch AI, we looked at the public evidence on how good Mythos Preview was at vulnerability discovery and exploit development. In this post, I consider the implications.
Here’s Holden Karnofsky:I tend to think it’s worse than 51/49. I tend to think we’re always going to be prone to overestimate how robustly good our actions are. And the more we learn about all the galaxy-brained considerations that one should have had in one’s head, the more it’s going to be like 50+ε%. I think AI safety is a great cause to work in. I’m excited to work in it. I think it’s high impact.
I saw this Twitter post today and really liked the idea. But I think the AA Index is a rather crude way and much prefer ECI from Epoch, which uses IRT.
BOJACK: Hey, I wanted to talk to you about—you know—I feel bad about what happened.
"Knowing is not enough; we must apply.
A lot of things are always happening. Only one story matters.
Cross-posting from my coworker Caspar Oesterheld's blog which I think is great and generally not well known.I’ve recently been working a little on whether consistency across different questions can be used as a measure of (and perhaps ultimately as a training target for) philosophical competence. I’m in the process of writing up the results into a paper. I’m here reporting results from a small, preliminary experiment that I ran late last year.
Midjourney has just announced their jump from being just the "makes funny images" AI company to being the "revolutionises diagnostics and human medicine forever" AI company, as a side gig. Here's the post.Basically, they've announced the creation of a highly advanced full-body ultrasound scanner beyond anything that's been done until now. The description of the process sounds straight out of a science fiction novel:It starts by stepping into a shallow pool of golden light.
I believe that internet culture influences real culture much more than the other way around. This is quite hard to prove, but I often see an idea start to come up in real-life conversations where I’ve seen it appear on Twitter a few weeks before. I’m not sure if ordinary people don’t realise this (that their ideas are workshopped and take root online), or if they just see it as that important.Furthermore, the internet is still quite meritocratic.
Suppose we have a dangerous misaligned AI that can fool alignment audits, and distill it into a student model. Two things can happen:Misalignment doesn’t transfer to the student. If so, we get a fairly capable benign model, which we can use to perform tasks that we wouldn’t want a misaligned AI to perform.Misalignment transfers to the student. The student might also be worse than the teacher at hiding its misalignment (e.g., because it is less capable).
Asset futarchy is attractive because it lets markets compare a proposal's expected effect on token value. That comparison is only reliable when conditional prices track the proposal's causal effect rather than strategic behavior around the decision rule.
Context: We are the ‘model motivations’ team at Arcadia Alignment. We aim to build a science of ‘model intentions’, unifying insights from personas and other empirical evidence.This is an informal research note that has come out of the first 2-3 weeks of exploratory work.
Epistemic status: SpeculativeTL;DR: Narcissism in Lasch's sense is unhappiness with the essential embeddedness of agency in our universe, "solved" by a denial of agency that leads to non-agency.Three examplesPerson A works long hours. For years, they've wanted to find a long-term relationship, but they don't have much time to find candidates, so they have a Tinder date around once a month.Person B has immigrant parents but doesn't speak the inheritance language.