Torrid News

lesswrong 31h ago 11°

War of Dots: CRUSHING my opponents with FACTS and LOGIC

War of Dots is a free minimal RTS on Steam.The gameplay is simple enough to be described in its entirety in a blogpost, but I'm too lazy to do so, so I'll only consider what I need for my analysis.

lesswrong 6h ago 11°

The one-week sprint

Recently I've been working in one-week sprints, and I've really enjoyed it! Tl;dr I need to do a lot of creative knowledge work, and have recently fallen into a routine which IMO is pretty good at facilitating that. The weekMonday and Tuesday — intense new work. I'm recharged and high-energy, and ready to grind very hard! I try to keep the whole day free: one contiguous, uninterrupted stretch of deep work, 12 hours or more.

lesswrong 20h ago 11°

Does it feel any different to be reverse-chiral life?

I will examine the concept of chirality (the difference between a right hand and a left hand, generalized) and its relevance to philosophy of mind. Philosophy of mind often deals with colors: colors of worldly objects and of mental representations of them. Chirality, like color, can be experienced: it feels different to look at a left hand compared with looking at a right hand. Physics treats chirality more directly than it treats color.

lesswrong 37h ago 10°

Vulnerabilities and exploits: where are we headed?

In Are Mythos’ cyber capabilities overhyped?, co-authored with Epoch AI, we looked at the public evidence on how good Mythos Preview was at vulnerability discovery and exploit development. In this post, I consider the implications.

lesswrong 3h ago 9°

A brief list of ways AI safety efforts could be net negative

Here’s Holden Karnofsky:I tend to think it’s worse than 51/49. I tend to think we’re always going to be prone to overestimate how robustly good our actions are. And the more we learn about all the galaxy-brained considerations that one should have had in one’s head, the more it’s going to be like 50+ε%. I think AI safety is a great cause to work in. I’m excited to work in it. I think it’s high impact.

lesswrong 32h ago 9°

How far do open weights trail the frontier?

I saw this Twitter post today and really liked the idea. But I think the AA Index is a rather crude way and much prefer ECI from Epoch, which uses IRT.

lesswrong 26h ago 9°

Contra Pace on When to Apologize

BOJACK: Hey, I wanted to talk to you about—you know—I feel bad about what happened.

lesswrong 40h ago 9°

Agents are under-elicited: A case study in optimization tasks

"Knowing is not enough; we must apply.

lesswrong 29h ago 9°

AI #173: AI Pauses

A lot of things are always happening. Only one story matters.

lesswrong 44h ago 8°

A preliminary experiment regarding consistency as a measure of conceptual abilities in language models

Cross-posting from my coworker Caspar Oesterheld's blog which I think is great and generally not well known.I’ve recently been working a little on whether consistency across different questions can be used as a measure of (and perhaps ultimately as a training target for) philosophical competence. I’m in the process of writing up the results into a paper. I’m here reporting results from a small, preliminary experiment that I ran late last year.

lesswrong 21h ago 7°

Midjourney's Spa, or when sci-fi tries to become mundane

Midjourney has just announced their jump from being just the "makes funny images" AI company to being the "revolutionises diagnostics and human medicine forever" AI company, as a side gig. Here's the post.Basically, they've announced the creation of a highly advanced full-body ultrasound scanner beyond anything that's been done until now. The description of the process sounds straight out of a science fiction novel:It starts by stepping into a shallow pool of golden light.

lesswrong 3h ago 7°

Online >> real life for spreading ideas

I believe that internet culture influences real culture much more than the other way around. This is quite hard to prove, but I often see an idea start to come up in real-life conversations where I’ve seen it appear on Twitter a few weeks before. I’m not sure if ordinary people don’t realise this (that their ideas are workshopped and take root online), or if they just see it as that important.Furthermore, the internet is still quite meritocratic.

lesswrong 22h ago 7°

The distillation double bind: Distilling misaligned models either transfers misalignment or it doesn't

Suppose we have a dangerous misaligned AI that can fool alignment audits, and distill it into a student model. Two things can happen:Misalignment doesn’t transfer to the student. If so, we get a fairly capable benign model, which we can use to perform tasks that we wouldn’t want a misaligned AI to perform.Misalignment transfers to the student. The student might also be worse than the teacher at hiding its misalignment (e.g., because it is less capable).

lesswrong 7h ago 6°

Futarchy is not secure without a proposal gatekeeper

Asset futarchy is attractive because it lets markets compare a proposal's expected effect on token value. That comparison is only reliable when conditional prices track the proposal's causal effect rather than strategic behavior around the decision rule.

lesswrong 27h ago 6°

Your Model Organisms Might Be Fried

Context: We are the ‘model motivations’ team at Arcadia Alignment. We aim to build a science of ‘model intentions’, unifying insights from personas and other empirical evidence.This is an informal research note that has come out of the first 2-3 weeks of exploratory work.

lesswrong 29h ago 6°

Shard narcissism as delusion of unembededness

Epistemic status: SpeculativeTL;DR: Narcissism in Lasch's sense is unhappiness with the essential embeddedness of agency in our universe, "solved" by a denial of agency that leads to non-agency.Three examplesPerson A works long hours. For years, they've wanted to find a long-term relationship, but they don't have much time to find candidates, so they have a Tinder date around once a month.Person B has immigrant parents but doesn't speak the inheritance language.

1 2