

0·
2 days agoANIMA! Moving Mountains is my favorite.


ANIMA! Moving Mountains is my favorite.


Anthropic has some similar findings, and they propose an architectural change (activation capping) that apparently helps keep the Assistant character away from dark traits (sometimes). But it hasn’t been implemented in any models, I assume because of the cost of scaling it up.





Hush Spain, or you don’t get Catalonia either.


I’d rather play a re-launched Vanguard: SoH.
This is probably role play, per the persona selection model, but there’s a lot of interesting research into the hidden “thoughts” of LLMs. Check out Neuronopedia and the Opus model cards for some great examples.
Would you like your tax return in tokens?
The paper is more rigorous with language but can be a slog.