Tutorials#

Walkthrough

Learn the basics

Activation Patching

Causal intervention

Attribution Patching

Approximate patching

Logit Lens

Decode activations

Future Lens

Probe future tokens

Function Vectors

Lambdas

Dictionary Learning

Sparse autoencoders