Tutorials#
Walkthrough
Learn the basics
Activation Patching
Causal intervention
Attribution Patching
Approximate patching
Boundless DAS
Identifying Causal Mechanisms in Alpaca
Dictionary Learning
Sparse autoencoders
Logit Lens
Decode activations
- Activation Patching
- Attribution Patching
- Setup (Ignore)
- Boundless DAS
- Setup (Ignore)
- Price Tagging game
- Prealign Task
- Boundless DAS
- Dictionary Learning
- Logit Lens
- Setup (Ignore)
- GPT2 Model Architecture
- Walkthrough
- 1️⃣ First, let’s start small
- 2️⃣ Bigger
- 3️⃣ I thought you said huge models?
- Getting Involved!