Info
Last Execution: 2026-02-17
| Package | Version |
|---|---|
| nnsight | 0.5.15 |
| Python | 3.12.3 |
| torch | 2.10.0+cu128 |
| transformers | 5.2.0 |
Cross-Prompt Intervention¶
Invokers let you batch multiple prompts in a single forward pass and share values between them. This is the basis of techniques like activation patching, where you transfer activations from one prompt into another.
Setup¶
from nnsight import LanguageModel
model = LanguageModel("openai-community/gpt2", device_map="auto", dispatch=True)
Invokers¶
When you pass input directly to model.trace(), an implicit invoker is created. For multiple prompts, use explicit invokers with tracer.invoke(). Each invoker defines a separate prompt that gets batched together in one forward pass.
with model.trace() as tracer:
with tracer.invoke("The Eiffel Tower is in the city of"):
logits1 = model.lm_head.output.save()
with tracer.invoke("The Colosseum is in the city of"):
logits2 = model.lm_head.output.save()
print(f"Prompt 1: {model.tokenizer.decode(logits1[0, -1].argmax(dim=-1))}")
print(f"Prompt 2: {model.tokenizer.decode(logits2[0, -1].argmax(dim=-1))}")
Prompt 1: Paris Prompt 2: P
How invokers work
Each invoker runs its intervention code in a separate worker thread. Invokers execute serially in the order they are defined — invoke 1 completes before invoke 2 starts. Within each invoker, you must access modules in forward-pass execution order.
You can also batch multiple prompts in a single invoker by passing a list:
with model.trace() as tracer:
with tracer.invoke(["The Eiffel Tower is in the city of", "The Colosseum is in the city of"]):
logits = model.lm_head.output.save()
print(f"Prompt 1: {model.tokenizer.decode(logits[0, -1].argmax(dim=-1))}")
print(f"Prompt 2: {model.tokenizer.decode(logits[1, -1].argmax(dim=-1))}")
Prompt 1: Paris Prompt 2: P
Sharing Values Across Invokers¶
Variables defined in one invoker are accessible in subsequent invokers. This works because invokers run serially — by the time invoke 2 starts, invoke 1 has already finished.
with model.trace() as tracer:
with tracer.invoke("The Eiffel Tower is in the city of"):
embeddings = model.transformer.wte.output
with tracer.invoke("_ _ _ _ _ _ _ _ _"):
# embeddings from invoke 1 is available here
logits = model.lm_head.output.save()
print(f"Shape: {logits.shape}")
Shape: torch.Size([1, 10, 50257])
Barriers¶
When two invokers both access the same module, you need a barrier to synchronize them. Without it, the variable from invoke 1 won't be materialized when invoke 2 tries to use it.
Create a barrier with tracer.barrier(n) where n is the number of invokers that will participate. Each invoker calls barrier() at the synchronization point.
with model.trace() as tracer:
barrier = tracer.barrier(2)
with tracer.invoke("The Eiffel Tower is in the city of"):
# Capture embeddings from the first prompt
embeddings = model.transformer.wte.output
barrier() # Signal: embeddings is ready
with tracer.invoke("_ _ _ _ _ _ _ _ _"):
barrier() # Wait for invoke 1 to finish
# Patch the second prompt's embeddings with the first's
model.transformer.wte.output = embeddings
logits = model.lm_head.output.save()
print(f"Patched prediction: {model.tokenizer.decode(logits[0, -1].argmax(dim=-1))}")
Patched prediction: in
When are barriers required?
Barriers are required when two invokers both access .output or .input on the same module. If invoke 1 reads model.layer.output and invoke 2 writes to model.layer.output, they both touch the same module — use a barrier. If they access different modules, no barrier is needed.
Activation Patching¶
A practical application: patch hidden states from one prompt into another to study causal effects. Here we transfer layer 5's representation of "Eiffel Tower" into the "Colosseum" prompt.
with model.trace() as tracer:
barrier = tracer.barrier(2)
with tracer.invoke("The Eiffel Tower is in the city of"):
clean_hs = model.transformer.h[5].output[0][:, -1, :]
barrier()
clean_logits = model.lm_head.output.save()
with tracer.invoke("The Colosseum is in the city of"):
barrier()
model.transformer.h[5].output[0][:, -1, :] = clean_hs
patched_logits = model.lm_head.output.save()
clean = model.tokenizer.decode(clean_logits[0, -1].argmax(dim=-1))
patched = model.tokenizer.decode(patched_logits[0, -1].argmax(dim=-1))
print(f"Clean (Eiffel Tower): {clean}")
print(f"Patched (Colosseum): {patched}")
Clean (Eiffel Tower): Paris Patched (Colosseum): Rome