Empty Invokers¶

An empty invoker is an invoke() call with no arguments. It creates a new worker thread that operates on the full batch from all previous input invokers, letting you access modules in any order or run code after unbounded iteration.

Setup¶

In [1]:

Copied!

from nnsight import LanguageModel

model = LanguageModel("openai-community/gpt2", device_map="auto", dispatch=True)
from nnsight import LanguageModel

model = LanguageModel("openai-community/gpt2", device_map="auto", dispatch=True)

Batch-Wide Operations¶

An empty invoker sees the combined batch from all prior input invokers. This is useful for operations that need the full batch at once.

In [2]:

Copied!





with model.trace() as tracer:

    with tracer.invoke("The Eiffel Tower is in the city of"):
        pass

    with tracer.invoke("The Colosseum is in the city of"):
        pass

    # Empty invoke — operates on both prompts as a single batch
    with tracer.invoke():
        all_logits = model.lm_head.output.save()

print(f"Combined batch shape: {all_logits.shape}")
print(f"Prompt 1: {model.tokenizer.decode(all_logits[0, -1].argmax(dim=-1))}")
print(f"Prompt 2: {model.tokenizer.decode(all_logits[1, -1].argmax(dim=-1))}")
with model.trace() as tracer:

    with tracer.invoke("The Eiffel Tower is in the city of"):
        pass

    with tracer.invoke("The Colosseum is in the city of"):
        pass

    # Empty invoke — operates on both prompts as a single batch
    with tracer.invoke():
        all_logits = model.lm_head.output.save()

print(f"Combined batch shape: {all_logits.shape}")
print(f"Prompt 1: {model.tokenizer.decode(all_logits[0, -1].argmax(dim=-1))}")
print(f"Prompt 2: {model.tokenizer.decode(all_logits[1, -1].argmax(dim=-1))}")

Combined batch shape: torch.Size([2, 10, 50257])
Prompt 1:  Paris
Prompt 2:  P

Accessing Modules Out of Order¶

Within a single invoker, you must access modules in forward-pass order. Empty invokers give you a fresh thread, letting you access any module regardless of what was accessed before.

In [3]:

Copied!





with model.trace() as tracer:

    with tracer.invoke("The Eiffel Tower is in the city of"):
        # Must be in order: layer 0 before layer 11
        early_hs = model.transformer.h[0].output[0].save()
        late_hs = model.transformer.h[-1].output[0].save()

    # Empty invoke — new thread, can access any layer
    with tracer.invoke():
        mid_hs = model.transformer.h[5].output[0].save()

print(f"Layer 0:  {early_hs.shape}")
print(f"Layer 5:  {mid_hs.shape}")
print(f"Layer 11: {late_hs.shape}")
with model.trace() as tracer:

    with tracer.invoke("The Eiffel Tower is in the city of"):
        # Must be in order: layer 0 before layer 11
        early_hs = model.transformer.h[0].output[0].save()
        late_hs = model.transformer.h[-1].output[0].save()

    # Empty invoke — new thread, can access any layer
    with tracer.invoke():
        mid_hs = model.transformer.h[5].output[0].save()

print(f"Layer 0:  {early_hs.shape}")
print(f"Layer 5:  {mid_hs.shape}")
print(f"Layer 11: {late_hs.shape}")

Layer 0:  torch.Size([1, 10, 768])
Layer 5:  torch.Size([1, 10, 768])
Layer 11: torch.Size([1, 10, 768])

You can chain multiple empty invokers to access the same module at different points:

In [4]:

Copied!





with model.trace() as tracer:

    with tracer.invoke("The Eiffel Tower is in the city of"):
        pass

    with tracer.invoke():
        hs_layer0 = model.transformer.h[0].output[0].save()

    with tracer.invoke():
        hs_layer5 = model.transformer.h[5].output[0].save()

    with tracer.invoke():
        hs_layer11 = model.transformer.h[11].output[0].save()

print(f"Layer 0:  {hs_layer0.shape}")
print(f"Layer 5:  {hs_layer5.shape}")
print(f"Layer 11: {hs_layer11.shape}")
with model.trace() as tracer:

    with tracer.invoke("The Eiffel Tower is in the city of"):
        pass

    with tracer.invoke():
        hs_layer0 = model.transformer.h[0].output[0].save()

    with tracer.invoke():
        hs_layer5 = model.transformer.h[5].output[0].save()

    with tracer.invoke():
        hs_layer11 = model.transformer.h[11].output[0].save()

print(f"Layer 0:  {hs_layer0.shape}")
print(f"Layer 5:  {hs_layer5.shape}")
print(f"Layer 11: {hs_layer11.shape}")

Layer 0:  torch.Size([1, 10, 768])
Layer 5:  torch.Size([1, 10, 768])
Layer 11: torch.Size([1, 10, 768])

Running Code After Unbounded Iteration¶

When using tracer.iter[:] (unbounded), all code after the loop is skipped because the iterator waits forever. An empty invoker solves this — it runs as a separate thread after the iteration completes.

In [5]:

Copied!





with model.generate(max_new_tokens=3) as tracer:

    with tracer.invoke("The Eiffel Tower is in the city of"):
        tokens = list().save()
        for step in tracer.iter[:]:
            tokens.append(model.lm_head.output[0, -1].argmax(dim=-1))
        # Code here would NEVER run!

    # Empty invoker — runs after generation completes
    with tracer.invoke():
        final_logits = model.lm_head.output.save()

for i, t in enumerate(tokens):
    print(f"Step {i}: {model.tokenizer.decode(t)}")
print(f"Final logits shape: {final_logits.shape}")
with model.generate(max_new_tokens=3) as tracer:

    with tracer.invoke("The Eiffel Tower is in the city of"):
        tokens = list().save()
        for step in tracer.iter[:]:
            tokens.append(model.lm_head.output[0, -1].argmax(dim=-1))
        # Code here would NEVER run!

    # Empty invoker — runs after generation completes
    with tracer.invoke():
        final_logits = model.lm_head.output.save()

for i, t in enumerate(tokens):
    print(f"Step {i}: {model.tokenizer.decode(t)}")
print(f"Final logits shape: {final_logits.shape}")

Step 0:  Paris
Step 1: ,
Step 2:  and
Final logits shape: torch.Size([1, 1, 50257])

When to use empty invokers

Batch-wide operations: access the combined batch from all input invokers
Out-of-order access: access a module that already ran in a prior invoker
Post-iteration code: run logic after an unbounded tracer.iter[:] loop
Multiple reads of the same module: each empty invoker is a separate thread, so each can independently access any module