Skip to content

tracer

tracer

Cache

Cache(modules: Optional[List[Union[Envoy, str]]] = None, device: Optional[device] = device('cpu'), dtype: Optional[dtype] = None, detach: Optional[bool] = True, include_output: bool = True, include_inputs: bool = False, rename: Optional[Dict[str, str]] = None, alias: Optional[Dict[str, str]] = None)

A cache for storing and transforming tensor values during tracing.

This class provides functionality to store tensor values with optional transformations such as detaching from computation graph, moving to a specific device, or converting to a specific dtype.

PARAMETER DESCRIPTION
device

Optional device to move tensors to

TYPE: Optional[device] DEFAULT: device('cpu')

dtype

Optional dtype to convert tensors to

TYPE: Optional[dtype] DEFAULT: None

detach

Whether to detach tensors from computation graph

TYPE: Optional[bool] DEFAULT: True

include_output

Whether to include output in the cached activations

TYPE: bool DEFAULT: True

include_inputs

Whether to include inputs in the cached activations

TYPE: bool DEFAULT: False

device instance-attribute

device = device

dtype instance-attribute

dtype = dtype

detach instance-attribute

detach = detach

modules instance-attribute

modules = modules

include_output instance-attribute

include_output = include_output

include_inputs instance-attribute

include_inputs = include_inputs

cache instance-attribute

cache = save()

Entry dataclass

Entry(output: Optional[Any] = None, inputs: Optional[Tuple[Tuple[Any, ...], Dict[str, Any]]] = None)
output class-attribute instance-attribute
output: Optional[Any] = None
inputs class-attribute instance-attribute
inputs: Optional[Tuple[Tuple[Any, ...], Dict[str, Any]]] = None
input property
input

Gets the first positional argument of the inputs value to the cached module. Returns None if no inputs were cached.

CacheDict

CacheDict(data: Union[CacheDict, Dict[str, Entry]], path: str = '', alias: Dict[str, str] = dict(), rename: Dict[str, str] = dict(), alias_paths: Dict[str, str] = dict())

Bases: Dict

A dictionary subclass that provides convenient access to cached module activations.

This class extends the standard dictionary to provide both dictionary-style access and attribute-style access to cached activations. It supports hierarchical access to nested modules using dot notation and indexing for module lists.

Examples:

Access cached activations using dictionary keys:

>>> cache['model.transformer.h.0.attn']

Access using attribute notation:

>>> cache.model.transformer.h[0].attn

Access module outputs and inputs:

>>> cache.model.transformer.h[0].output
>>> cache.model.transformer.h[0].inputs
>>> cache.model.transformer.h[0].input  # First input argument

The class maintains an internal path that tracks the current location in the module hierarchy, allowing for intuitive navigation through nested modules.

output property
output

Returns the output attribute from the Cache.Entry at the current path.

inputs property
inputs

Returns the inputs attribute from the Cache.Entry at the current path.

input property
input

Returns the input property from the Cache.Entry at the current path.

keys
keys(alias: bool = False)
__getitem__
__getitem__(key)
__getattr__
__getattr__(attr: str)

add

add(provider: str, value: Any)

Add a value to the cache with optional transformations.

PARAMETER DESCRIPTION
provider

The key to store the value under

TYPE: str

value

The tensor value to store

TYPE: Any

InterleavingTracer

InterleavingTracer(fn: Callable, model: Envoy, *args, backend: Backend = None, **kwargs)

Bases: Tracer

Tracer that manages the interleaving of model execution and interventions.

This class coordinates the execution of the model's forward pass and user-defined intervention functions through the Interleaver.

PARAMETER DESCRIPTION
fn

The function to execute (typically the model's forward pass)

TYPE: Callable

model

The model envoy to intervene on

TYPE: Envoy

*args

Additional arguments to pass to the function

DEFAULT: ()

**kwargs

Additional keyword arguments to pass to the function

DEFAULT: {}

fn instance-attribute

fn = fn

model instance-attribute

model = model

mediators instance-attribute

mediators: List[Mediator] = []

batcher instance-attribute

batcher = Batcher()

iter property

iter

result property

result: Object

Get the result of the method being traced.

This property allows access to the return values produced by the method being traced.

Example

model = LanguageModel("gpt2", device_map='auto', dispatch=True) with model.generate("Hello World") as tracer: result = tracer.result.save() print(result)

RETURNS DESCRIPTION
Object

The result of the method being traced

capture

capture()

Capture the code block within the 'with' statement.

compile

compile() -> Callable

Compile the captured code block into a callable function.

RETURNS DESCRIPTION
Callable

A callable function that executes the captured code block

get_frame

get_frame()

Get the frame of the tracer.

execute

execute(fn: Callable)

First executes the parent Tracer's execute method to set up the context, then creates an Interleaver to manage the interventions during model execution.

invoke

invoke(*args, **kwargs)

Create an Invoker to capture and execute an intervention function.

PARAMETER DESCRIPTION
*args

Additional arguments to pass to the intervention function

DEFAULT: ()

**kwargs

Additional keyword arguments to pass to the intervention function

DEFAULT: {}

RETURNS DESCRIPTION

An Invoker instance

stop

stop()

Raise an EarlyStopException to stop the execution of the model.

all

all()

next

next(step: int = 1)

cache

cache(modules: Optional[List[Union[Envoy, str]]] = None, device: Optional[device] = device('cpu'), dtype: Optional[dtype] = None, detach: Optional[bool] = True, include_output: bool = True, include_inputs: bool = False) -> Union[Dict, Object]

Get or create a cache for storing intermediate values during tracing.

PARAMETER DESCRIPTION
modules

Optional list of modules to cache, defaults to all modules

TYPE: Optional[List[Union[Envoy, str]]] DEFAULT: None

device

Optional device to move tensors to, defaults to cpu

TYPE: Optional[device] DEFAULT: device('cpu')

dtype

Optional dtype to convert tensors to, defaults to None

TYPE: Optional[dtype] DEFAULT: None

detach

Whether to detach tensors from computation graph, defaults to True

TYPE: Optional[bool] DEFAULT: True

include_output

Whether to include output in the cached activations

TYPE: bool DEFAULT: True

include_inputs

Whether to include inputs in the cached activations

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
Union[Dict, Object]

A dictionary containing the cached values

barrier

barrier(n_participants: int)

nnsight barrier: A synchronization primitive for coordinating multiple concurrent invocations in nnsight.

This works similarly to a threading.Barrier, but is designed for use with nnsight's model tracing and intervention system. A barrier allows you to pause execution in multiple parallel invocations until all participants have reached the barrier, at which point all are released to continue. This is useful when you want to synchronize the execution of different model runs, for example to ensure that all have reached a certain point (such as after embedding lookup) before proceeding to the next stage (such as generation or intervention).

Example usage:

with gpt2.generate(max_new_tokens=3) as tracer:
    barrier = tracer.barrier(2)

    with tracer.invoke(MSG_prompt):
        embeddings = gpt2.transformer.wte.output
        barrier()
        output1 = gpt2.generator.output.save()

    with tracer.invoke("_ _ _ _ _ _ _ _ _"):
        barrier()
        gpt2.transformer.wte.output = embeddings
        output2 = gpt2.generator.output.save()

In this example, both invocations will pause at the barrier until both have reached it, ensuring synchronization.

__getstate__

__getstate__()

Get the state of the tracer for serialization.

__setstate__

__setstate__(state)

Set the state of the tracer for deserialization.

ScanningTracer

ScanningTracer(fn: Callable, model: Envoy, *args, backend: Backend = None, **kwargs)

Bases: InterleavingTracer

A tracer that runs the model in fake tensor mode to validate operations and inspect tensor shapes.

This tracer uses PyTorch's FakeTensorMode to run the model without actual computation, allowing for shape validation and operation checking. It populates the _fake_inputs and _fake_output attributes on each Envoy to store the shapes and types of tensors that would flow through the model during a real forward pass.

execute

execute(fn: Callable)

Execute the model in fake tensor mode.

This method: 1. Registers forward hooks on all modules to capture fake input/output 2. Runs the model in fake tensor mode to validate operations 3. Stores the fake inputs/outputs on each Envoy for later inspection

PARAMETER DESCRIPTION
fn

The function to execute (typically the model's forward pass)

TYPE: Callable

Barrier

Barrier(model: Envoy, n_participants: int)

model instance-attribute

model = model

n_participants instance-attribute

n_participants = n_participants

participants instance-attribute

participants: Set[str] = set()

__call__

__call__()