tracer¶
tracer
¶
Cache
¶
Cache(modules: Optional[List[Union[Envoy, str]]] = None, device: Optional[device] = device('cpu'), dtype: Optional[dtype] = None, detach: Optional[bool] = True, include_output: bool = True, include_inputs: bool = False, rename: Optional[Dict[str, str]] = None, alias: Optional[Dict[str, str]] = None)
A cache for storing and transforming tensor values during tracing.
This class provides functionality to store tensor values with optional transformations such as detaching from computation graph, moving to a specific device, or converting to a specific dtype.
| PARAMETER | DESCRIPTION |
|---|---|
device
|
Optional device to move tensors to
TYPE:
|
dtype
|
Optional dtype to convert tensors to
TYPE:
|
detach
|
Whether to detach tensors from computation graph
TYPE:
|
include_output
|
Whether to include output in the cached activations
TYPE:
|
include_inputs
|
Whether to include inputs in the cached activations
TYPE:
|
Entry
dataclass
¶
Entry(output: Optional[Any] = None, inputs: Optional[Tuple[Tuple[Any, ...], Dict[str, Any]]] = None)
CacheDict
¶
CacheDict(data: Union[CacheDict, Dict[str, Entry]], path: str = '', alias: Dict[str, str] = dict(), rename: Dict[str, str] = dict(), alias_paths: Dict[str, str] = dict())
Bases: Dict
A dictionary subclass that provides convenient access to cached module activations.
This class extends the standard dictionary to provide both dictionary-style access and attribute-style access to cached activations. It supports hierarchical access to nested modules using dot notation and indexing for module lists.
Examples:
Access cached activations using dictionary keys:
Access using attribute notation:
Access module outputs and inputs:
>>> cache.model.transformer.h[0].output
>>> cache.model.transformer.h[0].inputs
>>> cache.model.transformer.h[0].input # First input argument
The class maintains an internal path that tracks the current location in the module hierarchy, allowing for intuitive navigation through nested modules.
add
¶
Add a value to the cache with optional transformations.
| PARAMETER | DESCRIPTION |
|---|---|
provider
|
The key to store the value under
TYPE:
|
value
|
The tensor value to store
TYPE:
|
InterleavingTracer
¶
Bases: Tracer
Tracer that manages the interleaving of model execution and interventions.
This class coordinates the execution of the model's forward pass and user-defined intervention functions through the Interleaver.
| PARAMETER | DESCRIPTION |
|---|---|
fn
|
The function to execute (typically the model's forward pass)
TYPE:
|
model
|
The model envoy to intervene on
TYPE:
|
*args
|
Additional arguments to pass to the function
DEFAULT:
|
**kwargs
|
Additional keyword arguments to pass to the function
DEFAULT:
|
result
property
¶
result: Object
Get the result of the method being traced.
This property allows access to the return values produced by the method being traced.
Example
model = LanguageModel("gpt2", device_map='auto', dispatch=True) with model.generate("Hello World") as tracer: result = tracer.result.save() print(result)
| RETURNS | DESCRIPTION |
|---|---|
Object
|
The result of the method being traced |
compile
¶
Compile the captured code block into a callable function.
| RETURNS | DESCRIPTION |
|---|---|
Callable
|
A callable function that executes the captured code block |
execute
¶
First executes the parent Tracer's execute method to set up the context, then creates an Interleaver to manage the interventions during model execution.
invoke
¶
Create an Invoker to capture and execute an intervention function.
| PARAMETER | DESCRIPTION |
|---|---|
*args
|
Additional arguments to pass to the intervention function
DEFAULT:
|
**kwargs
|
Additional keyword arguments to pass to the intervention function
DEFAULT:
|
| RETURNS | DESCRIPTION |
|---|---|
|
An Invoker instance |
cache
¶
cache(modules: Optional[List[Union[Envoy, str]]] = None, device: Optional[device] = device('cpu'), dtype: Optional[dtype] = None, detach: Optional[bool] = True, include_output: bool = True, include_inputs: bool = False) -> Union[Dict, Object]
Get or create a cache for storing intermediate values during tracing.
| PARAMETER | DESCRIPTION |
|---|---|
modules
|
Optional list of modules to cache, defaults to all modules
TYPE:
|
device
|
Optional device to move tensors to, defaults to cpu
TYPE:
|
dtype
|
Optional dtype to convert tensors to, defaults to None
TYPE:
|
detach
|
Whether to detach tensors from computation graph, defaults to True
TYPE:
|
include_output
|
Whether to include output in the cached activations
TYPE:
|
include_inputs
|
Whether to include inputs in the cached activations
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Union[Dict, Object]
|
A dictionary containing the cached values |
barrier
¶
nnsight barrier: A synchronization primitive for coordinating multiple concurrent invocations in nnsight.
This works similarly to a threading.Barrier, but is designed for use with nnsight's model tracing and intervention system. A barrier allows you to pause execution in multiple parallel invocations until all participants have reached the barrier, at which point all are released to continue. This is useful when you want to synchronize the execution of different model runs, for example to ensure that all have reached a certain point (such as after embedding lookup) before proceeding to the next stage (such as generation or intervention).
Example usage:
with gpt2.generate(max_new_tokens=3) as tracer:
barrier = tracer.barrier(2)
with tracer.invoke(MSG_prompt):
embeddings = gpt2.transformer.wte.output
barrier()
output1 = gpt2.generator.output.save()
with tracer.invoke("_ _ _ _ _ _ _ _ _"):
barrier()
gpt2.transformer.wte.output = embeddings
output2 = gpt2.generator.output.save()
In this example, both invocations will pause at the barrier until both have reached it, ensuring synchronization.
ScanningTracer
¶
Bases: InterleavingTracer
A tracer that runs the model in fake tensor mode to validate operations and inspect tensor shapes.
This tracer uses PyTorch's FakeTensorMode to run the model without actual computation, allowing for shape validation and operation checking. It populates the _fake_inputs and _fake_output attributes on each Envoy to store the shapes and types of tensors that would flow through the model during a real forward pass.
execute
¶
Execute the model in fake tensor mode.
This method: 1. Registers forward hooks on all modules to capture fake input/output 2. Runs the model in fake tensor mode to validate operations 3. Stores the fake inputs/outputs on each Envoy for later inspection
| PARAMETER | DESCRIPTION |
|---|---|
fn
|
The function to execute (typically the model's forward pass)
TYPE:
|