Setting Values#

We often not only want to see whats happening during computation, but intervene and edit the flow of information.

In this example, we create a tensor of noise to add to the hidden states. We then add it, use the assigment = operator to update the tensors of .output[0][:] with these new noised values.

[2]:

from nnsight import LanguageModel
import torch

model = LanguageModel('openai-community/gpt2', device_map='auto')

with model.trace('The Eiffel Tower is in the city of') as tracer:

    hidden_states_pre = model.transformer.h[-1].output[0].clone().save()

    noise = (0.001**0.5)*torch.randn(hidden_states_pre.shape)

    # model.transformer.h[-1].output = (hidden_states_pre + noise, model.transformer.h[-1].output[1])
    model.transformer.h[-1].output[0][:] = hidden_states_pre + noise

    hidden_states_post = model.transformer.h[-1].output[0].save()

We can see the change in the results:

[3]:

print(hidden_states_pre)
print(hidden_states_post)

tensor([[[ 0.0505, -0.1728, -0.1690,  ..., -1.0096,  0.1280, -1.0687],
         [ 8.7495,  2.9057,  5.3024,  ..., -8.0418,  1.2964, -2.8677],
         [ 0.2960,  4.6686, -3.6642,  ...,  0.2391, -2.6064,  3.2263],
         ...,
         [ 2.1537,  6.8917,  3.8651,  ...,  0.0588, -1.9866,  5.9188],
         [-0.4460,  7.4285, -9.3065,  ...,  2.0528, -2.7946,  0.5556],
         [ 6.6286,  1.7258,  4.7969,  ...,  7.6714,  3.0682,  2.0481]]],
       device='mps:0', grad_fn=<CloneBackward0>)
tensor([[[ 0.1225, -0.1650, -0.1966,  ..., -1.0529,  0.1273, -1.0736],
         [ 8.6914,  2.8702,  5.3589,  ..., -8.0615,  1.2423, -2.8655],
         [ 0.3343,  4.6634, -3.6297,  ...,  0.2230, -2.6057,  3.1985],
         ...,
         [ 2.1307,  6.9048,  3.8257,  ...,  0.1329, -1.9910,  5.9815],
         [-0.4599,  7.4705, -9.3246,  ...,  2.0739, -2.8393,  0.5849],
         [ 6.6359,  1.7332,  4.7754,  ...,  7.6323,  2.9859,  2.0438]]],
       device='mps:0', grad_fn=<CopySlices>)