Cross-Prompt Intervention#

Intervention operations work cross prompt! Use two invocations within the same generation block and operations can work between them.

In this case, we grab the token embeddings coming from the first prompt, “Madison square garden is located in the city of New” and replace the embeddings of the second prompt with them.

[1]:
from nnsight import LanguageModel

model = LanguageModel('openai-community/gpt2', device_map='auto')
/opt/anaconda3/envs/nnsight/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
[3]:
with model.generate(max_new_tokens=3) as tracer:

    with tracer.invoke("Madison square garden is located in the city of New") as invoker:

        embeddings = model.transformer.wte.output
        original = model.generator.output.save()

    with tracer.invoke("_ _ _ _ _ _ _ _ _ _") as invoker:

        model.transformer.wte.output = embeddings
        intervened = model.generator.output.save()

print(model.tokenizer.batch_decode(original))
print(model.tokenizer.batch_decode(intervened))
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
['Madison square garden is located in the city of New York City.']
['_ _ _ _ _ _ _ _ _ _ York City.']

We also could have entered a pre-saved embedding tensor as shown here:

[6]:
with model.generate("Madison square garden is located in the city of New", max_new_tokens=3) as tracer:

    embeddings = model.transformer.wte.output.save()
    original = model.generator.output.save()

print(model.tokenizer.batch_decode(original))

with model.generate("_ _ _ _ _ _ _ _ _ _", max_new_tokens=3) as tracer:

    model.transformer.wte.output = embeddings
    intervened = model.generator.output.save()

print(model.tokenizer.batch_decode(intervened))
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.
['Madison square garden is located in the city of New York City.']
['_ _ _ _ _ _ _ _ _ _ York City.']