Cross-Prompt Intervention#
Intervention operations work cross prompt! Use two invocations within the same generation block and operations can work between them.
In this case, we grab the token embeddings coming from the first prompt, “Madison square garden is located in the city of New” and replace the embeddings of the second prompt with them.
[1]:
from nnsight import LanguageModel
model = LanguageModel('openai-community/gpt2', device_map='auto')
/Users/rockinglikebalboa/NDIF/nnsight-test-env/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
/Users/rockinglikebalboa/NDIF/nnsight-test-env/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
[2]:
with model.generate(max_new_tokens=3) as tracer:
with tracer.invoke("Madison square garden is located in the city of New") as invoker:
embeddings = model.transformer.wte.output
original = model.generator.output.save()
with tracer.invoke("_ _ _ _ _ _ _ _ _ _") as invoker:
model.transformer.wte.output = embeddings
intervened = model.generator.output.save()
print(model.tokenizer.batch_decode(original.value))
print(model.tokenizer.batch_decode(intervened.value))
You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
['Madison square garden is located in the city of New York City.']
['_ _ _ _ _ _ _ _ _ _ York City.']
We also could have entered a pre-saved embedding tensor as shown here:
[3]:
with model.generate("Madison square garden is located in the city of New", max_new_tokens=3) as tracer:
embeddings = model.transformer.wte.output.save()
original = model.generator.output.save()
print(model.tokenizer.batch_decode(original.value))
with model.generate("_ _ _ _ _ _ _ _ _ _", max_new_tokens=3) as tracer:
model.transformer.wte.output = embeddings.value
intervened = model.generator.output.save()
print(model.tokenizer.batch_decode(intervened.value))
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
['Madison square garden is located in the city of New York City.']
['_ _ _ _ _ _ _ _ _ _ York City.']