LoRA for Sentiment Analysis#

📗 You can find an interactive Colab version of this tutorial here.

Low Rank Adaptation (LoRA) is a technique used to modify and fine tune large language models in a more efficient way. Rather than modifying all of the model weights, LoRAs find two low dimensional matrices that have the lowest rank. It then multiplies the two matrices to find the fine tuned weight matrix. This fine tuned weight matrix will be the same size as the original pre trained weight matrix. Once the fine tuned matrix has been found it can then be applied to the model’s layers.

TRAIN FIGURE

Fine tuning with a LoRA is a part of the Parameter Efficient Fine Tuning (PEFT) family because it keeps the original model unchanged and introduces a small number of layers or parameters instead. Once the fine tuned matrix has been calculated, it is applied to the last Multilayer Perceptron (MLP) layer of the model. Once the LoRA has been applied, the model is fine tuned based on a knowledge base or domain specific dataset.

TEST FIGURE

Setup#

Make sure you have obtained your NDIF API key and configured your workspace for remote execution.

The following packages need to be installed for this tutorial:

!pip install nnsight
!pip install pyarrow==15.0.2
!pip install datasets
!pip install datasets torch
[ ]:
try:
    import google.colab
    is_colab = True
except ImportError:
    is_colab = False

if is_colab:
    !pip install -U nnsight
    !pip install pyarrow==15.0.2
    !pip install datasets
    !pip install datasets torch
[2]:
from nnsight import CONFIG

if is_colab:
    # include your HuggingFace Token and NNsight API key on Colab secrets
    from google.colab import userdata
    NDIF_API = userdata.get('NDIF_API')
    HF_TOKEN = userdata.get('HF_TOKEN')

    CONFIG.set_default_api_key(NDIF_API)

Here are the imports needed for this tutorial.

[3]:
import torch
import torch.nn as nn
import pandas as pd
from nnsight import LanguageModel
from transformers import AutoModelForSequenceClassification, AutoTokenizer, AutoModelForCausalLM
from transformers import TrainingArguments, Trainer
from torch.utils.data import DataLoader, Subset
from datasets import load_dataset

Prepare Data#

For this tutorial we will be using the The Stanford Sentiment Treebank (SST2). It consists of sentences from movie reviews and human annotations of their sentiment. The task is to predict the sentiment of a given sentence as being either positive or negative. In the dataset, the positive/negative labels of each phrase are represented by a 0 for each negative statement and a 1 for each positive statement.

[4]:
# GLUE is a standard Natural Language Processing (NLP) benchmark which is commonly used for sentiment analysis tasks.
# It is responisble for assessing the effectiveness of language models across various NLP tasks.
# It serves as a standard for evaluating a model's ability to understand and process language.
dataset = load_dataset("glue", "sst2")

# 0 = neg, 1 = pos
def label_to_str(example):
    example['label'] = 'positive' if example['label'] == 1 else 'negative'
    return example

train_data = [(dataset['sentence'], 'positive' if dataset['label'] == 1 else 'negative') for dataset in dataset['train']]
validation_data = [(dataset['sentence'], 'positive' if dataset['label'] == 1 else 'negative') for dataset in dataset['validation']]

Next, we need to tokenize our data. Tokenizing involves converting text into a numerical representation. It is a popular technique in NLP because it helps the models better understand the text and output a more accurate result.

[5]:
tokenizer = AutoTokenizer.from_pretrained('openai-community/gpt2', add_prefix_space=True)
tokenizer.pad_token = tokenizer.eos_token

# Uses the tokenizer from the model to tokenize a given sentence with padding and truncation
def tokenize_function(text):
  return tokenizer(text['sentence'], padding='max_length', truncation=True, max_length=10, return_tensors='pt')

# We use .map() in order to apply the tokenization function to all the training data.
tokenized_train_dataset = dataset['train'].map(tokenize_function, batched=True, batch_size=10)
tokenized_train_dataset = tokenized_train_dataset.map(lambda x: {'input_ids': x['input_ids'], 'attention_mask': x['attention_mask'], 'labels': x['label']})

Prepare our Model#

For this tutorial we will be using the Llama-70B language model.

[6]:
# Use the LanguageModel wrapper class to load in the Llama model
model_name = "meta-llama/Meta-Llama-3.1-70B"
model = LanguageModel(model_name, device_map='auto')

This is the model architechure before the LoRA has been applied. After the model has been fine tuned with the LoRA, the last MLP layer of the model will be replaced with the LoRA.

We’re going to train a very simple LORA that, when applied, will make our model determine whether a sentence is displaying a positive sentiment or a negative sentiment.

[8]:
from nnsight import Envoy

# We will define a LORA class.
# The LORA class call method operations are simply traced like you would normally do in a .trace.
class LORA(nn.Module):
    def __init__(self, module: Envoy, dim: int, r: int) -> None:
        """Init.

        Args:
            module (Envoy): Which model Module we are adding the LORA to.
            dim (int): Dimension of the layer we are adding to (This could potentially be auto populated if the user scanned first so we know the shape)
            r (int): Inner dimension of the LORA
        """
        super(LORA, self).__init__()
        self.r = r
        self.module = module
        self.WA = torch.nn.Parameter(torch.randn(dim, self.r), requires_grad=True).save()
        self.WB = torch.nn.Parameter(torch.zeros(self.r, dim), requires_grad=True).save()

    # The Call method defines how to actually apply the LORA.
    # happens after the forward pass
    def __call__(self, alpha: float = 1.0):
        """Call.

        Args:
            alpha (float, optional): How much to apply the LORA. Can be altered after training for inference. Defaults to 1.0.
        """

        # We apply WA to the first positional arg (the hidden states)
        A_x = torch.matmul(self.module.input, self.WA)
        BA_x = torch.matmul(A_x, self.WB)

        # LORA is additive
        h = BA_x + self.module.output

        # Replace the output with our new one * alpha
        # Could also have been self.module.output[:] = h * alpha, for in-place
        self.module.output = h * alpha

    def parameters(self):
        # Some way to get all the parameters.
        return [self.WA, self.WB]

LLM Fine Tuning#

[9]:
# Inner LORA dimension
lora_dim = 4

# Module to train LORA on
# Accesses the last mlp layer of the model
module = model.model.layers[-1].mlp

We can use the .scan() method to get the shape of the module without having to fully run the model.

[10]:
with model.scan(" "):
    dim = module.output.shape[-1]

print(dim)
8192
[13]:
import nnsight
# The LORA object itself isn't transmitted to the server. Only the forward / call method.
# The parameters are created remotely and never sent only retrieved
with model.session(remote=True) as session:

    dataset = tokenized_train_dataset

    # Smaller chunks to run faster, feel free to increase
    indices = list(range(0, 5000))
    subset = Subset(dataset, indices)


    # Create a dataloader from it.
    dataloader = DataLoader(subset, batch_size=10)

    # Create our LORA on the last mlp and apply it to the model
    lora = LORA(module, dim, lora_dim)

    # Create an optimizer. Use the parameters from LORA
    optimizer = torch.optim.AdamW(lora.parameters(), lr=3)

    # Iterate over dataloader using .iter.
    with session.iter(dataloader) as batch:

        # Accesses the phrase that contains either a positive/negative sentiment
        prompt = batch['sentence']

        # Determines whether the phrase is positive/negative
        correct_token = batch['label']


        # Run .trace with prompt
        with model.trace(prompt) as tracer:


            # Apply LORA to intervention graph just by calling it with .trace
            # This is invoke the __call__() method of the LORA class defined above
            lora()


            # Get logits
            # Logits are the output of the neural network before the
            # activation function has been applied.
            logits = model.lm_head.output


            # Do cross entropy on last predicted token and correct_token
            loss = torch.nn.functional.cross_entropy(logits[:, -1], batch['label'])

            # Call backward
            loss.backward()


        # Call methods on optimizer. Graphs that arent from .trace (so in this case session and iterator both have their own graph) are executed sequentially.
        # The Graph of Iterator here will be:
        # 1.) Index batch at 0 for prompt
        # 2.) Index batch at 1 for correct_token
        # 3.) Execute the .trace using the prompt
        # 4.) Call .step() on optimizer
        optimizer.step()
        # 5.) Call .zero_grad() in optimizer
        optimizer.zero_grad()
        # 6.) Print out the lora WA weights to show they are indeed changing
        nnsight.log(lora.WA)


Streaming output truncated to the last 5000 lines.
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.7734, -0.4980,  0.2109, -0.0747],
        [ 8.1875,  0.6758, -0.0505,  0.3887],
        [ 0.4805,  0.2988,  0.3906,  1.5234],
        ...,
        [-7.4062, -0.1494, -0.7969, -0.8047],
        [ 0.5742,  0.0742,  3.9531,  1.6172],
        [ 0.9062,  0.3438,  4.0938,  0.8516]], requires_grad=True)
2025-02-06 20:26:27,839 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.9023, -0.5195,  0.2051, -0.0623],
        [ 9.4375,  0.7383, -0.0260,  0.2656],
        [ 0.5547,  0.3203,  0.3828,  1.4609],
        ...,
        [-8.5000, -0.1592, -0.8086, -0.7656],
        [ 0.6562,  0.0737,  4.0625,  1.5703],
        [ 1.1172,  0.4102,  3.9219,  0.7070]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.9023, -0.5195,  0.2051, -0.0623],
        [ 9.4375,  0.7383, -0.0260,  0.2656],
        [ 0.5547,  0.3203,  0.3828,  1.4609],
        ...,
        [-8.5000, -0.1592, -0.8086, -0.7656],
        [ 0.6562,  0.0737,  4.0625,  1.5703],
        [ 1.1172,  0.4102,  3.9219,  0.7070]], requires_grad=True)
2025-02-06 20:26:27,972 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.0156, -0.5352,  0.1973, -0.0479],
        [10.5000,  0.7812,  0.0131,  0.1387],
        [ 0.6172,  0.3359,  0.3828,  1.3828],
        ...,
        [-9.5000, -0.1592, -0.8281, -0.7109],
        [ 0.7344,  0.0625,  4.2500,  1.4766],
        [ 1.3203,  0.4551,  3.8906,  0.5391]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.0156, -0.5352,  0.1973, -0.0479],
        [10.5000,  0.7812,  0.0131,  0.1387],
        [ 0.6172,  0.3359,  0.3828,  1.3828],
        ...,
        [-9.5000, -0.1592, -0.8281, -0.7109],
        [ 0.7344,  0.0625,  4.2500,  1.4766],
        [ 1.3203,  0.4551,  3.8906,  0.5391]], requires_grad=True)
2025-02-06 20:26:28,108 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1094,  -0.5430,   0.1875,  -0.0327],
        [ 11.4375,   0.8164,   0.0491,   0.0270],
        [  0.6719,   0.3457,   0.3906,   1.3047],
        ...,
        [-10.3125,  -0.1553,  -0.8477,  -0.6602],
        [  0.8008,   0.0508,   4.4062,   1.3828],
        [  1.4922,   0.4922,   3.8594,   0.3945]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1094,  -0.5430,   0.1875,  -0.0327],
        [ 11.4375,   0.8164,   0.0491,   0.0270],
        [  0.6719,   0.3457,   0.3906,   1.3047],
        ...,
        [-10.3125,  -0.1553,  -0.8477,  -0.6602],
        [  0.8008,   0.0508,   4.4062,   1.3828],
        [  1.4922,   0.4922,   3.8594,   0.3945]], requires_grad=True)
2025-02-06 20:26:28,241 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1875,  -0.5547,   0.1816,  -0.0253],
        [ 12.2500,   0.8516,   0.0703,  -0.0586],
        [  0.7148,   0.3574,   0.3848,   1.2500],
        ...,
        [-10.9375,  -0.1641,  -0.8438,  -0.6328],
        [  0.8516,   0.0500,   4.4688,   1.3438],
        [  1.6328,   0.5234,   3.8125,   0.2656]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1875,  -0.5547,   0.1816,  -0.0253],
        [ 12.2500,   0.8516,   0.0703,  -0.0586],
        [  0.7148,   0.3574,   0.3848,   1.2500],
        ...,
        [-10.9375,  -0.1641,  -0.8438,  -0.6328],
        [  0.8516,   0.0500,   4.4688,   1.3438],
        [  1.6328,   0.5234,   3.8125,   0.2656]], requires_grad=True)
2025-02-06 20:26:28,376 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2500,  -0.5703,   0.1787,  -0.0251],
        [ 12.8125,   0.8906,   0.0664,  -0.1025],
        [  0.7500,   0.3730,   0.3594,   1.2344],
        ...,
        [-11.4375,  -0.1826,  -0.8281,  -0.6289],
        [  0.8906,   0.0544,   4.4688,   1.3203],
        [  1.7500,   0.5547,   3.7188,   0.1699]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2500,  -0.5703,   0.1787,  -0.0251],
        [ 12.8125,   0.8906,   0.0664,  -0.1025],
        [  0.7500,   0.3730,   0.3594,   1.2344],
        ...,
        [-11.4375,  -0.1826,  -0.8281,  -0.6289],
        [  0.8906,   0.0544,   4.4688,   1.3203],
        [  1.7500,   0.5547,   3.7188,   0.1699]], requires_grad=True)
2025-02-06 20:26:28,510 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2969,  -0.5820,   0.1777,  -0.0278],
        [ 13.2500,   0.9336,   0.0457,  -0.1177],
        [  0.7734,   0.3887,   0.3223,   1.2344],
        ...,
        [-11.8125,  -0.2070,  -0.8008,  -0.6406],
        [  0.9219,   0.0618,   4.4375,   1.3125],
        [  1.8438,   0.5859,   3.6094,   0.0947]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2969,  -0.5820,   0.1777,  -0.0278],
        [ 13.2500,   0.9336,   0.0457,  -0.1177],
        [  0.7734,   0.3887,   0.3223,   1.2344],
        ...,
        [-11.8125,  -0.2070,  -0.8008,  -0.6406],
        [  0.9219,   0.0618,   4.4375,   1.3125],
        [  1.8438,   0.5859,   3.6094,   0.0947]], requires_grad=True)
2025-02-06 20:26:28,647 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3359,  -0.5938,   0.1768,  -0.0303],
        [ 13.6250,   0.9688,   0.0294,  -0.1328],
        [  0.7930,   0.4004,   0.2910,   1.2266],
        ...,
        [-12.0625,  -0.2275,  -0.7773,  -0.6484],
        [  0.9492,   0.0679,   4.4062,   1.3047],
        [  1.9219,   0.6094,   3.5000,   0.0260]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3359,  -0.5938,   0.1768,  -0.0303],
        [ 13.6250,   0.9688,   0.0294,  -0.1328],
        [  0.7930,   0.4004,   0.2910,   1.2266],
        ...,
        [-12.0625,  -0.2275,  -0.7773,  -0.6484],
        [  0.9492,   0.0679,   4.4062,   1.3047],
        [  1.9219,   0.6094,   3.5000,   0.0260]], requires_grad=True)
2025-02-06 20:26:28,782 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.3672e+00, -5.9766e-01,  1.7578e-01, -3.3936e-02],
        [ 1.3875e+01,  1.0078e+00,  7.5378e-03, -1.3281e-01],
        [ 8.0859e-01,  4.1016e-01,  2.6172e-01,  1.2266e+00],
        ...,
        [-1.2250e+01, -2.4902e-01, -7.4609e-01, -6.6016e-01],
        [ 9.6484e-01,  7.9102e-02,  4.3125e+00,  1.3203e+00],
        [ 1.9844e+00,  6.2891e-01,  3.3750e+00, -2.7100e-02]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.3672e+00, -5.9766e-01,  1.7578e-01, -3.3936e-02],
        [ 1.3875e+01,  1.0078e+00,  7.5378e-03, -1.3281e-01],
        [ 8.0859e-01,  4.1016e-01,  2.6172e-01,  1.2266e+00],
        ...,
        [-1.2250e+01, -2.4902e-01, -7.4609e-01, -6.6016e-01],
        [ 9.6484e-01,  7.9102e-02,  4.3125e+00,  1.3203e+00],
        [ 1.9844e+00,  6.2891e-01,  3.3750e+00, -2.7100e-02]],
       requires_grad=True)
2025-02-06 20:26:28,916 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.3906e+00, -5.9766e-01,  1.7285e-01, -3.3691e-02],
        [ 1.4000e+01,  1.0312e+00, -1.2085e-02, -1.3184e-01],
        [ 8.1641e-01,  4.1992e-01,  2.2754e-01,  1.2344e+00],
        ...,
        [-1.2375e+01, -2.6172e-01, -7.2266e-01, -6.6016e-01],
        [ 9.8047e-01,  8.5449e-02,  4.2500e+00,  1.3203e+00],
        [ 2.0312e+00,  6.4062e-01,  3.2812e+00, -7.5684e-02]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.3906e+00, -5.9766e-01,  1.7285e-01, -3.3691e-02],
        [ 1.4000e+01,  1.0312e+00, -1.2085e-02, -1.3184e-01],
        [ 8.1641e-01,  4.1992e-01,  2.2754e-01,  1.2344e+00],
        ...,
        [-1.2375e+01, -2.6172e-01, -7.2266e-01, -6.6016e-01],
        [ 9.8047e-01,  8.5449e-02,  4.2500e+00,  1.3203e+00],
        [ 2.0312e+00,  6.4062e-01,  3.2812e+00, -7.5684e-02]],
       requires_grad=True)
2025-02-06 20:26:29,065 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4219e+00, -5.8203e-01,  1.6016e-01, -6.0425e-03],
        [ 1.4125e+01,  1.0469e+00, -1.8555e-02, -1.4648e-01],
        [ 8.2812e-01,  4.2383e-01,  2.1191e-01,  1.2109e+00],
        ...,
        [-1.2562e+01, -2.6172e-01, -7.1484e-01, -6.2891e-01],
        [ 9.8828e-01,  8.6914e-02,  4.1875e+00,  1.2969e+00],
        [ 2.0781e+00,  6.4062e-01,  3.2812e+00, -1.6699e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4219e+00, -5.8203e-01,  1.6016e-01, -6.0425e-03],
        [ 1.4125e+01,  1.0469e+00, -1.8555e-02, -1.4648e-01],
        [ 8.2812e-01,  4.2383e-01,  2.1191e-01,  1.2109e+00],
        ...,
        [-1.2562e+01, -2.6172e-01, -7.1484e-01, -6.2891e-01],
        [ 9.8828e-01,  8.6914e-02,  4.1875e+00,  1.2969e+00],
        [ 2.0781e+00,  6.4062e-01,  3.2812e+00, -1.6699e-01]],
       requires_grad=True)
2025-02-06 20:26:29,195 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4453e+00, -5.7031e-01,  1.4941e-01,  1.5320e-02],
        [ 1.4250e+01,  1.0469e+00, -7.5073e-03, -1.9141e-01],
        [ 8.3984e-01,  4.1602e-01,  2.2168e-01,  1.1250e+00],
        ...,
        [-1.2812e+01, -2.4609e-01, -7.2656e-01, -5.6250e-01],
        [ 1.0078e+00,  7.3242e-02,  4.2500e+00,  1.1875e+00],
        [ 2.1250e+00,  6.2500e-01,  3.3750e+00, -3.0273e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4453e+00, -5.7031e-01,  1.4941e-01,  1.5320e-02],
        [ 1.4250e+01,  1.0469e+00, -7.5073e-03, -1.9141e-01],
        [ 8.3984e-01,  4.1602e-01,  2.2168e-01,  1.1250e+00],
        ...,
        [-1.2812e+01, -2.4609e-01, -7.2656e-01, -5.6250e-01],
        [ 1.0078e+00,  7.3242e-02,  4.2500e+00,  1.1875e+00],
        [ 2.1250e+00,  6.2500e-01,  3.3750e+00, -3.0273e-01]],
       requires_grad=True)
2025-02-06 20:26:29,355 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4453e+00, -5.7422e-01,  1.4844e-01,  6.5918e-03],
        [ 1.4312e+01,  1.0312e+00,  1.4954e-02, -2.5195e-01],
        [ 8.4766e-01,  4.1016e-01,  2.2852e-01,  1.0547e+00],
        ...,
        [-1.3000e+01, -2.2949e-01, -7.3438e-01, -5.0000e-01],
        [ 1.0156e+00,  6.5430e-02,  4.2500e+00,  1.1094e+00],
        [ 2.1719e+00,  6.0938e-01,  3.4688e+00, -4.2578e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4453e+00, -5.7422e-01,  1.4844e-01,  6.5918e-03],
        [ 1.4312e+01,  1.0312e+00,  1.4954e-02, -2.5195e-01],
        [ 8.4766e-01,  4.1016e-01,  2.2852e-01,  1.0547e+00],
        ...,
        [-1.3000e+01, -2.2949e-01, -7.3438e-01, -5.0000e-01],
        [ 1.0156e+00,  6.5430e-02,  4.2500e+00,  1.1094e+00],
        [ 2.1719e+00,  6.0938e-01,  3.4688e+00, -4.2578e-01]],
       requires_grad=True)
2025-02-06 20:26:29,501 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4375e+00, -5.8594e-01,  1.5137e-01, -1.2695e-02],
        [ 1.4312e+01,  1.0312e+00,  1.8677e-02, -2.7344e-01],
        [ 8.4766e-01,  4.1211e-01,  2.1484e-01,  1.0391e+00],
        ...,
        [-1.3125e+01, -2.2754e-01, -7.2656e-01, -4.7266e-01],
        [ 1.0156e+00,  6.6406e-02,  4.1875e+00,  1.0781e+00],
        [ 2.2031e+00,  5.9766e-01,  3.4844e+00, -5.0781e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4375e+00, -5.8594e-01,  1.5137e-01, -1.2695e-02],
        [ 1.4312e+01,  1.0312e+00,  1.8677e-02, -2.7344e-01],
        [ 8.4766e-01,  4.1211e-01,  2.1484e-01,  1.0391e+00],
        ...,
        [-1.3125e+01, -2.2754e-01, -7.2656e-01, -4.7266e-01],
        [ 1.0156e+00,  6.6406e-02,  4.1875e+00,  1.0781e+00],
        [ 2.2031e+00,  5.9766e-01,  3.4844e+00, -5.0781e-01]],
       requires_grad=True)
2025-02-06 20:26:29,638 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4219e+00, -6.0547e-01,  1.5820e-01, -4.4922e-02],
        [ 1.4250e+01,  1.0391e+00,  6.9275e-03, -2.6562e-01],
        [ 8.3984e-01,  4.1797e-01,  1.8555e-01,  1.0625e+00],
        ...,
        [-1.3125e+01, -2.2852e-01, -7.1094e-01, -4.5703e-01],
        [ 1.0156e+00,  7.4219e-02,  4.0938e+00,  1.0859e+00],
        [ 2.2188e+00,  5.8984e-01,  3.4531e+00, -5.5859e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4219e+00, -6.0547e-01,  1.5820e-01, -4.4922e-02],
        [ 1.4250e+01,  1.0391e+00,  6.9275e-03, -2.6562e-01],
        [ 8.3984e-01,  4.1797e-01,  1.8555e-01,  1.0625e+00],
        ...,
        [-1.3125e+01, -2.2852e-01, -7.1094e-01, -4.5703e-01],
        [ 1.0156e+00,  7.4219e-02,  4.0938e+00,  1.0859e+00],
        [ 2.2188e+00,  5.8984e-01,  3.4531e+00, -5.5859e-01]],
       requires_grad=True)
2025-02-06 20:26:29,779 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.3984e+00, -6.3281e-01,  1.7383e-01, -9.7168e-02],
        [ 1.4125e+01,  1.0391e+00, -1.0872e-04, -2.6367e-01],
        [ 8.3594e-01,  4.1797e-01,  1.7188e-01,  1.0469e+00],
        ...,
        [-1.3188e+01, -2.1191e-01, -7.1875e-01, -4.0430e-01],
        [ 1.0156e+00,  7.2754e-02,  4.0312e+00,  1.0547e+00],
        [ 2.2344e+00,  5.7812e-01,  3.4531e+00, -6.2109e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.3984e+00, -6.3281e-01,  1.7383e-01, -9.7168e-02],
        [ 1.4125e+01,  1.0391e+00, -1.0872e-04, -2.6367e-01],
        [ 8.3594e-01,  4.1797e-01,  1.7188e-01,  1.0469e+00],
        ...,
        [-1.3188e+01, -2.1191e-01, -7.1875e-01, -4.0430e-01],
        [ 1.0156e+00,  7.2754e-02,  4.0312e+00,  1.0547e+00],
        [ 2.2344e+00,  5.7812e-01,  3.4531e+00, -6.2109e-01]],
       requires_grad=True)
2025-02-06 20:26:29,921 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3828,  -0.6406,   0.1787,  -0.1230],
        [ 14.0625,   1.0078,   0.0303,  -0.3164],
        [  0.8359,   0.4102,   0.1807,   0.9922],
        ...,
        [-13.2500,  -0.1855,  -0.7344,  -0.3359],
        [  1.0156,   0.0664,   4.0000,   0.9961],
        [  2.2500,   0.5547,   3.5469,  -0.7148]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3828,  -0.6406,   0.1787,  -0.1230],
        [ 14.0625,   1.0078,   0.0303,  -0.3164],
        [  0.8359,   0.4102,   0.1807,   0.9922],
        ...,
        [-13.2500,  -0.1855,  -0.7344,  -0.3359],
        [  1.0156,   0.0664,   4.0000,   0.9961],
        [  2.2500,   0.5547,   3.5469,  -0.7148]], requires_grad=True)
2025-02-06 20:26:30,053 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3672,  -0.6484,   0.1826,  -0.1455],
        [ 14.0000,   0.9727,   0.0669,  -0.3750],
        [  0.8320,   0.3984,   0.1963,   0.9219],
        ...,
        [-13.3125,  -0.1562,  -0.7500,  -0.2656],
        [  1.0078,   0.0588,   3.9688,   0.9375],
        [  2.2656,   0.5195,   3.7031,  -0.8359]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3672,  -0.6484,   0.1826,  -0.1455],
        [ 14.0000,   0.9727,   0.0669,  -0.3750],
        [  0.8320,   0.3984,   0.1963,   0.9219],
        ...,
        [-13.3125,  -0.1562,  -0.7500,  -0.2656],
        [  1.0078,   0.0588,   3.9688,   0.9375],
        [  2.2656,   0.5195,   3.7031,  -0.8359]], requires_grad=True)
2025-02-06 20:26:30,189 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3438,  -0.6562,   0.1865,  -0.1689],
        [ 13.8750,   0.9453,   0.0952,  -0.4199],
        [  0.8242,   0.3926,   0.1953,   0.8867],
        ...,
        [-13.3125,  -0.1348,  -0.7578,  -0.2109],
        [  1.0000,   0.0549,   3.9219,   0.8945],
        [  2.2656,   0.4883,   3.8125,  -0.9336]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3438,  -0.6562,   0.1865,  -0.1689],
        [ 13.8750,   0.9453,   0.0952,  -0.4199],
        [  0.8242,   0.3926,   0.1953,   0.8867],
        ...,
        [-13.3125,  -0.1348,  -0.7578,  -0.2109],
        [  1.0000,   0.0549,   3.9219,   0.8945],
        [  2.2656,   0.4883,   3.8125,  -0.9336]], requires_grad=True)
2025-02-06 20:26:30,321 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3203,  -0.6641,   0.1914,  -0.1914],
        [ 13.6875,   0.9258,   0.1055,  -0.4414],
        [  0.8125,   0.3945,   0.1719,   0.8906],
        ...,
        [-13.2500,  -0.1152,  -0.7617,  -0.1621],
        [  0.9844,   0.0674,   3.7656,   0.9062],
        [  2.2656,   0.4609,   3.8906,  -1.0156]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3203,  -0.6641,   0.1914,  -0.1914],
        [ 13.6875,   0.9258,   0.1055,  -0.4414],
        [  0.8125,   0.3945,   0.1719,   0.8906],
        ...,
        [-13.2500,  -0.1152,  -0.7617,  -0.1621],
        [  0.9844,   0.0674,   3.7656,   0.9062],
        [  2.2656,   0.4609,   3.8906,  -1.0156]], requires_grad=True)
2025-02-06 20:26:30,463 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2891,  -0.6680,   0.1953,  -0.2119],
        [ 13.5000,   0.9062,   0.1177,  -0.4609],
        [  0.8008,   0.3984,   0.1396,   0.9102],
        ...,
        [-13.1875,  -0.0991,  -0.7617,  -0.1206],
        [  0.9648,   0.0767,   3.6406,   0.9102],
        [  2.2500,   0.4355,   3.9531,  -1.0781]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2891,  -0.6680,   0.1953,  -0.2119],
        [ 13.5000,   0.9062,   0.1177,  -0.4609],
        [  0.8008,   0.3984,   0.1396,   0.9102],
        ...,
        [-13.1875,  -0.0991,  -0.7617,  -0.1206],
        [  0.9648,   0.0767,   3.6406,   0.9102],
        [  2.2500,   0.4355,   3.9531,  -1.0781]], requires_grad=True)
2025-02-06 20:26:30,605 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2656,  -0.6562,   0.1895,  -0.2148],
        [ 13.3750,   0.8711,   0.1445,  -0.4922],
        [  0.7891,   0.3906,   0.1426,   0.8828],
        ...,
        [-13.1250,  -0.0635,  -0.7812,  -0.0583],
        [  0.9531,   0.0664,   3.6250,   0.8633],
        [  2.2344,   0.4043,   4.0312,  -1.1484]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2656,  -0.6562,   0.1895,  -0.2148],
        [ 13.3750,   0.8711,   0.1445,  -0.4922],
        [  0.7891,   0.3906,   0.1426,   0.8828],
        ...,
        [-13.1250,  -0.0635,  -0.7812,  -0.0583],
        [  0.9531,   0.0664,   3.6250,   0.8633],
        [  2.2344,   0.4043,   4.0312,  -1.1484]], requires_grad=True)
2025-02-06 20:26:30,738 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.2422e+00, -6.3672e-01,  1.8066e-01, -2.1289e-01],
        [ 1.3250e+01,  8.1250e-01,  2.0215e-01, -5.4688e-01],
        [ 7.7734e-01,  3.7695e-01,  1.5625e-01,  8.4375e-01],
        ...,
        [-1.3062e+01, -2.3071e-02, -8.0859e-01,  5.0049e-03],
        [ 9.4141e-01,  4.9316e-02,  3.6406e+00,  8.0078e-01],
        [ 2.2188e+00,  3.7109e-01,  4.1250e+00, -1.2188e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.2422e+00, -6.3672e-01,  1.8066e-01, -2.1289e-01],
        [ 1.3250e+01,  8.1250e-01,  2.0215e-01, -5.4688e-01],
        [ 7.7734e-01,  3.7695e-01,  1.5625e-01,  8.4375e-01],
        ...,
        [-1.3062e+01, -2.3071e-02, -8.0859e-01,  5.0049e-03],
        [ 9.4141e-01,  4.9316e-02,  3.6406e+00,  8.0078e-01],
        [ 2.2188e+00,  3.7109e-01,  4.1250e+00, -1.2188e+00]],
       requires_grad=True)
2025-02-06 20:26:30,881 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.2188e+00, -6.2500e-01,  1.7773e-01, -2.1582e-01],
        [ 1.3125e+01,  7.6172e-01,  2.5000e-01, -5.9375e-01],
        [ 7.6562e-01,  3.6523e-01,  1.6309e-01,  8.1250e-01],
        ...,
        [-1.2938e+01,  8.5449e-03, -8.2422e-01,  5.7129e-02],
        [ 9.2969e-01,  3.5156e-02,  3.6406e+00,  7.4609e-01],
        [ 2.2031e+00,  3.4180e-01,  4.1875e+00, -1.2734e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.2188e+00, -6.2500e-01,  1.7773e-01, -2.1582e-01],
        [ 1.3125e+01,  7.6172e-01,  2.5000e-01, -5.9375e-01],
        [ 7.6562e-01,  3.6523e-01,  1.6309e-01,  8.1250e-01],
        ...,
        [-1.2938e+01,  8.5449e-03, -8.2422e-01,  5.7129e-02],
        [ 9.2969e-01,  3.5156e-02,  3.6406e+00,  7.4609e-01],
        [ 2.2031e+00,  3.4180e-01,  4.1875e+00, -1.2734e+00]],
       requires_grad=True)
2025-02-06 20:26:31,030 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1875,  -0.6211,   0.1777,  -0.2207],
        [ 12.9375,   0.7070,   0.2969,  -0.6328],
        [  0.7500,   0.3594,   0.1514,   0.7969],
        ...,
        [-12.8125,   0.0260,  -0.8281,   0.0952],
        [  0.9102,   0.0471,   3.5000,   0.7344],
        [  2.1719,   0.3242,   4.1875,  -1.3047]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1875,  -0.6211,   0.1777,  -0.2207],
        [ 12.9375,   0.7070,   0.2969,  -0.6328],
        [  0.7500,   0.3594,   0.1514,   0.7969],
        ...,
        [-12.8125,   0.0260,  -0.8281,   0.0952],
        [  0.9102,   0.0471,   3.5000,   0.7344],
        [  2.1719,   0.3242,   4.1875,  -1.3047]], requires_grad=True)
2025-02-06 20:26:31,166 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1484,  -0.6367,   0.1895,  -0.2334],
        [ 12.6875,   0.6992,   0.2852,  -0.6406],
        [  0.7305,   0.3633,   0.1152,   0.7969],
        ...,
        [-12.6250,   0.0251,  -0.8125,   0.1187],
        [  0.8906,   0.0615,   3.3281,   0.7266],
        [  2.1406,   0.3262,   4.0312,  -1.3125]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1484,  -0.6367,   0.1895,  -0.2334],
        [ 12.6875,   0.6992,   0.2852,  -0.6406],
        [  0.7305,   0.3633,   0.1152,   0.7969],
        ...,
        [-12.6250,   0.0251,  -0.8125,   0.1187],
        [  0.8906,   0.0615,   3.3281,   0.7266],
        [  2.1406,   0.3262,   4.0312,  -1.3125]], requires_grad=True)
2025-02-06 20:26:31,308 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1172,  -0.6406,   0.1943,  -0.2422],
        [ 12.4375,   0.6836,   0.2871,  -0.6523],
        [  0.7109,   0.3594,   0.1011,   0.7852],
        ...,
        [-12.4375,   0.0312,  -0.8047,   0.1426],
        [  0.8711,   0.0645,   3.2344,   0.7070],
        [  2.1094,   0.3223,   3.9219,  -1.3203]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1172,  -0.6406,   0.1943,  -0.2422],
        [ 12.4375,   0.6836,   0.2871,  -0.6523],
        [  0.7109,   0.3594,   0.1011,   0.7852],
        ...,
        [-12.4375,   0.0312,  -0.8047,   0.1426],
        [  0.8711,   0.0645,   3.2344,   0.7070],
        [  2.1094,   0.3223,   3.9219,  -1.3203]], requires_grad=True)
2025-02-06 20:26:31,444 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.0859,  -0.6445,   0.1982,  -0.2490],
        [ 12.1875,   0.6758,   0.2773,  -0.6562],
        [  0.6953,   0.3477,   0.1069,   0.7695],
        ...,
        [-12.1875,   0.0481,  -0.8047,   0.1660],
        [  0.8516,   0.0527,   3.2188,   0.6836],
        [  2.0625,   0.3203,   3.7969,  -1.3203]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.0859,  -0.6445,   0.1982,  -0.2490],
        [ 12.1875,   0.6758,   0.2773,  -0.6562],
        [  0.6953,   0.3477,   0.1069,   0.7695],
        ...,
        [-12.1875,   0.0481,  -0.8047,   0.1660],
        [  0.8516,   0.0527,   3.2188,   0.6836],
        [  2.0625,   0.3203,   3.7969,  -1.3203]], requires_grad=True)
2025-02-06 20:26:31,587 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.0547,  -0.6250,   0.1924,  -0.2520],
        [ 11.9375,   0.6250,   0.3184,  -0.6602],
        [  0.6836,   0.3223,   0.1494,   0.7461],
        ...,
        [-11.9375,   0.0864,  -0.8281,   0.1885],
        [  0.8320,   0.0223,   3.3125,   0.6562],
        [  2.0156,   0.3008,   3.7969,  -1.3203]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.0547,  -0.6250,   0.1924,  -0.2520],
        [ 11.9375,   0.6250,   0.3184,  -0.6602],
        [  0.6836,   0.3223,   0.1494,   0.7461],
        ...,
        [-11.9375,   0.0864,  -0.8281,   0.1885],
        [  0.8320,   0.0223,   3.3125,   0.6562],
        [  2.0156,   0.3008,   3.7969,  -1.3203]], requires_grad=True)
2025-02-06 20:26:31,742 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.0234e+00, -6.2891e-01,  1.9727e-01, -2.5391e-01],
        [ 1.1688e+01,  5.7031e-01,  3.6328e-01, -6.6406e-01],
        [ 6.7188e-01,  2.9688e-01,  1.9043e-01,  7.2266e-01],
        ...,
        [-1.1688e+01,  1.0791e-01, -8.3203e-01,  2.0703e-01],
        [ 8.1641e-01, -1.3275e-03,  3.3750e+00,  6.2891e-01],
        [ 1.9688e+00,  2.9102e-01,  3.7188e+00, -1.3125e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.0234e+00, -6.2891e-01,  1.9727e-01, -2.5391e-01],
        [ 1.1688e+01,  5.7031e-01,  3.6328e-01, -6.6406e-01],
        [ 6.7188e-01,  2.9688e-01,  1.9043e-01,  7.2266e-01],
        ...,
        [-1.1688e+01,  1.0791e-01, -8.3203e-01,  2.0703e-01],
        [ 8.1641e-01, -1.3275e-03,  3.3750e+00,  6.2891e-01],
        [ 1.9688e+00,  2.9102e-01,  3.7188e+00, -1.3125e+00]],
       requires_grad=True)
2025-02-06 20:26:31,886 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-9.8828e-01, -6.5234e-01,  2.1191e-01, -2.5391e-01],
        [ 1.1375e+01,  5.7812e-01,  3.3398e-01, -6.7188e-01],
        [ 6.5625e-01,  2.8516e-01,  1.9824e-01,  6.9922e-01],
        ...,
        [-1.1375e+01,  9.7656e-02, -8.0469e-01,  2.2656e-01],
        [ 7.9688e-01, -3.4332e-03,  3.3125e+00,  5.9766e-01],
        [ 1.9141e+00,  3.1641e-01,  3.4219e+00, -1.3125e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-9.8828e-01, -6.5234e-01,  2.1191e-01, -2.5391e-01],
        [ 1.1375e+01,  5.7812e-01,  3.3398e-01, -6.7188e-01],
        [ 6.5625e-01,  2.8516e-01,  1.9824e-01,  6.9922e-01],
        ...,
        [-1.1375e+01,  9.7656e-02, -8.0469e-01,  2.2656e-01],
        [ 7.9688e-01, -3.4332e-03,  3.3125e+00,  5.9766e-01],
        [ 1.9141e+00,  3.1641e-01,  3.4219e+00, -1.3125e+00]],
       requires_grad=True)
2025-02-06 20:26:32,030 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-9.5312e-01, -6.8750e-01,  2.3145e-01, -2.5000e-01],
        [ 1.1125e+01,  6.2109e-01,  2.5781e-01, -6.8750e-01],
        [ 6.4062e-01,  2.9102e-01,  1.6309e-01,  6.6797e-01],
        ...,
        [-1.1125e+01,  7.4707e-02, -7.6562e-01,  2.4609e-01],
        [ 7.7734e-01,  1.0437e-02,  3.1562e+00,  5.6250e-01],
        [ 1.8672e+00,  3.4570e-01,  3.0781e+00, -1.3125e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-9.5312e-01, -6.8750e-01,  2.3145e-01, -2.5000e-01],
        [ 1.1125e+01,  6.2109e-01,  2.5781e-01, -6.8750e-01],
        [ 6.4062e-01,  2.9102e-01,  1.6309e-01,  6.6797e-01],
        ...,
        [-1.1125e+01,  7.4707e-02, -7.6562e-01,  2.4609e-01],
        [ 7.7734e-01,  1.0437e-02,  3.1562e+00,  5.6250e-01],
        [ 1.8672e+00,  3.4570e-01,  3.0781e+00, -1.3125e+00]],
       requires_grad=True)
2025-02-06 20:26:32,164 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.9219,  -0.7109,   0.2461,  -0.2461],
        [ 10.8750,   0.6523,   0.1963,  -0.6992],
        [  0.6250,   0.2910,   0.1445,   0.6406],
        ...,
        [-10.8750,   0.0581,  -0.7305,   0.2617],
        [  0.7578,   0.0166,   3.0469,   0.5352],
        [  1.8203,   0.3711,   2.7812,  -1.3047]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.9219,  -0.7109,   0.2461,  -0.2461],
        [ 10.8750,   0.6523,   0.1963,  -0.6992],
        [  0.6250,   0.2910,   0.1445,   0.6406],
        ...,
        [-10.8750,   0.0581,  -0.7305,   0.2617],
        [  0.7578,   0.0166,   3.0469,   0.5352],
        [  1.8203,   0.3711,   2.7812,  -1.3047]], requires_grad=True)
2025-02-06 20:26:32,300 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-8.9062e-01, -7.2266e-01,  2.5195e-01, -2.4512e-01],
        [ 1.0625e+01,  6.6016e-01,  1.6211e-01, -6.9922e-01],
        [ 6.0938e-01,  2.8125e-01,  1.5137e-01,  6.2500e-01],
        ...,
        [-1.0625e+01,  7.3242e-02, -7.2656e-01,  2.6172e-01],
        [ 7.3828e-01,  2.7466e-03,  3.0469e+00,  5.2734e-01],
        [ 1.7734e+00,  3.7500e-01,  2.6250e+00, -1.2812e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-8.9062e-01, -7.2266e-01,  2.5195e-01, -2.4512e-01],
        [ 1.0625e+01,  6.6016e-01,  1.6211e-01, -6.9922e-01],
        [ 6.0938e-01,  2.8125e-01,  1.5137e-01,  6.2500e-01],
        ...,
        [-1.0625e+01,  7.3242e-02, -7.2656e-01,  2.6172e-01],
        [ 7.3828e-01,  2.7466e-03,  3.0469e+00,  5.2734e-01],
        [ 1.7734e+00,  3.7500e-01,  2.6250e+00, -1.2812e+00]],
       requires_grad=True)
2025-02-06 20:26:32,437 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8594,  -0.7031,   0.2480,  -0.2500],
        [ 10.3750,   0.6367,   0.1660,  -0.6836],
        [  0.5898,   0.2500,   0.2041,   0.6367],
        ...,
        [-10.3750,   0.1240,  -0.7539,   0.2451],
        [  0.7148,  -0.0447,   3.2188,   0.5508],
        [  1.7266,   0.3574,   2.5938,  -1.2422]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8594,  -0.7031,   0.2480,  -0.2500],
        [ 10.3750,   0.6367,   0.1660,  -0.6836],
        [  0.5898,   0.2500,   0.2041,   0.6367],
        ...,
        [-10.3750,   0.1240,  -0.7539,   0.2451],
        [  0.7148,  -0.0447,   3.2188,   0.5508],
        [  1.7266,   0.3574,   2.5938,  -1.2422]], requires_grad=True)
2025-02-06 20:26:32,573 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8242,  -0.6992,   0.2471,  -0.2500],
        [ 10.1250,   0.6289,   0.1533,  -0.6758],
        [  0.5703,   0.2393,   0.2109,   0.6211],
        ...,
        [-10.1250,   0.1504,  -0.7617,   0.2373],
        [  0.6953,  -0.0718,   3.2812,   0.5547],
        [  1.6797,   0.3496,   2.5156,  -1.2109]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8242,  -0.6992,   0.2471,  -0.2500],
        [ 10.1250,   0.6289,   0.1533,  -0.6758],
        [  0.5703,   0.2393,   0.2109,   0.6211],
        ...,
        [-10.1250,   0.1504,  -0.7617,   0.2373],
        [  0.6953,  -0.0718,   3.2812,   0.5547],
        [  1.6797,   0.3496,   2.5156,  -1.2109]], requires_grad=True)
2025-02-06 20:26:32,706 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.7891, -0.6992,  0.2500, -0.2451],
        [ 9.8125,  0.6211,  0.1377, -0.6680],
        [ 0.5547,  0.2324,  0.2080,  0.5977],
        ...,
        [-9.8750,  0.1455, -0.7422,  0.2451],
        [ 0.6758, -0.0806,  3.2656,  0.5391],
        [ 1.6406,  0.3438,  2.4219, -1.1797]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.7891, -0.6992,  0.2500, -0.2451],
        [ 9.8125,  0.6211,  0.1377, -0.6680],
        [ 0.5547,  0.2324,  0.2080,  0.5977],
        ...,
        [-9.8750,  0.1455, -0.7422,  0.2451],
        [ 0.6758, -0.0806,  3.2656,  0.5391],
        [ 1.6406,  0.3438,  2.4219, -1.1797]], requires_grad=True)
2025-02-06 20:26:32,843 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.7539, -0.7070,  0.2559, -0.2354],
        [ 9.5000,  0.6328,  0.1025, -0.6719],
        [ 0.5391,  0.2363,  0.1826,  0.5547],
        ...,
        [-9.6250,  0.1108, -0.6992,  0.2676],
        [ 0.6562, -0.0713,  3.1562,  0.5078],
        [ 1.5938,  0.3691,  2.1562, -1.1797]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.7539, -0.7070,  0.2559, -0.2354],
        [ 9.5000,  0.6328,  0.1025, -0.6719],
        [ 0.5391,  0.2363,  0.1826,  0.5547],
        ...,
        [-9.6250,  0.1108, -0.6992,  0.2676],
        [ 0.6562, -0.0713,  3.1562,  0.5078],
        [ 1.5938,  0.3691,  2.1562, -1.1797]], requires_grad=True)
2025-02-06 20:26:32,974 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.7148, -0.6797,  0.2480, -0.2373],
        [ 9.1875,  0.5820,  0.1299, -0.6406],
        [ 0.5273,  0.2256,  0.1895,  0.5430],
        ...,
        [-9.2500,  0.1299, -0.6992,  0.2656],
        [ 0.6328, -0.1348,  3.3750,  0.5547],
        [ 1.5547,  0.3730,  2.0156, -1.1562]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.7148, -0.6797,  0.2480, -0.2373],
        [ 9.1875,  0.5820,  0.1299, -0.6406],
        [ 0.5273,  0.2256,  0.1895,  0.5430],
        ...,
        [-9.2500,  0.1299, -0.6992,  0.2656],
        [ 0.6328, -0.1348,  3.3750,  0.5547],
        [ 1.5547,  0.3730,  2.0156, -1.1562]], requires_grad=True)
2025-02-06 20:26:33,111 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6836, -0.6406,  0.2354, -0.2422],
        [ 9.0000,  0.4551,  0.2432, -0.5625],
        [ 0.5156,  0.1885,  0.2539,  0.5742],
        ...,
        [-9.0000,  0.2148, -0.7539,  0.2285],
        [ 0.6094, -0.2100,  3.6719,  0.6133],
        [ 1.5156,  0.3320,  2.1250, -1.1016]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6836, -0.6406,  0.2354, -0.2422],
        [ 9.0000,  0.4551,  0.2432, -0.5625],
        [ 0.5156,  0.1885,  0.2539,  0.5742],
        ...,
        [-9.0000,  0.2148, -0.7539,  0.2285],
        [ 0.6094, -0.2100,  3.6719,  0.6133],
        [ 1.5156,  0.3320,  2.1250, -1.1016]], requires_grad=True)
2025-02-06 20:26:33,264 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6562, -0.6133,  0.2275, -0.2441],
        [ 8.8125,  0.3691,  0.3145, -0.5039],
        [ 0.5039,  0.1729,  0.2734,  0.5781],
        ...,
        [-8.8125,  0.2393, -0.7578,  0.2129],
        [ 0.5898, -0.2441,  3.7656,  0.6367],
        [ 1.4766,  0.2930,  2.2344, -1.0547]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6562, -0.6133,  0.2275, -0.2441],
        [ 8.8125,  0.3691,  0.3145, -0.5039],
        [ 0.5039,  0.1729,  0.2734,  0.5781],
        ...,
        [-8.8125,  0.2393, -0.7578,  0.2129],
        [ 0.5898, -0.2441,  3.7656,  0.6367],
        [ 1.4766,  0.2930,  2.2344, -1.0547]], requires_grad=True)
2025-02-06 20:26:33,409 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6172, -0.6445,  0.2441, -0.2314],
        [ 8.6250,  0.4004,  0.2598, -0.4883],
        [ 0.4922,  0.2031,  0.1963,  0.5430],
        ...,
        [-8.6250,  0.2158, -0.7227,  0.2100],
        [ 0.5742, -0.2275,  3.6250,  0.6289],
        [ 1.4297,  0.2793,  2.2031, -1.0156]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6172, -0.6445,  0.2441, -0.2314],
        [ 8.6250,  0.4004,  0.2598, -0.4883],
        [ 0.4922,  0.2031,  0.1963,  0.5430],
        ...,
        [-8.6250,  0.2158, -0.7227,  0.2100],
        [ 0.5742, -0.2275,  3.6250,  0.6289],
        [ 1.4297,  0.2793,  2.2031, -1.0156]], requires_grad=True)
2025-02-06 20:26:33,564 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5781, -0.7031,  0.2695, -0.2158],
        [ 8.3750,  0.4492,  0.1895, -0.4766],
        [ 0.4805,  0.2295,  0.1270,  0.5117],
        ...,
        [-8.4375,  0.1797, -0.6797,  0.2100],
        [ 0.5586, -0.1885,  3.3750,  0.6094],
        [ 1.3906,  0.2656,  2.1719, -0.9805]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5781, -0.7031,  0.2695, -0.2158],
        [ 8.3750,  0.4492,  0.1895, -0.4766],
        [ 0.4805,  0.2295,  0.1270,  0.5117],
        ...,
        [-8.4375,  0.1797, -0.6797,  0.2100],
        [ 0.5586, -0.1885,  3.3750,  0.6094],
        [ 1.3906,  0.2656,  2.1719, -0.9805]], requires_grad=True)
2025-02-06 20:26:33,715 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5742, -0.6875,  0.2637, -0.2051],
        [ 8.2500,  0.4414,  0.1768, -0.4629],
        [ 0.4785,  0.2295,  0.1143,  0.4863],
        ...,
        [-8.2500,  0.1631, -0.6523,  0.2100],
        [ 0.5508, -0.1816,  3.2812,  0.5938],
        [ 1.3750,  0.2119,  2.3594, -0.9375]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5742, -0.6875,  0.2637, -0.2051],
        [ 8.2500,  0.4414,  0.1768, -0.4629],
        [ 0.4785,  0.2295,  0.1143,  0.4863],
        ...,
        [-8.2500,  0.1631, -0.6523,  0.2100],
        [ 0.5508, -0.1816,  3.2812,  0.5938],
        [ 1.3750,  0.2119,  2.3594, -0.9375]], requires_grad=True)
2025-02-06 20:26:33,872 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5625, -0.7070,  0.2715, -0.1992],
        [ 8.1250,  0.4160,  0.1826, -0.4512],
        [ 0.4805,  0.2021,  0.1533,  0.4551],
        ...,
        [-8.1250,  0.1816, -0.6523,  0.2129],
        [ 0.5508, -0.2207,  3.3906,  0.5664],
        [ 1.3672,  0.1289,  2.6875, -0.9102]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5625, -0.7070,  0.2715, -0.1992],
        [ 8.1250,  0.4160,  0.1826, -0.4512],
        [ 0.4805,  0.2021,  0.1533,  0.4551],
        ...,
        [-8.1250,  0.1816, -0.6523,  0.2129],
        [ 0.5508, -0.2207,  3.3906,  0.5664],
        [ 1.3672,  0.1289,  2.6875, -0.9102]], requires_grad=True)
2025-02-06 20:26:34,009 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5547, -0.6953,  0.2695, -0.1895],
        [ 8.0625,  0.3633,  0.2139, -0.4473],
        [ 0.4844,  0.1699,  0.2021,  0.4219],
        ...,
        [-8.0625,  0.2275, -0.6719,  0.2197],
        [ 0.5586, -0.2832,  3.5781,  0.5312],
        [ 1.3594,  0.0361,  3.0781, -0.8906]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5547, -0.6953,  0.2695, -0.1895],
        [ 8.0625,  0.3633,  0.2139, -0.4473],
        [ 0.4844,  0.1699,  0.2021,  0.4219],
        ...,
        [-8.0625,  0.2275, -0.6719,  0.2197],
        [ 0.5586, -0.2832,  3.5781,  0.5312],
        [ 1.3594,  0.0361,  3.0781, -0.8906]], requires_grad=True)
2025-02-06 20:26:34,144 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5352, -0.7188,  0.2793, -0.1865],
        [ 7.9062,  0.3633,  0.1982, -0.4258],
        [ 0.4844,  0.1455,  0.2363,  0.3945],
        ...,
        [-7.9375,  0.2188, -0.6562,  0.2109],
        [ 0.5586, -0.2910,  3.5625,  0.5273],
        [ 1.3438, -0.0304,  3.3281, -0.8633]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5352, -0.7188,  0.2793, -0.1865],
        [ 7.9062,  0.3633,  0.1982, -0.4258],
        [ 0.4844,  0.1455,  0.2363,  0.3945],
        ...,
        [-7.9375,  0.2188, -0.6562,  0.2109],
        [ 0.5586, -0.2910,  3.5625,  0.5273],
        [ 1.3438, -0.0304,  3.3281, -0.8633]], requires_grad=True)
2025-02-06 20:26:34,281 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5078, -0.7539,  0.2930, -0.1865],
        [ 7.6875,  0.3906,  0.1611, -0.3945],
        [ 0.4746,  0.1562,  0.2100,  0.4023],
        ...,
        [-7.7500,  0.1846, -0.6211,  0.1943],
        [ 0.5430, -0.2520,  3.3594,  0.5547],
        [ 1.3203, -0.0742,  3.4844, -0.8242]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5078, -0.7539,  0.2930, -0.1865],
        [ 7.6875,  0.3906,  0.1611, -0.3945],
        [ 0.4746,  0.1562,  0.2100,  0.4023],
        ...,
        [-7.7500,  0.1846, -0.6211,  0.1943],
        [ 0.5430, -0.2520,  3.3594,  0.5547],
        [ 1.3203, -0.0742,  3.4844, -0.8242]], requires_grad=True)
2025-02-06 20:26:34,432 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4980, -0.7383,  0.2910, -0.1729],
        [ 7.5312,  0.4043,  0.1348, -0.3711],
        [ 0.4668,  0.1533,  0.2041,  0.3926],
        ...,
        [-7.6562,  0.1777, -0.6055,  0.1895],
        [ 0.5352, -0.2520,  3.2969,  0.5430],
        [ 1.3047, -0.1338,  3.6875, -0.8086]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4980, -0.7383,  0.2910, -0.1729],
        [ 7.5312,  0.4043,  0.1348, -0.3711],
        [ 0.4668,  0.1533,  0.2041,  0.3926],
        ...,
        [-7.6562,  0.1777, -0.6055,  0.1895],
        [ 0.5352, -0.2520,  3.2969,  0.5430],
        [ 1.3047, -0.1338,  3.6875, -0.8086]], requires_grad=True)
2025-02-06 20:26:34,586 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4961, -0.6914,  0.2793, -0.1504],
        [ 7.5000,  0.3496,  0.1602, -0.3867],
        [ 0.4648,  0.1221,  0.2393,  0.3398],
        ...,
        [-7.5938,  0.1963, -0.6016,  0.1963],
        [ 0.5352, -0.2773,  3.3281,  0.5039],
        [ 1.2891, -0.2002,  3.9219, -0.8047]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4961, -0.6914,  0.2793, -0.1504],
        [ 7.5000,  0.3496,  0.1602, -0.3867],
        [ 0.4648,  0.1221,  0.2393,  0.3398],
        ...,
        [-7.5938,  0.1963, -0.6016,  0.1963],
        [ 0.5352, -0.2773,  3.3281,  0.5039],
        [ 1.2891, -0.2002,  3.9219, -0.8047]], requires_grad=True)
2025-02-06 20:26:34,717 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5078, -0.5938,  0.2539, -0.1123],
        [ 7.5000,  0.2734,  0.2021, -0.4121],
        [ 0.4688,  0.0698,  0.3047,  0.2598],
        ...,
        [-7.6250,  0.2520, -0.6172,  0.2197],
        [ 0.5391, -0.3223,  3.4219,  0.4434],
        [ 1.2891, -0.2891,  4.2188, -0.8242]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5078, -0.5938,  0.2539, -0.1123],
        [ 7.5000,  0.2734,  0.2021, -0.4121],
        [ 0.4688,  0.0698,  0.3047,  0.2598],
        ...,
        [-7.6250,  0.2520, -0.6172,  0.2197],
        [ 0.5391, -0.3223,  3.4219,  0.4434],
        [ 1.2891, -0.2891,  4.2188, -0.8242]], requires_grad=True)
2025-02-06 20:26:34,850 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5195, -0.4844,  0.2246, -0.0737],
        [ 7.4375,  0.2285,  0.2217, -0.4238],
        [ 0.4688,  0.0337,  0.3457,  0.2012],
        ...,
        [-7.5625,  0.2793, -0.6172,  0.2305],
        [ 0.5391, -0.3574,  3.4688,  0.3906],
        [ 1.2812, -0.3633,  4.4688, -0.8359]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5195, -0.4844,  0.2246, -0.0737],
        [ 7.4375,  0.2285,  0.2217, -0.4238],
        [ 0.4688,  0.0337,  0.3457,  0.2012],
        ...,
        [-7.5625,  0.2793, -0.6172,  0.2305],
        [ 0.5391, -0.3574,  3.4688,  0.3906],
        [ 1.2812, -0.3633,  4.4688, -0.8359]], requires_grad=True)
2025-02-06 20:26:34,979 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5352, -0.3770,  0.1943, -0.0371],
        [ 7.3438,  0.2012,  0.2275, -0.4258],
        [ 0.4629,  0.0278,  0.3418,  0.1768],
        ...,
        [-7.3750,  0.2490, -0.5820,  0.2217],
        [ 0.5312, -0.3633,  3.4062,  0.3633],
        [ 1.2578, -0.3984,  4.5625, -0.8281]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5352, -0.3770,  0.1943, -0.0371],
        [ 7.3438,  0.2012,  0.2275, -0.4258],
        [ 0.4629,  0.0278,  0.3418,  0.1768],
        ...,
        [-7.3750,  0.2490, -0.5820,  0.2217],
        [ 0.5312, -0.3633,  3.4062,  0.3633],
        [ 1.2578, -0.3984,  4.5625, -0.8281]], requires_grad=True)
2025-02-06 20:26:35,136 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5352, -0.3145,  0.1777, -0.0096],
        [ 7.2500,  0.1719,  0.2363, -0.4277],
        [ 0.4531,  0.0398,  0.3105,  0.1670],
        ...,
        [-7.1562,  0.1943, -0.5391,  0.2061],
        [ 0.5195, -0.3477,  3.2812,  0.3477],
        [ 1.2344, -0.4219,  4.6250, -0.8164]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5352, -0.3145,  0.1777, -0.0096],
        [ 7.2500,  0.1719,  0.2363, -0.4277],
        [ 0.4531,  0.0398,  0.3105,  0.1670],
        ...,
        [-7.1562,  0.1943, -0.5391,  0.2061],
        [ 0.5195, -0.3477,  3.2812,  0.3477],
        [ 1.2344, -0.4219,  4.6250, -0.8164]], requires_grad=True)
2025-02-06 20:26:35,279 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5195, -0.3125,  0.1777,  0.0109],
        [ 7.0938,  0.1777,  0.2207, -0.4238],
        [ 0.4395,  0.0645,  0.2617,  0.1621],
        ...,
        [-6.9062,  0.1309, -0.4902,  0.1904],
        [ 0.5039, -0.3184,  3.1250,  0.3379],
        [ 1.2109, -0.4336,  4.6250, -0.8008]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5195, -0.3125,  0.1777,  0.0109],
        [ 7.0938,  0.1777,  0.2207, -0.4238],
        [ 0.4395,  0.0645,  0.2617,  0.1621],
        ...,
        [-6.9062,  0.1309, -0.4902,  0.1904],
        [ 0.5039, -0.3184,  3.1250,  0.3379],
        [ 1.2109, -0.4336,  4.6250, -0.8008]], requires_grad=True)
2025-02-06 20:26:35,410 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5117, -0.2812,  0.1699,  0.0269],
        [ 6.9375,  0.1641,  0.2188, -0.4180],
        [ 0.4277,  0.0703,  0.2402,  0.1621],
        ...,
        [-6.6875,  0.1050, -0.4609,  0.1729],
        [ 0.4922, -0.3105,  3.0469,  0.3340],
        [ 1.1875, -0.4551,  4.6562, -0.7812]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5117, -0.2812,  0.1699,  0.0269],
        [ 6.9375,  0.1641,  0.2188, -0.4180],
        [ 0.4277,  0.0703,  0.2402,  0.1621],
        ...,
        [-6.6875,  0.1050, -0.4609,  0.1729],
        [ 0.4922, -0.3105,  3.0469,  0.3340],
        [ 1.1875, -0.4551,  4.6562, -0.7812]], requires_grad=True)
2025-02-06 20:26:35,545 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4961, -0.2402,  0.1592,  0.0400],
        [ 6.7812,  0.1465,  0.2197, -0.4121],
        [ 0.4199,  0.0471,  0.2598,  0.1797],
        ...,
        [-6.5938,  0.1367, -0.4629,  0.1465],
        [ 0.4883, -0.3555,  3.1094,  0.3516],
        [ 1.1719, -0.5078,  4.8125, -0.7500]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4961, -0.2402,  0.1592,  0.0400],
        [ 6.7812,  0.1465,  0.2197, -0.4121],
        [ 0.4199,  0.0471,  0.2598,  0.1797],
        ...,
        [-6.5938,  0.1367, -0.4629,  0.1465],
        [ 0.4883, -0.3555,  3.1094,  0.3516],
        [ 1.1719, -0.5078,  4.8125, -0.7500]], requires_grad=True)
2025-02-06 20:26:35,681 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4785, -0.1943,  0.1465,  0.0498],
        [ 6.5938,  0.1348,  0.2178, -0.4062],
        [ 0.4141,  0.0186,  0.2871,  0.2012],
        ...,
        [-6.4688,  0.2021, -0.4824,  0.1133],
        [ 0.4844, -0.4219,  3.2344,  0.3828],
        [ 1.1562, -0.5508,  4.9062, -0.7227]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4785, -0.1943,  0.1465,  0.0498],
        [ 6.5938,  0.1348,  0.2178, -0.4062],
        [ 0.4141,  0.0186,  0.2871,  0.2012],
        ...,
        [-6.4688,  0.2021, -0.4824,  0.1133],
        [ 0.4844, -0.4219,  3.2344,  0.3828],
        [ 1.1562, -0.5508,  4.9062, -0.7227]], requires_grad=True)
2025-02-06 20:26:35,814 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4609, -0.1670,  0.1396,  0.0620],
        [ 6.4375,  0.1182,  0.2188, -0.3984],
        [ 0.4062,  0.0081,  0.2930,  0.2041],
        ...,
        [-6.3438,  0.2490, -0.4941,  0.0864],
        [ 0.4785, -0.4609,  3.2969,  0.3945],
        [ 1.1406, -0.5703,  4.9062, -0.7070]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4609, -0.1670,  0.1396,  0.0620],
        [ 6.4375,  0.1182,  0.2188, -0.3984],
        [ 0.4062,  0.0081,  0.2930,  0.2041],
        ...,
        [-6.3438,  0.2490, -0.4941,  0.0864],
        [ 0.4785, -0.4609,  3.2969,  0.3945],
        [ 1.1406, -0.5703,  4.9062, -0.7070]], requires_grad=True)
2025-02-06 20:26:35,952 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4492, -0.1484,  0.1338,  0.0732],
        [ 6.2812,  0.1216,  0.2080, -0.3984],
        [ 0.3984,  0.0156,  0.2773,  0.1836],
        ...,
        [-6.1875,  0.2520, -0.4844,  0.0786],
        [ 0.4727, -0.4629,  3.2656,  0.3750],
        [ 1.1172, -0.5625,  4.8125, -0.7109]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4492, -0.1484,  0.1338,  0.0732],
        [ 6.2812,  0.1216,  0.2080, -0.3984],
        [ 0.3984,  0.0156,  0.2773,  0.1836],
        ...,
        [-6.1875,  0.2520, -0.4844,  0.0786],
        [ 0.4727, -0.4629,  3.2656,  0.3750],
        [ 1.1172, -0.5625,  4.8125, -0.7109]], requires_grad=True)
2025-02-06 20:26:36,084 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4316, -0.2002,  0.1426,  0.1011],
        [ 6.1250,  0.1475,  0.1855, -0.4082],
        [ 0.3906,  0.0219,  0.2617,  0.1650],
        ...,
        [-6.0312,  0.2305, -0.4648,  0.0806],
        [ 0.4629, -0.4277,  3.1406,  0.3242],
        [ 1.0938, -0.5312,  4.6562, -0.7266]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4316, -0.2002,  0.1426,  0.1011],
        [ 6.1250,  0.1475,  0.1855, -0.4082],
        [ 0.3906,  0.0219,  0.2617,  0.1650],
        ...,
        [-6.0312,  0.2305, -0.4648,  0.0806],
        [ 0.4629, -0.4277,  3.1406,  0.3242],
        [ 1.0938, -0.5312,  4.6562, -0.7266]], requires_grad=True)
2025-02-06 20:26:36,216 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4023, -0.2441,  0.1504,  0.1270],
        [ 5.9688,  0.1475,  0.1768, -0.4082],
        [ 0.3867,  0.0100,  0.2676,  0.1689],
        ...,
        [-5.9062,  0.2559, -0.4648,  0.0679],
        [ 0.4551, -0.4180,  3.0781,  0.2969],
        [ 1.0625, -0.5156,  4.5625, -0.7305]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4023, -0.2441,  0.1504,  0.1270],
        [ 5.9688,  0.1475,  0.1768, -0.4082],
        [ 0.3867,  0.0100,  0.2676,  0.1689],
        ...,
        [-5.9062,  0.2559, -0.4648,  0.0679],
        [ 0.4551, -0.4180,  3.0781,  0.2969],
        [ 1.0625, -0.5156,  4.5625, -0.7305]], requires_grad=True)
2025-02-06 20:26:36,358 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3809, -0.2490,  0.1504,  0.1436],
        [ 5.8438,  0.1187,  0.1846, -0.3965],
        [ 0.3848, -0.0220,  0.2949,  0.1885],
        ...,
        [-5.8125,  0.3262, -0.4824,  0.0447],
        [ 0.4492, -0.4648,  3.1250,  0.3008],
        [ 1.0391, -0.5234,  4.5312, -0.7227]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3809, -0.2490,  0.1504,  0.1436],
        [ 5.8438,  0.1187,  0.1846, -0.3965],
        [ 0.3848, -0.0220,  0.2949,  0.1885],
        ...,
        [-5.8125,  0.3262, -0.4824,  0.0447],
        [ 0.4492, -0.4648,  3.1250,  0.3008],
        [ 1.0391, -0.5234,  4.5312, -0.7227]], requires_grad=True)
2025-02-06 20:26:36,493 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3633, -0.2480,  0.1484,  0.1572],
        [ 5.6875,  0.1021,  0.1865, -0.3867],
        [ 0.3809, -0.0518,  0.3203,  0.2061],
        ...,
        [-5.6875,  0.3926, -0.4980,  0.0237],
        [ 0.4375, -0.4980,  3.1562,  0.3008],
        [ 1.0156, -0.5234,  4.4688, -0.7148]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3633, -0.2480,  0.1484,  0.1572],
        [ 5.6875,  0.1021,  0.1865, -0.3867],
        [ 0.3809, -0.0518,  0.3203,  0.2061],
        ...,
        [-5.6875,  0.3926, -0.4980,  0.0237],
        [ 0.4375, -0.4980,  3.1562,  0.3008],
        [ 1.0156, -0.5234,  4.4688, -0.7148]], requires_grad=True)
2025-02-06 20:26:36,628 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3379, -0.2695,  0.1514,  0.1699],
        [ 5.5000,  0.1152,  0.1738, -0.3809],
        [ 0.3711, -0.0593,  0.3223,  0.2148],
        ...,
        [-5.5625,  0.4414, -0.5078,  0.0061],
        [ 0.4160, -0.4980,  3.1094,  0.2910],
        [ 0.9766, -0.4941,  4.3438, -0.7148]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3379, -0.2695,  0.1514,  0.1699],
        [ 5.5000,  0.1152,  0.1738, -0.3809],
        [ 0.3711, -0.0593,  0.3223,  0.2148],
        ...,
        [-5.5625,  0.4414, -0.5078,  0.0061],
        [ 0.4160, -0.4980,  3.1094,  0.2910],
        [ 0.9766, -0.4941,  4.3438, -0.7148]], requires_grad=True)
2025-02-06 20:26:36,760 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3281, -0.2578,  0.1475,  0.1797],
        [ 5.3125,  0.1211,  0.1660, -0.3750],
        [ 0.3594, -0.0615,  0.3203,  0.2207],
        ...,
        [-5.4062,  0.4785, -0.5117, -0.0090],
        [ 0.3926, -0.4805,  3.0156,  0.2812],
        [ 0.9414, -0.4668,  4.2188, -0.7109]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3281, -0.2578,  0.1475,  0.1797],
        [ 5.3125,  0.1211,  0.1660, -0.3750],
        [ 0.3594, -0.0615,  0.3203,  0.2207],
        ...,
        [-5.4062,  0.4785, -0.5117, -0.0090],
        [ 0.3926, -0.4805,  3.0156,  0.2812],
        [ 0.9414, -0.4668,  4.2188, -0.7109]], requires_grad=True)
2025-02-06 20:26:36,894 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3203, -0.2363,  0.1406,  0.1865],
        [ 5.1562,  0.0977,  0.1748, -0.3691],
        [ 0.3496, -0.0747,  0.3301,  0.2246],
        ...,
        [-5.2812,  0.5391, -0.5273, -0.0214],
        [ 0.3750, -0.4902,  2.9844,  0.2695],
        [ 0.9141, -0.4609,  4.1562, -0.7070]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3203, -0.2363,  0.1406,  0.1865],
        [ 5.1562,  0.0977,  0.1748, -0.3691],
        [ 0.3496, -0.0747,  0.3301,  0.2246],
        ...,
        [-5.2812,  0.5391, -0.5273, -0.0214],
        [ 0.3750, -0.4902,  2.9844,  0.2695],
        [ 0.9141, -0.4609,  4.1562, -0.7070]], requires_grad=True)
2025-02-06 20:26:37,036 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3398, -0.1260,  0.1133,  0.1982],
        [ 4.9688,  0.0815,  0.1797, -0.3613],
        [ 0.3438, -0.1016,  0.3594,  0.2227],
        ...,
        [-5.2500,  0.6172, -0.5547, -0.0295],
        [ 0.3633, -0.5156,  3.0156,  0.2539],
        [ 0.8945, -0.4785,  4.1562, -0.7070]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3398, -0.1260,  0.1133,  0.1982],
        [ 4.9688,  0.0815,  0.1797, -0.3613],
        [ 0.3438, -0.1016,  0.3594,  0.2227],
        ...,
        [-5.2500,  0.6172, -0.5547, -0.0295],
        [ 0.3633, -0.5156,  3.0156,  0.2539],
        [ 0.8945, -0.4785,  4.1562, -0.7070]], requires_grad=True)
2025-02-06 20:26:37,179 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3438, -0.0344,  0.0908,  0.2090],
        [ 4.6562,  0.1748,  0.1177, -0.3281],
        [ 0.3301, -0.0986,  0.3477,  0.2354],
        ...,
        [-5.0625,  0.6133, -0.5430, -0.0510],
        [ 0.3359, -0.4922,  2.9062,  0.2598],
        [ 0.8633, -0.4512,  4.0000, -0.6875]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3438, -0.0344,  0.0908,  0.2090],
        [ 4.6562,  0.1748,  0.1177, -0.3281],
        [ 0.3301, -0.0986,  0.3477,  0.2354],
        ...,
        [-5.0625,  0.6133, -0.5430, -0.0510],
        [ 0.3359, -0.4922,  2.9062,  0.2598],
        [ 0.8633, -0.4512,  4.0000, -0.6875]], requires_grad=True)
2025-02-06 20:26:37,313 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3516,  0.0820,  0.0620,  0.2246],
        [ 4.4062,  0.2441,  0.0713, -0.3027],
        [ 0.3164, -0.0986,  0.3418,  0.2432],
        ...,
        [-4.9375,  0.6367, -0.5469, -0.0620],
        [ 0.3105, -0.4727,  2.7969,  0.2617],
        [ 0.8359, -0.4414,  3.9062, -0.6758]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3516,  0.0820,  0.0620,  0.2246],
        [ 4.4062,  0.2441,  0.0713, -0.3027],
        [ 0.3164, -0.0986,  0.3418,  0.2432],
        ...,
        [-4.9375,  0.6367, -0.5469, -0.0620],
        [ 0.3105, -0.4727,  2.7969,  0.2617],
        [ 0.8359, -0.4414,  3.9062, -0.6758]], requires_grad=True)
2025-02-06 20:26:37,466 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3594,  0.1914,  0.0347,  0.2383],
        [ 4.1562,  0.3047,  0.0298, -0.2793],
        [ 0.3027, -0.1006,  0.3379,  0.2480],
        ...,
        [-4.7812,  0.6484, -0.5430, -0.0742],
        [ 0.2852, -0.4570,  2.7188,  0.2617],
        [ 0.8086, -0.4316,  3.8125, -0.6641]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3594,  0.1914,  0.0347,  0.2383],
        [ 4.1562,  0.3047,  0.0298, -0.2793],
        [ 0.3027, -0.1006,  0.3379,  0.2480],
        ...,
        [-4.7812,  0.6484, -0.5430, -0.0742],
        [ 0.2852, -0.4570,  2.7188,  0.2617],
        [ 0.8086, -0.4316,  3.8125, -0.6641]], requires_grad=True)
2025-02-06 20:26:37,620 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4102,  0.3848, -0.0167,  0.2695],
        [ 4.0938,  0.3047,  0.0325, -0.2734],
        [ 0.3105, -0.1445,  0.3984,  0.2148],
        ...,
        [-4.6875,  0.6719, -0.5469, -0.0811],
        [ 0.2773, -0.4785,  2.7500,  0.2363],
        [ 0.7891, -0.4297,  3.7656, -0.6562]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4102,  0.3848, -0.0167,  0.2695],
        [ 4.0938,  0.3047,  0.0325, -0.2734],
        [ 0.3105, -0.1445,  0.3984,  0.2148],
        ...,
        [-4.6875,  0.6719, -0.5469, -0.0811],
        [ 0.2773, -0.4785,  2.7500,  0.2363],
        [ 0.7891, -0.4297,  3.7656, -0.6562]], requires_grad=True)
2025-02-06 20:26:37,780 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4609,  0.5742, -0.0669,  0.3008],
        [ 4.0312,  0.2812,  0.0500, -0.2773],
        [ 0.3125, -0.1787,  0.4414,  0.1885],
        ...,
        [-4.6562,  0.7109, -0.5586, -0.0796],
        [ 0.2656, -0.4941,  2.7656,  0.2148],
        [ 0.7617, -0.4121,  3.6562, -0.6406]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4609,  0.5742, -0.0669,  0.3008],
        [ 4.0312,  0.2812,  0.0500, -0.2773],
        [ 0.3125, -0.1787,  0.4414,  0.1885],
        ...,
        [-4.6562,  0.7109, -0.5586, -0.0796],
        [ 0.2656, -0.4941,  2.7656,  0.2148],
        [ 0.7617, -0.4121,  3.6562, -0.6406]], requires_grad=True)
2025-02-06 20:26:37,908 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5156,  0.7539, -0.1157,  0.3301],
        [ 3.7656,  0.4121, -0.0422, -0.2207],
        [ 0.3027, -0.1611,  0.4102,  0.2139],
        ...,
        [-4.2812,  0.5703, -0.4727, -0.1387],
        [ 0.2246, -0.3828,  2.3750,  0.2891],
        [ 0.7227, -0.3398,  3.3281, -0.5859]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5156,  0.7539, -0.1157,  0.3301],
        [ 3.7656,  0.4121, -0.0422, -0.2207],
        [ 0.3027, -0.1611,  0.4102,  0.2139],
        ...,
        [-4.2812,  0.5703, -0.4727, -0.1387],
        [ 0.2246, -0.3828,  2.3750,  0.2891],
        [ 0.7227, -0.3398,  3.3281, -0.5859]], requires_grad=True)
2025-02-06 20:26:38,053 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5703,  0.9883, -0.1777,  0.3711],
        [ 3.9219,  0.4336, -0.0542, -0.1943],
        [ 0.3145, -0.2061,  0.4668,  0.1846],
        ...,
        [-3.9844,  0.5273, -0.4336, -0.1650],
        [ 0.2158, -0.3906,  2.3281,  0.2852],
        [ 0.7500, -0.4141,  3.5312, -0.6094]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5703,  0.9883, -0.1777,  0.3711],
        [ 3.9219,  0.4336, -0.0542, -0.1943],
        [ 0.3145, -0.2061,  0.4668,  0.1846],
        ...,
        [-3.9844,  0.5273, -0.4336, -0.1650],
        [ 0.2158, -0.3906,  2.3281,  0.2852],
        [ 0.7500, -0.4141,  3.5312, -0.6094]], requires_grad=True)
2025-02-06 20:26:38,202 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6211,  1.2031, -0.2354,  0.4082],
        [ 4.0625,  0.3965, -0.0330, -0.1855],
        [ 0.3301, -0.2871,  0.5664,  0.1279],
        ...,
        [-3.9062,  0.6250, -0.4648, -0.1533],
        [ 0.2266, -0.5117,  2.6094,  0.2188],
        [ 0.7812, -0.5234,  3.8281, -0.6406]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6211,  1.2031, -0.2354,  0.4082],
        [ 4.0625,  0.3965, -0.0330, -0.1855],
        [ 0.3301, -0.2871,  0.5664,  0.1279],
        ...,
        [-3.9062,  0.6250, -0.4648, -0.1533],
        [ 0.2266, -0.5117,  2.6094,  0.2188],
        [ 0.7812, -0.5234,  3.8281, -0.6406]], requires_grad=True)
2025-02-06 20:26:38,337 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6602,  1.1797, -0.2383,  0.4258],
        [ 4.3125,  0.5000, -0.0889, -0.1562],
        [ 0.3379, -0.2598,  0.5352,  0.1060],
        ...,
        [-3.7812,  0.5703, -0.4258, -0.1562],
        [ 0.2188, -0.4766,  2.4531,  0.1865],
        [ 0.7969, -0.6289,  4.1250, -0.6719]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6602,  1.1797, -0.2383,  0.4258],
        [ 4.3125,  0.5000, -0.0889, -0.1562],
        [ 0.3379, -0.2598,  0.5352,  0.1060],
        ...,
        [-3.7812,  0.5703, -0.4258, -0.1562],
        [ 0.2188, -0.4766,  2.4531,  0.1865],
        [ 0.7969, -0.6289,  4.1250, -0.6719]], requires_grad=True)
2025-02-06 20:26:38,470 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6914,  1.0469, -0.2188,  0.4355],
        [ 4.5312,  0.6641, -0.1787, -0.1279],
        [ 0.3418, -0.1855,  0.4512,  0.0903],
        ...,
        [-3.7344,  0.4219, -0.3496, -0.1631],
        [ 0.1934, -0.3340,  2.0312,  0.1592],
        [ 0.8203, -0.6328,  4.1250, -0.6914]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6914,  1.0469, -0.2188,  0.4355],
        [ 4.5312,  0.6641, -0.1787, -0.1279],
        [ 0.3418, -0.1855,  0.4512,  0.0903],
        ...,
        [-3.7344,  0.4219, -0.3496, -0.1631],
        [ 0.1934, -0.3340,  2.0312,  0.1592],
        [ 0.8203, -0.6328,  4.1250, -0.6914]], requires_grad=True)
2025-02-06 20:26:38,604 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6602,  0.8359, -0.1807,  0.4590],
        [ 5.0938,  0.7617, -0.2246, -0.0825],
        [ 0.3633, -0.1855,  0.4414,  0.1128],
        ...,
        [-3.7812,  0.3223, -0.2949, -0.1748],
        [ 0.2539, -0.3223,  1.9453,  0.2002],
        [ 0.8281, -0.6172,  4.0625, -0.7148]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6602,  0.8359, -0.1807,  0.4590],
        [ 5.0938,  0.7617, -0.2246, -0.0825],
        [ 0.3633, -0.1855,  0.4414,  0.1128],
        ...,
        [-3.7812,  0.3223, -0.2949, -0.1748],
        [ 0.2539, -0.3223,  1.9453,  0.2002],
        [ 0.8281, -0.6172,  4.0625, -0.7148]], requires_grad=True)
2025-02-06 20:26:38,740 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-6.3281e-01,  7.2266e-01, -1.5430e-01,  4.6680e-01],
        [ 5.6562e+00,  6.9141e-01, -2.1680e-01,  4.1504e-03],
        [ 3.8086e-01, -2.1777e-01,  4.5117e-01,  1.5918e-01],
        ...,
        [-4.0312e+00,  4.0820e-01, -2.8906e-01, -2.3438e-01],
        [ 3.2812e-01, -4.8438e-01,  2.1094e+00,  3.4375e-01],
        [ 8.3594e-01, -6.4844e-01,  4.0625e+00, -7.1094e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-6.3281e-01,  7.2266e-01, -1.5430e-01,  4.6680e-01],
        [ 5.6562e+00,  6.9141e-01, -2.1680e-01,  4.1504e-03],
        [ 3.8086e-01, -2.1777e-01,  4.5117e-01,  1.5918e-01],
        ...,
        [-4.0312e+00,  4.0820e-01, -2.8906e-01, -2.3438e-01],
        [ 3.2812e-01, -4.8438e-01,  2.1094e+00,  3.4375e-01],
        [ 8.3594e-01, -6.4844e-01,  4.0625e+00, -7.1094e-01]],
       requires_grad=True)
2025-02-06 20:26:38,874 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6133,  0.6680, -0.1338,  0.4609],
        [ 6.1562,  0.6445, -0.2109,  0.0742],
        [ 0.4004, -0.1807,  0.4414,  0.1357],
        ...,
        [-4.2188,  0.3926, -0.2734, -0.2559],
        [ 0.3867, -0.5703,  2.2031,  0.4238],
        [ 0.8594, -0.6328,  4.0312, -0.7266]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6133,  0.6680, -0.1338,  0.4609],
        [ 6.1562,  0.6445, -0.2109,  0.0742],
        [ 0.4004, -0.1807,  0.4414,  0.1357],
        ...,
        [-4.2188,  0.3926, -0.2734, -0.2559],
        [ 0.3867, -0.5703,  2.2031,  0.4238],
        [ 0.8594, -0.6328,  4.0312, -0.7266]], requires_grad=True)
2025-02-06 20:26:39,009 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5703,  0.5781, -0.1152,  0.4648],
        [ 6.5000,  0.7148, -0.2080,  0.0884],
        [ 0.4121, -0.0801,  0.4297,  0.0417],
        ...,
        [-4.4375,  0.2910, -0.2598, -0.2441],
        [ 0.4297, -0.5781,  2.2812,  0.4375],
        [ 0.8789, -0.5312,  4.0000, -0.7852]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5703,  0.5781, -0.1152,  0.4648],
        [ 6.5000,  0.7148, -0.2080,  0.0884],
        [ 0.4121, -0.0801,  0.4297,  0.0417],
        ...,
        [-4.4375,  0.2910, -0.2598, -0.2441],
        [ 0.4297, -0.5781,  2.2812,  0.4375],
        [ 0.8789, -0.5312,  4.0000, -0.7852]], requires_grad=True)
2025-02-06 20:26:39,144 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5312,  0.4980, -0.0986,  0.4688],
        [ 6.7500,  0.8438, -0.1953,  0.0874],
        [ 0.4199,  0.0219,  0.4219, -0.0491],
        ...,
        [-4.6250,  0.1709, -0.2500, -0.2275],
        [ 0.4609, -0.5508,  2.3594,  0.4336],
        [ 0.8945, -0.4258,  3.9688, -0.8398]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5312,  0.4980, -0.0986,  0.4688],
        [ 6.7500,  0.8438, -0.1953,  0.0874],
        [ 0.4199,  0.0219,  0.4219, -0.0491],
        ...,
        [-4.6250,  0.1709, -0.2500, -0.2275],
        [ 0.4609, -0.5508,  2.3594,  0.4336],
        [ 0.8945, -0.4258,  3.9688, -0.8398]], requires_grad=True)
2025-02-06 20:26:39,278 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4824,  0.4062, -0.0859,  0.4688],
        [ 6.9688,  0.9180, -0.1953,  0.0776],
        [ 0.4258,  0.0840,  0.3945, -0.1465],
        ...,
        [-4.8438,  0.1226, -0.2246, -0.1992],
        [ 0.4922, -0.5664,  2.3438,  0.4082],
        [ 0.9141, -0.3750,  3.8438, -0.9062]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4824,  0.4062, -0.0859,  0.4688],
        [ 6.9688,  0.9180, -0.1953,  0.0776],
        [ 0.4258,  0.0840,  0.3945, -0.1465],
        ...,
        [-4.8438,  0.1226, -0.2246, -0.1992],
        [ 0.4922, -0.5664,  2.3438,  0.4082],
        [ 0.9141, -0.3750,  3.8438, -0.9062]], requires_grad=True)
2025-02-06 20:26:39,414 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4375,  0.3223, -0.0752,  0.4668],
        [ 7.1562,  0.9570, -0.2012,  0.0583],
        [ 0.4316,  0.1040,  0.3496, -0.2773],
        ...,
        [-5.0000,  0.1309, -0.1904, -0.1514],
        [ 0.5195, -0.6094,  2.2969,  0.3535],
        [ 0.9258, -0.3477,  3.7188, -0.9727]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4375,  0.3223, -0.0752,  0.4668],
        [ 7.1562,  0.9570, -0.2012,  0.0583],
        [ 0.4316,  0.1040,  0.3496, -0.2773],
        ...,
        [-5.0000,  0.1309, -0.1904, -0.1514],
        [ 0.5195, -0.6094,  2.2969,  0.3535],
        [ 0.9258, -0.3477,  3.7188, -0.9727]], requires_grad=True)
2025-02-06 20:26:39,573 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3945,  0.2793, -0.0659,  0.4746],
        [ 7.3125,  0.9688, -0.2031,  0.0299],
        [ 0.4355,  0.1089,  0.3125, -0.4102],
        ...,
        [-5.1250,  0.1621, -0.1611, -0.0962],
        [ 0.5430, -0.6797,  2.2656,  0.2637],
        [ 0.9375, -0.3281,  3.5938, -1.0312]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3945,  0.2793, -0.0659,  0.4746],
        [ 7.3125,  0.9688, -0.2031,  0.0299],
        [ 0.4355,  0.1089,  0.3125, -0.4102],
        ...,
        [-5.1250,  0.1621, -0.1611, -0.0962],
        [ 0.5430, -0.6797,  2.2656,  0.2637],
        [ 0.9375, -0.3281,  3.5938, -1.0312]], requires_grad=True)
2025-02-06 20:26:39,706 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3496,  0.1543, -0.0454,  0.4492],
        [ 7.3750,  1.0078, -0.2158,  0.0200],
        [ 0.4355,  0.1523,  0.2480, -0.4629],
        ...,
        [-5.2500,  0.1689, -0.1299, -0.0583],
        [ 0.5664, -0.7031,  2.1719,  0.2285],
        [ 0.9453, -0.3105,  3.4688, -1.0781]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3496,  0.1543, -0.0454,  0.4492],
        [ 7.3750,  1.0078, -0.2158,  0.0200],
        [ 0.4355,  0.1523,  0.2480, -0.4629],
        ...,
        [-5.2500,  0.1689, -0.1299, -0.0583],
        [ 0.5664, -0.7031,  2.1719,  0.2285],
        [ 0.9453, -0.3105,  3.4688, -1.0781]], requires_grad=True)
2025-02-06 20:26:39,837 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3066,  0.0417, -0.0266,  0.4258],
        [ 7.3438,  1.0781, -0.2520,  0.0282],
        [ 0.4258,  0.2061,  0.1699, -0.4941],
        ...,
        [-5.3125,  0.1514, -0.0923, -0.0337],
        [ 0.5820, -0.7031,  2.0312,  0.2119],
        [ 0.9531, -0.2832,  3.3281, -1.1094]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3066,  0.0417, -0.0266,  0.4258],
        [ 7.3438,  1.0781, -0.2520,  0.0282],
        [ 0.4258,  0.2061,  0.1699, -0.4941],
        ...,
        [-5.3125,  0.1514, -0.0923, -0.0337],
        [ 0.5820, -0.7031,  2.0312,  0.2119],
        [ 0.9531, -0.2832,  3.3281, -1.1094]], requires_grad=True)
2025-02-06 20:26:39,971 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2578, -0.0427, -0.0131,  0.4043],
        [ 7.3125,  1.1797, -0.3125,  0.0420],
        [ 0.4121,  0.2500,  0.1035, -0.5195],
        ...,
        [-5.3750,  0.1279, -0.0554, -0.0129],
        [ 0.5977, -0.6797,  1.8359,  0.2031],
        [ 0.9570, -0.2656,  3.2188, -1.1406]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2578, -0.0427, -0.0131,  0.4043],
        [ 7.3125,  1.1797, -0.3125,  0.0420],
        [ 0.4121,  0.2500,  0.1035, -0.5195],
        ...,
        [-5.3750,  0.1279, -0.0554, -0.0129],
        [ 0.5977, -0.6797,  1.8359,  0.2031],
        [ 0.9570, -0.2656,  3.2188, -1.1406]], requires_grad=True)
2025-02-06 20:26:40,105 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.1680e-01, -1.0645e-01, -4.4861e-03,  3.8477e-01],
        [ 7.3438e+00,  1.2188e+00, -3.2812e-01,  6.0791e-02],
        [ 4.0039e-01,  2.7734e-01,  6.1035e-02, -5.3906e-01],
        ...,
        [-5.4688e+00,  1.3184e-01, -3.9307e-02,  1.8387e-03],
        [ 6.0547e-01, -6.6016e-01,  1.6797e+00,  1.9727e-01],
        [ 9.6484e-01, -2.6758e-01,  3.1875e+00, -1.1641e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.1680e-01, -1.0645e-01, -4.4861e-03,  3.8477e-01],
        [ 7.3438e+00,  1.2188e+00, -3.2812e-01,  6.0791e-02],
        [ 4.0039e-01,  2.7734e-01,  6.1035e-02, -5.3906e-01],
        ...,
        [-5.4688e+00,  1.3184e-01, -3.9307e-02,  1.8387e-03],
        [ 6.0547e-01, -6.6016e-01,  1.6797e+00,  1.9727e-01],
        [ 9.6484e-01, -2.6758e-01,  3.1875e+00, -1.1641e+00]],
       requires_grad=True)
2025-02-06 20:26:40,261 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.3926e-01, -1.0596e-01, -2.0874e-02,  3.5547e-01],
        [ 7.5000e+00,  1.2188e+00, -3.0859e-01,  8.6914e-02],
        [ 3.9844e-01,  2.9492e-01,  3.8818e-02, -5.4297e-01],
        ...,
        [-5.6875e+00,  1.7480e-01, -5.5664e-02,  5.8289e-03],
        [ 6.3281e-01, -6.7188e-01,  1.6797e+00,  2.0898e-01],
        [ 9.7266e-01, -2.7539e-01,  3.1875e+00, -1.1719e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.3926e-01, -1.0596e-01, -2.0874e-02,  3.5547e-01],
        [ 7.5000e+00,  1.2188e+00, -3.0859e-01,  8.6914e-02],
        [ 3.9844e-01,  2.9492e-01,  3.8818e-02, -5.4297e-01],
        ...,
        [-5.6875e+00,  1.7480e-01, -5.5664e-02,  5.8289e-03],
        [ 6.3281e-01, -6.7188e-01,  1.6797e+00,  2.0898e-01],
        [ 9.7266e-01, -2.7539e-01,  3.1875e+00, -1.1719e+00]],
       requires_grad=True)
2025-02-06 20:26:40,467 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.6172e-01, -8.3008e-02, -4.3701e-02,  3.2617e-01],
        [ 7.6250e+00,  1.1797e+00, -2.5586e-01,  1.1621e-01],
        [ 3.9844e-01,  2.9102e-01,  5.3711e-02, -5.3516e-01],
        ...,
        [-5.8438e+00,  2.2168e-01, -7.7148e-02,  7.4768e-03],
        [ 6.5625e-01, -6.9531e-01,  1.7422e+00,  2.2559e-01],
        [ 9.7266e-01, -2.9688e-01,  3.2656e+00, -1.1719e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.6172e-01, -8.3008e-02, -4.3701e-02,  3.2617e-01],
        [ 7.6250e+00,  1.1797e+00, -2.5586e-01,  1.1621e-01],
        [ 3.9844e-01,  2.9102e-01,  5.3711e-02, -5.3516e-01],
        ...,
        [-5.8438e+00,  2.2168e-01, -7.7148e-02,  7.4768e-03],
        [ 6.5625e-01, -6.9531e-01,  1.7422e+00,  2.2559e-01],
        [ 9.7266e-01, -2.9688e-01,  3.2656e+00, -1.1719e+00]],
       requires_grad=True)
2025-02-06 20:26:40,600 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3145, -0.1235, -0.0420,  0.2988],
        [ 7.7500,  1.1719, -0.2402,  0.1406],
        [ 0.4043,  0.3008,  0.0437, -0.5273],
        ...,
        [-6.0312,  0.2393, -0.0791,  0.0089],
        [ 0.6797, -0.6875,  1.6562,  0.2363],
        [ 0.9688, -0.2910,  3.2031, -1.1719]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3145, -0.1235, -0.0420,  0.2988],
        [ 7.7500,  1.1719, -0.2402,  0.1406],
        [ 0.4043,  0.3008,  0.0437, -0.5273],
        ...,
        [-6.0312,  0.2393, -0.0791,  0.0089],
        [ 0.6797, -0.6875,  1.6562,  0.2363],
        [ 0.9688, -0.2910,  3.2031, -1.1719]], requires_grad=True)
2025-02-06 20:26:40,736 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-3.5547e-01, -1.8848e-01, -2.6367e-02,  2.7539e-01],
        [ 7.8125e+00,  1.2031e+00, -2.6172e-01,  1.5918e-01],
        [ 4.0820e-01,  3.2031e-01,  2.4261e-03, -5.1953e-01],
        ...,
        [-6.1562e+00,  2.3926e-01, -6.6406e-02,  1.1169e-02],
        [ 6.9531e-01, -6.5234e-01,  1.4531e+00,  2.4121e-01],
        [ 9.6094e-01, -2.8711e-01,  3.1250e+00, -1.1719e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-3.5547e-01, -1.8848e-01, -2.6367e-02,  2.7539e-01],
        [ 7.8125e+00,  1.2031e+00, -2.6172e-01,  1.5918e-01],
        [ 4.0820e-01,  3.2031e-01,  2.4261e-03, -5.1953e-01],
        ...,
        [-6.1562e+00,  2.3926e-01, -6.6406e-02,  1.1169e-02],
        [ 6.9531e-01, -6.5234e-01,  1.4531e+00,  2.4121e-01],
        [ 9.6094e-01, -2.8711e-01,  3.1250e+00, -1.1719e+00]],
       requires_grad=True)
2025-02-06 20:26:40,869 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3809, -0.2178, -0.0262,  0.2520],
        [ 7.8438,  1.2188, -0.2773,  0.1748],
        [ 0.4102,  0.3398, -0.0422, -0.5117],
        ...,
        [-6.2500,  0.2520, -0.0703,  0.0115],
        [ 0.7070, -0.6289,  1.3281,  0.2480],
        [ 0.9531, -0.2793,  3.0156, -1.1641]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3809, -0.2178, -0.0262,  0.2520],
        [ 7.8438,  1.2188, -0.2773,  0.1748],
        [ 0.4102,  0.3398, -0.0422, -0.5117],
        ...,
        [-6.2500,  0.2520, -0.0703,  0.0115],
        [ 0.7070, -0.6289,  1.3281,  0.2480],
        [ 0.9531, -0.2793,  3.0156, -1.1641]], requires_grad=True)
2025-02-06 20:26:41,003 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4297, -0.2051, -0.0522,  0.2236],
        [ 7.8750,  1.2188, -0.2715,  0.1924],
        [ 0.4180,  0.3516, -0.0596, -0.4980],
        ...,
        [-6.2188,  0.2617, -0.0708,  0.0132],
        [ 0.7188, -0.6133,  1.2656,  0.2559],
        [ 0.9336, -0.2715,  2.8906, -1.1562]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4297, -0.2051, -0.0522,  0.2236],
        [ 7.8750,  1.2188, -0.2715,  0.1924],
        [ 0.4180,  0.3516, -0.0596, -0.4980],
        ...,
        [-6.2188,  0.2617, -0.0708,  0.0132],
        [ 0.7188, -0.6133,  1.2656,  0.2559],
        [ 0.9336, -0.2715,  2.8906, -1.1562]], requires_grad=True)
2025-02-06 20:26:41,140 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-4.6680e-01, -1.8652e-01, -7.9590e-02,  1.9727e-01],
        [ 7.8438e+00,  1.1953e+00, -2.3145e-01,  2.0996e-01],
        [ 4.1992e-01,  3.4766e-01, -2.5879e-02, -4.7656e-01],
        ...,
        [-6.0312e+00,  3.2031e-01, -1.3965e-01,  6.8054e-03],
        [ 7.1484e-01, -6.2891e-01,  1.4688e+00,  2.7344e-01],
        [ 9.1406e-01, -2.6953e-01,  2.8438e+00, -1.1484e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-4.6680e-01, -1.8652e-01, -7.9590e-02,  1.9727e-01],
        [ 7.8438e+00,  1.1953e+00, -2.3145e-01,  2.0996e-01],
        [ 4.1992e-01,  3.4766e-01, -2.5879e-02, -4.7656e-01],
        ...,
        [-6.0312e+00,  3.2031e-01, -1.3965e-01,  6.8054e-03],
        [ 7.1484e-01, -6.2891e-01,  1.4688e+00,  2.7344e-01],
        [ 9.1406e-01, -2.6953e-01,  2.8438e+00, -1.1484e+00]],
       requires_grad=True)
2025-02-06 20:26:41,296 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-4.9805e-01, -1.6602e-01, -1.0498e-01,  1.7383e-01],
        [ 7.7812e+00,  1.1719e+00, -2.0508e-01,  2.2461e-01],
        [ 4.1797e-01,  3.4961e-01, -1.8433e-02, -4.6094e-01],
        ...,
        [-5.8125e+00,  3.6523e-01, -1.9336e-01,  2.1820e-03],
        [ 7.1094e-01, -6.4062e-01,  1.6250e+00,  2.8711e-01],
        [ 8.9062e-01, -2.6562e-01,  2.7812e+00, -1.1406e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-4.9805e-01, -1.6602e-01, -1.0498e-01,  1.7383e-01],
        [ 7.7812e+00,  1.1719e+00, -2.0508e-01,  2.2461e-01],
        [ 4.1797e-01,  3.4961e-01, -1.8433e-02, -4.6094e-01],
        ...,
        [-5.8125e+00,  3.6523e-01, -1.9336e-01,  2.1820e-03],
        [ 7.1094e-01, -6.4062e-01,  1.6250e+00,  2.8711e-01],
        [ 8.9062e-01, -2.6562e-01,  2.7812e+00, -1.1406e+00]],
       requires_grad=True)
2025-02-06 20:26:41,433 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-5.1172e-01, -2.1680e-01, -9.0332e-02,  1.5723e-01],
        [ 7.6562e+00,  1.1797e+00, -2.1680e-01,  2.3242e-01],
        [ 4.1211e-01,  3.6523e-01, -5.2734e-02, -4.5117e-01],
        ...,
        [-5.5625e+00,  3.9258e-01, -2.2852e-01, -6.9809e-04],
        [ 6.9922e-01, -6.0938e-01,  1.5312e+00,  2.9102e-01],
        [ 8.6719e-01, -2.4902e-01,  2.6406e+00, -1.1250e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-5.1172e-01, -2.1680e-01, -9.0332e-02,  1.5723e-01],
        [ 7.6562e+00,  1.1797e+00, -2.1680e-01,  2.3242e-01],
        [ 4.1211e-01,  3.6523e-01, -5.2734e-02, -4.5117e-01],
        ...,
        [-5.5625e+00,  3.9258e-01, -2.2852e-01, -6.9809e-04],
        [ 6.9922e-01, -6.0938e-01,  1.5312e+00,  2.9102e-01],
        [ 8.6719e-01, -2.4902e-01,  2.6406e+00, -1.1250e+00]],
       requires_grad=True)
2025-02-06 20:26:41,638 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-5.0781e-01, -3.3008e-01, -4.1992e-02,  1.4062e-01],
        [ 7.5312e+00,  1.2109e+00, -2.6953e-01,  2.4023e-01],
        [ 4.0234e-01,  3.9648e-01, -1.3574e-01, -4.3945e-01],
        ...,
        [-5.3438e+00,  4.0625e-01, -2.4707e-01, -3.5858e-03],
        [ 6.7969e-01, -5.4688e-01,  1.2578e+00,  2.9688e-01],
        [ 8.3984e-01, -2.1289e-01,  2.3594e+00, -1.1094e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-5.0781e-01, -3.3008e-01, -4.1992e-02,  1.4062e-01],
        [ 7.5312e+00,  1.2109e+00, -2.6953e-01,  2.4023e-01],
        [ 4.0234e-01,  3.9648e-01, -1.3574e-01, -4.3945e-01],
        ...,
        [-5.3438e+00,  4.0625e-01, -2.4707e-01, -3.5858e-03],
        [ 6.7969e-01, -5.4688e-01,  1.2578e+00,  2.9688e-01],
        [ 8.3984e-01, -2.1289e-01,  2.3594e+00, -1.1094e+00]],
       requires_grad=True)
2025-02-06 20:26:41,789 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-5.0781e-01, -4.2383e-01, -1.8768e-03,  1.2598e-01],
        [ 7.4375e+00,  1.1719e+00, -2.3633e-01,  2.3633e-01],
        [ 3.9453e-01,  4.1016e-01, -1.7578e-01, -4.3164e-01],
        ...,
        [-5.1562e+00,  4.3750e-01, -2.8125e-01, -3.1891e-03],
        [ 6.6016e-01, -5.0781e-01,  1.0781e+00,  2.9492e-01],
        [ 8.2422e-01, -2.1094e-01,  2.2969e+00, -1.1016e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-5.0781e-01, -4.2383e-01, -1.8768e-03,  1.2598e-01],
        [ 7.4375e+00,  1.1719e+00, -2.3633e-01,  2.3633e-01],
        [ 3.9453e-01,  4.1016e-01, -1.7578e-01, -4.3164e-01],
        ...,
        [-5.1562e+00,  4.3750e-01, -2.8125e-01, -3.1891e-03],
        [ 6.6016e-01, -5.0781e-01,  1.0781e+00,  2.9492e-01],
        [ 8.2422e-01, -2.1094e-01,  2.2969e+00, -1.1016e+00]],
       requires_grad=True)
2025-02-06 20:26:41,918 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5078, -0.4668,  0.0159,  0.1206],
        [ 7.3438,  1.0781, -0.1318,  0.2109],
        [ 0.3867,  0.4102, -0.1826, -0.4355],
        ...,
        [-4.9688,  0.5156, -0.3633,  0.0141],
        [ 0.6406, -0.5078,  1.1250,  0.2656],
        [ 0.8086, -0.2373,  2.4219, -1.1016]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5078, -0.4668,  0.0159,  0.1206],
        [ 7.3438,  1.0781, -0.1318,  0.2109],
        [ 0.3867,  0.4102, -0.1826, -0.4355],
        ...,
        [-4.9688,  0.5156, -0.3633,  0.0141],
        [ 0.6406, -0.5078,  1.1250,  0.2656],
        [ 0.8086, -0.2373,  2.4219, -1.1016]], requires_grad=True)
2025-02-06 20:26:42,051 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-5.1172e-01, -4.8633e-01,  2.4170e-02,  1.1914e-01],
        [ 7.2500e+00,  9.5703e-01, -6.2561e-03,  1.7676e-01],
        [ 3.7891e-01,  3.9453e-01, -1.5820e-01, -4.4922e-01],
        ...,
        [-4.7500e+00,  6.0547e-01, -4.4922e-01,  3.6133e-02],
        [ 6.2109e-01, -5.2344e-01,  1.2422e+00,  2.2754e-01],
        [ 7.8906e-01, -2.8516e-01,  2.6562e+00, -1.1172e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-5.1172e-01, -4.8633e-01,  2.4170e-02,  1.1914e-01],
        [ 7.2500e+00,  9.5703e-01, -6.2561e-03,  1.7676e-01],
        [ 3.7891e-01,  3.9453e-01, -1.5820e-01, -4.4922e-01],
        ...,
        [-4.7500e+00,  6.0547e-01, -4.4922e-01,  3.6133e-02],
        [ 6.2109e-01, -5.2344e-01,  1.2422e+00,  2.2754e-01],
        [ 7.8906e-01, -2.8516e-01,  2.6562e+00, -1.1172e+00]],
       requires_grad=True)
2025-02-06 20:26:42,192 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5039, -0.5469,  0.0471,  0.1094],
        [ 7.1250,  0.8789,  0.0752,  0.1562],
        [ 0.3652,  0.4023, -0.1826, -0.4414],
        ...,
        [-4.4688,  0.6250, -0.4824,  0.0398],
        [ 0.5938, -0.4980,  1.1719,  0.2178],
        [ 0.7656, -0.2969,  2.7031, -1.1094]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5039, -0.5469,  0.0471,  0.1094],
        [ 7.1250,  0.8789,  0.0752,  0.1562],
        [ 0.3652,  0.4023, -0.1826, -0.4414],
        ...,
        [-4.4688,  0.6250, -0.4824,  0.0398],
        [ 0.5938, -0.4980,  1.1719,  0.2178],
        [ 0.7656, -0.2969,  2.7031, -1.1094]], requires_grad=True)
2025-02-06 20:26:42,324 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5039, -0.6289,  0.0771,  0.0942],
        [ 6.9375,  0.8828,  0.0767,  0.1592],
        [ 0.3516,  0.4004, -0.1904, -0.4395],
        ...,
        [-4.2188,  0.6328, -0.5039,  0.0400],
        [ 0.5664, -0.4395,  0.9766,  0.2295],
        [ 0.7344, -0.2930,  2.6875, -1.0938]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5039, -0.6289,  0.0771,  0.0942],
        [ 6.9375,  0.8828,  0.0767,  0.1592],
        [ 0.3516,  0.4004, -0.1904, -0.4395],
        ...,
        [-4.2188,  0.6328, -0.5039,  0.0400],
        [ 0.5664, -0.4395,  0.9766,  0.2295],
        [ 0.7344, -0.2930,  2.6875, -1.0938]], requires_grad=True)
2025-02-06 20:26:42,462 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5039, -0.6836,  0.0991,  0.0830],
        [ 6.9062,  0.7969,  0.1562,  0.1387],
        [ 0.3516,  0.3711, -0.1436, -0.4551],
        ...,
        [-4.1562,  0.6758, -0.5469,  0.0486],
        [ 0.5664, -0.4316,  0.9805,  0.2168],
        [ 0.7031, -0.3027,  2.7188, -1.0859]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5039, -0.6836,  0.0991,  0.0830],
        [ 6.9062,  0.7969,  0.1562,  0.1387],
        [ 0.3516,  0.3711, -0.1436, -0.4551],
        ...,
        [-4.1562,  0.6758, -0.5469,  0.0486],
        [ 0.5664, -0.4316,  0.9805,  0.2168],
        [ 0.7031, -0.3027,  2.7188, -1.0859]], requires_grad=True)
2025-02-06 20:26:42,609 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5156, -0.6680,  0.0957,  0.0815],
        [ 6.9375,  0.6719,  0.2676,  0.1104],
        [ 0.3535,  0.3340, -0.0859, -0.4727],
        ...,
        [-4.0938,  0.7266, -0.5938,  0.0586],
        [ 0.5664, -0.4434,  1.0625,  0.1953],
        [ 0.6875, -0.3438,  2.9062, -1.0859]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5156, -0.6680,  0.0957,  0.0815],
        [ 6.9375,  0.6719,  0.2676,  0.1104],
        [ 0.3535,  0.3340, -0.0859, -0.4727],
        ...,
        [-4.0938,  0.7266, -0.5938,  0.0586],
        [ 0.5664, -0.4434,  1.0625,  0.1953],
        [ 0.6875, -0.3438,  2.9062, -1.0859]], requires_grad=True)
2025-02-06 20:26:42,743 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5156, -0.6758,  0.1016,  0.0771],
        [ 6.8438,  0.6133,  0.3164,  0.0947],
        [ 0.3457,  0.3262, -0.0811, -0.4746],
        ...,
        [-3.9531,  0.7500, -0.6172,  0.0645],
        [ 0.5586, -0.4336,  1.0469,  0.1836],
        [ 0.6680, -0.3711,  3.0000, -1.0781]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5156, -0.6758,  0.1016,  0.0771],
        [ 6.8438,  0.6133,  0.3164,  0.0947],
        [ 0.3457,  0.3262, -0.0811, -0.4746],
        ...,
        [-3.9531,  0.7500, -0.6172,  0.0645],
        [ 0.5586, -0.4336,  1.0469,  0.1836],
        [ 0.6680, -0.3711,  3.0000, -1.0781]], requires_grad=True)
2025-02-06 20:26:42,875 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5039, -0.7109,  0.1172,  0.0698],
        [ 6.6875,  0.5820,  0.3359,  0.0850],
        [ 0.3359,  0.3320, -0.1040, -0.4688],
        ...,
        [-3.7812,  0.7578, -0.6289,  0.0679],
        [ 0.5469, -0.4199,  1.0234,  0.1729],
        [ 0.6328, -0.3555,  2.9062, -1.0625]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5039, -0.7109,  0.1172,  0.0698],
        [ 6.6875,  0.5820,  0.3359,  0.0850],
        [ 0.3359,  0.3320, -0.1040, -0.4688],
        ...,
        [-3.7812,  0.7578, -0.6289,  0.0679],
        [ 0.5469, -0.4199,  1.0234,  0.1729],
        [ 0.6328, -0.3555,  2.9062, -1.0625]], requires_grad=True)
2025-02-06 20:26:43,011 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4902, -0.7383,  0.1289,  0.0640],
        [ 6.5625,  0.5508,  0.3574,  0.0752],
        [ 0.3242,  0.3340, -0.1187, -0.4629],
        ...,
        [-3.6406,  0.7852, -0.6523,  0.0737],
        [ 0.5352, -0.4023,  0.9688,  0.1660],
        [ 0.6016, -0.3496,  2.8438, -1.0469]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4902, -0.7383,  0.1289,  0.0640],
        [ 6.5625,  0.5508,  0.3574,  0.0752],
        [ 0.3242,  0.3340, -0.1187, -0.4629],
        ...,
        [-3.6406,  0.7852, -0.6523,  0.0737],
        [ 0.5352, -0.4023,  0.9688,  0.1660],
        [ 0.6016, -0.3496,  2.8438, -1.0469]], requires_grad=True)
2025-02-06 20:26:43,154 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4766, -0.7969,  0.1514,  0.0557],
        [ 6.4688,  0.4375,  0.4434,  0.0571],
        [ 0.3164,  0.3027, -0.0796, -0.4668],
        ...,
        [-3.5469,  0.9023, -0.7305,  0.0884],
        [ 0.5312, -0.4707,  1.2266,  0.1406],
        [ 0.5781, -0.3496,  2.8281, -1.0312]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4766, -0.7969,  0.1514,  0.0557],
        [ 6.4688,  0.4375,  0.4434,  0.0571],
        [ 0.3164,  0.3027, -0.0796, -0.4668],
        ...,
        [-3.5469,  0.9023, -0.7305,  0.0884],
        [ 0.5312, -0.4707,  1.2266,  0.1406],
        [ 0.5781, -0.3496,  2.8281, -1.0312]], requires_grad=True)
2025-02-06 20:26:43,308 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4531, -0.8789,  0.1787,  0.0483],
        [ 6.3750,  0.3340,  0.5156,  0.0413],
        [ 0.3027,  0.2871, -0.0674, -0.4688],
        ...,
        [-3.4219,  0.9961, -0.7891,  0.1016],
        [ 0.5234, -0.5195,  1.4062,  0.1182],
        [ 0.5586, -0.3477,  2.7969, -1.0156]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4531, -0.8789,  0.1787,  0.0483],
        [ 6.3750,  0.3340,  0.5156,  0.0413],
        [ 0.3027,  0.2871, -0.0674, -0.4688],
        ...,
        [-3.4219,  0.9961, -0.7891,  0.1016],
        [ 0.5234, -0.5195,  1.4062,  0.1182],
        [ 0.5586, -0.3477,  2.7969, -1.0156]], requires_grad=True)
2025-02-06 20:26:43,445 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3945, -0.9805,  0.2148,  0.0449],
        [ 6.2812,  0.3145,  0.5273,  0.0251],
        [ 0.2891,  0.2754, -0.0586, -0.4688],
        ...,
        [-3.2344,  1.0312, -0.8203,  0.1143],
        [ 0.5039, -0.4609,  1.2578,  0.0898],
        [ 0.5234, -0.3262,  2.6875, -1.0000]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3945, -0.9805,  0.2148,  0.0449],
        [ 6.2812,  0.3145,  0.5273,  0.0251],
        [ 0.2891,  0.2754, -0.0586, -0.4688],
        ...,
        [-3.2344,  1.0312, -0.8203,  0.1143],
        [ 0.5039, -0.4609,  1.2578,  0.0898],
        [ 0.5234, -0.3262,  2.6875, -1.0000]], requires_grad=True)
2025-02-06 20:26:43,585 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-3.1445e-01, -1.3203e+00,  3.0664e-01,  5.5908e-02],
        [ 6.0312e+00,  4.2773e-01,  4.5117e-01, -2.7771e-03],
        [ 2.7344e-01,  3.1445e-01, -1.1719e-01, -4.8047e-01],
        ...,
        [-2.9688e+00,  9.6875e-01, -7.9688e-01,  1.3281e-01],
        [ 4.7852e-01, -3.5352e-01,  9.5703e-01,  5.2979e-02],
        [ 4.8828e-01, -2.5977e-01,  2.4219e+00, -9.8828e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-3.1445e-01, -1.3203e+00,  3.0664e-01,  5.5908e-02],
        [ 6.0312e+00,  4.2773e-01,  4.5117e-01, -2.7771e-03],
        [ 2.7344e-01,  3.1445e-01, -1.1719e-01, -4.8047e-01],
        ...,
        [-2.9688e+00,  9.6875e-01, -7.9688e-01,  1.3281e-01],
        [ 4.7852e-01, -3.5352e-01,  9.5703e-01,  5.2979e-02],
        [ 4.8828e-01, -2.5977e-01,  2.4219e+00, -9.8828e-01]],
       requires_grad=True)
2025-02-06 20:26:43,716 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.1973e-01, -1.5156e+00,  3.6719e-01,  5.8105e-02],
        [ 5.7812e+00,  3.3008e-01,  4.8633e-01,  2.3193e-03],
        [ 2.5586e-01,  2.9297e-01, -1.1035e-01, -4.6875e-01],
        ...,
        [-2.6719e+00,  1.0391e+00, -8.2812e-01,  1.3184e-01],
        [ 4.6094e-01, -3.6133e-01,  9.4141e-01,  5.3711e-02],
        [ 4.5898e-01, -3.2812e-01,  2.5469e+00, -9.4141e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.1973e-01, -1.5156e+00,  3.6719e-01,  5.8105e-02],
        [ 5.7812e+00,  3.3008e-01,  4.8633e-01,  2.3193e-03],
        [ 2.5586e-01,  2.9297e-01, -1.1035e-01, -4.6875e-01],
        ...,
        [-2.6719e+00,  1.0391e+00, -8.2812e-01,  1.3184e-01],
        [ 4.6094e-01, -3.6133e-01,  9.4141e-01,  5.3711e-02],
        [ 4.5898e-01, -3.2812e-01,  2.5469e+00, -9.4141e-01]],
       requires_grad=True)
2025-02-06 20:26:43,869 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2314, -1.5234,  0.3848,  0.0417],
        [ 5.6875,  0.1973,  0.5430,  0.0161],
        [ 0.2598,  0.2021, -0.0243, -0.4238],
        ...,
        [-2.8125,  1.2344, -0.9102,  0.1104],
        [ 0.4902, -0.5156,  1.2734,  0.1030],
        [ 0.4961, -0.5195,  3.0156, -0.8672]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2314, -1.5234,  0.3848,  0.0417],
        [ 5.6875,  0.1973,  0.5430,  0.0161],
        [ 0.2598,  0.2021, -0.0243, -0.4238],
        ...,
        [-2.8125,  1.2344, -0.9102,  0.1104],
        [ 0.4902, -0.5156,  1.2734,  0.1030],
        [ 0.4961, -0.5195,  3.0156, -0.8672]], requires_grad=True)
2025-02-06 20:26:44,003 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2285, -1.5625,  0.4062,  0.0289],
        [ 5.5625,  0.0898,  0.5859,  0.0269],
        [ 0.2598,  0.1309,  0.0420, -0.3848],
        ...,
        [-2.9375,  1.4141, -0.9844,  0.0903],
        [ 0.5117, -0.6406,  1.5469,  0.1445],
        [ 0.5352, -0.7344,  3.5312, -0.7930]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2285, -1.5625,  0.4062,  0.0289],
        [ 5.5625,  0.0898,  0.5859,  0.0269],
        [ 0.2598,  0.1309,  0.0420, -0.3848],
        ...,
        [-2.9375,  1.4141, -0.9844,  0.0903],
        [ 0.5117, -0.6406,  1.5469,  0.1445],
        [ 0.5352, -0.7344,  3.5312, -0.7930]], requires_grad=True)
2025-02-06 20:26:44,142 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2402, -1.5391,  0.4160,  0.0190],
        [ 5.3438,  0.0432,  0.6055,  0.0381],
        [ 0.2539,  0.0864,  0.0854, -0.3477],
        ...,
        [-3.0156,  1.5703, -1.0469,  0.0728],
        [ 0.5234, -0.7266,  1.7422,  0.1826],
        [ 0.5586, -0.8867,  3.9062, -0.7227]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2402, -1.5391,  0.4160,  0.0190],
        [ 5.3438,  0.0432,  0.6055,  0.0381],
        [ 0.2539,  0.0864,  0.0854, -0.3477],
        ...,
        [-3.0156,  1.5703, -1.0469,  0.0728],
        [ 0.5234, -0.7266,  1.7422,  0.1826],
        [ 0.5586, -0.8867,  3.9062, -0.7227]], requires_grad=True)
2025-02-06 20:26:44,273 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2539, -1.5312,  0.4258,  0.0085],
        [ 4.9062,  0.1377,  0.5664,  0.0728],
        [ 0.2373,  0.1172,  0.0737, -0.2793],
        ...,
        [-3.0312,  1.6641, -1.0859,  0.0505],
        [ 0.5234, -0.7656,  1.8594,  0.2275],
        [ 0.5781, -1.0156,  4.2188, -0.6562]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2539, -1.5312,  0.4258,  0.0085],
        [ 4.9062,  0.1377,  0.5664,  0.0728],
        [ 0.2373,  0.1172,  0.0737, -0.2793],
        ...,
        [-3.0312,  1.6641, -1.0859,  0.0505],
        [ 0.5234, -0.7656,  1.8594,  0.2275],
        [ 0.5781, -1.0156,  4.2188, -0.6562]], requires_grad=True)
2025-02-06 20:26:44,408 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.6562e-01, -1.5078e+00,  4.2969e-01,  2.0752e-03],
        [ 4.4688e+00,  2.3438e-01,  5.3125e-01,  1.0889e-01],
        [ 2.1875e-01,  1.6992e-01,  4.6631e-02, -1.9043e-01],
        ...,
        [-3.0469e+00,  1.6797e+00, -1.1094e+00,  9.2163e-03],
        [ 5.1172e-01, -7.3438e-01,  1.8672e+00,  3.1641e-01],
        [ 6.1328e-01, -1.1641e+00,  4.5625e+00, -6.2500e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.6562e-01, -1.5078e+00,  4.2969e-01,  2.0752e-03],
        [ 4.4688e+00,  2.3438e-01,  5.3125e-01,  1.0889e-01],
        [ 2.1875e-01,  1.6992e-01,  4.6631e-02, -1.9043e-01],
        ...,
        [-3.0469e+00,  1.6797e+00, -1.1094e+00,  9.2163e-03],
        [ 5.1172e-01, -7.3438e-01,  1.8672e+00,  3.1641e-01],
        [ 6.1328e-01, -1.1641e+00,  4.5625e+00, -6.2500e-01]],
       requires_grad=True)
2025-02-06 20:26:44,539 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2754, -1.4922,  0.4336, -0.0076],
        [ 4.2500,  0.3086,  0.5039,  0.1338],
        [ 0.2080,  0.2227,  0.0211, -0.0977],
        ...,
        [-3.1406,  1.6406, -1.1172, -0.0605],
        [ 0.5117, -0.6836,  1.8516,  0.4199],
        [ 0.6445, -1.3359,  4.9062, -0.6328]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2754, -1.4922,  0.4336, -0.0076],
        [ 4.2500,  0.3086,  0.5039,  0.1338],
        [ 0.2080,  0.2227,  0.0211, -0.0977],
        ...,
        [-3.1406,  1.6406, -1.1172, -0.0605],
        [ 0.5117, -0.6836,  1.8516,  0.4199],
        [ 0.6445, -1.3359,  4.9062, -0.6328]], requires_grad=True)
2025-02-06 20:26:44,673 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.5195e-01, -1.5938e+00,  4.4531e-01, -8.8379e-02],
        [ 4.1875e+00,  3.1055e-01,  4.8828e-01,  9.5703e-02],
        [ 1.9824e-01,  2.6758e-01, -1.3199e-03, -1.4526e-02],
        ...,
        [-3.6719e+00,  1.7578e+00, -1.1406e+00,  1.6968e-02],
        [ 5.4688e-01, -7.5391e-01,  1.9375e+00,  2.8906e-01],
        [ 6.8750e-01, -1.5078e+00,  5.1875e+00, -6.8750e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-2.5195e-01, -1.5938e+00,  4.4531e-01, -8.8379e-02],
        [ 4.1875e+00,  3.1055e-01,  4.8828e-01,  9.5703e-02],
        [ 1.9824e-01,  2.6758e-01, -1.3199e-03, -1.4526e-02],
        ...,
        [-3.6719e+00,  1.7578e+00, -1.1406e+00,  1.6968e-02],
        [ 5.4688e-01, -7.5391e-01,  1.9375e+00,  2.8906e-01],
        [ 6.8750e-01, -1.5078e+00,  5.1875e+00, -6.8750e-01]],
       requires_grad=True)
2025-02-06 20:26:44,806 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2236, -1.6797,  0.4531, -0.1582],
        [ 4.0938,  0.3027,  0.4746,  0.0554],
        [ 0.1914,  0.2852, -0.0129,  0.0088],
        ...,
        [-4.1250,  1.8906, -1.1641,  0.1172],
        [ 0.5781, -0.8516,  2.0469,  0.1016],
        [ 0.7266, -1.6641,  5.4375, -0.7500]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2236, -1.6797,  0.4531, -0.1582],
        [ 4.0938,  0.3027,  0.4746,  0.0554],
        [ 0.1914,  0.2852, -0.0129,  0.0088],
        ...,
        [-4.1250,  1.8906, -1.1641,  0.1172],
        [ 0.5781, -0.8516,  2.0469,  0.1016],
        [ 0.7266, -1.6641,  5.4375, -0.7500]], requires_grad=True)
2025-02-06 20:26:44,931 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.1885, -1.8047,  0.4648, -0.2383],
        [ 4.0000,  0.4805,  0.4219,  0.1426],
        [ 0.1846,  0.3340, -0.0383,  0.0889],
        ...,
        [-4.4375,  1.8672, -1.1641,  0.1245],
        [ 0.6055, -0.9062,  2.0938, -0.0284],
        [ 0.7617, -1.6719,  5.5000, -0.6719]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.1885, -1.8047,  0.4648, -0.2383],
        [ 4.0000,  0.4805,  0.4219,  0.1426],
        [ 0.1846,  0.3340, -0.0383,  0.0889],
        ...,
        [-4.4375,  1.8672, -1.1641,  0.1245],
        [ 0.6055, -0.9062,  2.0938, -0.0284],
        [ 0.7617, -1.6719,  5.5000, -0.6719]], requires_grad=True)
2025-02-06 20:26:45,079 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.1455, -1.9375,  0.4766, -0.3164],
        [ 3.9375,  0.7500,  0.3477,  0.2754],
        [ 0.1797,  0.4082, -0.0762,  0.2051],
        ...,
        [-4.7500,  1.7734, -1.1484,  0.0986],
        [ 0.6289, -0.8672,  2.0469, -0.0537],
        [ 0.7930, -1.5938,  5.4375, -0.5352]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.1455, -1.9375,  0.4766, -0.3164],
        [ 3.9375,  0.7500,  0.3477,  0.2754],
        [ 0.1797,  0.4082, -0.0762,  0.2051],
        ...,
        [-4.7500,  1.7734, -1.1484,  0.0986],
        [ 0.6289, -0.8672,  2.0469, -0.0537],
        [ 0.7930, -1.5938,  5.4375, -0.5352]], requires_grad=True)
2025-02-06 20:26:45,211 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.1118, -1.9453,  0.4785, -0.3613],
        [ 3.7969,  0.9531,  0.2871,  0.3809],
        [ 0.1729,  0.4297, -0.0933,  0.2637],
        ...,
        [-5.0312,  1.8125, -1.1484,  0.1172],
        [ 0.6484, -0.9141,  2.0625, -0.1367],
        [ 0.8164, -1.5781,  5.4375, -0.4473]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.1118, -1.9453,  0.4785, -0.3613],
        [ 3.7969,  0.9531,  0.2871,  0.3809],
        [ 0.1729,  0.4297, -0.0933,  0.2637],
        ...,
        [-5.0312,  1.8125, -1.1484,  0.1172],
        [ 0.6484, -0.9141,  2.0625, -0.1367],
        [ 0.8164, -1.5781,  5.4375, -0.4473]], requires_grad=True)
2025-02-06 20:26:45,360 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0845, -1.9062,  0.4766, -0.3926],
        [ 3.6719,  1.1016,  0.2373,  0.4648],
        [ 0.1670,  0.4238, -0.1016,  0.3008],
        ...,
        [-5.2500,  1.8906, -1.1484,  0.1426],
        [ 0.6602, -0.9805,  2.0938, -0.2207],
        [ 0.8359, -1.5547,  5.4062, -0.3691]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0845, -1.9062,  0.4766, -0.3926],
        [ 3.6719,  1.1016,  0.2373,  0.4648],
        [ 0.1670,  0.4238, -0.1016,  0.3008],
        ...,
        [-5.2500,  1.8906, -1.1484,  0.1426],
        [ 0.6602, -0.9805,  2.0938, -0.2207],
        [ 0.8359, -1.5547,  5.4062, -0.3691]], requires_grad=True)
2025-02-06 20:26:45,493 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0635, -1.8359,  0.4727, -0.4199],
        [ 3.5469,  1.2422,  0.1914,  0.5391],
        [ 0.1602,  0.4180, -0.1084,  0.3320],
        ...,
        [-5.4062,  1.9297, -1.1406,  0.1631],
        [ 0.6680, -1.0391,  2.1094, -0.2949],
        [ 0.8438, -1.4922,  5.3438, -0.2930]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0635, -1.8359,  0.4727, -0.4199],
        [ 3.5469,  1.2422,  0.1914,  0.5391],
        [ 0.1602,  0.4180, -0.1084,  0.3320],
        ...,
        [-5.4062,  1.9297, -1.1406,  0.1631],
        [ 0.6680, -1.0391,  2.1094, -0.2949],
        [ 0.8438, -1.4922,  5.3438, -0.2930]], requires_grad=True)
2025-02-06 20:26:45,625 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0376, -1.7969,  0.4688, -0.4414],
        [ 3.4375,  1.3984,  0.1445,  0.6016],
        [ 0.1553,  0.4668, -0.1270,  0.3594],
        ...,
        [-5.5625,  1.8672, -1.1250,  0.1797],
        [ 0.6758, -1.0469,  2.0938, -0.3574],
        [ 0.8516, -1.4219,  5.2500, -0.2266]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0376, -1.7969,  0.4688, -0.4414],
        [ 3.4375,  1.3984,  0.1445,  0.6016],
        [ 0.1553,  0.4668, -0.1270,  0.3594],
        ...,
        [-5.5625,  1.8672, -1.1250,  0.1797],
        [ 0.6758, -1.0469,  2.0938, -0.3574],
        [ 0.8516, -1.4219,  5.2500, -0.2266]], requires_grad=True)
2025-02-06 20:26:45,758 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0160, -1.6875,  0.4609, -0.4629],
        [ 3.2969,  1.5234,  0.1050,  0.6562],
        [ 0.1494,  0.4961, -0.1406,  0.3848],
        ...,
        [-5.6562,  1.8359, -1.1094,  0.1914],
        [ 0.6797, -1.0859,  2.0938, -0.4062],
        [ 0.8477, -1.3828,  5.1562, -0.1641]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0160, -1.6875,  0.4609, -0.4629],
        [ 3.2969,  1.5234,  0.1050,  0.6562],
        [ 0.1494,  0.4961, -0.1406,  0.3848],
        ...,
        [-5.6562,  1.8359, -1.1094,  0.1914],
        [ 0.6797, -1.0859,  2.0938, -0.4062],
        [ 0.8477, -1.3828,  5.1562, -0.1641]], requires_grad=True)
2025-02-06 20:26:45,899 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0061, -1.6172,  0.4531, -0.4785],
        [ 3.1562,  1.6172,  0.0708,  0.7031],
        [ 0.1416,  0.5234, -0.1533,  0.4023],
        ...,
        [-5.6875,  1.7891, -1.0938,  0.2031],
        [ 0.6758, -1.1016,  2.0781, -0.4531],
        [ 0.8398, -1.3438,  5.0625, -0.1084]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0061, -1.6172,  0.4531, -0.4785],
        [ 3.1562,  1.6172,  0.0708,  0.7031],
        [ 0.1416,  0.5234, -0.1533,  0.4023],
        ...,
        [-5.6875,  1.7891, -1.0938,  0.2031],
        [ 0.6758, -1.1016,  2.0781, -0.4531],
        [ 0.8398, -1.3438,  5.0625, -0.1084]], requires_grad=True)
2025-02-06 20:26:46,044 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0182, -1.4609,  0.4434, -0.5000],
        [ 3.0625,  1.5625,  0.0498,  0.7695],
        [ 0.1377,  0.5234, -0.1602,  0.4316],
        ...,
        [-5.6875,  1.7891, -1.0781,  0.2041],
        [ 0.6719, -1.1406,  2.0781, -0.4805],
        [ 0.8359, -1.3750,  5.0000, -0.0366]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0182, -1.4609,  0.4434, -0.5000],
        [ 3.0625,  1.5625,  0.0498,  0.7695],
        [ 0.1377,  0.5234, -0.1602,  0.4316],
        ...,
        [-5.6875,  1.7891, -1.0781,  0.2041],
        [ 0.6719, -1.1406,  2.0781, -0.4805],
        [ 0.8359, -1.3750,  5.0000, -0.0366]], requires_grad=True)
2025-02-06 20:26:46,201 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0420, -1.3359,  0.4336, -0.5156],
        [ 2.8906,  1.5312,  0.0287,  0.8203],
        [ 0.1406,  0.5039, -0.1621,  0.4648],
        ...,
        [-5.6875,  1.7656, -1.0625,  0.2070],
        [ 0.6719, -1.1406,  2.0625, -0.5117],
        [ 0.8281, -1.3984,  4.9062,  0.0237]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0420, -1.3359,  0.4336, -0.5156],
        [ 2.8906,  1.5312,  0.0287,  0.8203],
        [ 0.1406,  0.5039, -0.1621,  0.4648],
        ...,
        [-5.6875,  1.7656, -1.0625,  0.2070],
        [ 0.6719, -1.1406,  2.0625, -0.5117],
        [ 0.8281, -1.3984,  4.9062,  0.0237]], requires_grad=True)
2025-02-06 20:26:46,328 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 7.4219e-02, -1.2344e+00,  4.2383e-01, -5.2734e-01],
        [ 2.7188e+00,  1.5625e+00,  5.4321e-03,  8.5547e-01],
        [ 1.4355e-01,  5.5078e-01, -1.7188e-01,  4.7070e-01],
        ...,
        [-5.6562e+00,  1.6172e+00, -1.0391e+00,  2.2363e-01],
        [ 6.6797e-01, -1.0469e+00,  2.0156e+00, -5.6250e-01],
        [ 8.2422e-01, -1.3828e+00,  4.8125e+00,  7.0312e-02]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 7.4219e-02, -1.2344e+00,  4.2383e-01, -5.2734e-01],
        [ 2.7188e+00,  1.5625e+00,  5.4321e-03,  8.5547e-01],
        [ 1.4355e-01,  5.5078e-01, -1.7188e-01,  4.7070e-01],
        ...,
        [-5.6562e+00,  1.6172e+00, -1.0391e+00,  2.2363e-01],
        [ 6.6797e-01, -1.0469e+00,  2.0156e+00, -5.6250e-01],
        [ 8.2422e-01, -1.3828e+00,  4.8125e+00,  7.0312e-02]],
       requires_grad=True)
2025-02-06 20:26:46,457 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0830, -1.0156,  0.4062, -0.5547],
        [ 2.8281,  1.3984,  0.0119,  0.9336],
        [ 0.1631,  0.5156, -0.1582,  0.5352],
        ...,
        [-5.8125,  1.6250, -1.0234,  0.2021],
        [ 0.6680, -1.0156,  1.9922, -0.5781],
        [ 0.8398, -1.5078,  4.8125,  0.1729]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0830, -1.0156,  0.4062, -0.5547],
        [ 2.8281,  1.3984,  0.0119,  0.9336],
        [ 0.1631,  0.5156, -0.1582,  0.5352],
        ...,
        [-5.8125,  1.6250, -1.0234,  0.2021],
        [ 0.6680, -1.0156,  1.9922, -0.5781],
        [ 0.8398, -1.5078,  4.8125,  0.1729]], requires_grad=True)
2025-02-06 20:26:46,609 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0977, -0.7695,  0.3887, -0.5898],
        [ 2.8438,  1.1562,  0.0361,  1.0312],
        [ 0.1738,  0.4277, -0.1245,  0.6367],
        ...,
        [-5.7812,  1.7734, -1.0312,  0.1416],
        [ 0.6602, -1.0547,  2.0312, -0.5430],
        [ 0.8359, -1.6797,  4.8750,  0.2969]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0977, -0.7695,  0.3887, -0.5898],
        [ 2.8438,  1.1562,  0.0361,  1.0312],
        [ 0.1738,  0.4277, -0.1245,  0.6367],
        ...,
        [-5.7812,  1.7734, -1.0312,  0.1416],
        [ 0.6602, -1.0547,  2.0312, -0.5430],
        [ 0.8359, -1.6797,  4.8750,  0.2969]], requires_grad=True)
2025-02-06 20:26:46,762 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.1230e-01, -6.8750e-01,  3.8672e-01, -5.8984e-01],
        [ 2.7969e+00,  1.1094e+00,  3.1281e-03,  1.0625e+00],
        [ 1.7969e-01,  4.0039e-01, -1.2793e-01,  6.7969e-01],
        ...,
        [-5.6875e+00,  1.8047e+00, -1.0156e+00,  1.1230e-01],
        [ 6.4844e-01, -1.0391e+00,  1.9922e+00, -5.3906e-01],
        [ 8.3203e-01, -1.7500e+00,  4.7812e+00,  3.6523e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.1230e-01, -6.8750e-01,  3.8672e-01, -5.8984e-01],
        [ 2.7969e+00,  1.1094e+00,  3.1281e-03,  1.0625e+00],
        [ 1.7969e-01,  4.0039e-01, -1.2793e-01,  6.7969e-01],
        ...,
        [-5.6875e+00,  1.8047e+00, -1.0156e+00,  1.1230e-01],
        [ 6.4844e-01, -1.0391e+00,  1.9922e+00, -5.3906e-01],
        [ 8.3203e-01, -1.7500e+00,  4.7812e+00,  3.6523e-01]],
       requires_grad=True)
2025-02-06 20:26:46,902 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1138, -0.7461,  0.4043, -0.5781],
        [ 2.8750,  1.1875, -0.0703,  1.0703],
        [ 0.1865,  0.4082, -0.1543,  0.7070],
        ...,
        [-5.6875,  1.7656, -0.9766,  0.0942],
        [ 0.6445, -0.9531,  1.8281, -0.5547],
        [ 0.8242, -1.7266,  4.5312,  0.4082]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1138, -0.7461,  0.4043, -0.5781],
        [ 2.8750,  1.1875, -0.0703,  1.0703],
        [ 0.1865,  0.4082, -0.1543,  0.7070],
        ...,
        [-5.6875,  1.7656, -0.9766,  0.0942],
        [ 0.6445, -0.9531,  1.8281, -0.5547],
        [ 0.8242, -1.7266,  4.5312,  0.4082]], requires_grad=True)
2025-02-06 20:26:47,033 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1157, -0.8047,  0.4199, -0.5703],
        [ 2.9844,  1.2266, -0.1270,  1.0781],
        [ 0.1992,  0.4277, -0.1875,  0.7305],
        ...,
        [-5.8438,  1.6875, -0.9258,  0.0747],
        [ 0.6523, -0.8594,  1.6562, -0.5625],
        [ 0.8164, -1.6797,  4.2812,  0.4473]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1157, -0.8047,  0.4199, -0.5703],
        [ 2.9844,  1.2266, -0.1270,  1.0781],
        [ 0.1992,  0.4277, -0.1875,  0.7305],
        ...,
        [-5.8438,  1.6875, -0.9258,  0.0747],
        [ 0.6523, -0.8594,  1.6562, -0.5625],
        [ 0.8164, -1.6797,  4.2812,  0.4473]], requires_grad=True)
2025-02-06 20:26:47,165 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1191, -0.8281,  0.4277, -0.5586],
        [ 3.0469,  1.1875, -0.1387,  1.0625],
        [ 0.2080,  0.4082, -0.1826,  0.7305],
        ...,
        [-5.8750,  1.6953, -0.9102,  0.0713],
        [ 0.6523, -0.8398,  1.6562, -0.5938],
        [ 0.7969, -1.6875,  4.1562,  0.4648]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1191, -0.8281,  0.4277, -0.5586],
        [ 3.0469,  1.1875, -0.1387,  1.0625],
        [ 0.2080,  0.4082, -0.1826,  0.7305],
        ...,
        [-5.8750,  1.6953, -0.9102,  0.0713],
        [ 0.6523, -0.8398,  1.6562, -0.5938],
        [ 0.7969, -1.6875,  4.1562,  0.4648]], requires_grad=True)
2025-02-06 20:26:47,309 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1484, -0.9141,  0.4473, -0.5586],
        [ 3.1719,  1.1719, -0.1611,  1.0547],
        [ 0.2158,  0.3945, -0.1797,  0.7305],
        ...,
        [-5.8438,  1.6641, -0.8828,  0.0605],
        [ 0.6641, -0.8594,  1.7422, -0.6406],
        [ 0.7773, -1.6953,  4.0312,  0.4766]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1484, -0.9141,  0.4473, -0.5586],
        [ 3.1719,  1.1719, -0.1611,  1.0547],
        [ 0.2158,  0.3945, -0.1797,  0.7305],
        ...,
        [-5.8438,  1.6641, -0.8828,  0.0605],
        [ 0.6641, -0.8594,  1.7422, -0.6406],
        [ 0.7773, -1.6953,  4.0312,  0.4766]], requires_grad=True)
2025-02-06 20:26:47,462 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1660, -0.9609,  0.4570, -0.5508],
        [ 3.3125,  1.1953, -0.1992,  1.0625],
        [ 0.2246,  0.3730, -0.1670,  0.7188],
        ...,
        [-5.8438,  1.5938, -0.8398,  0.0361],
        [ 0.6758, -0.8438,  1.7422, -0.6562],
        [ 0.7578, -1.6797,  3.8750,  0.4941]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1660, -0.9609,  0.4570, -0.5508],
        [ 3.3125,  1.1953, -0.1992,  1.0625],
        [ 0.2246,  0.3730, -0.1670,  0.7188],
        ...,
        [-5.8438,  1.5938, -0.8398,  0.0361],
        [ 0.6758, -0.8438,  1.7422, -0.6562],
        [ 0.7578, -1.6797,  3.8750,  0.4941]], requires_grad=True)
2025-02-06 20:26:47,607 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1865, -0.9766,  0.4590, -0.5352],
        [ 3.3125,  1.2891, -0.2852,  1.1016],
        [ 0.2285,  0.3789, -0.1924,  0.7383],
        ...,
        [-5.8750,  1.4297, -0.7500, -0.0255],
        [ 0.6875, -0.7812,  1.5938, -0.6211],
        [ 0.7148, -1.6406,  3.6406,  0.5234]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1865, -0.9766,  0.4590, -0.5352],
        [ 3.3125,  1.2891, -0.2852,  1.1016],
        [ 0.2285,  0.3789, -0.1924,  0.7383],
        ...,
        [-5.8750,  1.4297, -0.7500, -0.0255],
        [ 0.6875, -0.7812,  1.5938, -0.6211],
        [ 0.7148, -1.6406,  3.6406,  0.5234]], requires_grad=True)
2025-02-06 20:26:47,742 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1875, -0.9453,  0.4473, -0.5117],
        [ 3.3281,  1.3672, -0.3574,  1.1328],
        [ 0.2344,  0.3750, -0.2012,  0.7461],
        ...,
        [-5.9375,  1.2969, -0.6758, -0.0747],
        [ 0.6992, -0.7266,  1.4844, -0.5938],
        [ 0.6992, -1.6406,  3.5938,  0.5234]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1875, -0.9453,  0.4473, -0.5117],
        [ 3.3281,  1.3672, -0.3574,  1.1328],
        [ 0.2344,  0.3750, -0.2012,  0.7461],
        ...,
        [-5.9375,  1.2969, -0.6758, -0.0747],
        [ 0.6992, -0.7266,  1.4844, -0.5938],
        [ 0.6992, -1.6406,  3.5938,  0.5234]], requires_grad=True)
2025-02-06 20:26:47,887 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1611, -0.7969,  0.3906, -0.4648],
        [ 3.3438,  1.3281, -0.3223,  1.1172],
        [ 0.2451,  0.3281, -0.1270,  0.7109],
        ...,
        [-6.1250,  1.2812, -0.6875, -0.0854],
        [ 0.7109, -0.7578,  1.7031, -0.6211],
        [ 0.7031, -1.7188,  3.9375,  0.4805]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1611, -0.7969,  0.3906, -0.4648],
        [ 3.3438,  1.3281, -0.3223,  1.1172],
        [ 0.2451,  0.3281, -0.1270,  0.7109],
        ...,
        [-6.1250,  1.2812, -0.6875, -0.0854],
        [ 0.7109, -0.7578,  1.7031, -0.6211],
        [ 0.7031, -1.7188,  3.9375,  0.4805]], requires_grad=True)
2025-02-06 20:26:48,020 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1426, -0.6719,  0.3438, -0.4219],
        [ 3.3281,  1.2969, -0.2969,  1.1016],
        [ 0.2520,  0.2832, -0.0559,  0.6797],
        ...,
        [-6.2500,  1.2812, -0.7070, -0.0952],
        [ 0.7148, -0.7812,  1.8828, -0.6445],
        [ 0.7031, -1.7891,  4.3125,  0.4414]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1426, -0.6719,  0.3438, -0.4219],
        [ 3.3281,  1.2969, -0.2969,  1.1016],
        [ 0.2520,  0.2832, -0.0559,  0.6797],
        ...,
        [-6.2500,  1.2812, -0.7070, -0.0952],
        [ 0.7148, -0.7812,  1.8828, -0.6445],
        [ 0.7031, -1.7891,  4.3125,  0.4414]], requires_grad=True)
2025-02-06 20:26:48,154 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1260, -0.5703,  0.3066, -0.3828],
        [ 3.3125,  1.2812, -0.2930,  1.0781],
        [ 0.2578,  0.2598, -0.0249,  0.6367],
        ...,
        [-6.3438,  1.2656, -0.7188, -0.1011],
        [ 0.7188, -0.7812,  1.9531, -0.6758],
        [ 0.7031, -1.8359,  4.5625,  0.4004]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1260, -0.5703,  0.3066, -0.3828],
        [ 3.3125,  1.2812, -0.2930,  1.0781],
        [ 0.2578,  0.2598, -0.0249,  0.6367],
        ...,
        [-6.3438,  1.2656, -0.7188, -0.1011],
        [ 0.7188, -0.7812,  1.9531, -0.6758],
        [ 0.7031, -1.8359,  4.5625,  0.4004]], requires_grad=True)
2025-02-06 20:26:48,288 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1396, -0.5430,  0.2969, -0.3262],
        [ 3.2969,  1.2578, -0.2871,  1.0547],
        [ 0.2559,  0.2637, -0.0508,  0.5625],
        ...,
        [-6.3438,  1.2031, -0.6875, -0.0869],
        [ 0.7031, -0.7305,  1.7969, -0.7461],
        [ 0.6992, -1.8750,  4.7812,  0.3652]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1396, -0.5430,  0.2969, -0.3262],
        [ 3.2969,  1.2578, -0.2871,  1.0547],
        [ 0.2559,  0.2637, -0.0508,  0.5625],
        ...,
        [-6.3438,  1.2031, -0.6875, -0.0869],
        [ 0.7031, -0.7305,  1.7969, -0.7461],
        [ 0.6992, -1.8750,  4.7812,  0.3652]], requires_grad=True)
2025-02-06 20:26:48,430 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1787, -0.6836,  0.3496, -0.2207],
        [ 3.2031,  1.3047, -0.3535,  0.9883],
        [ 0.2471,  0.3047, -0.1504,  0.4375],
        ...,
        [-6.3750,  1.0547, -0.5938, -0.0306],
        [ 0.6836, -0.5938,  1.2734, -0.9062],
        [ 0.6836, -1.8438,  4.6250,  0.2754]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1787, -0.6836,  0.3496, -0.2207],
        [ 3.2031,  1.3047, -0.3535,  0.9883],
        [ 0.2471,  0.3047, -0.1504,  0.4375],
        ...,
        [-6.3750,  1.0547, -0.5938, -0.0306],
        [ 0.6836, -0.5938,  1.2734, -0.9062],
        [ 0.6836, -1.8438,  4.6250,  0.2754]], requires_grad=True)
2025-02-06 20:26:48,562 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2168, -0.6992,  0.3613, -0.1602],
        [ 3.1406,  1.3047, -0.3691,  0.9531],
        [ 0.2363,  0.3203, -0.2041,  0.3516],
        ...,
        [-6.4062,  1.0234, -0.5781, -0.0293],
        [ 0.6680, -0.5625,  1.1562, -0.9492],
        [ 0.6680, -1.8438,  4.6562,  0.2285]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2168, -0.6992,  0.3613, -0.1602],
        [ 3.1406,  1.3047, -0.3691,  0.9531],
        [ 0.2363,  0.3203, -0.2041,  0.3516],
        ...,
        [-6.4062,  1.0234, -0.5781, -0.0293],
        [ 0.6680, -0.5625,  1.1562, -0.9492],
        [ 0.6680, -1.8438,  4.6562,  0.2285]], requires_grad=True)
2025-02-06 20:26:48,693 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2480, -0.6094,  0.3418, -0.1177],
        [ 3.0938,  1.1484, -0.2773,  0.9492],
        [ 0.2295,  0.2578, -0.1426,  0.3145],
        ...,
        [-6.4688,  1.1641, -0.6602, -0.0559],
        [ 0.6562, -0.6641,  1.4453, -0.9375],
        [ 0.6641, -2.0156,  5.3125,  0.2354]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2480, -0.6094,  0.3418, -0.1177],
        [ 3.0938,  1.1484, -0.2773,  0.9492],
        [ 0.2295,  0.2578, -0.1426,  0.3145],
        ...,
        [-6.4688,  1.1641, -0.6602, -0.0559],
        [ 0.6562, -0.6641,  1.4453, -0.9375],
        [ 0.6641, -2.0156,  5.3125,  0.2354]], requires_grad=True)
2025-02-06 20:26:48,822 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2598, -0.4707,  0.3125, -0.0737],
        [ 3.1406,  1.0312, -0.2061,  0.9453],
        [ 0.2236,  0.2139, -0.1011,  0.2871],
        ...,
        [-6.5000,  1.2578, -0.7148, -0.0850],
        [ 0.6523, -0.7617,  1.7188, -0.9297],
        [ 0.6680, -2.1719,  5.9062,  0.2354]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2598, -0.4707,  0.3125, -0.0737],
        [ 3.1406,  1.0312, -0.2061,  0.9453],
        [ 0.2236,  0.2139, -0.1011,  0.2871],
        ...,
        [-6.5000,  1.2578, -0.7148, -0.0850],
        [ 0.6523, -0.7617,  1.7188, -0.9297],
        [ 0.6680, -2.1719,  5.9062,  0.2354]], requires_grad=True)
2025-02-06 20:26:48,952 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3301, -0.4609,  0.3047, -0.0654],
        [ 3.0156,  1.0469, -0.1963,  1.0000],
        [ 0.2031,  0.2656, -0.1445,  0.3828],
        ...,
        [-6.4375,  1.2109, -0.7227, -0.1641],
        [ 0.6172, -0.6758,  1.6406, -0.7578],
        [ 0.6328, -2.2031,  6.1562,  0.3105]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3301, -0.4609,  0.3047, -0.0654],
        [ 3.0156,  1.0469, -0.1963,  1.0000],
        [ 0.2031,  0.2656, -0.1445,  0.3828],
        ...,
        [-6.4375,  1.2109, -0.7227, -0.1641],
        [ 0.6172, -0.6758,  1.6406, -0.7578],
        [ 0.6328, -2.2031,  6.1562,  0.3105]], requires_grad=True)
2025-02-06 20:26:49,087 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3828, -0.3926,  0.2891, -0.0427],
        [ 2.8906,  1.0391, -0.1797,  1.0312],
        [ 0.1836,  0.3086, -0.1816,  0.4609],
        ...,
        [-6.3125,  1.1562, -0.7188, -0.2363],
        [ 0.5820, -0.5938,  1.5625, -0.5977],
        [ 0.5938, -2.2188,  6.3438,  0.3828]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3828, -0.3926,  0.2891, -0.0427],
        [ 2.8906,  1.0391, -0.1797,  1.0312],
        [ 0.1836,  0.3086, -0.1816,  0.4609],
        ...,
        [-6.3125,  1.1562, -0.7188, -0.2363],
        [ 0.5820, -0.5938,  1.5625, -0.5977],
        [ 0.5938, -2.2188,  6.3438,  0.3828]], requires_grad=True)
2025-02-06 20:26:49,221 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4180, -0.3359,  0.2754, -0.0243],
        [ 2.9062,  0.9531, -0.1377,  1.0312],
        [ 0.1729,  0.3242, -0.1973,  0.5078],
        ...,
        [-6.3750,  1.2266, -0.7461, -0.2539],
        [ 0.5703, -0.6250,  1.6406, -0.5430],
        [ 0.6094, -2.3594,  6.7500,  0.3535]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4180, -0.3359,  0.2754, -0.0243],
        [ 2.9062,  0.9531, -0.1377,  1.0312],
        [ 0.1729,  0.3242, -0.1973,  0.5078],
        ...,
        [-6.3750,  1.2266, -0.7461, -0.2539],
        [ 0.5703, -0.6250,  1.6406, -0.5430],
        [ 0.6094, -2.3594,  6.7500,  0.3535]], requires_grad=True)
2025-02-06 20:26:49,355 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 4.3359e-01, -2.4316e-01,  2.5781e-01,  1.8406e-04],
        [ 3.2969e+00,  6.6797e-01, -3.3691e-02,  9.5703e-01],
        [ 1.8555e-01,  2.6953e-01, -1.6504e-01,  4.8633e-01],
        ...,
        [-6.5000e+00,  1.4531e+00, -8.1250e-01, -2.1582e-01],
        [ 5.8984e-01, -7.9297e-01,  1.9062e+00, -5.8984e-01],
        [ 7.0312e-01, -2.6406e+00,  7.3750e+00,  2.4414e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 4.3359e-01, -2.4316e-01,  2.5781e-01,  1.8406e-04],
        [ 3.2969e+00,  6.6797e-01, -3.3691e-02,  9.5703e-01],
        [ 1.8555e-01,  2.6953e-01, -1.6504e-01,  4.8633e-01],
        ...,
        [-6.5000e+00,  1.4531e+00, -8.1250e-01, -2.1582e-01],
        [ 5.8984e-01, -7.9297e-01,  1.9062e+00, -5.8984e-01],
        [ 7.0312e-01, -2.6406e+00,  7.3750e+00,  2.4414e-01]],
       requires_grad=True)
2025-02-06 20:26:49,513 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4531, -0.2422,  0.2500,  0.0171],
        [ 3.6719,  0.4297,  0.0535,  0.8945],
        [ 0.1982,  0.2227, -0.1367,  0.4688],
        ...,
        [-6.6250,  1.6484, -0.8672, -0.1826],
        [ 0.6055, -0.9219,  2.1094, -0.6211],
        [ 0.7969, -2.9062,  7.9375,  0.1445]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4531, -0.2422,  0.2500,  0.0171],
        [ 3.6719,  0.4297,  0.0535,  0.8945],
        [ 0.1982,  0.2227, -0.1367,  0.4688],
        ...,
        [-6.6250,  1.6484, -0.8672, -0.1826],
        [ 0.6055, -0.9219,  2.1094, -0.6211],
        [ 0.7969, -2.9062,  7.9375,  0.1445]], requires_grad=True)
2025-02-06 20:26:49,657 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4805, -0.3281,  0.2520,  0.0383],
        [ 3.9219,  0.2852,  0.1094,  0.8242],
        [ 0.2061,  0.1924, -0.1196,  0.4473],
        ...,
        [-6.5625,  1.7266, -0.8906, -0.1416],
        [ 0.6133, -1.0078,  2.2500, -0.6562],
        [ 0.8633, -3.0469,  8.3125,  0.0439]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4805, -0.3281,  0.2520,  0.0383],
        [ 3.9219,  0.2852,  0.1094,  0.8242],
        [ 0.2061,  0.1924, -0.1196,  0.4473],
        ...,
        [-6.5625,  1.7266, -0.8906, -0.1416],
        [ 0.6133, -1.0078,  2.2500, -0.6562],
        [ 0.8633, -3.0469,  8.3125,  0.0439]], requires_grad=True)
2025-02-06 20:26:49,790 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4922, -0.4883,  0.2637,  0.0679],
        [ 4.1562,  0.2051,  0.1445,  0.7539],
        [ 0.2109,  0.2285, -0.1416,  0.3867],
        ...,
        [-6.4062,  1.6719, -0.8828, -0.0840],
        [ 0.6133, -1.0312,  2.3125, -0.7070],
        [ 0.9219, -3.1250,  8.5625, -0.0574]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4922, -0.4883,  0.2637,  0.0679],
        [ 4.1562,  0.2051,  0.1445,  0.7539],
        [ 0.2109,  0.2285, -0.1416,  0.3867],
        ...,
        [-6.4062,  1.6719, -0.8828, -0.0840],
        [ 0.6133, -1.0312,  2.3125, -0.7070],
        [ 0.9219, -3.1250,  8.5625, -0.0574]], requires_grad=True)
2025-02-06 20:26:49,933 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.5273, -0.6719,  0.2793,  0.1025],
        [ 4.3750,  0.1035,  0.1846,  0.6953],
        [ 0.2227,  0.2344, -0.1455,  0.3535],
        ...,
        [-6.2812,  1.6406, -0.8750, -0.0364],
        [ 0.6211, -1.0703,  2.3906, -0.7383],
        [ 0.9648, -3.1562,  8.6875, -0.1621]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.5273, -0.6719,  0.2793,  0.1025],
        [ 4.3750,  0.1035,  0.1846,  0.6953],
        [ 0.2227,  0.2344, -0.1455,  0.3535],
        ...,
        [-6.2812,  1.6406, -0.8750, -0.0364],
        [ 0.6211, -1.0703,  2.3906, -0.7383],
        [ 0.9648, -3.1562,  8.6875, -0.1621]], requires_grad=True)
2025-02-06 20:26:50,061 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.5547, -0.7305,  0.2812,  0.1147],
        [ 4.8438,  0.1035,  0.1973,  0.6250],
        [ 0.2422,  0.3262, -0.1953,  0.2598],
        ...,
        [-6.3125,  1.3359, -0.8086,  0.0742],
        [ 0.6289, -0.9922,  2.3125, -0.8242],
        [ 1.0703, -2.8906,  8.3750, -0.3730]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.5547, -0.7305,  0.2812,  0.1147],
        [ 4.8438,  0.1035,  0.1973,  0.6250],
        [ 0.2422,  0.3262, -0.1953,  0.2598],
        ...,
        [-6.3125,  1.3359, -0.8086,  0.0742],
        [ 0.6289, -0.9922,  2.3125, -0.8242],
        [ 1.0703, -2.8906,  8.3750, -0.3730]], requires_grad=True)
2025-02-06 20:26:50,203 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.5586, -0.7266,  0.2773,  0.1157],
        [ 5.3125,  0.1328,  0.2012,  0.5508],
        [ 0.2715,  0.3301, -0.2002,  0.2344],
        ...,
        [-6.6562,  1.2266, -0.7812,  0.1270],
        [ 0.6680, -1.0625,  2.4062, -0.8125],
        [ 1.2109, -2.8125,  8.3125, -0.4805]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.5586, -0.7266,  0.2773,  0.1157],
        [ 5.3125,  0.1328,  0.2012,  0.5508],
        [ 0.2715,  0.3301, -0.2002,  0.2344],
        ...,
        [-6.6562,  1.2266, -0.7812,  0.1270],
        [ 0.6680, -1.0625,  2.4062, -0.8125],
        [ 1.2109, -2.8125,  8.3125, -0.4805]], requires_grad=True)
2025-02-06 20:26:50,346 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.5781, -0.7461,  0.2754,  0.1211],
        [ 5.5000,  0.2188,  0.1885,  0.4688],
        [ 0.2988,  0.3164, -0.1953,  0.2236],
        ...,
        [-7.2812,  1.2500, -0.7812,  0.1416],
        [ 0.7227, -1.2109,  2.5625, -0.7539],
        [ 1.3125, -2.7500,  8.1875, -0.5742]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.5781, -0.7461,  0.2754,  0.1211],
        [ 5.5000,  0.2188,  0.1885,  0.4688],
        [ 0.2988,  0.3164, -0.1953,  0.2236],
        ...,
        [-7.2812,  1.2500, -0.7812,  0.1416],
        [ 0.7227, -1.2109,  2.5625, -0.7539],
        [ 1.3125, -2.7500,  8.1875, -0.5742]], requires_grad=True)
2025-02-06 20:26:50,492 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6016, -0.7852,  0.2754,  0.1289],
        [ 5.5938,  0.3301,  0.1680,  0.3867],
        [ 0.3145,  0.3242, -0.2012,  0.1992],
        ...,
        [-7.7188,  1.2422, -0.7734,  0.1602],
        [ 0.7578, -1.3125,  2.6719, -0.7070],
        [ 1.3984, -2.6719,  8.0625, -0.6562]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6016, -0.7852,  0.2754,  0.1289],
        [ 5.5938,  0.3301,  0.1680,  0.3867],
        [ 0.3145,  0.3242, -0.2012,  0.1992],
        ...,
        [-7.7188,  1.2422, -0.7734,  0.1602],
        [ 0.7578, -1.3125,  2.6719, -0.7070],
        [ 1.3984, -2.6719,  8.0625, -0.6562]], requires_grad=True)
2025-02-06 20:26:50,661 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6250, -0.8789,  0.2793,  0.1406],
        [ 5.5938,  0.4473,  0.1455,  0.3105],
        [ 0.3242,  0.3809, -0.2275,  0.1562],
        ...,
        [-8.0625,  1.1641, -0.7500,  0.1846],
        [ 0.7852, -1.3594,  2.7188, -0.6797],
        [ 1.4688, -2.5625,  7.9062, -0.7344]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6250, -0.8789,  0.2793,  0.1406],
        [ 5.5938,  0.4473,  0.1455,  0.3105],
        [ 0.3242,  0.3809, -0.2275,  0.1562],
        ...,
        [-8.0625,  1.1641, -0.7500,  0.1846],
        [ 0.7852, -1.3594,  2.7188, -0.6797],
        [ 1.4688, -2.5625,  7.9062, -0.7344]], requires_grad=True)
2025-02-06 20:26:50,817 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6484, -1.1328,  0.2949,  0.1641],
        [ 5.5312,  0.6875,  0.0986,  0.2227],
        [ 0.3242,  0.4805, -0.2715,  0.0981],
        ...,
        [-8.3750,  0.9844, -0.7109,  0.2178],
        [ 0.7969, -1.2891,  2.6562, -0.6836],
        [ 1.5234, -2.3750,  7.6562, -0.8164]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6484, -1.1328,  0.2949,  0.1641],
        [ 5.5312,  0.6875,  0.0986,  0.2227],
        [ 0.3242,  0.4805, -0.2715,  0.0981],
        ...,
        [-8.3750,  0.9844, -0.7109,  0.2178],
        [ 0.7969, -1.2891,  2.6562, -0.6836],
        [ 1.5234, -2.3750,  7.6562, -0.8164]], requires_grad=True)
2025-02-06 20:26:50,921 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6562, -1.4688,  0.3125,  0.1924],
        [ 5.5000,  0.9766,  0.0442,  0.1367],
        [ 0.3242,  0.6016, -0.3203,  0.0354],
        ...,
        [-8.6250,  0.7930, -0.6758,  0.2490],
        [ 0.8086, -1.1875,  2.5781, -0.6953],
        [ 1.5781, -2.0938,  7.3438, -0.9062]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6562, -1.4688,  0.3125,  0.1924],
        [ 5.5000,  0.9766,  0.0442,  0.1367],
        [ 0.3242,  0.6016, -0.3203,  0.0354],
        ...,
        [-8.6250,  0.7930, -0.6758,  0.2490],
        [ 0.8086, -1.1875,  2.5781, -0.6953],
        [ 1.5781, -2.0938,  7.3438, -0.9062]], requires_grad=True)
2025-02-06 20:26:51,078 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6875, -1.6875,  0.3262,  0.2109],
        [ 5.5000,  1.0781,  0.0144,  0.0854],
        [ 0.3301,  0.6328, -0.3438,  0.0126],
        ...,
        [-8.8125,  0.7500, -0.6523,  0.2578],
        [ 0.8203, -1.2031,  2.5469, -0.6680],
        [ 1.6172, -1.9375,  7.0938, -0.9609]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6875, -1.6875,  0.3262,  0.2109],
        [ 5.5000,  1.0781,  0.0144,  0.0854],
        [ 0.3301,  0.6328, -0.3438,  0.0126],
        ...,
        [-8.8125,  0.7500, -0.6523,  0.2578],
        [ 0.8203, -1.2031,  2.5469, -0.6680],
        [ 1.6172, -1.9375,  7.0938, -0.9609]], requires_grad=True)
2025-02-06 20:26:51,236 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 7.0703e-01, -1.6406e+00,  3.2617e-01,  2.1191e-01],
        [ 5.6250e+00,  1.0000e+00,  7.7515e-03,  6.0547e-02],
        [ 3.3594e-01,  6.0547e-01, -3.5156e-01,  9.2163e-03],
        ...,
        [-9.0000e+00,  7.6562e-01, -6.3672e-01,  2.5977e-01],
        [ 8.2422e-01, -1.3281e+00,  2.5781e+00, -6.1328e-01],
        [ 1.6484e+00, -1.8984e+00,  6.9062e+00, -9.8828e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 7.0703e-01, -1.6406e+00,  3.2617e-01,  2.1191e-01],
        [ 5.6250e+00,  1.0000e+00,  7.7515e-03,  6.0547e-02],
        [ 3.3594e-01,  6.0547e-01, -3.5156e-01,  9.2163e-03],
        ...,
        [-9.0000e+00,  7.6562e-01, -6.3672e-01,  2.5977e-01],
        [ 8.2422e-01, -1.3281e+00,  2.5781e+00, -6.1328e-01],
        [ 1.6484e+00, -1.8984e+00,  6.9062e+00, -9.8828e-01]],
       requires_grad=True)
2025-02-06 20:26:51,384 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 7.1484e-01, -1.5625e+00,  3.2422e-01,  2.1094e-01],
        [ 5.7188e+00,  9.4141e-01, -1.1349e-04,  3.8574e-02],
        [ 3.4180e-01,  5.7422e-01, -3.5547e-01,  6.0730e-03],
        ...,
        [-9.1875e+00,  8.2031e-01, -6.2500e-01,  2.6172e-01],
        [ 8.3203e-01, -1.4688e+00,  2.6094e+00, -5.6641e-01],
        [ 1.6797e+00, -1.8672e+00,  6.7188e+00, -1.0078e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 7.1484e-01, -1.5625e+00,  3.2422e-01,  2.1094e-01],
        [ 5.7188e+00,  9.4141e-01, -1.1349e-04,  3.8574e-02],
        [ 3.4180e-01,  5.7422e-01, -3.5547e-01,  6.0730e-03],
        ...,
        [-9.1875e+00,  8.2031e-01, -6.2500e-01,  2.6172e-01],
        [ 8.3203e-01, -1.4688e+00,  2.6094e+00, -5.6641e-01],
        [ 1.6797e+00, -1.8672e+00,  6.7188e+00, -1.0078e+00]],
       requires_grad=True)
2025-02-06 20:26:51,527 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7344, -1.6172,  0.3301,  0.2002],
        [ 5.7188,  0.9297, -0.0145,  0.0237],
        [ 0.3418,  0.6055, -0.3809,  0.0244],
        ...,
        [-9.3125,  0.7500, -0.5938,  0.2480],
        [ 0.8320, -1.4844,  2.5625, -0.4980],
        [ 1.7031, -1.8359,  6.5625, -1.0234]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7344, -1.6172,  0.3301,  0.2002],
        [ 5.7188,  0.9297, -0.0145,  0.0237],
        [ 0.3418,  0.6055, -0.3809,  0.0244],
        ...,
        [-9.3125,  0.7500, -0.5938,  0.2480],
        [ 0.8320, -1.4844,  2.5625, -0.4980],
        [ 1.7031, -1.8359,  6.5625, -1.0234]], requires_grad=True)
2025-02-06 20:26:51,653 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7500, -1.7188,  0.3398,  0.1826],
        [ 5.5312,  0.9727, -0.0413,  0.0204],
        [ 0.3242,  0.6914, -0.4316,  0.0718],
        ...,
        [-9.1875,  0.6055, -0.5508,  0.2217],
        [ 0.8164, -1.4297,  2.4531, -0.4121],
        [ 1.7109, -1.7656,  6.3750, -1.0234]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7500, -1.7188,  0.3398,  0.1826],
        [ 5.5312,  0.9727, -0.0413,  0.0204],
        [ 0.3242,  0.6914, -0.4316,  0.0718],
        ...,
        [-9.1875,  0.6055, -0.5508,  0.2217],
        [ 0.8164, -1.4297,  2.4531, -0.4121],
        [ 1.7109, -1.7656,  6.3750, -1.0234]], requires_grad=True)
2025-02-06 20:26:51,801 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7656, -1.7969,  0.3477,  0.1670],
        [ 5.2812,  1.0781, -0.0864,  0.0366],
        [ 0.3027,  0.8047, -0.4980,  0.1406],
        ...,
        [-9.0000,  0.4492, -0.5078,  0.1914],
        [ 0.7891, -1.3438,  2.2812, -0.3125],
        [ 1.7031, -1.6484,  6.0938, -0.9961]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7656, -1.7969,  0.3477,  0.1670],
        [ 5.2812,  1.0781, -0.0864,  0.0366],
        [ 0.3027,  0.8047, -0.4980,  0.1406],
        ...,
        [-9.0000,  0.4492, -0.5078,  0.1914],
        [ 0.7891, -1.3438,  2.2812, -0.3125],
        [ 1.7031, -1.6484,  6.0938, -0.9961]], requires_grad=True)
2025-02-06 20:26:51,936 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7695, -1.8125,  0.3438,  0.1650],
        [ 5.0312,  1.0547, -0.0815,  0.0118],
        [ 0.2871,  0.8672, -0.5273,  0.1699],
        ...,
        [-8.9375,  0.3906, -0.4922,  0.1885],
        [ 0.7656, -1.2891,  2.1875, -0.2432],
        [ 1.6797, -1.5781,  5.8750, -0.9883]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7695, -1.8125,  0.3438,  0.1650],
        [ 5.0312,  1.0547, -0.0815,  0.0118],
        [ 0.2871,  0.8672, -0.5273,  0.1699],
        ...,
        [-8.9375,  0.3906, -0.4922,  0.1885],
        [ 0.7656, -1.2891,  2.1875, -0.2432],
        [ 1.6797, -1.5781,  5.8750, -0.9883]], requires_grad=True)
2025-02-06 20:26:52,063 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 7.7344e-01, -1.7891e+00,  3.3398e-01,  1.6992e-01],
        [ 4.7812e+00,  1.0312e+00, -7.8125e-02, -8.9111e-03],
        [ 2.7734e-01,  9.0234e-01, -5.3516e-01,  1.7773e-01],
        ...,
        [-8.9375e+00,  4.4922e-01, -5.1953e-01,  2.2266e-01],
        [ 7.5391e-01, -1.3203e+00,  2.2656e+00, -2.4316e-01],
        [ 1.6719e+00, -1.5312e+00,  5.7188e+00, -9.8828e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 7.7344e-01, -1.7891e+00,  3.3398e-01,  1.6992e-01],
        [ 4.7812e+00,  1.0312e+00, -7.8125e-02, -8.9111e-03],
        [ 2.7734e-01,  9.0234e-01, -5.3516e-01,  1.7773e-01],
        ...,
        [-8.9375e+00,  4.4922e-01, -5.1953e-01,  2.2266e-01],
        [ 7.5391e-01, -1.3203e+00,  2.2656e+00, -2.4316e-01],
        [ 1.6719e+00, -1.5312e+00,  5.7188e+00, -9.8828e-01]],
       requires_grad=True)
2025-02-06 20:26:52,198 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7969, -1.7266,  0.3184,  0.1797],
        [ 4.6875,  0.9141, -0.0239, -0.0522],
        [ 0.2734,  0.8945, -0.5039,  0.1602],
        ...,
        [-8.9375,  0.5312, -0.5586,  0.2598],
        [ 0.7539, -1.3750,  2.4062, -0.2598],
        [ 1.6641, -1.5156,  5.6875, -1.0000]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7969, -1.7266,  0.3184,  0.1797],
        [ 4.6875,  0.9141, -0.0239, -0.0522],
        [ 0.2734,  0.8945, -0.5039,  0.1602],
        ...,
        [-8.9375,  0.5312, -0.5586,  0.2598],
        [ 0.7539, -1.3750,  2.4062, -0.2598],
        [ 1.6641, -1.5156,  5.6875, -1.0000]], requires_grad=True)
2025-02-06 20:26:52,332 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 8.0859e-01, -1.6797e+00,  3.0469e-01,  1.8555e-01],
        [ 4.5938e+00,  8.3594e-01,  8.0566e-03, -8.5938e-02],
        [ 2.7148e-01,  9.0625e-01, -4.9414e-01,  1.5234e-01],
        ...,
        [-9.0000e+00,  5.6250e-01, -5.7422e-01,  2.8516e-01],
        [ 7.5781e-01, -1.4062e+00,  2.4844e+00, -2.6758e-01],
        [ 1.6641e+00, -1.5000e+00,  5.6562e+00, -1.0078e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 8.0859e-01, -1.6797e+00,  3.0469e-01,  1.8555e-01],
        [ 4.5938e+00,  8.3594e-01,  8.0566e-03, -8.5938e-02],
        [ 2.7148e-01,  9.0625e-01, -4.9414e-01,  1.5234e-01],
        ...,
        [-9.0000e+00,  5.6250e-01, -5.7422e-01,  2.8516e-01],
        [ 7.5781e-01, -1.4062e+00,  2.4844e+00, -2.6758e-01],
        [ 1.6641e+00, -1.5000e+00,  5.6562e+00, -1.0078e+00]],
       requires_grad=True)
2025-02-06 20:26:52,481 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 8.1641e-01, -1.5703e+00,  2.7930e-01,  1.9238e-01],
        [ 4.4688e+00,  8.2422e-01,  3.4790e-03, -1.1182e-01],
        [ 2.6758e-01,  9.4922e-01, -5.2344e-01,  1.5039e-01],
        ...,
        [-8.9375e+00,  5.4297e-01, -5.6641e-01,  3.0469e-01],
        [ 7.5391e-01, -1.3828e+00,  2.4375e+00, -2.6758e-01],
        [ 1.6562e+00, -1.3828e+00,  5.3438e+00, -1.0000e+00]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 8.1641e-01, -1.5703e+00,  2.7930e-01,  1.9238e-01],
        [ 4.4688e+00,  8.2422e-01,  3.4790e-03, -1.1182e-01],
        [ 2.6758e-01,  9.4922e-01, -5.2344e-01,  1.5039e-01],
        ...,
        [-8.9375e+00,  5.4297e-01, -5.6641e-01,  3.0469e-01],
        [ 7.5391e-01, -1.3828e+00,  2.4375e+00, -2.6758e-01],
        [ 1.6562e+00, -1.3828e+00,  5.3438e+00, -1.0000e+00]],
       requires_grad=True)
2025-02-06 20:26:52,624 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.8398, -1.5547,  0.2754,  0.1973],
        [ 4.2812,  0.8555, -0.0250, -0.1338],
        [ 0.2617,  1.0156, -0.5859,  0.1494],
        ...,
        [-9.0000,  0.4414, -0.5195,  0.3184],
        [ 0.7500, -1.3047,  2.2344, -0.2637],
        [ 1.6484, -1.2266,  4.8750, -0.9883]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.8398, -1.5547,  0.2754,  0.1973],
        [ 4.2812,  0.8555, -0.0250, -0.1338],
        [ 0.2617,  1.0156, -0.5859,  0.1494],
        ...,
        [-9.0000,  0.4414, -0.5195,  0.3184],
        [ 0.7500, -1.3047,  2.2344, -0.2637],
        [ 1.6484, -1.2266,  4.8750, -0.9883]], requires_grad=True)
2025-02-06 20:26:52,763 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.8555, -1.5391,  0.2715,  0.2012],
        [ 4.1250,  0.8555, -0.0378, -0.1533],
        [ 0.2559,  1.0703, -0.6406,  0.1494],
        ...,
        [-9.0625,  0.3789, -0.4902,  0.3301],
        [ 0.7461, -1.2422,  2.1094, -0.2617],
        [ 1.6406, -1.0859,  4.4375, -0.9766]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.8555, -1.5391,  0.2715,  0.2012],
        [ 4.1250,  0.8555, -0.0378, -0.1533],
        [ 0.2559,  1.0703, -0.6406,  0.1494],
        ...,
        [-9.0625,  0.3789, -0.4902,  0.3301],
        [ 0.7461, -1.2422,  2.1094, -0.2617],
        [ 1.6406, -1.0859,  4.4375, -0.9766]], requires_grad=True)
2025-02-06 20:26:52,914 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.9102, -1.5547,  0.2773,  0.2041],
        [ 4.0312,  0.7305,  0.0210, -0.1758],
        [ 0.2617,  1.0469, -0.6016,  0.1387],
        ...,
        [-9.1250,  0.4316, -0.5117,  0.3457],
        [ 0.7461, -1.2969,  2.2656, -0.2715],
        [ 1.6094, -0.9492,  4.0312, -0.9609]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.9102, -1.5547,  0.2773,  0.2041],
        [ 4.0312,  0.7305,  0.0210, -0.1758],
        [ 0.2617,  1.0469, -0.6016,  0.1387],
        ...,
        [-9.1250,  0.4316, -0.5117,  0.3457],
        [ 0.7461, -1.2969,  2.2656, -0.2715],
        [ 1.6094, -0.9492,  4.0312, -0.9609]], requires_grad=True)
2025-02-06 20:26:53,049 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.9531, -1.5312,  0.2754,  0.2061],
        [ 3.9531,  0.5664,  0.0977, -0.1953],
        [ 0.2656,  1.0078, -0.5469,  0.1299],
        ...,
        [-9.1250,  0.4961, -0.5352,  0.3574],
        [ 0.7461, -1.3828,  2.5000, -0.2793],
        [ 1.5703, -0.8555,  3.7188, -0.9492]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.9531, -1.5312,  0.2754,  0.2061],
        [ 3.9531,  0.5664,  0.0977, -0.1953],
        [ 0.2656,  1.0078, -0.5469,  0.1299],
        ...,
        [-9.1250,  0.4961, -0.5352,  0.3574],
        [ 0.7461, -1.3828,  2.5000, -0.2793],
        [ 1.5703, -0.8555,  3.7188, -0.9492]], requires_grad=True)
2025-02-06 20:26:53,191 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.9766, -1.4609,  0.2637,  0.2041],
        [ 3.7656,  0.5391,  0.1104, -0.2227],
        [ 0.2598,  1.0312, -0.5625,  0.1040],
        ...,
        [-8.9375,  0.4297, -0.5117,  0.3789],
        [ 0.7305, -1.3906,  2.5469, -0.2988],
        [ 1.5234, -0.7344,  3.3594, -0.9414]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.9766, -1.4609,  0.2637,  0.2041],
        [ 3.7656,  0.5391,  0.1104, -0.2227],
        [ 0.2598,  1.0312, -0.5625,  0.1040],
        ...,
        [-8.9375,  0.4297, -0.5117,  0.3789],
        [ 0.7305, -1.3906,  2.5469, -0.2988],
        [ 1.5234, -0.7344,  3.3594, -0.9414]], requires_grad=True)
2025-02-06 20:26:53,349 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0078, -1.4375,  0.2598,  0.2061],
        [ 3.5469,  0.5508,  0.1055, -0.2520],
        [ 0.2500,  1.0703, -0.5938,  0.0723],
        ...,
        [-8.6875,  0.3672, -0.4863,  0.3965],
        [ 0.7070, -1.3672,  2.5312, -0.3223],
        [ 1.4688, -0.6172,  3.0156, -0.9336]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0078, -1.4375,  0.2598,  0.2061],
        [ 3.5469,  0.5508,  0.1055, -0.2520],
        [ 0.2500,  1.0703, -0.5938,  0.0723],
        ...,
        [-8.6875,  0.3672, -0.4863,  0.3965],
        [ 0.7070, -1.3672,  2.5312, -0.3223],
        [ 1.4688, -0.6172,  3.0156, -0.9336]], requires_grad=True)
2025-02-06 20:26:53,484 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0312e+00, -1.5312e+00,  2.7148e-01,  2.2363e-01],
        [ 3.3281e+00,  6.4062e-01,  7.1777e-02, -2.9492e-01],
        [ 2.3535e-01,  1.1719e+00, -6.6797e-01,  1.5182e-03],
        ...,
        [-8.4375e+00,  2.3828e-01, -4.4336e-01,  4.2578e-01],
        [ 6.8359e-01, -1.2812e+00,  2.4062e+00, -3.6914e-01],
        [ 1.4062e+00, -3.9453e-01,  2.4688e+00, -9.6875e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0312e+00, -1.5312e+00,  2.7148e-01,  2.2363e-01],
        [ 3.3281e+00,  6.4062e-01,  7.1777e-02, -2.9492e-01],
        [ 2.3535e-01,  1.1719e+00, -6.6797e-01,  1.5182e-03],
        ...,
        [-8.4375e+00,  2.3828e-01, -4.4336e-01,  4.2578e-01],
        [ 6.8359e-01, -1.2812e+00,  2.4062e+00, -3.6914e-01],
        [ 1.4062e+00, -3.9453e-01,  2.4688e+00, -9.6875e-01]],
       requires_grad=True)
2025-02-06 20:26:53,638 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0391, -1.5234,  0.2715,  0.2217],
        [ 3.2969,  0.5625,  0.0884, -0.2812],
        [ 0.2275,  1.2188, -0.7070, -0.0227],
        ...,
        [-8.1875,  0.2139, -0.4238,  0.4238],
        [ 0.6719, -1.3125,  2.4375, -0.3379],
        [ 1.3672, -0.2480,  2.0625, -0.9727]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0391, -1.5234,  0.2715,  0.2217],
        [ 3.2969,  0.5625,  0.0884, -0.2812],
        [ 0.2275,  1.2188, -0.7070, -0.0227],
        ...,
        [-8.1875,  0.2139, -0.4238,  0.4238],
        [ 0.6719, -1.3125,  2.4375, -0.3379],
        [ 1.3672, -0.2480,  2.0625, -0.9727]], requires_grad=True)
2025-02-06 20:26:53,774 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0469, -1.5156,  0.2715,  0.2197],
        [ 3.2500,  0.4609,  0.1113, -0.2617],
        [ 0.2207,  1.2500, -0.7383, -0.0430],
        ...,
        [-7.9375,  0.2451, -0.4160,  0.4082],
        [ 0.6641, -1.4062,  2.5469, -0.2734],
        [ 1.3359, -0.1572,  1.7578, -0.9570]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0469, -1.5156,  0.2715,  0.2197],
        [ 3.2500,  0.4609,  0.1113, -0.2617],
        [ 0.2207,  1.2500, -0.7383, -0.0430],
        ...,
        [-7.9375,  0.2451, -0.4160,  0.4082],
        [ 0.6641, -1.4062,  2.5469, -0.2734],
        [ 1.3359, -0.1572,  1.7578, -0.9570]], requires_grad=True)
2025-02-06 20:26:53,908 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0312, -1.3984,  0.2598,  0.2061],
        [ 3.3594,  0.2539,  0.1602, -0.2227],
        [ 0.2227,  1.2109, -0.7305, -0.0334],
        ...,
        [-7.8750,  0.3613, -0.4277,  0.3809],
        [ 0.6641, -1.5234,  2.6719, -0.2051],
        [ 1.3359, -0.1572,  1.5938, -0.9180]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0312, -1.3984,  0.2598,  0.2061],
        [ 3.3594,  0.2539,  0.1602, -0.2227],
        [ 0.2227,  1.2109, -0.7305, -0.0334],
        ...,
        [-7.8750,  0.3613, -0.4277,  0.3809],
        [ 0.6641, -1.5234,  2.6719, -0.2051],
        [ 1.3359, -0.1572,  1.5938, -0.9180]], requires_grad=True)
2025-02-06 20:26:54,061 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0156, -1.2578,  0.2451,  0.1934],
        [ 3.4219,  0.1367,  0.1875, -0.1895],
        [ 0.2207,  1.2031, -0.7383, -0.0272],
        ...,
        [-7.7500,  0.3887, -0.4199,  0.3574],
        [ 0.6562, -1.5625,  2.7188, -0.1475],
        [ 1.3281, -0.1157,  1.3984, -0.8828]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0156, -1.2578,  0.2451,  0.1934],
        [ 3.4219,  0.1367,  0.1875, -0.1895],
        [ 0.2207,  1.2031, -0.7383, -0.0272],
        ...,
        [-7.7500,  0.3887, -0.4199,  0.3574],
        [ 0.6562, -1.5625,  2.7188, -0.1475],
        [ 1.3281, -0.1157,  1.3984, -0.8828]], requires_grad=True)
2025-02-06 20:26:54,202 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0156e+00, -1.5078e+00,  2.6562e-01,  1.6016e-01],
        [ 3.2656e+00,  2.2363e-01,  1.6504e-01, -1.4258e-01],
        [ 2.0020e-01,  1.2969e+00, -7.9297e-01,  3.0518e-03],
        ...,
        [-7.5625e+00,  2.5391e-01, -3.8477e-01,  3.2227e-01],
        [ 6.4453e-01, -1.4453e+00,  2.5938e+00, -6.6895e-02],
        [ 1.3281e+00,  3.9062e-02,  1.0859e+00, -8.2812e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0156e+00, -1.5078e+00,  2.6562e-01,  1.6016e-01],
        [ 3.2656e+00,  2.2363e-01,  1.6504e-01, -1.4258e-01],
        [ 2.0020e-01,  1.2969e+00, -7.9297e-01,  3.0518e-03],
        ...,
        [-7.5625e+00,  2.5391e-01, -3.8477e-01,  3.2227e-01],
        [ 6.4453e-01, -1.4453e+00,  2.5938e+00, -6.6895e-02],
        [ 1.3281e+00,  3.9062e-02,  1.0859e+00, -8.2812e-01]],
       requires_grad=True)
2025-02-06 20:26:54,336 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0156, -1.7188,  0.2832,  0.1309],
        [ 3.1250,  0.2930,  0.1455, -0.1025],
        [ 0.1797,  1.3984, -0.8477,  0.0381],
        ...,
        [-7.3125,  0.1001, -0.3477,  0.2852],
        [ 0.6250, -1.3125,  2.4531,  0.0112],
        [ 1.3203,  0.1787,  0.8008, -0.7812]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0156, -1.7188,  0.2832,  0.1309],
        [ 3.1250,  0.2930,  0.1455, -0.1025],
        [ 0.1797,  1.3984, -0.8477,  0.0381],
        ...,
        [-7.3125,  0.1001, -0.3477,  0.2852],
        [ 0.6250, -1.3125,  2.4531,  0.0112],
        [ 1.3203,  0.1787,  0.8008, -0.7812]], requires_grad=True)
2025-02-06 20:26:54,469 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0156, -1.8438,  0.2949,  0.1143],
        [ 2.9688,  0.3594,  0.1279, -0.0659],
        [ 0.1592,  1.4531, -0.8828,  0.0471],
        ...,
        [-7.0000,  0.0840, -0.3262,  0.2832],
        [ 0.5977, -1.2812,  2.3750,  0.0325],
        [ 1.3047,  0.2461,  0.5898, -0.7578]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.0156, -1.8438,  0.2949,  0.1143],
        [ 2.9688,  0.3594,  0.1279, -0.0659],
        [ 0.1592,  1.4531, -0.8828,  0.0471],
        ...,
        [-7.0000,  0.0840, -0.3262,  0.2832],
        [ 0.5977, -1.2812,  2.3750,  0.0325],
        [ 1.3047,  0.2461,  0.5898, -0.7578]], requires_grad=True)
2025-02-06 20:26:54,618 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.9883, -1.7578,  0.2949,  0.1348],
        [ 2.7656,  0.4336,  0.1089, -0.0273],
        [ 0.1514,  1.4453, -0.8984,  0.0135],
        ...,
        [-6.6562,  0.2715, -0.3262,  0.3379],
        [ 0.5703, -1.4531,  2.4062, -0.0811],
        [ 1.2969,  0.1914,  0.4727, -0.7930]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.9883, -1.7578,  0.2949,  0.1348],
        [ 2.7656,  0.4336,  0.1089, -0.0273],
        [ 0.1514,  1.4453, -0.8984,  0.0135],
        ...,
        [-6.6562,  0.2715, -0.3262,  0.3379],
        [ 0.5703, -1.4531,  2.4062, -0.0811],
        [ 1.2969,  0.1914,  0.4727, -0.7930]], requires_grad=True)
2025-02-06 20:26:54,773 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.8555, -1.3906,  0.2773,  0.1836],
        [ 2.7500,  0.4160,  0.1035, -0.0084],
        [ 0.1602,  1.3672, -0.8906, -0.0493],
        ...,
        [-6.5000,  0.4902, -0.3320,  0.3945],
        [ 0.5625, -1.6406,  2.4531, -0.1973],
        [ 1.3281,  0.0747,  0.4238, -0.8398]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.8555, -1.3906,  0.2773,  0.1836],
        [ 2.7500,  0.4160,  0.1035, -0.0084],
        [ 0.1602,  1.3672, -0.8906, -0.0493],
        ...,
        [-6.5000,  0.4902, -0.3320,  0.3945],
        [ 0.5625, -1.6406,  2.4531, -0.1973],
        [ 1.3281,  0.0747,  0.4238, -0.8398]], requires_grad=True)
2025-02-06 20:26:54,905 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7305, -1.1172,  0.2656,  0.2236],
        [ 2.5156,  0.6523,  0.0640,  0.0238],
        [ 0.1504,  1.3906, -0.9062, -0.0898],
        ...,
        [-6.1562,  0.5000, -0.3164,  0.4316],
        [ 0.5430, -1.6875,  2.4219, -0.2812],
        [ 1.3516,  0.0121,  0.3496, -0.8750]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.7305, -1.1172,  0.2656,  0.2236],
        [ 2.5156,  0.6523,  0.0640,  0.0238],
        [ 0.1504,  1.3906, -0.9062, -0.0898],
        ...,
        [-6.1562,  0.5000, -0.3164,  0.4316],
        [ 0.5430, -1.6875,  2.4219, -0.2812],
        [ 1.3516,  0.0121,  0.3496, -0.8750]], requires_grad=True)
2025-02-06 20:26:55,030 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6211, -0.8789,  0.2539,  0.2578],
        [ 2.0781,  0.9570,  0.0109,  0.0435],
        [ 0.1328,  1.4922, -0.9492, -0.1406],
        ...,
        [-6.0000,  0.3711, -0.2852,  0.4688],
        [ 0.5195, -1.5781,  2.2812, -0.3750],
        [ 1.3438,  0.0247,  0.2266, -0.9102]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.6211, -0.8789,  0.2539,  0.2578],
        [ 2.0781,  0.9570,  0.0109,  0.0435],
        [ 0.1328,  1.4922, -0.9492, -0.1406],
        ...,
        [-6.0000,  0.3711, -0.2852,  0.4688],
        [ 0.5195, -1.5781,  2.2812, -0.3750],
        [ 1.3438,  0.0247,  0.2266, -0.9102]], requires_grad=True)
2025-02-06 20:26:55,161 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.5039, -0.6055,  0.2373,  0.2773],
        [ 1.8359,  1.0859, -0.0063,  0.0942],
        [ 0.1299,  1.5078, -0.9531, -0.1396],
        ...,
        [-5.7188,  0.2451, -0.2539,  0.5039],
        [ 0.5078, -1.5391,  2.2188, -0.4219],
        [ 1.3516, -0.0148,  0.1738, -0.9180]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.5039, -0.6055,  0.2373,  0.2773],
        [ 1.8359,  1.0859, -0.0063,  0.0942],
        [ 0.1299,  1.5078, -0.9531, -0.1396],
        ...,
        [-5.7188,  0.2451, -0.2539,  0.5039],
        [ 0.5078, -1.5391,  2.2188, -0.4219],
        [ 1.3516, -0.0148,  0.1738, -0.9180]], requires_grad=True)
2025-02-06 20:26:55,311 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3535, -0.2412,  0.2109,  0.2695],
        [ 1.9062,  1.0078,  0.0242,  0.2012],
        [ 0.1826,  1.3984, -0.8906, -0.0216],
        ...,
        [-6.1250,  0.3359, -0.2656,  0.4688],
        [ 0.5508, -1.6953,  2.3594, -0.3320],
        [ 1.4688, -0.1172,  0.2285, -0.8789]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3535, -0.2412,  0.2109,  0.2695],
        [ 1.9062,  1.0078,  0.0242,  0.2012],
        [ 0.1826,  1.3984, -0.8906, -0.0216],
        ...,
        [-6.1250,  0.3359, -0.2656,  0.4688],
        [ 0.5508, -1.6953,  2.3594, -0.3320],
        [ 1.4688, -0.1172,  0.2285, -0.8789]], requires_grad=True)
2025-02-06 20:26:55,443 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2334,  0.0767,  0.1875,  0.2617],
        [ 2.3125,  0.8320,  0.0757,  0.3164],
        [ 0.2451,  1.2500, -0.8086,  0.1084],
        ...,
        [-7.2188,  0.5781, -0.3047,  0.4062],
        [ 0.6055, -1.8594,  2.5156, -0.2344],
        [ 1.6094, -0.2715,  0.3535, -0.8203]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2334,  0.0767,  0.1875,  0.2617],
        [ 2.3125,  0.8320,  0.0757,  0.3164],
        [ 0.2451,  1.2500, -0.8086,  0.1084],
        ...,
        [-7.2188,  0.5781, -0.3047,  0.4062],
        [ 0.6055, -1.8594,  2.5156, -0.2344],
        [ 1.6094, -0.2715,  0.3535, -0.8203]], requires_grad=True)
2025-02-06 20:26:55,583 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1211,  0.3145,  0.1680,  0.2539],
        [ 2.5469,  0.8320,  0.0962,  0.4180],
        [ 0.2930,  1.1875, -0.7617,  0.2236],
        ...,
        [-8.1250,  0.7461, -0.3320,  0.3516],
        [ 0.6406, -1.9219,  2.5938, -0.1455],
        [ 1.7031, -0.3457,  0.4141, -0.7695]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1211,  0.3145,  0.1680,  0.2539],
        [ 2.5469,  0.8320,  0.0962,  0.4180],
        [ 0.2930,  1.1875, -0.7617,  0.2236],
        ...,
        [-8.1250,  0.7461, -0.3320,  0.3516],
        [ 0.6406, -1.9219,  2.5938, -0.1455],
        [ 1.7031, -0.3457,  0.4141, -0.7695]], requires_grad=True)
2025-02-06 20:26:55,714 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 5.0049e-03,  3.2812e-01,  1.5723e-01,  2.1973e-01],
        [ 2.6719e+00,  8.4766e-01,  1.1230e-01,  5.1172e-01],
        [ 3.3008e-01,  1.1953e+00, -7.3047e-01,  3.6523e-01],
        ...,
        [-8.8750e+00,  8.2812e-01, -3.5156e-01,  2.9102e-01],
        [ 6.7188e-01, -1.8906e+00,  2.6250e+00, -3.3936e-02],
        [ 1.7812e+00, -3.7109e-01,  4.4727e-01, -7.0703e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 5.0049e-03,  3.2812e-01,  1.5723e-01,  2.1973e-01],
        [ 2.6719e+00,  8.4766e-01,  1.1230e-01,  5.1172e-01],
        [ 3.3008e-01,  1.1953e+00, -7.3047e-01,  3.6523e-01],
        ...,
        [-8.8750e+00,  8.2812e-01, -3.5156e-01,  2.9102e-01],
        [ 6.7188e-01, -1.8906e+00,  2.6250e+00, -3.3936e-02],
        [ 1.7812e+00, -3.7109e-01,  4.4727e-01, -7.0703e-01]],
       requires_grad=True)
2025-02-06 20:26:55,846 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0991,  0.3965,  0.1465,  0.2061],
        [ 2.6719,  0.9062,  0.1235,  0.6133],
        [ 0.3613,  1.2109, -0.6992,  0.5039],
        ...,
        [-9.5000,  0.8438, -0.3672,  0.2109],
        [ 0.6836, -1.7969,  2.6250,  0.1299],
        [ 1.8281, -0.3516,  0.4629, -0.6211]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0991,  0.3965,  0.1465,  0.2061],
        [ 2.6719,  0.9062,  0.1235,  0.6133],
        [ 0.3613,  1.2109, -0.6992,  0.5039],
        ...,
        [-9.5000,  0.8438, -0.3672,  0.2109],
        [ 0.6836, -1.7969,  2.6250,  0.1299],
        [ 1.8281, -0.3516,  0.4629, -0.6211]], requires_grad=True)
2025-02-06 20:26:55,976 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.1953,  0.4727,  0.1377,  0.2041],
        [ 2.7188,  1.0000,  0.1338,  0.7422],
        [ 0.3828,  1.2578, -0.6719,  0.7109],
        ...,
        [-9.9375,  0.7969, -0.3789,  0.0894],
        [ 0.6875, -1.6406,  2.6250,  0.3887],
        [ 1.8594, -0.3301,  0.4766, -0.5352]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.1953,  0.4727,  0.1377,  0.2041],
        [ 2.7188,  1.0000,  0.1338,  0.7422],
        [ 0.3828,  1.2578, -0.6719,  0.7109],
        ...,
        [-9.9375,  0.7969, -0.3789,  0.0894],
        [ 0.6875, -1.6406,  2.6250,  0.3887],
        [ 1.8594, -0.3301,  0.4766, -0.5352]], requires_grad=True)
2025-02-06 20:26:56,133 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.3164,   0.5898,   0.1318,   0.2578],
        [  2.8438,   1.0547,   0.1387,   0.8047],
        [  0.4102,   1.2812,  -0.6484,   0.8242],
        ...,
        [-10.8125,   0.8320,  -0.3789,   0.1191],
        [  0.7305,  -1.5625,   2.5781,   0.3887],
        [  1.9141,  -0.3379,   0.4668,  -0.5391]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.3164,   0.5898,   0.1318,   0.2578],
        [  2.8438,   1.0547,   0.1387,   0.8047],
        [  0.4102,   1.2812,  -0.6484,   0.8242],
        ...,
        [-10.8125,   0.8320,  -0.3789,   0.1191],
        [  0.7305,  -1.5625,   2.5781,   0.3887],
        [  1.9141,  -0.3379,   0.4668,  -0.5391]], requires_grad=True)
2025-02-06 20:26:56,274 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.4531,   0.7188,   0.1270,   0.3418],
        [  3.2031,   1.0469,   0.1367,   0.7461],
        [  0.4512,   1.2734,  -0.6328,   0.7812],
        ...,
        [-12.0000,   0.9297,  -0.3711,   0.2695],
        [  0.7969,  -1.5391,   2.5000,   0.1807],
        [  1.9766,  -0.3633,   0.4473,  -0.6016]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.4531,   0.7188,   0.1270,   0.3418],
        [  3.2031,   1.0469,   0.1367,   0.7461],
        [  0.4512,   1.2734,  -0.6328,   0.7812],
        ...,
        [-12.0000,   0.9297,  -0.3711,   0.2695],
        [  0.7969,  -1.5391,   2.5000,   0.1807],
        [  1.9766,  -0.3633,   0.4473,  -0.6016]], requires_grad=True)
2025-02-06 20:26:56,406 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.5938,   0.8398,   0.1221,   0.4219],
        [  3.7031,   0.9453,   0.1445,   0.5859],
        [  0.4922,   1.2422,  -0.6133,   0.6836],
        ...,
        [-13.1250,   1.0781,  -0.3691,   0.4688],
        [  0.8555,  -1.5469,   2.4375,  -0.0845],
        [  2.0312,  -0.4004,   0.4375,  -0.6875]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.5938,   0.8398,   0.1221,   0.4219],
        [  3.7031,   0.9453,   0.1445,   0.5859],
        [  0.4922,   1.2422,  -0.6133,   0.6836],
        ...,
        [-13.1250,   1.0781,  -0.3691,   0.4688],
        [  0.8555,  -1.5469,   2.4375,  -0.0845],
        [  2.0312,  -0.4004,   0.4375,  -0.6875]], requires_grad=True)
2025-02-06 20:26:56,536 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.7109,   0.9297,   0.1191,   0.4883],
        [  4.0625,   0.9219,   0.1289,   0.4707],
        [  0.5234,   1.2344,  -0.6055,   0.6172],
        ...,
        [-14.1250,   1.2031,  -0.3652,   0.6406],
        [  0.8984,  -1.5312,   2.3438,  -0.3027],
        [  2.0625,  -0.4258,   0.4160,  -0.7578]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.7109,   0.9297,   0.1191,   0.4883],
        [  4.0625,   0.9219,   0.1289,   0.4707],
        [  0.5234,   1.2344,  -0.6055,   0.6172],
        ...,
        [-14.1250,   1.2031,  -0.3652,   0.6406],
        [  0.8984,  -1.5312,   2.3438,  -0.3027],
        [  2.0625,  -0.4258,   0.4160,  -0.7578]], requires_grad=True)
2025-02-06 20:26:56,664 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8125,   0.9727,   0.1226,   0.5469],
        [  4.4062,   0.9648,   0.0845,   0.3574],
        [  0.5508,   1.2500,  -0.6250,   0.5430],
        ...,
        [-14.9375,   1.2734,  -0.3496,   0.7930],
        [  0.9375,  -1.4766,   2.2031,  -0.5039],
        [  2.0938,  -0.4199,   0.3359,  -0.8242]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8125,   0.9727,   0.1226,   0.5469],
        [  4.4062,   0.9648,   0.0845,   0.3574],
        [  0.5508,   1.2500,  -0.6250,   0.5430],
        ...,
        [-14.9375,   1.2734,  -0.3496,   0.7930],
        [  0.9375,  -1.4766,   2.2031,  -0.5039],
        [  2.0938,  -0.4199,   0.3359,  -0.8242]], requires_grad=True)
2025-02-06 20:26:56,799 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-8.9062e-01,  9.9609e-01,  1.2793e-01,  6.0156e-01],
        [ 4.5938e+00,  1.1016e+00, -1.3428e-02,  1.9629e-01],
        [ 5.7031e-01,  1.3047e+00, -6.9531e-01,  4.0625e-01],
        ...,
        [-1.5562e+01,  1.2812e+00, -3.1250e-01,  9.4922e-01],
        [ 9.6484e-01, -1.3906e+00,  1.9922e+00, -7.1484e-01],
        [ 2.1094e+00, -3.8867e-01,  1.8555e-01, -9.0625e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-8.9062e-01,  9.9609e-01,  1.2793e-01,  6.0156e-01],
        [ 4.5938e+00,  1.1016e+00, -1.3428e-02,  1.9629e-01],
        [ 5.7031e-01,  1.3047e+00, -6.9531e-01,  4.0625e-01],
        ...,
        [-1.5562e+01,  1.2812e+00, -3.1250e-01,  9.4922e-01],
        [ 9.6484e-01, -1.3906e+00,  1.9922e+00, -7.1484e-01],
        [ 2.1094e+00, -3.8867e-01,  1.8555e-01, -9.0625e-01]],
       requires_grad=True)
2025-02-06 20:26:56,932 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-9.2578e-01,  9.6875e-01,  1.4258e-01,  6.7188e-01],
        [ 4.7500e+00,  1.2422e+00, -1.1279e-01,  2.8809e-02],
        [ 5.7812e-01,  1.3750e+00, -7.8516e-01,  2.1875e-01],
        ...,
        [-1.6000e+01,  1.2344e+00, -2.5391e-01,  1.1328e+00],
        [ 9.7656e-01, -1.2656e+00,  1.6484e+00, -1.0078e+00],
        [ 2.1094e+00, -3.4766e-01,  1.0559e-02, -9.9609e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-9.2578e-01,  9.6875e-01,  1.4258e-01,  6.7188e-01],
        [ 4.7500e+00,  1.2422e+00, -1.1279e-01,  2.8809e-02],
        [ 5.7812e-01,  1.3750e+00, -7.8516e-01,  2.1875e-01],
        ...,
        [-1.6000e+01,  1.2344e+00, -2.5391e-01,  1.1328e+00],
        [ 9.7656e-01, -1.2656e+00,  1.6484e+00, -1.0078e+00],
        [ 2.1094e+00, -3.4766e-01,  1.0559e-02, -9.9609e-01]],
       requires_grad=True)
2025-02-06 20:26:57,063 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.0391e+00,  1.0625e+00,  1.2207e-01,  6.1328e-01],
        [ 5.2812e+00,  1.2969e+00, -1.5527e-01, -1.3306e-02],
        [ 6.0547e-01,  1.3984e+00, -8.0859e-01,  2.1582e-01],
        ...,
        [-1.6750e+01,  1.2969e+00, -2.6172e-01,  1.1328e+00],
        [ 1.0391e+00, -1.2266e+00,  1.6016e+00, -9.9219e-01],
        [ 2.1562e+00, -3.4180e-01, -2.0630e-02, -9.8438e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.0391e+00,  1.0625e+00,  1.2207e-01,  6.1328e-01],
        [ 5.2812e+00,  1.2969e+00, -1.5527e-01, -1.3306e-02],
        [ 6.0547e-01,  1.3984e+00, -8.0859e-01,  2.1582e-01],
        ...,
        [-1.6750e+01,  1.2969e+00, -2.6172e-01,  1.1328e+00],
        [ 1.0391e+00, -1.2266e+00,  1.6016e+00, -9.9219e-01],
        [ 2.1562e+00, -3.4180e-01, -2.0630e-02, -9.8438e-01]],
       requires_grad=True)
2025-02-06 20:26:57,214 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2109,   1.2344,   0.0767,   0.4668],
        [  6.0938,   1.2656,  -0.1357,   0.0757],
        [  0.6406,   1.3906,  -0.7852,   0.3359],
        ...,
        [-17.7500,   1.4062,  -0.3008,   1.0391],
        [  1.1250,  -1.2344,   1.6953,  -0.8320],
        [  2.2344,  -0.3867,   0.1416,  -0.8398]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2109,   1.2344,   0.0767,   0.4668],
        [  6.0938,   1.2656,  -0.1357,   0.0757],
        [  0.6406,   1.3906,  -0.7852,   0.3359],
        ...,
        [-17.7500,   1.4062,  -0.3008,   1.0391],
        [  1.1250,  -1.2344,   1.6953,  -0.8320],
        [  2.2344,  -0.3867,   0.1416,  -0.8398]], requires_grad=True)
2025-02-06 20:26:57,347 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3672,   1.4141,   0.0288,   0.3145],
        [  6.9375,   1.1562,  -0.0684,   0.2500],
        [  0.6797,   1.3516,  -0.7305,   0.5234],
        ...,
        [-18.7500,   1.5625,  -0.3633,   0.8867],
        [  1.2188,  -1.2891,   1.9062,  -0.5625],
        [  2.3125,  -0.4453,   0.3477,  -0.6758]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3672,   1.4141,   0.0288,   0.3145],
        [  6.9375,   1.1562,  -0.0684,   0.2500],
        [  0.6797,   1.3516,  -0.7305,   0.5234],
        ...,
        [-18.7500,   1.5625,  -0.3633,   0.8867],
        [  1.2188,  -1.2891,   1.9062,  -0.5625],
        [  2.3125,  -0.4453,   0.3477,  -0.6758]], requires_grad=True)
2025-02-06 20:26:57,480 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4922e+00,  1.5156e+00, -9.1553e-04,  2.0312e-01],
        [ 7.6250e+00,  1.0703e+00, -1.4160e-02,  3.9453e-01],
        [ 7.0703e-01,  1.3438e+00, -7.1484e-01,  6.2891e-01],
        ...,
        [-1.9625e+01,  1.6719e+00, -4.1016e-01,  7.6172e-01],
        [ 1.2891e+00, -1.3203e+00,  2.0469e+00, -3.4961e-01],
        [ 2.3750e+00, -4.9805e-01,  5.3516e-01, -5.2734e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.4922e+00,  1.5156e+00, -9.1553e-04,  2.0312e-01],
        [ 7.6250e+00,  1.0703e+00, -1.4160e-02,  3.9453e-01],
        [ 7.0703e-01,  1.3438e+00, -7.1484e-01,  6.2891e-01],
        ...,
        [-1.9625e+01,  1.6719e+00, -4.1016e-01,  7.6172e-01],
        [ 1.2891e+00, -1.3203e+00,  2.0469e+00, -3.4961e-01],
        [ 2.3750e+00, -4.9805e-01,  5.3516e-01, -5.2734e-01]],
       requires_grad=True)
2025-02-06 20:26:57,615 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.5938e+00,  1.5938e+00, -2.6001e-02,  1.0693e-01],
        [ 8.1875e+00,  1.0469e+00,  4.7302e-03,  5.0000e-01],
        [ 7.3047e-01,  1.3594e+00, -7.3047e-01,  6.9141e-01],
        ...,
        [-2.0250e+01,  1.7266e+00, -4.3359e-01,  6.6406e-01],
        [ 1.3516e+00, -1.3203e+00,  2.1250e+00, -1.7578e-01],
        [ 2.4062e+00, -5.2734e-01,  6.6016e-01, -4.0430e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.5938e+00,  1.5938e+00, -2.6001e-02,  1.0693e-01],
        [ 8.1875e+00,  1.0469e+00,  4.7302e-03,  5.0000e-01],
        [ 7.3047e-01,  1.3594e+00, -7.3047e-01,  6.9141e-01],
        ...,
        [-2.0250e+01,  1.7266e+00, -4.3359e-01,  6.6406e-01],
        [ 1.3516e+00, -1.3203e+00,  2.1250e+00, -1.7578e-01],
        [ 2.4062e+00, -5.2734e-01,  6.6016e-01, -4.0430e-01]],
       requires_grad=True)
2025-02-06 20:26:57,762 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.6797e+00,  1.6328e+00, -4.3701e-02,  2.3682e-02],
        [ 8.6250e+00,  1.0469e+00,  6.7749e-03,  5.8594e-01],
        [ 7.4219e-01,  1.4062e+00, -7.7734e-01,  7.3047e-01],
        ...,
        [-2.0750e+01,  1.7109e+00, -4.3164e-01,  5.8203e-01],
        [ 1.3984e+00, -1.2891e+00,  2.1250e+00, -3.1982e-02],
        [ 2.4219e+00, -5.4297e-01,  7.3828e-01, -2.9688e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.6797e+00,  1.6328e+00, -4.3701e-02,  2.3682e-02],
        [ 8.6250e+00,  1.0469e+00,  6.7749e-03,  5.8594e-01],
        [ 7.4219e-01,  1.4062e+00, -7.7734e-01,  7.3047e-01],
        ...,
        [-2.0750e+01,  1.7109e+00, -4.3164e-01,  5.8203e-01],
        [ 1.3984e+00, -1.2891e+00,  2.1250e+00, -3.1982e-02],
        [ 2.4219e+00, -5.4297e-01,  7.3828e-01, -2.9688e-01]],
       requires_grad=True)
2025-02-06 20:26:57,893 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.7422e+00,  1.6406e+00, -5.4199e-02, -4.9072e-02],
        [ 9.0625e+00,  1.0859e+00, -1.1475e-02,  6.6406e-01],
        [ 7.5000e-01,  1.4766e+00, -8.4375e-01,  7.6172e-01],
        ...,
        [-2.1125e+01,  1.6094e+00, -3.9844e-01,  5.1172e-01],
        [ 1.4375e+00, -1.1953e+00,  1.9922e+00,  9.6191e-02],
        [ 2.4375e+00, -5.2344e-01,  7.2656e-01, -2.0020e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.7422e+00,  1.6406e+00, -5.4199e-02, -4.9072e-02],
        [ 9.0625e+00,  1.0859e+00, -1.1475e-02,  6.6406e-01],
        [ 7.5000e-01,  1.4766e+00, -8.4375e-01,  7.6172e-01],
        ...,
        [-2.1125e+01,  1.6094e+00, -3.9844e-01,  5.1172e-01],
        [ 1.4375e+00, -1.1953e+00,  1.9922e+00,  9.6191e-02],
        [ 2.4375e+00, -5.2344e-01,  7.2656e-01, -2.0020e-01]],
       requires_grad=True)
2025-02-06 20:26:58,027 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.7891,   1.6406,  -0.0620,  -0.1128],
        [  9.3750,   1.1562,  -0.0430,   0.7344],
        [  0.7500,   1.5625,  -0.9336,   0.8008],
        ...,
        [-21.3750,   1.5156,  -0.3691,   0.4453],
        [  1.4531,  -1.0703,   1.7812,   0.2178],
        [  2.4375,  -0.4805,   0.6562,  -0.1118]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.7891,   1.6406,  -0.0620,  -0.1128],
        [  9.3750,   1.1562,  -0.0430,   0.7344],
        [  0.7500,   1.5625,  -0.9336,   0.8008],
        ...,
        [-21.3750,   1.5156,  -0.3691,   0.4453],
        [  1.4531,  -1.0703,   1.7812,   0.2178],
        [  2.4375,  -0.4805,   0.6562,  -0.1118]], requires_grad=True)
2025-02-06 20:26:58,158 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.8359,   1.6719,  -0.0757,  -0.1631],
        [  9.6875,   1.1484,  -0.0420,   0.7734],
        [  0.7500,   1.6016,  -0.9844,   0.8125],
        ...,
        [-21.6250,   1.4609,  -0.3496,   0.3906],
        [  1.4688,  -0.9805,   1.6328,   0.3145],
        [  2.4375,  -0.4707,   0.6602,  -0.0447]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.8359,   1.6719,  -0.0757,  -0.1631],
        [  9.6875,   1.1484,  -0.0420,   0.7734],
        [  0.7500,   1.6016,  -0.9844,   0.8125],
        ...,
        [-21.6250,   1.4609,  -0.3496,   0.3906],
        [  1.4688,  -0.9805,   1.6328,   0.3145],
        [  2.4375,  -0.4707,   0.6602,  -0.0447]], requires_grad=True)
2025-02-06 20:26:58,296 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.8750e+00,  1.7031e+00, -8.8379e-02, -2.0508e-01],
        [ 9.9375e+00,  1.1094e+00, -2.7954e-02,  8.0078e-01],
        [ 7.4609e-01,  1.6172e+00, -1.0078e+00,  8.0469e-01],
        ...,
        [-2.1750e+01,  1.4453e+00, -3.4570e-01,  3.5352e-01],
        [ 1.4766e+00, -9.2188e-01,  1.5469e+00,  3.8281e-01],
        [ 2.4375e+00, -5.0781e-01,  7.6953e-01, -9.2773e-03]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.8750e+00,  1.7031e+00, -8.8379e-02, -2.0508e-01],
        [ 9.9375e+00,  1.1094e+00, -2.7954e-02,  8.0078e-01],
        [ 7.4609e-01,  1.6172e+00, -1.0078e+00,  8.0469e-01],
        ...,
        [-2.1750e+01,  1.4453e+00, -3.4570e-01,  3.5352e-01],
        [ 1.4766e+00, -9.2188e-01,  1.5469e+00,  3.8281e-01],
        [ 2.4375e+00, -5.0781e-01,  7.6953e-01, -9.2773e-03]],
       requires_grad=True)
2025-02-06 20:26:58,431 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.8984e+00,  1.7344e+00, -1.0254e-01, -2.3730e-01],
        [ 1.0125e+01,  1.0547e+00, -8.2397e-03,  8.1641e-01],
        [ 7.4219e-01,  1.6250e+00, -1.0234e+00,  7.9688e-01],
        ...,
        [-2.1750e+01,  1.4844e+00, -3.6133e-01,  3.4180e-01],
        [ 1.4766e+00, -9.2578e-01,  1.5625e+00,  4.0039e-01],
        [ 2.4375e+00, -5.5078e-01,  8.9062e-01,  1.4221e-02]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.8984e+00,  1.7344e+00, -1.0254e-01, -2.3730e-01],
        [ 1.0125e+01,  1.0547e+00, -8.2397e-03,  8.1641e-01],
        [ 7.4219e-01,  1.6250e+00, -1.0234e+00,  7.9688e-01],
        ...,
        [-2.1750e+01,  1.4844e+00, -3.6133e-01,  3.4180e-01],
        [ 1.4766e+00, -9.2578e-01,  1.5625e+00,  4.0039e-01],
        [ 2.4375e+00, -5.5078e-01,  8.9062e-01,  1.4221e-02]],
       requires_grad=True)
2025-02-06 20:26:58,579 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.9219e+00,  1.7812e+00, -1.1865e-01, -2.5781e-01],
        [ 1.0312e+01,  9.3359e-01,  3.5156e-02,  8.0078e-01],
        [ 7.3828e-01,  1.6094e+00, -1.0234e+00,  7.6953e-01],
        ...,
        [-2.1750e+01,  1.5703e+00, -3.9062e-01,  3.4961e-01],
        [ 1.4766e+00, -9.8828e-01,  1.6875e+00,  3.6328e-01],
        [ 2.4375e+00, -6.6016e-01,  1.1406e+00, -1.0315e-02]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-1.9219e+00,  1.7812e+00, -1.1865e-01, -2.5781e-01],
        [ 1.0312e+01,  9.3359e-01,  3.5156e-02,  8.0078e-01],
        [ 7.3828e-01,  1.6094e+00, -1.0234e+00,  7.6953e-01],
        ...,
        [-2.1750e+01,  1.5703e+00, -3.9062e-01,  3.4961e-01],
        [ 1.4766e+00, -9.8828e-01,  1.6875e+00,  3.6328e-01],
        [ 2.4375e+00, -6.6016e-01,  1.1406e+00, -1.0315e-02]],
       requires_grad=True)
2025-02-06 20:26:58,711 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.9375,   1.8359,  -0.1348,  -0.2715],
        [ 10.5000,   0.7812,   0.0889,   0.7656],
        [  0.7383,   1.5547,  -0.9922,   0.7031],
        ...,
        [-21.6250,   1.6641,  -0.4199,   0.3613],
        [  1.4766,  -1.0625,   1.8203,   0.3145],
        [  2.4375,  -0.7891,   1.4219,  -0.0513]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.9375,   1.8359,  -0.1348,  -0.2715],
        [ 10.5000,   0.7812,   0.0889,   0.7656],
        [  0.7383,   1.5547,  -0.9922,   0.7031],
        ...,
        [-21.6250,   1.6641,  -0.4199,   0.3613],
        [  1.4766,  -1.0625,   1.8203,   0.3145],
        [  2.4375,  -0.7891,   1.4219,  -0.0513]], requires_grad=True)
2025-02-06 20:26:58,852 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.9375,   1.8203,  -0.1406,  -0.2949],
        [ 10.5625,   0.6797,   0.1250,   0.7422],
        [  0.7305,   1.5156,  -0.9727,   0.6602],
        ...,
        [-21.5000,   1.7188,  -0.4395,   0.3633],
        [  1.4609,  -1.1016,   1.8984,   0.2871],
        [  2.4219,  -0.8828,   1.6406,  -0.0786]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.9375,   1.8203,  -0.1406,  -0.2949],
        [ 10.5625,   0.6797,   0.1250,   0.7422],
        [  0.7305,   1.5156,  -0.9727,   0.6602],
        ...,
        [-21.5000,   1.7188,  -0.4395,   0.3633],
        [  1.4609,  -1.1016,   1.8984,   0.2871],
        [  2.4219,  -0.8828,   1.6406,  -0.0786]], requires_grad=True)
2025-02-06 20:26:58,993 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.9219,   1.7578,  -0.1416,  -0.3203],
        [ 10.6250,   0.6055,   0.1514,   0.7266],
        [  0.7188,   1.5156,  -0.9766,   0.6445],
        ...,
        [-21.2500,   1.7266,  -0.4473,   0.3555],
        [  1.4375,  -1.0938,   1.9219,   0.2812],
        [  2.4062,  -0.9531,   1.8125,  -0.0986]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.9219,   1.7578,  -0.1416,  -0.3203],
        [ 10.6250,   0.6055,   0.1514,   0.7266],
        [  0.7188,   1.5156,  -0.9766,   0.6445],
        ...,
        [-21.2500,   1.7266,  -0.4473,   0.3555],
        [  1.4375,  -1.0938,   1.9219,   0.2812],
        [  2.4062,  -0.9531,   1.8125,  -0.0986]], requires_grad=True)
2025-02-06 20:26:59,127 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.9062,   1.7266,  -0.1455,  -0.3359],
        [ 10.5625,   0.6133,   0.1553,   0.7266],
        [  0.6992,   1.5547,  -0.9922,   0.6602],
        ...,
        [-20.8750,   1.6719,  -0.4434,   0.3379],
        [  1.4062,  -1.0391,   1.8828,   0.2988],
        [  2.3750,  -0.9688,   1.8984,  -0.0991]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.9062,   1.7266,  -0.1455,  -0.3359],
        [ 10.5625,   0.6133,   0.1553,   0.7266],
        [  0.6992,   1.5547,  -0.9922,   0.6602],
        ...,
        [-20.8750,   1.6719,  -0.4434,   0.3379],
        [  1.4062,  -1.0391,   1.8828,   0.2988],
        [  2.3750,  -0.9688,   1.8984,  -0.0991]], requires_grad=True)
2025-02-06 20:26:59,258 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.8750,   1.6250,  -0.1416,  -0.3613],
        [ 10.4375,   0.6641,   0.1475,   0.7383],
        [  0.6797,   1.6250,  -1.0234,   0.7031],
        ...,
        [-20.5000,   1.5859,  -0.4297,   0.3105],
        [  1.3750,  -0.9453,   1.7969,   0.3340],
        [  2.3281,  -0.9297,   1.9062,  -0.0781]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.8750,   1.6250,  -0.1416,  -0.3613],
        [ 10.4375,   0.6641,   0.1475,   0.7383],
        [  0.6797,   1.6250,  -1.0234,   0.7031],
        ...,
        [-20.5000,   1.5859,  -0.4297,   0.3105],
        [  1.3750,  -0.9453,   1.7969,   0.3340],
        [  2.3281,  -0.9297,   1.9062,  -0.0781]], requires_grad=True)
2025-02-06 20:26:59,392 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.8438,   1.5703,  -0.1406,  -0.3750],
        [ 10.3125,   0.6680,   0.1504,   0.7344],
        [  0.6602,   1.6719,  -1.0469,   0.7305],
        ...,
        [-20.1250,   1.5312,  -0.4219,   0.2930],
        [  1.3438,  -0.8828,   1.7422,   0.3516],
        [  2.2969,  -0.9570,   2.0000,  -0.0913]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.8438,   1.5703,  -0.1406,  -0.3750],
        [ 10.3125,   0.6680,   0.1504,   0.7344],
        [  0.6602,   1.6719,  -1.0469,   0.7305],
        ...,
        [-20.1250,   1.5312,  -0.4219,   0.2930],
        [  1.3438,  -0.8828,   1.7422,   0.3516],
        [  2.2969,  -0.9570,   2.0000,  -0.0913]], requires_grad=True)
2025-02-06 20:26:59,520 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.8125,   1.5547,  -0.1436,  -0.3789],
        [ 10.1875,   0.5859,   0.1738,   0.6992],
        [  0.6406,   1.6797,  -1.0469,   0.7188],
        ...,
        [-19.7500,   1.5391,  -0.4297,   0.2969],
        [  1.3203,  -0.8672,   1.7422,   0.3379],
        [  2.2656,  -1.0156,   2.1250,  -0.1226]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.8125,   1.5547,  -0.1436,  -0.3789],
        [ 10.1875,   0.5859,   0.1738,   0.6992],
        [  0.6406,   1.6797,  -1.0469,   0.7188],
        ...,
        [-19.7500,   1.5391,  -0.4297,   0.2969],
        [  1.3203,  -0.8672,   1.7422,   0.3379],
        [  2.2656,  -1.0156,   2.1250,  -0.1226]], requires_grad=True)
2025-02-06 20:26:59,653 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.7812,   1.5391,  -0.1465,  -0.3809],
        [ 10.1250,   0.5078,   0.1973,   0.6680],
        [  0.6289,   1.6484,  -1.0234,   0.6797],
        ...,
        [-19.3750,   1.6016,  -0.4473,   0.3164],
        [  1.2969,  -0.8945,   1.7812,   0.2969],
        [  2.2344,  -1.1094,   2.2969,  -0.1729]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.7812,   1.5391,  -0.1465,  -0.3809],
        [ 10.1250,   0.5078,   0.1973,   0.6680],
        [  0.6289,   1.6484,  -1.0234,   0.6797],
        ...,
        [-19.3750,   1.6016,  -0.4473,   0.3164],
        [  1.2969,  -0.8945,   1.7812,   0.2969],
        [  2.2344,  -1.1094,   2.2969,  -0.1729]], requires_grad=True)
2025-02-06 20:26:59,796 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.7500,   1.5547,  -0.1523,  -0.3750],
        [ 10.0000,   0.4160,   0.2217,   0.6328],
        [  0.6172,   1.6016,  -0.9922,   0.6328],
        ...,
        [-19.0000,   1.7031,  -0.4707,   0.3438],
        [  1.2734,  -0.9375,   1.8438,   0.2441],
        [  2.2031,  -1.2109,   2.4688,  -0.2246]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.7500,   1.5547,  -0.1523,  -0.3750],
        [ 10.0000,   0.4160,   0.2217,   0.6328],
        [  0.6172,   1.6016,  -0.9922,   0.6328],
        ...,
        [-19.0000,   1.7031,  -0.4707,   0.3438],
        [  1.2734,  -0.9375,   1.8438,   0.2441],
        [  2.2031,  -1.2109,   2.4688,  -0.2246]], requires_grad=True)
2025-02-06 20:26:59,939 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.7188,   1.5703,  -0.1572,  -0.3672],
        [  9.8750,   0.3770,   0.2314,   0.6094],
        [  0.6016,   1.5547,  -0.9648,   0.5898],
        ...,
        [-18.6250,   1.7812,  -0.4902,   0.3672],
        [  1.2500,  -0.9727,   1.8906,   0.1992],
        [  2.1719,  -1.2891,   2.5938,  -0.2695]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.7188,   1.5703,  -0.1572,  -0.3672],
        [  9.8750,   0.3770,   0.2314,   0.6094],
        [  0.6016,   1.5547,  -0.9648,   0.5898],
        ...,
        [-18.6250,   1.7812,  -0.4902,   0.3672],
        [  1.2500,  -0.9727,   1.8906,   0.1992],
        [  2.1719,  -1.2891,   2.5938,  -0.2695]], requires_grad=True)
2025-02-06 20:27:00,071 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.6797,   1.5625,  -0.1592,  -0.3613],
        [  9.7500,   0.3359,   0.2412,   0.5859],
        [  0.5859,   1.5312,  -0.9492,   0.5625],
        ...,
        [-18.2500,   1.8203,  -0.5000,   0.3809],
        [  1.2188,  -0.9883,   1.9062,   0.1670],
        [  2.1406,  -1.3438,   2.6875,  -0.3047]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.6797,   1.5625,  -0.1592,  -0.3613],
        [  9.7500,   0.3359,   0.2412,   0.5859],
        [  0.5859,   1.5312,  -0.9492,   0.5625],
        ...,
        [-18.2500,   1.8203,  -0.5000,   0.3809],
        [  1.2188,  -0.9883,   1.9062,   0.1670],
        [  2.1406,  -1.3438,   2.6875,  -0.3047]], requires_grad=True)
2025-02-06 20:27:00,215 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.6406,   1.5000,  -0.1553,  -0.3633],
        [  9.5625,   0.3242,   0.2432,   0.5742],
        [  0.5703,   1.5234,  -0.9414,   0.5508],
        ...,
        [-17.8750,   1.8047,  -0.4980,   0.3809],
        [  1.1875,  -0.9492,   1.8750,   0.1592],
        [  2.1094,  -1.3672,   2.7344,  -0.3242]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.6406,   1.5000,  -0.1553,  -0.3633],
        [  9.5625,   0.3242,   0.2432,   0.5742],
        [  0.5703,   1.5234,  -0.9414,   0.5508],
        ...,
        [-17.8750,   1.8047,  -0.4980,   0.3809],
        [  1.1875,  -0.9492,   1.8750,   0.1592],
        [  2.1094,  -1.3672,   2.7344,  -0.3242]], requires_grad=True)
2025-02-06 20:27:00,356 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.5859,   1.3984,  -0.1475,  -0.3691],
        [  9.3750,   0.3125,   0.2441,   0.5586],
        [  0.5508,   1.5391,  -0.9453,   0.5547],
        ...,
        [-17.5000,   1.7656,  -0.4902,   0.3770],
        [  1.1562,  -0.8633,   1.7812,   0.1748],
        [  2.0625,  -1.3984,   2.7812,  -0.3457]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.5859,   1.3984,  -0.1475,  -0.3691],
        [  9.3750,   0.3125,   0.2441,   0.5586],
        [  0.5508,   1.5391,  -0.9453,   0.5547],
        ...,
        [-17.5000,   1.7656,  -0.4902,   0.3770],
        [  1.1562,  -0.8633,   1.7812,   0.1748],
        [  2.0625,  -1.3984,   2.7812,  -0.3457]], requires_grad=True)
2025-02-06 20:27:00,500 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.5391,   1.3359,  -0.1416,  -0.3691],
        [  9.1875,   0.3008,   0.2432,   0.5430],
        [  0.5312,   1.5469,  -0.9453,   0.5586],
        ...,
        [-17.1250,   1.7266,  -0.4824,   0.3711],
        [  1.1250,  -0.7852,   1.6953,   0.1875],
        [  2.0156,  -1.4141,   2.8125,  -0.3613]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.5391,   1.3359,  -0.1416,  -0.3691],
        [  9.1875,   0.3008,   0.2432,   0.5430],
        [  0.5312,   1.5469,  -0.9453,   0.5586],
        ...,
        [-17.1250,   1.7266,  -0.4824,   0.3711],
        [  1.1250,  -0.7852,   1.6953,   0.1875],
        [  2.0156,  -1.4141,   2.8125,  -0.3613]], requires_grad=True)
2025-02-06 20:27:00,630 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.4922,   1.3125,  -0.1406,  -0.3633],
        [  9.0000,   0.2617,   0.2490,   0.5234],
        [  0.5117,   1.5391,  -0.9375,   0.5547],
        ...,
        [-16.7500,   1.7188,  -0.4824,   0.3711],
        [  1.0938,  -0.7383,   1.6328,   0.1895],
        [  1.9688,  -1.4531,   2.8750,  -0.3828]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.4922,   1.3125,  -0.1406,  -0.3633],
        [  9.0000,   0.2617,   0.2490,   0.5234],
        [  0.5117,   1.5391,  -0.9375,   0.5547],
        ...,
        [-16.7500,   1.7188,  -0.4824,   0.3711],
        [  1.0938,  -0.7383,   1.6328,   0.1895],
        [  1.9688,  -1.4531,   2.8750,  -0.3828]], requires_grad=True)
2025-02-06 20:27:00,762 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.4453,   1.2969,  -0.1406,  -0.3555],
        [  8.8750,   0.1553,   0.2715,   0.4902],
        [  0.4941,   1.4922,  -0.9102,   0.5312],
        ...,
        [-16.3750,   1.7656,  -0.4941,   0.3809],
        [  1.0703,  -0.7383,   1.6328,   0.1748],
        [  1.9219,  -1.5391,   2.9844,  -0.4180]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.4453,   1.2969,  -0.1406,  -0.3555],
        [  8.8750,   0.1553,   0.2715,   0.4902],
        [  0.4941,   1.4922,  -0.9102,   0.5312],
        ...,
        [-16.3750,   1.7656,  -0.4941,   0.3809],
        [  1.0703,  -0.7383,   1.6328,   0.1748],
        [  1.9219,  -1.5391,   2.9844,  -0.4180]], requires_grad=True)
2025-02-06 20:27:00,893 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3906,   1.2422,  -0.1357,  -0.3535],
        [  8.7500,   0.0437,   0.2969,   0.4570],
        [  0.4844,   1.4219,  -0.8672,   0.5000],
        ...,
        [-16.1250,   1.8750,  -0.5156,   0.3984],
        [  1.0547,  -0.7930,   1.6953,   0.1445],
        [  1.8906,  -1.6406,   3.1094,  -0.4551]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3906,   1.2422,  -0.1357,  -0.3535],
        [  8.7500,   0.0437,   0.2969,   0.4570],
        [  0.4844,   1.4219,  -0.8672,   0.5000],
        ...,
        [-16.1250,   1.8750,  -0.5156,   0.3984],
        [  1.0547,  -0.7930,   1.6953,   0.1445],
        [  1.8906,  -1.6406,   3.1094,  -0.4551]], requires_grad=True)
2025-02-06 20:27:01,023 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3438,   1.1875,  -0.1318,  -0.3516],
        [  8.6250,  -0.0525,   0.3164,   0.4277],
        [  0.4785,   1.3438,  -0.8164,   0.4629],
        ...,
        [-15.8750,   2.0000,  -0.5391,   0.4160],
        [  1.0391,  -0.8555,   1.7500,   0.1143],
        [  1.8594,  -1.7266,   3.2188,  -0.4844]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.3438,   1.1875,  -0.1318,  -0.3516],
        [  8.6250,  -0.0525,   0.3164,   0.4277],
        [  0.4785,   1.3438,  -0.8164,   0.4629],
        ...,
        [-15.8750,   2.0000,  -0.5391,   0.4160],
        [  1.0391,  -0.8555,   1.7500,   0.1143],
        [  1.8594,  -1.7266,   3.2188,  -0.4844]], requires_grad=True)
2025-02-06 20:27:01,176 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2812,   1.0625,  -0.1201,  -0.3516],
        [  8.4375,  -0.0164,   0.3008,   0.4102],
        [  0.4648,   1.3203,  -0.8008,   0.4414],
        ...,
        [-15.5000,   2.0469,  -0.5469,   0.4277],
        [  1.0078,  -0.8281,   1.7031,   0.0991],
        [  1.8203,  -1.7812,   3.2969,  -0.5078]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2812,   1.0625,  -0.1201,  -0.3516],
        [  8.4375,  -0.0164,   0.3008,   0.4102],
        [  0.4648,   1.3203,  -0.8008,   0.4414],
        ...,
        [-15.5000,   2.0469,  -0.5469,   0.4277],
        [  1.0078,  -0.8281,   1.7031,   0.0991],
        [  1.8203,  -1.7812,   3.2969,  -0.5078]], requires_grad=True)
2025-02-06 20:27:01,310 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2266,   0.9688,  -0.1113,  -0.3496],
        [  8.2500,   0.0664,   0.2715,   0.3965],
        [  0.4512,   1.3047,  -0.7891,   0.4219],
        ...,
        [-15.1250,   2.0156,  -0.5391,   0.4316],
        [  0.9727,  -0.7422,   1.5859,   0.0903],
        [  1.7734,  -1.7891,   3.2969,  -0.5234]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.2266,   0.9688,  -0.1113,  -0.3496],
        [  8.2500,   0.0664,   0.2715,   0.3965],
        [  0.4512,   1.3047,  -0.7891,   0.4219],
        ...,
        [-15.1250,   2.0156,  -0.5391,   0.4316],
        [  0.9727,  -0.7422,   1.5859,   0.0903],
        [  1.7734,  -1.7891,   3.2969,  -0.5234]], requires_grad=True)
2025-02-06 20:27:01,463 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1797,   0.9102,  -0.1055,  -0.3477],
        [  8.0000,   0.1514,   0.2432,   0.3828],
        [  0.4336,   1.2969,  -0.7812,   0.4043],
        ...,
        [-14.6250,   1.9375,  -0.5234,   0.4355],
        [  0.9336,  -0.6328,   1.4531,   0.0820],
        [  1.7188,  -1.7734,   3.2656,  -0.5352]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1797,   0.9102,  -0.1055,  -0.3477],
        [  8.0000,   0.1514,   0.2432,   0.3828],
        [  0.4336,   1.2969,  -0.7812,   0.4043],
        ...,
        [-14.6250,   1.9375,  -0.5234,   0.4355],
        [  0.9336,  -0.6328,   1.4531,   0.0820],
        [  1.7188,  -1.7734,   3.2656,  -0.5352]], requires_grad=True)
2025-02-06 20:27:01,596 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1328,   0.8828,  -0.1035,  -0.3457],
        [  7.7500,   0.1963,   0.2256,   0.3711],
        [  0.4160,   1.2812,  -0.7695,   0.3887],
        ...,
        [-14.1250,   1.9297,  -0.5195,   0.4355],
        [  0.8984,  -0.5664,   1.3594,   0.0767],
        [  1.6641,  -1.7891,   3.2812,  -0.5430]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1328,   0.8828,  -0.1035,  -0.3457],
        [  7.7500,   0.1963,   0.2256,   0.3711],
        [  0.4160,   1.2812,  -0.7695,   0.3887],
        ...,
        [-14.1250,   1.9297,  -0.5195,   0.4355],
        [  0.8984,  -0.5664,   1.3594,   0.0767],
        [  1.6641,  -1.7891,   3.2812,  -0.5430]], requires_grad=True)
2025-02-06 20:27:01,721 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1016,   0.9531,  -0.1113,  -0.3457],
        [  7.5000,   0.2168,   0.2129,   0.3594],
        [  0.4062,   1.2422,  -0.7461,   0.3770],
        ...,
        [-13.8125,   2.0469,  -0.5430,   0.4277],
        [  0.8750,  -0.6016,   1.3906,   0.0820],
        [  1.6328,  -1.8359,   3.3438,  -0.5469]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.1016,   0.9531,  -0.1113,  -0.3457],
        [  7.5000,   0.2168,   0.2129,   0.3594],
        [  0.4062,   1.2422,  -0.7461,   0.3770],
        ...,
        [-13.8125,   2.0469,  -0.5430,   0.4277],
        [  0.8750,  -0.6016,   1.3906,   0.0820],
        [  1.6328,  -1.8359,   3.3438,  -0.5469]], requires_grad=True)
2025-02-06 20:27:01,861 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.0703,   1.0391,  -0.1206,  -0.3457],
        [  7.2500,   0.2471,   0.1973,   0.3477],
        [  0.3984,   1.1875,  -0.7148,   0.3672],
        ...,
        [-13.5000,   2.1719,  -0.5664,   0.4180],
        [  0.8516,  -0.6523,   1.4453,   0.0903],
        [  1.6016,  -1.9219,   3.4531,  -0.5430]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.0703,   1.0391,  -0.1206,  -0.3457],
        [  7.2500,   0.2471,   0.1973,   0.3477],
        [  0.3984,   1.1875,  -0.7148,   0.3672],
        ...,
        [-13.5000,   2.1719,  -0.5664,   0.4180],
        [  0.8516,  -0.6523,   1.4453,   0.0903],
        [  1.6016,  -1.9219,   3.4531,  -0.5430]], requires_grad=True)
2025-02-06 20:27:02,014 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.0312,   1.0391,  -0.1211,  -0.3418],
        [  6.9375,   0.3477,   0.1641,   0.3301],
        [  0.3887,   1.1797,  -0.7109,   0.3457],
        ...,
        [-13.3125,   2.2812,  -0.5859,   0.4082],
        [  0.8281,  -0.6641,   1.4453,   0.0913],
        [  1.5703,  -1.9609,   3.4844,  -0.5430]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -1.0312,   1.0391,  -0.1211,  -0.3418],
        [  6.9375,   0.3477,   0.1641,   0.3301],
        [  0.3887,   1.1797,  -0.7109,   0.3457],
        ...,
        [-13.3125,   2.2812,  -0.5859,   0.4082],
        [  0.8281,  -0.6641,   1.4453,   0.0913],
        [  1.5703,  -1.9609,   3.4844,  -0.5430]], requires_grad=True)
2025-02-06 20:27:02,157 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.9688,   0.9375,  -0.1113,  -0.3281],
        [  6.4688,   0.6133,   0.0884,   0.2891],
        [  0.3711,   1.2031,  -0.7266,   0.3145],
        ...,
        [-12.8750,   2.2656,  -0.5820,   0.4121],
        [  0.8008,  -0.6289,   1.3906,   0.0796],
        [  1.5156,  -1.8750,   3.3594,  -0.5625]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.9688,   0.9375,  -0.1113,  -0.3281],
        [  6.4688,   0.6133,   0.0884,   0.2891],
        [  0.3711,   1.2031,  -0.7266,   0.3145],
        ...,
        [-12.8750,   2.2656,  -0.5820,   0.4121],
        [  0.8008,  -0.6289,   1.3906,   0.0796],
        [  1.5156,  -1.8750,   3.3594,  -0.5625]], requires_grad=True)
2025-02-06 20:27:02,310 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-8.8281e-01,  7.7344e-01, -9.5215e-02, -3.0859e-01],
        [ 5.9375e+00,  9.4922e-01, -4.3030e-03,  2.3535e-01],
        [ 3.4961e-01,  1.2422e+00, -7.4219e-01,  2.7539e-01],
        ...,
        [-1.2250e+01,  2.1094e+00, -5.5078e-01,  4.3555e-01],
        [ 7.5391e-01, -5.0391e-01,  1.2500e+00,  3.8818e-02],
        [ 1.4531e+00, -1.7500e+00,  3.1875e+00, -5.8984e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-8.8281e-01,  7.7344e-01, -9.5215e-02, -3.0859e-01],
        [ 5.9375e+00,  9.4922e-01, -4.3030e-03,  2.3535e-01],
        [ 3.4961e-01,  1.2422e+00, -7.4219e-01,  2.7539e-01],
        ...,
        [-1.2250e+01,  2.1094e+00, -5.5078e-01,  4.3555e-01],
        [ 7.5391e-01, -5.0391e-01,  1.2500e+00,  3.8818e-02],
        [ 1.4531e+00, -1.7500e+00,  3.1875e+00, -5.8984e-01]],
       requires_grad=True)
2025-02-06 20:27:02,442 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8438,   0.8359,  -0.0996,  -0.3164],
        [  5.5625,   1.1250,  -0.0581,   0.2139],
        [  0.3398,   1.2188,  -0.7305,   0.2734],
        ...,
        [-11.8125,   2.1406,  -0.5547,   0.4160],
        [  0.7227,  -0.5000,   1.2266,   0.0513],
        [  1.4062,  -1.7188,   3.1250,  -0.5781]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8438,   0.8359,  -0.0996,  -0.3164],
        [  5.5625,   1.1250,  -0.0581,   0.2139],
        [  0.3398,   1.2188,  -0.7305,   0.2734],
        ...,
        [-11.8125,   2.1406,  -0.5547,   0.4160],
        [  0.7227,  -0.5000,   1.2266,   0.0513],
        [  1.4062,  -1.7188,   3.1250,  -0.5781]], requires_grad=True)
2025-02-06 20:27:02,573 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8438,   0.9844,  -0.1123,  -0.3359],
        [  5.5000,   1.1562,  -0.0771,   0.2197],
        [  0.3477,   1.1328,  -0.6875,   0.3086],
        ...,
        [-11.4375,   2.1875,  -0.5625,   0.3945],
        [  0.7109,  -0.5703,   1.2734,   0.0942],
        [  1.3750,  -1.7109,   3.0938,  -0.5625]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8438,   0.9844,  -0.1123,  -0.3359],
        [  5.5000,   1.1562,  -0.0771,   0.2197],
        [  0.3477,   1.1328,  -0.6875,   0.3086],
        ...,
        [-11.4375,   2.1875,  -0.5625,   0.3945],
        [  0.7109,  -0.5703,   1.2734,   0.0942],
        [  1.3750,  -1.7109,   3.0938,  -0.5625]], requires_grad=True)
2025-02-06 20:27:02,700 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8398,   1.0391,  -0.1167,  -0.3438],
        [  5.5312,   1.0625,  -0.0654,   0.2500],
        [  0.3613,   1.0234,  -0.6367,   0.3574],
        ...,
        [-11.1875,   2.2969,  -0.5820,   0.3594],
        [  0.7148,  -0.6875,   1.3672,   0.1543],
        [  1.3359,  -1.6953,   3.0469,  -0.5469]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8398,   1.0391,  -0.1167,  -0.3438],
        [  5.5312,   1.0625,  -0.0654,   0.2500],
        [  0.3613,   1.0234,  -0.6367,   0.3574],
        ...,
        [-11.1875,   2.2969,  -0.5820,   0.3594],
        [  0.7148,  -0.6875,   1.3672,   0.1543],
        [  1.3359,  -1.6953,   3.0469,  -0.5469]], requires_grad=True)
2025-02-06 20:27:02,853 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8555,   1.1562,  -0.1270,  -0.3574],
        [  5.4688,   1.0469,  -0.0723,   0.2637],
        [  0.3652,   0.9609,  -0.6055,   0.3848],
        ...,
        [-10.9375,   2.3750,  -0.5938,   0.3281],
        [  0.7109,  -0.7578,   1.4141,   0.1953],
        [  1.3047,  -1.6953,   3.0156,  -0.5273]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8555,   1.1562,  -0.1270,  -0.3574],
        [  5.4688,   1.0469,  -0.0723,   0.2637],
        [  0.3652,   0.9609,  -0.6055,   0.3848],
        ...,
        [-10.9375,   2.3750,  -0.5938,   0.3281],
        [  0.7109,  -0.7578,   1.4141,   0.1953],
        [  1.3047,  -1.6953,   3.0156,  -0.5273]], requires_grad=True)
2025-02-06 20:27:02,987 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8555,   1.2188,  -0.1318,  -0.3652],
        [  5.4062,   1.0234,  -0.0767,   0.2754],
        [  0.3633,   0.9180,  -0.5820,   0.4004],
        ...,
        [-10.6250,   2.3906,  -0.5938,   0.3066],
        [  0.6953,  -0.7734,   1.4141,   0.2197],
        [  1.2656,  -1.6719,   2.9688,  -0.5117]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8555,   1.2188,  -0.1318,  -0.3652],
        [  5.4062,   1.0234,  -0.0767,   0.2754],
        [  0.3633,   0.9180,  -0.5820,   0.4004],
        ...,
        [-10.6250,   2.3906,  -0.5938,   0.3066],
        [  0.6953,  -0.7734,   1.4141,   0.2197],
        [  1.2656,  -1.6719,   2.9688,  -0.5117]], requires_grad=True)
2025-02-06 20:27:03,131 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8516,   1.2734,  -0.1357,  -0.3711],
        [  5.2812,   1.0234,  -0.0854,   0.2832],
        [  0.3613,   0.8750,  -0.5625,   0.4141],
        ...,
        [-10.3125,   2.4219,  -0.5938,   0.2832],
        [  0.6836,  -0.8008,   1.4297,   0.2432],
        [  1.2266,  -1.6641,   2.9219,  -0.4941]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ -0.8516,   1.2734,  -0.1357,  -0.3711],
        [  5.2812,   1.0234,  -0.0854,   0.2832],
        [  0.3613,   0.8750,  -0.5625,   0.4141],
        ...,
        [-10.3125,   2.4219,  -0.5938,   0.2832],
        [  0.6836,  -0.8008,   1.4297,   0.2432],
        [  1.2266,  -1.6641,   2.9219,  -0.4941]], requires_grad=True)
2025-02-06 20:27:03,264 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8320,  1.1875, -0.1270, -0.3711],
        [ 5.1562,  1.0000, -0.0894,  0.2891],
        [ 0.3535,  0.8789, -0.5625,  0.4180],
        ...,
        [-9.9375,  2.3750, -0.5820,  0.2676],
        [ 0.6641, -0.7812,  1.3906,  0.2598],
        [ 1.1797, -1.5938,  2.7969, -0.4824]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8320,  1.1875, -0.1270, -0.3711],
        [ 5.1562,  1.0000, -0.0894,  0.2891],
        [ 0.3535,  0.8789, -0.5625,  0.4180],
        ...,
        [-9.9375,  2.3750, -0.5820,  0.2676],
        [ 0.6641, -0.7812,  1.3906,  0.2598],
        [ 1.1797, -1.5938,  2.7969, -0.4824]], requires_grad=True)
2025-02-06 20:27:03,401 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8164,  1.0859, -0.1172, -0.3691],
        [ 5.0312,  0.9414, -0.0845,  0.2930],
        [ 0.3477,  0.8594, -0.5547,  0.4199],
        ...,
        [-9.5625,  2.3906, -0.5820,  0.2520],
        [ 0.6445, -0.7852,  1.3750,  0.2715],
        [ 1.1328, -1.5547,  2.7344, -0.4727]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8164,  1.0859, -0.1172, -0.3691],
        [ 5.0312,  0.9414, -0.0845,  0.2930],
        [ 0.3477,  0.8594, -0.5547,  0.4199],
        ...,
        [-9.5625,  2.3906, -0.5820,  0.2520],
        [ 0.6445, -0.7852,  1.3750,  0.2715],
        [ 1.1328, -1.5547,  2.7344, -0.4727]], requires_grad=True)
2025-02-06 20:27:03,532 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8203,  1.0859, -0.1162, -0.3633],
        [ 5.0625,  0.8555, -0.0723,  0.2949],
        [ 0.3555,  0.7695, -0.5117,  0.4062],
        ...,
        [-9.5000,  2.5938, -0.6172,  0.2500],
        [ 0.6484, -0.8945,  1.4609,  0.2656],
        [ 1.1250, -1.6250,  2.7969, -0.4746]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8203,  1.0859, -0.1162, -0.3633],
        [ 5.0625,  0.8555, -0.0723,  0.2949],
        [ 0.3555,  0.7695, -0.5117,  0.4062],
        ...,
        [-9.5000,  2.5938, -0.6172,  0.2500],
        [ 0.6484, -0.8945,  1.4609,  0.2656],
        [ 1.1250, -1.6250,  2.7969, -0.4746]], requires_grad=True)
2025-02-06 20:27:03,667 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8164,  0.9336, -0.1021, -0.3652],
        [ 5.0625,  0.7812, -0.0625,  0.2930],
        [ 0.3555,  0.7031, -0.4785,  0.3965],
        ...,
        [-9.4375,  2.7188, -0.6328,  0.2432],
        [ 0.6445, -0.8945,  1.4375,  0.2793],
        [ 1.1094, -1.6641,  2.8125, -0.4707]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8164,  0.9336, -0.1021, -0.3652],
        [ 5.0625,  0.7812, -0.0625,  0.2930],
        [ 0.3555,  0.7031, -0.4785,  0.3965],
        ...,
        [-9.4375,  2.7188, -0.6328,  0.2432],
        [ 0.6445, -0.8945,  1.4375,  0.2793],
        [ 1.1094, -1.6641,  2.8125, -0.4707]], requires_grad=True)
2025-02-06 20:27:03,802 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8281,  0.8516, -0.0942, -0.3613],
        [ 5.0938,  0.6836, -0.0469,  0.2891],
        [ 0.3672,  0.5977, -0.4277,  0.3672],
        ...,
        [-9.5000,  2.8906, -0.6562,  0.2461],
        [ 0.6484, -0.9258,  1.4453,  0.2793],
        [ 1.1094, -1.7344,  2.8750, -0.4766]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8281,  0.8516, -0.0942, -0.3613],
        [ 5.0938,  0.6836, -0.0469,  0.2891],
        [ 0.3672,  0.5977, -0.4277,  0.3672],
        ...,
        [-9.5000,  2.8906, -0.6562,  0.2461],
        [ 0.6484, -0.9258,  1.4453,  0.2793],
        [ 1.1094, -1.7344,  2.8750, -0.4766]], requires_grad=True)
2025-02-06 20:27:03,934 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8320,  0.7539, -0.0850, -0.3594],
        [ 5.1562,  0.5703, -0.0266,  0.2773],
        [ 0.3691,  0.5312, -0.3926,  0.3535],
        ...,
        [-9.4375,  2.9531, -0.6641,  0.2363],
        [ 0.6406, -0.8984,  1.3984,  0.3008],
        [ 1.1016, -1.7578,  2.8750, -0.4727]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8320,  0.7539, -0.0850, -0.3594],
        [ 5.1562,  0.5703, -0.0266,  0.2773],
        [ 0.3691,  0.5312, -0.3926,  0.3535],
        ...,
        [-9.4375,  2.9531, -0.6641,  0.2363],
        [ 0.6406, -0.8984,  1.3984,  0.3008],
        [ 1.1016, -1.7578,  2.8750, -0.4727]], requires_grad=True)
2025-02-06 20:27:04,067 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8359,  0.6406, -0.0747, -0.3594],
        [ 5.1562,  0.5703, -0.0303,  0.2910],
        [ 0.3672,  0.5039, -0.3750,  0.3613],
        ...,
        [-9.3750,  2.9688, -0.6641,  0.2207],
        [ 0.6289, -0.8555,  1.3438,  0.3242],
        [ 1.0859, -1.7656,  2.8594, -0.4648]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8359,  0.6406, -0.0747, -0.3594],
        [ 5.1562,  0.5703, -0.0303,  0.2910],
        [ 0.3672,  0.5039, -0.3750,  0.3613],
        ...,
        [-9.3750,  2.9688, -0.6641,  0.2207],
        [ 0.6289, -0.8555,  1.3438,  0.3242],
        [ 1.0859, -1.7656,  2.8594, -0.4648]], requires_grad=True)
2025-02-06 20:27:04,196 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8398,  0.5664, -0.0674, -0.3555],
        [ 5.1562,  0.5352, -0.0265,  0.2930],
        [ 0.3652,  0.4648, -0.3535,  0.3574],
        ...,
        [-9.3750,  3.0000, -0.6641,  0.2109],
        [ 0.6133, -0.8242,  1.3047,  0.3398],
        [ 1.0703, -1.7812,  2.8594, -0.4629]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8398,  0.5664, -0.0674, -0.3555],
        [ 5.1562,  0.5352, -0.0265,  0.2930],
        [ 0.3652,  0.4648, -0.3535,  0.3574],
        ...,
        [-9.3750,  3.0000, -0.6641,  0.2109],
        [ 0.6133, -0.8242,  1.3047,  0.3398],
        [ 1.0703, -1.7812,  2.8594, -0.4629]], requires_grad=True)
2025-02-06 20:27:04,354 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8398,  0.4512, -0.0569, -0.3594],
        [ 5.0625,  0.5586, -0.0352,  0.3086],
        [ 0.3633,  0.4199, -0.3301,  0.3477],
        ...,
        [-9.4375,  3.0625, -0.6719,  0.2148],
        [ 0.6016, -0.8320,  1.2891,  0.3359],
        [ 1.0547, -1.8047,  2.8750, -0.4648]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8398,  0.4512, -0.0569, -0.3594],
        [ 5.0625,  0.5586, -0.0352,  0.3086],
        [ 0.3633,  0.4199, -0.3301,  0.3477],
        ...,
        [-9.4375,  3.0625, -0.6719,  0.2148],
        [ 0.6016, -0.8320,  1.2891,  0.3359],
        [ 1.0547, -1.8047,  2.8750, -0.4648]], requires_grad=True)
2025-02-06 20:27:04,507 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8281,  0.3184, -0.0449, -0.3652],
        [ 4.9688,  0.5859, -0.0444,  0.3223],
        [ 0.3594,  0.3789, -0.3086,  0.3379],
        ...,
        [-9.4375,  3.1719, -0.6875,  0.2295],
        [ 0.5938, -0.8672,  1.3047,  0.3164],
        [ 1.0391, -1.8438,  2.8750, -0.4727]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.8281,  0.3184, -0.0449, -0.3652],
        [ 4.9688,  0.5859, -0.0444,  0.3223],
        [ 0.3594,  0.3789, -0.3086,  0.3379],
        ...,
        [-9.4375,  3.1719, -0.6875,  0.2295],
        [ 0.5938, -0.8672,  1.3047,  0.3164],
        [ 1.0391, -1.8438,  2.8750, -0.4727]], requires_grad=True)
2025-02-06 20:27:04,642 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.7812,  0.0645, -0.0225, -0.3867],
        [ 5.1250,  0.6836, -0.0684,  0.3496],
        [ 0.3691,  0.3203, -0.2773,  0.3164],
        ...,
        [-9.4375,  3.2500, -0.6992,  0.2402],
        [ 0.6094, -0.8828,  1.3047,  0.3027],
        [ 0.9922, -1.8281,  2.8281, -0.4648]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.7812,  0.0645, -0.0225, -0.3867],
        [ 5.1250,  0.6836, -0.0684,  0.3496],
        [ 0.3691,  0.3203, -0.2773,  0.3164],
        ...,
        [-9.4375,  3.2500, -0.6992,  0.2402],
        [ 0.6094, -0.8828,  1.3047,  0.3027],
        [ 0.9922, -1.8281,  2.8281, -0.4648]], requires_grad=True)
2025-02-06 20:27:04,797 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-7.3828e-01, -1.5039e-01, -3.4027e-03, -4.0430e-01],
        [ 5.1875e+00,  8.0078e-01, -9.6680e-02,  3.8086e-01],
        [ 3.7305e-01,  3.0078e-01, -2.6562e-01,  3.1250e-01],
        ...,
        [-9.3750e+00,  3.2656e+00, -6.9922e-01,  2.4121e-01],
        [ 6.2109e-01, -8.6328e-01,  1.2734e+00,  3.0078e-01],
        [ 9.3359e-01, -1.7188e+00,  2.6875e+00, -4.3164e-01]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-7.3828e-01, -1.5039e-01, -3.4027e-03, -4.0430e-01],
        [ 5.1875e+00,  8.0078e-01, -9.6680e-02,  3.8086e-01],
        [ 3.7305e-01,  3.0078e-01, -2.6562e-01,  3.1250e-01],
        ...,
        [-9.3750e+00,  3.2656e+00, -6.9922e-01,  2.4121e-01],
        [ 6.2109e-01, -8.6328e-01,  1.2734e+00,  3.0078e-01],
        [ 9.3359e-01, -1.7188e+00,  2.6875e+00, -4.3164e-01]],
       requires_grad=True)
2025-02-06 20:27:04,950 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6875, -0.5703,  0.0315, -0.4453],
        [ 5.1562,  1.0547, -0.1514,  0.4355],
        [ 0.3691,  0.3652, -0.2871,  0.3535],
        ...,
        [-9.3125,  3.2812, -0.6992,  0.2432],
        [ 0.6250, -0.7812,  1.1875,  0.3223],
        [ 0.8828, -1.5391,  2.4844, -0.3750]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6875, -0.5703,  0.0315, -0.4453],
        [ 5.1562,  1.0547, -0.1514,  0.4355],
        [ 0.3691,  0.3652, -0.2871,  0.3535],
        ...,
        [-9.3125,  3.2812, -0.6992,  0.2432],
        [ 0.6250, -0.7812,  1.1875,  0.3223],
        [ 0.8828, -1.5391,  2.4844, -0.3750]], requires_grad=True)
2025-02-06 20:27:05,103 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6289, -0.9727,  0.0645, -0.4844],
        [ 5.2500,  1.0781, -0.1650,  0.4414],
        [ 0.3691,  0.3750, -0.2910,  0.3633],
        ...,
        [-9.3125,  3.4219, -0.7148,  0.2676],
        [ 0.6328, -0.8164,  1.1875,  0.2969],
        [ 0.8438, -1.4609,  2.3594, -0.3516]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.6289, -0.9727,  0.0645, -0.4844],
        [ 5.2500,  1.0781, -0.1650,  0.4414],
        [ 0.3691,  0.3750, -0.2910,  0.3633],
        ...,
        [-9.3125,  3.4219, -0.7148,  0.2676],
        [ 0.6328, -0.8164,  1.1875,  0.2969],
        [ 0.8438, -1.4609,  2.3594, -0.3516]], requires_grad=True)
2025-02-06 20:27:05,259 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5938, -1.2734,  0.0894, -0.5117],
        [ 5.4688,  0.9258, -0.1484,  0.4199],
        [ 0.3809,  0.3066, -0.2676,  0.3379],
        ...,
        [-9.3750,  3.6562, -0.7422,  0.3086],
        [ 0.6523, -0.9297,  1.2422,  0.2461],
        [ 0.8398, -1.5078,  2.3438, -0.3594]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5938, -1.2734,  0.0894, -0.5117],
        [ 5.4688,  0.9258, -0.1484,  0.4199],
        [ 0.3809,  0.3066, -0.2676,  0.3379],
        ...,
        [-9.3750,  3.6562, -0.7422,  0.3086],
        [ 0.6523, -0.9297,  1.2422,  0.2461],
        [ 0.8398, -1.5078,  2.3438, -0.3594]], requires_grad=True)
2025-02-06 20:27:05,400 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5547, -1.5312,  0.1113, -0.5352],
        [ 5.6250,  0.8008, -0.1348,  0.4004],
        [ 0.3809,  0.2695, -0.2539,  0.3184],
        ...,
        [-9.3125,  3.8281, -0.7578,  0.3418],
        [ 0.6602, -1.0156,  1.2812,  0.2021],
        [ 0.8203, -1.5156,  2.3125, -0.3633]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5547, -1.5312,  0.1113, -0.5352],
        [ 5.6250,  0.8008, -0.1348,  0.4004],
        [ 0.3809,  0.2695, -0.2539,  0.3184],
        ...,
        [-9.3125,  3.8281, -0.7578,  0.3418],
        [ 0.6602, -1.0156,  1.2812,  0.2021],
        [ 0.8203, -1.5156,  2.3125, -0.3633]], requires_grad=True)
2025-02-06 20:27:05,536 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5234, -1.7969,  0.1328, -0.5508],
        [ 5.7500,  0.7266, -0.1289,  0.3789],
        [ 0.3770,  0.2695, -0.2500,  0.2910],
        ...,
        [-9.1250,  3.9062, -0.7617,  0.3750],
        [ 0.6602, -1.0547,  1.2891,  0.1562],
        [ 0.7891, -1.4062,  2.1875, -0.3809]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.5234, -1.7969,  0.1328, -0.5508],
        [ 5.7500,  0.7266, -0.1289,  0.3789],
        [ 0.3770,  0.2695, -0.2500,  0.2910],
        ...,
        [-9.1250,  3.9062, -0.7617,  0.3750],
        [ 0.6602, -1.0547,  1.2891,  0.1562],
        [ 0.7891, -1.4062,  2.1875, -0.3809]], requires_grad=True)
2025-02-06 20:27:05,670 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4473, -2.1250,  0.1592, -0.5508],
        [ 5.6875,  0.8398, -0.1562,  0.3203],
        [ 0.3633,  0.3145, -0.2637,  0.2412],
        ...,
        [-8.8125,  3.8750, -0.7539,  0.4180],
        [ 0.6484, -1.0078,  1.2344,  0.0840],
        [ 0.7344, -1.2266,  1.9844, -0.4238]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.4473, -2.1250,  0.1592, -0.5508],
        [ 5.6875,  0.8398, -0.1562,  0.3203],
        [ 0.3633,  0.3145, -0.2637,  0.2412],
        ...,
        [-8.8125,  3.8750, -0.7539,  0.4180],
        [ 0.6484, -1.0078,  1.2344,  0.0840],
        [ 0.7344, -1.2266,  1.9844, -0.4238]], requires_grad=True)
2025-02-06 20:27:05,805 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3789, -2.3594,  0.1777, -0.5586],
        [ 5.6875,  0.8164, -0.1533,  0.3105],
        [ 0.3516,  0.3086, -0.2559,  0.2373],
        ...,
        [-8.5625,  3.9375, -0.7578,  0.4277],
        [ 0.6406, -1.0078,  1.2188,  0.0510],
        [ 0.6836, -1.1094,  1.8516, -0.4336]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3789, -2.3594,  0.1777, -0.5586],
        [ 5.6875,  0.8164, -0.1533,  0.3105],
        [ 0.3516,  0.3086, -0.2559,  0.2373],
        ...,
        [-8.5625,  3.9375, -0.7578,  0.4277],
        [ 0.6406, -1.0078,  1.2188,  0.0510],
        [ 0.6836, -1.1094,  1.8516, -0.4336]], requires_grad=True)
2025-02-06 20:27:05,946 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3047, -2.4688,  0.1855, -0.5859],
        [ 5.7812,  0.7266, -0.1328,  0.3262],
        [ 0.3418,  0.2432, -0.2188,  0.2910],
        ...,
        [-8.3750,  4.0312, -0.7734,  0.4102],
        [ 0.6211, -1.0547,  1.2500,  0.0574],
        [ 0.6523, -1.0859,  1.8281, -0.3945]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.3047, -2.4688,  0.1855, -0.5859],
        [ 5.7812,  0.7266, -0.1328,  0.3262],
        [ 0.3418,  0.2432, -0.2188,  0.2910],
        ...,
        [-8.3750,  4.0312, -0.7734,  0.4102],
        [ 0.6211, -1.0547,  1.2500,  0.0574],
        [ 0.6523, -1.0859,  1.8281, -0.3945]], requires_grad=True)
2025-02-06 20:27:06,081 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2617, -2.4062,  0.1738, -0.6250],
        [ 5.9375,  0.4922, -0.0728,  0.3711],
        [ 0.3340,  0.1660, -0.1738,  0.3496],
        ...,
        [-8.2500,  4.2188, -0.8125,  0.3711],
        [ 0.6055, -1.1562,  1.3516,  0.0879],
        [ 0.6250, -1.1094,  1.8750, -0.3438]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.2617, -2.4062,  0.1738, -0.6250],
        [ 5.9375,  0.4922, -0.0728,  0.3711],
        [ 0.3340,  0.1660, -0.1738,  0.3496],
        ...,
        [-8.2500,  4.2188, -0.8125,  0.3711],
        [ 0.6055, -1.1562,  1.3516,  0.0879],
        [ 0.6250, -1.1094,  1.8750, -0.3438]], requires_grad=True)
2025-02-06 20:27:06,237 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.1836, -2.3750,  0.1680, -0.6562],
        [ 5.9062,  0.4766, -0.0806,  0.4121],
        [ 0.3145,  0.2090, -0.2080,  0.4062],
        ...,
        [-8.1875,  4.2188, -0.8086,  0.3320],
        [ 0.5781, -1.1250,  1.2734,  0.1201],
        [ 0.5703, -0.9727,  1.6562, -0.2930]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.1836, -2.3750,  0.1680, -0.6562],
        [ 5.9062,  0.4766, -0.0806,  0.4121],
        [ 0.3145,  0.2090, -0.2080,  0.4062],
        ...,
        [-8.1875,  4.2188, -0.8086,  0.3320],
        [ 0.5781, -1.1250,  1.2734,  0.1201],
        [ 0.5703, -0.9727,  1.6562, -0.2930]], requires_grad=True)
2025-02-06 20:27:06,372 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0752, -2.3906,  0.1689, -0.6875],
        [ 5.9062,  0.5430, -0.1118,  0.4590],
        [ 0.3105,  0.2715, -0.2520,  0.4648],
        ...,
        [-8.2500,  4.1875, -0.7891,  0.2910],
        [ 0.5586, -1.0547,  1.1484,  0.1582],
        [ 0.5430, -0.7969,  1.3750, -0.2373]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[-0.0752, -2.3906,  0.1689, -0.6875],
        [ 5.9062,  0.5430, -0.1118,  0.4590],
        [ 0.3105,  0.2715, -0.2520,  0.4648],
        ...,
        [-8.2500,  4.1875, -0.7891,  0.2910],
        [ 0.5586, -1.0547,  1.1484,  0.1582],
        [ 0.5430, -0.7969,  1.3750, -0.2373]], requires_grad=True)
2025-02-06 20:27:06,579 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0159, -2.3750,  0.1670, -0.7109],
        [ 5.8750,  0.6133, -0.1426,  0.5000],
        [ 0.3066,  0.3340, -0.2949,  0.5195],
        ...,
        [-8.2500,  4.2188, -0.7852,  0.2695],
        [ 0.5352, -1.0391,  1.1016,  0.1699],
        [ 0.5195, -0.6445,  1.1328, -0.1885]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0159, -2.3750,  0.1670, -0.7109],
        [ 5.8750,  0.6133, -0.1426,  0.5000],
        [ 0.3066,  0.3340, -0.2949,  0.5195],
        ...,
        [-8.2500,  4.2188, -0.7852,  0.2695],
        [ 0.5352, -1.0391,  1.1016,  0.1699],
        [ 0.5195, -0.6445,  1.1328, -0.1885]], requires_grad=True)
2025-02-06 20:27:06,711 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0449, -2.1875,  0.1465, -0.7031],
        [ 6.0000,  0.6406, -0.1611,  0.5273],
        [ 0.3262,  0.2773, -0.2656,  0.4902],
        ...,
        [-8.4375,  4.3438, -0.8047,  0.2734],
        [ 0.5469, -1.1484,  1.2109,  0.1187],
        [ 0.4844, -0.5625,  0.9961, -0.1670]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0449, -2.1875,  0.1465, -0.7031],
        [ 6.0000,  0.6406, -0.1611,  0.5273],
        [ 0.3262,  0.2773, -0.2656,  0.4902],
        ...,
        [-8.4375,  4.3438, -0.8047,  0.2734],
        [ 0.5469, -1.1484,  1.2109,  0.1187],
        [ 0.4844, -0.5625,  0.9961, -0.1670]], requires_grad=True)
2025-02-06 20:27:06,842 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0728, -2.0625,  0.1309, -0.6992],
        [ 6.0938,  0.6680, -0.1777,  0.5508],
        [ 0.3496,  0.2393, -0.2451,  0.4648],
        ...,
        [-8.6250,  4.4375, -0.8203,  0.2773],
        [ 0.5742, -1.2500,  1.3125,  0.0728],
        [ 0.4629, -0.4844,  0.8633, -0.1465]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.0728, -2.0625,  0.1309, -0.6992],
        [ 6.0938,  0.6680, -0.1777,  0.5508],
        [ 0.3496,  0.2393, -0.2451,  0.4648],
        ...,
        [-8.6250,  4.4375, -0.8203,  0.2773],
        [ 0.5742, -1.2500,  1.3125,  0.0728],
        [ 0.4629, -0.4844,  0.8633, -0.1465]], requires_grad=True)
2025-02-06 20:27:06,991 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1113, -2.0781,  0.1235, -0.6875],
        [ 6.0938,  0.7812, -0.2041,  0.5664],
        [ 0.3594,  0.3164, -0.2559,  0.4297],
        ...,
        [-8.6250,  4.3438, -0.8164,  0.2871],
        [ 0.5859, -1.2266,  1.3438,  0.0232],
        [ 0.4375, -0.3711,  0.7148, -0.1318]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1113, -2.0781,  0.1235, -0.6875],
        [ 6.0938,  0.7812, -0.2041,  0.5664],
        [ 0.3594,  0.3164, -0.2559,  0.4297],
        ...,
        [-8.6250,  4.3438, -0.8164,  0.2871],
        [ 0.5859, -1.2266,  1.3438,  0.0232],
        [ 0.4375, -0.3711,  0.7148, -0.1318]], requires_grad=True)
2025-02-06 20:27:07,135 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1396, -2.0312,  0.1152, -0.6758],
        [ 6.0000,  0.9570, -0.2305,  0.5820],
        [ 0.3711,  0.3594, -0.2617,  0.3945],
        ...,
        [-8.6250,  4.1562, -0.8047,  0.2930],
        [ 0.6016, -1.1641,  1.3594, -0.0176],
        [ 0.4023, -0.2344,  0.5742, -0.1167]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1396, -2.0312,  0.1152, -0.6758],
        [ 6.0000,  0.9570, -0.2305,  0.5820],
        [ 0.3711,  0.3594, -0.2617,  0.3945],
        ...,
        [-8.6250,  4.1562, -0.8047,  0.2930],
        [ 0.6016, -1.1641,  1.3594, -0.0176],
        [ 0.4023, -0.2344,  0.5742, -0.1167]], requires_grad=True)
2025-02-06 20:27:07,270 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1670, -1.9922,  0.1079, -0.6641],
        [ 6.0000,  1.0234, -0.2559,  0.5820],
        [ 0.3867,  0.3633, -0.2695,  0.3516],
        ...,
        [-8.5625,  4.0000, -0.7930,  0.2988],
        [ 0.6133, -1.1562,  1.3594, -0.0659],
        [ 0.3672, -0.1348,  0.4453, -0.1074]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1670, -1.9922,  0.1079, -0.6641],
        [ 6.0000,  1.0234, -0.2559,  0.5820],
        [ 0.3867,  0.3633, -0.2695,  0.3516],
        ...,
        [-8.5625,  4.0000, -0.7930,  0.2988],
        [ 0.6133, -1.1562,  1.3594, -0.0659],
        [ 0.3672, -0.1348,  0.4453, -0.1074]], requires_grad=True)
2025-02-06 20:27:07,406 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1973, -1.8750,  0.1050, -0.6406],
        [ 5.9688,  1.0312, -0.2832,  0.5703],
        [ 0.4043,  0.3477, -0.2793,  0.3027],
        ...,
        [-8.5000,  3.8750, -0.7773,  0.3066],
        [ 0.6289, -1.2422,  1.3047, -0.1475],
        [ 0.3379, -0.0942,  0.3008, -0.1143]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1973, -1.8750,  0.1050, -0.6406],
        [ 5.9688,  1.0312, -0.2832,  0.5703],
        [ 0.4043,  0.3477, -0.2793,  0.3027],
        ...,
        [-8.5000,  3.8750, -0.7773,  0.3066],
        [ 0.6289, -1.2422,  1.3047, -0.1475],
        [ 0.3379, -0.0942,  0.3008, -0.1143]], requires_grad=True)
2025-02-06 20:27:07,541 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2334, -1.7969,  0.1001, -0.6250],
        [ 5.9062,  1.0781, -0.2988,  0.5742],
        [ 0.4141,  0.3535, -0.2812,  0.2773],
        ...,
        [-8.3750,  3.7500, -0.7617,  0.3105],
        [ 0.6367, -1.2969,  1.2578, -0.2100],
        [ 0.3008, -0.0092,  0.2197, -0.0894]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2334, -1.7969,  0.1001, -0.6250],
        [ 5.9062,  1.0781, -0.2988,  0.5742],
        [ 0.4141,  0.3535, -0.2812,  0.2773],
        ...,
        [-8.3750,  3.7500, -0.7617,  0.3105],
        [ 0.6367, -1.2969,  1.2578, -0.2100],
        [ 0.3008, -0.0092,  0.2197, -0.0894]], requires_grad=True)
2025-02-06 20:27:07,674 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2676, -1.6953,  0.0977, -0.6016],
        [ 5.8438,  1.1641, -0.3008,  0.6016],
        [ 0.4219,  0.3848, -0.2695,  0.2930],
        ...,
        [-8.3125,  3.5625, -0.7578,  0.2812],
        [ 0.6445, -1.2969,  1.2500, -0.2217],
        [ 0.2656,  0.1079,  0.1943, -0.0342]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2676, -1.6953,  0.0977, -0.6016],
        [ 5.8438,  1.1641, -0.3008,  0.6016],
        [ 0.4219,  0.3848, -0.2695,  0.2930],
        ...,
        [-8.3125,  3.5625, -0.7578,  0.2812],
        [ 0.6445, -1.2969,  1.2500, -0.2217],
        [ 0.2656,  0.1079,  0.1943, -0.0342]], requires_grad=True)
2025-02-06 20:27:07,808 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 2.9883e-01, -1.5703e+00,  9.7168e-02, -5.7422e-01],
        [ 5.7500e+00,  1.2266e+00, -3.0273e-01,  6.1719e-01],
        [ 4.2578e-01,  3.9258e-01, -2.6758e-01,  2.9883e-01],
        ...,
        [-8.1875e+00,  3.4219e+00, -7.4609e-01,  2.5977e-01],
        [ 6.4453e-01, -1.2969e+00,  1.2344e+00, -2.3145e-01],
        [ 2.3242e-01,  1.8262e-01,  1.4258e-01,  8.0566e-03]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 2.9883e-01, -1.5703e+00,  9.7168e-02, -5.7422e-01],
        [ 5.7500e+00,  1.2266e+00, -3.0273e-01,  6.1719e-01],
        [ 4.2578e-01,  3.9258e-01, -2.6758e-01,  2.9883e-01],
        ...,
        [-8.1875e+00,  3.4219e+00, -7.4609e-01,  2.5977e-01],
        [ 6.4453e-01, -1.2969e+00,  1.2344e+00, -2.3145e-01],
        [ 2.3242e-01,  1.8262e-01,  1.4258e-01,  8.0566e-03]],
       requires_grad=True)
2025-02-06 20:27:07,938 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3242, -1.4453,  0.0977, -0.5547],
        [ 5.6250,  1.2344, -0.3125,  0.6328],
        [ 0.4238,  0.3691, -0.2754,  0.3066],
        ...,
        [-8.0000,  3.3125, -0.7266,  0.2373],
        [ 0.6406, -1.3281,  1.1953, -0.2344],
        [ 0.2041,  0.2598,  0.1094,  0.0437]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3242, -1.4453,  0.0977, -0.5547],
        [ 5.6250,  1.2344, -0.3125,  0.6328],
        [ 0.4238,  0.3691, -0.2754,  0.3066],
        ...,
        [-8.0000,  3.3125, -0.7266,  0.2373],
        [ 0.6406, -1.3281,  1.1953, -0.2344],
        [ 0.2041,  0.2598,  0.1094,  0.0437]], requires_grad=True)
2025-02-06 20:27:08,081 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3477, -1.3281,  0.0977, -0.5352],
        [ 5.5312,  1.2188, -0.3203,  0.6484],
        [ 0.4199,  0.3340, -0.2832,  0.3164],
        ...,
        [-7.7812,  3.2344, -0.7070,  0.2168],
        [ 0.6289, -1.3438,  1.1562, -0.2383],
        [ 0.1758,  0.3340,  0.0825,  0.0737]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3477, -1.3281,  0.0977, -0.5352],
        [ 5.5312,  1.2188, -0.3203,  0.6484],
        [ 0.4199,  0.3340, -0.2832,  0.3164],
        ...,
        [-7.7812,  3.2344, -0.7070,  0.2168],
        [ 0.6289, -1.3438,  1.1562, -0.2383],
        [ 0.1758,  0.3340,  0.0825,  0.0737]], requires_grad=True)
2025-02-06 20:27:08,231 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3652, -1.2344,  0.0977, -0.5156],
        [ 5.4062,  1.1953, -0.3262,  0.6602],
        [ 0.4141,  0.3184, -0.2891,  0.3203],
        ...,
        [-7.5625,  3.1094, -0.6914,  0.2021],
        [ 0.6094, -1.3125,  1.1250, -0.2520],
        [ 0.1338,  0.4453,  0.0608,  0.0903]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3652, -1.2344,  0.0977, -0.5156],
        [ 5.4062,  1.1953, -0.3262,  0.6602],
        [ 0.4141,  0.3184, -0.2891,  0.3203],
        ...,
        [-7.5625,  3.1094, -0.6914,  0.2021],
        [ 0.6094, -1.3125,  1.1250, -0.2520],
        [ 0.1338,  0.4453,  0.0608,  0.0903]], requires_grad=True)
2025-02-06 20:27:08,375 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3828, -1.1406,  0.0972, -0.4961],
        [ 5.2812,  1.2188, -0.3340,  0.6602],
        [ 0.4043,  0.3340, -0.2969,  0.3047],
        ...,
        [-7.2812,  2.9375, -0.6719,  0.1982],
        [ 0.5859, -1.2578,  1.0938, -0.2715],
        [ 0.0918,  0.5547,  0.0386,  0.1001]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3828, -1.1406,  0.0972, -0.4961],
        [ 5.2812,  1.2188, -0.3340,  0.6602],
        [ 0.4043,  0.3340, -0.2969,  0.3047],
        ...,
        [-7.2812,  2.9375, -0.6719,  0.1982],
        [ 0.5859, -1.2578,  1.0938, -0.2715],
        [ 0.0918,  0.5547,  0.0386,  0.1001]], requires_grad=True)
2025-02-06 20:27:08,508 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4062, -1.0234,  0.0952, -0.4844],
        [ 5.1562,  1.1094, -0.3281,  0.6953],
        [ 0.3926,  0.3281, -0.2988,  0.3047],
        ...,
        [-6.9688,  2.8906, -0.6602,  0.1631],
        [ 0.5625, -1.2969,  1.0938, -0.2344],
        [ 0.0513,  0.5664,  0.0552,  0.1475]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4062, -1.0234,  0.0952, -0.4844],
        [ 5.1562,  1.1094, -0.3281,  0.6953],
        [ 0.3926,  0.3281, -0.2988,  0.3047],
        ...,
        [-6.9688,  2.8906, -0.6602,  0.1631],
        [ 0.5625, -1.2969,  1.0938, -0.2344],
        [ 0.0513,  0.5664,  0.0552,  0.1475]], requires_grad=True)
2025-02-06 20:27:08,639 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4238, -0.8867,  0.0913, -0.4766],
        [ 5.0000,  0.9961, -0.3203,  0.7266],
        [ 0.3809,  0.3164, -0.2969,  0.3066],
        ...,
        [-6.6562,  2.8594, -0.6523,  0.1279],
        [ 0.5391, -1.3438,  1.1016, -0.1953],
        [ 0.0118,  0.5586,  0.0801,  0.1904]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4238, -0.8867,  0.0913, -0.4766],
        [ 5.0000,  0.9961, -0.3203,  0.7266],
        [ 0.3809,  0.3164, -0.2969,  0.3066],
        ...,
        [-6.6562,  2.8594, -0.6523,  0.1279],
        [ 0.5391, -1.3438,  1.1016, -0.1953],
        [ 0.0118,  0.5586,  0.0801,  0.1904]], requires_grad=True)
2025-02-06 20:27:08,773 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4453, -0.8633,  0.0952, -0.4648],
        [ 4.8125,  0.9375, -0.3203,  0.7461],
        [ 0.3691,  0.3633, -0.3164,  0.2988],
        ...,
        [-6.4062,  2.7500, -0.6289,  0.1011],
        [ 0.5117, -1.3438,  1.0703, -0.1650],
        [-0.0260,  0.6445,  0.0093,  0.2197]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4453, -0.8633,  0.0952, -0.4648],
        [ 4.8125,  0.9375, -0.3203,  0.7461],
        [ 0.3691,  0.3633, -0.3164,  0.2988],
        ...,
        [-6.4062,  2.7500, -0.6289,  0.1011],
        [ 0.5117, -1.3438,  1.0703, -0.1650],
        [-0.0260,  0.6445,  0.0093,  0.2197]], requires_grad=True)
2025-02-06 20:27:08,911 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4688, -0.9219,  0.1050, -0.4551],
        [ 4.5312,  0.9648, -0.3359,  0.7656],
        [ 0.3496,  0.4180, -0.3418,  0.2930],
        ...,
        [-5.8438,  2.4844, -0.5781,  0.0728],
        [ 0.4512, -1.1797,  0.8984, -0.1279],
        [-0.0894,  0.7969, -0.1387,  0.2490]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4688, -0.9219,  0.1050, -0.4551],
        [ 4.5312,  0.9648, -0.3359,  0.7656],
        [ 0.3496,  0.4180, -0.3418,  0.2930],
        ...,
        [-5.8438,  2.4844, -0.5781,  0.0728],
        [ 0.4512, -1.1797,  0.8984, -0.1279],
        [-0.0894,  0.7969, -0.1387,  0.2490]], requires_grad=True)
2025-02-06 20:27:09,048 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4922, -1.0000,  0.1162, -0.4453],
        [ 4.5625,  0.8711, -0.3223,  0.7695],
        [ 0.3379,  0.4961, -0.3770,  0.2949],
        ...,
        [-5.4375,  2.2344, -0.5352,  0.0481],
        [ 0.4102, -1.0547,  0.7734, -0.0996],
        [-0.1104,  0.8164, -0.1348,  0.2578]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4922, -1.0000,  0.1162, -0.4453],
        [ 4.5625,  0.8711, -0.3223,  0.7695],
        [ 0.3379,  0.4961, -0.3770,  0.2949],
        ...,
        [-5.4375,  2.2344, -0.5352,  0.0481],
        [ 0.4102, -1.0547,  0.7734, -0.0996],
        [-0.1104,  0.8164, -0.1348,  0.2578]], requires_grad=True)
2025-02-06 20:27:09,181 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4531, -0.9531,  0.1133, -0.4277],
        [ 4.6875,  0.6953, -0.2852,  0.7617],
        [ 0.3438,  0.5312, -0.3887,  0.2871],
        ...,
        [-5.0625,  2.0625, -0.5039,  0.0309],
        [ 0.3711, -1.0078,  0.7305, -0.0879],
        [-0.1123,  0.7891, -0.0679,  0.2578]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.4531, -0.9531,  0.1133, -0.4277],
        [ 4.6875,  0.6953, -0.2852,  0.7617],
        [ 0.3438,  0.5312, -0.3887,  0.2871],
        ...,
        [-5.0625,  2.0625, -0.5039,  0.0309],
        [ 0.3711, -1.0078,  0.7305, -0.0879],
        [-0.1123,  0.7891, -0.0679,  0.2578]], requires_grad=True)
2025-02-06 20:27:09,317 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3613, -0.7930,  0.0972, -0.4062],
        [ 5.2188,  0.2715, -0.1768,  0.7344],
        [ 0.3691,  0.5000, -0.3633,  0.2695],
        ...,
        [-5.2188,  2.2188, -0.5430,  0.0332],
        [ 0.4102, -1.2188,  1.0000, -0.1089],
        [ 0.0310,  0.5781,  0.2578,  0.2412]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.3613, -0.7930,  0.0972, -0.4062],
        [ 5.2188,  0.2715, -0.1768,  0.7344],
        [ 0.3691,  0.5000, -0.3633,  0.2695],
        ...,
        [-5.2188,  2.2188, -0.5430,  0.0332],
        [ 0.4102, -1.2188,  1.0000, -0.1089],
        [ 0.0310,  0.5781,  0.2578,  0.2412]], requires_grad=True)
2025-02-06 20:27:09,468 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2812, -0.7031,  0.0884, -0.3828],
        [ 5.6250,  0.0908, -0.1279,  0.6797],
        [ 0.3789,  0.6211, -0.4121,  0.1953],
        ...,
        [-5.2812,  2.1562, -0.5391,  0.0608],
        [ 0.4395, -1.2422,  1.0625, -0.1719],
        [ 0.1260,  0.5391,  0.3613,  0.1934]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2812, -0.7031,  0.0884, -0.3828],
        [ 5.6250,  0.0908, -0.1279,  0.6797],
        [ 0.3789,  0.6211, -0.4121,  0.1953],
        ...,
        [-5.2812,  2.1562, -0.5391,  0.0608],
        [ 0.4395, -1.2422,  1.0625, -0.1719],
        [ 0.1260,  0.5391,  0.3613,  0.1934]], requires_grad=True)
2025-02-06 20:27:09,602 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2119, -0.6250,  0.0806, -0.3613],
        [ 5.9375, -0.0405, -0.0898,  0.6289],
        [ 0.3828,  0.7461, -0.4648,  0.1206],
        ...,
        [-5.3125,  2.0625, -0.5312,  0.0884],
        [ 0.4590, -1.2031,  1.0781, -0.2422],
        [ 0.2031,  0.5156,  0.4355,  0.1475]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.2119, -0.6250,  0.0806, -0.3613],
        [ 5.9375, -0.0405, -0.0898,  0.6289],
        [ 0.3828,  0.7461, -0.4648,  0.1206],
        ...,
        [-5.3125,  2.0625, -0.5312,  0.0884],
        [ 0.4590, -1.2031,  1.0781, -0.2422],
        [ 0.2031,  0.5156,  0.4355,  0.1475]], requires_grad=True)
2025-02-06 20:27:09,738 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1523, -0.5000,  0.0713, -0.3438],
        [ 6.0625, -0.2500, -0.0461,  0.5938],
        [ 0.3848,  0.8281, -0.5039,  0.0630],
        ...,
        [-5.2188,  2.0781, -0.5312,  0.1011],
        [ 0.4727, -1.2188,  1.1094, -0.2910],
        [ 0.2695,  0.4668,  0.5156,  0.1108]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1523, -0.5000,  0.0713, -0.3438],
        [ 6.0625, -0.2500, -0.0461,  0.5938],
        [ 0.3848,  0.8281, -0.5039,  0.0630],
        ...,
        [-5.2188,  2.0781, -0.5312,  0.1011],
        [ 0.4727, -1.2188,  1.1094, -0.2910],
        [ 0.2695,  0.4668,  0.5156,  0.1108]], requires_grad=True)
2025-02-06 20:27:09,871 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.5430e-01, -3.4766e-01,  6.2500e-02, -3.3008e-01],
        [ 6.0938e+00, -4.7852e-01, -6.0730e-03,  5.6250e-01],
        [ 3.8867e-01,  7.5391e-01, -5.2344e-01,  3.7842e-02],
        ...,
        [-4.9375e+00,  2.3594e+00, -5.3906e-01,  9.7656e-02],
        [ 4.6484e-01, -1.4141e+00,  1.1641e+00, -3.1055e-01],
        [ 3.3984e-01,  2.8516e-01,  6.0938e-01,  9.3262e-02]],
       requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 1.5430e-01, -3.4766e-01,  6.2500e-02, -3.3008e-01],
        [ 6.0938e+00, -4.7852e-01, -6.0730e-03,  5.6250e-01],
        [ 3.8867e-01,  7.5391e-01, -5.2344e-01,  3.7842e-02],
        ...,
        [-4.9375e+00,  2.3594e+00, -5.3906e-01,  9.7656e-02],
        [ 4.6484e-01, -1.4141e+00,  1.1641e+00, -3.1055e-01],
        [ 3.3984e-01,  2.8516e-01,  6.0938e-01,  9.3262e-02]],
       requires_grad=True)
2025-02-06 20:27:10,006 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1206, -0.2383,  0.0544, -0.3203],
        [ 6.0625, -0.6055,  0.0294,  0.5469],
        [ 0.3887,  0.7266, -0.5391,  0.0359],
        ...,
        [-4.5625,  2.5312, -0.5430,  0.0850],
        [ 0.4609, -1.5312,  1.2109, -0.3066],
        [ 0.3867,  0.1846,  0.6914,  0.0938]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1206, -0.2383,  0.0544, -0.3203],
        [ 6.0625, -0.6055,  0.0294,  0.5469],
        [ 0.3887,  0.7266, -0.5391,  0.0359],
        ...,
        [-4.5625,  2.5312, -0.5430,  0.0850],
        [ 0.4609, -1.5312,  1.2109, -0.3066],
        [ 0.3867,  0.1846,  0.6914,  0.0938]], requires_grad=True)
2025-02-06 20:27:10,151 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1465, -0.0923,  0.0481, -0.3008],
        [ 6.3438, -0.6133,  0.0654,  0.5625],
        [ 0.3906,  0.7070, -0.5508,  0.0403],
        ...,
        [-5.1562,  2.4531, -0.5547,  0.0134],
        [ 0.5000, -1.5156,  1.2734, -0.2344],
        [ 0.4492,  0.1465,  0.7773,  0.1182]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1465, -0.0923,  0.0481, -0.3008],
        [ 6.3438, -0.6133,  0.0654,  0.5625],
        [ 0.3906,  0.7070, -0.5508,  0.0403],
        ...,
        [-5.1562,  2.4531, -0.5547,  0.0134],
        [ 0.5000, -1.5156,  1.2734, -0.2344],
        [ 0.4492,  0.1465,  0.7773,  0.1182]], requires_grad=True)
2025-02-06 20:27:10,290 ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1670,  0.0396,  0.0425, -0.2812],
        [ 6.6562, -0.6523,  0.0918,  0.5664],
        [ 0.3848,  0.7109, -0.5547,  0.0608],
        ...,
        [-5.6250,  2.3750, -0.5625, -0.0498],
        [ 0.5156, -1.4453,  1.3516, -0.1436],
        [ 0.5156,  0.0776,  0.8203,  0.1230]], requires_grad=True)
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - LOG: Parameter containing:
tensor([[ 0.1670,  0.0396,  0.0425, -0.2812],
        [ 6.6562, -0.6523,  0.0918,  0.5664],
        [ 0.3848,  0.7109, -0.5547,  0.0608],
        ...,
        [-5.6250,  2.3750, -0.5625, -0.0498],
        [ 0.5156, -1.4453,  1.3516, -0.1436],
        [ 0.5156,  0.0776,  0.8203,  0.1230]], requires_grad=True)
2025-02-06 20:27:10,664 ed8eb309-5855-442c-9624-4c3afb47007f - COMPLETED: Your job has been completed.
INFO:nnsight_remote:ed8eb309-5855-442c-9624-4c3afb47007f - COMPLETED: Your job has been completed.
Downloading result: 100%|██████████| 133k/133k [00:00<00:00, 1.63MB/s]
[14]:
print(model)
LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(128256, 8192)
    (layers): ModuleList(
      (0-79): 80 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): Linear(in_features=8192, out_features=8192, bias=False)
          (k_proj): Linear(in_features=8192, out_features=1024, bias=False)
          (v_proj): Linear(in_features=8192, out_features=1024, bias=False)
          (o_proj): Linear(in_features=8192, out_features=8192, bias=False)
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear(in_features=8192, out_features=28672, bias=False)
          (up_proj): Linear(in_features=8192, out_features=28672, bias=False)
          (down_proj): Linear(in_features=28672, out_features=8192, bias=False)
          (act_fn): SiLU()
        )
        (input_layernorm): LlamaRMSNorm((8192,), eps=1e-05)
        (post_attention_layernorm): LlamaRMSNorm((8192,), eps=1e-05)
      )
    )
    (norm): LlamaRMSNorm((8192,), eps=1e-05)
    (rotary_emb): LlamaRotaryEmbedding()
  )
  (lm_head): Linear(in_features=8192, out_features=128256, bias=False)
  (generator): Generator(
    (streamer): Streamer()
  )
)

In addition to the weights changing, we know the LoRA has been applied because there is a difference in the model’s architecture. The 11th block of the model no longer has the standard MLP layer and instead contains the LoRA.

Now it is time to test out whether our fine tuned model is able to predict the sentiment of a given sentence.

[15]:
# With lora. Will output "negative".
with model.generate("I'm upset", remote=True) as generator:
  lora()
  out = model.lm_head.output.save()

# The model outputs the sentiment as tokens first.
token_ids = out.argmax(dim=-1)

# Convert the tokens to either positive or negative
count_positive = (token_ids == 1).sum().item()
count_negative = (token_ids == 0).sum().item()

# Determine the overall sentiment of the entire sentence
if count_positive > count_negative:
  print("\nPrediction with LoRA: Positive\n")
else:
  print("\nPrediction with LoRA: Negative\n")

# Then without. It will try to complete the sentence rather than output the
# sentiment analysis.

with model.generate("I'm upset", remote=True) as generator:
    out = model.lm_head.output.save()

print("\nPrediction without LoRA:", model.tokenizer.decode(out.argmax(dim=-1)[0]))
2025-02-06 20:27:23,165 b81eb518-fb00-4be4-a3c5-3681783dcddb - RECEIVED: Your job has been received and is waiting approval.
INFO:nnsight_remote:b81eb518-fb00-4be4-a3c5-3681783dcddb - RECEIVED: Your job has been received and is waiting approval.
2025-02-06 20:27:23,490 b81eb518-fb00-4be4-a3c5-3681783dcddb - APPROVED: Your job was approved and is waiting to be run.
INFO:nnsight_remote:b81eb518-fb00-4be4-a3c5-3681783dcddb - APPROVED: Your job was approved and is waiting to be run.
2025-02-06 20:27:26,398 b81eb518-fb00-4be4-a3c5-3681783dcddb - RUNNING: Your job has started running.
INFO:nnsight_remote:b81eb518-fb00-4be4-a3c5-3681783dcddb - RUNNING: Your job has started running.
2025-02-06 20:27:27,348 b81eb518-fb00-4be4-a3c5-3681783dcddb - COMPLETED: Your job has been completed.
INFO:nnsight_remote:b81eb518-fb00-4be4-a3c5-3681783dcddb - COMPLETED: Your job has been completed.
Downloading result: 100%|██████████| 258k/258k [00:00<00:00, 1.83MB/s]

Prediction with LoRA: Negative

2025-02-06 20:27:28,121 bcbd9294-cd09-4a1a-9aa6-1cad695cd3a9 - RECEIVED: Your job has been received and is waiting approval.
INFO:nnsight_remote:bcbd9294-cd09-4a1a-9aa6-1cad695cd3a9 - RECEIVED: Your job has been received and is waiting approval.
2025-02-06 20:27:28,455 bcbd9294-cd09-4a1a-9aa6-1cad695cd3a9 - APPROVED: Your job was approved and is waiting to be run.
INFO:nnsight_remote:bcbd9294-cd09-4a1a-9aa6-1cad695cd3a9 - APPROVED: Your job was approved and is waiting to be run.
2025-02-06 20:27:28,664 bcbd9294-cd09-4a1a-9aa6-1cad695cd3a9 - RUNNING: Your job has started running.
INFO:nnsight_remote:bcbd9294-cd09-4a1a-9aa6-1cad695cd3a9 - RUNNING: Your job has started running.
2025-02-06 20:27:29,271 bcbd9294-cd09-4a1a-9aa6-1cad695cd3a9 - COMPLETED: Your job has been completed.
INFO:nnsight_remote:bcbd9294-cd09-4a1a-9aa6-1cad695cd3a9 - COMPLETED: Your job has been completed.
Downloading result: 100%|██████████| 258k/258k [00:00<00:00, 1.79MB/s]

Prediction without LoRA:  that