Wenn ich versuche ein Model mit der
loss.backward()
Methode zu trainieren bekomme ich den Fehler
RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
Also habe ich den Parameter
retrain_graph=True
eingefügt aber bekomme jetzt einen anderen Fehler
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [512, 128]] is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
So sieht mein Trainingsloop aus
for epoch in range(initial_num_epochs):
hidden = initial_model.init_hidden(initial_batch_size)
for i in range(0, initial_training_input_data.size(0) - initial_batch_size, initial_batch_size):
inputs = initial_training_input_data[i:i + initial_batch_size]
targets = initial_training_target_data[i:i + initial_batch_size]
outputs, hidden = initial_model(inputs, hidden)
loss = initial_criterion(outputs, targets)
initial_optimizer.zero_grad()
loss.backward(retain_graph=True)
initial_optimizer.step()
print(f'Epoch [{epoch + 1}/{initial_num_epochs}], Loss: {loss.item():.4f}')
und so das Model
class TextLSTM(nn.Module):
def __init__(self, input_size, hidden_size, output_size, num_layers=1):
super(TextLSTM, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
self.embedding = nn.Embedding(input_size, hidden_size)
self.lstm = nn.LSTM(hidden_size, hidden_size, num_layers, batch_first=True)
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x, h):
embedded = self.embedding(x)
output, h = self.lstm(embedded, h)
output = self.fc(output[:, -1, :])
return output, h
def init_hidden(self, batch_size):
return (torch.zeros(self.num_layers, batch_size, self.hidden_size),
torch.zeros(self.num_layers, batch_size, self.hidden_size))