PyTorch on M1 Mac: RuntimeError: Placeholder storage has not been allocated on MPS device
Asked Answered
A

2

18

I'm training a model in PyTorch 1.13.0 (I have also tried this on the nightly build torch-1.14.0.dev20221207 to no avail) on my M1 Mac and would like to use MPS hardware acceleration. I have the following relevant code in my project to send the model and input tensors to MPS:

device = torch.device("mps" if torch.backends.mps.is_available() else "cpu") # This always results in MPS

model.to(device)

... And in my Dataset subclass:

class MyDataset(Dataset):
    def __init__(self, df, window_size):
        self.df = df
        self.window_size = window_size
        self.data = []
        self.labels = []
        for i in range(len(df) - window_size):
            x = torch.tensor(df.iloc[i:i+window_size].values, dtype=torch.float, device=device)
            y = torch.tensor(df.iloc[i+window_size].values, dtype=torch.float, device=device)
            self.data.append(x)
            self.labels.append(y)
    def __len__(self):
        return len(self.data)
    def __getitem__(self, idx):
        return self.data[idx], self.labels[idx]

This results in the following traceback during my first training step:

Traceback (most recent call last):
  File "lstm_model.py", line 263, in <module>
    train_losses, val_losses = train_model(model, criterion, optimizer, train_loader, val_loader, epochs=100)
  File "lstm_model.py", line 212, in train_model
    train_loss += train_step(model, criterion, optimizer, x, y)
  File "lstm_model.py", line 191, in train_step
    y_pred = model(x)
  File "miniconda3/envs/pytenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "lstm_model.py", line 182, in forward
    out, _ = self.lstm(x, (h0, c0))
  File "miniconda3/envs/pytenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "miniconda3/envs/pytenv/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 774, in forward
    result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers,
RuntimeError: Placeholder storage has not been allocated on MPS device!

I've tried creating tensors in my Dataset subclass without a device specified and then calling .to(device) on them:

x = torch.tensor(df.iloc[i:i+window_size].values, dtype=torch.float)
x = x.to(device)
y = torch.tensor(df.iloc[i+window_size].values, dtype=torch.float)
y = y.to(device)

I've also tried creating the tensors without a device specified in my Dataset subclass and sending tensors to device in both the forward method of my model and in my train_step function.

How can I resolve my error?

Alar answered 7/12, 2022 at 23:59 Comment(4)
By any chance, are you using tensroboard? It happens for me with a simple CNN, when I try to add it to tensorboard. Without that it works without issues.Epithelium
I'm not, interesting though.Alar
It happened to me when I tried accelerating GPT2; I think it's a bug in PyTorch.Illume
I'm having the same problem. Both model and training data are located in the right device but having the same error. Have you solved this problem by any chance?Bohaty
M
5

A possible issue with your code may be that you are not sending the inputs to the device inside your training loop. You should send both the model and the inputs to the device, as you can read about in this blog post.

An example code would be the following:

def train(model, train_loader, device, *args):
    model.train()

    for it, batch in tqdm(enumerate(train_loader), desc="Epoch %s: " % (epoch), total=train_loader.__len__()):
        batch = {'data': batch['data'].to(device), 'labels': batch['labels'].to(device)}

        # perform training
        ...

# set model and device
model = MyWonderfulModel(*args)
device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
model.to(device)

# call training function
train(model, train_loader, device, *args)

Running such training function on my M1 Mac works using MPS.

Morey answered 14/3, 2023 at 8:15 Comment(3)
Good answer. To further elaborate, any object transferred to a certain device will stay there until you move it elsewhere, and you may need to transfer an object back to CPU in order to work with it further (depending on you workflow).Baalbeer
As a sidenote: MPS is "Metal Performance Shaders" - hardware support for ARM Macs. More here: pytorch.org/docs/stable/notes/mps.htmlSalesgirl
I guess the author moved input to the device already. Didn't he?Bohaty
N
-1

try to change this code device = torch.device("mps" if torch.backends.mps.is_available() else "cpu") # This always results in MPS to device = torch.device("mps")

Neutral answered 21/2, 2023 at 5:32 Comment(1)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Nkvd

© 2022 - 2024 — McMap. All rights reserved.