How to resolve the error: RuntimeError: received 0 items of ancdata
Asked Answered
M

4

5

I have a torch.utils.data.DataLoader. I have created them with the following code.

transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

trainset = CIFAR100WithIdx(root='.',
                           train=True,
                           download=True,
                           transform=transform_train,
                           rand_fraction=args.rand_fraction)

train_loader = torch.utils.data.DataLoader(trainset,
                                           batch_size=args.batch_size,
                                           shuffle=True,
                                           num_workers=args.workers)

But when I run the following code I get an error.

train_loader_2 = []
for i, (inputs, target, index_dataset) in enumerate(train_loader):
    train_loader_2.append((inputs, target, index_dataset))

The error is

Traceback (most recent call last):
  File "main_superloss.py", line 460, in <module>
    main()
  File "main_superloss.py", line 456, in main
    main_worker(args)
  File "main_superloss.py", line 374, in main_worker
    train_loader, val_loader = get_train_and_val_loader(args)
  File "main_superloss.py", line 120, in get_train_and_val_loader
    for i, (inputs, target, index_dataset) in enumerate(train_loader):
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 804, in __next__
    idx, data = self._get_data()
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 771, in _get_data
    success, data = self._try_get_data()
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 724, in _try_get_data
    data = self.data_queue.get(timeout=timeout)
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/queues.py", line 113, in get
    return _ForkingPickler.loads(res)
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 284, in rebuild_storage_fd
    fd = df.detach()
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/resource_sharer.py", line 58, in detach
    return reduction.recv_handle(conn)
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/reduction.py", line 185, in recv_handle
    return recvfds(s, 1)[0]
  File "/home/C00423766/.conda/envs/dp/lib/python3.7/multiprocessing/reduction.py", line 161, in recvfds
    len(ancdata))
RuntimeError: received 0 items of ancdata

The reason I want to get the data inside a list is because I want to reorder the samples. And not in a random way but in a particular way. How can I do that?

Mittel answered 28/3, 2022 at 5:10 Comment(0)
S
6

I was facing a similar issue with my code and based on some discussions (check #1, #2, #3). I used ulimit -n 2048 to increase the maximum number of file descriptors a process can have. You can read more about ulimit here.

About the issue - The discussions suggest that it has to do something with pytorch’s forked multiprocessing code.

On the second part of your question, how to reorder a dataloader - You can refer this answer

Sundew answered 8/7, 2022 at 9:44 Comment(0)
A
1
import resource
rlimit = resource.getrlimit(resource.RLIMIT_NOFILE)
resource.setrlimit(resource.RLIMIT_NOFILE, (2048, rlimit[1]))
Apiculture answered 8/1, 2023 at 7:36 Comment(0)
P
1

I was able to fix it with

sudo vim /etc/security/limits.conf
# TODO add `* soft nofile 4096` to the end of the file without a `#`.
sudo vim /etc/pam.d/common-session
# TODO add `session required pam_limits.so` to the end of the file without a `#`
# Log out and log back in and you should be good. Check with ulimit -n
# NOTE: also need to restart ssh/any screen sessions as they remember the fd limit.
Pierette answered 28/2, 2023 at 16:54 Comment(0)
O
0

On Ubuntu you need to do the following to solve this problem for all users:

Add the line

session required pam_limits.so 

to the common-session*files (there are multiple!)

$ sudo nano /etc/pam.d/common-session
$ sudo nano /etc/pam.d/common-session-noninteractive

Afterwards add the lines

* soft nofile 4096
* hard nofile 4096

to the limits.conf file

 $ sudo nano /etc/security/limits.conf

Then after a relogin you should see

$ ulimit -a
...
open files                          (-n) 4096
...

This should fix the problem forever on your Ubuntu machine.

see

Overawe answered 3/8, 2023 at 9:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.