Access to ImageNet data download
Asked Answered
C

2

12

I've already been granted by the ImageNet website http://www.image-net.org/download-images to download the image data. And the page shows:

You have been granted access to the the whole ImageNet database through our site. By doing so you agree to the terms of access.

Download as one tar file

The full ImageNet data is currently unavailable. Data for ILSVRC is available.

ImageNet Fall 2011 release MD5: ...

ImageNet10K from Deng et al. ECCV2010

But both of the links shows "OOPS The url is not valid." when opened. (It's absolutely not due to some problem of my web or browser. I can tell this by the consistency of ImageNet web page style. I guess these links are too old, and moved to other urls, yet their website didn't update at once)

I have two questions here.

(1) Where and how can I really download the ImageNet data (as well as their labels, for classification task)?

(2) I want the data for the purpose to validate my method in my paper. Even if the dataset is downloaded, I'm afraid that it's unnecessarily big. Do I have to validate on ImageNet (Since its adopted in many papers anyway...) ? The Tiny ImageNet data's page seems not broken on their website. But its a dataset much smaller.

Coxcombry answered 12/1, 2021 at 13:59 Comment(0)
U
12

It can be downloaded in python using the datasets library:

>>> from datasets import load_dataset
>>> ds = load_dataset("imagenet-1k")
>>> train_ds = ds["train"]
>>> train_ds[0]["image"]  # a PIL Image

You may need to install it as well as Pillow and login to Hugging Face after accepting the terms of access

pip install datasets Pillow
huggingface-cli login

You can find more info and links to download the files on the ImageNet page on Hugging Face: https://huggingface.co/datasets/imagenet-1k

Undress answered 6/12, 2022 at 15:54 Comment(1)
This is currently the best solution, since the Kaggle dataset is the "full" ImageNet set that's going to take up 160+ GB. ImageNet1K from HF is fairly easy to set up following these particular instructions.Meddlesome
V
9

ImageNet Download:

Go to https://www.kaggle.com/c/imagenet-object-localization-challenge and click on the data tab. You can use the Kaggle API to download on a remote computer, or that page to download all the files you want directly.

There, they provide both the labels and the image data.

I don't know what is up with the ImageNet website, however, the url list links were also broken for me today. One way you can still get the data is by going to an alternate mirror, such as Kaggle ImageNet download, the link I provided above. From what I have hears, the Kaggle ImageNet is equivalent to the ImageNet from their website.

I'm unsure about how to answer your second question, as I don't know enough about your project. However, ImageNet will probably work to validate your model.

Vanguard answered 14/1, 2021 at 20:20 Comment(4)
Recently imagenet took down the Dowload Image URLs section. But it is archive'd in archive.orgExcipient
how can we find the image URLs section?Larger
kaggle.com/c/imagenet-object-localization-challenge downloads a folder which contains Data, Annotations and ImageSets. Can anybody give a hint on how to use this folders for training or testing e.g. a VGG16 or ResNet34 PyTorch model?Gumbo
The command to download the dataset is already in the page: kaggle competitions download -c imagenet-object-localization-challengeBifacial

© 2022 - 2024 — McMap. All rights reserved.