Are there similar datasets to MNIST?
Asked Answered
P

3

9

I am doing research on machine learning. Now I want to test my algorithms with some famous datasets. Since I am a newbie in this area, I can't find other suitable datasets apart from MNIST. I thing MNIST is quite suitable for our research. Does anyone know some similar datasets with MNIST?

P.S I know another handwritten digit dataset that is often used, called USPS dataset. But I need a dataset with more training examples (typically more than 10000 and comparable to the number of training examples in MNIST), so USPS is out of my selection.

Punctate answered 23/3, 2013 at 8:38 Comment(2)
This depends on what you want to do. MNIST is a great dataset that contains handwritten digits. Do you want to work on handwritten digits or something else (faces, handwritten letters, etc)?Mckamey
You can find an already decoded version of the MNIST dataset here: mnist-decoded.000webhostapp.comSikhism
C
5

The machine learning archive (http://archive.ics.uci.edu/ml/) contains quite a variety of datasets including those, like MINIST, suitable for classification e.g. (http://archive.ics.uci.edu/ml/datasets/Skin+Segmentation).

I can't say which of them would be suitable without knowing what you're trying to demonstrate with your algorithm but anything inside the UCI archive is well known.

Cleland answered 23/3, 2013 at 8:45 Comment(0)
V
4

You can try Fashion MNIST or Kuzushiji MNIST that have very similar properties to MNIST, but a bit harder to predict. From Fashion MNIST's page:

Seriously, we are talking about replacing MNIST. Here are some good reasons:

  • MNIST is too easy. Convolutional nets can achieve 99.7% on MNIST. Classic machine learning algorithms can also achieve 97% easily. Check out our side-by-side benchmark for Fashion-MNIST vs. MNIST, and read "Most pairs of MNIST digits can be distinguished pretty well by just one pixel."
  • MNIST is overused. In this April 2017 Twitter thread, Google Brain research scientist and deep learning expert Ian Goodfellow calls for people to move away from MNIST.
  • MNIST can not represent modern CV tasks, as noted in this April 2017 Twitter thread, deep learning expert/Keras author François Chollet.
Vitriol answered 24/3, 2020 at 4:40 Comment(0)
A
0

I know this question is old, but I hope my suggestions can still be useful. I was also looking for datasets similar to handwritten MNIST and Fashion MINIST as well. Pytorch provides several of them with documentation: KMNIST, QMNIST, USPS, SEMEION, SVHN, amongst others. Check here for the full list.

Adiel answered 17/8, 2023 at 19:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.