2020年4月21日火曜日

Several ways to use Mnist dataset

Introduction

The Mnist hand written digit database is one of the most famous dataset in machine learning.

Although they are maintained in several well known library, as it seems that there are several ways to utilize them and several type of datasets, I confused whether there are something difference.

Because I guess there are anyone like me, I wrote this article to maintain confused information.

Because I already wrote this article in Japanese and referred some reference in that, in this article I'm suppose to show reference in minimum.

Assumption

I assume that you already installed sklearn, tensorflow and pytorch.(Anyway as for me I installed them with Anaconda.)

Furthermore I use MacOSX

Notation

We can see two types of mnist so called hand written dataset.

The first is the one attached to sklearn.

And the second is the others.

The first one is made up of 8×8 pixels.

And the second is 28×28 pixels.

The data attached to sklearn (8×8pixel)

where they are

The dataset attached to sklearn is in the following directory.

/(depending on environment respectively)/lib/python3.7/site-packages/sklearn/datasets

The follow is in my case. (I use Anaconda)

$ls  /Users/hiroshi/opt/anaconda3/lib/python3.7/site-packages/sklearn/    

__check_build   dummy.py   model_selection
__init__.py   ensemble   multiclass.py
__pycache__   exceptions.py   multioutput.py
_build_utils   experimental   naive_bayes.py
_config.py   externals   neighbors
_distributor_init.py  feature_extraction  neural_network
_isotonic.cpython-37m-darwin.so feature_selection  pipeline.py
base.py    gaussian_process  preprocessing
calibration.py   impute    random_projection.py
cluster    inspection   semi_supervised
compose    isotonic.py   setup.py
conftest.py   kernel_approximation.py  svm
covariance   kernel_ridge.py   tests
cross_decomposition  linear_model   tree
datasets   manifold   utils
decomposition   metrics
discriminant_analysis.py mixture

And looking the inside the dataset directory, you might find as follows.

$ls /Users/hiroshi/opt/anaconda3/lib/python3.7/site-packages/sklearn/datasets

__init__.py     california_housing.py
__pycache__     covtype.py
_base.py     data
_california_housing.py    descr
_covtype.py     images
_kddcup99.py     kddcup99.py
_lfw.py      lfw.py
_olivetti_faces.py    olivetti_faces.py
_openml.py     openml.py
_rcv1.py     rcv1.py
_samples_generator.py    samples_generator.py
_species_distributions.py   setup.py
_svmlight_format_fast.cpython-37m-darwin.so species_distributions.py
_svmlight_format_io.py    svmlight_format.py
_twenty_newsgroups.py    tests
base.py      twenty_newsgroups.py

Here you can see the other datasets beside mnist.

And diving into the dataset directory more deeply, you might find as follows.

$ ls /Users/hiroshi/opt/anaconda3/lib/python3.7/site-packages/sklearn/datasets/data
boston_house_prices.csv  diabetes_target.csv.gz  linnerud_exercise.csv
breast_cancer.csv  digits.csv.gz   linnerud_physiological.csv
diabetes_data.csv.gz  iris.csv   wine_data.csv

Here there is some datasets like iris dataset , boston_house_price dataset and so on that are often referred some article about skelearn.

How to import dataset

It is same to official page of sklearn.

The subsequent task is done launching the python from terminal.

>>> from sklearn.datasets import load_digits
>>> import matplotlib.pyplot as plt
>>> digit=load_digits()
>>> digit.data.shape
(1797, 64)     

>>> plt.gray()
>>> digit.images[0]
array([[ 0.,  0.,  5., 13.,  9.,  1.,  0.,  0.],
       [ 0.,  0., 13., 15., 10., 15.,  5.,  0.],
       [ 0.,  3., 15.,  2.,  0., 11.,  8.,  0.],
       [ 0.,  4., 12.,  0.,  0.,  8.,  8.,  0.],
       [ 0.,  5.,  8.,  0.,  0.,  9.,  8.,  0.],
       [ 0.,  4., 11.,  0.,  1., 12.,  7.,  0.],
       [ 0.,  2., 14.,  5., 10., 12.,  0.,  0.],
       [ 0.,  0.,  6., 13., 10.,  0.,  0.,  0.]])
>>> plt.matshow(digit.images[0])
>>> plt.show()

And the following image will appear.

Download the original dataset(28×28pixel)

The original dataset of mnist is in this page.

But the data you can get there is binary data which you cannot use it as it is.

So you need to process them to utilize.

But as you will see , the mnist dataset is so famous dataset that there are a lot of tools to use them immediately.

Of course , although the way to process them by yourself exit, as I couldn't catch up with it and I thought I wondered whether we took much time to seek the way, I'm not suppose to talk about the way.

Download via sklearn(28×28pixel)

Searching internet, in some old article I could find the following way.

from sklearn.datasets import fetch_mldata

But it shows us error , as the website we are suppose to access is not available.

So nowadays it seem that we have to use fetch_openml as follows.

>>> import matplotlib.pyplot as plt  
>>> from sklearn.datasets import fetch_openml
>>> digits = fetch_openml(name='mnist_784', version=1)
>>> digits.data.shape
(70000, 784)
>>> plt.imshow(digits.data[0].reshape(28,28), cmap=plt.cm.gray_r)

>>>>>> plt.show()

tensorflow(28×28pixel)

This is the way using the tutorials of tensorflow.

>>> from tensorflow.examples.tutorials.mnist import input_data

Although this command might enable us to import mnist, it didn't. In my case I faced the following error.

As a result There may be some case where the directory including the tutorial isn't downloaded with tensorflow.

Traceback (most recent call last):
  File "", line 1, in 
ModuleNotFoundError: No module named 'tensorflow.examples.tutorials'

I tried to check inside of directory practically.This is the result.

$ls /Users/hiroshi/opt/anaconda3/lib/python3.7/site-packages/tensorflow_core/examples/
__init__.py __pycache__ saved_model

I referred the following pages

At first,you accessgithub page of Tensorflow and download zip file in anywhere and open.

we can find the directory named "tensorflow-master", and in the directory named tensorflow-master\tensorflow\examples\ , there is a directory named "tutorials".

we copy the directory ,"tutorials" into "/Users/hiroshi/opt/anaconda3/lib/python3.7/site-packages/tensorflow_core/examples/"

Then,

>>> import matplotlib.pyplot as plt   
>>> from tensorflow.examples.tutorials.mnist import input_data
>>> mnist = input_data.read_data_sets("MNIST_data", one_hot=True)
>>> im = mnist.train.images[1]
>>> im = im.reshape(-1, 28)
>>> plt.imshow(im)

>>> plt.show()

and you can find the image of mnist.

keras(28×28pixel)

>>> import matplotlib.pyplot as plt   
>>> import tensorflow as tf
>>> mnist = tf.keras.datasets.mnist
>>> mnist
>>> mnist_data = mnist.load_data()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 1s 0us/step
>>> type(mnist_data[0])
   
>>> len(mnist_data[0])
2
>>> len(mnist_data[0][0])
60000
>>> len(mnist_data[0][0][1])
28
>>> mnist_data[0][0][1].shape
(28, 28)

>>> plt.imshow(mnist_data[0][0][1],cmap=plt.cm.gray_r)

>>> plt.show()

I'm not suppose to show image I got, but if you are in success in doing procedure , you must find the image.

pytorch(28×28pixel)

It seems that if you can't run the following command you can't go next.

>>> from torchvision.datasets import MNIST

I faced an error.

It seems that torchvision don't exit.

In my case, when installing pytorch , I merely do as follows. It seems to be the reason.

$conda install pytorch

In order to install some options, you have to do as follows.

$conda install pytorch torchvision -c pytorch

As you are required to choose y or n, you choose y.

Doing it ( if you need),you run the following command.

>>> import matplotlib.pyplot as plt   


>>> import torchvision.transforms as transforms
>>> from torch.utils.data import DataLoader
>>> from torchvision.datasets import MNIST
>>> mnist_data = MNIST('~/tmp/mnist', train=True, download=True, transform=transforms.ToTensor())
>>> data_loader = DataLoader(mnist_data,batch_size=4,shuffle=False)
>>> data_iter = iter(data_loader)
>>> images, labels = data_iter.next()
>>> npimg = images[0].numpy()
>>> npimg = npimg.reshape((28, 28))
>>> plt.imshow(npimg, cmap='gray')

>>plt.show()

The sources I referred

The original dataset of mnist

MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burges

sklearn

Tensorflow

The others

OpenML

2020年2月16日日曜日

What happen if not doing bow rehair

The other day I found my bow's hair couldn't be tighten.

I guessed that it was because I didn't do bow rehair for long time.

Furthermore the part of bow's frog seem to break , so my bow couldn't adjust how tighten my bow's hair is.

We can see a frog on the frog -1

We can see a crack on the frog -2

In the case it is expected that my bow will completely break during our performance. As I want to be avoid such a situation, I thought I took measure to my bow.

Going to the repair shop I was said that it is too hard to repair it. And they explained the reason and the structure of the bow.

According to their explanation, my bow's hair has so completely stretched and movable space to adjust how tighten my bow's is has used fully that I can not tighten more.

Moreover , even though in such a case if I turn a screw by force, the screw may break.

They say it takes me as same money as I buy a new one. So I hesitate over whether to buy the new one or to borrow from my acquaintance for time being to get over performance.

Fortunately I could find acquaintance to lend me bow, I decided to use it.

But it is true my bow is so old that I buy a new one after tomorrow's performance.

2020年2月8日土曜日

I took part in the event about English

I'm suppose to talk about this week.

This week was a little busy.

First , as the project I take part in will finish by this month, I was introduce next project and went to take interview with the client.

Though I cannot talk about detail, next one seem to be related RPA.

I hear it seem to be fashionable these days.

I'm not sure whether I'm employed or not , I'm looking forward to join next project.

And, on Friday.

I participated the outdoor event about English.

It is 1000 speakers conference. I was the first time.

https://1000-speakers.connpass.com/event/163088/

According to this introduction, this event is held to give us opportunity to make presentation in English as much as possible.

This event aim to give 1000 speaker opportunity finally.

As they don't force us to make presentation and it is OK to just listening to.

As I was the new comer , I didn't make presentation. But totally the atmosphere was so good that I want to take part in next time too , preparing an interesting presentation.

If you want to try to join this , asking me and let's join it together!!

2020年2月2日日曜日

information from our eyes

As current my job is engineer , I may not get so tired at least physically.

But I am always tired.

Although there are a lot of things I have to do , I don't get well motivated.

There may be the opinion that I don't have split enough. But I don't like such a spiritual discussion.

I agree to use such words , "sprits" or "guts", at last of last. But there may be rational measure we have to do before it.

I tried to consider why I am always tired.

The first is because my posture is bad.

In the other day , I was advised that my posture was so bad that I seemed not to be confidence.

Checking my posture , I found I work with my posture bad during a day. And it makes my neck really stiff.

And I found my posture is good when I play instrument.

Although I couldn't have image what is good posture, working with good posture , it made my neck a little better.

Furthermore is jam-packed train.

Just hearing the word "jam-packed train" makes me so tired. But let's dive into it.

I feel I get a lot of information while I'm just on the train.

For instance, enormous people before my face , people's moving occurring every time the train get to the next station , smell , advertisement , landscape outside train , noise made by train , and information from mobile phone that I a little automatically check.

Adding that people adhere to me is also stress.

Moreover , as I'm engineer, I look the display of computer during a day.

A lot of stimulus invoke rambling thought. So I feel my attention scatter everywhere and all kind of thought run in my mind.

As I thought it is too much information, I tried to close my eye as much as possible.

Closing my eyes and restricting information from external, it make my brain neutral and , if possible , I can be into myself.

consciously or unconsciously reacting the stimulus one by one , I felt I was hurried off to do a lot of thing.

But closing my eye make me relaxed.

It might be one of the mindfulness.

2020年1月25日土曜日

Take it easy

This is the first post of this year.

Last year I already post some article. But since then I couldn't post any one.

I resolved to write any post and tried to consider how I can write article.

Because I have so many task and run another blog in Japanese that I cannot give first priority to writing these blog post.

As the result, taking it easy , I resolved to try write one post a week.
(although I already miss the first post of this year)

From beginning of this year , I review vocabulary and colloquial English.

There is about 800 word and 300 phrase.

As I am reviewing them everyday, I came to answer in flash.

To repeating reviewing is important.

That's about it .

I made it a rule to take it easy. So see you next post.