手書き数字認識:画像認識

クジラ飛行机「ゼロからやさしくはじめるPython入門」第7章step04

y = f( p, x ) xは64次元(8x8)のベクトル,y は 0, 1, 2, 3, ・・・,9

教師付きデータを利用して,パラメータpを学習(調節)する。

データセットの読み込み

In [1]:
from sklearn.datasets import load_digits
digits = load_digits()    # 手書き数字データセットをロードし,digitsとラベリング
#digits datasetのkeys
print( digits.keys() )
dict_keys(['data', 'target', 'target_names', 'images', 'DESCR'])

データセットの説明

In [2]:
print(digits.DESCR)       # データセット説明を印刷
.. _digits_dataset:

Optical recognition of handwritten digits dataset
--------------------------------------------------

**Data Set Characteristics:**

    :Number of Instances: 5620
    :Number of Attributes: 64
    :Attribute Information: 8x8 image of integer pixels in the range 0..16.
    :Missing Attribute Values: None
    :Creator: E. Alpaydin (alpaydin '@' boun.edu.tr)
    :Date: July; 1998

This is a copy of the test set of the UCI ML hand-written digits datasets
https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits

The data set contains images of hand-written digits: 10 classes where
each class refers to a digit.

Preprocessing programs made available by NIST were used to extract
normalized bitmaps of handwritten digits from a preprinted form. From a
total of 43 people, 30 contributed to the training set and different 13
to the test set. 32x32 bitmaps are divided into nonoverlapping blocks of
4x4 and the number of on pixels are counted in each block. This generates
an input matrix of 8x8 where each element is an integer in the range
0..16. This reduces dimensionality and gives invariance to small
distortions.

For info on NIST preprocessing routines, see M. D. Garris, J. L. Blue, G.
T. Candela, D. L. Dimmick, J. Geist, P. J. Grother, S. A. Janet, and C.
L. Wilson, NIST Form-Based Handprint Recognition System, NISTIR 5469,
1994.

.. topic:: References

  - C. Kaynak (1995) Methods of Combining Multiple Classifiers and Their
    Applications to Handwritten Digit Recognition, MSc Thesis, Institute of
    Graduate Studies in Science and Engineering, Bogazici University.
  - E. Alpaydin, C. Kaynak (1998) Cascading Classifiers, Kybernetika.
  - Ken Tang and Ponnuthurai N. Suganthan and Xi Yao and A. Kai Qin.
    Linear dimensionalityreduction using relevance weighted LDA. School of
    Electrical and Electronic Engineering Nanyang Technological University.
    2005.
  - Claudio Gentile. A New Approximate Maximal Margin Classification
    Algorithm. NIPS. 2000.

手書き数字のデータを画像として表示

In [3]:
%matplotlib inline
import matplotlib.pyplot as plt
In [4]:
# 手書き画像データを表示: 8px * 8px = 64px, 各pixelの明暗を0~16で表現
print( "number of data:", len(digits.data) )
n = 500
print("data number:n=", n)
print( digits.data[n] )                 # 配列(サイズ64)64次元ベクトル
print( "target:", digits.target[n] )    # 正解
print( "images:", digits.images[n] )    # 2次元配列(8x8),要素の値:0..16の数値 
plt.matshow(digits.images[n], cmap ="gray") 
plt.show()
number of data: 1797
data number:n= 500
[ 0.  0.  3. 10. 14.  3.  0.  0.  0.  8. 16. 11. 10. 13.  0.  0.  0.  7.
 14.  0.  1. 15.  2.  0.  0.  2. 16.  9. 16. 16.  1.  0.  0.  0. 12. 16.
 15. 15.  2.  0.  0.  0. 12. 10.  0.  8.  8.  0.  0.  0.  9. 12.  4.  7.
 12.  0.  0.  0.  2. 11. 16. 16.  9.  0.]
target: 8
images: [[ 0.  0.  3. 10. 14.  3.  0.  0.]
 [ 0.  8. 16. 11. 10. 13.  0.  0.]
 [ 0.  7. 14.  0.  1. 15.  2.  0.]
 [ 0.  2. 16.  9. 16. 16.  1.  0.]
 [ 0.  0. 12. 16. 15. 15.  2.  0.]
 [ 0.  0. 12. 10.  0.  8.  8.  0.]
 [ 0.  0.  9. 12.  4.  7. 12.  0.]
 [ 0.  0.  2. 11. 16. 16.  9.  0.]]
In [5]:
fig = plt.subplots(8, 10, figsize=(14, 14) )  # 8行,10列1400px*1400px
for i in range(80):
    plt.subplot(8, 10, i+1).matshow(digits.images[i], cmap ="gray")   # 8行10列のi+1番目
    plt.xticks(color="None")    # x軸の数字を非表示
    plt.yticks(color="None")
    plt.title( digits.target[i],  y = -0.3 )
In [6]:
# データを学習用とテスト用に分割 --- (*1)
from sklearn.model_selection import train_test_split as split
x_train, x_test, y_train, y_test = split(digits.data, digits.target)
print( "len(x_train)=", len(x_train) )
len(x_train)= 1347
In [7]:
# データを学習:アルゴリズム SVC Support Vector Classification.--- (*2)
from sklearn import svm
clf = svm.SVC(gamma = 'auto')    # Support Vector Classificationオブジェクト生成, gamma = 'auto' 指定!
clf.fit(x_train, y_train )       # trainingデータで学習

# モデルを評価 --- (*3)
pred = clf.predict(x_test)    # testデータで評価
print("pred=", pred)
# predの要素 と y_test(教師データ)の要素が一致する回数をカウントして,正当率を計算
result = list(pred == y_test).count(True) / len(y_test)
print("正当率=" + str(result))
pred= [7 5 4 5 5 5 6 5 7 7 5 5 9 2 7 1 5 5 5 5 4 5 6 5 5 5 3 7 5 3 5 7 5 5 5 9 5
 5 5 5 5 5 5 5 3 7 5 5 5 5 5 5 5 5 5 6 5 5 1 4 3 4 5 6 5 5 5 5 0 5 5 5 0 0
 5 5 1 9 5 5 5 5 4 5 5 5 0 5 6 5 5 4 5 5 5 6 5 5 5 5 5 5 5 5 5 0 5 6 4 5 5
 5 5 5 5 5 4 5 5 5 5 5 5 5 5 1 5 2 5 5 3 4 5 5 2 5 5 6 5 0 0 5 5 6 5 9 5 5
 5 5 0 7 5 5 4 5 5 1 5 5 5 5 5 0 5 2 5 5 7 5 5 5 5 8 5 5 5 5 5 9 5 0 4 4 6
 3 5 5 3 3 1 5 5 5 5 5 0 5 6 4 5 7 2 5 5 5 5 5 5 5 9 0 9 5 1 0 4 5 9 6 3 5
 5 5 6 5 5 5 5 2 5 8 4 5 3 5 5 5 5 5 5 6 2 5 5 5 3 7 5 5 7 5 5 5 5 6 6 5 5
 5 5 5 5 5 3 0 5 7 5 5 5 5 0 1 5 4 5 7 5 8 5 5 5 5 5 5 5 1 4 5 5 2 5 5 0 5
 5 5 5 5 5 5 5 5 7 5 3 5 6 3 5 5 6 5 5 5 5 5 5 6 5 5 1 5 7 7 5 5 5 3 5 5 2
 5 0 5 5 7 5 1 7 5 5 5 0 5 5 5 3 5 3 5 5 5 5 3 5 4 4 6 0 5 5 6 2 9 5 5 0 5
 5 0 7 0 5 5 5 5 5 5 5 4 5 5 5 4 7 5 9 5 5 6 5 3 5 5 5 5 5 5 0 5 3 5 4 5 5
 4 5 6 5 5 6 6 3 5 5 5 5 5 5 1 4 5 5 5 6 2 5 4 5 6 5 1 5 6 5 5 5 2 5 5 5 5
 5 5 5 5 5 5]
正当率=0.43555555555555553
In [8]:
# データを学習:LinearSVC
clf2 = svm.LinearSVC()    # LinearSVCオブジェクト生成
clf2.fit(x_train, y_train)    # trainingデータで学習
# モデルを評価
pred = clf2.predict(x_test)
result = list(pred == y_test).count(True) / len(y_test)
print("正解率=" + str(result))
正解率=0.9444444444444444
C:\WPy64-3740\python-3.7.4.amd64\lib\site-packages\sklearn\svm\base.py:929: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
  "the number of iterations.", ConvergenceWarning)
In [9]:
isPredict = list(pred == y_test)
print('isPredict/n', isPredict)
predictFalseList = []
n = -1
for a in isPredict:
    n += 1
    if( a == False ):
        #print(n)
        predictFalseList += [n]

print('識別に失敗したデータのindex\n', predictFalseList)
isPredict/n [True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, False, True, False, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, False, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True]
識別に失敗したデータのindex
 [4, 21, 53, 67, 102, 115, 118, 120, 129, 177, 205, 208, 244, 276, 290, 307, 315, 326, 353, 367, 370, 379, 403, 410, 436]
In [10]:
fig = plt.subplots(3, 10, figsize=(14, 6) ) # 3行,10列1400px*1000px
n = 0
for i in predictFalseList:
    n += 1
    Nimg = x_test[i].reshape(8,8)
    plt.subplot(3, 10, n).matshow( Nimg, cmap ="gray")   # 5行10列のi+1番目
    plt.xticks(color="None")    # x軸の数字を非表示
    plt.yticks(color="None")
    plt.title( str( y_test[i] ) + ', ' + str( pred[i] ) ,  y = -0.3 ) # 図のタイトルに,教師データ(正解)と予測(回答)を表示
In [11]:
# 学習済みモデルを保存
from sklearn.externals import joblib
joblib.dump(clf2, "digits.pkl", compress=True)
C:\WPy64-3740\python-3.7.4.amd64\lib\site-packages\sklearn\externals\joblib\__init__.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
  warnings.warn(msg, category=DeprecationWarning)
Out[11]:
['digits.pkl']
In [12]:
# ファイルから学習済みモデルを読み込む
clf3 = joblib.load("digits.pkl")

# 読み込んだモデルの正解率を確かめる
pred = clf3.predict(x_test)
result = list(pred == y_test).count(True) / len(y_test)
print("正解率=" + str(result))
正解率=0.9444444444444444

手書き文字画像ファイルから数字識別

In [13]:
%matplotlib inline
from PIL import Image
from matplotlib import pylab as plt

# ファイル名を指定
png_file = "tegaki3.png"

# 画像ファイルを開く 
img = Image.open(png_file)
plt.imshow(img)
Out[13]:
<matplotlib.image.AxesImage at 0x26b6dbce808>
In [14]:
# リサイズしてグレイスケールに
img.thumbnail((8, 8), Image.LANCZOS) # リサイズ, Image.LANCZOS:フィルター
img = img.convert("L") # 8bitグレイスケールに
plt.imshow(img, cmap="gray")     # cmap="gray"指定!
Out[14]:
<matplotlib.image.AxesImage at 0x26b6b2966c8>
In [15]:
# numpyの配列形式に変換
import numpy as np
print(img)
img_a = np.array(img, 'float') # 画像→配列 第2引数はデータ型指定
print(img_a)
img_a = 255 -  img_a   # ネガポジ反転
print(img_a)
img_a = img_a // 16    # 0-16の範囲に揃える
img_a = img_a.reshape(-1,) # 一次元に変換

print("---今回変換したデータ---")
print(img_a)
<PIL.Image.Image image mode=L size=8x8 at 0x26B6B2A1808>
[[255. 214. 133. 134. 161. 251. 255. 254.]
 [255. 207. 112. 114.  83.  77. 244. 254.]
 [255. 255. 255. 255. 255.  32. 185. 255.]
 [253. 253. 181.  93.  19.  96. 239. 253.]
 [253. 253. 169. 118.  89.  89. 254. 253.]
 [255. 255. 255. 255. 255.  48. 172. 255.]
 [255. 141.  95.  98.  75.  61. 214. 254.]
 [255. 212. 146. 154. 180. 245. 255. 254.]]
[[  0.  41. 122. 121.  94.   4.   0.   1.]
 [  0.  48. 143. 141. 172. 178.  11.   1.]
 [  0.   0.   0.   0.   0. 223.  70.   0.]
 [  2.   2.  74. 162. 236. 159.  16.   2.]
 [  2.   2.  86. 137. 166. 166.   1.   2.]
 [  0.   0.   0.   0.   0. 207.  83.   0.]
 [  0. 114. 160. 157. 180. 194.  41.   1.]
 [  0.  43. 109. 101.  75.  10.   0.   1.]]
---今回変換したデータ---
[ 0.  2.  7.  7.  5.  0.  0.  0.  0.  3.  8.  8. 10. 11.  0.  0.  0.  0.
  0.  0.  0. 13.  4.  0.  0.  0.  4. 10. 14.  9.  1.  0.  0.  0.  5.  8.
 10. 10.  0.  0.  0.  0.  0.  0.  0. 12.  5.  0.  0.  7. 10.  9. 11. 12.
  2.  0.  0.  2.  6.  6.  4.  0.  0.  0.]
In [16]:
# 学習モデルを読み込んで判定
clf = joblib.load("digits.pkl")
result = clf.predict([img_a])
print(result)     # 結果を表示
[9]
In [17]:
%matplotlib inline
from PIL import Image
from matplotlib import pylab as plt
import numpy as np

# 画像ファイルを指定して判定する(24bit PNG対応)
def predict_num(clf, png_file):
    img = Image.open(png_file)
    img.thumbnail((8, 8), Image.LANCZOS) # リサイズ
    img = img.convert("L") # 8bitグレイスケール
    img_a = np.array(img, 'f') # 画像→配列
    img_a = 255 -  img_a # ネガポジ反転
    img_a = img_a // 16 # 0-16の範囲に揃える
    img_a = img_a.reshape(-1,) # 一次元に変換
    r = clf.predict([img_a])
    return r[0]

# 学習モデルの読み込み
clf = joblib.load("digits.pkl")

# テスト画像で判定してみる
print(predict_num(clf, "tegaki5.png"))    
print(predict_num(clf, "tegaki9.png"))   
5
7

参考:subplot

In [18]:
import matplotlib.pyplot as plt
import numpy as np

fig, ax = plt.subplots(6,  5, sharex=True,  sharey=True)
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0, hspace=0.1)

for i in range(30):
    row = i // 5
    col = i % 5
    digits.images[i] = 15 - digits.images[i]
    ax[row, col].imshow(digits.images[i], cmap="gray")

plt.show()
fig.savefig("digits30.png")
In [ ]: