사이킷런에서 KNN 실습 예제 코드를 확인해 본다.
아이리스 데이터셋을 이용하여 분류하는 예제 코드이다.
print(__doc__)
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from sklearn import neighbors, datasets
n_neighbors = 15
# import some data to play with
iris = datasets.load_iris()
# we only take the first two features. We could avoid this ugly
# slicing by using a two-dim dataset
X = iris.data[:, :2]
y = iris.target
h = .02 # step size in the mesh
# Create color maps
cmap_light = ListedColormap(['orange', 'cyan', 'cornflowerblue'])
cmap_bold = ListedColormap(['darkorange', 'c', 'darkblue'])
for weights in ['uniform', 'distance']:
# we create an instance of Neighbours Classifier and fit the data.
clf = neighbors.KNeighborsClassifier(n_neighbors, weights=weights)
clf.fit(X, y)
# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.figure()
plt.pcolormesh(xx, yy, Z, cmap=cmap_light)
# Plot also the training points
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold,
edgecolor='k', s=20)
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.title("3개 클래스 분류 (k = %i, weights = '%s')"
% (n_neighbors, weights))
plt.show()
주석을 작성해보면...
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from sklearn import neighbors, datasets
# k 개수를 할당한다
n_neighbors = 15
# 아이리스 데이터셋을 불러온다
iris = datasets.load_iris()
# 아이리스 데이터셋은 4개의 속성이 있다. 2개의 항목만 가져와서 그래프 시각화용으로 사용한다
# ex>X = iris.data[:, 2:4]
X = iris.data[:, :2]
# 데이터셋 target(클래스)값을 할당한다
y = iris.target
# 그래프 간격 사이즈를 지정한다
h = .02
# 색상을 지정한다
cmap_light = ListedColormap(['orange', 'cyan', 'cornflowerblue'])
cmap_bold = ListedColormap(['darkorange', 'c', 'darkblue'])
# KNN 분류기에 가중치를 주어서 비교한다
# distance를 선택하면 거리에 가중치를 주는 것으로, 가까울수록 더 가중치가 부여된다
for weights in ['uniform', 'distance']:
# we create an instance of Neighbours Classifier and fit the data.
clf = neighbors.KNeighborsClassifier(n_neighbors, weights=weights)
clf.fit(X, y)
# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.figure()
plt.pcolormesh(xx, yy, Z, cmap=cmap_light)
# Plot also the training points
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold,
edgecolor='k', s=20)
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.title("3개 클래스 (k = %i, weights = '%s')"
% (n_neighbors, weights))
plt.show()
참조 : https://scikit-learn.org/stable/auto_examples/neighbors/plot_classification.html#sphx-glr-download-auto-examples-neighbors-plot-classification-py
Nearest Neighbors Classification — scikit-learn 0.23.2 documentation
Note Click here to download the full example code or to run this example in your browser via Binder Nearest Neighbors Classification Sample usage of Nearest Neighbors classification. It will plot the decision boundaries for each class. print(__doc__) impor
scikit-learn.org
'머신러닝 > 알고리즘' 카테고리의 다른 글
2-1-2. Decision Tree 실습 2 (0) | 2020.09.09 |
---|---|
2-1-1. Decision Tree 실습 1 (0) | 2020.09.09 |
2-1. Decision Tree (의사결정나무) (0) | 2020.09.09 |
1-3-1. KNN 실습 1 (0) | 2020.09.08 |
1-3. KNN (K Nearest Neighbors) (0) | 2020.09.08 |