What is KNN?
KNN categorizes new values into the category that has a majority among the K nearest neighbors.
data:image/s3,"s3://crabby-images/3ca58/3ca58b3e20edaa45df914500c50f98692618a6a4" alt=""
To
data:image/s3,"s3://crabby-images/a1ef0/a1ef03e5ca4c112a1f3c16d7c6bf4cdb025de95d" alt=""
Steps of KNN
- Step 1.
Choose the number K of neighbors.
Let’s assume that K is 5.
data:image/s3,"s3://crabby-images/20c3d/20c3de70cc859755f3044af3e795ca87e524ab1d" alt=""
- Step 2.
Take the K nearest neighbors of the new data point according to the Euclidean distance (most commonly used), Manhattan distance, or any other distance metric.
data:image/s3,"s3://crabby-images/c7e94/c7e94308f160054e5a8fb7b9ed72397a1acd4e7d" alt=""
- Step 3.
Among these K neighbors, count the number of data points in each category.
In the example:- Category 1: 3 neighbors
- Category 2: 2 neighbors
- Category 1: 3 neighbors
- Step 4.
Assign the new data point to the category with the most neighbors.
data:image/s3,"s3://crabby-images/e72aa/e72aa74edc632b0b289df24f56c0ca98041a0d73" alt=""
Example
Code
1
2
3
4
5
6
7
8
9
10
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
classifier = KNeighborsClassifier(n_neighbors=5, metric='minkowski', p=2) #metric : algorithm to determine the distance between two points.
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
Result
data:image/s3,"s3://crabby-images/c5780/c57806795265a11531eddb479ceaffd09649339c" alt=""