Home Kernel Principal Component Analysis (Kernel PCA)
Post
Cancel

Kernel Principal Component Analysis (Kernel PCA)

What is Kernel PCA?

PCA is a linear algorithm whereas Kernel PCA is a non-linear algorithm.

Kernel PCA is used for:

  • Noise filtering
  • Visualization
  • Feature extraction
  • Stock market predictions
  • Gene data analysis

The goals of Kernel PCA are:

  • Identifying patterns in data.
  • Detect the correlation between variables.

The role of Kernel PCA is:

  • Standardize the data.
  • Obtain the eigenvectors and eigenvalues from the covariance matrix or correlation matrix, or perform Singular Value Decomposition (SVD).
  • Sort eigenvalues in descending order and choose the $k$ eigenvectors that correspond to the $k$ largest eigenvalues where $k$ is the number of dimensions of the new feature subspace.
  • Construct the projection matrix W from the selected $k$ eigen vectors.
  • Transform the original data set X via W to obtain a k-dimensional feature subspace Y.

Kernel PCA can be expressed as:




Example



Code



1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import KernelPCA
from sklearn.linear_model import LogisticRegression

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

kpca = KernelPCA(n_components=2, kernel='rbf')
X_train = kpca.fit_transform(X_train)
X_test = kpca.transform(X_test)

classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)

y_pred = classifier.predict(X_test)



Result







Implementation

This post is licensed under CC BY 4.0 by the author.