What is Kernel PCA?

PCA is a linear algorithm whereas Kernel PCA is a non-linear algorithm.

Kernel PCA is used for:

Noise filtering
Visualization
Feature extraction
Stock market predictions
Gene data analysis

The goals of Kernel PCA are:

Identifying patterns in data.
Detect the correlation between variables.

The role of Kernel PCA is:

Standardize the data.
Obtain the eigenvectors and eigenvalues from the covariance matrix or correlation matrix, or perform Singular Value Decomposition (SVD).
Sort eigenvalues in descending order and choose the $k$ eigenvectors that correspond to the $k$ largest eigenvalues where $k$ is the number of dimensions of the new feature subspace.
Construct the projection matrix W from the selected $k$ eigen vectors.
Transform the original data set X via W to obtain a k-dimensional feature subspace Y.

Kernel PCA can be expressed as:

Example

Code

  
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import KernelPCA
from sklearn.linear_model import LogisticRegression

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

kpca = KernelPCA(n_components=2, kernel='rbf')
X_train = kpca.fit_transform(X_train)
X_test = kpca.transform(X_test)

classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)

y_pred = classifier.predict(X_test)

Result

Implementation

Kernel pca

Kernel Principal Component Analysis (Kernel PCA)

What is Kernel PCA?

Example

Code

Result

Implementation

Further Reading

Data Preprocessing Introduction

Regression

Classification