Supervised Learning

Decision tree

A nonparametric method with a treelike method whose leaves represent class labels and branches represent conjunctions of features

Naive Bayes

A probabilistic method based on Bayes theorem with the assumption of independence between every pair of features

Support vector machine classifier

An algorithm to find a separating hyperplane between the two classes by mapping the labelled data to a highdimensional feature space

K Nearest Neighbor

A nonparametric method based on the minimum dissimilarity between new items and the labelled items in different classes

Random Forest

An algorithm consisting of a collection of simple tree predictors independently for the estimation of the final outcome

Unsupervised Learning

Kmeans

An unsupervised learning method with a given number of clusters to sort the data based on the average value of data in each group as the centroid

Kmedoids

An unsupervised learning method similar to kmeans by assigning the centroid of each group with an existing data point instead of the average value

Hierarchical Clustering

An alternative approach which aims to build a hierarchy of clusters in a dendrogram without a given number of clusters

DBSCAN

A densitybased clustering algorithm to identify clusters with specific shape in distribution

ExpectationMaximization

An iterative way to approximate the maximum likelihood estimates for model parameters

Correlation

FPGrowth Algorithm

An efficient method for mining the complete set of frequent patterns with a special data structure named frequentpattern tree with all the association information reserved

Apriori Algorithm

A classical data analytics algorithm to discover the potential association rules among frequent items

Dimensionality reduction

Principal Component Analysis

An orthogonal transformation of data with a new coordinate system with the greatest variance projected to the first coordinate

Selforganizing Map

A type of artificial neural network for a lowdimensional representation of the training data space

Random Matrix

An algorithm which reveal potential regulations with high order matrices for massive data by eigenvalue analysis
