Skip to main content

Table 5 Data Analytics Algorithms

From: Big data analytics in smart grids: a review

Category

Algorithm

description

Supervised Learning

Decision tree

A non-parametric method with a tree-like method whose leaves represent class labels and branches represent conjunctions of features

Naive Bayes

A probabilistic method based on Bayes theorem with the assumption of independence between every pair of features

Support vector machine classifier

An algorithm to find a separating hyperplane between the two classes by mapping the labelled data to a high-dimensional feature space

K Nearest Neighbor

A non-parametric method based on the minimum dissimilarity between new items and the labelled items in different classes

Random Forest

An algorithm consisting of a collection of simple tree predictors independently for the estimation of the final outcome

Unsupervised Learning

K-means

An unsupervised learning method with a given number of clusters to sort the data based on the average value of data in each group as the centroid

K-medoids

An unsupervised learning method similar to k-means by assigning the centroid of each group with an existing data point instead of the average value

Hierarchical Clustering

An alternative approach which aims to build a hierarchy of clusters in a dendrogram without a given number of clusters

DBSCAN

A density-based clustering algorithm to identify clusters with specific shape in distribution

Expectation-Maximization

An iterative way to approximate the maximum likelihood estimates for model parameters

Correlation

FP-Growth Algorithm

An efficient method for mining the complete set of frequent patterns with a special data structure named frequent-pattern tree with all the association information reserved

Apriori Algorithm

A classical data analytics algorithm to discover the potential association rules among frequent items

Dimensionality reduction

Principal Component Analysis

An orthogonal transformation of data with a new coordinate system with the greatest variance projected to the first coordinate

Self-organizing Map

A type of artificial neural network for a low-dimensional representation of the training data space

Random Matrix

An algorithm which reveal potential regulations with high order matrices for massive data by eigenvalue analysis