Skip to main content
# Table 5 Data Analytics Algorithms

Category | Algorithm | description |
---|---|---|

Supervised Learning | Decision tree | A non-parametric method with a tree-like method whose leaves represent class labels and branches represent conjunctions of features |

Naive Bayes | A probabilistic method based on Bayes theorem with the assumption of independence between every pair of features | |

Support vector machine classifier | An algorithm to find a separating hyperplane between the two classes by mapping the labelled data to a high-dimensional feature space | |

K Nearest Neighbor | A non-parametric method based on the minimum dissimilarity between new items and the labelled items in different classes | |

Random Forest | An algorithm consisting of a collection of simple tree predictors independently for the estimation of the final outcome | |

Unsupervised Learning | K-means | An unsupervised learning method with a given number of clusters to sort the data based on the average value of data in each group as the centroid |

K-medoids | An unsupervised learning method similar to k-means by assigning the centroid of each group with an existing data point instead of the average value | |

Hierarchical Clustering | An alternative approach which aims to build a hierarchy of clusters in a dendrogram without a given number of clusters | |

DBSCAN | A density-based clustering algorithm to identify clusters with specific shape in distribution | |

Expectation-Maximization | An iterative way to approximate the maximum likelihood estimates for model parameters | |

Correlation | FP-Growth Algorithm | An efficient method for mining the complete set of frequent patterns with a special data structure named frequent-pattern tree with all the association information reserved |

Apriori Algorithm | A classical data analytics algorithm to discover the potential association rules among frequent items | |

Dimensionality reduction | Principal Component Analysis | An orthogonal transformation of data with a new coordinate system with the greatest variance projected to the first coordinate |

Self-organizing Map | A type of artificial neural network for a low-dimensional representation of the training data space | |

Random Matrix | An algorithm which reveal potential regulations with high order matrices for massive data by eigenvalue analysis |