/Data-mining-algorithms

Data mining core algorithms implementation through scratch, such as clustering and association rule mining.

Primary LanguageJupyter Notebook

Data-mining

  • The core purpose of Data Mining is to unearth important information in a dataset and make the best use of this to discover and decode future trends.
  • Data Mining also incorporates data cleaning, pattern prediction, statistical analysis, data conversion, machine learning, and data visualization.

Association Rule Mining

An association rule has 2 parts:

  • an antecedent (if) and
  • a consequent (then) An antecedent is something that’s found in data, and a consequent is an item that is found in combination with the antecedent. Have a look at this rule for instance:

“If a customer buys bread, he’s 70% likely of buying milk.”

Apriori algorithms: Prerequisite – Frequent Item set in Data set.

  • All subsets of a frequent itemset must be frequent(Apriori propertry).
  • If an itemset is infrequent, all its supersets will be infrequent.

I have implemented Apriori algorithm.

Cluster Analysis

Clustering is the process of making a group of abstract objects into classes of similar objects.

  • Requirements of Clustering in Data Mining

    • Scalability
    • Ability to deal with different kinds of attributes
    • Discovery of clusters with attribute shape
    • High dimensionality
    • Ability to deal with noisy data
    • Interpretability

    I have implemented K-means cluster with initial seed selection and K-medoids clustering algorithm.