/Genetic-Frog-Leaping-Algorithm-for-Text-Document-Clustering

In this study, we developed a text document clustering optimization model using a novel genetic frog-leaping algorithm that efficiently clusters text documents based on selected features. The proposed approach is based on two metaheuristic algorithms: a genetic algorithm (GA) and a shuffled frog-leaping algorithm (SFLA). The GA performs feature selection, and the SFLA performs clustering.

Primary LanguageJupyter Notebook

Genetic Frog Leaping Algorithm for Text Document Clustering

1. Overview

In this project, we developed a text document clustering optimization model using a novel genetic frog-leaping algorithm that efficiently clusters text documents based on selected features. The proposed approach is based on two metaheuristic algorithms: a genetic algorithm (GA) and a shuffled frog-leaping algorithm (SFLA). The GA performs feature selection, and the SFLA performs clustering. To evaluate its effectiveness, the proposed approach was tested on a well-known text document dataset: the “20Newsgroup” dataset from the University of California Irvine Machine Learning Repository. Overall, after multiple experiments were compared and analyzed, it was demonstrated that using the proposed algorithm on the 20Newsgroup dataset greatly facilitated text document clustering, compared with classical K-means clustering. Nevertheless, this improvement requires longer computational time.

2. Resource Description

For more details please viste Paper Resource. This repo contains the following:

  • SFLA only code.
  • GA-SFLA code.
  • GA-K-means code.
  • Genetic Frog Leaping Algorithm for Text Document Clustering scientific paper that describe the project in details.
  • Genetic Frog Leaping Algorithm for Text Document Clustering scientific presentation that describe the project.

In addition, you can find presentation at SlideShare.