/mining-microarray

Mining Microarray Gene Expression Data for Cancers - correlation analysis and neighborhood based interestingness analysis of emerging patterns

Primary LanguageJava

mining-microarray

This is project work for Data Mining class, part of my Master's degree.

Overview

Mining Microarray data - correlation analysis and, neighborhood based interestingness analysis of emerging patterns.

The given data is discretized using entropy-based discretization and FP-growth algorithm is used to mine minimal genesets correlated with a class. Emerging patterns are mined for each class and, interestingness analysis is done based on total distance of a pattern.

Project Specifications

Refer to project2.pdf

Procedure, assumptions and findings

Refer to report.pdf (I have worked on tasks 1 and 3)

Input file - colon cancer data

Sample input file is available at cc.data

Execution

Refer to READ ME.txt