This course is an introduction to Data Mining, a relatively new field of CS. Data mining is a set of algorithms designed to find patterns in data.
There is a huge amount of data being collected from areas of science and engineering, environmental control, business and management, and government administration. In addition, we generate data every day as we walk around with our cell phones, tweet, leave reviews on Yelp, etc. Data is everywhere, and in this class we will talk about how to find patterns in data that teach us about the processes that generated the data (like ourselves!).
• Classification
• Regression
• Association Analysis
• Clustering
• Recommender Systems
• Mining the Web
• Data exploration and Visualization
• Applications
The objective of this course is to present an overview of multiple data mining techniques, including algorithms for regression, classification, association analysis, clustering, and recommendation systems. The students will be able to understand and sensibly use state-of-the-art Data Mining algorithms and tools, evaluate their results, and apply knowledge extracted from data to diverse real-life problems. The labs will reinforce the material taught in the lectures and provide guidance for experimenting with various Data Mining algorithms. The assignments will allow students to practice implementing some of popular machine learning algorithms.