CMPS 4450 Data Mining and Visualization (4)
Knowledge discovery in and visualization of large datasets, including data 
warehouses and text-based information systems. Topics covered include data 
mining concepts, information retrieval, analysis methods, storage systems, 
visualization, implementation and applications.
Prerequisite: CMPS 3120
Data structures
Algorithm analysis
Relations and sets
Graph theory
4 semester units. 3 units lecture (150 minutes), 1 unit lab (150 minutes).
Selected elective for CS
Data Mining: Concepts and Techniques, Third edition. Jiawei Han, Micheline
Kamber, and Jian Pei. Morgan Kaufmann Publishers, 2012, 
ISBN-13 978-0-12-381479-1.
Author's website (textbook errata): 
http://www.cs.uiuc.edu/~hanj/bk3
Melissa Danforth
This course covers the following ACM/IEEE CS2013 (Computer Science) 
Body of Knowledge student learning outcomes: 
CS-IM/Data Mining
CS-IS/Basic Search Strategies
CS-IS/Reasoning Under Uncertainty
CS-IS/Advanced Machine Learning
The course maps to the following performance indicators for Computer Science
(CAC/ABET):
- 3a. An ability to apply knowledge of computing and mathematics appropriate 
to the discipline.
- 
- 3e. An understanding of professional, ethical, legal, security, and social 
issues and responsibilities.
- 
- 3g. An ability to analyze the local and global impact of computing on 
individuals, organizations, and society.
- 
| Week | Chapter(s) | Topics | 
|---|
| 1 | Outside information and Chapter 1 | Ethics of data mining, Introduction to data mining | 
| 2 | Chapter 2 | Data characteristics, Visualization, Similarity and dissimilarity measures | 
| 3 and 4 | Chapter 3 | Data preprocessing, reduction, and transformation | 
| 5 | Chapter 6 | Patterns, correlations and associations | 
| 6 | Chapter 6 | Rule generation, Evaluating generated association rules | 
| 7 | Chapters 7 and 8 | Overview of advanced pattern mining, Classification introduction | 
| 8 | Chapter 8 | Decision trees, Naive Bayesian classifier, Rule-based classifier | 
| 9 | Chapters 8 and 9 | Evaluating classifiers, Bayesian belief networks | 
| 10 | Chapter 9 | Support vector machines, k-nearest neighbor, Additional classifiers | 
| 11 | Chapter 10 | Clustering, k-means, k-metroids, Hierarchical clustering | 
| 12 | Chapter 10 | Density clustering, Grid-based clustering, Evaluating clusters | 
| 13 | Chapter 11 | Fuzzy clustering, Clustering with constraints | 
| 14 | Chapter 12 | Outlier detection | 
| 15 | Chapters 4 and 5 | Data warehousing and views | 
Not applicable to this course.
Melissa Danforth on 31 July 2014
Approved by CEE/CS Department on [date] 
Effective Fall 2016