CMPS 4450 Data Mining and Visualization (4)
Knowledge discovery in and visualization of large datasets, including data
warehouses and text-based information systems. Topics covered include data
mining concepts, information retrieval, analysis methods, storage systems,
visualization, implementation and applications.
Prerequisite: CMPS 3120
Data structures
Algorithm analysis
Relations and sets
Graph theory
4 semester units. 3 units lecture (150 minutes), 1 unit lab (150 minutes).
Selected elective for CS
Data Mining: Concepts and Techniques, Third edition. Jiawei Han, Micheline
Kamber, and Jian Pei. Morgan Kaufmann Publishers, 2012,
ISBN-13 978-0-12-381479-1.
Author's website (textbook errata):
http://www.cs.uiuc.edu/~hanj/bk3
Melissa Danforth
This course covers the following ACM/IEEE CS2013 (Computer Science)
Body of Knowledge student learning outcomes:
CS-IM/Data Mining
CS-IS/Basic Search Strategies
CS-IS/Reasoning Under Uncertainty
CS-IS/Advanced Machine Learning
The course maps to the following performance indicators for Computer Science
(CAC/ABET):
- 3a. An ability to apply knowledge of computing and mathematics appropriate
to the discipline.
-
- 3e. An understanding of professional, ethical, legal, security, and social
issues and responsibilities.
-
- 3g. An ability to analyze the local and global impact of computing on
individuals, organizations, and society.
-
Week | Chapter(s) | Topics |
1 | Outside information and Chapter 1 |
Ethics of data mining, Introduction to data mining |
2 | Chapter 2 |
Data characteristics, Visualization, Similarity and dissimilarity measures |
3 and 4 | Chapter 3 |
Data preprocessing, reduction, and transformation |
5 | Chapter 6 |
Patterns, correlations and associations |
6 | Chapter 6 |
Rule generation, Evaluating generated association rules |
7 | Chapters 7 and 8 |
Overview of advanced pattern mining, Classification introduction |
8 | Chapter 8 |
Decision trees, Naive Bayesian classifier, Rule-based classifier |
9 | Chapters 8 and 9 |
Evaluating classifiers, Bayesian belief networks |
10 | Chapter 9 |
Support vector machines, k-nearest neighbor, Additional classifiers |
11 | Chapter 10 |
Clustering, k-means, k-metroids, Hierarchical clustering |
12 | Chapter 10 |
Density clustering, Grid-based clustering, Evaluating clusters |
13 | Chapter 11 |
Fuzzy clustering, Clustering with constraints |
14 | Chapter 12 |
Outlier detection |
15 | Chapters 4 and 5 |
Data warehousing and views |
Not applicable to this course.
Melissa Danforth on 31 July 2014
Approved by CEE/CS Department on [date]
Effective Fall 2016