Lab 1 - Introduction to Ethics
Due: Friday by 3:30pm
Data mining carries with it several ethical dimensions that we will explore
in the first two labs in the class. Today's lab will focus on historical
background, particularly with respects to scientific/medical research that
involves data from humans (human subjects research).
There will be a brief lecture at the start of lab. Here are some reading
assignments that will give background on the lecture:
- National Commission for the Protection of Human Subjects
- This commission was formed in the 1970s in response to some egregious
breaches of ethics in biomedical research in the preceeding decades.
- The Belmont Report
- This report was the product of the National Commission for the Protection
of Human Subjects. It is considered the foundation of ethical human subjects
research today, even when that research is not biomedical in nature.
- National Institute of Health FAQ on public vs private information
- One of the key phrases that makes human subject research actually be
considered human subject research in a research environment is if the
research gathers "identifiable private information". This FAQ answers
questions around this definition.
- Considerations and Recommendations Concerning Internet Research
- This is a report from HHS specifically addressing human subject research
on the Internet, including when data mining is considered human subject
research or when it is just processing existing public information.
After the lecture is done, break into groups and discuss whether the following
scenarios would be considered human subject research. Each group should write
a report on their discussions.
- Alice is a medical research at a major public university. Alice goes to
a nationally recognized data center and downloads a public, anonymous database
to train neural networks to detect warnings signs for heart disease. The
original database creator has obtained informed consent from all the people
whose data is in the database.
- Bob is a master's student in science education who is studying the causes
of dropping out of college. Bob uses Facebook to gather information about
college-aged people in order to determine if social media postings have any
predictive value in determing who graduates college and who drops out. Bob
anonymizes this data and posts it online.
- Carol is a programmer for a major retailer. Carol uses transaction data
from the retailer's loyalty program to tailor marketing campaigns to specific
customers' shopping habits.
After allowing sufficient time for discussions in groups, we will get back
together and discuss the three situations as a class.
Each group should choose one group member to upload their report to Moodle.
Make sure every group member's name is on the report.