Data sharing concerns
One of the challenges of sharing human subjects data is the risk that your data may identify an individual, either directly or indirectly. Additionally, the information in your dataset may be legally protected or sensitive, which could lead to legal repercussions for you and/or bring harm to the individual if that information is released and linked to that individual’s identity.
Disclosure is the unauthorized release of information that may identify an individual research participant or organization. Examples of disclosive information include:
- Direct identifiers or Personally Identifiable Information (PII), such as name, address, social security number, and phone number.
- Indirect identifiers, such as zip code, birthdate, education, and race/ethnicity, that could be used in combination to uniquely identify an individual.
- Information in a dataset that can be linked with outside information, from sources such as social media, administrative data, or other public datasets, that results in identification of an individual.
Legally protected data have restrictions placed on them by law. Examples include:
- Family Education Rights and Privacy Act (FERPA) protected educational records data, such as grades
- Health Insurance Portability and Accountability Act (HIPAA) protected medical or healthcare data
Sensitive data include any information that may cause harm, legal jeopardy, or reputational damage to the subject if disclosed. Such data may or may not be legally protected. Examples include:
- Criminal of illegal behaviors, such as drug use
- Mental health information
- Sexual behaviors
- Information about minors or other vulnerable populations
Before sharing human subjects data publicly, the dataset should have a low disclosure risk or be free of disclosive information. This involves removing both direct identifiers (see the University procedure for de-identifying health data) AND indirect identifiers that may pose a disclosure risk.
See our guide on how the libraries' Data Repository detects sensitive information.
If your data contain legally protected or sensitive data, or if the removal of identifiers limit the usefulness of your data, consider sharing through archives with restricted access repositories, such as the Inter-University Consortium for Political and Social Research (ICPSR).
In addition to the content of the data, the agreement made with participants in your IRB can also limit the extent to which human subjects data can be shared.
Need more help or information?
College of Liberal Arts (CLA) LATIS Research Services
Clinical and Translational Science Institute (CTSI) Research Toolkit