Data management good practices

Why cite data?

Giving appropriate attribution to research data improves data discoverability, signals the usefulness and fitness of data, provides citable contributions to the scholarly record, and supports long-term reusability. Many journals and publishers recognize the need to cite data in articles. Here is a list of journals that have open data policies.

Elements of an effective data citation

Use this general guide to both create your own data citation, if needed, and cite datasets that you used in your research.

  • Data authors. Just as with a manuscript publication, data authors should include all who helped to create the dataset, such as study PI, data collector, data analyst, etc.)
  • Dataset title. Include a title that is unique, different or separate from the associated manuscript title. 
  • Location of the data. Name the data repository that houses the data, or indicate where the data may be discovered.
  • Bibliographic metadata. Include year, edition, volume, or version of the dataset.
  • Viewing platform, if applicable. Some datasets are available only via special interactive viewers. Indicate which viewer was used to generate the data you selected/used. 
  • Parameters selected or used, if applicable. If citing secondary data, indicate which sections of the dataset that you used. 
  • Date accessed. If citing secondary data, include the date you last accessed it.
  • Persistent URL. All published datasets should have a persistent URL, the most common of which is a DOI. A persistent URL is a link that remains the same over time, even when the website it connects to is updated.

Citation examples

Dataset from a data repository: 
Cassidy, Kira et al. (2022). Gray wolf packs and human-caused wolf mortality [Dataset]. Dryad. https://doi.org/10.5061/dryad.mkkwh713f

Tables, charts, graphs, maps or figures appearing in a publication:
U.S. Fish and Wildlife Service. “Table 373. Threatened and Endangered Wildlife and Plant Species: 2009.” Statistical Abstract of the United States. Year: 2010. https://www2.census.gov/library/publications/2010/compendia/statab/129ed/tables/geo.pdf. Accessed: 09/04/2024. 

Interactive Database with static URLs:
United States Bureau of the Census. “Population Change.” Dataset: 2020 Census Demographic Data Map Viewer. Parameters selected: Population Change. Date Generated: 09/04/2024. https://maps.geo.census.gov/ddmv/map.html

Interactive Database without static URLs:
Bureau of Economic Analysis. "Per Capita real GDP by state (chained 2000 dollars)." Dataset: Gross Domestic Product by State. Parameters: all industry total, 2008, all states and regions. Regional Economic Accounts. Date Generated: 11/04/09.

American Chemical Society's Style for Printed Data Sets:
Rind, D. 1994. General Circulation Model Output Data Set. IGBP PAGES/World Data Center for Paleoclimatology Data Contribution Series #1994-012. NOAA/NCDC Paleoclimatology Program, Boulder, Colorado, USA.

American Chemical Society’s style for Data from a Database:
4-Bromo-2-fluorotoluene. SDBSWeb. National Institute of Advanced Industrial Science and Technology. n.d. https://sdbs.db.aist.go.jp (accessed 2019-03-17). (CAS RN: 51436-99-8).

Geoscience Information Society's Style for Data Sets:
Defosse, G.E., and M. Bertiller. 1998. NPP Grassland: Media Luna, Argentina, 1981-1983. Data set. Available on-line [http://www.daac.ornl.gov/] from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, U.S.A.M

Last Updated: Oct 21, 2024 4:50 PM