Data management good practices

Getting started

Finished with your research project? Make sure that you can continue to access your data into the future. Data that is primarily digital in nature requires ongoing attention and support in order to ensure that the file will remain accessible.

When wrapping up a project, some initial questions you'll need to answer are:

  • Which files do you need to keep long-term?
  • What file formats are your data saved in?
  • Where will you store the data?
  • When and how will you back-up your data?

Common long term access issues

One of the most common short-term reasons you might be unable to open a file is if you use a proprietary file format, and lose access to the software or hardware needed to open the file. This may happen if:

  • your subscription to a program lapses,
  • the program developers stop supporting and updating a program, or
  • you upgrade to new hardware without transferring or purchasing a program license.

Other potential reasons you may be unable to open a file is if it becomes corrupted or if the file is stored on obsolete storage media.

Proprietary vs. open file formats

Some files formats are the property of the individual, organization, or company that created them: the specifications behind the format are closed and usually only one specific program can open this file format. For example: a .DWG file can only be opened in the program AutoCAD. Sometimes, the formats have been reversed engineered such that other software can interpret and open the file, although it may not render accurately.

Other file formats are openly documented, and widely supported. Many programs can read and open these formats: such as the MP3 or CSV format.

Maintaining access to your files

  • Using open-source programs (for example, using LibreOffice to read Microsoft Office documents) can help when the original program is not available.
    • Downside: developers may stop supporting an open-source program, and the program may not read and display the data in the file the same way the originating program did.
  • Keeping multiple copies of important documents in different storage locations can help you recover if one copy of the file degrades.
    • Downside: Keep an eye on where your files are stored, since files on removable media need to be migrated every 3-5 years.
  • Converting your files to more open formats will help ensure that you have access to some form of your data for the long-term.
    • Downside: File formats have different features and restrictions in how they are read, displayed, and manipulated by software - keep a copy of your data in the original format, just in case.

Converting files

Keep in mind that whenever you convert file formats or use an alternative program to open the original file, you may encounter unexpected changes or limitations in how you can read and edit that data. For this reason, proceed carefully and use the following table as a guideline. It is a good idea to keep a copy of your data in its original file format. If you have specific questions about your file formats, please contact us data@umn.edu.

Suggested file formats

We recommend using non-proprietary, open, non-compressed file types when possible. Some suggested file types are:

  • Databases: XML, CSV
  • Geospatial: SHP, DBF, GeoTIFF, NetCDF
  • Moving images: MOV, MPEG, AVI, MXF
  • Sounds: WAVE, AIFF, MP3, MXF
  • Statistics: ASCII, DTA, POR, SAS, SAV
  • Still images: TIFF, JPEG 2000, PDF, PNG, GIF, BMP
  • Tabular data: CSV
  • Text: XML, PDF/A, HTML, ASCII, UTF-8
  • Web archive: WARC
  • Containers: TAR, GZIP, ZIP
Last Updated: Mar 29, 2024 3:03 PM