Data management best practices

Data Management Bootcamp

Wednesday, January 12, 2022 | 9:30 a.m. - noon a.m. | Online 

Hosted by the University Libraries 

Free to U of M graduate students

Registration is now closed. Please keep an eye out for future events.

Keep your research organized by learning data management skills! This virtual workshop is designed for graduate students and new researchers who will be managing a research project for the first time. We will split this workshop into an asynchronous portion where you will be introduced to basic data management topics, and a live online portion where you will be able to discuss how these topics apply to your research with a data management expert, and see demonstrations of reproducible workflows. You will leave with long-term strategies to organize your project and actionable steps that you can put into practice right away.

Sponsored by the University Libraries, LATIS, the Graduate School, OVPR, and the Informatics Institute.

Questions? Contact

Data management journey breakout sessions

The live workshop will consist of breakout sessions in which you will build upon the basic data management lessons introduced in the asynchronous section of this event, and apply them to real research workflows.

We plan to tailor these sessions to where you currently are (or looking to go) on your data management journey, from just getting started to deep in the middle of research to nearing the end of your degree or project completion. 

Choose the best fit for where you are!

  • Starting your journey: This breakout session will review the data management basics and provide students an overview of the UMN resources and skills they need for developing good research data management. Students starting their graduate degrees or who are just beginning their research careers would benefit the most from this session. 

  • In the woods: This breakout session will walk you through the research process regarding  how to help collaborators, curators, and future you understand your data later. The focus will be on documentation and how to get your team on the same page.  A data curator will share strategies to help make your data more FAIR (findable, accessible, interoperable and reusable).  Students who are mid academic career, working in a lab or with a team are encouraged to attend.

  • Approaching the clearing: This breakout session will cover topics to think about when approaching the end of your graduate degree or research career at this institution. Topics include planning for changes in access to storage, software, ownership, as well as options for sharing data and materials from your thesis or research projects. Students who are in the last few years of their degrees will benefit from this discussion, as would anyone who is interested in planning ahead.

Tool-based workflow sessions

The second part of the live workshop will involve tool-based demonstrations. Presenters will walk through how to use a tool within a workflow, including how it touches on where and how files are stored or managed. 

Although no experience is required with any of these tools, these sessions are not designed to necessarily teach the tool, but rather demonstrate ways it can be used and integrated into strong data management workflows. 

These sessions will be recorded and shared on the Data Management Bootcamp Canvas site. You will need to select ONE to attend, but will be able to access the recordings after the session. Each session will have its own Zoom link, which will be shared on Canvas site the day of the bootcamp.

  • Using R with sensitive data (Alicia Hofelich Mohr, Ph.D.): R is a free and common tool used for data analysis (like SPSS, Stata, or SAS). This session will demonstrate how to use R in connection with UMN's Box Secure Storage to clean sensitive data and create reproducible reports for analysis. 

  • Organizing PDFs and article citations with a manager (Jody Kempf): Zotero is a free, easy-to-use tool to help you collect, organize, cite, and share research. This session will demonstrate how to set up and use Zotero to collect and organize citations, PDFS and other files.  We will then demonstrate using Zotero to format citations in a variety of styles and how to add in-text citations to Microsoft Word and Google Docs, as well as share citations with others using their group feature.  

  • Collecting and organizing data from the web (Valerie Collins): is a web browser extension for scraping web data, and requires no coding knowledge to run. This session will demonstrate the basic uses of this tool, discuss some considerations specific to capturing and organizing web-based data and content, and look at some workflows for incorporating other tools to manage the collected data.

  • Research workflows in Python and Jupyter (Michael Beckstrand, Ph.D.): Python is a free open-source data science tool with many applications in the social sciences and digital humanities. Jupyter provides a handy browser-based environment for writing, running, and publishing Python code. This session will walk through a reproducible research workflow for using Python/Jupyter to analyze textual data.

  • Critical appraisal of data using the CURATED framework (Shanda Hunt): CURATED is an open access framework for critically appraising data "packages" - whether the end goal is project sunsetting or public sharing (soon to be mandated by NIH). This session will teach you how to critically appraise your research data for findability, accessibility, interoperability, and reuse, with a focus on human participant data.

  • Writing Your Thesis or Dissertation with R Markdown, GitHub, and Friends (Hava Blair): The R ecosystem can be used for more than statistical analysis!  This session will demonstrate how to use RStudio, R Markdown documents, and git/GitHub version control to write your thesis or dissertation in a fully reproducible and transparent way.  We will cover set-up, handling citations, incorporating code and analysis, version control, and output options for your beautiful, reproducible documents (PDF, Word, HTML).

Last Updated: Jan 7, 2022 12:56 PM