Big data are a pervasive part of biological research; thus, the ability to manage and analyze data has become critical for students across a wide variety of biological disciplines. The computing language and environment R, a free-commons for data-analysis platform, is now the tool of choice within the scientific community for data manipulation, data visualization, statistical analyses, and generation of high-quality figures for presentations and publications.
We will incorporate a workshop to introduce R and RStudio to trainees in the 4EC REU. Over five weekly sessions, trainees will be introduced to vectors, data entry, data frame manipulation, data presentation, writing functions, for-loops, the ‘apply’ family of functions, installing and using R packages, creating plots, and interpreting R-help pages. Instruction will take place via dynamic code generation as each student follows along step-by-step with the instructor. Periods of dynamic instruction include practice exercises, with additional “challenge activities” that vary in difficulty and thus allow students to choose an appropriate challenge to refine their skills.
This workshop, developed and led by UC Davis graduate students, has been successfully taught to undergraduates in the STEMinist Data Science Initiative at UC Berkeley and UC Davis, and students in UC Davis summer research programs.