Overview: This course will cover the basic statistical knowledge necessary for a graduate student to design, execute, and analyze a basic research project. The course aims to have students focus on thinking about the biological processes that they are studying in their research and how to translate them into statistical models. The course will take a hands-on computational approach, teaching students the statistical programming language R. In addition to teaching the fundamentals of data analysis, we will emphasize several key concepts of efficient computer programming that students can use in a variety of other areas outside of data analysis.

We will emphasize the underlying principle behind modern statistical analysis – that nearly every biological system can be described with a simple series of linear or nonlinear relationships created by a data generating process with variation in data generated some meaningful error generating process. Additionally, we will emphasize thinking about whole biological systems, causality, and the limits of inference that can be drawn from observational versus experimental studies.

The course will build through a series of topics. We will begin by thinking about the basics how we sample populations and how we describe those samples. We will move on to the fundamentals of frequentist hypothesis testing as a jumping off point for deriving inference from a sample of data. We will focus this understanding on simple linear data generating processes with a normal, or Gaussian, error generating process. We will use this framework to explore Likelihoodist and Bayesian modes of drawing inference from data and discuss when each approach might be right for a given problem. With this firm footing, we will move on to examine the analysis of manipulative experiments, complex multi-causal models, and nonlinear data generating processes with non-normal error generating processes.

Along the way, we will stress ideas of how to deal with modern complex data sets, efficient computation, and try to consider deeply the philosophical nature that underpins modern statistical inference in biology.

Objectives:

  1. To learn how to think about your study system and research question of interest in a systematic way in order to design an efficient sampling and experimental research program.
  2. To understand how to analyze collected data to derive the most information possible about your research questions.
  3. Provide the grounding needed to effectively collaborate with statistical experts.
  4. Allow students to feel sufficiently comfortable with the basic principles of statistical analysis so that they can learn and implement techniques outside of the purview of this course.

Prerequisites: I will assume a basic knowledge of algebra. Undergraduate courses in probability theory and computer science are useful, but not required.

Content and teaching approach: The course will be a mixture of lecture and hands-on data analysis lab. Students will be expected to have a computer available during the course so that they can follow examples and attempt in-class problems.

Grading: Your grade will be determined by a combination of weekly homework, a course blog, and a midterm exam, and a final paper. Homework will consist of a problem set and will be worth 40% of your course grade. In-class quizzes will comprise 10%. The midterm exam will be take-home and worth 20%. The final paper will be worth 30%. Additionally, there will be multiple opportunities for extra credit along the way.

Homework: All homework done using R should be turned in as an RMarkdown document (http://rmarkdown.rstudio.com/). I will conduct a short tutorial in class. I’ll provide a directory structure to make sure you write a document that I can compile and edit. Note – all slides will be written using RMarkdown, and code will be made available as an example. Please turn your homework in with the following filename format: lastname_#.Rmd

Code of Conduct and Academic Integrity: It is the expressed policy of the University that every aspect of academic life–not only formal coursework situations, but all relationships and interactions connected to the educational process–shall be conducted in an absolutely and uncompromisingly honest manner. The University presupposes that any submission of work for academic credit is the student’s own and is in compliance with University policies, including its policies on appropriate citation and plagiarism. These policies are spelled out in the Code of Student Conduct. Students are required to adhere to the Code of Student Conduct, including requirements for academic honesty, as delineated in the University of Massachusetts Boston Graduate Catalogue and relevant program student handbook(s). http://www.umb.edu/life_on_campus/policies/code You are encouraged to visit and review the UMass website on Correct Citation and Avoiding Plagiarism: http://umb.libguides.com/citations

Accommodations: The University of Massachusetts Boston is committed to providing reasonable academic accommodations for all students with disabilities. This syllabus is available in alternate format upon request. If you have a disability and feel you will need accommodations in this course, please contact the Ross Center for Disability Services, Campus Center, Upper Level, Room 211 at 617.287.7430. http://www.umb.edu/academics/vpass/disability/ After registration with the Ross Center, a student should present and discuss the accommodations with the professor. Although a student can request accommodations at any time, we recommend that students inform the professor of the need for accommodations by the end of the Drop/Add period to ensure that accommodations are available for the entirety of the course.