class: center, middle background-image: url(images/01/Sonoma_coast.jpg) background-position: center background-size: cover # .large[Introduction to Computational Data Analysis for Biology] ## .center[2023 Edition] ### .center[Jarrett Byrnes] UMass Boston https://biol607.github.io/ --- class: center # Why are we here? data:image/s3,"s3://crabby-images/4e2ed/4e2ed14102834771ed3c5d592d00f16f8d24c11e" alt=":scale 50%" --- # Who are You? 1. Name 2. Lab 3. Brief research description 4. Why are you here? -- .center[Write it here: https://etherpad.wikimedia.org/p/607-intro-2022] --- # Course Goals 1. Learn how to think about your research in a systematic way to design efficient observational & experimental studies. <br><br> -- 2. Understand how to get the most bang for your buck from your data. <br><br> -- 3. Make you effective collaborators with statisticians. <br><br> -- 4. Learn how to program to expand your scientific toolkit. <br><br> -- 5. Make you comfortable enough to learn and grow beyond this class. --- # What are we doing here? ## Course divided into blocks -- 1. Introduction to computation and reproducibility -- 2. Regression and Inference -- 3. Further Adventures in Statistical Modeling -- 3. Causal Inference and Study Design --- # Block 1: Computation ```r # Load the library #### library(ggplot2) # Load the data #### eelgrass <- read.csv("./data/15q05EelgrassGenotypes.csv") # Plot #### ggplot(eelgrass, aes(y = shoots, x = treatment.genotypes)) + geom_point() + stat_smooth(method = "lm") + theme_classic(base_size = 17) + labs(x = "No. of Genotypes", y = "No. of Shoots per sq. m.") ``` -- .center[.large[.red[Coding is power!]]] -- .center[.large[.red[Code Forces You to Be Explicit About Biology]]] --- class:center # Block 1: Reproducibility data:image/s3,"s3://crabby-images/0bb45/0bb45ebdba216de58b4b8bc5ee3c3657a593996d" alt="" --- class:center # Furthering Open Science data:image/s3,"s3://crabby-images/6d6f8/6d6f88d7391a923aba807ef084d482ef592acad9" alt="https://www.4open-sciences.org/component/content/article/11-news/276-four-pillars-of-open-science-open-code" --- # Block 2: Regression data:image/s3,"s3://crabby-images/d6d84/d6d84281b1ee57a0498c023829c5d9696f3af99d" alt=""<!-- --> --- # Block 3: Further Adventures data:image/s3,"s3://crabby-images/aed61/aed61f59cb8f4baada7af4aed7fec5259668c175" alt=""<!-- --> -- .large[.center[Don't worry! It's all just a line!]] --- # Block 3: Further Adventures data:image/s3,"s3://crabby-images/875aa/875aa7dce0aba00a8c337dde8d65c8a369a08f6a" alt="" -- .large[.center[Don't worry! It's all just a line!]] --- class:center # Block 4: Causal Inference data:image/s3,"s3://crabby-images/aeace/aeace38cd7d8f4c1cd8f2d44763ea664949b77aa" alt="" --- # Block 4: Study Design data:image/s3,"s3://crabby-images/19b88/19b8842689a94e8b3b9d29108ab58ca1b2b206f7" alt="" --- # Block 5: Inference .center[data:image/s3,"s3://crabby-images/1beea/1beeaa2d8c43d625c5b46bdf02c2e57067057bb3" alt=":scale 50%"] - What is the probability of a hypothesis? Or data given a hypothesis? - Is variation explained by a driver of interest? - How confident are you in your conclusions? - How can we generalize from our models to the world? --- # Lecture and Lab - T/Th Lecture on Concepts - Also Paper Discussion, Shiny Apps, etc. - Please bring your most interactive self! - I will try and make it easy for folk on Zoom - F Lab - Live coding! - I will screw up - don't take me as gospel! - Be generous with feedback/pace comments - Invite your friends! --- # Yes, Lectures are Coded R Markdown sometimes with Reveal.js or Xarnigan .center[<img src="images/01/lecture_code.jpg">] http://github.com/biol607/biol607.github.io --- # Some Old Technology .center[ data:image/s3,"s3://crabby-images/e0f09/e0f09ef82fffa3c43eec0776e40d2d01896797ee" alt="" ] - Green: Party on, Wayne -- - Red: I fell off the understanding wagon -- - Blue: Write a question/Other --- class: center # Readings for Class: Fieberg data:image/s3,"s3://crabby-images/54a33/54a33777a70a29b286736732d8b6d6d99c915a24" alt=":scale 45%" .left[Feiberg, J. 2022. Statistics for Ecologists.] ### https://fw8051statistics4ecologists.netlify.app/ --- class: center # Help John Out! Annotate His Book! data:image/s3,"s3://crabby-images/54a33/54a33777a70a29b286736732d8b6d6d99c915a24" alt=":scale 50%" ## https://hypothes.is/signup --- class: center # Readings for Class:<br>Wickham & Grolemund data:image/s3,"s3://crabby-images/b36ac/b36ac8a2cc4e97ac5ea01486f491454a047224e9" alt=":scale 35%" .left[Wickham, H. Çetinkaya-Rundel, M. and Grolemund, G., 2023. R for Data Science.] https://r4ds.hadley.nz/ --- class:center # There will be memes data:image/s3,"s3://crabby-images/66161/66161e996a62477662782c837e280c6df12ad339" alt=":scale 50%" -- .large[please feed my #statsmeme addiction] --- # And Now, A Pop Quiz! (I kid! I kid!) <br><br><center> <div style="font-size: 2em;font-weight: bold;">http://tinyurl.com/firstPopQuiz</div> </center><br><br> --- # My Actual Policy on Grading .center[ data:image/s3,"s3://crabby-images/f958d/f958d77029ed552b1b239291b4f3fc497abfb032" alt="" ] --- # Problem Sets - THE MOST IMPORTANT THING YOU DO - Adapted from many sources - Will often require R - Complete them using Quarto/Rmarkdown - Submit via Dropbox or Github --- # Midterm - Advanced problem set - After Regression. Probably. --- # Final Project - Topic of your choosing - Your data, public data, any data! - Make it dissertation relevant! - If part of submitted manuscript, I will retroactively raise your grade - Dates - Proposal Due Oct 20th - Presentations on Dec 15th - Paper due Dec 20th (but earlier fine!) --- # Impress Yourself: Use Github .center[data:image/s3,"s3://crabby-images/77ffd/77ffd5172aeb020be151999206f2cf49c7af2038" alt=":scale 50%"] - This whole class is a github repo - Having a github presence is becoming a real advantage - So.... create a class repository! - folder for homework, folder for exams, folder for labs - If you submit a link to your homework in a repo, +1 per homework! - There will be a github tutorial outside of class hours? --- # Life La Vida Data Science - Check out http://www.r-bloggers.com/ and https://rweekly.org/ - Listen to podcasts like https://itunes.apple.com/us/podcast/not-so-standard-deviations/ - Start going to local R User Groups like https://www.meetup.com/Boston-useR/ - Follow data science greats on Twitter (see https://twitter.com/jebyrnes/lists/stats-r-on-twitter) - Bring up cool things in the UMBRug slack --- # Help your fellow students data:image/s3,"s3://crabby-images/57f98/57f980e5fbe9dd21154c104e04ee1dbf17d6b816" alt="" - Having a problem during homework/exam/etc? - First, try and solve it yourself (google, stackoverflow, etc.) - Post a REPRODUCIBLE EXAMPLE to our slack channel - I notice if you post before I do! --- # Become Part of the Conversation Stats and R on Twitter: https://bit.ly/stats_r_twitter. Stats and R on Bluesky: https://bit.ly/bsky_rstats .center[data:image/s3,"s3://crabby-images/58b24/58b24bca2d7d4b621dacf2c78617eaf36974c37c" alt=":scale 50%"] --- # Welcome! <br><br> .center[.middle[data:image/s3,"s3://crabby-images/62c16/62c16618de8622d35895ba44fc13f8c69f7dc787" alt=""]]