background-image: url("data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/rstats-artwork/welcome_to_rstats_twitter.png") background-position: 50% 0% background-size: 60% class: bottom ## Writing reproducible manuscripts in R [**Shilaan Alzahawi**](http://shilaan.rbind.io) @ Stanford Graduate School of Business Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations) --- ### Do your data sci like it's going to need an alibi <img src="data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/rstats-artwork/reproducibility_court.png" width="100%" style="display: block; margin: auto;" /> Slides at [bit.ly/shilaan-apa](https://bit.ly/shilaan-apa) Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations) --- background-image: url("data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/make-your-own-stats-cartoons/ex_3.png") background-position: 90% 50% background-size: 38% # Introduction 🧊🔨 -- `\(~~\)` `\(~~\)` **Statistics** @ Ghent University **Organizational Behavior** @ Stanford GSB 🔎 `\(~\)` Statistical inference & hypothesis testing 🔎 `\(~\)` Open and reproducible science 🔎 `\(~\)` Crowdsourced & big team science --- # Outline -- **What?** 📝 `\(~\)` Reproducible manuscripts -- **Why?** ✅ `\(~\)` Benefits -- **How?** 🛠 `\(~\)` Tutorial `\(~~~~~~\)` pt. 1: An introduction to **R Markdown** `\(~~~~~~\)` pt. 2: An introduction to **papaja** -- `\(~~\)` Example manuscripts at [github.com/shilaan/example-manuscripts](https://github.com/shilaan/example-manuscripts) --- # The typical workflow When writing a scientific report, the typical workflow is to ... -- 1. Do your analyses (e.g., in `R`, `Python`, `SPSS`, `SAS`, `Matlab`, or `Stata`) -- 2. Copy-paste or otherwise save your graphs and results -- 3. Open a program (e.g., `Microsoft Word`) to communicate the results -- 4. Manually format your results and citations -- ### Discussion questions -- What are common challenges when working in this fashion? What kind of problems could arise? --- class:center, middle <iframe width="1120" height="630" src="https://www.youtube.com/embed/s3JldKoA0zw" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> --- <img src="data:image/png;base64,#http://swcarpentry.github.io/git-novice/fig/phd101212s.png" width="60%" style="display: block; margin: auto;" /> --- # Typical workflow challenges -- - Time-consuming -- - Error-prone (e.g., rounding or transcription errors) -- - Lacks transparency; difficult to reproduce (by others **and** yourself!) -- - Difficult to maintain and update (endless rewriting and reformatting...) -- - Overhead costs of different computing/software environments -- - **Anything else...?** --- background-image: url("data:image/png;base64,#https://upload.wikimedia.org/wikipedia/en/f/ff/SuccessKid.jpg") background-position: 50% 92% background-size: 45% ## An alternative workflow: What? -- - Fuse your code and writing -- - Directly embed results in your report -- - Automatically reflect analytic changes in your documentation -- - Update all your results, figures, and tables automatically -- - Automatic formatting (including citations!) --- background-image: url("data:image/png;base64,#https://raw.githubusercontent.com/allisonhorst/stats-illustrations/master/rstats-artwork/data_cowboy.png") background-position: 90% 40% background-size: 50% ## An alternative workflow: Why? -- Less... -- ⬇️ Error-prone -- ⬇️ Time-consuming -- More... -- ⬆️ Dynamic -- ⬆️ Reproducible -- ⬆️ Transparent (for others **and** yourself) --- background-image: url("data:image/png;base64,#https://bookdown.org/yihui/rmarkdown/images/hex-rmarkdown.png") background-position: 50% 90% background-size: 20% ## Our weapon of choice: RMarkdown -- - RMarkdown is an **authoring framework for data science**, designed for reproducibility -- - The same document holds the code and the narrative surrounding the data -- - Results are automatically generated from the code -- - You can use a single R Markdown file to ✓ save and execute code, and ✓ generate high quality reports that can be shared with an audience --- <img src="data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/rstats-artwork/rmarkdown_rockstar.png" width="80%" style="display: block; margin: auto;" /> Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations): **Get your code, text, and outputs in the same (reproducible) place** --- ## Introduction to RMarkdown -- - Create dynamic analysis documents that combine code, output (incl. figures and tables), and writing -- - Can be used to ✓ Reproduce your analyses ✓ Collaborate and share code with others ✓ Communicate your results with others -- - Output formats include HTML, PDF, Word and... 🤩 Slide shows ([bit.ly/shilaan-apa](https://shilaan-apa.netlify.app)) 🤩 Websites ([shilaan.rbind.io](http://shilaan.rbind.io)) 🤩 Blogs 🤩 Books 🤩 CVs 🤩 Dashboards 🤩 Interactive documents 🤩 Conference posters 🤩 Manuscripts --- background-image: url("data:image/png;base64,#images/manuscript.png") background-position: 50% 80% background-size: contain ## Sneak peek: the power of RMarkdown --- background-image: url("data:image/png;base64,#images/cites.png") background-position: 50% 70% background-size: contain ## Sneak peek: the power of RMarkdown --- background-image: url("data:image/png;base64,#images/refs.png") background-position: 50% 50% background-size: contain ## Sneak peek: the power of RMarkdown --- ## Discussion question #### Are there good reasons for **not** using **RMarkdown**? -- <svg viewBox="0 0 576 512" style="position:relative;display:inline-block;top:.1em;fill:#035AA6;height:1.5em;" xmlns="http://www.w3.org/2000/svg"> <path d="M569.517 440.013C587.975 472.007 564.806 512 527.94 512H48.054c-36.937 0-59.999-40.055-41.577-71.987L246.423 23.985c18.467-32.009 64.72-31.951 83.154 0l239.94 416.028zM288 354c-25.405 0-46 20.595-46 46s20.595 46 46 46 46-20.595 46-46-20.595-46-46-46zm-43.673-165.346l7.418 136c.347 6.364 5.609 11.346 11.982 11.346h48.546c6.373 0 11.635-4.982 11.982-11.346l7.418-136c.375-6.874-5.098-12.654-11.982-12.654h-63.383c-6.884 0-12.356 5.78-11.981 12.654z"></path></svg> `\(~~\)` Steep **learning curve** -- <svg viewBox="0 0 576 512" style="position:relative;display:inline-block;top:.1em;fill:#035AA6;height:1.5em;" xmlns="http://www.w3.org/2000/svg"> <path d="M569.517 440.013C587.975 472.007 564.806 512 527.94 512H48.054c-36.937 0-59.999-40.055-41.577-71.987L246.423 23.985c18.467-32.009 64.72-31.951 83.154 0l239.94 416.028zM288 354c-25.405 0-46 20.595-46 46s20.595 46 46 46 46-20.595 46-46-20.595-46-46-46zm-43.673-165.346l7.418 136c.347 6.364 5.609 11.346 11.982 11.346h48.546c6.373 0 11.635-4.982 11.982-11.346l7.418-136c.375-6.874-5.098-12.654-11.982-12.654h-63.383c-6.884 0-12.356 5.78-11.981 12.654z"></path></svg> `\(~~\)` Barriers to **collaborating** with others (requires additional tools: **Git/GitHub**) -- <svg viewBox="0 0 576 512" style="position:relative;display:inline-block;top:.1em;fill:#035AA6;height:1.5em;" xmlns="http://www.w3.org/2000/svg"> <path d="M569.517 440.013C587.975 472.007 564.806 512 527.94 512H48.054c-36.937 0-59.999-40.055-41.577-71.987L246.423 23.985c18.467-32.009 64.72-31.951 83.154 0l239.94 416.028zM288 354c-25.405 0-46 20.595-46 46s20.595 46 46 46 46-20.595 46-46-20.595-46-46-46zm-43.673-165.346l7.418 136c.347 6.364 5.609 11.346 11.982 11.346h48.546c6.373 0 11.635-4.982 11.982-11.346l7.418-136c.375-6.874-5.098-12.654-11.982-12.654h-63.383c-6.884 0-12.356 5.78-11.981 12.654z"></path></svg> `\(~~\)` Not the best format for **computationally expensive functions** -- <svg viewBox="0 0 512 512" style="position:relative;display:inline-block;top:.1em;fill:#035AA6;height:1.5em;" xmlns="http://www.w3.org/2000/svg"> <path d="M256 8C119.043 8 8 119.083 8 256c0 136.997 111.043 248 248 248s248-111.003 248-248C504 119.083 392.957 8 256 8zm0 448c-110.532 0-200-89.431-200-200 0-110.495 89.472-200 200-200 110.491 0 200 89.471 200 200 0 110.53-89.431 200-200 200zm107.244-255.2c0 67.052-72.421 68.084-72.421 92.863V300c0 6.627-5.373 12-12 12h-45.647c-6.627 0-12-5.373-12-12v-8.659c0-35.745 27.1-50.034 47.579-61.516 17.561-9.845 28.324-16.541 28.324-29.579 0-17.246-21.999-28.693-39.784-28.693-23.189 0-33.894 10.977-48.942 29.969-4.057 5.12-11.46 6.071-16.666 2.124l-27.824-21.098c-5.107-3.872-6.251-11.066-2.644-16.363C184.846 131.491 214.94 112 261.794 112c49.071 0 101.45 38.304 101.45 88.8zM298 368c0 23.159-18.841 42-42 42s-42-18.841-42-42 18.841-42 42-42 42 18.841 42 42z"></path></svg> `\(~~~\)` Anything else? --- class: inverse, center, middle # Part 1: RMarkdown --- # Getting started with RMarkdown - Install [`R`](https://cran.r-project.org/mirrors.html) - Install [`RStudio`](https://www.rstudio.com/products/rstudio/download/) - Install the `RMarkdown` package - Install `\(\LaTeX\)` (e.g., `TinyTex`) ```r install.packages("rmarkdown") install.packages("tinytex") # for generating PDF output tinytex::install_tinytex() # install TinyTeX ``` <img src="data:image/png;base64,#https://shilaan.rbind.io/post/building-your-website-using-r-blogdown/excited.jpg" width="70%" style="display: block; margin: auto;" /> --- ## Opening a new R Markdown - Create a new R Markdown document from the menu `File -> New File -> R Markdown` <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/new-rmarkdown.gif" width="90%" style="display: block; margin: auto;" /> --- ## Notebook interface - Allows for direct interaction with R (execute code and display results inline) - Makes it easy to test and iterate - Produces a reproducible document with publication-quality output <img src="data:image/png;base64,#https://d33wubrfki0l68.cloudfront.net/07a00dd9669405f3cba06ef333db180295466252/7b153/lesson-images/how-2-chunk.png" width="90%" style="display: block; margin: auto;" /> --- ## Three types of content - YAML meta-data / frontmatter (between `---` and `---`) - Text with Markdown formatting - R code <img src="data:image/png;base64,#images/rmarkdown.png" width="95%" style="display: block; margin: auto;" /> --- class: inverse, center, middle # Metadata --- background-image: url("data:image/png;base64,#https://cwextensions.com/images/logo-someta.png") background-position: 92% 7% background-size: 10% # YAML metadata The YAML header contains basic metadata and rendering instructions ```yaml --- title: My R Markdown Report author: Shilaan Alzahawi output: pdf_document date: "2021-11-09" --- ``` -- The date will be **dynamically updated** every time we knit the report, with the help of the following line of code (more on **in-line code** later): -- <img src="data:image/png;base64,#images/date.png" width="1037" style="display: block; margin: auto auto auto 0;" /> --- # Preview an RMarkdown <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/preview.gif" width="100%" style="display: block; margin: auto;" /> --- # Rendering a document ✓ ![knit](data:image/png;base64,#images/knit.png) ✓ Windows/Linux: `Control + Shift + K` ✓ OS X: `Command + Shift + K` <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/knitting.gif" width="85%" style="display: block; margin: auto;" /> --- <img src="data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/rstats-artwork/rmarkdown_wizards.png" width="100%" style="display: block; margin: auto;" /> Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations): **Become an RMarkdown knitting wizard** --- ## Output formats ![](data:image/png;base64,#https://d33wubrfki0l68.cloudfront.net/ece57b678854545e6602a23daede51ad72da2170/21cca/lesson-images/outputs-1-word.png) --- ## Output formats ![](data:image/png;base64,#https://d33wubrfki0l68.cloudfront.net/ebcf2beeb67cc21693d73b8708d5af0fa9769f57/de9a9/lesson-images/outputs-2-pdf.png) --- ## What's happening behind the scenes? ![A diagram illustrating how an R Markdown document is converted to the final output document](data:image/png;base64,#https://bookdown.org/yihui/rmarkdown-cookbook/images/workflow.png) ☞ The code within the `.Rmd` file is executed and converted into an `.md` file; ☞ The `.md` file is converted to the output format specified in the metadata --- ## What's happening behind the scenes? Knitting an `RMarkdown` file... -- 1. Starts a new R session ✓ No packages or objects loaded -- 2. Sets your working directory to the location of the `RMarkdown` file -- 3. Executes all code chunks from top to bottom -- ### ⚠️ **Make sure to load all R packages you use!** <img src="data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/make-your-own-stats-cartoons/ex_4.png" width="45%" style="display: block; margin: auto;" /> Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations) --- class: inverse, center, middle # Code --- ## Two types of code in RMarkdown 1. A code chunk, surrounded by three backticks and `{r}` 2. An inline code expression, surrounded by one backtick and `r` <img src="data:image/png;base64,#https://d33wubrfki0l68.cloudfront.net/4c3760f9341ec07761c95fb5f03e033fa73d206d/057ff/lesson-images/inline-1-heat.png" width="95%" style="display: block; margin: auto auto auto 0;" /> --- ## Code chunks -- "*Code chunks are the beating heart of our R Markdown.*" [Xie, Dervieux, Riederer 2021](https://bookdown.org/yihui/rmarkdown-cookbook/rmarkdown-anatomy.html) -- ```r summary(Orange) ``` ``` ## Tree age circumference ## 3:7 Min. : 118.0 Min. : 30.0 ## 1:7 1st Qu.: 484.0 1st Qu.: 65.5 ## 5:7 Median :1004.0 Median :115.0 ## 2:7 Mean : 922.1 Mean :115.9 ## 4:7 3rd Qu.:1372.0 3rd Qu.:161.5 ## Max. :1582.0 Max. :214.0 ``` -- ### Inserting a code chunk -- ✓ Windows/Linux: `Control + Alt + I` -- ✓ OS X: `Command + Option + I` -- ✓ Enclosing code with three backticks and `{r}` -- ✓ ![](data:image/png;base64,#https://d33wubrfki0l68.cloudfront.net/b8b19518e688e3ca1390e0a1588916f04908d33f/8a4dc/images/notebook-insert-chunk.png) --- ## Inserting code chunks <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/insert-rchunk.gif" width="95%" style="display: block; margin: auto;" /> --- ## Chunk anatomy <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/chunk-anatomy-2.gif" width="80%" style="display: block; margin: auto;" /> --- ## Naming your code chunks It's recommended to name your chunks. This allows you to quickly navigate code, automatically name figures, and troubleshoot errors. <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/chunk-names.gif" width="80%" style="display: block; margin: auto;" /> --- ## Chunk options Control a chunk's behavior by passing additional, comma-separated arguments -- ✓ `echo = TRUE` show code and output (*default*) -- ✓ `echo = FALSE` show output only (hide code) -- ✓ `include = FALSE` do not show output (run code) -- ✓ `eval = FALSE` show code (do not run; no output) -- ✓ `warning = FALSE` removes warning messages -- ✓ `error = FALSE` removes error messages -- ✓ `message = FALSE` removes all messages -- ```r summary(Orange) ``` -- **Bonus question:** What chunk option did I set here? --- ## Chunk options <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/chunk-options.gif" width="100%" style="display: block; margin: auto;" /> Credit for all GIFs goes to [Shannon Pileggi](https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/#rmd) --- ## Chunk execution `Ctrl + Enter` or `Command + Enter` or press ![](data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/img/run-chunk.PNG) <img src="data:image/png;base64,#https://www.pipinghotdata.com/posts/2020-09-07-introducing-the-rstudio-ide-and-r-markdown/gifs/run-chunk.gif" width="100%" style="display: block; margin: auto;" /> --- ## In-line code To insert in-line code, wrap your code in a single backtick. RMarkdown will always - display the results of inline code, but not the code - apply relevant text formatting to the results -- **R Markdown document** <img src="data:image/png;base64,#images/inline.png" width="1867" /> -- **Knitted HTML document** <img src="data:image/png;base64,#images/inline-knitted.png" width="1667" /> --- class: inverse, center, middle # Text --- # Markdown formatting basics ![](data:image/png;base64,#https://d33wubrfki0l68.cloudfront.net/59f29676ef5e4d74685e14f801bbc10c2dbd3cef/c0688/lesson-images/markdown-1-markup.png) --- ![](data:image/png;base64,#images/syntax-becomes.png) For more formatting options, see the [R Markdown Reference guide](https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf?_ga=2.157796986.1542626288.1625161001-1806201684.1624641897) --- ## Tables <img src="data:image/png;base64,#https://d33wubrfki0l68.cloudfront.net/09467251a219c3c6b2dae2bf1367e5736a9ef78c/feeea/lesson-images/tables-1-kable.png" width="90%" /> More on **APA tables** in Pt. 2! --- ## R Markdown tips and tricks -- 📦 Load all R packages in the first code chunk -- ⚠️ Do not include `install.packages()` or `setwd()` -- ![](data:image/png;base64,#images/spell.png) RMarkdown checks your spelling! -- ⛑ `File > Help > Cheatsheets > R Markdown Cheat Sheet` -- 💨 `File > Help > Markdown Quick Reference` -- ### Resources - [R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/) - [R Markdown Cookbook](https://bookdown.org/yihui/rmarkdown-cookbook/) --- class: inverse, center, middle # Part 2: papaja --- class: center background-image: url("data:image/png;base64,#images/papaja.png") background-position: 50% 60% background-size: 25% # Getting started with papaja **papaja** = **P**reparing **APA** **j**ournal **a**rticles created by [Frederik Aust](https://github.com/crsh/papaja) --- background-image: url("data:image/png;base64,#images/manuscript.png") background-position: 50% 80% background-size: contain ## Sneak peek: APA title page --- ## Sneak peek: APA tables -- <img src="data:image/png;base64,#images/table.png" width="2249" /> -- <img src="data:image/png;base64,#images/table-knit.png" width="50%" style="display: block; margin: auto;" /> --- # Getting started with papaja -- ```r # make sure you've already installed tinytex! install.packages("devtools") devtools::install_github("crsh/papaja@devel") #install papaja ``` -- `File > New File > R Markdown > From Template > APA article` -- <img src="data:image/png;base64,#images/new-apa.png" width="50%" style="display: block; margin: auto;" /> --- class: center background-image: url("data:image/png;base64,#images/cites.png") background-position: 50% 70% background-size: contain # APA citations --- ## Getting started with APA citations -- 1. Download [Zotero](https://www.zotero.org) -- 2. Download the [Better BibTex for Zotero extension](https://retorque.re/zotero-better-bibtex/) -- 3. Install citr: an RStudio Addin to Insert Markdown Citations ▸ citr can directly access your reference database ▸ citr can keep your reference file updated -- ```r devtools::install_github("crsh/citr") ``` --- # Inserting citations -- 1. Create a reference file using a reference manager (e.g., Zotero) -- 2. Supply the reference file in the `---`front matter`---` ![](data:image/png;base64,#images/bib.png) -- 3. Insert citations -- ▸ Insert using your citation key ![](data:image/png;base64,#images/yarkoni.png) -- ▸ Insert using `Addins > Insert citations` ![](data:image/png;base64,#images/addin.png) --- class: center background-image: url("data:image/png;base64,#images/insert-citation.png") background-position: 50% 50% background-size: 85% --- # Inserting citations <img src="data:image/png;base64,#images/citation-table.png" width="2445" /> --- background-image: url("data:image/png;base64,#images/refs.png") background-position: 50% 60% background-size: contain # Inserting citations - You can cite R packages, too! - After loading all packages, run `r_refs()` to create a BibTex file with references to all currently loaded packages --- ### Harnessing the power of meta-data, code, and text -- **R Markdown document** <img src="data:image/png;base64,#images/harness.png" width="2392" /> -- **Knitted APA manuscript** <img src="data:image/png;base64,#images/harness-knit.png" width="1727" /> --- background-image: url("data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/raw/master/make-your-own-stats-cartoons/ex_1.png") background-position: 50% 90% background-size: 50% ## Statistical output -- <img src="data:image/png;base64,#images/statistics.png" width="2296" /> -- <img src="data:image/png;base64,#images/statistics-knit.png" width="50%" /> --- ## Another look at APA tables -- <img src="data:image/png;base64,#images/table.png" width="2249" /> -- <img src="data:image/png;base64,#images/table-knit.png" width="50%" style="display: block; margin: auto;" /> --- background-image: url("data:image/png;base64,#images/papaja.png") background-position: 90% 80% background-size: 25% # pajaja tips and tricks -- Define a keyboard shortcut for inserting citations ✂︎ `Tools > Addins > Browse Addins > citr > Keyboard Shortcuts` -- ### Helpful resources - The [papaja manual](http://frederikaust.com/papaja_man/) - [Papers](https://github.com/crsh/papaja#papers-written-with-papaja) written with papaja --- class: right <img src="data:image/png;base64,#https://github.com/allisonhorst/stats-illustrations/blob/master/rstats-artwork/r_first_then.png?raw=true" width="65%" style="display: block; margin: auto;" /> Artwork by [@allison_horst](https://github.com/allisonhorst/stats-illustrations) --- class: center, middle # Thank you! ❤︎ Slides created with the R package [**xaringan**](https://github.com/yihui/xaringan). **Questions?** Reach out to me at **shilaan@stanford.edu**