5 Using Quarto

So far we have covered:

How to organise your project (RStudio projects!)
Appropriately refer to data (file storage hygiene!)
A brief intro into what Quarto is

Now, let’s talk about using Quarto.

5.1 Overview

Teaching 10 minutes
Exercises 10 minutes

5.2 Questions

How should I start an Quarto document?
What do I put in the YAML metadata?
How do I create a code chunk?
What sort of options to I need to worry about for my code?

5.3 Objectives

Create a Quarto document, do some basic exploration

5.4 The anatomy of a Quarto document

This is a Quarto document (demo). It has three parts:

Metadata (YAML)
Text (markdown formatting)
Code (code formatting)

5.4.1 Metadata

The metadata of the document tells you how it is formed - what the title is, what date to put, and other control information. If you’re familiar with LaTeX, this is kind of like how you specify the many options, what sort of document it is, what styles to use, and so on at the front matter.

Quarto documents use YAML (YAML Ain’t Markup Language) to provide the metadata. It looks like this.

---
title: "An example document"
author: "Nicholas Tierney"
format: html
---

It starts and ends with three dashes ---, and has fields like the following: title, author, and format.

title and author are special inputs which place the title and author information at the top of the document in large font. They are optional!

format: html tells us we want this to be a HTML formatted document - you’ll see what this looks like in a moment!

5.4.2 Text

Is markdown, as we discussed in the earlier section,

It provides a simple way to mark up text

Markdown
Rendered text

- bullet list
- bullet list
- bullet list

1. numbered list
2. numbered list
3. numbered list

__bold__, **bold**, _italic_, *italic*

> quote of something profound

```r
# computer code goes in three back ticks
1 + 1
2 + 2
```

bullet list
bullet list
bullet list

numbered list
numbered list
numbered list

bold, bold, italic, italic

quote of something profound

# computer code goes in three back ticks
1 + 1

[1] 2

2 + 2

[1] 4

5.4.3 Code

We refer to code in an Quarto document in two ways:

Code chunks, and
Inline code.

5.4.3.1 Code chunks

Code chunks are marked by three backticks and curly braces. We put the letter r inside them to denote them as “r” code chunks, but you can instead use “python” and “julia” instead:

```{r}
#| label: r-chunk-name
# a code chunk
```

```{python}
#| label: py-chunk-name
# a code chunk
```

```{julia}
#| label: julia-chunk-name
# a code chunk
```

This book currently focusses only on R

Quarto provides support for R, Python, Julia, and Observable, which are all very powerful and awesome languages! However currently we will only be focussing on using R in this book. But I want to make sure that you know you can use python, or Julia, or Observable! More languages will be supported into the future, I believe.

a backtick is a special character you might not have seen before, it is typically located under the tilde key (~). On USA / Australia keyboards, is under the escape key:

image from https://commons.wikimedia.org/wiki/File:ANSI_Keyboard_Layout_Diagram_with_Form_Factor.svg

5.4.4 Chunk names

Every chunk should ideally have a name. As I’ve mentioned earlier, naming things is hard, but follow these rules and you’ll be fine:

one word that describes the action (e.g., “read”)
one word that describes the thing inside the code (e.g, “gapminder”)
separate words with “-” or “_” (e.g., read-gapminder)

5.5 Code chunk options

You can control how the code is output by changing the code chunk options, which are written with a #|, called a “hash-pipe”, since # is “hash”, and | is “pipe”, but might sometimes be called “bar” or “v-bar”.

```{r}
#| label: read-gapminder
#| eval: false
gap <- read_csv("gapminder.csv")
```

A nice feature of Quarto + Rstudio is that they provide code completion when you start writing the code chunk options, and they will provide options when hitting “tab”.

In the past Rmarkdown required “TRUE” and “FALSE”, but note that Quarto always uses true or false in lowercase, and never “yes” or “no”.

The code chunks you need to know about right now are:

eval: true/false Do you want to evaluate the code?
echo: true/false Do you want to print the code?
cache: true / false Do you want to save the output of the chunk so it doesn’t have to run next time?
include: Do you want to include code output in the final output document? Setting to false means nothing is put into the output document, but the code is still run.

You can read more about the options at the official documentation: https://quarto.org/docs/computations/execution-options.html

Converting Rmarkdown to Quarto

If you’ve got some Rmarkdown document and you want to change over the chunk headers, you can run code like this:

knitr::convert_chunk_header(
  input = "paper.qmd",
  output = identity
)

5.5.1 Inline code

Sometimes you want to run the code inside a sentence. When the code is run inside the sentence, it is called running the code “inline”.

You might want to run the code inline to name the number of variables or rows in a dataset in a sentence like:

There are XXX observations in the airquality dataset, and XXX variables.

You can call code “inline” like so:

```{r}
r_heights <- c(153, 151, 156, 160, 171)
r_mean <- mean(r_heights)
```

The mean of these heights is `{r} r_mean`

Which will produce the following sentence:

The mean of these heights is 158.2

Essentially, instead of using three backticks to write multiple lines of code, you use a single backtick. You can think of this as a backtick being used inside text for a one liner, whereas creating a code fence with three backticks indicates something longer.

There are `{r} nrow(airquality)` observations in the airquality dataset, 
and `{r} ncol(airquality)` variables.

Which gives you the following sentence

There are 153 observations in the airquality dataset, and 6 variables.

What’s great about this is that if your data changes upstream, then you don’t need to work out where you mentioned your data and change that bit of text. You just render the document and it takes care of these details.

5.6 Creating a Quarto document

Rstudio menu system
Explore the template provided by Rstudio
Compile an Quarto document

5.7 Working with a Quarto document

Demo: Create a Quarto document in rstudio.

Your Turn

Use the rstudio project you previously created, qmd4sci-materials, and open the “01-qmd-examples.qmd” file.
Run some brief summaries of the data in the Quarto document:
- hist(data$)
- How big is the data?
- How many countries are there?
- What was the lowest life expectancy in Australia’s History?
- How about the lowest GDP for Australia?
- Where does Australia rank in GDP in 1997?

5.8 Nick’s Quarto starter pack

I highly recommend that each document you write has a certain structure:

Sets global options in the YAML
First code chunk manages libraries
Second code chunk manages functions

For example

---
title: example
format: 
  html:
    fig-align: center
    fig-width: 4
    fig-height: 4
    fig-format: png
execute:
  echo: false
  cache: true
---


```{r}
#| label: library
library(tidyverse)
```

```{r}
#| label: functions
# A function to scale input to 0-1
scale_01 <- function(x){
  (x - min(x, na.rm = TRUE)) / diff(range(x, na.rm = TRUE))
}
```

```{r}
#| label: read-data
gapminder <- read_csv(here::here("data", "gapminder.csv"))
```

In the YAML chunk under execute, you set the options that you want to define globally. In this case, I’ve told Quarto:

fig-align: center Align my figures in the center
fig-width: 4 & fig-height: 4. Set the width and height to 4 inches.
fig-format: png. Save the images as PNG
cache: true. Save the output results of all my chunks so that they don’t need to be run again.
echo: false: I don’t want any code printed by setting echo: false.

In the library chunk, you put all the library calls. This helps make it clearer for anyone else who might read your work what is needed to run this document. I often go through the process of moving these library calls to the top of the document when I have a moment, or when I’m done writing. You can also look at Miles McBain’s packup package to help move these library calls to the top of a document.

In the functions chunk, you put any functions that you write in the process of writing your document. Similar to the library chunk, I write these functions as I go, as needed, and them move these to the top when I get a moment, or once I’m done. The benefit of this is that all your functions are in one spot, and you might be able to identify ways to make them work better together, or improve them separately. You might even want to move these into a new R package, and putting them here makes that easier to see what you are doing.

In the readr chunk, you read in any data you are going to be using in the document.

Now, this is my personal preference, and there are definitely other ways to organise things! But, I find the following benefits:

The “top part” of your document contains all the metadata / setup info. Your global options. You don’t need to specify every single code chunk.
It helps another person get oriented with your work - they know the settings, the functions used, and the special things that you wrote (your functions)
Remember, “another person” includes yourself in 6 months. You are always collaborating with your future self. You are always collaborating with your future self. Say it with me.

Your Turn

Update the “01-qmd-example.qmd” Quarto document you just created, based on the aforementioned steps discussed above.