Data Visualization in R

# Data Visualization in R
## Tidying your data and model into a nice plot
### Tiago Ventura
### University of Maryland

---

# My approach

**Always better to present your results with a graph**

And keep all (or most of) your models and descriptive tables in the appendix.

---

# Main Goals for Today

By the end of the workshop, I expect you to know how to:

- Prepare your data for easy visualization.

- Run and extract information from statistical models.

- Use ggplot2 to build an informative visualization.

---

## How to get there?

- Basics of ggplot

- Tidy Data with TidyR

- Broom for Model Quantities

- Case studies (from my own work)

---

class:inverse, center, middle

# Basics of ggplot2

---

# How ggplot works

- Data Visualization involves connecting (mapping) variables from your data to graphical representations.

- ggplot2 provides you with a language to map data to a plot.

- ggplot2 works by connecting data and visual components through a function called __aesthethics mapping__ (aes).

- Every graph is built layer by layer starting with your: 1) data, 2) aesthetics mappings, 3) geometric decisions, and then 4) embelisshment of the plot.

---

# Summary

---
class:inverse, center, middle

# Tidy your data
---

# Starting some coding

Let's first call our packages. I am using the package [packman](http://trinker.github.io/pacman/vignettes/Introduction_to_pacman.html) to help me manage my libraries.

```r
pacman::p_load(tidyverse, gapminder, kableExtra, tidyr, ggthemes, patchwork, broom)
```

This is a tidy data:

```r
knitr::kable(head(gapminder), format = 'html')
```

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> country </th>
   <th style="text-align:left;"> continent </th>
   <th style="text-align:right;"> year </th>
   <th style="text-align:right;"> lifeExp </th>
   <th style="text-align:right;"> pop </th>
   <th style="text-align:right;"> gdpPercap </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Afghanistan </td>
   <td style="text-align:left;"> Asia </td>
   <td style="text-align:right;"> 1952 </td>
   <td style="text-align:right;"> 28.801 </td>
   <td style="text-align:right;"> 8425333 </td>
   <td style="text-align:right;"> 779.4453 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Afghanistan </td>
   <td style="text-align:left;"> Asia </td>
   <td style="text-align:right;"> 1957 </td>
   <td style="text-align:right;"> 30.332 </td>
   <td style="text-align:right;"> 9240934 </td>
   <td style="text-align:right;"> 820.8530 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Afghanistan </td>
   <td style="text-align:left;"> Asia </td>
   <td style="text-align:right;"> 1962 </td>
   <td style="text-align:right;"> 31.997 </td>
   <td style="text-align:right;"> 10267083 </td>
   <td style="text-align:right;"> 853.1007 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Afghanistan </td>
   <td style="text-align:left;"> Asia </td>
   <td style="text-align:right;"> 1967 </td>
   <td style="text-align:right;"> 34.020 </td>
   <td style="text-align:right;"> 11537966 </td>
   <td style="text-align:right;"> 836.1971 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Afghanistan </td>
   <td style="text-align:left;"> Asia </td>
   <td style="text-align:right;"> 1972 </td>
   <td style="text-align:right;"> 36.088 </td>
   <td style="text-align:right;"> 13079460 </td>
   <td style="text-align:right;"> 739.9811 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Afghanistan </td>
   <td style="text-align:left;"> Asia </td>
   <td style="text-align:right;"> 1977 </td>
   <td style="text-align:right;"> 38.438 </td>
   <td style="text-align:right;"> 14880372 </td>
   <td style="text-align:right;"> 786.1134 </td>
  </tr>
</tbody>
</table>

---

# Tidy Data

There are three interrelated rules which make a [dataset tidy](https://r4ds.had.co.nz/tidy-data.html).

- Each variable must have its own column.
- Each observation must have its own row.
- Each value must have its own cell.

---

# Getting our own data

```r
data <- read_csv("https://docs.google.com/spreadsheets/d/e/2PACX-1vQ56fySJKLL18Lipu1_i3ID9JE06voJEz2EXm6JW4Vh11zmndyTwejMavuNntzIWLY0RyhA1UsVEen0/pub?gid=0&single=true&output=csv")
```

#### Is this data tidy? (You can say just checking colnames)

```
##  [1] "state"                  "pollster"               "sponsor"               
##  [4] "start.date"             "end.date"               "entry.date.time..et."  
##  [7] "number.of.observations" "population"             "mode"                  
## [10] "biden"                  "trump"                  "biden_margin"          
## [13] "other"                  "undecided"              "url"                   
## [16] "include"                "notes"
```

---
# Pivoting to make the data tidy (Pivot Longer)

```r
data_long <- data %>% 
*              pivot_longer(cols=c(biden,trump),
*                           names_to="Vote_Choice",
*                           values_to="Vote") %>%
              mutate(end.date=as.Date(str_replace_all(end.date,
                              "/", "-"), format = "%m-%d-%Y"))
```

We have three inputs in `pivot_longer`:

- `cols`: the variables you want to convert from wide to long.
- `names_to`: new variable for the columns names from the wide data.
- `values_to`: new variable for the values from the wide data.

---

## And now:

```r
data_long %>% 
  select(Vote_Choice, Vote) %>%
  slice(1:5)
```

```
## # A tibble: 5 x 2
##   Vote_Choice  Vote
##   <chr>       <dbl>
## 1 biden          45
## 2 trump          50
## 3 biden          52
## 4 trump          40
## 5 biden          47
```

### .center[With this tidy dataset, we can start our visualizations.]

---
class:inverse, center, middle

# Basics of ggplot
---

## Linking the Data to Visuals (Mapping).

Mapping is how you connect your data and variables with the visual representations of a graph. We will do this in three steps.

- **The Data Step**: Tell ggplot what your data is.

- **The Mapping Step:** Tell ggplot **what** variables -> visuals you want to see.

- **The Geom Step:** Tell ggplot **how** you want to see

---
# An Abstract Example of ggplot

```r
knitr::include_graphics("gg-syntax.png")
```

---

# Polls Over Time (Ugly but with the basics)

```r
ggplot(data=national_polls, # the data step 
       aes(x=end.date, y=Vote)) + # the map step
geom_point() # the geom step
```

---

### Which Aesthetics I can use?

![](aes.png)

---

# Aesthethics: Color

```r
ggplot(data=national_polls, # the data step
       aes(x=end.date, y=Vote, 
           color=Vote_Choice)) + # the map step
geom_point()
```

<img src="slides_files/figure-html/unnamed-chunk-12-1.png" width="60%" />
 
---

# Aesthethics: Shape

```r
ggplot(data=national_polls, # the data step
       aes(x=end.date, y=Vote,
           color=Vote_Choice,
           shape=Vote_Choice)) + # the map step
geom_point() 
```

---

# Aesthethics: Alpha

```r
ggplot(data=national_polls, # the data step
       aes(x=end.date, y=Vote,
           color=Vote_Choice,
           shape=Vote_Choice, 
           alpha=end.date)) + # the map step
geom_point() 
```

---

# Aesthethics: Linetype

```r
ggplot(data=national_polls, # the data step
       aes(x=end.date, y=Vote)) + # the map step
geom_smooth(aes(linetype=Vote_Choice)) +
geom_point(aes(color=Vote_Choice), alpha=.2)
```

---

# Some notes

- You can use multiple aesthetics together.

- One variables for each aesthethic (that's why your data should be tidy)

- Outside of Aes, the aesthetics work with a simple value, not a variable linking data and geoms.

---
class:inverse, center, middle

# Geoms

---

# Smooting the data

```r
ggplot(data=national_polls,
       aes(x=end.date, y=Vote, color=Vote_Choice, fill=Vote_Choice)) + 
geom_point(alpha=.2) +
geom_smooth()
```

---

# Density

```r
ggplot(data=national_polls,
       aes(x=end.date)) +
geom_density(fill="steelblue")
```

---

# Nice Trick: facet wrap

```r
ggplot(data=state_polls %>% filter(state%in%swing_states)) +
  geom_density(aes(x=end.date, fill=state), alpha=.3) +
  facet_wrap(~state, ncol=3)
```

---

# Bars

```r
ggplot(data=data_long,
       aes(x=end.date, fill=mode)) +
geom_bar() +
scale_fill_brewer(palette = "Set3")
```

---

# Box Plot

```r
# Another example
ggplot(state_polls %>% filter(state%in%swing_states),
       aes(x=Vote,y=fct_rev(state), fill=Vote_Choice)) +
  geom_boxplot() +
  scale_fill_manual(values=c("biden"="blue","trump"="red")) 
```

---
class:inverse, center, middle

# And many, many, many more options.

### See [here](https://github.com/rstudio/cheatsheets/blob/master/data-visualization-2.1.pdf)

---
### Exercise: What do you see?

---

## Part 4: Adjust scales, labels,  titles, and more.

After you are set on the mapping and geoms, the next step is to adjust the scale of your the graph. These functions are usually on the form: `scale_aesthethic_type`.

- `scale_x_log10`: To convert the numeric axis to the log scale
- `scale_y_reverse`: To reverse the scale
- `scale_fill_manual`: To create your own discrete set of fill. 
- `scale_colour_brewer()`: Change the Pallet of Colours

<br>

*After you understand the first three steps, the last one is all about googling.*
---

# Putting all together

```r
p <- ggplot(data=national_polls,
       aes(x=end.date, y=Vote, color=Vote_Choice, 
           shape=Vote_Choice, fill=Vote_Choice)) + 
geom_point(alpha=.2) +
geom_smooth() +
scale_shape_manual(values =c(21, 23)) +
scale_fill_manual(values=c("red", "blue")) +
scale_color_manual(values=c("red", "blue"), 
                   labels=c("Biden", "Trump"), 
                   name= "Vote Choice") +
scale_x_date(date_breaks = "1 month", date_labels = "%b %d") +
guides(fill=FALSE, shape=FALSE) +
labs(x = "End of the Poll", y = "Results",
         title = "Polls US Presidential Election",
         subtitle = "",
         caption = "Source: The Economist")  
```

---
class: center, middle

![](slides_files/figure-html/unnamed-chunk-23-1.png)
---

# Part 5: Embellishment as an consistent workflow

- Most of the adjustments you can make on your plot go inside of the `theme` function.

- When working in a paper, you should be consistent with your graphs.

- Create your own theme, and apply to all your codes.

---

# Build Your Theme

```r
# Set up my theme  ------------------------------------------------------------
my_font <- "Palatino Linotype"
my_bkgd <- "#f5f5f2"
pal <- RColorBrewer::brewer.pal(9, "Spectral")
my_theme <- theme(text = element_text(family = my_font, color = "#22211d"),
            rect = element_rect(fill = my_bkgd),
            plot.background = element_rect(fill = my_bkgd, color = NA),
            panel.background = element_rect(fill = my_bkgd, color = NA),
            panel.border = element_rect(color="black"), 
            strip.background = element_rect(color="black", fill="gray85"), 
            legend.background = element_rect(fill = my_bkgd, color = NA),
            legend.key = element_rect(size = 6, fill = "white", colour = NA), 
            legend.key.size = unit(1, "cm"),
            legend.text = element_text(size = 10, family = my_font),
            legend.title = element_text(size=10),
            plot.title = element_text(size = 22, face = "bold", family=my_font),
            plot.subtitle = element_text(size=16, family=my_font),
            axis.title= element_text(size=14),
            axis.text = element_text(size=8, family=my_font),
            axis.title.x = element_text(hjust=1),
            strip.text = element_text(family = my_font, color = "#22211d",
                                            size = 10, face="italic"))
```

```r
# This sets up for all your plots

*theme_set(theme_bw() + my_theme)
```
---

## Or by hand for each plot

```r
p +
  theme_bw() + 
  my_theme
```
]

--
.pull-right[
<img src="slides_files/figure-html/unnamed-chunk-28-1.png" width="100%" />
]
--

---

# Pre-built plots

There are several packages in R and built-in in gpplot with pre-built themes. Some examples from `ggthemes` and `hrbrthemes`

- theme_minimal()

- theme_economist()

- theme_fivethirtyeight()

- theme_ipsum()

---

```r
p +
  theme_minimal(base_size=12)
```

![](slides_files/figure-html/unnamed-chunk-29-1.png)
]

```r
p +
  theme_fivethirtyeight() 
```

![](slides_files/figure-html/unnamed-chunk-30-1.png)
]

---
class:inverse, center, middle

# Modelling with Broom and Purrr
---

# Broom

- We have discussed how to go from your raw data to informative visualization plots.

- From this section forward, we will use the same logic to go from your statistical models outputs to plots.

- We will use David Robinson’s `broom` package to help us out, and the tidyverse package `purrr` to run the same thing multiple times (without loops)

---

### A Simple Model

```r
# Separate the data
biden <- national_polls %>%
      filter(Vote_Choice=="biden") %>%
      mutate(first_day=min(end.date, na.rm=TRUE), 
             days=as.numeric(end.date-first_day))

# simple linear model

lm_time <- lm(Vote~ days, data=biden)
summary(lm_time)
```

```
## 
## Call:
## lm(formula = Vote ~ days, data = biden)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.2236 -1.7612  0.0033  1.7480  7.5289 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 46.671697   0.317704   146.9   <2e-16 ***
## days         0.016358   0.001558    10.5   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.706 on 493 degrees of freedom
## Multiple R-squared:  0.1828,	Adjusted R-squared:  0.1811 
## F-statistic: 110.2 on 1 and 493 DF,  p-value: < 2.2e-16
```

---

# Extract Quantities with Broom

- `tidy`: to extract the model main parameters
- `augment`: to extract observation-level statistics (predictions)
- `glance`: to extract model-level statistics.

---

## Tidy: Extract Quantities

```r
# a data frame
results <- tidy(lm_time)
results
```

```
## # A tibble: 2 x 5
##   term        estimate std.error statistic  p.value
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
## 1 (Intercept)  46.7      0.318       147.  0       
## 2 days          0.0164   0.00156      10.5 2.06e-23
```

---

# Augment: Predicted Values

```r
augment(lm_time)
```

```
## # A tibble: 495 x 8
##     Vote  days .fitted .resid .std.resid    .hat .sigma   .cooksd
##    <dbl> <dbl>   <dbl>  <dbl>      <dbl>   <dbl>  <dbl>     <dbl>
##  1    52   305    51.7  0.339      0.126 0.00653   2.71 0.0000519
##  2    52   304    51.6  0.355      0.132 0.00645   2.71 0.0000564
##  3    51   304    51.6 -0.645     -0.239 0.00645   2.71 0.000185 
##  4    52   306    51.7  0.323      0.120 0.00661   2.71 0.0000476
##  5    48   304    51.6 -3.64      -1.35  0.00645   2.70 0.00593  
##  6    50   305    51.7 -1.66      -0.616 0.00653   2.71 0.00125  
##  7    53   305    51.7  1.34       0.497 0.00653   2.71 0.000810 
##  8    53   306    51.7  1.32       0.491 0.00661   2.71 0.000800 
##  9    53   305    51.7  1.34       0.497 0.00653   2.71 0.000810 
## 10    50   306    51.7 -1.68      -0.622 0.00661   2.71 0.00129  
## # … with 485 more rows
```

---
# Glance: model-level statistics.

```r
glance(lm_time)
```

```
## # A tibble: 1 x 12
##   r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC
##       <dbl>         <dbl> <dbl>     <dbl>    <dbl> <dbl>  <dbl> <dbl> <dbl>
## 1     0.183         0.181  2.71      110. 2.06e-23     1 -1194. 2394. 2407.
## # … with 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
```

### Why is this so cool?

- Tidyverse approach

- It can be combined with a whole set of packages from tidyverse.

- Returns a clean tibble.

---
# Example: Plot The Predicted Values

```r
# Plot
augment(lm_time, se_fit = TRUE) %>% 
  mutate(lb=.fitted - 1.96*.se.fit,ub=.fitted + 1.96*.se.fit) %>%
ggplot(data=.) +  geom_ribbon(aes(y=.fitted, ymin=lb,
                  ymax=ub, x=days), alpha=.2) +
  geom_line(aes(y=.fitted, x=days), color="blue") +
  geom_point(aes(y = Vote, x=days), alpha=.2) 
```

---

## Running Multiple Models

What if I want to run the same model for multiple subgroups? Or multiple different models?

Use `purrr` for functional programming. This is where R and tidyverse gets really beautiful.

The logic is simple. We will nest our data, run models in the subgroups, tidy the results, and unnest everything in a tidy format dataset.

---

## Nest the Data

```r
# Step 1: Nest your data
nested_data <- state_polls %>%
  filter(Vote_Choice=="biden") %>%
  mutate(first_day=min(end.date,na.rm=TRUE), 
         days=as.numeric(end.date-first_day)) %>%
* group_by(state) %>%
* nest()
```

---
## Nest the Data

```r
nested_data
```

```
## # A tibble: 47 x 2
## # Groups:   state [47]
##    state data                    
##    <chr> <list>                  
##  1 MT    <tibble[,18] [18 × 18]> 
##  2 ME    <tibble[,18] [19 × 18]> 
##  3 IA    <tibble[,18] [32 × 18]> 
##  4 WI    <tibble[,18] [95 × 18]> 
##  5 PA    <tibble[,18] [112 × 18]>
##  6 NC    <tibble[,18] [104 × 18]>
##  7 MI    <tibble[,18] [112 × 18]>
##  8 FL    <tibble[,18] [105 × 18]>
##  9 AZ    <tibble[,18] [88 × 18]> 
## 10 MN    <tibble[,18] [34 × 18]> 
## # … with 37 more rows
```
]

.pull-right[
- The data column is called a [list-column](https://jennybc.github.io/purrr-tutorial/ls13_list-columns.html) because it works as a list where every element has a entire dataset inside of it.

- With a list of datasets, we can use functional programming in `purrr` to run the same models for each dataset. 
]

---

## Run the Models

```r
nested_data <- nested_data %>%
*               mutate(model=map(data, ~ lm(Vote~days, .x)))

nested_data
```

```
## # A tibble: 47 x 3
## # Groups:   state [47]
##    state data                     model 
##    <chr> <list>                   <list>
##  1 MT    <tibble[,18] [18 × 18]>  <lm>  
##  2 ME    <tibble[,18] [19 × 18]>  <lm>  
##  3 IA    <tibble[,18] [32 × 18]>  <lm>  
##  4 WI    <tibble[,18] [95 × 18]>  <lm>  
##  5 PA    <tibble[,18] [112 × 18]> <lm>  
##  6 NC    <tibble[,18] [104 × 18]> <lm>  
##  7 MI    <tibble[,18] [112 × 18]> <lm>  
##  8 FL    <tibble[,18] [105 × 18]> <lm>  
##  9 AZ    <tibble[,18] [88 × 18]>  <lm>  
## 10 MN    <tibble[,18] [34 × 18]>  <lm>  
## # … with 37 more rows
```

---
## Unnest (All back to normal)

```r
nested_data <- nested_data %>% 
                     mutate(results=map(model, tidy)) %>%
*                    unnest(results)

nested_data
```

```
## # A tibble: 94 x 8
## # Groups:   state [47]
##    state data             model  term     estimate std.error statistic   p.value
##    <chr> <list>           <list> <chr>       <dbl>     <dbl>     <dbl>     <dbl>
##  1 MT    <tibble[,18] [1… <lm>   (Interc… 35.5       1.58       22.5   1.59e- 13
##  2 MT    <tibble[,18] [1… <lm>   days      0.0369    0.00725     5.09  1.09e-  4
##  3 ME    <tibble[,18] [1… <lm>   (Interc… 50.6       2.73       18.5   1.06e- 12
##  4 ME    <tibble[,18] [1… <lm>   days      0.00517   0.0127      0.407 6.89e-  1
##  5 IA    <tibble[,18] [3… <lm>   (Interc… 42.6       1.53       27.8   5.68e- 23
##  6 IA    <tibble[,18] [3… <lm>   days      0.0172    0.00690     2.50  1.81e-  2
##  7 WI    <tibble[,18] [9… <lm>   (Interc… 44.8       0.603      74.2   6.76e- 84
##  8 WI    <tibble[,18] [9… <lm>   days      0.0259    0.00294     8.83  6.60e- 14
##  9 PA    <tibble[,18] [1… <lm>   (Interc… 46.4       0.486      95.4   7.18e-107
## 10 PA    <tibble[,18] [1… <lm>   days      0.0172    0.00224     7.68  7.20e- 12
## # … with 84 more rows
```

---

## Outputs from Unnest

```r
# first, remove the intercept
to_plot <- nested_data %>%
              filter(term!="(Intercept)") %>%
              mutate(ub=estimate+1.96*std.error, 
                     lb=estimate-1.96*std.error)  %>%
              drop_na()
# graph
ggplot(to_plot, aes(x=fct_rev(state),y=estimate, ymin=lb, ymax=ub)) +
  geom_pointrange(shape=21, fill="blue", color="black", alpha=.8) +
  geom_hline(yintercept = 0, linetype="dashed", color="gray") +
  coord_flip() +
  theme_minimal() +
  labs(x = "Linear Time Trend by State",  y= "Biden Support in the Polls")  
```

---

![](slides_files/figure-html/unnamed-chunk-41-1.png)

---
class:inverse, center, middle

# Case Study:

## Partisanship, Covid and Risk Perceptions in Brazil. 
---

# An Example of my Workflow

- To conclude our workshop, I will show you the code of my recent paper (co-authored with Ernesto Calvo) forthcoming at the Latin American Politics and Society.

- The paper is about partisanship and risk perceptions about COVID-19. I will focus on the descriptive analysis and the simple regression models we use to show partisan difference of risk perceptions in Brazil.

- The paper and replication files can be found [here](https://github.com/TiagoVentura/Calvo_Ventura_LAPS_2021).

- **Our Goal**: A model of partisanship on three different outcomes. 
--

---

##  Step1: Tidy Your Data

```r
load("CV_data.Rdata")
library(tidyverse)
library(tidyr)

# Untidy
d %>% select(covid_job, covid_health, covid_government)
```

```
## # A tibble: 2,362 x 3
##    covid_job         covid_health      covid_government      
##    <fct>             <fct>             <fct>                 
##  1 Very unlikely     Somewhat unlikely Somewhat Unappropriate
##  2 Very unlikely     Somewhat Likely   Somewhat Appropriate  
##  3 Very Likely       Very Likely       Very Appropriate      
##  4 Very Likely       Very Likely       Somewhat Unappropriate
##  5 Somewhat Likely   Somewhat Likely   Somewhat Appropriate  
##  6 Somewhat Likely   Somewhat unlikely Somewhat Appropriate  
##  7 Very unlikely     Somewhat Likely   Very Appropriate      
##  8 Somewhat unlikely Somewhat Likely   Very Appropriate      
##  9 Very unlikely     Somewhat Likely   Somewhat Unappropriate
## 10 Very Likely       Somewhat unlikely Somewhat Unappropriate
## # … with 2,352 more rows
```

---

## Make it tidy

```r
d_pivot <- d %>% 
            pivot_longer(cols=c(covid_job, covid_health, 
                                covid_government), 
                         names_to="covid", 
                         values_to="covid_values") 
```

---

## What do I have now?

```
## # A tibble: 7,086 x 2
##    covid            covid_values          
##    <chr>            <fct>                 
##  1 covid_job        Very unlikely         
##  2 covid_health     Somewhat unlikely     
##  3 covid_government Somewhat Unappropriate
##  4 covid_job        Very unlikely         
##  5 covid_health     Somewhat Likely       
##  6 covid_government Somewhat Appropriate  
##  7 covid_job        Very Likely           
##  8 covid_health     Very Likely           
##  9 covid_government Very Appropriate      
## 10 covid_job        Very Likely           
## # … with 7,076 more rows
```

---

## Nest and Models

```r
data_nested <- d_pivot %>%
                group_by(covid) %>%
                nest() %>%
                mutate(model=map(data, ~ 
*                             lm(as.numeric(covid_values) ~
*                                runoff_haddad +
*                                runoff_bolsonaro +
*                                income + gender + work +
*                                as.numeric(education) + age , data=.x)),
                       res=map(model,tidy)) %>%
                unnest(res) %>%
                mutate(lb=estimate - 1.96*std.error, 
                       up= estimate + 1.96*std.error)
```

**Everything we need is here: group_by, nest, model, unnest. **

---

# Next (Important Steps)

- Fix the labels.

- **Get your labels correct before plotting**.

- By correct I mean: names and order.

```r
to_plot <- data_nested %>% 
              filter(str_detect(term, "runoff")) %>%
              mutate(labels_iv=fct_recode(term, "Haddad Voters"="runoff_haddadOn", 
                                                "Bolsonaro Voters"="runoff_bolsonaroOn")) %>%
              mutate(outcome= ifelse(covid=="covid_job", 
                         "How likely is it that you \n could lose your job? ",
                         ifelse(covid=="covid_health", 
                                "How likely will your health \n be affected by COVID-19?", 
                                "Has the government response \n been appropriate ?"))) 
```

---
# Final Plot

```r
#pick my colors
pal <- RColorBrewer::brewer.pal(9, "Spectral")

#graph
ggplot(to_plot, aes(y=estimate, x=labels_iv, 
                      ymin=up, ymax=lb, color=labels_iv)) +
  geom_pointrange(shape=21, fill="white", size=2) +
  labs(x="", y="Point Estimates", 
       title = "\nPartisanship, Risk Perceptions and Government Responses to Covid in Brazil", 
       subtitle = "Regression Estimates with Controls by Income, Gender, Age, Education, and Occupation.", 
       caption ="Note: Baseline are Independent Voters") +
  geom_hline(yintercept = 0, linetype="dashed", color="darkred") + 
  scale_color_manual(values=c("Bolsonaro Voters"=pal[9], "Haddad Voters"=pal[1]),
                     name="Who would you vote for?") +
  facet_wrap(~outcome)  + 
  theme_bw() +
  theme(strip.text = element_text(size=7),
        axis.text.x = element_blank()) 
```

---

![](slides_files/figure-html/unnamed-chunk-48-1.png)