Transcript for video titled "LeaRn R with NDACAN, Week 2, 'Tidyverse' Functions, October 18, 2024".

[VOICEOVER]
National Data Archive on Child Abuse and Neglect 

[ONSCREEN slide content 1] 
WELCOME TO NDACAN MONTHLY OFFICE HOURS!
National Data Archive on Child Abuse and Neglect
DUKE UNIVERSITY, CORNELL UNIVERSITY,& UNIVERSITY OF CALIFORNIA:SAN FRANCISCO
 The session will begin at 11am EST
 11:00 - 11:30am - LeaRn with NDACAN
 11:30 - 12:00pm - Office hours breakout sessions
 Please submit LeaRn questions to the Q&A box
 This session is being recorded.
 See ZOOM Help Center for connection issues:
https://support.zoom.us/hc/en-us

[Paige Logan Prater]
Hi everybody my name is Paige Logan Prater I am the Graduate Research Associate here at NDACAN. NDACAN stands for the National Data Archive on Child Abuse and Neglect. We do this Office Hours Monthly Office Hours Series every year and this year is a special offering that we or not me the team kind of thought of which was to provide some R training for folks that are interested. So this space even though it's held by NDACAN you do not need to have particular familiarity with the Archive or the data sets but we will be using those data sets in the exercises or data that looks like the Archive data. And we're really excited for everyone to be joining us. So just a few housekeeping items before we jump in we will do the R training portion for the first 30 minutes. Frank will take us through that and then at about half past the hour we will go back to our kind of conventional Office Hours breakout session. So folks that are interested in sticking around we really welcome you to we will have breakout sessions with different folks on the NDACAN team where you can ask questions about R or stats and research design in general. We also have folks that are very familiar with the archive data sets that can answer any questions so that will be at about 11:30. If you have questions throughout the R training please put them in the q&a box and we will get to as many as possible in the order that they come in. This session is being recorded and all of the sessions, slides, and other materials are available on the NDACAN website and I'll will paste it in the chat once Frank gets started. I think that's really it yeah there for Zoom help feel free to reach out to Andres he's here and I don't want to delay us anymore so Frank I'll kick it over to you.

[Frank Edwards]
Great thank you Paige. So yeah welcome everybody. 

[ONSCREEN slide content 2] 
LEARN WITH NDACAN, Presented by Frank Edwards

[Frank Edwards]
So today we're going to cover the Tidyverse. So we'll be introducing the Tidyverse suweet of functions I adore the Tidyverse Ilearned bas. R when I was a a graduate student at the University of Washington about 10 years ago. But Tidyverse has really revolutionized the way that I work with R and has become industry standard for working with R because it streamlines so many of the routine data manipulation tasks that we need to do in when we're working with complex data sets. So yeah let's dive in. 

[ONSCREEN slide content 3] 
MATERIALS FOR THIS COURSE
MATERIALS FOR THIS COURSE
 Course Box folder (https://cornell.box.com/v/LeaRn-with-R-NDACAN-2024-2025)
contains
 Data (will be released as used in the lessons)
 Census state-level data, 2015-2019
 AFCARS state-aggregate data, 2015-2019
 AFCARS (FAKE) individual-level data, 2016-2019
 NYTD (FAKE) individual-level data, 2017 Cohort
 Documentation/codebooks for the provided datasets
 Slides used in each week’s lesson
 Exercises as that correspond to each week’s lesson
 An .R file that will have example, usable R code for each lesson – will be updated and
appended with code from each lesson

[Frank Edwards]
All of the data is available in the box folder the data folder and the slides, pdfs, and homework are also in the week 2 folder now. We have a dot R file available for you to follow along if you like we also have some homework assignment some homework questions for you to practice working with the sample data we've provided. Just a reminder that the AFCARS and NYTD data that we've provided for you are fake, that is they are simulated. They are deidentified and simulated so please do not use those for any real world analysis. You know obviously you can apply for the raw data through the Archive but don't use any of the data that we've provided for you here in any actual analyses these are just for practice. But all of the data that we've provided you can be requested through the Archive or when it comes to Census data are easily available through IPUMS at the University of Minnesota that's IPUMS it's a great way to interface with data from the U.S. Census and the American Community Survey which is what we'll be using today. And at any point if something's unclear or you have a question the q&a box is fine but I'm also okay with you just unmuting yourself and asking me a question if it's something urgent. Trying to monitor the chat as we go but I will almost certainly miss things. 

[Paige Logan Prater]
I'll keep an eye on it for you Frank no worries.

[Frank Edwards]
Thank you paige just interrupt me if there's something that comes up that you think. 

[Paige Logan Prater]
Okay will do.

[ONSCREEN slide content 4] 
WEEK 2: “TIDYVERSE” FUNCTIONS, October 18, 2024

[Frank Edwards]
Yeah I don't mind being interrupted all right let's go. So Tidyverse it's great and Tidyverse is actually a suite of packages that was designed by Hadley Wickham and his crew and for those of us who are using RStudio which should be just about all of us working in R. Hadley and that team are also deeply involved in developing RStudio so the logic that animates the Tidyverse is also the logic that animates RStudio. And you'll find that embracing the Tidyverse way of doing things will make your R trajectory much easier. The two core elements of it that I think are most important are the set of functions we get access to through the tidyr and dplyr package that convert R to work with a syntax that's much more similar to SQL or SQL that is routinely used in large scale databases when compared with the base R way of manipulating data objects. So we get a suite of tools that for those of us who know how to work with SQL will be very familiar and for those of us who don't they still are going to streamline your workflow and make it easier to interface with your large data sets and then if you do need to pivot over to working with external databases you'll you find that your task is much easier. The other big thing we get access to is ggplot2 and ggplot2 is a workhorse for data visualization and it is among the most powerful tools currently available to visualize and manipulate data visuals. It's stellar and it's one of the main reasons for learning R at this point. Let's dive into Tidyverse. 

[ONSCREEN slide content 5] 
DATA USED IN THIS WEEK’S EXAMPLE CODE
Census aggregate data from 2015-2019 (census_2015_2019.csv)
 Population counts by state, year, sex, race, and ethnicity
 Publicly available from CDC Wonder:
 https://wonder.cdc.gov/single-race-population.html
 AFCARS aggregate data from 2015-2019 (afcars_aggreg_suppressed.csv)
 Counts by state, year, sex, race/ethnicity of children in foster care; number of children
removed due to physical or sexual abuse, or neglect; the number of children who entered or
exited foster care in that year
 Can order full data from NDACAN:
 https://www.ndacan.acf.hhs.gov/datasets/request-dataset.cfm

[Frank Edwards]
The data we're going to use in this week's example are the Census 2015 to 19 file which has population counts by state, year, sex, race, and ethnicity and those are from CDC wonder. Those CDC wonder, the SEER data county population, data we've aggregated up to states and those data are derived from the U.S. Census small area population estimates. They're really great and they pair very nicely with our NDACAN data because they'll be able to give you age and race specific age race and sex specific population estimates when we want to compute rates for child welfare system events. We also have that aggregate data from 2015 to 19 in the afcars_aggreg_suppressed.csv and again don't use these for any actual analyses these are just for practice. But these are counts of CPS events foster care in this case, by state, year, sex, race ethnicity of children. And then we have counts by maltreatment type and entries and exits from foster care. And again if you want to do actual data work with these with the AFCARS please do request the data from NDACAN rather than using the data that we've provided you here today. You won't get reliable results using the data we've provided you here today. So don't use it for real analysis just for practice. 

[ONSCREEN slide content 6] 
TIDYVERSE

[Frank Edwards]
All right. 

[ONSCREEN slide content 7] 
WHAT IS THE TIDYVERSE
 “tidyverse” is a special R package that contains a collection of R
packages designed to streamline data exploration and analysis
 Core tidyverse installation includes packages: ggplot2, dplyr,
tidyr, readr, purr, tibble, stringr, forcats
 The set of packages share common syntax, and data formatting and
types, like tibbles (i.e., data frames)

[Frank Edwards]
So what's in the Tidyverse? Tidyverse is going to pull in so Tidyverse is really a suite of packages that all work along the same logic and the basic logic is tidy data. Tidy data is data that is long rather than wide where all variables are in columns rather than rows. So a lot of us who work in Stata or some other suite might be used to seeing data in a wide format that is data where we have actual information that is a variable contained in a column name. So you might imagine like a longitudinal data set like the NLSY or something like that where we might have repeated observations for individuals in separate columns. That is we might have you know let's say we have our Census data as a population count and we might have 50 rows one for each state, and then each column might be the population for that state in a particular year. So we might have a column for 2000 2001 2002 2003 so we might have 50 rows and like 50 columns. In Tidyverse what we're going to do instead is we're going to have three columns. We're going to have state we're going to have year and we're going to have population right? Our primary principle when we're working with tidy data is that all variables must be in columns rather than in row all variables must be represented within a column. We don't want those variable information to be contained in the name of the column typically. That's the animating philosophy and the reading we've assigned will cover that philosophy in more detail but if you start to work with your data in long rather than wide format it'll make your Tidyverse adventure far easier. 

[ONSCREEN slide content 8] 
ADDITIONAL TIDYVERSE RESOURCES
 For more details and background, and additional tidyverse
packages see: https://www.tidyverse.org/
 A good text reference is Chapter 5 in R for Data Science,
available here: https://r4ds.had.co.nz/

[Frank Edwards]
So again Tidyverse.org has great resources and chapter five in "R for Data Science" also has great resources. The "R for Data Science" book was written by the author of many of the Tidyverse packages Hadley Wickham and it will be a great prime for how to get started in R if you haven't encountered that book before I highly recommend it. It's free and I use it to teach my graduate students all the time. 

[ONSCREEN slide content 9] 
Pictures of Tidyverse resources hex stickers.

[Frank Edwards]
Akay so again Tidyverse has a whole suite of packages here and one thing that I love in Tidyverse is how easy they make working with some of the functions. I'm going to pivot off of PowerPoint for one moment and and show you a cool thing about Tidyverse when we're in RStudio you can see this these are all the sweeted packages they love their little hex stickers. But one thing we can do very quickly in when we're working with Tidyverse packages if we're in RStudio which I have our homework for the week up here and I have that solution set available. If you go to help and you go to cheat sheets you're going to see a lot of our Tidyverse packages have cheat sheets that are readily available for us to work with so if we go to the cheat sheet for dplyr we'll get this great overview of how to work with the dplyr package and manipulate different kinds of variables. If we go to help cheat sheets ggplot2 we'll get the ggplot2 cheat sheet that will tell you most of what you need to know to work with ggplot. Really really love working with those cheat sheets so take a look at those as you run into problems which you inevitably will. 

[ONSCREEN slide content 10] 
THE PIPING OPERATOR
 The biggest difference with “tidyverse” programming is
using the special syntax called “pipe” operators to make
code cleaner, more intuitive, and consolidated
%>%
 NOTE: Newest versions of tidyverse can use the new pipe
syntax |> and you may see this in documentation. They
are functionally the same!

[Frank Edwards]
The piping operator the big difference between Tidyverse and base R is that we're going to use something called a pipe and you can read this little "percent greater than percent" operator as effectively reading as "and then". Right, so what it's going to do is it's going to take an object and then we're going to put the pipe at the end of the object and we should read that as take this object and then do the next thing. And the cool thing is then we can put another pipe at the end of that command, so we can say and then do this and then do this and then do this and then do this. So we can string together you know a battery of commands rather than having to retype the names of the objects over and over and over again which gets very tedious in base R. In the "R For Data Science" book you'll see this new version of the pipe this kind of I forget the name of that operator the the vertical line and the greater than that's exactly the same as the percent greater than percent they they work equivalently so feel free to use whichever pipe you prefer. And when you're in RStudio if you type control shift m or command shift m on a Mac it will generate a pipe operator for you so you don't need to type it over and over again. But the pipe is going to be routinely used. 

[ONSCREEN slide content 11] 
USING THE PIPING OPERATOR
 The operator %>% is used to sequentially apply functions to a
data set
 Avoids having to overwrite or save many intermediate steps as
usually happens with analogous base R constructions
 Compares to using $ in base R to reference variables
 DATA %>% select(var1, var2, var3) %>%
mutate(var4 = var1+var2/var3) %>%
rename(variable1 = var1,
variable2 = var2,
outcome = var3)

[Frank Edwards]
So we use it to sequentially apply functions to a data set. And again it makes our life really easy. So we can do things like data pipe select. And what select is going to do is it's going to return only the columns that we name. So if I have in data columns named var1 through var10, data pipe select var1 var2 var3 is going to return an object that only contains the columns var1 var2 var3 so it's going to subset the data by column. That's what select does. Mutate is our new verb for creating variables. So mutate will create a column that has the same length as the rest of the data frame and in Tidyverse you might see these called tibbles those are just a special Tidyverse version of data frames but they are interchangeable with data frames for most intents and purposes. But we can read these first two lines as take data and then select var1 var2 var3 and then mutate var4 is equal to var1 plus var2 divided var3 and then rename and what we're going to do with rename is we're going to convert var1 into variable1, var2 into variable2 and var3 into outcome. So this is a common kind of Tidyverse command and these are all using the functions from dplyr you don't need to worry so much about which package in Tidyverse these come from because we're going to you know work with it as a whole but these are if you're looking at the cheat sheets these are dplyr functions: select, mutate, and rename. 

[ONSCREEN slide content 12] 
MOST OFTEN USED TIDYVERSE FUNCTIONS
 select(): select variables to keep/drop or to reorder variables
 mutate(): creates new variables that are functions of existing variables
 filter(): filter/subset data using logical operators over variables
 summarize(): summarizes data to 1 row per grouping variable with
specified function (e.g. mean, sum, number of unique values)
 arrange(): reorder of data based on alphanumeric order of listed variables
 group_by(): grouping variables that functions can then be applied over
(for example, grouping by race and to get means or totals by race)
 rename(): rename variable names

[Frank Edwards]
So select the most commonly used ones here and I we have I think my zoom got a little messed up here because that's there's more. Yeah we have some stuff that kind of got spit off the slides. My apologies they're on the pdf but I'll read them off to you. Select selects variables to keep or drop or to reorder variables, so if we use select we can subset the data by column. Mutate creates new variables that can be functions of existing variables or can be constants or can be anything you can imagine. Filter filters the data using logical operators. So what filter is going to do is it's going to subset the data by row. So let's say we had our Census data where we had the data that we're working with today has a column for sex we could use filter to look exclusively at the female population or exclusively at the male population. That is we could filter out the values using a particular value of a variable. Summarize is almost always paired with its friend group_by. So group_by down here is a grouping variable that functions can be applied over. And then once what we will routinely do is we'll group_by some categorical variable say race ethnicity and then we'll summarize to compute values for each category. And when we put summarize into our function we're actually going to reduce down our dimensions. So if I have a 50 row data set and I compute a summarize that is maybe I'm computing a mean with summarize, I'll return a single row. But if I use group by race and I have five race categories I'll return five rows with a mean for each race. So group_by summarize is a set that will go together all the time. 

[ONSCREEN slide content 13] 
ADVANCED DATA MANIPULATION FUNCTIONS
 full_join (), left_join (), right_join (),
anti_join(): join/merge data
 pivot_wider(), pivot_longer(): go from long to wider
format, or vice versa
 mutate_at(), mutate_if(): apply same function to multiple
variables at same time
 summarize_at(), summarize_if() : summarise multiple
variables at same time over the same function and group

[Frank Edwards]
We also have a lot of other functions available to do things like join. So this is where a lot of the SQL syntax will be familiar to those who have seen SQL before. Routinely we're going to have tables that share common variables. For example we might have in our AFCARS data a variable for a state and in our Census data a variable for a state and we'd like to know how to pull those two tables together how to pull those two data frames together. And we have a whole group of join functions that will allow us to do that. They all do slightly different things. I recommend trying to stick with left join while you're learning it. What left join will do is it will take the first data frame object you give it and take the second data frame object you give it and stick the second one onto the first one by common variable. Right join, full join, and anti-join can be useful but while you're learning it's a good idea to stick with just left for now because I think we tend to think about left to right assignment so that'll make your life a little easier. Pivot wider and pivot longer are really useful for when we need to convert between wide format data and long format data. So pivot wider can turn long format data into wide format data and pivot long can turn wide format data into long format. So we can go back and forth between long and wide. That is we can spread our data over columns or we can gather it into rows. And pivot wider and pivot longer yeah it those are incredibly helpful functions, a little tricky to work with but anytime you get stuck on any of these functions your best friend is just to go to your R console and type question with the name of the function to pull up the R help file. So question pivot_wider we'll show you how to work with pivot wider and it'll give you some demos for how the function works. Mutate at and mutate if are great when we need to apply a similar transformation across a number of variables at the exact same time. So what we could do is we could mutate at var1 and var2 and then multiply both of those columns by two. Right, or do whatever operation we wanted to do to them. So if we have the same set of transformations that we need to apply to a lot of variables at the same time mutate at and mutate if are really great and summarize at and summarize if have a similar set of applications. So that's it for our slides. I think why don't I know I'm not supposed to do live coding here but I think I want to show you a couple of these operations that we will work through on the homework. 

[ONSCREEN DEMONSTRATION CODE]
The following is typed into the R script area:
#install.packages("tidyverse")
library(tidyverse)

census<-read_csv("census_2015_2019.csv")
afcars<-read_csv("afcars_aggreg_suppresed.csv")

The following is typed into the R console:
head(census)
head(afcars)
A census table of data appears with columns cy, stfips, state, st, sex, race6, hisp, pop.
An afcars table of data appears with columns fy, state, sex, raceethn, numchild, physabuse, sexabuse.

The following is typed into the R script area:
I'm going to aggregate foster care entries by state and then join it to Census data and compute entry rates per 1,000 population.

afcars_agg<- afcars %>%
 
The following is typed into the R console:
glimpse(afcars)
A table of data appears in the R console with row names fy, state, sex, raceethn, numchild, physabuse, sexabuse, neglect, entered, exited.

The following is typed into the R script area:
group_by(fy, state) %>%
summarize(entered = sum(entered, na.rm=T))

The following is typed into the R console:
afcars_agg
A table of data appears in the R console with columns fy, state, entered.

The following is typed into the R console:
nrow(afcars_agg)
The number 260 appears.
The following is typed into the R console:
nrow(afcars)
The number 4100 appears.

The following is typed into the R console:
census
A census table of data appears with columns cy, stfips, state, st, sex, race6, hisp, pop.

The following is typed into the R script area:
census_agg<-census %>%
groupby_(cy, stfips, st) %>%
summarize(pop = sum(pop, na.rm =T))

The following is typed into the R console:
census_agg
A table of data appears in the R console with columns cy, stfips, st, pop.

The following is typed into the R console:
afcars_agg
A table of data appears in the R console with columns fy, state, entered.

The following is typed into the R script area:
afcars_agg<-afcars_agg %>%
rename(year = fy)

census_agg<-census_agg %>%
rename(year = cy, state = stfips)

afcars_pop<-afcars_agg %>%
left_join(census_agg)

The following is typed into the R console:
afcars_pop
A five-column table of data appears in the R console with columns year, state, entered, st, pop.

The following is typed into the R script area:
afcars_pop<-afcars_pop %>%
mutate(entered_rate = entered / pop * 1000)

The following is typed into the R console:
afcars_pop
A table of data appears in the R console with columns year, state, entered, st, pop, entered_rate.

The following is typed into the R script area:
ggplot(afcars_pop, aes(x = year, y = entered_rate, group = st) + geom_line())
A plot of lines appears in the Plots area.

The following is typed into the R script area added to the previous code line:
+ facet_wrap(~st)
A group of 50 plots, one line plot per state, appears in the Plots area.

[Frank Edwards]
So here's what we'll do. I'm going to share here and this file that I'm working on is week2solutions.rmd. This is posted in our shared directory and I'm using R markdown maybe some of you have not seen this before and I'll talk you through the basics of R markdown but it's a really really handy way to work with to produce scientific reports or fancy looking reports using R. So the way we could start a new markdown file is go to new file, R markdown and it's going to give you a demo file that will compile for. You once we have this we click the Knit button it's going to prompt us to save it and that's fine. But if we click Knit it's going to actually compile this into a really nice report that will embed our images and our code for us. So for week 2 solutions right I thought this would be a useful way for us to work through the homework. I'm going to load Tidyverse in and we always start by installing it if we haven't. I've commented it out here but the way we install packages to R is with and then quotations around the name of the package. We only need to do that once per installation of R so this is not a piece of code that you need to run over and over again you just need to run it the first time and then we'll be able to library Tidyverse in. Now some of us might have used read.csv in the past but the read R suite reader comes with Tidyverse and it has parallel functions that have underscores instead of dots and they parse csv files and other kinds of files a little bit more efficiently than does the read dot functions. So that you know they'll do things like not convert strings into factors which is a pet peeve of a lot of R users. So our Census data loaded in just fine. I want to show you the joining capabilities here because I think that's something that's a little tricky. Let me make sure I have the data file that we need. So let me get the AFCARS data downloaded really quickly my apologies for not having this set up. Okay there it is. Brilliant. And I can see the file names here I'm using an R project so I really recommend doing this if you're in RStudio to create a project for each directory that you're working in because sometimes you have to set your working directory that gets a bit tedious if we use projects we don't have to do that. It'll set our working directory to wherever we set the project file. So I'm going to read it afcars_aggreg_suppressed.csv all right. So let's first take a look at these two objects over here I'm going to use the broom to clear up my console a little bit and I like to use the head function to look at my data. So these are the two tables that we have. We have Census and AFCARS and what I would like to do is I would like to compute the total number of let's say foster care entries by state by year and then compute those as per capita rates for each state. So let's do that first I'm going to write out some just plain english that'll appear in my markdown document. I'm going to aggregate foster care entries by state and then join it to Census data and compute entry rates per 1,000 population. Sound good? Sounds good. So the way we do that we'll take AFCARS and I'm going to call this afcars_agg I'm going to create a new object from it so I'll take AFCARS and I'm going to pipe a group_by and here I know I want to compute state and year level foster care entry rates right now our data is has implicitly groupings by sex and race ethnicity right so we want to split this up a little bit. And I want to make sure I'm looking at AFCARS correctly I can't see a couple of columns with head so I'm going to use the glimpse function which will give me all of the first few rows the data it's a really nice way to just see the data at a glance. So I'm going to group by fy and state. Now what this is going to do is it's going to effectively change the structure of the data now we're going to flatten out sex and race ethnicity. So I'm going to summarize entered equals sum of entered. But I need to add a special argument here I can see that I have some missing values and I don't want those to return an error so I'm going to make sure to include the na.rm equals true argument for now. It's not always the best way to handle missing values but it'll work for what we need to do today. And now we can see that we have a three column data set that has many fewer rows. So afcars_agg has 260 rows and the original AFCARS data set had 4100 rows so we've compressed those sex and race groups to only have one column of entries for each state. Now that's great that's exactly where we want it to be but we're not ready to join yet. Right, we need to do the same thing for the Census so let's do census_agg is Census pipe group by we're going to do a similar thing here cy and st fips and then let's also include st so that we have our descriptive label for the state as well. The st fips is the fips code for the state and we can see that that's going to match the state code we have in AFCARS so we need that and we need cy for year. But we're also going to bring in st the state abbreviation onto to our AFCARS file to make it a little easier to work with and we'll summarize pop equals sum pop and I don't know if there are missing values here but I'll just add the na.rm in case and then from there we'll have census_agg that looks like that and we have afcars_agg that looks like that and those look pretty similar. So I think I'm almost ready to join. My problem is that the names differ I have cy and fy and I have state and st fips so we need to do one more thing. We need to to rename those columns to harmonize them so we'll take afcars_agg, type rename and we'll say year equals cy and we'll say st fips equals I'm sorry I mixed up my Census and we'll keep st fips as it is and then we'll do census_agg is census_agg pipe rename. Year equals cy and I need to rename my state column here so let's make it yeah let's do state equals stfips and that'll match what we have in AFCARS. Cool okay then I can make my join. So let's do AFCARS join or we'll just call it afcars_pop and so what we'll do is we'll take AFCARS and we're going to left join census_agg. I'm sorry we want to do afcars_agg all right so now I have a join. I have my state specific and I also in my join I pulled in my two-letter state abbreviation as well that'll make visuals a little easier to work with so I have now the foster care entries and the population. Next let's compute our last piece which was foster care entries per 10,000 persons. And for that we'll do a mutate. We'll do mutate and I'll say entry entered rate equals entered divided by pop times 1,000. So now I have an entry rate and just to put the cherry on top let's do a quick ggplot where I'll take take afcars_pop a e s, x equals year y equals entered_rate, and then we also want to do group equals st plus geom_line I know I haven't shown you ggplot we'll get to that later but this is now a quick and not particularly attractive line plot that shows each of the state trajectories for entry rate. We could get a little fancier with this and we could do a facet_wrap by st and that's going to produce 50 lots that show us the time series for entry rates for each state. So I'm going to stop there. But.

[Paige Logan Prater]
Thank you Frank.

[Frank Edwards]
Tou are welcome. Thanks everybody for bearing with me as we you know did the thing. 

[Paige Logan Prater]
I don't see any questions in the q&a box I did see one from Melanie about not necessarily about coding but Melanie said that they don't see the help at the top of the cheat sheets that you mentioned. I'm not sure.

[Frank Edwards]
Okay yeah Melanie okay help at the top okay so let's let's check it let's check RStudio again so here we are in RStudio so Melanie are you if you're working in RStudio you should see a contextual menu for help up here. Right and so within help I can go down to cheat sheets and then pull up my ggplot cheat sheet. Yep so yeah if if I hope that should work for you but that's I don't know if it's not then there's probably a problem with your Rstudio installation. No other questions?

[Paige Logan Prater]
None that I'm seeing but that's great because we need to transition over to the breakout sessions.

[Voiceover]
The National Data Archive On Child Abuse and Neglect is a joint project of Duke University, Cornell University, University Of California, San Francisco, and Mathematica. Funding for NDACAN is provided by the Children's Bureau, An Office Of The Administration For Children and Families.

[Music]