[voiceover]
National Data Archive on Child Abuse Neglect.

[Erin McCauley]
Okay everyone good afternoon! Thank you so much for being here today and welcome to the fourth session of the 2021 NDACAN Summer Training Series. So this session is being recorded, we turn all of the summer training series into a webinar series which will be posted on our website at the end of the summer. So as I said this is the NDACAN summer training series. It's hosted by the National Data Archive on Child Abuse and Neglect and the NDACAN or you know National Data Archive on Child Abuse and Neglect is hosted at Cornell University and Duke University. If you've been with us for a few years you know that we recently expanded our affiliations to include Duke so our staff is split between Cornell and Duke. The theme for this summer's training series is “Data Strategies for the Study of Child Welfare”. In addition to being hosted by NDACAN this is also done with support from children's Bureau with whom we have the contract for our data archive. So here is a quick overview of the summer. We started with an introduction to NDACAN, a brief overview of our data sets and the different services and supports that we provide. Then we talked about the survey-based data cluster, and then we moved to a brief overview of the administrative data cluster and then a walk-through of how to link them. Today we'll be having presentations about the VCIS data and special populations. And this is a dataset we can use to expand our panel of child welfare involvement. And then we're going to round out the summer with two workshops. First we'll have a multilevel modeling workshop and then a latent class analysis workshop and these will be really focused on the actual analysis of data. So now I'm going to pass the presentation off to our first presenter who is our postdoc in the Data Archive. Alex take it away!

[Alex Roehrkasse]
Hi everyone, I'm Alex and as Erin said I'm a postdoc at the Archive. I do a number of things at the archive but one of the main things I do is track down historical data some of which has been kind of lost to history, and try and clean it up and release it through the Archive. One of the things we've been doing at the Archive is trying to recover and release new sources of historical data and what we're going to be talking about today is a Voluntary Cooperative Information System or VCIS. And it's really our first successful efforts to do this. So the data we're talking about today were released through the archive just a couple months ago and so all the data we're  discussing are now publicly available and indeed they are kind of hot off the presses. So I'll give an overview of the VCIS and then Frank will step in to kind of give a more in-depth discussion about how we might use a data source like this to study a special population like American Indian and Alaska Native children. So yeah like I said this is just my overview of the data. The VCIS one of the easiest ways to think about the VCIS is as a predecessor to the AFCARS. The AFCARS is of course the Adoption and Foster Care Analysis and Reporting System and it's our current sort of flagship source for information on foster care. The AFCARS also includes information on adoption of course, as did the VCIS. The VCIS adoption data are pretty spotty and so we haven't released the adoption data and I'm not sure were going to, so we're just going to be talking about foster care data today. The AFCARS begins in 1995 and runs through the present but before the AFCARS the VCIS was funded by the federal government but administered by a private organization as a survey to the various states. But the survey was itself voluntary and so differently from the AFCARS the completeness of the data varies from year to year and unfortunately sometimes the states seem to interpret some of the survey questions a little bit differently so we'll talk a little bit about reliability issues  that are present in the VCIS that you might not be used to seeing in the AFCARS. The VCIS was used in a number of really important studies on the foster care system in the United States that were published in the late 1990s and early 2000's but the unfortunately the data themselves were kind of lost. The organization that collected the data didn't maintain the file and so we had to recover them from an academic researcher who had published using these data and had maintained them in his own private files, and so we got spreadsheets from him and then spent the better part of a couple years cleaning up the data and producing documentation to help users use it responsibly. And so if you're interested in using the VCIS data you would go to our website, you would find it released as a dataset and you would find accompanying it all kinds of documentation that would help you use it. So the VCIS and other similar products have clear use value for researchers who are explicitly interested in you know historical trends, historical periods, long-term change over time. But I think historical data like the VCIS also offer sort of important opportunities for people who are really only more interested in the present. On the one hand they give us really important context for understanding where we are  right now so often will say things like 'foster care rates were very high in Minnesota in 2018', but it raises all kinds of questions: compared to what? High compared to 2017 the year prior? Compared to 2010? Compared to 2000 or 1995 which is often a historical comparison we'll make because that's when the AFCARS data end? Sometimes our answers actually change when we look further back in time. Another important reason to use historical data though is for causal analyses. Very often we're  interested in estimating the causal effect of placement into foster care or some of the causal determinants of placement into foster care itself. Whenever we're  doing causal analysis though, we're usually looking for some source of exogenous variation, some sort of policy shock or some sort of source of exogenous variation in a different system that will allow us to kind of use natural experiments to make valid causal estimates. And even though we might be more interested in the present, we might not always have sources of exogenous variation available to us. So one of the things that historical data does is it allows us to expand to the set of policy interventions, social and demographic trends that we might use that we might leverage in different ways to generate causal estimates whether we're  interested in the causes of contact with the foster care system or the causal impact of the foster system foster care system contact on children. So now let's start to speak a little more concretely about what's in the VCIS. So the VCIS data are pretty rich but they are different from the AFCARS in important ways and they do have important limitations. So the AFCARS are a fundamentally a micro data product. So in the AFCARS, the current system, states report to the Children's Bureau records on individual children. And so if you were to download the AFCARS data from NDACAN you would get a very large data set with one row for each child. The VCIS are aggregate data so each observation is a count of children. The things that the VCIS counts are children in foster care, children entering foster care, or children exiting foster care in any given state in any given year. So the VCIS was first fielded in 1982 and it runs through 1995 and each state was surveyed but nat each state but not every state responded in any given year. And so I'll talk a little bit later about how many states we kind of have covered for in any given year. Each of these counts is then broken down in a number of different ways. So each of these counts of children in foster care, entering, exiting foster care, are broken down by sex or by age group or by race and ethnicity. You'll notice I emphasized the “or” so the cross tabulations here are somewhat limited. We could for example observe the number of boys entering foster care in Minnesota in 1982 or the number of 0 to 4-year-old children entering foster care in Minnesota in 1982, but not the number of 0 to 4-year-old boys entering foster care in Minnesota in 1982. So the tabulations are sort of one-dimensional. The tabulations by sex, age group, and race and ethnicity are pretty reliable because states by and large understood those things to mean the same thing. There are more what may be more interesting cross tabulations depending on your interests but which are a little less reliable. So there are information on reasons that children entered or exited foster care, where they were living while they were in care, or how much time they spent while they were in care. Unfortunately though because the VCIS was voluntary, states sometimes appear to have interpreted some of these categories differently and so we've done our best to clean up the data so that those categorical responses are reliable, but in other cases as I'll talk about we've had to suppress some responses. There are number of things that the VCIS doesn't include which users will have to figure out solutions for as they are as they are engaging with the data. The main thing is that and this is true of most other data products at NDACAN releases is that there's no population data accompanying the VCIS. And so like I was saying we get counts of children entering care or in care or exiting care but the meaning of those counts isn't always clear on their own. Usually we want to construct rates, which is to say, the number of children entering care per 1000 children or the number of children in care per 1000 children. And the reason we want to do this is the underlying population the number of children in a state in a given year who are at risk of interacting with the foster care system is changing particularly when we're doing historical analysis over long periods of time. And so generally when we're using these kinds of data we want to merge them to a source of population data to construct a denominator for a population rate. Things get a little trickier here when you're doing historical analysis because the availability of census data particularly for certain ethno-racial groups starts to become more limited. But fortunately the ethno-racial categories in AFCARS are broadly comparable to pretty standard sources of population data like the SEER data that's S E E R from the National Cancer Institute and so users can merge the VCIS to sources of population data like the SEER data to construct whatever rates they might be interested in. The VCIS and the AFCARS are broadly comparable but they're not precisely identical in the way they measure certain things. So for this reason whenever we're  combining the VCIS and AFCARS, which I really encourage users to do -I think it's what makes the VCIS really useful and powerful- whenever we are doing those sort of merges across sources we do want to be careful about making specific kinds of comparisons. So for example the AFCARS includes children who have run away from foster care. The VCIS was supposed to but some states seem not to have included runaway children. And so if you were comparing say the VCIS data in 1994 in a given state, let's just say Georgia, to the AFCARS data in 1996 in Georgia you would want to do a little bit of special investigation to see how reliable those sources were in that specific case. NDACAN has done a lot of analysis to compare the reliability of the VCIS to the AFCARS and the general take-home is that there isn't a lot of bias across the two sources. So in some states the AFCARS data is a little higher in other states the AFCARS data is a little lower, on average they seem to even out and there isn't much bias. So there's noise but not too much bias. The important thing to note here is that specific comparisons within a state across years or across specific states are going to require a little extra caution but you can use these data in the aggregate to make responsible statements about change over long periods of time. So I've already started to hint that different states in different years, the way they define certain categories, the way they're measuring certain populations may be changing. The way NDACAN has decided to deal with this is to release two files as part of the VCIS data release. So if you go to the NDACAN website and you request these data, you download them, you're going to get two files: what we're calling the clean and the raw file. The raw file takes the spreadsheets we inherited and if there are like obvious coding errors or like a column got moved over we fix those things, but we leave the data more or less as we believe it was reported from states to the collecting agency. The clean file then does a lot more work to reconcile what we believe to be issues of reliability where for example let's say a state in one year just like doesn't report any Hispanic children in, a more likely would be a state doesn't report any Asian-American or Pacific Islander children in its foster care population, but it's a large state and we have strong reason to suspect there are such children in the foster care system. That leads us to believe that those children have been relegated to the “Other” race category. The problem is, that “Other” race category then isn't comparable to other years or other states that do separately identify Asian-American or Pacific Islander children. And so we've done our best to reallocate counts where that's possible, but in other cases we've had to suppress counts so that users aren't making sort of fallacious comparisons. And so when users download the VCIS data they should be using the clean file for pretty much any imaginable purpose. I don't know how to say this strongly enough. It's more than that we strongly encourage I guess we insist that users use the clean file for any analysis they're doing and refer to the raw file only on a case-by-case basis to better understand why we might have made a change or suppressed a value. Finally, the VCIS doesn't include complete data. So there were many states that just didn't report any data in a given year, states that didn't report data on certain populations or certain variables in a given year, and then of course there are values that NDACAN itself has suppressed. And so as a result there's I would say a moderate amount of missing data in the VCIS. Now that's fine, we can still do quite a lot with it but we do need a responsible approach to dealing with the missing data. So this is particularly important if we're ever going to be collapsing state-level data or sort of counting up states to generate national estimates because for example if we're missing data from California in a given year it will look like cases plummeted all of a sudden. So we need to figure out ways to make evidence-based guesses about what values might have been had they been observed. And so whether you're going to be interpolating data or using multiple imputation strategies or using you know full information maximum likelihood or Bayesian approaches to missing data whatever you're experienced with whatever is well suited to your case is fine but I'd strongly encourage users of the VCIS as with other data products that have missing data to think carefully about how to deal with those missing data responsibly. Okay all that said we really can combine these data sources and it yields some pretty large and and pretty powerful data sets. So this is this figure here is actually from an appendix from an article that should be coming out I think actually in the next couple weeks in Demography that is going to kind of introduce the VCIS data and demonstrate some of the things you can do with it. What you see here is we have a panel for each sort of ethno-racial group but the upper left is all children. On the X-axis the horizontal axis year we have historical time and you'll see it goes the axis goes all the way from 1961 up to I think 2018 here. And then the vertical axis counts the number of states for which we observe a count of children. I think these are counting children in foster care. The green X's kind of over on the right are for AFCARS data and what you see is that for the 21st century we have essentially complete data on all ethno-racial groups. In the late 1990s when the AFCARS was just getting going there were a number of states that didn't report any data and so we don't have complete data for the AFCARS or from the AFCARS for those years. The red squares correspond to the VCIS data and what you can see is that if you look at the top left panel you can see that reports counts of all children in foster care for almost every state in any given year. So we have a pretty good portrait of the total population in foster care from the VCIS. As for specific ethno-racial groups it gets a little spotty and obviously we lose a lot of detail over time. The brown crosses and blue circles here report data from two other sources that the archive is currently developing: the NCSS is the National Center for Social Statistics and the CBSS corresponds to the Children's Bureau Statistical Series and so these are even older sources of historical data on children in foster care. There's no ethno-racial breakdowns in those sources but what you can see is that we can get counts of children in foster care by state all the way back to 1961. And so this allows us to make some pretty interesting comparisons, get some pretty interesting context on the long-term trends in children in foster care in the United States. And this figure starts to illustrate how we might use some of these data to analyze racial and ethnic inequality in system contact. So I'll walk us through the figure slowly just to be clear about what we're actually looking at. So what I've done here is take the counts of children from each ethno-racial group in each state in each year and divide it by the population of children in that ethno-racial group in that state in that year overall. So calculated a rate of children in foster care. Then I've taken the foster care rate for each group and divided it by the foster care rate for all children that is children of all races and ethnicities. And what that yields is a rate ratio comparing the foster care rate for that group to the foster care rate for all children. What that means is that the rate if the rate ratio is greater than one, children from that ethno-racial group are overrepresented in the foster care system. If that rate ratio is less than one, children from that ethno-racial group are underrepresented in the foster care system. So here I'm just showing you data from the VCIS and the AFCARS combined which is all you see on the horizontal axis are ranges from 1982 to 2018. And then on the vertical axis you have the rate ratio so one means that the children for that group are equally represented, again above that line means overrepresented, below the line means they're underrepresented. And immediately we can learn a few things. We see perhaps unsurprisingly for example in California that white children are slightly underrepresented in the foster care system more or less consistently so over the period of analysis. We see that American Indian children in California are moderately overrepresented in the foster care system and black and African-American children in California are pretty severely overrepresented. One of the things we learn from combining the VCIS with the AFCARS is that black children's overrepresentation in foster care system peaked in the early 1990s and has been more or less consistently decreasing across most states since the early 1990s. We also see though that very high levels of overrepresentation of American Indian children in foster care, particularly in the upper Northwest are not just striking for their extreme levels but they're higher than they've been in the last 40 years. There are some points of caution though. Obviously there are some missing data so there's a big gap here from Minnesota. If we were doing any sort of statistical analysis we would want again to figure out some way to deal with those missing data. One of the things you see looking at the panel for Georgia is that the estimates for certain groups, American Indian children, Asian Pacific Islander children, are kind of all over the place. That's because those estimates are based on very small numbers of children. And so it's hard to give them strong interpretation. It's clear that those children are not overrepresented in the foster care population but how much they're underrepresented it's a little hard to say because they're bouncing around quite a bit. The last word of caution I would give to people is that there is qualitative variation in what it means for a child from a particular ethno-racial group to be in foster care in a particular state in a particular year. So for example when we're trying to give interpretation to these really extreme rates among  American Indian children in Minnesota, we would want to think carefully about how that system works for that population in that place. I think that kind of segues into some of the things Frank's going to talk about so I'll stop there and I'll hand it over to Frank. Thanks for listening and I'm really excited for your questions.

[Frank Edwards]
Alright thanks Alex, I'm always fascinated to learn more about these data. So what I'm going to describe now is a work in progress that will be using the VCIS data to think about change over time in exposure of American Indian Alaska Native children to foster care particularly thinking about changes in the wake of the passage of the Indian Child Welfare Act. So one thing that the VCIS I'm really excited about its release to help us think about how the child welfare system has changed and its relationship to Native populations in the Native nations since the passage of the Indian Child Welfare Act. So the ICWA passed in 1978 in a context where American Indian and Alaska Native children were being removed from their families and placed into foster and adoptive homes at extraordinary rates this was part of the long legacy of family separation that took the form of boarding schools initially, that we've been learning a lot more about the brutality of a lot of those institutions unfortunately in recent days. And then many of those kind of similar efforts to develop the foster and adoption system. The Indian child welfare act was passed in 1978 with the explicit intent of ending a long-standing crisis of the separation of Native children from their families in the United States and it guaranteed sovereignty of tribal nations over children who were eligible for membership in the tribe right? So it was very squarely directed at granting control over child welfare-involved Native children to tribal governments and to ensuring that Native children were placed with members of the family, members of their tribal communities or, when that was not available, other American Indian or Alaska Native families. So as of the mid-1970s the Association on American Indian Affairs conducted research based on surveys to foster care providers, to adoption agencies and state agencies, in addition to surveying the various field offices of the Bureau of Indian Affairs and estimated that about one in four Native children had been in foster care in the mid-1970s. We estimate, along with Chris Wildeman in a recent paper, estimate that cumulative risk for Native children today is about one in eight. So we estimate there's been about a 50% decline in the national cumulative risk comparing those 1970s numbers to numbers today. But what the VCIS data are going to let us do is take a much closer look at what happened to state foster care trajectories for Native children following the passage of the ICWA. And so my colleague Theresa Rocha Beardall and I have written one paper that's forthcoming in the Columbia Journal of Race a Law that looks at comparing the reported numbers from the Association on American Indian Affairs in the mid 1970s to 2019 estimates of foster care entry risk for Native children at the state level to think about how state level foster care populations have shifted across two time points that of course would be really fascinating to think about how those shifted over time beginning in the period that was very close to the passage of ICWA. So with the VCIS data were going to be able to look at state level at how American Indian and Alaska Native foster care caseloads move from the 1980s and then to link it with the AFCARS panel through the current date so we'll be able to trace really carefully how those caseloads have moved over time. And that has a number of really fascinating implications both descriptively it will be really interesting to think about the short- and long-term kind of trajectories of states following the passage of ICWA but we also know that many states implemented the federal provisions of ICWA in different ways and had various levels of compliance with the law and also a number of states took the additional step of passing their own state level Indian Child Welfare Acts that provided additional protections. So it will be really, as Alex was mentioning earlier, you know we have a kind of series of comparisons that these historical data enable us to answer that we simply wouldn't have been able to without these data. However, this does present challenges and many of these are really similar to the kinds of things that Alex presented before but in particular when we're thinking about American Indian and Alaska Native populations there are some unique challenges that I'm actively thinking about and I think other users of the data should be thinking about. So now we have 1982 through 2019 available whereas before for foster care we were only able to look at 2000 to 2019 so obviously you know basically doubling the length of the time series has pretty tremendous value as a researcher for all kinds of analyses but for measuring exposure over time population denominators are absolutely central, right? That we want to measure not only change in the child-welfare-system-involved population but that's going to be a function in part of changes in the underlying child population for different subgroups. And a key concern here is that since this methods for enumerating race have shifted quite a lot over the period that we're looking at. So Alex mentioned using and I posted some details about the SEER time series county-level population data and those are really really useful data but they do have some limits. So they used a bridge race method to track that the Census has developed to make the time series comparable despite shifts in the Census tabulation methodology. But when we think about American Indian and Alaska Natives in particular there's been some really dramatic changes in that population over time that researchers need to pay careful attention to. So the proportion of the population that has identified as American Indian and Alaskan Native has grown dramatically since 1970. The rate of growth of that population if we were to take the Census-identified population that was American Indian and Alaska Native in the 1970 Census and compare it to the 1990 Census we see dramatically more growth in the size of that population than is explainable due to demography alone. That is, the growth in that population is not explainable as a function of migration, fertility, or mortality. Shifting cultural identifications and shifting enumeration methods in the Census are responsible for this. So we have an artifact of cultural changes in the United States. We also have an artifact of the data collection methodology. So the census allowed users to self identify their race and ethnicity following the 1970 Census and then beginning in 2000 allowed respondents to identify membership in multiple racial or ethnic categories. Once that happened a self identification in the context of American Indians and Alaska Natives prior to the 1970s, Census enumerators themselves and in the 1970 Census, Census enumerators themselves would infer the race or ethnicity of a respondent. Now they and so American Indian Alaska Native racial identification was something that was almost certainly underreported particularly in a context where we were in a moment in which assimilation was the national strategy for addressing American Indian and Alaska Native tribes particularly through the child welfare system. That is, there were incentives for Native people to not identify themselves as Native and there were large-scale government practices that were encouraging people to assimilate into white cultural practices and also relocated people away from traditional homelands or other kind of relocated homelands and so we have some real instability and change in what it means what this category means over time. So for example if we use the SEER population data which I mentioned before, we can compare that population total in 2010 for what the SEER counts as the U.S. American Indian population to what the Census estimates as the U.S. American Indian population. And were going to get dramatically different numbers. The Census will report many more Native people if we count a Native person as someone who identifies as American Indian alone or in combination with another group that the number identifying as Native in the Census is dramatically higher than what we'll see in the SEER which uses that bridge race methodology. So we need to think really carefully about what our reference group is and what our denominator is going to be when we use some of the subgroups particularly when we use American Indians and think about comparing race data over time because those population numbers have moved around a lot and we in a lot of cases are comparing fundamentally different populations at different moments in time based on how the data was collected and how the identification of race and ethnicity worked in our culture at that time. So I think what all this is to say is that these are incredibly powerful data but data are products of historical environments and we have to think really carefully about the context that we're  using in so that we make accurate comparisons over time. Yeah again I'm incredibly excited about the work that Alex and others at NDACAN have put into this data because I think it's going to be a huge contribution to what we're able to learn about particularly the trajectories of states over time.

[Clayton Covington]
Thank you Alex and Frank for those excellent presentations I'm going to be taking over for Erin on the Q&A side so I'll read out the questions so the first question is it seems like this person is looking for some clarification from you Alex about what you meant when you were speaking about causal analyses earlier, wondering if you're instead talking about correlational rates so simple correlations which in the ANOVA over the repeated measured test and then to follow up that question also asking can you clarify what you meant when you said exogenous variable?

[Alex Roehrkasse]
Absolutely thanks for that question. Yeah so I can't speak much to the ANOVA question I don't have a lot of experience or expertise there but let me answer the question by way of an example. Let's say that we're interested in the impact of child poverty on children's likelihood of being placed into foster care. And let's say we're fundamentally interested in kind of what that relationship looks like today. To generate a valid causal estimate of the impact of child poverty on placement into foster care, we need to use some sort of research design that's going to control for the fact that those things are both determined by potentially a large number of unobserved factors. So one thing we might do is search for the source of what I called exogenous variation that affects rates of child poverty. And so when I say exogenous variation I mean variation across states or change over time within a state that results from some factor that we think isn't kind of caught up in the messy causal net that's also involves foster care. So one of the ways we often do that is by leveraging change in policy. So for example there were really important changes to welfare in the 1990s that differed by state. As a result we have changes to child poverty state by state in the 80s and 90s that result from changes in these policies and therefore are sort of exogenous in some meaningful sense. By using data over time the VCIS enables us to do more longitudinal research or research over time we're  able to leverage some of these policy changes to generate causal estimates of the impact of child poverty on foster care placement. Those things are harder to do if you're just analyzing one year of data or even two or three years of data and they are especially hard to do when you don't have sources of this exogenous variation that I'm talking about. So we don't get a big policy shock in any given year. Sometimes we kind of have to rely on big historical changes to generate some of the sources of variation we need to generate causal estimates. So both by just creating longer time series and including other historical periods where there were major policy changes going on, we kind of increase our opportunities to do causal research. Hopefully that answers the question.

 [Clayton Covington]
Yeah thank you for that Alex. I'm wondering Frank would you be willing to weigh in you know having these you know used these data at NDACAN for a while about the utility of NDACAN data for causal inference?

[Frank Edwards]
Sure so I think Alex his expertise on causal inference is a little bit stronger than mine and I tend to be a little more skeptical about econometric causal inference but that's fine that's fine I mean I think you know obviously a longer time period gives us more data to work with, as Alex mentioned. You know if we can think about appropriate comparisons across places to develop like a plausible difference in inferrence design but you know we have a lot more a much larger window for policy to look here. We have 40 years rather than 20 so we've really dramatically expanded what we can look at I mean in the context of the study that I'm thinking about and described we're going to be looking at shifts in American Indian Alaska Native contact with the child welfare system at the state level over time and we can get a really detailed look at that and with clever design if we had two very similar states that implemented slightly different versions of ICWA compliance we might be able to think about designing to estimate effects of ICWA on child welfare caseloads. Again I'm not an econometrician so I have a hard time thinking in too much detail about identification strategies using these data but there's certainly a lot that could be done here. In that response though I think I could say exogenous here does it mean something similar to confounding? No it doesn't so a confounding variable could be on the causal pathway between on outcome and a predictor, right? But in this case an exogenous variable is going to drive variation in our outcome variable in a way that's  uncorrelated with the predictor that we're  interested in. So it's outside of the causal system. Exogenous means it's it's kind of external to... its outside of the causal system that we're we're investigating here. So it so we might have some natural experiments like things that could drive differences in child welfare case populations that are unrelated to whatever policy system we might be interested in like maybe a mass migration of children due to a hurricane, right? We could think about hurricane Katrina as potentially being an exogenous source of variation in Texas's child welfare system as we see a large number of children enter the system following you know a natural disaster or something like that. Alex would you like to maybe clarify a little more on that? 

[Alex Roehrkasse]
No I think I it's it's hard to it's hard to speak except in concrete examples sometimes because it's going to depend a little bit on what the cause you're interested in and what the outcome you're interested in, is. So the sort of appropriate research design here depends more specifically on what the question this and and then how to find the right sources of the exogenous variation then kind of follow from that. I think Frank's description of exogenous variation is spot on. I'll just say that we're  always available to talk through you know research designs using these data and so if you ever have more specific questions about how to answer a question that you're interested in you should reach out to us.

[Clayton Covington]
So if the denominator of a population increases but the numerator of children removed remains stable then is that really a decrease or is the original ratio incorrect? A second question asks, or are there people who are not in a geographic areas that have higher risk of removal now identifying as Native Americans? And the third question asks, or maybe the number hasn't greatly changed and it's just an identification issue specifically on the topic of Native Americans identification?

[Frank Edwards]
This is precisely why understanding the data is so important because we could have movement in a denominator that appears to show changes in the rate of exposure to a phenomenon over time that could be driven by population shifts or could be driven by artifacts of data collection or could be driven by cultural changes. And I think it's it's up to us as researchers to do the careful work to make sure that what we're  providing is comparable over time, right? And so all of the things that you mentioned in there could occur in a naïve comparison of rates from time period one to time period two in which the data collection practices have varied and the cultural meanings attached to identification with a particular group have also shifted or the practices of identification with a group have shifted. So yeah it's really difficult to say what's driving change in that context and I think that underlines the importance of being very careful with your data work when you're using historical data.

[Clayton Covington]
Alright thank you for that Frank. The next question asks for entries and exits does the data contain a number of reentries or just first entries? And the second question asks is the total children in care at the end of the year?

[Alex Roehrkasse]
Yeah I can take that. Those are great questions. The VCIS doesn't include any information about reentries and so we don't know whether an observed entry is a first entry or a reentry. The point-in-time estimates of children in care are taken at the end of the fiscal year. The end of the fiscal year is usually I think July 30 for most states but states do differ in their whether they reported their point-in-time estimates in terms of the federal fiscal year, the state fiscal year which is sometimes different, or the end of the calendar year. There is information documenting to a limited extent which date any given state used and so if you have reason to suspect that there's meaningful calendar fluctuation and you want to drill down on when exactly these point-in-time estimates were made, there is metadata available to do that for many if not most years of the VCIS data.

[Erin McCauley]
Hopefully you guys can hear me now sorry about my loss of audio earlier but I also just wanted to take a moment to highlight next week's topic while folks are coming up with any additional questions. Next week were going to have that workshop on multilevel modeling. Frank will be back and will be joined by Sarah Sernaker who's our statistician who gave the presentation last week about linking data. So we hope to see you then.

[Frank Edwards]
Thanks everybody

[voiceover]
The National Data Archive on Child Abuse Neglect is a collaboration between Cornell University and Duke University. Funding for NDACAN is provided by the Children's Bureau, An office of the Administration for Children and Families.