Here’s the latest guest post from Yale economist and law professor Ian Ayres. Here are Ayres’s past posts and here is a recent discussion of standardized tests.
A recent article in the Times trumpeted the results of a report that had just been released by the Educational Testing Service (E.T.S.).
The E.T.S. researchers used four variables that are beyond the control of schools: the percentage of children living with one parent; the percentage of eighth graders absent from school at least three times a month; the percentage of children age 5 or younger whose parents read to them daily; and the percentage of eighth graders who watch five or more hours of TV a day. Using just those four variables, the researchers were able to predict each state’s results on the federal eighth-grade reading test with impressive accuracy.
“Together, these four factors account for about two-thirds of the large differences among states,” the report said. In other words, the states that had the lowest test scores tended to be those that had the highest percentages of children from single-parent families, eighth graders watching lots of TV and eighth graders absent a lot, and the lowest percentages of young children being read to regularly, regardless of what was going on in their schools.
The article fairly portrays the text of the study, which concludes:
In statistical terms, these four factors account for two-thirds of the differences in the actual scores (r squared = .66). That is a very strong association. (emphasis added).
The last sentence is odd. Normally, I’d look at the statistical significance of the individual factors if I were going to judge the strength of the association. The report’s phrasing suggests a strong association between the reading score outcome and all four of the underlying factors. But what you would not learn unless you dug into the appendix is that only 3 of the 4 factors were statistically significant.
It turns out that the impact of the “percentage of children under age 18 in a state who live with one parent” (labeled in the table as “onepar”) is neither large nor statistically different from zero. A one standard deviation increase in the percentage of single-parent kids only reduces the predicted reading score by only about half a point (while a one-standard deviation increase in heavy TV watchers reduces the predicted reading score by 3.3 points).
Moreover, this marginal effect by traditional standards is not statistically significant. The estimated negative impact of single-parent families may simply be a byproduct of chance (the T value indicates that the estimated negative coefficient of -0.0656 is only about four-tenths of a standard deviation away from zero — so we can’t reject in this data the possibility that the true impact of one-parent families on reading test scores is positive).
When I reran the same regression but dropped the “onepar” variable, the adjusted r-squared increased slightly. (You can download an Excel file with the full results and data here). That’s right: a three-factor regression does an even better job at explaining the reading score data.
We shouldn’t put very much weight on this regression. Instead of analyzing data on individual students, the report focused on aggregate state data that suppresses by averaging a great deal of the real variation of interest. The 4-factor regression only concerns 50 state data points. There may be other evidence in other studies that children of one parent families have poorer educational outcomes, but there is not a strong association between the two variables in this particular regression data.

So if I came from a one-parent family, I would understand this article?
So if I came from a one-parent family, I would understand this article?
I could also find an r square that is very high if I wanted to see the effect of ice cream sales on crime rates. I am amazed that such a report came out. I am guessing that if people don’t really know anything about statistics you could fool them into thinking that jumping in front of moving trains can make them healthy.
I could also find an r square that is very high if I wanted to see the effect of ice cream sales on crime rates. I am amazed that such a report came out. I am guessing that if people don’t really know anything about statistics you could fool them into thinking that jumping in front of moving trains can make them healthy.
#2, I think it’s equally unfair to assume that they are definitely not causal. Of course, correlation doesn’t imply causation, but I still think the original report (assuming you take out the 4th variable which doesn’t appear to hold water given this post) is worthwhile and does tell you something. That “something” is that those three variable should be tested to see if they actually do CAUSE the outcome. Thus, the benefit of the report is we may be moving in the right direction
#2, I think it’s equally unfair to assume that they are definitely not causal. Of course, correlation doesn’t imply causation, but I still think the original report (assuming you take out the 4th variable which doesn’t appear to hold water given this post) is worthwhile and does tell you something. That “something” is that those three variable should be tested to see if they actually do CAUSE the outcome. Thus, the benefit of the report is we may be moving in the right direction
Pablito, I am interested in your train related health plan and wish to subscribe to your newsletter.
Pablito, I am interested in your train related health plan and wish to subscribe to your newsletter.