The Trickiest Tripwires in Analytics

1. Are my conclusions BS?

I admire this. The author found an interesting statistical anomaly and got some attention for it. Then she discovered it was actually all due to an error and published a retraction of the use of Benford’s law as a fraud detector.

2. How do I figure out causation?

Does teen pregnancy cause poverty or vice versa? Here’s Tim Taylor:

In an ideal experiment, one might want a research design in which a random sample of teenagers becomes pregnant and gives birth, and then you could track the outcomes. Of course, randomized pregnancy is an impractical research design! But here are four approaches used by clever economists to disentangle this question of cause and effect.

A within-family approach. Look at life outcomes for sisters who give birth at different ages. The result of this kind of study is “once background characteristics are controlled for, the differences are quite modest. Furthermore, even these modest differences likely overstate the costs of teen childbearing, since the sister who gives birth as a teen is likely to be “negatively” selected compared
to her sister who does not.”

Miscarriages.  Of those teens who become pregnant, some will suffer miscarriages. Compare women who are similar in measured characteristics of family background, but some of whom gave birth as teenagers while others had a miscarriage. It turns out that their life outcomes look quite similar: that is, giving birth as a teenager doesn’t appear to cause any additional decline in later life outcomes.

Age at first menstruation. Girls who menstruate earlier are at greater risk of becoming pregnant as teenagers. One can use a statistical approach to look at two groups of women who are similar in measured characteristics of family background, but where one group has a higher pregnancy rate because they began their menstrual cycle earlier. However, the life outcomes for these groups look quite similar; is not correlated with lower life outcomes: that is, a random chance of being more likely to give birth as a teenager (because of an earlier age of first menstruation) doesn’t appear to cause any additional decline in later life outcomes.

Propensity scores. Look at girls within a certain school, so that they live in more-or-less the same neighborhood. Using the available data, develop a “propensity score” that measures how likely a girl is to give birth as a teenager. Then compare the life outcomes for girls with similar propensity scores, some of whom gave birth and some of whom did not. There doesn’t seem to be a difference in life outcomes, again suggesting that giving birth as a teenager doesn’t much alter other life outcomes.

Advertisements
This entry was posted in economics, science. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s