R in the news

I’m sure everyone and his brother will be posting a link to this article from Steve McNally’s blog on Forbes. The post contains a lot of interesting links, as do the comments.

Calorie labels don’t work

Many public health professionals believe that if people only knew how many calories were in their Big Macs, they’d order fewer of them. This has led to laws requiring restaurants to post caloric and other information about their menu items. But a study of New York City’s law shows that while people are aware of the information, it doesn’t cause them to eat less. In fact, according to an article in today’s New York Times, the study showed that:

..people had, in fact, ordered slightly more calories than the typical customer had before the labeling law went into effect, in July 2008.

What the article doesn’t speculate on is why this might be the case. I think people are less likely to purchase that extra Whopper when they don’t know how many calories are in it. When they find out how many calories the sandwich really has, they probably figure, “ah well, that isn’t too many, and besides I can make up for it by ordering a diet Coke.”

So what will the health police try next?

Smoking and early death

8AmericsIn the Eight Americas study, researcher Christopher Murray (Harvard School of Public Health) and colleagues present mortality data for various ethnic groups in the United States. The data on median age at death by US county was rather shocking given the wide variation.

The figure at left is for white males only; unfortunately the paper does not contain a figure with aggregate data for both sexes and all ethnic groups. Suffice it to say that charts for women and other ethnic groups look very similar, giving new meaning to the term “Red States”. Since I live right in the middle of that crimson gash, I wondered what causes people here to die so young, compared to their fellow citizens in other areas of the country.

Does the chart below explain it? I used Google Charts to produce this figure. It shows the rate of cigarette smoking by adults, with purple being the lowest and red the highest (I haven’t figured out yet how to put a legend on a Google Chart, so please bear with me).

Of course, cigarette smoking may not be the only cause of early death in the South. But the correspondence between the two figures is striking.

Note, just by looking at the lower graph, you would expect to see a lower age of death in Nevada than in surrounding states (the stench of cigarettes in the casinos was unbearable to me the one and only time I went there). And sure enough, you look at the upper figure and there it is.

Coin Flips Are Not Random

At least not according to this article in the Washington Post. Here is the academic article (Dynamical Bias in the Coin Toss) by Persi Diaconis, Susan Holmes and Richard Montgomery that the Washington Post is reporting on.

Another rant against BMI

Slate weighs in on the controversy.

Averages and integrals

As any student of calculus knows, the integral of a function over a given range is closely related to the average value of the function over that range. One very useful average in studies of energy use is the daily average temperature. Building energy use is often a strong function of outdoor air temperature because about half of the energy required is for heating and air conditioning. Consequently, daily energy use in a building usually correlates well with daily average temperature.

Weather stations report temperature at regular intervals, and the best way to find the daily average is of course to take the average of all readings over a 24-hour period (assuming there are no gaps in the data). In historical studies however, sometimes the only information available is the daily high and low. But believe it or not, averaging the daily high and low temperatures gives a pretty good estimate of the daily average temperature.

Students in calculus classes are taught various methods of approximating an integral. The simplest is the rectangular rule, which in the case of daily temperature is equivalent to taking the average of all of the readings over the day (the integral would then be equal to the average multiplied by the total length of the interval, which is 24 hours).

Averaging the maximum and minimum of the function over the range gives about the same answer as the rectangular rule, but to my knowledge this method of approximating an integral is never taught. It would be interesting to know how “smooth” a curve has to be for this hold. It’s trivially true for a straight line, for example. What about second order curves?

Another interesting method of finding daily average temperature is to measure the temperature at 5:04:18 AM and 6:55:42 PM, and average the two readings. What is special about those two times? The answer is left as an exercise for the reader. Hint: it’s closely related to another method of approximating an integral, one that is occasionally still taught in classes on numerical analysis.

Another study shows it’s better to be a bit overweight

In 1993-1994, demographic data were collected on a sample of 11,326 Canadians over the age of 25. The Cox proportional hazards model was used to determine the effect of BMI on mortality over a twelve year period. It was found that people who were overweight but not obese — BMI from 25 to 29.9 — had lower mortality than people of so-called normal weight (B.M.I. of 18.5 to 24.9).

According to an article in the New York Times:

“Overweight may not be the problem we thought it was,” said Dr. David H. Feeny, a senior investigator at Kaiser Permanente Center for Health Research in Portland, Ore., and one of the authors of the study. “Overweight was protective.”

The abstract is available on-line.