A study was conducted to investigate whether there is a relationship between (caffeinated) coffee consumption and risk of depression in women [Lucas et al, Archives of internal medicine 171.17 (2011), 1571].

Data were collected on 50739 women free of depression symptoms at the start of the study in the year 1996, and these women were followed through 2006.

For the duration of the study, the researchers used surveys to record how much coffee the women consumed and whether they experienced depression.

- Download the following dataset: coffee-depression.
- State the null and alternative hypothesis for this study, and what statistical test is appropriate to use.
- Calculate frequency sums for each category, and verify that the overall total matches the sample size (50739).
- Calculate the expected frequencies.
- Calculate the test statistic.
- Calculate the number of degrees of freedom.
- Compare the test statistic with the appropriate critical value(s); can you reject the null hypothesis? Interpret in context.
- In a New York Times article, caution is expressed about using the results of this study to make public health recommendations. Official public guidelines on coffee intake should wait “until studies with methodologies better able to determine causality are conducted”. Do you agree? Why?