Day 22: Friday May 29, 2020 (n=14)
As of this morning, you have completed seven re-interviews using the hypothesis-generating questionnaire.
You decide to enter your questionnaires into a database and then export the data into Microsoft Excel for analysis. You need to analyze the data to determine the frequency of exposures for cases and whether there are any exposures that are reported at higher than expected frequencies, compared to baseline population data.
Population-based studies such as Foodbook and the CDC FoodNet Population Survey provide investigators with baseline data on the reported frequency of specific food exposures for non-ill individuals over a 7-day period. These data can be used as a point of comparison for questionnaire data to identify food exposures that are reported by outbreak cases more commonly than would be expected.
For more information on expected food frequencies, refer to the background reading below.
This is a three part exercise.
Examine the initial food exposure data in Table 1 (“Original Data”) of Module 2 – Exercise 2.
Identify the most common food exposures among the cases – look at the proportion of cases that answered “Yes”, as well as those that answered “Yes” or “Probably”.
Calculate the proportion of cases reporting each food item ([Yes+Probably]/(Yes+Probably+No]). Note that ‘Don’t Know’ responses are not included in the food frequency analysis. In practice, investigators look at all exposures no matter how many cases report them. This is because not all cases will have necessarily been asked about all exposures. As well, some foods (e.g., sprouts, nuts, seeds, flour) may be used as an ingredient in other foods or used as a garnish, which makes them more difficult for cases to recall. Even a small proportion of cases reporting an uncommon food or a food that may be used as an ingredient can provide an important clue for investigators during hypothesis generation. Investigators will also look closely at the exposures that ‘flag’ as being reported more frequently than expected based on the population data (at p<0.05) as well as exposures that are reported frequently, but do not flag.
Compare the food frequencies to reference food frequencies in Table 2 (“Food Frequency Reference”) of Module 2 – Exercise 2.
How do the proportion of cases reporting an exposure compare to expected levels (e.g., proportions obtained from population-based food survey data)?
Items that were not available on the food survey include:
- Chia seeds
- Flax seeds
- Other seeds
- Non-dairy milk
- Other nuts
There are many limitations to using expected food frequencies, such as not accounting for:
- Seasonality (e.g., consumption of cherries is higher in the summer; however, the expected levels are based on the average across the whole year, and therefore do not take into account variations by month),
- Differences in consumption between men and women, or different age groups, geographic location, and
- Various ethnic/religious/cultural diets
Question 2-8: How might you utilize Foodbook data to address these limitations (for example, if cases are located just in Atlantic provinces, and are primarily under 18)?
Another challenge that arises is when cases have similar food consumption habits. Sometimes if cases are eating many of the same items it becomes challenging to determine a suspect source. For example, this particular case demographic appears to consume largely plant based diets. Thus, they all likely consume a lot of fresh produce items of all types – from berries to leafy greens and other vegetables.
One way to assess whether the difference between the proportions of cases reporting an exposure is significantly different from the expected proportion is to calculate the binomial probability.
Examine Table 3 (“Binomial Probability”) in Module 2 – Exercise 2. Which food items are reported at a significantly higher frequency than expected?