Hypothesis generation

On this page


Generating hypotheses is an important, but often challenging, step in an outbreak investigation. When generating hypotheses, it is best to keep an open mind and to cast a wide net. A good starting place would be to identify exposures that have been previously been associated with the pathogen under investigation. This can be done by:

  1. searching an outbreak database such as Outbreak Summaries, the Marler-Clark database, and the CDC Foodborne Outbreak Online database (see Tools for links to these databases)
  2. reviewing the published literature using a search engine such as PubMed or Google Scholar.

If the case definition for the illnesses under investigation includes laboratory information in the form of Whole Genome Sequencing (WGS) results, consider investigating where and when the sequence has been seen before. Provincial and federal public health laboratories maintain WGS databases that can contain valuable information for outbreak investigation purposes. PulseNet Canada can provide information about how common or rare the serotype or sequence is nationally, where and when it was last seen, and if it has been detected in any food samples in the past. PulseNet Canada will also be able to check the United States’ PulseNet WGS databases for matches. FoodNet Canada can provide information about whether the sequence has previously been seen in farm or retail samples from its sentinel sites.

While it is important to gather such historical information, the most effective way to generate a high-quality hypothesis is to identify common exposures amongst cases. This can be achieved by interviewing cases using a hypothesis generating questionnaire and analysing exposures. 

Back to top

Questionnaires and interviewing

Hypothesis generating questionnaires

Hypothesis generating questionnaires (or shotgun questionnaires) are intended to obtain detailed information on what a person’s exposures were in the days leading up to their illness. They are typically quite long and ask about many exposures such as travel history, contact with animals, restaurants, events attended, and a comprehensive food history. The time period of interest varies between pathogens, as the exposure period is equal to the maximum incubation period of the pathogen.

When designing a questionnaire, it is important to ensure that the questions are gathering the intended information. Questions should be concise, informal, and specific. Before interviewing cases, questionnaires should be tested to ensure clarity and identify any potential errors.

Read more – Questionnaire Design

Case interviewing

Once the questionnaire is developed and piloted, it should be administered to cases in a consistent and unbiased manner. Case interviews can be conducted by one or multiple interviewers. A centralized approach allows a single interviewer to standardize interviews, detect patterns, and probe for items of interest. However, a multiple- interviewer approach is more time-efficient and allows for multiple perspectives when it comes time to identify the source.

Although case interviewing is an important outbreak investigation tool, it is not without its challenges. By the time the outbreak team is ready to conduct the interview, it could be weeks to months after the onset of symptoms. It is difficult for people to recall what they ate over a month ago. Sometimes cases might need to be interviewed multiple times as the hypothesis is developed and refined.

Read more- Case interviews

Back to top

Exposure analysis

Once the interviews are complete, the data can be entered into a database or line list. The frequency of exposures for the cases is then obtained (e.g., % of cases that consumed each food item).

It is tempting to conclude that the most commonly consumed food items are the most likely suspects, but it is possible that these foods are commonly consumed amongst the general population as well. What is needed is a baseline proportion to compare the exposure frequencies to. Reference population studies, such as the CDC Food Atlas, the Nesbitt Waterloo study and Foodbook (see Tools), can be used for this purpose. These studies provide investigators with the expected food frequencies based on 7-day food histories from thousands of respondents. These data can be used as a point of comparison for questionnaire data to identify exposures such as food items with higher than expected frequencies. Statistical tests (e.g., binomial probability tests) can then be used to test whether the differences between the proportion of cases exposed is significantly different from the proportion of “controls” (i.e., people included in the population studies) (see Tools).

There are many limitations to using expected food frequencies, such as some studies not accounting for:

  • Seasonality (e.g., consumption of cherries is higher in the summer, however the expected levels are the same year-round)
  • Differences in consumption between sexes, adults and children
  • Geographic location
  • Various ethnic/religious/cultural groups

Further, since specific questions differ among surveys, it is often difficult to find the most appropriate comparison group. For example, the CDC Atlas of Exposures differentiates between hamburgers eaten at home or outside the home, while questionnaires used in investigations typically do not. Such differences in food definitions can make it challenging to determine which reference variable is the most appropriate to use as an “expected” level.

It is important to keep in mind that some foods with high expected consumption levels (e.g., chicken) may not flag statistically, but could still be potential sources. Further, there are other common exposures amongst cases that can carry important clues about the source of the outbreak. Cases that report common restaurants, events, or grocery stores can be considered sub-clusters. These sub-clusters should be investigated thoroughly by obtaining menus, receipts, or shopper card information if possible.

Back to top


Back to top


Toolkit binomial probability calculation tool for food exposures

  • This Microsoft Excel document allows users to enter outbreak case food exposure numbers for 300 food items and automatically calculates binomial probabilities using two reference populations and flags exposures of interest for follow-up (Reference populations: CDC Population Survey Atlas of Exposures, 2006-2007 and Waterloo Region, Ontario Food Consumption Survey, November 2005 to March 2006).

Toolkit Outbreak Summaries overview

  • This PDF document provides an overview of the Outbreak Summaries application, its key features and benefits, and an example of how it can be used during an outbreak investigation.

CDC National Outbreak Reporting System (NORS) Dashboard

  • The NORS dashboard allows users to view and download data on disease outbreaks reported to CDC.Data can be filtered by type of outbreak, year, state, etiology (genus only), setting, food/ingredient, water exposure, and water type.

Food Consumption Patterns in the Waterloo Region

  • This food frequency study by Nesbitt et. al. was conducted in Waterloo, Ontario in 2005-2006. The study collected 7-day food consumption data from 2,332 Canadians.

CDC Food Atlas 2006-2007

  • This study by CDC was conducted in 10 U.S. states in 2006-2007. The study asked 17,000 respondents about their exposure to a comprehensive list of foods as well as animal exposure.

FoodNet Canada Reports and Publications

  • FoodNet Canada reports and publications provide information on the areas of greatest risk to human health to help direct food safety actions and programming as well as public health interventions, and to evaluate their effectiveness.

CDC FoodNet Reports

  • The Foodborne Diseases Active Surveillance Network (FoodNet) Annual Reports are summaries of information collected through active surveillance of nine pathogens.

Marler Clark Foodborne Illness Outbreak Database

  • This database provides summaries of food and water related outbreaks caused by various enteric pathogens dating back to 1984.

FDA Foodborne Illness-Causing Organisms Cheat Sheet

  • A quick summary chart on foodborne illnesses, organisms involved, symptom onset times, signs and symptoms to expect, and food sources.

CFIA: Canada’s 10 Least Wanted Foodborne Pathogens

  • This infographic prepared by the CFIA includes information on symptoms, onset time, transmission, potential sources, and preventative measures for ten foodborne pathogens.

Foodbook: Canadian Food Exposure Study to Strengthen Outbreak Response

  • Foodbook is a population-based telephone survey that was conducted in all Canadian provinces and territories. It provides essential data on food, animal and water exposure which is used by the Agency, as well as other federal, provincial, and territorial (F/P/T) partners to understand, respond to, control and prevent enteric illness in Canada.

Toolkit outbreak response database*

  • This Microsoft Access Database follows the layout and structure of the PHAC enteric hypothesis generating questionnaires. Users are able to enter data, export select fields to a Microsoft Excel line list, generate automatic food and risk exposure summary tables and run custom queries. 

    *Due to the Government of Canada’s Standard on Web Accessibility, this tool cannot be posted, but it is available upon request. Please contact us at info@outbreaktools.ca to request a copy. Please let us know if you need support or an accessible format.

Back to top