Questionnaire design

On this page


Questionnaires are systematic ways of gathering information needed in an outbreak investigation, usually through interviewing cases. Questionnaires play a key role in hypothesis generation, and they can be used to collect data for both descriptive and analytic studies. Questionnaires should be designed along with a data management plan.

There are different types of questionnaires that are commonly used in outbreak investigations:

  • Hypothesis generating questionnaires capture as many exposures as possible (i.e., cast a wide net). Hypothesis generating questionnaires target only cases, favouring open-ended and exploratory questions while examining all possible risk factors.
  • Focused questionnaires ask questions about specific exposures of interest and are often derived from a pattern of risk factors reported through the hypothesis generating questionnaire.
  • Hypothesis testing questionnaires focus on specific exposures. Hypothesis testing questionnaires are standardized and compare cases and controls using targeted questions that examine only suspected risk factors and use relatively few open-ended questions. 

Back to top

Designing questionnaires

Purpose and Audience:

  • Identify the purpose of the questionnaire.
  • Consider who will be completing the questionnaire (e.g., an interviewer or the respondent themselves or their parent or caregiver). What is practical given time, logistic and resource constraints?

Design and layout:

  • Keep the questionnaire as short, tidy, relevant, and logical as possible to increase response rate. For self-administered questionnaires, survey length, font size, white space and organization are particularly important.
  • Order the questions from easy to difficult, from general to particular, from factual to abstract. Start with questions relevant to the main subject trying to avoid starting with personal and demographic questions.

Questionnaire components:

Questionnaire components include an introduction, information identifying the source (study subject or a proxy), demographics, clinical details, exposure or risk factor information and a conclusion.

  • Identifying information collected in a hypothesis generating questionnaire by a local health authority may include name, identifying numbers and/or contact information. Identifying information allows for subject identification, updating questionnaires as more information becomes available, linking questionnaires to other records, mapping, and preventing duplicate data entry of records.
  • Demographic information describes the cases in terms of age, date of birth, sex, occupation and place of employment/school. This information can help clarify the population at risk of becoming ill and needs to be evaluated to determine if it affects the relationship between exposure and disease.
  • Clinical information describes the health status of cases; variables may include date and time of illness onset, signs and symptoms (e.g., first signs, subsequent signs, severity, duration), laboratory tests, hospitalization status and outcome (e.g., recovery, death). This information allows characterization of the illness, charting of its time course, as well as decisions about who has the outcome of interest.
  • Exposure or risk factor information is used to test the hypotheses under investigation and is probably the major focus of the questionnaire. Risk factors may be related directly to food (food and beverage consumption inside and outside the home), may include food handling and cooking practices, attending or working in certain institutions or facilities, water exposures (drinking and recreational), recent travel, contact with animals and being on a farm, and contact with ill people and children in daycare. Data collection will be different depending on whether the pathogen is known or unknown. For example, if the pathogen is known, the investigation can focus on commonly associated foods and other potential risk factors during the exposure period given the relevant incubation period.
  • Notes section captures interviewer comments, ideas, questions.

Testing the questionnaire:

Pilot the questionnaire on colleagues and, more importantly, lay people. Refine the questionnaire accordingly if questions do not yield the information they are supposed to, if wording is unclear, if questions can be interpreted in more than one way, if closed-ended questions have insufficient options/categories and/or if skip patterns are not followed correctly.

Back to top


Types of questions

Question types include fill-in-the-blank, open-ended, and closed-ended. The type of question used depends upon what sort of information is desired and influences the analysis used.

  • Indicate units for fill-in-the-blank questions (e.g., Time of illness onset: ______ am/pm, Departure date (DD-MM-YYYY): ___-___-______, Return date (DD-MM-YYYY): ___-___-______).
  • Close-ended questions have limited response options (e.g., checkboxes). These questions are precise and uniform, can prompt/improve recall, and are easier to code and analyze than open-ended questions, but offer no opportunity to add extra information.
  • Open-ended questions allow for unlimited responses in the respondents own words (e.g., List the fruits that you have eaten in the past 7 days.). Responses from open-ended questions can be categorized and used later in closed-ended questions as data collection tools evolve during an investigation (e.g., Have you eaten any of the following items in the past 7 days: Mangoes? Yes/No/Don’t know. Strawberries? Yes/No/Don’t know. Melons? Yes/No/Don’t know.)

Creating specific questions:

  • Address one subject per question. For example asking “Did you eat a burger” and “Was there lettuce on the burger” is preferable to asking “Did you eat a burger with lettuce”.
  • Make questions as short as possible.
  • Avoid leading questions or bias responses.
  • Use simple and informal language. For example, “Did you have muscle aches?” is better than “Did you have myalgia?”.
  • Define terms to improve reliability. For example, “Are you experiencing diarrhea? We consider diarrhea to be 3 or more loose bowel movements in a 24 hour period” is more reliable than “Are you experiencing diarrhea?” because the former item defines what is meant by diarrhea, helping the respondents provide answers based on the same definition of diarrhea, rather than on what they think diarrhea is.
  • Provide appropriate options to increase validity. For example, when asking about sources of drinking water in the home, it is more valid to provide the options “municipal tap water”, “well water”, “surface water” and “commercially bottled water” than just “tap water” and “bottled water”.
  • Be as specific as possible to avoid ambiguity in the analysis. For example, “Have you been examined by a physician for these symptoms in the past seven days” is more specific than “Have you been examined by a physician in the past seven days”. With respect to food consumption, you want an accurate account of food actually eaten, not what is usually preferred. However, food preferences may be the only option when the illness occurred months ago or the incubation period is long (e.g., Listeria: 30 days).
  • Use specific date/time references to improve the validity of the results and increase recall. For example, “Did you swim in a public pool between Monday, June 2 and Monday, June 9, 2010?” is better than “Have you been swimming in a public pool recently?”.  

Back to top


Toolkit enteric questionnaire repository

  • This repository contains questionnaires for various pathogens, to be used at various stages of an outbreak investigation, including hypothesis generation, refinement, and testing. 

Back to top


Gregg, M.B (ed.). 2002. Field Epidemiology, 2nd Edition.Oxford University Press,Oxford, England.

World Health Organization. 2008. Foodborne disease outbreaks: guidelines for investigation and control. WHO Press, World Health Organization, Geneva, Switzerland.

Back to top