1. Home
  2. Understand your audience
  3. Research methods
  4. Overview of Research Methods
  5. Surveys
  6. Analyzing
  7. Qualitative analysis


Qualitative analysis of open-ended questions

Most of the data collected in open-ended responses are qualitative, that is, are in a nonnumeric form. Thus, in order to analyze and make sense of this data, one has to conduct Qualitative Data Analysis (QDA).

QDA involves a range of processes and procedures that aim to provide an explanation, understanding and interpretation of the collected data.

Two of the most popular approaches to analyze answers to open-ended questions are the content analysis and thematic analysis. The first approach employs a more systematic and mechanical process and is usually used with a purpose of classifying and quantifying data. The second approach employs a more flexible and reflective process and is usually used the capture the richness and in-depth nature of qualitative data. Next, an overview of each approach is presented.



Content analysis

This approach involves a rigorous and systematic classification process of coding and identifying themes or patterns that emphasize the reliability and replicability of observations and subsequent interpretations. Content analysis is a particularly useful approach when the purpose is to classify, summarize, quantify and tabulate qualitative data.

Generally the structuring process of content analysis follows three main steps:


1. Identification of the categories of analysis and development of the coding system

This involves and interactive process of determining the appropriate unit or level of analysis (this could be the all answer, sentences, or words) and identifying the recurrent categories that give meaning to the data. The purpose is to develop a coding system that will enable the conversion of the data into meaningful and specific units of information (codes or categories). The development of the coding system can be data-driven or theory-driven.

In a data-driven approach, the categories (codes) are selected based on a detailed analysis of all data. This approach is particularly suited when there is little knowledge about the topics and themes that may come up in the answers or when the goal is to make an in-depth exploration of the data.

In a theory-driven approach, the categories (codes) selected are predetermined by an existing theory. Thus, in this approach it is not strictly necessary to go through all data in order to select the categories, making it less time-consuming than the data-driven approach. The theory-driven approach is particularly suited when there is already knowledge and a conceptual organization of the themes that should be analyzed in the answers or when the goal is to test a theory.

In both approaches, the end product of this step should be a checklist or coding system instrument that identifies all the relevant categories, providing clear definitions and concrete examples in the data of each category, and accompanied with rigorous instructions of how the data should be coded using the instrument.


2. Coding of the data into the categories of analysis

This step involves the organization and coding of all data in a way that ensures reliability and meaningfulness, i.e., the previously defined categories (codes) are used to classify the content into explicative categories. Thus, this step requires the execution of an explicit set of recording instructions about the rules for coding the data into categories. Recording should involve more than one judge so that the coding of each content/unit can be examined for reliability, and sources of disagreement can be identified and corrected. Reliability of the coding system can then be evaluated through computation of coefficients of agreement between two or more different judges/coders (e.g., Kappa and Pi).


3. Analysis and interpretation

Once all data is organized and coded, qualitative (e.g., content, relationships between categories) and quantitative analysis (e.g., by using statistical methods to analyze the prevalence of different categories) can be performed and followed by an interpretation of the results.


For more information about this approach, the following sources are recommended:

Krippendorff, K. H. (2012) Content Analysis: An Introduction to Its Methodology, 3rd Ed. Thousand Oaks: SAGE Publications.

Schreier, M. (2012) Qualitative Content Analysis in Practice. London: SAGE Publications.



Thematic analysis

This approach is the most commonly used in qualitative analysis, because it is a simple, less time-consuming and flexible approach. In fact, this approach can be used with many kinds of qualitative data, and with many goals in mind. For that reason, thematic analysis is often implicitly and explicitly a part of other approaches of data analysis including grounded theory, narrative analysis and IPA. Researchers often use thematic analysis as a first step to look for broader patterns in their work in order to then conduct a more fine grained analysis using alternative approaches, if necessary.

The main goal when using this approach is to provide a description and understanding of answers. It helps researchers move their analysis from a broad reading of the data towards discovering patterns and developing themes.

The first thing to consider when using thematic analysis is how the themes will be identified. This can be deductively or inductively.

In deductive thematic analysis, a structure or predetermined framework is used to analyze data. Essentially, the researcher imposes their own structure or theories on the data and then uses these to analyze it. This approach is particularly useful when one has specific research questions that already identify the main themes or categories used to group the data and then look for similarities and differences. Given that this approach is relatively quicker and easier to perform, it is also particularly useful when time and resources are limited. However, by using a predetermine thematic framework one loses in flexibility of analysis which can bias and limit the interpretation of the data

In inductive thematic analysis, little or no predetermined theory, structure or framework is used to analyze data; instead the actual data itself is used to derive the structure of analysis. In this approach the themes are strongly linked to the data since they emerge from it. This approach is comprehensive and therefore time-consuming and is particularly useful when little or nothing is known about the event or topic under study.

Usually, inductive thematic analysis involves 6 phases: familiarization with data; generation of initial codes; searching for themes among codes; reviewing themes; defining and naming themes; and producing the final report.


PhaseDescription of the processResult
Familiarization with the dataRead and re-read data in order to become familiar with what the data entails, paying specific attention to patterns that occur and noting down initial ideas/patterns.Preliminary "start" codes and detailed notes.
Generation of initial codesGenerate the initial codes by identifying where and how patterns occur. This happens through data reduction where the researcher collapses data into labels in order to create categories for more efficient analysis. Data compilation is also completed here. This involves the researcher making inferences about what the codes mean.Comprehensive codes of how data answers research question(s).
Searching for themesCollate codes into themes that accurately depict the data. It is important in developing themes that the researcher describes exactly what the themes mean, what they include and exclude.List of candidate themes for further analysis.
Reviewing themesCheck if the themes make sense and account for all the coded extracts and the entire data set. If the analysis seems incomplete, the researcher needs to go back and find what is missing. Generate a thematic “map” of the analysis.Coherent recognition of how themes are patterned to tell an accurate story about the data.
Defining and naming categoriesGenerate clear definitions and names for each theme. Describe which aspects of data are being captured in each theme, and what is interesting about the themes.A comprehensive analysis of what the themes contribute to understanding the data.
Producing final reportDecide which themes make meaningful contributions to understanding what is going on within the data. Researchers should also conduct verification of the data to check if their description is an accurate representation.Description of the findings

Adapted from Braun and Clarke (2006)


For more information about this approach, the following sources are recommended:

Braun, V. and Clarke, V. (2006) Using thematic analysis in psychology. Qualitative Research in Psychology 3 (2): 93.

Boyatzis, R.E. (1998) Transforming qualitative information: Thematic analysis and code development. Thousand Oaks: SAGE Publications.



Computer Assisted Qualitative Data AnalysiS (CAQDAS)

When performing qualitative analysis, one should consider using a computer to enhance the analytic process. Currently there are numerous programs designed for qualitative data that can speed up the analysis process, make it easier for researchers to experiment with different codes, test different hypotheses about relationships, and facilitate diagrams of emerging theories and preparation of research reports. The steps involved in computer-assisted qualitative data analysis parallel those used traditionally to analyze text such as notes, documents, or transcripts: preparation, coding, analysis, and reporting. However, CAQDAS can only support the intellectual processes of the researcher; it cannot substitute the role of the researcher in the understanding and interpretation of the data.

For more information about CAQDAS, the following source is recommended:



Methods selection decision aid

Take this short quiz to learn what are the best 3 methods to understand your audience and answer your research question(s).