Home » Foundations » Language Of Research »
Types of Data
We'll talk about data in lots of places in The Knowledge Base, but here I just want to make a fundamental distinction between two types of data: qualitative and quantitative. The way we typically define them, we call data 'quantitative' if it is in numerical form and 'qualitative' if it is not. Notice that qualitative data could be much more than just words or text. Photographs, videos, sound recordings and so on, can be considered qualitative data.
Personally, while I find the distinction between qualitative and quantitative data to have some utility, I think most people draw too hard a distinction, and that can lead to all sorts of confusion. In some areas of social research, the qualitative-quantitative distinction has led to protracted arguments with the proponents of each arguing the superiority of their kind of data over the other. The quantitative types argue that their data is 'hard', 'rigorous', 'credible', and 'scientific'. The qualitative proponents counter that their data is 'sensitive', 'nuanced', 'detailed', and 'contextual'.
For many of us in social research, this kind of polarized debate has become less than productive. And, it obscures the fact that qualitative and quantitative data are intimately related to each other. All quantitative data is based upon qualitative judgments; and all qualitative data can be described and manipulated numerically. For instance, think about a very common quantitative measure in social research -- a self esteem scale. The researchers who develop such instruments had to make countless judgments in constructing them: how to define self esteem; how to distinguish it from other related concepts; how to word potential scale items; how to make sure the items would be understandable to the intended respondents; what kinds of contexts it could be used in; what kinds of cultural and language constraints might be present; and on and on. The researcher who decides to use such a scale in their study has to make another set of judgments: how well does the scale measure the intended concept; how reliable or consistent is it; how appropriate is it for the research context and intended respondents; and on and on. Believe it or not, even the respondents make many judgments when filling out such a scale: what is meant by various terms and phrases; why is the researcher giving this scale to them; how much energy and effort do they want to expend to complete it, and so on. Even the consumers and readers of the research will make lots of judgments about the self esteem measure and its appropriateness in that research context. What may look like a simple, straightforward, cut-and-dried quantitative measure is actually based on lots of qualitative judgments made by lots of different people.
On the other hand, all qualitative information can be easily converted into
quantitative, and there are many times when doing so would add considerable value to your
research. The simplest way to do this is to divide the qualitative information into units
and number them! I know that sounds trivial, but even that simple nominal enumeration can
enable you to organize and process qualitative information more efficiently. Perhaps more
to the point, we might take text information (say, excerpts from transcripts) and pile
these excerpts into piles of similar statements. When we do something even as easy as this
simple grouping or piling task, we can describe the results quantitatively. For instance,
if we had ten statements and we grouped these into five piles (as shown in the figure),
we could describe the
piles using a 10 x 10 table of 0's and 1's. If two statements were placed
together in the same pile, we would put a 1 in their row-column juncture. If two
statements were placed in different piles, we would use a 0. The resulting matrix
or table describes the grouping of the ten statements in terms of their similarity. Even
though the data in this example consists of qualitative statements (one per card), the
result of our simple qualitative procedure (grouping similar excerpts into the same piles)
is quantitative in nature. "So what?" you ask. Once we have
the data in numerical form, we can manipulate it numerically. For instance, we could
have five different judges sort the 10 excerpts and obtain a 0-1 matrix like this for each
judge. Then we could average the five matrices into a single one that shows the
proportions of judges who grouped each pair together. This proportion could be
considered an estimate of the similarity (across independent judges) of the excerpts.
While this might not seem too exciting or useful, it is exactly this kind of procedure
that I use as an integral part of the process of developing 'concept maps' of ideas for
groups of people (something that is useful!).
Copyright ©2006, William M.K. Trochim, All Rights Reserved
Purchase a printed copy of the Research Methods Knowledge
Base
Last Revised: 10/20/2006