Ross Woods, 2022-24
Thematic coding is a common method of analyzing documentary data, usually transcripts of interviews or focus groups with open-ended questions. As a qualitative methodology, it gives researchers a way to interpret and analyse data.
Check the purpose of the research:
Thematic coding is derived from a research approach called grounded theory. In essence, this is a method of using a comprehensive set of examples to identify patterns, from which the researcher can create a theory. The theory is justified by the range of real examples.
Thematic coding is really just a systematic way of analyzing data to reach a conclusion. It has become increasingly popular in recent years, partly because it looks like a set of steps. However, the latter stages are less procedural and require more thought. It is not actually a set of steps, but is more like a set of phases that can overlap. For example, you can start transcribing and analyzing data as soon as it is collected.
Thematic coding has several advantages. First, the researcher simply has to follow the method. Second, it gives a way to systematically analyze lots of data, such as when writing a longer thesis or a dissertation. Third, the researcher can use it as in stages, giving an opportunity to adapt the method as needed, and perhaps hold more interviews. Fourth, it is easier if the researcher uses voice-to-text software to transcribe interviews.
It also has several disadvantages. Although it is quite flexible, it probably doesn’t allow much scope for innovation. If you do not use transcription software, it is very time-consuming to transcribe interviews by hand, or quite expensive if you have to pay someone else to do it for you.
You should already be keeping a written diary of your methodology, including what you did, why you did it, your methods, and your observations. Your description is essential to your accountability. In principle, it must to be detailed enough to enable someone else to follow your method. (In the Grounded Theory literature, the diary is often called keeping a memo
or memoing.
)
Write your notes in full sentences so that you can understand them even after you've forgotten the actual situation in which you wrote them. (A list of unexplained topics is not helpful.)
Add records of your reflections to your diary. You can start to informally analyze data it as soon as you start collecting it. You should take notice anything relevant to your research question, for example:
In your diary, you should also write down the reasons why you interpreted the data the way you did. The description will probably be quite simple at first, but any later changes or elaborations will be significant because they indicate a better interpretation of the data.
💡 If you are writing a dissertation ...
You can can start formulating deductive a priori themes very early in the whole process, even before you collect any data. It is quite permissible to derive a set of themes from your literature review or your statement of the research question. However, it would be a mistake to use deductive themes exclusively because other unexpected but significant themes might emerge in the data later on.
The other alternative is to use inductive codes and themes, that is, those that emerge in the data during your analysis. This has the advantage of including those that you could not anticipate in your original plan. You can even modify your system of inductive codes and themes during data-gathering and analysis in order to get themes that better represent your data.
Some of your ongoing analysis might affect your data gathering. Qualitative research is often iterative, and this method allows you to improve your data collection and analysis as you progress:
You have enough data when any more would not improve, strengthen, nor add to your conclusions. This point is called “data saturation.” If you use interviews, data saturation is usually reached with fewer than 20 interviews. One of these methods will help you when to decide to stop collecting data:
In some kinds of research, such as ethnography, it is usually possible to keep collecting more and more data. In these cases, the criterion is your research question. In other cases, you can consider stopping when you have obtained data from everybody in your sample.
You can start transcription as soon as you have collected data. Transcribe it word-for-word into documents, although you might be able to exclude anything clearly irrelevant to your research purpose. Warning: Some things that look irrelevant at first might appear more relevant later on when you understand the data better.
Most researchers prefer to use transcription voice-to-text software or external services to do transcriptions. A few, however, prefer to do it manually because it brings the very close to the data, even though it is horrendously time-consuming.
Start reading and re-reading all your data while it is still coming in, and make diary notes of any other questions arising. (If you transcribe manually, this will come very easily.)
When you become very familiar with your data, it might look very little even when you have enough. Don’t worry.
You might have started collecting quotations, but now you can treat it as an extra stage. Using direct quotations from respondents in your final report has two particular benefits:
Coding is a way of simplifying and breaking down a large amount of real data into smaller, more useful pieces.
Codes have the following advantages:
If possible, start coding as soon as you have transcriptions and are familiar with the texts, while you are still collecting data. Mark all parts of the text that are relevant to your research question with a color-code or symbol. These might be “recurring patterns, terms, or visual elements.” (Naeem et al. p. 2.) On each part of the text that you marked, put a brief label of a single word or a short phrase that says what is going on. These labels are your codes.
Coding is itself part of analysis, because you are sorting raw data into structured meaning.
When you have finished coding, you will have a patchwork of the meanings of everything in your data that is relevant to your topic, and it will help you to develop theory. It is also simpler and briefer than the full text of raw data.
💡 It is good practice to have someone else check your coding; it will help prevent or minimize personal bias in interpreting data.
💡 The simplest way is to color-code documents by hand, usually in a word processor, but paper might be easier for some people. Although time-consuming, hand-coding is still a good option because you get a better idea of what is going on in the data as you work through the details. Otherwise, you can use software, like Zotero, which is free and online; some institutions use it as their standard method. Just check that it will do what you want for your particular research project.
⚠ Some mistakes are easy to make if you make incorrect assumptions about your respondents:
Group related codes together and represent them with a theme, that is, an overarching idea that represents what is happening. Themes are a higher level of abstraction.
Do your themes accurately represent the theoretical ideas in your data and codes?
When your have created a system of themes, compare different occurrences and look for patterns in the data. By this stage, you should be able to see patterns; the sooner you spot the patterns and confirm them, the faster you make progress. You will find that you read the transcripts again and again, and become very familiar with them.
What are the relationships between codes and themes? You can use diagrams or models to represent the relationships among these concepts. (Naeem et al. p. 4.) Can you accurately define those relationships and demonstrate them from your data?
If your questions addressed your research question ane purpose, you will find an answer in the data, even if it is not the answer you expected.
This approach is primarily expository:
This approach is best for making sense (i.e. creating a theory) of an unusual or counterintuitive phenomenon. However, don't treat it as a rigid set of steps that will meet all your conceptualization needs:**
How many themes?
There is no rule about specific numbers of themes. The principle is that you need enough to represent the data accurately and to help you reach sound conclusions. If the number of codes hinder and confuse your analysis, you should ask whether the number of them is the cause of the difficulty.
The data saturation level indicates that between ten and twenty themes is probably enough if your interviews are well-focussed on your topic. If a smaller number of themes accurately represents both the data and the research problem, then you might not need more.
If you have a large number of themes, some will probably have very few occurrences and they will not tend to be helpful. However, a large number of themes is not a bad thing in some circumstances. First, the research might have a problem of diversity. For example, the phenomena in your research problem might have a wide variety of causes, manifestations, or symptoms. Second, a small number of occurances (the outliers) might be significant, and you cannot presume that the bulk of data is always the best data. For example, a treatment might be quite safe for 98% of subjects, but a 2% death rate might be unacceptably high.
How can I code qualitative data from my interviews so that I work smarter, not harder?***
Organizing large amounts of data is possible with a computer, but it might not be the best way for everybody. Besides, if you make a mistake with a computer you might not notice it or might not be able to reverse it. Advice so far:
__________
* A mathematical proof of data saturation is unlikely because qualititative data is not appropriate for a mathematical proof.
** Ross Woods, 2020, '24, derived from Strauss and Corbin, 1990, pp. 99-107.
*** With thanks to Rιchαrd Scοtt Bαskαs, Rαιnεε Βrγαnt, Lγndα Dανis.
Muhammad Naeem, Wilson Ozuem, Kerry Howell, and Silvia Ranfagni. A Step-by-Step Process of Thematic Analysis to Develop a Conceptual Model in Qualitative Research
International Journal of Qualitative Methods Volume 22:1–18 (2023) DOI: 10.1177/16094069231205789
Ross Woods, 2020, '24. Toolkit of research methods.
Anselm Strauss and Juliet Corbin. 1990. Basics of Qualitative Research: Grounded Theory and Procedures and Techiniques (Newbury Park, Ca.: Sage Publications).
Τοm Grαnοff offered another way of looking at it. If the literature is fairly mature and you already have a quite good idea of what most of the top ten responses will be, give interviewees a checklist of all those responses for them to endorse those that are relevant to them. Then follow it with an open-ended question like, “Please comment on the responses that are most important to you.” This approach has several advantages:
This is better than surveys with open-ended questions where the majority of respondents gave either no answer or an answer of less than five words, which is basically useless.
With thanks to Tom Granoff. He thinks he didn't invent
it, but doesn't know its origins. It might of been the use of symptom checklists as a way to do quick clinical assessments, and he adapted it to dissertation work.