A 6th response option (“I don’t know”) was added to a 5-point Likert scale. Is the data lost?

I need a little bit of help salvaging the data from a questionnaire.

One of my colleagues applied a questionnaire, but inadvertently, instead of using the original 5-point Likert scale (strongly disagree to strongly agree), he inserted a 6th answer into the scale. And, to make the matter worse, the 6th response option is … “I don’t know”.

The problem is the big proportion of respondents who, at one point or another, chose “I don’t know”. If they were a reasonably small percentage, I’d have just excluded them from the database.
However, the core of the research rests on a conceptual model, and excluding so many records would create a problem for the model.

Could someone point me in the right direction here? Are there any ‘good practices’, or can I do anything to use (transform, convert, etc.) those “I don’t know” responses?

Also, if I do any manipulation of the data in question (i.e., if I convert the “I don’t know” responses, by substitution, imputation, etc.), what kind of ‘disclaimer’, ‘warning’, annotation, should I use?

I know it is a long shot, but I confess, besides salvaging the responses, I am also curious what is the agreed practice (if there is one), in these type of cases.

PS: I know it sounds childish, but no, the ‘colleague’ isn’t me 🙂


Why try to force a calibration on something which is not true? As Maarten said, this is not a loss of data but a gain of information. If the magical pill you are looking for exists, it would mean that there are some assumptions about your population that are made, for example, a bias in favor of one particular label even though users say “I don’t know”.

I totally understand your frustration but the proper way to approach the problem is to modify the model to suit your needs based on the true existing data, and not the other way around (modifying the data).

Source : Link , Question Author : streamline , Answer Author : Silverfish

Leave a Comment