|Year : 2021 | Volume
| Issue : 3 | Page : 260-262
The validation of questionnaires
Praveen K Nirmalan
Department of Research, AMMA Healthcare Research Gurukul, Kochi, Kerala, India
|Date of Submission||14-Jan-2021|
|Date of Decision||17-Jan-2021|
|Date of Acceptance||17-Jan-2021|
|Date of Web Publication||08-Dec-2021|
Dr. Praveen K Nirmalan
AMMA Healthcare Research Gurukul, Kochi, Kerala
Source of Support: None, Conflict of Interest: None
We have previously described the design and development of a questionnaire. It is important to establish the validity of a questionnaire before it is administered to a population. A series of inter-related tests are required to determine the validity of a questionnaire. Questionnaires pass through an iterative process that includes the development of items, testing of the validity, revision of items, revision of the conceptual basis of the test, retesting, and repetition of the process till the questionnaire is finalized. The process of validation of a questionnaire continues if the questionnaire is in use. In this manuscript, we briefly describe the conceptual basis of the different methods that are used to establish the validity of questionnaires and item retention and exclusion.
Keywords: Internal consistency, item response theory, questionnaire, validation
|How to cite this article:|
Nirmalan PK. The validation of questionnaires. Kerala J Ophthalmol 2021;33:260-2
| Introduction|| |
Validity refers to the interpretations of measurements and is fundamental to the development and interpretations of tests., A questionnaire (or scale or test) has a specific purpose. The interpretations of the scores or results of the questionnaire must relate to the specific purpose of the questionnaire.,, Validity provides a quantitative description of the support provided by evidence and theory to the interpretation of test scores.,,
The scale or questionnaire generally follows one of three models.,, The most common model is a quantitative model that uses the levels or degree of the target construct to differentiate individuals.,, The other models include class models that categorize individuals into qualitatively different groups and the more complex dynamic models.,, The structural validity implies that the internal structure of the questionnaire parallels the external structure of the target condition and reflects the underlying variance., Structural validity includes empirical assessments based on nontest parameters, tests for internal consistency focused on inter-item and item-total measures, and item response theory (IRT) assessments based on latent traits. Empirical assessments are useful at the stage of questionnaire development although they are usually used in external validation these days. Empirical assessments include the administration of the item pool to a clinical and community sample., The differences in mean item scores are then used in conjunction with other parameters to determine inclusion or exclusion of an item.
Internal consistency is widely used in the development of a questionnaire. Item-total correlations are generally used to determine if an item must be retained or excluded and can be done when developing a single scale.,, However, an exploratory factor analysis (EFA) is a better and preferable option if the questionnaire has hierarchical constructs or multiple constructs. The EFA helps to identify the underlying dimensions (unidimensional or multidimensional) that are subsequently used for scale construction. The initial step is a principal factor analysis or a principal component analysis. Items that have loading factors 0.35–0.40 and items that have similar or stronger loadings on other factors are eliminated from the item pool of questions.,, Confirmatory factor analysis is then used to further explore the structural validity of the questionnaire.,,,, The loading of items in the factor analysis is used to explore the redundancy or similarity of items and the correlation of the items with the theoretical basis of the questionnaire and other items in the questionnaire.,,,,,,
The IRT is another method that can be used after EFA to assess structural integrity, especially in short-form questionnaires.,, IRT presumes that each item response reflects an underlying construct. IRT also presumes that an item characteristic curve can describe the item–trait relationship as a monotonically increasing function. IRT tries to identify specific items that provide maximum information for everyone and based on their level of the underlying dimension. IRT considers an item as optimal if the respondent has a 50% probability of responding correctly. IRT is also used to estimate item difficulty and item discrimination. The ability to estimate the individual item–trait level without administering a fixed set of items is a major advantage with IRT., This allows the development of computer-aided tests (CATs) that can be scaled up in difficulty based on the underlying capability of the person and through a subset of items that are maximally informative for each person. CATs are equally efficient and can provide trait-level information using fewer items than a conventional questionnaire.,
Evaluation of the psychometric properties
The development of the questionnaire may lead to a revision of the theoretical concept underlying the questionnaire like how individual items are revised.,, An initial step is to examine the response distributions of individual items. Items that have a highly skewed distribution (floor and ceiling effects) or questions where most respondents provide a similar response must be considered for elimination.,, These unbalanced questions provide little information, weakly correlate with other items, and lead to highly unstable correlational results.,,
The next step is to determine the items to eliminate or retain in the questionnaire. Each item in the questionnaire should measure only one thing. However, we must look at the differences between internal consistency and unidimensionality. Internal consistency describes the intercorrelations between items of a scale. Cronbach's coefficient alpha is a test that is commonly used to determine internal consistency with a threshold of ≥0.80 suggesting that the item can be retained.,, Unidimensionality indicates whether the items in a scale assess a single underlying factor and is, therefore, a better measure of the validity of a questionnaire., The often-used Cronbach's coefficient alpha is of limited use in the determination of unidimensionality. The Cronbach's coefficient alpha is also a function of the scale length and average inter-item correlation (AIC) and can lead to an imperfect measure of internal consistency., Several highly correlated items, many moderately correlated items, and various combinations of scale length and AIC can lead to erroneous measures of internal consistency using Cronbach's alpha test., Cronbach's alpha test cannot be used if the number of items in a questionnaire is more than 40.,,
An AIC that falls between 0.15 and 0.5 is considered a better measure than the Cronbach's coefficient alpha., However, the AIC alone cannot establish the unidimensionality of the questionnaire. A higher AIC can be obtained by averaging many higher coefficients with many lower ones. It is, therefore, necessary to examine the range and distribution of these correlations and not focus only on the AIC., Thus, the AIC and majority of the inter-item correlations should range between 0.15 and 0.50 to ensure unidimensionality.,,
The process of external validation of a questionnaire continues if that questionnaire is in use. Clear conceptualization of the theory, development of specific and relevant item pools, and assessment of convergent and discriminant validity during the scale development help to understand what the questionnaire measures and what it does not.
Convergent validity is examined by assessing the relationships between indicators of the same construct. Discriminant validity examines the relation of a measure with indicators of other constructs and looks to establish that highly correlated constructs within hierarchical models are empirically distinct from one another. Discriminant validity is established by showing that convergent correlations are significantly higher than discriminant coefficients., Criterion validity is shown through the significant relation of a test with theoretically relevant nontest outcomes (e.g., clinical diagnoses and arrest records). Incremental validity demonstrates that the measure adds significantly to the prediction of a criterion over and above what can be predicted by other sources of data.
| Conclusion|| |
The design, development, and validation of a questionnaire involves several stages of testing and revision to establish validity. Changes to the structure of the questionnaire in terms of addition or deletion of questions, rewording questions, and change in response scales affect the validity of the questionnaire and impact on the interpretation of results. Questionnaires are developed for specific contexts in specific populations. It is important to retest the validity of a questionnaire when it is applied in a new context or a different population or when a translated version of the questionnaire is used.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Clark LA, Watson D. Constructing validity: New developments in creating objective measuring instruments. Psychol Assess 2019;31:1412-27.
Cronbach LJ, Meehl PE. Construct validity in psychological tests. Psychol Bull 1955;52:281-302.
American Educational Research Association, American Psychological Association, National Council on Measurement in Education, & Joint Committee on Standards for Educational and Psychological Testing (AERA, APA, & NCME). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association; 2014.
Loevinger J. Objective tests as instruments of psychological theory. Psychol Rep 1957;3:635-94.
Haslam N, Holland E, Kuppens P. Categories versus dimensions in personality and psychopathology: A quantitative review of taxometric research. Psychol Med 2012;42:903-20.
Markon KE, Chmielewski M, Miller CJ. The reliability and validity of discrete and continuous measures of psychopathology: A quantitative review. Psychol Bull 2011;137:856-79.
Messick S. Standards of validity and the validity of standards in performance assessment. Educ Meas Issues Pract 1995;14:5-8.
Meehl PE. The dynamics of “structured” personality tests. J Clin Psychol 1945;1:296-303.
Clark LA, Watson DB. Constructing validity: Basic issues in objective scale development. Psychol Assess 1995;7:309-19.
Simms LJ, Watson D. The construct validation approach to personality scale construction. In: Robins RW, Fraley RC, Krueger RF, editors. Handbook of Research Methods in Personality Psychology. New York: Guilford Press; 2007. p. 240-58.
Comrey AL. Factor-analytic methods of scale development in personality and clinical psychology. J Consult Clin Psychol 1988;56:754-61.
Fabrigar LR, Wegener DT, MacCallum RC, Strahan EJ. Evaluating the use of exploratory factor analysis in psychological research. Psychol Methods 1999;4:272-99.
Russell DW. In search of underlying dimensions: The use (and abuse) of factor analysis in personality and social psychology bulletin. Personal Soc Psychol Bull 2002;28:1629-46.
Watson D. Objective tests as instruments of psychological theory and research. In: Cooper H, editor. Handbook of Research Methods in Psychology. Foundations, Planning, Measures, and Psychometrics. Vol. 1. Washington, DC: American Psychological Association; 2012. p. 349-69.
Reise SP, Ainsworth AT, Haviland MG. Item response theory: Fundamentals, applications, and promise in psychological research. Curr Dir Psychol Sci 2005;14:95-101.
Reise SP, Waller NG. Item response theory and clinical measurement. Ann Rev Clin Psychol 2009;5:27-48.
Rudick MM, Yam WH, Simms LJ. Comparing countdown- and IRT-based approaches to computerized adaptive personality testing. Psychol Assess 2013;25:769-79.
Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika 1951;16:297-334.
Nunnally JC. Psychometric Theory. 2nd
ed. New York: McGraw-Hill; 1978.
Streiner DL. Starting at the beginning: An introduction to coefficient alpha and internal consistency. J Pers Assess 2003;80:99-103.
Briggs SR, Cheek JM. The role of factor analysis in the development and evaluation of personality scales. J Pers 1986;54:106-48.
Cortina JM. What is coefficient alpha? An examination of theory and applications. J Appl Psychol 1993;78:98-104.
Campbell DT, Fiske DW. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychol Bull 1959;56:81-105.