Traditional scores versus IRT estimates on forced-choice tests based on a dominance model

  1. Pedro M. Hontangas 1
  2. Iwin Leenen 2
  3. Jimmy de la Torre 3
  4. Vicente Ponsoda 4
  5. Daniel Morillo 4
  6. Francisco J. Abad 4
  1. 1 Universitat de València
    info

    Universitat de València

    Valencia, España

    ROR https://ror.org/043nxc105

  2. 2 Universidad Nacional Autónoma de México
    info

    Universidad Nacional Autónoma de México

    Ciudad de México, México

    ROR https://ror.org/01tmp8f25

  3. 3 The State University of New Jersey (USA)
  4. 4 Universidad Autónoma de Madrid
    info

    Universidad Autónoma de Madrid

    Madrid, España

    ROR https://ror.org/01cby8j38

Journal:
Psicothema

ISSN: 0214-9915

Year of publication: 2016

Volume: 28

Issue: 1

Pages: 76-82

Type: Article

More publications in: Psicothema

Abstract

Background: Forced-choice tests (FCTs) were proposed to minimize response biases associated with Likert format items. It remains unclear whether scores based on traditional methods for scoring FCTs are appropriate for between-subjects comparisons. Recently, Hontangas et al. (2015) explored the extent to which traditional scoring of FCTs relates to the true scores and IRT estimates. The authors found certain conditions under which traditional scores (TS) can be used with FCTs when the underlying IRT model was an unfolding model. In this study, we examine to what extent the results are preserved when the underlying process becomes a dominance model. Method: The independent variables analyzed in a simulation study are: forced-choice format, number of blocks, discrimination of items, polarity of items, variability of intra-block difficulty, range of difficulty, and correlation between dimensions. Results: A similar pattern of results was observed for both models; however, correlations between TS and true thetas are higher and the differences between TS and IRT estimates are less discrepant when a dominance model involved. Conclusions: A dominance model produces a linear relationship between TS and true scores, and the subjects with extreme thetas are better measured.

Bibliographic References

  • Baron, H. (1996). Strengths and limitations of ipsative measurement. Journal of Occupational and Organizational Psychology, 69, 49-56.
  • Bartram, D. (2007). Increasing validity with forced-choice criterion measurement formats. International Journal of Selection and Assessment, 15, 263-272.
  • Birnbaum, A. (1968). Some latent trait models. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
  • Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431-444.
  • Brown, A., & Bartram, D. (2009). Doing Less but Getting More: Improving Forced-Choice Measures with IRT. Paper presented at the 24th Annual conference of the Society for Industrial and Organizational Psychology, New Orleans.
  • Brown, A., & Maydeu-Olivares, A. (2010). Issues that should not be overlooked in the dominance versus ideal point controversy. Industrial and Organizational Psychology, 3(4), 489-493.
  • Brown, A., & Maydeu-Olivares, A. (2011). Item response modelling of forced-choice questionnaires. Educational and Psychological Measurement, 71, 460-502.
  • Brown, A., & Maydeu-Olivares, A. (2013). How IRT can solve problems of ipsative data in forced-choice questionnaires. Psychological Methods, 18, 36-52.
  • Carvalho F., Filho, A., Pessotto, F., & Bortolotti, S. (2014). Application of the unfolding model to the aggression dimension of the Dimensional Clinical Personality Inventory (IDCP). Revista Colombiana de Psicología, 23, 339-349.
  • Chernyshenko, O. S., Stark, S., Drasgow, F., & Roberts, B. W. (2007). Constructing personality scales under the assumptions of an ideal point response process: Toward increasing the flexibility of personality measures. Psychological Assessment, 19(1), 88-106.
  • Chernyshenko, O. S., Stark, S., Chan, K. Y., Drasgow, F., & Williams, B. (2001). Fitting item response theory models to two personality inventories: Issues and insights. Multivariate Behavioral Research, 36(4), 523-562.
  • Cheung, M. W.-L., & Chan, W. (2002). Reducing uniform response bias with ipsative measurement in multiple-group confirmatory factor analysis. Structural Equation Modeling, 9, 55-77.
  • Christiansen, N. D., Burns, G. N., & Montgomery, G. E. (2005). Reconsidering forced-choice item formats for applicant personality assessment. Human Performance, 18, 267-307.
  • Clemans, W. V. (1966). An analytical and empirical examination of some properties of ipsative measures (Psychometrika Monograph No. 14). Richmond, VA: Psychometric Society.
  • Closs, S. J. (1996). On the factoring and interpretation of ipsative data. Journal of Occupational and Organizational Psychology, 69, 41-47.
  • de la Torre, J., Ponsoda, V., Leenen, I., & Hontangas, P. (2011). Some extensions of the multi-unidimensional pairwise preference model. Paper presented at the 26th annual meeting of the Society for Industrial and Organizational Psychology, Chicago, IL.
  • Drasgow, F. L., Chernyshenko, O. S., & Stark, S. (2010). 75 years after Likert: Thurstone was right! Industrial and Organizational Psychology, 3, 465-476.
  • Harris, D. (1989). Comparison of 1-,2-, and 3-parameter IRT models. Educational Measurement: Issues and Practice, Spring, 34-41.
  • Heggestad, E. D., Morrison, M., Reeve, C. L., & McCloy, R. A. (2006). Forced-choice assessments of personality for selection: Evaluating issues of normative assessment and faking resistance. Journal of Applied Psychology, 91, 9-24.
  • Hicks, L. E. (1970). Some properties of ipsative, normative, and forced- choice normative measures. Psychological Bulletin, 74, 167-184.
  • Hirsh, J. B., & Peterson, J. B. (2008). Predicting creativity and academic success with a “Fake-Proof” measure of the Big Five. Journal of Research in Personality, 42, 1323-1333.
  • Hontangas, P. M., de la Torre, J., Ponsoda, V., Leenen, I., Morillo, D., & Abad, F. J. (2015). Comparing traditional and IRT scoring of forced-choice tests. Applied Psychological Measurement. Advance online publication, doi:10.1177/0146621615585851
  • Huang, J., & Mead, A. D. (2014). Effect of personality item writing on psychometric properties of Ideal-Point and Likert scales. Psychological Assessment, 26, 1162-1172.
  • Jackson, D. N., Wroblewski, V. R., & Ashton, M. C. (2000). The impact of faking on employment tests: Does forced choice offer a solution? Human Performance, 13, 371-388.
  • Johnson, C. E., Wood, R., & Blinkhorn, S. F. (1988). Spuriouser and spouriouser: The use of ipsative personality tests. Journal of Occupational Psychology, 61, 153-162.
  • Kim, J. K., & Nicewander, A. (1993). Ability estimation for conventional tests. Psychometrika, 58, 587-599.
  • Liao, C., & Mead, A. D. (2009, April). Fit of ideal-point and dominance IRT models to simulated data. Paper presented at the 24th annual meeting of the Society for Industrial and Organizational Psychology, New Orleans, LA.
  • Matthews, G., & Oddy, K. (1997). Ipsative and normative scales in adjectival measurement of personality: Problems of bias and discrepancy. International Journal of Selection and Assessment, 5, 169-182.
  • McCloy, R. A., Heggestad, E. D., & Reeve, C. L. (2005). A silk purse from the sow’s ear: Retrieving normative information from multidimensional forced-choice items. Organizational Research Methods, 8, 222-248.
  • Meade, A. W. (2004). Psychometric problems and issues involved with creating and using Ipsative measures for selection. Journal of Occupational and Organizational Psychology, 77, 531-552.
  • Morillo, D., Leenen, I., Abad, F. J., Hontangas, P. M., de la Torre, J., & Ponsoda, V. (2015, submitted). A dominance variant under the multi- unidimensional Pairwise-preference framework: Model formulation and Markov Chain Monte Carlo Estimation. Applied Psychological Measurement.
  • Oswald, F. L., & Schell, K. L. (2010). Developing and scaling personality measures: Thurstone was right - but so far, Likert was not wrong. Industrial and Organizational Psychology: Perspectives on Science and Practice, 3, 481-484.
  • Roberts, J. S., Donoghue, J. R., & Laughlin, J. E. (2000). A general item response model for unfolding unidimensional polytomous responses. Applied Psychological Measurement, 24, 3-32.
  • Saville, P., & Willson, E. (1991). The reliability and validity of normative and ipsative approaches in the measurement of personality. Journal of Occupational Psychology, 64, 219-238.
  • Scherbaum, C. A., Finlinson, S., Barden, K., & Tamanini, K. (2006). Applications of item response theory to measurement issues in leadership research. The Leadership Quarterly, 17, 366-386.
  • Stark, S., Chernyshenko, O. S., & Drasgow, F. (2005). An IRT approach to constructing and scoring pairwise preference items involving stimuli on different dimensions: The multi-unidimensional pairwise- preference model. Applied Psychological Measurement, 29, 184-203.
  • Stark, S., Chernyshenko, O. S., Drasgow, F., & Williams, B. A. (2006). Examining assumptions about item responding in personality assessment: Should ideal point methods be considered for scale development and scoring? Journal of Applied Psychology, 91(1), 25-39.
  • Tay, L., Ali, U. S., Drasgow, F., & Williams, B. (2011). Fitting IRT models, assessing the relative model-data fit of ideal point and fominance models. Applied Psychological Measurement, 35, 280-295.
  • Tay, L., Drasgow, F., Rounds, J., & Williams, B. (2009). Fitting measurement models to vocational interest data: Are dominance models ideal? Journal of Applied Psychology, 94, 1287-1304.
  • van Eijnatten, F. M., van der Ark, L. A., & Holloway, S. S. (2015). Ipsative measurement and the analysis of organizational values: An alternative approach for data analysis. Quality and Quantity: International Journal of Methodology, 49, 559-579.
  • Weekers, M. A., & Meijer, R. R. (2008). Scaling response processes on personality items using unfolding dominance models: An illustration with a Dutch dominance and unfolding personality inventory. European Journal of Psychological Assessment, 24, 65-77.