A Comparison of Traditional and Non-traditional True-False Measures in a Business Task

Lori S. Kopp; Dr. Richard Perlow

doi:10.47742/ijbssr.v7n4p1

Authors

Lori S. Kopp University of Lethbridge
Dr. Richard Perlow MacEwan University

DOI:

https://doi.org/10.47742/ijbssr.v7n4p1

Keywords:

Business simulation, measurement, scoring formulas, evaluation methods, confidence testing

Abstract

Beliefs regarding the usefulness of true-false tests are mixed. Many of these opinions stem from research comparing true-false test performance to that on traditional paper-and-pencil tests. However, little is known about how true-false test scores relate to performance measures requiring knowledge application, or whether different scoring algorithms vary in their ability to predict such performance. To address these gaps, we examined the relationships between traditional and modified true-false scoring methods and outcomes on a business simulation designed to assess complex knowledge application. Our results showed that posttest true-false scores were associated with simulation performance, with the gap between high and low scorers widening over time. Scoring formats that incorporated confidence ratings demonstrated higher reliability and predictive power, but were not substantially more correlated with performance than traditional methods. These findings suggest that true-false tests can serve as effective measures of performance on complex tasks.

Downloads

Download data is not yet available.

Author Biographies

Lori S. Kopp, University of Lethbridge

Lori S. Kopp

Associate Professor

Dhillon School of Business

University of Lethbridge

Dr. Richard Perlow, MacEwan University

Dr. Richard Perlow has over two decades of experience in higher education. Before joining MacEwan as the Dean of the School of Business, Dr. Perlow worked at the University of Lethbridge in the Dhillon School of Business where he taught courses in human resource management and organizational behaviour. From 2006 to 2015 he served as the School’s associate dean. Dr. Perlow has also held appointments at the University of Manitoba, Clemson University, and Auburn University.

References

Alfieri, L., Brooks, P. J., Aldrich, N. J., & Tenenbaum, H. R. (2011). Does discovery-based learning instruction enhance learning. Journal of Educational Psychology, 103(1), 1-18. https://doi.org/10.1037/a0021017

Anderson, J. R. (1987). Skill acquisition: Compilation of weak-method problem solutions. Psychological Review, 94, 192 -210. https://psycnet.apa.org/doi/10.1037/0033-295X.94.2.192

Anastasi, A. (1988). Psychological testing (6th ed.). Upper Saddle River, NJ. Prentice-Hall.

Balart, P., Ezquerra, L. & Hernandez-Arenaz, I. (March 18, 2020). Framing effects on risk-raking behavior: Evidence from a field experiment. Available at SSRN: https://ssrn.com/abstract=3556710 or http://dx.doi.org/10.2139/ssrn.3556710

Baldiga, K. (2014). Gender differences in willingness to guess. Management Science, 60(2), 434-448. http://dx.doi.org/10.1287/mnsc.2013.1776

Brassil, C. E., & Couch, B. A. (2019). Multiple-true-false questions reveal more thoroughly the complexity of student thinking than multiple-choice questions: a Bayesian item response model comparison. International Journal of STEM Education, 6, 16, 1-17.

https://doi.org/10.1186/s40594-019-0169-0

Brewer, P., & Venaik, S. (2014). The ecological fallacy in national culture research. Organizational Studies, 35(7), 1063-1086. https://doi.org/10.1177%2F0170840613517602

Campbell, M. L. (2015). Multiple-choice exams and guessing: Results from a one-year study of general chemistry tests designed to discourage guessing. Journal of Chemical Education, 92, 1194-1200. https://doi.org/10.1021/ed500465q

Carver, C. S., & Scheier, M. F. (2001). On the self-regulation of behavior. Cambridge, United Kingdom: Cambridge University Press.

Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1981). Categorization and representation of physics problems by experts and novices. zognitive Science, 5, 121-152. https://doi.org/10.1207/s15516709cog0502_2

Couch, B. A., Hubbard, J. K., & Brassil, C. E. (2018). Multiple–true–false questions reveal the limits of the multiple–choice format for detecting students with incomplete understandings. BioScience, 68(6), 455–463. https://doi.org/10.1093/biosci/biy037

Cronbach, L. J. (1942). Studies of acquiescence as a factor in the true-false test. Journal of Educational Psychology, 33(6), 401-415. https://psycnet.apa.org/doi/10.1037/h0054677

Dolnicar, S. & Grün, B. (2007). Cross-cultural differences in survey response patterns. International Marketing Review, 24 (2), 127-143.

https://doi.org/10.1108/02651330710741785

Downing, S. M. (1992). True-false, alternate-choice, and multiple-choice items. Educational Measurement-Issues and Practice, 11(3), 27-30. https://doi.org/10.1111/j.1745-3992.1992.tb00248.x

Dutke, S., & Barenberg, J. (2015). Easy and informative: Using confidence-weighted true-false items for knowledge tests in psychology courses. Psychology Learning and Teaching, 14(3), 250-259. https://doi.org/10.1177%2F1475725715605627

Ebel, R. L. (1968). Blind guessing on objective achievement tests. Journal of Educational Measurement, 5(4), 321-325.

http://www.jstor.org/stable/1433785

Ebel, R. L. (1970). The case for true-false items. School Review, 78(3), 373-389. http://www.jstor.org/stable/1084159

Espinosa, M. P., & Gardeazabal, J. (2010). Optimal correction for guessing in multiple-choice tests. Journal of Mathematical Psychology, 54(5), 415–425. https://doi.org/10.1016/j.jmp.2010.06.001

Espinosa, M. & Gardeazabal, J. (2020). The Gender-bias Effect of Test Scoring and Framing: A Concern for Personnel Selection and College Admission. The B.E. Journal of Economic Analysis & Policy, 20(3), 20190316. https://doi.org/10.1515/bejeap-2019-0316

Frisbie, D. A. (1992). The multiple true false format: A status review. Educational Measurement-Issues and Practice, 11(4), 21-26. https://doi.org/10.1111/j.1745-3992.1992.tb00259.x

Frisbie, D. A., & Sweeney, D. A. (1982). The relative merits of multiple true-false achievement tests. Journal of Educational Measurement, 19(1), 29-35. URL: http://www.jstor.org/stable/1434916.

Greene, E. B. (1929). Achievement and confidence on true-false tests of college students. Journal of Abnormal and Social Psychology, 23, (4), 467-478. https://psycnet.apa.org/doi/10.1037/h0072335

Gose, M. D., & Escudero, R. M. (1996). Whether to use true-false items. Educational Research Quarterly, 20, 37-47.

Grosse, M., & Wright, B. D. (1985). Validity and reliability of true-false tests. Educational and Psychological Measurement, 45(1), 1-13. https://doi.org/10.1177%2F0013164485451001

Hancock, G. R., Thiede, K. W., Sax, G., & Michael, W. B. (1993). Reliability of comparably written two-option multiple-choice and true false test items. Educational and Psychological Measurement, 53(3), 651-660. https://doi.org/10.1177%2F0013164493053003006

Hofstede, G. (1984). Culture’s consequences: International differences in work-related values. Newbury Park, CA: Sage.

Hunter, J. E., & Schmidt, F. L. (1990). Methods of meta-analysis: Cumulating research findings across studies. Sage.

Iriberri, Nagore & Rey-Biel, Pedro. (2021). Brave boys and play-it-safe girls: Gender differences in willingness to guess in a large scale natural field experiment. European Economic Review, 131(2), 103603. doi: 10.1016/j.euroecorev.2020.103603.

Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decisions under risk. Econometrika, 47, 263–291. doi:10.2307/1914185

Kahneman, D., & Tversky, A. (1984). Choices, values, and frames. American Psychologist, 39, 341–350.

https://psycnet.apa.org/doi/10.1037/0003-066X.39.4.341

Kang, J. K., Pashler, H., Cepeda, N. J., Rohrer, D., Carpenter, S. K., & Mozer, M. C. (2011). Does incorrect guessing impair fact learning? Journal of Educational Psychology, 103(1), 48-59. https://psycnet.apa.org/doi/10.1037/a0021977.

Lau, P. N. K., Lau, S. H., Hong, K. S., & Usop, H. (2011). Guessing, partial knowledge, and misconceptions in multiple-choice tests. Educational Technology & Society, 14 (4), 99–110. http://www.jstor.org/stable/jeductechsoci.14.4.99

Lee, I. A., & Preacher, K. J. (2013, September). Calculation for the test of the difference between two dependent correlations with one variable in common [Software]. Retrieved from http://quantpsy.org/corrtest/corrtest2.htm on May 3, 2021.

Montolio, D., & Taberner, P. A. (2021). Gender differences under test pressure and their impact on academic performance: A quasi-experimental design. Journal of Economic Behavior and Organization 191, 1065–1090. https://doi.org/10.1016/j.jebo.2021.09.021

Newell, A., & Simon, J. (1972). Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall.

Pinheiro J., Bates D., DebRoy S., & Sarkar D. (2020). nlme: Linear and Nonlinear Mixed Effects Models. R package version 3.1-144,

https://CRAN.R-project.org/package=nlme.

R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. URL https://www.R-project.org/.

Sax, G. (1989). Principles of educational and psychological measurement and evaluation (3rd ed.). Belmont, CA: Wadsworth.

Simon, H. A. (1955). A behavioral model of rational choice. Quarterly Journal of Economics, 59, 99–118. https://oi.org/10.2307/1884852

Simon, H. A. (1956). Rational choice and the structure of the environment. Psychological Review, 63, 129–138.

https://psycnet.apa.org/doi/10.1037/h0042769

Storey, A. G. (1966). A review of evidence or the case against the true-false item. Journal of Educational Research, 59(6), 282-285. https://doi.org/10.1080/00220671.1966.10883357

Siddiqui, N.I., Bhavsar, V.H., Bhavsar, A.V. & Bose, S. Contemplation on marking scheme for Type X multiple choice questions, and an illustration of a practically applicable scheme. Indian Journal of Pharmacology, 48(2): 114–121.

https://doi.org/10.4103/0253-7613.178836

Thorndike, R. L. (1982). Applied Psychometrics. Boston: Houghton:Mifflin.

Tsui, A., Nifadkar, S. S., & Ou, A. Y. (2007). Cross-national, cross-cultural organizational behavior research: Advances, gaps, and recommendations. Journal of Management, 33(3), 426-478. https://doi.org/10.1177%2F0149206307300818

Venaik, S., & Brewer, P. (2016). National culture dimensions: The perpetuation of cultural ignorance. Management Learning, 47(5), 563-589. https://doi.org/10.1177%2F1350507616629356

Wood, R. E., & Bandura, A. (1989a). Social cognitive theory of organizational management. Academy of Management Review, 14(3), 361-384. https://doi.org/10.5465/amr.1989.4279067

Wood, R. E., & Bandura, A. (1989b). Impact of conceptions of ability on self-regulatory mechanisms and complex decision making.

Journal of Personality and Social Psychology, 56(3), 407-415. http://dx.doi.org/10.1037//0022-3514.56.3.407

Wood, R. E., Bandura, A., & Bailey, T. (1990). Mechanisms governing organizational performance in complex decision-making environments. Organizational Behavior and Human Decision Processes, 46(2), 181-201. https://doi.org/10.1016/0749-5978(90)90028-8

Wood, R. E. & Bailey, T. C. (1985). Some unanswered questions about goal effects: A recommended change in research methods.

Australian Journal of Management, 10, 61-73. https:/ /doi.org/10.1177%2F031289628501000105