Psychometric Tests: A Revolution in Measuring Human Behaviour

Psychometric Tests: A Revolution in Measuring Human Behaviour

Can we really measure behaviour?

Human beings are deeply complex. We display a multitude of different behaviours and emotions and might react to the same set of circumstances in completely different ways. We have different approaches to the opportunities and difficulties life brings and differing perceptions of what is important to us. Have you ever wondered why a colleague might have a calm, relaxed, confident temperament under normal working conditions, but can become impatient and abrupt – or introverted – under high pressure? And what about friends who thrive on high-octane pursuits like diving, trekking across deserts or skydiving, whilst others would not fathom taking such risks, preferring to keep their feet on the ground in less extreme conditions. We all have different talents and skills and some of us are able to reach the very top of our game. Think elite athletes, scientists, business leaders and entrepreneurs. Philosophers and Psychologists have been debating why this is the case for centuries. And in the late 1800s, there was a breakthrough that revolutionised our ability to learn about human behaviour and apply this knowledge to different settings. Charles Darwin in his seminal work “Origin of the Species” theorised that human traits could be passed down through family bloodlines. This also assumed that traits differed between individuals. Darwin’s cousin, Frances Galton drew on this theory to focus on the differences between human “mental capacity” or intelligence and, in so doing, developed the first method of mathematically measuring human behaviour, coining the word psychometrics” and paving the way for others to develop his ideas further. The psychometric test as we know it today is used to test ongoing theories of individual differences, the most widely researched being intelligence and personality, to investigate associations between personal characteristics and other factors and, importantly for organisational settings, to apply this knowledge to solve human-centred business problems.

What is a Psychometric Test?

The aim of a psychometric test is to determine a representation of an individual’s aptitude or personality in a particular context. It may require the candidate to undertake time-bounded short tasks such as picture completion, letter-number sequencing, or interpretation of visual puzzles or, more commonly in personality tests, requires the individual to self-report on the extent to which they might agree with a particular statement, or how often they exhibit certain behaviours, selecting an answer on a dimension for example from “strongly disagree to strongly agree”. One might think that individual judgements on personal attributes might not give a true picture of someone’s Psychometric tests: a revolution in measuring human behaviour personality but, in fact, evidence shows people are a reliable judge of their own personality as compared to judgement from others.

A brief history

It’s worth touching on the origins of the psychometric test. Pre-20th Century, there was intense debate as to whether the study of human behaviour could be a reputable science. How do we measure something so intangible? Both Galton’s breakthrough and Spearman’s development of factor analysis, which enabled mathematical calculation of how variables were related to each other (for instance a set of traits such as “talkative”, “gregarious” and “outgoing”) were revolutionary. It led to an understanding that behaviour can be inferred through asking individuals the right set of questions and systematically evidencing the outcomes mathematically.

Intelligence

There is no question of the value that Galton contributed to Psychology as the forefather of psychometrics and individual differences. However, aside from his mathematical discoveries, his version of an intelligence test was deemed to be quite crude. Galton’s theories were developed further by Alfred Binet and Theodore Simon in 1905 to identify learning disabilities in children and a later version, the Stanford-Binet test, adapted by Lewis Terman, is still used today to assess for developmental difficulties. The test was able to benchmark expected intelligence scores for different ages. These tests were developed further during the World Wars, as part of the recruitment process to determine whether soldiers would develop shell shock, known today as PTSD. During World War II, psychometric tests were used to establish which soldiers were eligible for officer ranks. In terms of intelligence, the test most used today was developed by David Wechsler in the 1950s. The Weschler Adult Intelligence Scale (WAIS) and the Weschler Intelligence Scale for Children (WISC) concatenate scores from four intelligence categories (verbal comprehension, perceptual reasoning, working memory and processing speed) into an overall score which is compared against the norm for a particular age group.

Development of cognitive ability measures have enabled investigation into what work outcomes are predicted by intelligence. This matters when we look at evidence such as Price’s Law (1963) which cites that distribution of productivity across a workforce is uneven. In simple terms, 10 per cent of the workforce will produce the same volume of work as the other 90 per cent. When we overlay financial consequences, the effects are quite staggering, i.e. the more productive your workforce, the more profitable your business will be. It is worth knowing, then, that one of the best-known associations with strong cognitive ability is high job performance. In fact, individual Psychometric tests: a revolution in measuring human behaviour differences in cognitive ability have been shown to account for approximately 25 per cent of the variability in performance across various organisations.

Emotional Intelligence

Despite the popularity and apparent integrity of the WAIS and WISC, there have been challenges to pooling multiple abilities under one “general intelligence” umbrella. Emotional Intelligence (EI) has been proposed as a discreet construct separate from general intelligence. It can be defined as “the ability to perceive, understand and manage the emotions in the self and others”. As the zeitgeist of the moment, it may come as a surprise that there is still sparse evidence about how to reliably measure EI. Despite this, there are two better-known EI psychometric tests. Mayer, Salovey and Caruso’s MSCEIT takes an “ability” view, (we choose when or if to apply an “ability”) and requires individuals to respond to questions representing four dimensions of EI where questions include “which mood is helpful in this situation” or “how well would these actions preserve mood”. Petrides’ & Furnham’s TEIQue takes a trait view asking participants to respond on a scale about the extent to which a number of statements are true, such as “I usually find it difficult to regulate my emotions” or “I can deal effectively with people”. Various industries have developed related tests for use in work settings, particularly to assess EI traits in leaders and managers. From a scientific perspective, many would argue these should be applied with caution!

Personality

Gordon Allport defined personality as “a dynamic organisation, inside the person, of psychophysical systems that create the person’s characteristic patterns of behaviour, thoughts and feelings”. How’s that for a definition! Suffice to say, it’s a complex construct to understand and measure and there have been multiple theories about what personality is and where it originates from. The dominant view is the trait approach which assumes human beings display personality “traits” which are an individual’s enduring internal dispositions to “think, feel, and behave in certain ways” (McCrae & Costa, 2005). They reveal themselves in our behaviour depending on the situation. For example, an individual may be more confident to express their views at a work social event rather than a team meeting, or present in front of strangers rather than their peers. From a work perspective, these assessments are used across industries both to reduce a large pool of applicants to those most suitable for the position, find the best fit of an applicant to a particular role and to predict other behaviour such as how engaged employees are at work. The approach that has been most researched is the Big Five Model.

The Big Five Model of personality

Influences on trait personality development stem from Galton’s “Lexical Hypothesis” – that the basis of personality traits can be encoded from language. Allport and Odbert (1936) responded by using the dictionary to extract all personality-related words (18,000 in total!). Raymond Cattell later drew on the study of intelligence to simplify Allport and Odbert’s work, using factor analysis to determine which “personality” words were related. He came up with 12 dimensions of personality. In parallel, Hans Eysenck determined there were three, drawing on his work using personality tests to recruit soldiers in World War II. Further analysis, challenge and refinement was carried out by others. Tupes and Christal undertook US Airforce studies between 1954-1961, evidencing the generalisability of five representative factors across disparate groups of people. More recently, McCrae and Costa have contributed to the general consensus (acknowledging criticism by some) of the reliability and validity of the Big Five Factor model where individuals’ profiles encapsulate their position on each of the dimensions of Openness, Conscientiousness, Extroversion, Agreeableness and Neuroticism (OCEAN). This model can be used to predict a plethora of work-related outcomes such as which personality traits are better for certain jobs, team performance, burnout and work engagement and leadership behaviour.

Choosing the right psychometrics

Critical to the integrity of psychological testing are two parameters: reliability and validity. Reliability checks whether test scores are stable over time and are consistent. Given we expect personality traits to endure through our lives, results from a test under the same conditions for a given individual should produce approximately the same results at different time points. Additionally, within the Big Five model, there will be multiple statements for each personality dimension (a scale). We can mathematically check that all these statements are testing the same dimension.

A valid test is one that measures what it sets out to, for instance, Extraversion in a personality context. It is also critical that the measures used have “predictive” validity, i.e. we must know that measures such as personality and job outcomes (e.g. job performance, “fit” of an individual to their work environment, organisational commitment) are associated to be able to make accurate decisions about employees, for instance as part of a recruitment process.

There have been challenges in the previous century to tests that have been – and are still – widely used, but do not fit one or both of the reliability/validity criteria. One example is Hermann Rorschach’s ink blot tests. Influenced by Sigmund Freud, a famous Austrian doctor who focussed on accessing patients’ unconscious mind to relieve them of their anxiety, Rorschach hand painted a set of ten ink blot pictures which he used with patients to assess personality and determine psychological disorders. The test requires patients to describe what they see in the pictures and these answers are interpreted by experts. From a scientific perspective, there is little evidence to support its reliability or validity. It has been criticised for being unreliable as different assessors interpret patients’ reading of the ten pictures in different ways. The validity is problematic as there is little evidence to suggest it genuinely supports the diagnosis of psychological disorders. There is some evidence to suggest that an aide to support patients to open up about personal issues during therapy, they have some merit.

Another test which is likely the most used psychometric test in work contexts is the Meyers- Briggs Type Inventory (MBTI), which is influenced by Carl Jung’s theories on typology. It takes some of the constructs of the Big Five Model but assumes human beings fit into a “type”. Unlike the trait model where individuals fit on a dimension of each of the five constructs, a type assumes we fit into one of two categories, for example, an individual is either an extrovert or an Introvert. There are four of these dichotomies, and a candidate’s MBTI profile is a combination of one from each of the four groupings. Arguably this reduces the complexity of human behaviour, and research on the test has shown poor consistency in that the same person can end up with a different profile only weeks after their original test. Practitioners have disputed these issues and promote the valuable discussions that arise following assessments.

Fairness in psychometric testing

Luckily, we are past the dark days of intelligence testing being used to justify discrimination against ethnic minorities. Proponents of the Eugenics movement, which sought to promote the heritability of “desirable” traits and included the very people who developed psychometric testing such as Frances Galton and Lewis Terman, used fraudulent and biased “science” to paint those who tested low in intelligence as incompetent. This precluded any association with social inequalities such as low wages and poor living conditions. For a time, the positive intentions of psychologists like Binet who aimed to use these tests to identify children needing additional support was overshadowed and many became suspicious of psychometrics and Psychology in general. It was works such as Stephen Gould’s The Mismeasures of Man that brought integrity back to the merits of psychological testing. Nowadays there are laws to prevent racial or gender discrimination during employment screening in the US and the UK and other developed countries. The laws protect against practices which favour recruiting a white male majority.

Looking to the future

Psychometric tests are here to stay, particularly in a data-driven world in an age of constant technological advancement.

Certainly one of the factors shaping employee recruitment is the need for organisations to appeal to the values of any prospective hire or current employee. The knowledge that happy employees lead to better performance, less absence and reduced attrition is gradually shaping organisational strategy. In a global competitive landscape, high calibre employees are like gold dust and the culture of their current or prospective team needs to fit their beliefs, principles and goals. Accordingly, there has been a proliferation of psychometric tests developed to measure outcomes such as work engagement, thriving at work, organisational commitment, person-environment fit and organisational citizenship. Given current thinking is that our characteristics are not fixed; our personality traits can be activated in certain situations and at different times in our lives, this dynamic play between personality affecting work and work affecting personality demonstrates that work can have a profound impact on our quality of life.

An organisation that uses robust psychometric tests, seeks to understand candidates’ motivations and values and strives to pursue an appropriate team culture to both drive performance and support employee potential has a winning combination for success.

At My People Group this is what we are passionate about supporting our clients with. Based on over 20 years of experience nurturing high performance in sports and business based on sound Psychology research, our Recruitment and Culture software products are grounded on reliable, validated and appropriate psychometrics. They give our clients critical insight into how to recruit candidates that fit their prospective team culture, increasing the likelihood they will stay and thrive once they join, and an understanding of the blockers to and opportunities for improved team performance. We use Psychometric approaches grounded in science to select, optimise and retain their people, in order to help the organisation flourish. Our psychometric approach to measuring teams and culture is unique and it enables you to build and enable teams to succeed using data rather than just opinion.

If you would like to learn more about how we apply science to our software and how we support our clients to build great teams, get in touch. We’d love to speak to you.

Barrick, M. R., & Mount, M. K. (1991). The Big Five personality dimensions and job performance: A
meta-analysis. Personnel Psychology, 44, 1–26.
Barrick, M. R., Stewart, G. L., Neubert, M. J., & Mount, M. K. (1998). Relating member ability and
personality to work-team processes and team effectiveness. Journal of Applied Psychology, 83,
377–391.
Bartram, D. (2005, November). The changing face of testing. The Psychologist, 666–668.
Birkeland, S., Manson, T., & Kisamore, J. (2006). A meta-analytic investigation of job applicant faking
on personality measures. International Journal of Selection and Assessment, 14, 317–335.
Boake, C. (2002). From the Binet–Simon to the Wechsler–Bellevue: Tracing the History of
Intelligence Testing. Journal of Clinical and Experimental Neuropsychology, 24(3), 383–405.
https://doi.org/10.1076/jcen.24.3.383.981
Cohen, A. S. (2016, February 19). Harvard’s Eugenics Era. Harvard Magazine.
https://www.harvardmagazine.com/2016/03/harvards-eugenics-era.
Cook, M., & Cripps, B. (2005). Psychological Assessment in the Workplace. Wiley.
Deary, I. J., Whalley, L. J., Lemmon, H., Crawford, J. R., & Starr, J. M. (2000). The Stability of Individual
Differences in Mental Ability from Childhood to Old Age: Follow-up of the 1932 Scottish Mental
Survey. Intelligence, 28(1), 49–55.
Frothingham, M. B. (2021, November 2). Rorschach Inkblot Test: Definition, History & Interpretation.
Simple Psychology.
Galton, F. (1879). Psychometric Experiments. Brain.
Goldberg, L. R. (1993). The Structure of Phenotypic Personality Traits. American Psychologist, 48(1),
26–34.
Gould, S. J. (1996). The Mismeasure of Man. W. W. Norton & Company.
Hiermeier, U. M., & Verity, S. J. (2022). ‘Race’ and racism in intelligence testing. Clinical Psychology
Forum.
Higgins, D. M., Peterson, J. B., Lee, A. G. M., & Pihl, R. O. (2007). Prefrontal cognitive ability,
intelligence, Big Five personality, and the prediction of advanced academic and workplace
performance. Journal of Personality and Social Psychology, 93, 298–319.
Hirsh, J. B. (2009, September). Choosing the right tools to find the right people. The Psychologist,
730–735.
Hogan, J., & Holland, B. (2003). Using theory to evaluate personality and job-performance relations:
A socioanalytic perspective. Journal of Applied Psychology, 88(1), 100–112.
https://doi.org/10.1037/0021-9010.88.1.100
Johnson, J.A. (2000). Predicting observers’ ratings of The Big Five from the CPI, HPI and NEO-PI-R: A
comparative validity study. European Journal of Personality, 14, 1-19.
Lake, C. J., Carlson, J., Rose, A., & Chlevin-Thiele, C. (2019). Trust in name brand assessments: The
case of the Myers-Briggs Type Indicator. The Psychologist-Manager Journal, 22(2), 91–107.
https://doi.org/10.1037/mgr0000086
Langelaan, S., Bakker, A. B., van Doornen, L. J. P., & Schaufeli, W. B. (2006). Burnout and work
engagement: Do individual differences make a difference? Personality and Individual
Differences, 40(3), 521–532. https://doi.org/10.1016/j.paid.2005.07.009
Laurent, J., Swerdlik, M., & Ryburn, M. (1992). Review of validity research on the Stanford-Binet
Intelligence Scale: Fourth Edition. Psychological Assessment, 4(1), 102–112.
https://doi.org/10.1037/1040-3590.4.1.102
Mayer, J. D., Salovey, P., & Caruso, D. R. (2008). Emotional intelligence: New ability or eclectic traits?
American Psychologist, 63(6), 503–517.
McCrae, R. R., & Costa, P. T. (2005). Personality in Adulthood: A Five-factor Theory Perspective (2nd
ed.). Guilford Press.
McDowall, A., & Rojon, C. (2016, January). The enigma of testing. The Psychologist.
Petrides, K. V., Furnham, A., & Mavroveli, S. (n.d.). Trait emotional intelligence: Moving forward in
the field of EI. In G. Matthews, M. Zeidner, & R. D. Roberts (Eds.), The science of emotional
intelligence: Knowns and unknowns (pp. 151–166). Oxford University Press.
Pittenger, D. J. (2005). Cautionary Comments Regarding the Myers-Briggs Type Indicator. Consulting
Psychology Journal: Practice & Research, 57(3), 210–221.
Roid, G. H., & Pomplun, M. (2012). The Stanford-Binet Intelligence Scales. In D. P. Flanagan & P. L.
Harrison (Eds.), Contemporary intellectual assessment: Theories, tests, and issues (5th ed., pp.
249–268). The Guilford Press.
Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel
psychology. Psychological Bulletin, 124, 262–274.
Stein, R., & Swan, A. B. (2019). Evaluating the validity of Myers-Briggs Type Indicator theory: A
teaching tool and window into intuitive psychology. Social and Personality Psychology
Compass, 13(3), e12441. https://doi.org/10.1111/spc3.12441
Unknown. (n.d.). Psychometrics. The Open University. Retrieved July 11, 2022, from
https://www2.open.ac.uk/openlearn/CHIPs/index.html#/node/37/commentary