
Behind the Wheel

- Welcome
- Introduction
- Planning
- Recruitment
- Selection
- Objective
- The Concept of Validity
- Legal Defensibility
- Other Considerations
- A Comparison of Selection Methods - Research and Reality
- Background Review
- Tests
- Interviewing
- Comments
- Hiring
- Orientation
- Workbook
- Research Synopsis

Tests
![]() ![]() |
Some people see tests as the “silver bullet” of the selection process. There is a widespread belief that the appropriate test or group of tests exists or can be developed that will provide companies with an error-free selection methodology. There is some justification in this belief. Research on tests or, more appropriately, psychometric instruments, has consistently shown they can be a highly reliable and valid method for selecting employees. Additionally, they are standardized and systematic, target specific aspects of the candidates’ knowledge and abilities that have been identified as being important in the performance of the job, and test results can be calibrated against a standard or a benchmark. However, for them to fulfill all of these benefits, they have to be used appropriately.
“Mass Transit”
Selection A number of companies use a “mass” process as the initial step in selecting bus operators. Generally speaking, this process involves inviting individuals who have submitted résumés or application forms and who fulfill the minimum job requirements, to a meeting at a centrally located facility, usually a hotel. At this meeting, which can last several hours, candidates are provided with background information on the company and a realistic job preview (from a current operator, who “tells it like it is”). In addition, background data for each candidate are reviewed to ensure they are complete. In some cases, as well, tests (usually cognitive tests) are given to the candidates to complete. Finally, candidates are told about subsequent steps in the process and are given anticipated timeframes for when each of these steps will occur.
7.1 Test Types
In order to understand how to use tests better and appropriately, it is useful to divide them into two main categories:
(a) Those that measure what a candidate can do
(b) Those that measure what the candidate will most likely do or prefer to
Ability (cognitive, spatial, motor, perceptual, etc.) and performance (in-service/work sampling) tests fall into the first category and personality (conscientiousness, emotional stability, agreeableness etc.) and interest/ integrity tests fall into the second group. It should be noted the second group is not really a test, but rather a questionnaire because there are no right or wrong answers to the questions. They simply provide an inventory of the candidate’s personality/interest/integrity along certain dimensions. Ability questionnaires are tests because there is a right or wrong answer to each of the questions on the test.
7.2 Standards and Requirements
This understanding of the nature of tests can be very useful when we consider the National Occupational Standards. The Standards describe the tasks bus operators need to be able to perform in order to accomplish their work. For each of these tasks there are required knowledge and abilities. Possessing this knowledge and having these abilities are bona fide occupational requirements of the job and, therefore, we should test candidates to determine whether they have these knowledge and abilities using knowledge and ability tests. In other words, we are talking about tests that determine what the candidates know and what they are able to do, i.e. tests that are of the first type.
7.3 Standard Ability Tests
There is a plethora of standard ability tests on the market. The dilemma, when considering the use of one or more of these tests, is to determine whether candidate results on the test are a valid predictor of whether the employee has a certain ability or abilities as required by the Standards. For example, will an employee who scores well on a cognitive test, such as reading ability, have the ability to read route and road maps (standard 1.03.06)? One would presume so, but we will not know for certain until we validate that reading test results are a good predictor of route and map reading ability20. Given that doing this kind of validation is time-consuming and costly, then the question is, should standardized tests be used? The answer to this question is a qualified yes. Standardized tests are relatively easy to administer, inexpensive and can be used to confirm information collected in the Background Review stage of the process, e.g. reading and numeric tests can be used to confirm education and training.
7.4 Standard Personality, Interest and Integrity Questionnaires
Personality, interest and integrity tests present a more complicated set of considerations. First, as is shown in Exhibit 7.1, these questionnaires are of dubious reliability and validity. Second, it is much more difficult to tie the items on one of these tests to the Standards. Third, these tests are self-reporting and, therefore, questionable in terms of the results. Finally, there is the question of acceptance by candidates, managers and unions in terms of their “face validity” and fairness. Where there may be some value in the use of a personality or interest questionnaire is as an aid to interviewing, both reference interviewing and candidate interviewing.21 This use is discussed in the section on Interviewing.
7.5 In-Service or Work-Sampling Performance Tests
With these kinds of tests, candidates are actually required to perform a sample of the tasks that make up the job. Because the candidate is carrying out the actual task, say reading a route or road map, there is content validity to the test. The important consideration in using in-service or work-sampling tests is to base them on a bona fide requirement or, in the case of bus operators, the National Occupational Standards. As has been pointed out in the previous discussion on Background Reviews, the Standards are used to identify the key knowledge and abilities. A test is then developed that requires the candidates to demonstrate they have the knowledge or abilities. For example, if the key ability is to read a route or road map then the test would have the candidates read a sample route or road map and they would be scored on how accurately and how quickly they are able to carry out this task. Candidate results can then be compared with standards set for acceptable performance.
Transparency and Opacity
An issue related to the use of personality, interest and integrity questionnaires is transparency versus opacity. Some questionnaires are made up of questions closely related to the nature of the work and are, therefore, highly “transparent.” The dilemma with these types of inventories is the candidate (particularly if he/she is “test wise”) can “fake good” and answer the questions in a “socially desirable” way, thus invalidating the results. There are other questionnaires where the questions tend to be unrelated to the nature of the work and, therefore, highly “opaque.” The problem with these types of questionnaires is that the candidate, not seeing any relationship between the questions and the job, may not take the questionnaire seriously (see the note on Candidate Likes and Dislikes following section 4.5).
The benefits of in-service or work-sampling tests are they have a high degree of acceptability, face-validity, and represent an actual sample of the candidate performing the work as opposed to a sign, as in the case of ability tests, that the candidate might be able to do the work. As well, there is the benefit the test allows the candidates a realistic job preview (RJP) and, therefore, the opportunity to determine whether the work fits with their interests better22.
Personality, interest and integrity tests are difficult to tie to bona fide requirements and, therefore, of dubious reliability and validity.
These tests are not without their drawbacks. They are generally costly to develop and, in many cases, may require special equipment (for example, if we want to do a work sampling test for ability 2.02.03, identify unusual noises and vehicle behaviour occurrences, a vehicle would be required).
7.6 Words of Wisdom
There are other considerations regarding the use of tests beyond those we have discussed so far:
(a) Testing should not be done out of frustration with other selection methods. Testing should be seen as part of the whole recruitment processand be integrated with and supportive of the other selection methods of Background Review and Interviews.
Practice Makes Perfect
Familiarity with a test helps to eliminate unwanted candidate errors due to anxiety or nervousness. In order to improve test validity, candidates need to be informed, beforehand, they will be asked to complete a test or many tests as part of the selection process. Additionally, experts in the field recommend that candidates be provided with several sample/practice questions to complete prior to the actual test. The provision of practice questions is particularly important in the case of ability tests. Research has shown individuals will perform better, sometimes significantly so, on a test if they have completed it before. As the use of tests becomes more pervasive, it has been found some candidates have become “test wise” thus performing much better than less experienced individuals on both ability and personality, interest and integrity questionnaires.
(b) There are thousands of tests available on the market. Very few have been validated for use in selecting bus operators23, therefore, caution must be used in deciding on whether they should form part of the selection process. The Mental Measurement Yearbook provides reviews of the tests they list. It should be referred to prior to making a decision on any one test. A short list of selected suppliers is contained in Exhibit 7.4 along with some recommended resources on the subject. In Exhibit 7.5 we have identified three commercially available bus operator-specific selection instruments, developed in the United States, that have validation studies associated with them. Our inclusion of these three should not be seen as an endorsement of any one of them.
Tests should support other selection methods and not be used as the only method.
(c) In this sense, there should not be over-reliance on testing and test scores when selecting candidates for hiring. Tests produce, in most cases, a numerical score and, therefore, there will be a natural tendency to be swayed by that score because of its preciseness.
(d) The establishment of standards for benchmarking test results is very important. If the standards are set too high, good candidates may not make it to the next step in the process. Furthermore, the standard my have an upper as well as a lower limit. For example, an extremely high score on a cognitive test could be a signal the candidates could potentially become bored and dissatisfied with the job.
The Truth About Video Tests
Recently, researchers have been questioning the validity of what they call “nonsimulation- based measures of cross-functional skills” (i.e. teamwork, customer service, decision-making). Included in these measures are videos - wherein a scene is played out and the candidates have to select how they would deal with the situation from a multiplechoice list of answers. The argument being raised by these industrial psychologists is the relationship between responses on such instruments and actual performance has not been established through research even though there are legitimate, empirically-based claims of predictive validity for the instruments.
(e) Beware of what Levin and Rosse call the “Black Magic”24 tests. These questionnaires are built upon anecdotal evidence, faulty theory and even, in some cases, superstition. They run from handwriting analysis through graphology to polygraph testing. Research has shown they are not valid predictors of job performance.
(f) Test administrators have to be well trained to ensure the same conditions prevail for all candidates; the test is safe and; they are not influencing the outcome by the way they administer the test.
20 In an OC Transpo study, the company was administering a series of 12 tests called the General Aptitude Test Battery that measured a number of factors including general learning ability, verbal aptitude, numerical aptitude etc. In their validation study the consultants found that beyond determining that the test might be valid for predicting training success, the tests did not predict future operator performance.
21 In an OC Transpo study the Canadian Occupational Interest Inventory developed for the Canadian Employment and Immigration Commission in the early 1980’s was found to be a useful questionnaire for predicting job performance and employee commitment and job satisfaction.
22 It is interesting to note that researchers report that despite these benefits, in-service and work-sampling tests are not widely used in Canada, the United States or the United Kingdom.
23 Other than the OC Transpo validation study mentioned previously, the only other study we were able to find was one conducted in the United States in the early 1970’s. In this study a battery of 11 tests was evaluated and it was determined that only 3 of the eleven were valid predictors of performance.
24 Levin, R., Roose J. 1997. High Impact Hiring. Jossey-Bass, San Francisco. Page 197.