Selection is the choosing of people, once you have identified them. This often involves the use of tools from complex assessment the use of simple pen-and-paper tests.
Selection tools
* Application form: A structured request for specific information.
* Assessment centers: Multiple tests and exercises through the day.
* Biodata: Correlating biographical data with job performance.
* Interview: The classic selection interview...
* Intelligence tests: Many ways of measuring of intelligence.
* Psychometric tests: Attainment, aptitude and intelligence tests.
* Personality tests: Finding aspects of personality.
* Resumé / Curriculum Vitae: A personal history.
* Work sample: Getting them to do a sample of 'real' work.
Selection articles
* Accuracy and faking: Candidate and recruiter issues.
* Faking and deviance: Faking as 'bad' or 'good' depends on your viewpoint.
* Personality and job success: Personality factors that predict job success.
* Reducing faking in tests: Design psychometric tests to minimize faking.
* Reliability: Stability and consistency give trust in the test.
* The selection spiral: When people select only those less able than themselves.
* Validity: Results measure what the assessment was intended to measure.
Description
An application form is a structured form developed by the company offering the job, which the candidate completes. This may be in addition to a résumé, or as a replacement. Résumés may, when application forms are required, be 'optional'.
Supplemental application forms
A supplemental application form does not replace a résumé. It seeks additional information in specific areas that are shaped by the key criteria you are seeking. For example, you might ask about experience of working in teams, or even ask them to write a short piece about their views of the dynamics of your marketplace.
Replacement application forms
A form that replaces a résumé must gather all information that is required, including contact information, job history, personal statements and education, as well as any specialized job-related information you require.
Development
Development of a supplemental application form starts with a good job analysis, from which key criteria which are to be sought are extracted and parts of the form designed whereby data may be reliably gathered.
Design of the form should be to include clear instructions such that the candidate is in no doubt about what is required in each field. This may include check boxes of various forms for basic facts (e.g. male/female or a checklist for computer application skills), single-line fields for short items such as their name, and larger boxes for free-format descriptions, such as descriptions of their responsibilities in various jobs.
If the form is to be on paper, then standard graphic design principles should be used, such as clear use of space, fitting coherent sets of information on a single side of paper, etc. If the form is for use on the web, then web design principles should be used, such as coping with resizing of windows, clear 'submit' button, etc.
Discussion
Application forms are very popular, being used by 93% of UK firms (Shackleton and Newell, 1991), and have found increasing popularity with the web (Park, 1999 and Reed, 2002), where online completion of forms eases data capture and ensures standardization.
The application form is the recruiter's rebuttal to the résumé, providing them with an initial selection tool that can be used to create a short-list based on what is required by the job rather than what the applicant chooses to tell.
Application forms are finite, and long forms are likely to put off some candidates (although this may be of benefit to put off the casual applicant).
Application forms used a great deal on the web and facilitate automated filtering, where a job may have many applicants and individual sections may be scanned for specific key words (such as qualifications or experience).
Application forms, like CVs, are self-reports and hence may are open to impression management and other forms of faking, and hence ‘factual’ information should be treated with care.
A sharp candidate will take good note of the application form, as it often hints at (or even shouts about) the key criteria that the recruiting company is seeking.
Design of Application Forms must take care about legal constraints. If there is any legal challenge to the application process, the motivation for any item in the application form could be challenged. Both the language and content of application forms thus needs to be carefully screened for bias and sensitivity. For example if ethnic background is being questioned, then there must be a legitimate reason for this (and the wording must also be ‘politically correct’).
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Assesment Centers:
Description
The Assessment Center is an approach to selection whereby a battery of tests and exercises are administered to a person or a group of people across a number of hours (usually within a single day).
Assessment centers are particularly useful where:
* Required skills are complex and cannot easily be assessed with interview or simple tests.
* Required skills include significant interpersonal elements (e.g. management roles).
* Multiple candidates are available and it is acceptable for them to interact with one another.
Individual exercises
Individual exercises provide information on how the person works by themselves. The classic exercise is the in-tray, of which there are many variants, but which have a common theme of giving the person an unstructured large pile of work and then see how they go about doing it.
Individual exercises (and especially the 'in tray') are very common and have a correlation with cognitive ability. Other variants include planning exercises (here’s problems, how will you address them) and case analysis (here’s a scenario, what wrong? How would you fix it?).
One-to-one exercises
In one-to-one exercises, the candidate interacts in various ways with another person, being observed (as with other exercises) by the assessor(s). They are often used to assess listening, communication and interpersonal skills, as well as other job-related knowledge and skills.
In role-play exercises, the person takes on a role (possibly the job being applied for) and interacts with someone who is acting (possibly one of the assessors) in a defined scenario. This may range from dealing with a disaffected employee to putting a persuasive argument to conducting a fact-finding interview.
Other exercises may have elements of role-play but are in more 'normal' positions, such as making a presentation or doing an interview (interesting reversal!).
Group exercises
Group exercises test how people interact in a group, for example showing in practice the Belbin Team Roles that they take.
Leaderless group discussions (often of a group of candidates) start with everyone on a relatively equal position (although this may be affected by such as the shape of the table).
A typical variant is to assign roles to each candidate and give them a brief of which others are unaware. These groups can be used to assess such skills as negotiation, persuasion, teamwork, planning and organization, decision-making and, leadership.
Another variant is simply to give a give topic for group to discuss (has less face validity).
Business simulations may be used, sometimes with computers being used to add information and determine outcomes of decisions. These often work with 'turns' that are made of data given to the group, followed by a discussion and decision which is entered into the computer to give the results for the next round.
Relevant topics increases face validity. Studies (Bass, 1954) have shown high inter-rater reliability (.82) and test-re-test results (.72).
Self-assessment exercises
A neat trick is to ask candidates to assess themselves, for example by asking them to rate themselves after each exercise. There is usually a high correlation between candidate and assessor ratings (indicating honesty).
Ways of improving these exercises include:
* Increasing length of assessment form to include behavioral dimensions based on selection competencies
* Change instructions to promote a more realistic appraisal by applicant of their skills
* Imply that candidate would be held accountable if a discrepancy is found between their and assessor ratings.
Those with low self-assessment accuracy are likely to find behavioral modification and adaptation difficult (perhaps as they have low emotional intelligence).
Development
Developing assessment centers involves much test development, although much can be selected 'off the shelf'. A key area of preparation is with assessors, on whose judgment candidates will be rejected and selected.
Identify criteria
Identify the criteria by which you will assess the candidates. Derive these from a sound job analysis.
Keep the number of criteria low -- less than six is good -- in order to help assessors remember and focus. This also helps simplify the final judgment process.
Develop exercises
Make exercises as realistic as possible. This will help both candidates and assessors and will give a good idea what the candidate is like in real situations.
Design the exercises around the criteria so they can be identified rather than find a nice exercise and see if you can spot any useful criteria. Allow for confirmation and for disconfirmation of criteria.
Include clear guidelines for player so they can get 'into' the exercises as easily as possible. You should be assessing them on the exercise, not on their memory.
Include guidelines also for role-players, assessors and also for those who will set up the exercises (eg. what parts to include in exercise packs, how to set them up ready for use, etc.).
Triangulate for results across multiple exercises so each exercise supports others, showing different facets of the person and their behavior against the criteria.
Select assessors
Select assessors based on their ability to make effective judgments. Gender is not important, but age and rank are.
There are two approaches to selecting assessors. You can use a small pool of assessors who become better at the job, or you can use many people to help diffuse acceptance of the candidates and the selection method.
Do use assessors who are aware of organizational norms and values (this militates against using external assessors), but do also include specialists, e.g. organizational psychologists (who may well be external, unless you are in a large company).
Develop tools for assessors
Asking assessors to make personal judgments is likely to result in bias. Tools can be developed to help them score candidates accurately and consistently.
Include behavioral checklists (lists of behaviors that display criteria) and behavioral coding that uses prepared data-gathering sheets (this standardizes between-gatherers data).
Traditional assessment has a process of observe, record, classify, evaluate. Schema-based assessment has examples of poor, average and good behavior (there is no separation of evaluation and observation).
Prepare assessors and others
Ensure the people who will be assessing, role-playing, etc. are ready beforehand. The assessment center should not be a learning exercise for assessors.
Two days of training are better than one. Include theory of social information processing, interpersonal judgment, social cognition and decision-making theory.
Make assessors responsible for giving feedback to candidates and accountable to organization for their decisions. This encourages them to be careful with their assessments.
Run the assessment center
If you have planned everything well, it will go well. Things to remember include:
* Directions to the center sent well beforehand, including by road, rail and air.
* Welcome for candidates, with refreshments and waiting area between exercises.
* Capturing feedback from assessors immediately after sessions.
* A focus with assessors on criteria.
* Swift and smooth correction of assessors who are not using criteria.
* A timetable for everyone that runs on time.
* Lunch! Coffee breaks!
* Thanks to everyone involved.
* Finishing the exercises in time for the assessors to do the final scoring/discussion session.
Follow-up
After the center, follow up with candidates and assessors as appropriate. A good practice is to give helpful feedback to candidates who are unsuccessful so they can understand their strengths and weaknesses.
Discussion
Assessments have grown hugely in popularity. In 1973 only about 7% of companies were using them. By the mid-1980s, this had grown to 20%, and by the end of the 1990s it had leapt again to 65%.
Assessment centers allow assessment of potential skill and so are good when seeking new recruits. They allows a wide range of criteria to be assessed, including group activity and aggregations of higher-level, managerial competences.
Assessment centers are not cheap to put on and require multiple assessors who must be available. Organizational psychologists can be of particular value to assess and identify the subtler aspects of behavior.
Origins
The assessment center was originated by AT&T, who included the following nine components:
1. Business game
2. Leaderless group discussion
3. In-tray exercise
4. Two-hour interview
5. Projective test
6. Personality test
7. ‘q sort’
8. intelligence tests
9. Autobiographical essay and questionnaire
Validity
Reliability and validity is difficult, as there are so many parts and so much variation. A 1966 study showed high validity in identifying middle managers. There is a lower adverse effect on individuals than separate tests (eg. psychometrics).
Criticisms
The outcome of assessment centers are based on the judgments of the assessors and hence the quality of those judgments. Not only are judgments subject to human bias but they also are affected by the group psychology effects of assessors interacting.
Assessors often deviate from marking schemes, often collapsing multiple criteria into a generic ‘performance’ criterion. This is often due to overburdening of assessors with more than 4-5 criteria (so use less). More attention is often given to direct observation than other data (eg. psychometric tests). Assessors even use their own private criteria – especially organizational fit.
_______________________________________________________________________
Biodata:
Description
Biodata methods collect biographical information about a person that has been proven to correlate with good job performance. The correlations can be quite strange: all you need is to know that if a person has a certain item in their history that they are more likely to be good at the target job.
For example, in World War 2, the US Air Force discovered that men who had built and flown model aircraft when they were boys were more likely to make good fighter pilots. In 1952, Mosel's detailed study of department store sales staff found that the most successful people were widowed, female, 35-54 years old, between 4 foot 11 inches and 5 foot 2 inches, weighed at least 160 pounds, lived in a boarding house, had dependants, had a high-school education, had at least five years sales experience but had been in the previous post for less than five years with no time off for illness.
More recent and understandable events can also be used, for example working on government projects might correlate with effective use of project management methodologies. Biodata can include aspects of personal information, childhood, education, employment, external experiences, skills, socioeconomic status, social activities, hobbies and personal traits and characteristics.
Biodata is collected using a written form, structured to discover the key information that is required. The target people complete the form and hand it in, where it is studied for the key characteristics being sought.
Development
Define 'performance'
First start off by defining the performance that you are seeking. This should be in a form that will help you with the next step, for example using standard descriptions of 'leadership' or 'technical ability'.
Find high performers
Find a significant population from which you can extract sufficient numbers of high performers in the job you are analyzing. Thus a large company may study international managers who have proven successful at managing virtual teams, whilst an army may seek individual soldiers who have shown exemplary battlefield bravery (perhaps via those who have been awarded medals).
Collect biographical data
Design a structured and repeatable data collection method to extract as much biographical information as you can handle. This may include investigation of childhood events, education, jobs performed and self-reported significant 'life events'.
Approaches such as Critical Incident Technique, Interview and Structured questionnaires may be used to collect information. Each method used will typically expose different information, allowing different facets of the person to be examined.
Correlate performance and biographical data
This step is largely statistical, as the biographical data is coded and correlated with job performance.
Scoring is done with ‘item responses level keying’.
* Empirical approach: look at proportion of variance accounted for between item and outcome criteria.
* Rational approach: job experts devise weightings based on theoretical a priori links.
Draft a questionnaire for each hypothesized scale and apply to a large sample (450). Each question is tested and results are factor analyzed to find (desirable) clustering.
Design biodata form
When collecting biodata information, it can be a good idea to hide the key characteristics you are seeking within other irrelevant information (which also may be found in your researches). Where possible, collect hard, verifiable items (such as examinations passed). Where this is not possible, the potential for faking must be taken into account.
Biodata may be collected in scripted format:
Please describe a time when you were faced with a difficult customer. What was said? What did you do? What has the result? (Please answer in no more than 200 words).
It is very often, however, collected in more structured formats that allow for rigorous analysis:
Which of the following have you done in the past five years? Please tick all that apply:
[ ] Climbed a high mountain
[ ] Been on holiday to at least three different countries
[ ] Run a marathon
[ ] Raced a car
Test and use
Finally, try it out, for example on the people who supplied the data in the first place.
Discussion
Biodata is little known but is widely used (the first recorded use was in Chicago in 1894). It has nothing to do with biological aspects of the person, but has much to do with their biography (biodata is short for 'biographical data', not 'biological data'). Done well it has a high level of reliability and validity.
Unlike many other tools, biodata has a solely empirical root. There is no psychological theory behind it. It simply finds what works and does not question why. The founding hypothesis is that statistics holds true, and that correlations found in some people will also be found in other people.
Developing biodata tests is very time-consuming and hence costly. This goes some way to explain its use in particular areas, for example the military, where there is a large population available for study and a consistent need in terms of job performance.
Biodata is different from personality tests in that it has a broader content and is more specific. It can also be re-scored for different roles and can be done both by candidate and also by someone they know.
Once developed, it is then quick to process lots of applicants, particularly if multiple-choice tick-boxes are used.
Biodata has face validity and is clearly fair as it is same for all. It is also easier to monitor for discrimination.
Unless demonstrated to be job-dependent, items about race, gender, marital status, number of dependents, birth order and spousal occupation are likely to be illegal. Other issues of concern include accuracy, faking, invasion of privacy and adverse impact on perceptions of the applicant.
Faking may be minimized when the correlations are not clear (as in the WW2 pilot example). This can be improved further by hiding the real question amongst other less critical items.
Over time, a standard biodata test could lead to less diverse recruiting. It is also constrained by time and context. It can become dated as jobs change, which may reduce reasons for using it as it takes a lot of effort to set up in the first place. Same test may not be useful across the world – problem for global companies.
___________________________________________________________________
Selection Interview:
Description
Interviews are conversations whereby a candidate interacts with one or more people who assess the candidate and, in a selection interview, decide on whether this person should be offered a job.
Such interviews typically last 15 to 60 minutes although they can be shorter or longer.
Interview structure
The structure of an interview is based on the degree of control exerted by the interviewer as to the predictability of what questions are asked and what information is sought. When there is specific informational needs, then a more structured approach may be used.
* Unstructured interviews are unplanned, non-directed, uncontrolled, unformatted, bilateral communications and flexible. They require skills in questioning and probing.
* Semi-structured interviews are pre-scheduled, directed but flexible, major topic areas are controlled and there is a focused flow.
* Structured interviews are pre-planned, interviewer directed, standardised, pre-formatted and inflexible. They have a full structure and use highly-designed, closed questions. They assume a consistent format will get consistent responses.
Interview types
There are four common type of selection interview:
* Situational interviews use situation-specific questions based on job and look at hypothetical performance. They are conducted by specialists: psychologists or trained people.
* Job-related interviews ask about past behaviour on job. They are typically conducted by HR or managers.
* Psychological interviews assess personality traits. They are conducted by work/organizational psychologists.
* Competency interviews widen psychological interviews to include competencies such as interpersonal skills, leadership and other identified key competencies.
Interviews may involve a varying number of people. One-to-one interviews are common, although there are benefits for using multiple interviewers, for example where one person asks the questions and the other observes in a more detached 'third person' position. This objective position can look for body language and other subtleties which the person questioning may miss.
Behavioral interviews
Behavioral interviews assume that past behaviour is likely to predict future behaviour, and the more evidence there is of a previous pattern then the more likely it is that it will be repeated in the future.
The interview is developed with Job Analysis and Critical Incident Technique of effective and ineffective performance, through which a range of ‘performance dimensions’ are devised that indicate critical categories and which provide basis for detailed question themes.
The interviewer then pays particular attention to past behaviours in critical categories and also probes for motivations behind behaviours.
Variations include Behavioral Patterned Description Interviews (BPDIs), Behavioral Events Interviews and Criterion Referenced Interviews.
Situational interview
Situational interviews are based on Latham's Goal-Setting Theory, which assumes intent precedes actions. There is a focus on future (vs. past of behavioural methods).
The situational interview is developed by the use of Critical Incidents to devise rating scale of behaviours. The costs can be high at around $1000 per question.
The interviewer asks what the person would do in theoretical situations and assesses their response against the criteria and rating scale.
A new variant of situational interviews is the ‘multimodal’ form, focusing on self-presentation, vocabulary assessment, biographical questions and other situational aspects.
Situational interviews tend to reduce the chance of discrimination as they offer all candidates the same scenarios and evaluate them against the same criteria. Candidates also prefer them, as they seems fairer, but it still limits their control over proceedings.
Development
Develop questions and scoring
The first stage of interview development is, if it is not already available, a job analysis of the position, in order to identify key knowledge, experiences, competencies. This may be done using sophisticated methods such as often the Critical Incident Technique.
For situational interviews, particular scenarios are identified from which situational questions may be derived. For behavioral interviews, the attitudes and behaviors required in the job are uncovered.
For a structured approach, questions and scoring can be derived from empirical study that may include in-depth analysis of incumbents and interview of high performers. Typically 10-15 traits emerge, from which around 120 questions may be developed around broad situational and behavioral aspects.
Questions should be tested on good/bad performers and around half discarded to ensure a consistently high quality of questions. A scoring guide may then be developed.
Train interviewers
Interviewers should receive some training rather than be plucked from management and other ranks and placed in front of an unsuspecting candidate. Allport (1937) identified key attributes of interviewers:
* Breadth of experience and from diverse backgrounds.
* Above average intelligence and have self-insight and understanding of others.
* Emotionally stable, well-adjusted, with good social skills.
* Ability to be detached.
* Similar in some way to interviewee.
* Expertise in role.
Important in training is in developing objective rating skills, rather than letting the interviewer give way to the sizeable human bias and opinion that they may have.
They also need to be seen to be fair (candidates watch for this) and must, of course, comply with laws and regulations around gender, ethnicity, age and so on. A worst-case interviewer can lay the company open to damaging law suit (both financially and reputationally).
Discussion
The interview is an extremely common selection method and has a high predictive validity for job performance (Robertson and Smith, 2001), indicating many factors that are relevant for the communications job, including cognitive ability (Huffcutt et al., 1966), oral skills (Campion et al, 1988), social skills (Searle, 2003) and person-organisation fit (e.g. Harris, 1999).
Objectivist psychometric perspective
In the wonderfully-named objectivist psychometric perspective (i.e. traditional viewpoint), there is a focus on structure, reliability and validity, based on and assumption that the interview is an objective and accurate means of assessment where interviewees are passive participants providing information to rational interviewees who are skilled at acquiring and interpreting information.
Criteria used include cognitive ability, job knowledge or tacit knowledge (eg. through situational interviewing), social skills (e.g. extraversion, emotional stability, openness) and person-organisation fit (though this is fraught with difficulties).
There are dangers in this approach, for example in the use of the interviewer’s perception of the organisation, and what it needs. The interviewer typically seeks organisation fit by comparing against a simplistic prototype and may well assess personal qualities over required skills. 'Fit' may also happen at the social level, with the interviewer looking for personal fit with the candidate.
Social-interactionist perspective
In the social-interactionist perspective (i.e. modern viewpoint), there is a focus on social factors and the dynamic process within the interview. The interview is seen as a subjective, complex and unique event where both parties act as active participant-observers.
The perception of the interviewer is important within this perspective and research has shown some interesting factors. Where the interviewer has not paid due attention to candidate qualifications, the candidate often draws back, becoming more reticent and talking less about themselves (perhaps punishing the interviewer for the slight). Candidate expectations depend on how much they like the interviewer as the job. The candidate may well feel unfairly treated if not given enough attention or opportunity to dialogue, and may well develop negative expectations if the interviewer talks more than they do.
Faking
Faking is less easy in interviews than CVs or Application Forms, as non-verbal signals may be detected. Nevertheless, interviews are so common that some interviewees acquire significant expertise in the ‘interview technique’, managing impressions and having ready answers for common questions.
Image Management (IM)
Faking also may also appear through dress, words and body language, where impression-management seeks to make a person appear more than they normally are. When the interviewer sees the candidate as over-dressed (or provocatively under-dressed!), with excessive make-up or otherwise trying too hard to impress, they may suspect them of concealment and mark them down accordingly.
Impression management includes Ingratiating behaviour (agreeing, complementing, offering favours), self-promotion (to boost competency range) and also anger and intimidation (to show fearlessness).
In studies, women showed more openness and older, more experienced people maintained more eye contact, projected a more positive image and asked more questions. Older people reduced the number of entitlement statements used and increased self-enhancement and self-promotion statements. Where there was more role ambiguity, impression management increased.
Reliability and validity
Interviews are generally reliable with criterion values of .51 common, rising to .63 when used with psychometric tests. The validity is higher for situational (.60), job-related (.39), psychological (.29) styles.
Situational interviewing is relatively simplistic and is predominantly used in low-complexity jobs. Behavioral interviewing brings in a wider range of behaviours from inside and outside work, allows more thorough probing of motivations and is preferred for higher-level jobs.
Bias and misjudgement
There are many factors in which can bias interviewers, including gender, race, age, appearance, attitude, non-verbal behaviour, physical setting and job market factors (Avery and Campion, 1982), bias towards positive information and even primacy and recency and contrast effects in the ordering of candidates (Asch, 1946), Miller and Campbell (1959) and (Anderson and Shackleton, 1993). These factors may be reduced by training, but often not eliminated.
Within interviews, it is important that fair play is perceived, which includes, for example that all candidates should each have a comparable experience, even if the interviewer concludes early on that they are not suitable.
Interviewers are subject to normal human biases, for example they tend to be biased towards ‘people like me’. Positive information is weighted more than negative data (which takes more time to process). The order of candidates can also cause bias (primacy and recency effects).
The Halo effect happens when one good aspect of candidate makes them look good in other areas as well. The reverse is true, and the Horns effect occurs where a negative perception is generalized to other aspects of the person. A typical horns effect is where the person is overweight and where this is generalized into greed, lack of control, lack of social ability, etc.
The generalization continues and candidates who are nervous at interview can be generalized as always nervous, whilst the confident may be attributed as being skilful in other areas.
______________________________________________________________________
Intelligence testing:
Description
There are three schools of thought about intelligence and consequently how it may be tested.
Uni-factor models
Uni-factor models have a single dimension, defining 'intelligence' as a single thing that is fixed, unchangeable and can be measured. They usually have a socio-biological basis, where intelligence is defined as a combination of genetic and social factors.
Spearman (1904) did a factor analysis of ability, identifying ‘g’ as general ‘psychophysiological’ intelligence.
Unifactor models allow for racial and national differences in intelligence. In once sense, this may be seen as racism or xenophobia. On the other hand, it also calls into question the notion of 'intelligence' as a single thing -- usually the defining race/nationality/gender measures themselves as a standard, and any difference with others is very likely to show the others to be inferior.
Multi-factor models
Multi-factor models identify intelligence as a combination of distinct abilities. They tend to place emphasis on the role of the environment in learning and see intelligence as dynamic and situated., rather than a fixed ability in all circumstances.
Thurstone (1938) identified nine primary factors of intelligence:
Words Spatial Perceptual
Verbal Numerical Memory
Reasoning Deduction Induction
J. P. Guilford identified a cube (5 x 6 x 6) of abilities.
Multi-hierarchical models
Multi-hierarchical models measure the application of intelligence, not just cognitive ability. They try to provide more organization than multi-factor models that seem unwield (eg. Guilford’s 180 cubelets).
Horn and Cattell (1966) defined five second-order factors, notably including fluid and crystallized intelligence:
* Fluid intelligence: non-verbal, general reasoning. [tacit?]
* Crystallized intelligence: application of verbal or conceptual knowledge. [explicit?]
* Visualisation
* Retrieval
* Cognitive speed
Vernon’s model (1950s) describes intelligence as an equation:
General intelligence
= Verbal ability (verbal + numerical)
+ Spatial/Mechanical ability (Spatial + Mechanical)
Carroll’s model (1993) is more complex:
General intelligence
= Fluid intelligence (inductive reasoning + sequential reasoning)
+ Crystallised (lexical knowledge + foreign language aptitude)
+ Visual perception (visual imagery + perceptual integration)
Gardner’s model (1983) of seven intelligences is favoured by educationalists:
1. Linguistic
2. Musical
3. Logical-mathematic
4. Spatial
5. Bodily-kinaesthetic
6. Intrapersonal
7. Interpersonal
Furnham (1992) defined five factors that predict occupational behaviour:
* Ability (skill in the basic job)
* Motivation
* Personality traits
* Demographic factors
* Intelligence
Discussion
General intelligence tools are not used that often as preferences are usually to find more distinct abilities. The main areas of testing (related to Guilford) are verbal, numerical, spatial, dexterity, sensory. Cognitive ability tests are good predictors of initial job performance, but decline over time.
It is important to examine the test manual, to ensure you use the tool correctly (many do not do this well).
Testing people with disabilities is a problem as there is a lack of information available around this area.
There is a strong relationship between job performance and general intelligence. Tests are good at assessing this – interviews don’t add that much. Tests are also more objective.
Many tests seek maximum ability and are developed in a clinical environment. Ackerman et al (1989) noted that typical ability is more relevant in job environments. In a related early research, Terman (1934) identified chronic (long-term) vs. acute (short-term) intelligence.
The classic IQ test has been widely criticized as being biased towards white, middle-class Western adult males.
Culture-free testing is important for global organizations. A well known test is Raven’s ‘Progressive Matrices’ (1965), which measures three ability levels: children and people with learning disabilities up to fluid intelligence. Abstract reasoning is measured via pattern (‘matrix’) with a part missing. It is confusing to follow and the norm table difficult to use. There is little evidence of removing bias towards majority groups.
Some researchers say spatial ability is not as good as verbal ability. Differences may (or may not) be caused by geographic and organisational contexts. The Big Five test is claimed as culture-free (McCrea and Costa) but is also doubted.
________________________________________________________
Pshycometric test:
Description
Psychometric tests are used, as the name suggests, to measure some psychological aspect of the person. Most commonly in selection, this includes personality, ability and motivation.
Occupational tests are used to measure maximum intellectual performance in terms of attainment, ability and aptitude, or typical behavior in terms of motivation and temperament.
Typical tests identify the direction of interests and can be used to suggest types of jobs associated with these areas.
Attainment tests
Attainment tests are used to assess the level of achievement in a particular area, such as in high school examinations.
Aptitude tests
Aptitude tests assess potential in some target area, seeking to discover possible future capability. This is as opposed to ability tests, which seek current capability. They can be used to measure specific aptitudes or collective traits (eg. technical, verbal, numerical).
Intelligence tests
General intelligence tests include cognitive studies that focus on information processing and organization of knowledge.
These tests are often made up of batteries of sub-tests that each test a narrow range, such as arithmetic reasoning, verbal intelligence, etc.
Intelligence tests may measure two factors:
* Fluid ability: applying reasoning skills to novel situations (decreases with age).
* Crystallized ability: using culturally specific component (increases with age).
Intelligence is not normally distributed. At the bottom end, scores are tightly grouped, suggesting strong general factor. At the high end, scores show more independence between sub-tests, indicating specific intelligences.
Development
The overall approach to developing psychometric tests is to generate a large number of sample items, give them to a set of people and then keep only those that differentiate.
Maximum-performance questions are selected based on target-related factors. Questions here are based on right-wrong difference.
Typical-performance questions are selected based on personality, mood, attitude, temperament. Questions here are based on identifying differences in selected factors.
There are five methods of construction, as below.
Criterion-keyed
Criterion-keyed tests focuses on an external domain or criterion. Thus for interest inventory, criteria are interested related to specific occupational group.
They could be used in change to identify those who lack flexibility.
Example: MMPI
This method is criticized as having an atheoretical basis where selection of items based on empirical data on ability to differentiate. It addresses similarities and differences, not why these are so. The domain of test may be limited: for example, ‘mania’ in MMPI has only one criterion scale. The more specific a measure, the more limited it is by its generalizability. There can also be problems when moved from one context to another, especially across cultures.
Factor analytic
This identifies items that load onto one factor and not onto another. It has the advantage that scores always have the same meaning.
Development of the test seeks strong correlation between the item and factor.
Example: Cattell’s 16PF. He listed all personality traits he could find and gave tests to heterogeneous groups of adults. Then he used factor analysis to develop theory of structure and relationships (not for data reduction). This has since been correlated with 50 different occupations.
A key in doing this is the size of the sample group. The larger the group, the lower the standard error.
Item Response Theory has been devised to help test-developers assess the nature of differences.
Item analytic
This is a very simple method which correlates each item with the overall test. It is useful for eliminating unsatisfactory items prior to using factor analysis. This is useful in developing longer tests by eliminating weaker items.
There is a need to be careful here:
* Domain definition: e.g. avoid investigating trust by asking person if trustworthy.
* Bloated specifics: repeated coverage of same item leads to apparent high reliability.
* Transportability: these tests often based on social and other domain-specific values.
Thurston scales
These are widely used, particularly in assessing attitude. They identify statements concerning attitude, then assess relevance of these with a panel of experts.
Items are chosen on the standard deviation of rating given by experts (ie. Those they mostly agree on).
A high level of values tend to be in results, hence transportability is issue.
Guttman scales
These are less widely used.
Items in this method are sorted in terms of difficulty or intensity.
Problems include getting graduation, every item must correlate with total score. which needs many items and large samples.
Discussion
Bright people will tend to do well on many different types of test as such tests have a high correlation with intelligence. This reduces the value of the test in differentiating individuals.
Factors affecting test experience
Factors affecting test experience include:
* Test: pre-test information, type of test, language, instructions, structure, medium, timescales
* Person: experience, confidence, emotion, motivation, memory, culture
* Environment: Light, heat, humidity, noise, distractions, test administrator
* Computers: affect both developers and test-takers.
* Time: affects stress, ability to complete, alertness (time of day). The test itself may also age, esp. when ‘semantically laden’.
* Test-taker:
o Alpha ability: improves as a results of the test, which teaches them things.
o Beta ability: improves management ability (eg. managing time, rtfq).
* Attention: to test taker (Hawthorne effect).
Criticisms and hazards
Criticisms, hazards and potential problems with psychometric test include:
* Inadequate definition of concept to be measured.
* Bias (undesirable) in differentiation (desirable) between test takers. Eg. gender bias.
* Poor application of tools, eg. inadequate job analysis, wrong usage of tools.
* Words defined differently by developers (eg. extravert, innovator), causing confusion.
* Misinterpretation of results by users.
* Not reading the test manual properly (which tells how/where it is to be used)._________
______________________________________________________________________
Personality test:
Description
Personality tests seek to identify - guess what - aspects of a person's personality that are correlated in some way with job performance.
Type
Focus
Concern
Notes
Example
Psychodynamic Internal Unconscious mind Clinical background. Strong Freud/Jung influence.
Attention to dysfunction, neuroticism
Psychoanalysis, Jungian Type Inventory (e.g. MBTI)
Biological Internal Heredity and learning Criticised as defining intelligence with too few factors. Eysenck Personality Questionnaire (EPQ)
Behavioral External Habits and reinforcement Learning though conditioning and shaping behavior. Focus on scientific proof. Misses cognition. Behavioral assessment, behavioral interviews
Phenomenological and humanistic Internal Maslow, Kelley, Lewin. Influence by subjectivism and individualism FIRO-B
Social-cognitive Internal and External Context and cognition Bandura, Walters. Includes social and cognitive psychology.
Trait Internal Values, behavior and relationship with performance. Based on clusters, factor analysis, predictability. 16PF, OPQ, IPT
Personality tests are often administered as self-completed sets of questions about preferences and behaviors each of which contributes towards a score or position along a number of personality dimensions, e.g.:
Extravert
X
|
Introvert
Any given score may be correlated with a particular job. For example, jobs that require significant interaction with people may have a correlation of extraversion with job success.
Development
Personality tests are hard to validate and so are developed over a long period of hypothesis, test and observation.
There are several actions a test developer can use to minimize faking:
* Give instructions with warning.
* Include social desirability (lie) scale. Eg. MMPI.
* Ipsative questions (forced choice and no middle option)
* Conceal purpose of questionnaire (eg. biodata that correlates non-obvious biographical data with performance predictors).
* Say ‘don’t think too hard’.
* Promise (and give feedback) to the test-taker.
Discussion
Personality tests are very commonly used, although often from a viewpoint that (incorrectly) perceives them as very strong predictors of behavior. Personality is a complex concept and whilst personality tests can give useful indicators, the world is not divided up into 16 (or less!) types of people who are unable to see or act outside of their personality profiles.
Stability
There is often a belief that personality is fixed and does not change. In practice there are three types of instability:
* Temporal: Personality can change over time, for example in Jungian Type Inventory, there is a tendency to polarize at one end of the spectrum or recognize a need for flexibility and tend towards the middle.
* Contextual: People act very differently in different situations (e.g. home and work).
* Internal: Personality assessments are often based on self reports, where people often answer questions based on an idealized self or what they believe is needed.
Predictive validity
Much research shows value of personality tools and their links to job needs, for example the best pilots have emotional stability and extraversion. Many people will self-select jobs based on their perceived via this.
Bandwidth can be an issue, where the breadth of cover by each instrument is insufficient. However, lots of factors becomes unwieldy, too few are criticised as simplistic (eg. 16PF vs. Big Five arguments).
The jangle fallacy occurs where same trait name used by two or more questionnaires. This can be confusing.
The most predictive personality factors of job performance are conscientiousness and general intelligence (but what is ‘job performance’?). Sub-factors of ‘conscientiousness’ in studies also varied (competence, order, dutifulness, etc.). Combined traits are finding favour, such as consciousness and agreeableness. Extraversion is important in some situations, such as sales - but high agreeableness may result in lower sales and in some settings, managers do less well if they are conscientious.
Overall, though, personality tests have low predictive validity of job performance, but they are used often for this, for example people may be de-selected solely on test results. People have even been made redundant from jobs based on personality tests (and giving them a biased report to show this).
Work is often done in teams and personality tests often do not cover this (or do so only in a limited way).
Distortion and faking
Distortion and faking can be a problem where people may deliberately or subconsciously bias their self-reports (where social desirability bias can have an undesired effect). There may be a central tendency where people take the safe choice. Acquiescent people tend to use 'yes' and 'agree' to answers more than they should.
Despite concerns, faking does not affect validity that much.
Faking good (Impression Management) can be useful in the target job and is itself an indicator of personality.
______________________________________________________
Resume and CV:
The résumé (USA) or CV (UK) is a personal summary of an applicant's history that may include:
* Contact details
* Summary statement about the person, characterizing them and their ambitions.
* Experience (usually the main body of the application)
* Qualifications, both academic and professional
* Hobbies and other interests
Two common forms of résumé are the functional résumé and the historic résumé. In the functional résumé, the applicant takes a list of particular skills and knowledge and gives evidence of their ability in each item. In the historic resume, they summarize achievements for job positions in historic order (usually with the more recent jobs first).
Discussion
The résumé is probably the most common tool used in selection, at least in the initial selection process where it is often used as the basis for initial shortlisting (at least for external candidates --- internal selection often does not use the résumé). Care must be taken here, as when there are many candidates, it is easy to 'throw the baby out with the bathwater', filtering out good ones as well as the less desirable ones.
The résumé is a self report and, as such, may be economical with the truth or contain exaggerations and even complete fabrication. Applicants know the importance of their résumé, which may get only a few seconds of attention before is rejected, and so may take inordinate care over their construction. A well-crafted résumé may indicate that the person is careful and skilful. It may also be true that they paid a professional to write it for them. It is also likely that they have thought hard about what to tell you and what not to tell you. A more amateur layout could be less polished but it may also be more naive and honest.
As the first thing that the recruiter sees about a person and also the most common tool used in interviews, it often has a disproportionate effect.
For use in interviews, key aspects that match job criteria may be extracted and used to help probing. If the person is lying in the résumé, you may spot this during the more detailed questioning. You may also follow up with referees (although do remember that these also were selected by the candidate).
CVs are written by the candidate and are intended to show them in their best light and hence are unlikely to include negative elements. The CV may thus be taken as an indicator only, with verification of key items by other methods, such as following-up of references, questioning during interview and testing of skills by a work sample.
Without care, recruiters may easily seduced by subtle elements of the CV that have been shown to have undue effect. For example, when comparing CVs, Impression Management elements such as competency statements (Bright and Hutton, 2000) have been shown to have a positive effect.
The CV itself, even when used with other methods, may have a disproportionate influence on selection (Robertson and Smith, 2001), perhaps due to its familiarity or ready availability (Tversky and Kahneman, 1974).
Deselection on racial and other grounds is illegal, including by CV. There are cases where minority activists have send two CVs to a company, identical apart from racial exaggeration or hiding, and then suing the company when they receive a response only on the non-racial CV.
The candidate is also subject to legislation here: if they lie on their CV and are appointed on these ‘facts’, then this may be grounds for later dismissal. It is even know for people to pass themselves off as doctors and lawyers, faking certificates and other documents.
______________________________________________
Work Sample:
Description
‘Work sample’ is a method of testing ability by giving the candidate a sample of typical work to do and evaluating their performance.
Work samples may appear as short questions along the lines of 'What would you do in this situation' or more complex scenarios to analyze. At its most naturalistic, the candidate is put into a the actual job where they may spend some time actually doing real work. The normal situation, however is for the person to be given a role-play or real-life situations where the candidate acts out a realistic situation. This creates a repeatable pattern whereby multiple candidates can be given the same test and hence more easily compared.
Job-knowledge tests
Job-knowledge tests focus on specific dimension or content to determine current knowledge, such as a test of knowledge about the highway code.
Knowledge tests such as this may be computerized, enabling them to be taken at any time and even in any place. This also reduces the cost of administration and can reduce security issues (such as loss of exam papers).
Proctoring is a method often used, whereby questions and sequences are regularly changed to reduce copying cheating.
Job knowledge tests are increasing use in professional areas such as medicine and architecture.
Hands-on performance tests
Hands-on performance tests are used to test people for physical capabilities. For example, a psychomotor test, which is characterized by manual dexterity exercises.
Situational judgments tests
In situational judgments tests, people are asked how they would act in a given situation. This may be done with a multiple choice to enable automated marking. It can be used in many different jobs, for example leadership and teaching.
These tests assess job knowledge and the ability of the candidate to apply this knowledge in specific situations, (rather like in situational interviewing). They can be used to assess for aptitude and trainability as well as for current knowledge, and can be helpful in recruiting people with no previous experience.
Development
Work samples, as with other selection methods often start with a job analysis of good performers.
The job is typically broken down into key behavioral components, which are then used to create a checklist of desirable behaviors.
From this, scenarios and case studies may be developed.
Discussion
Work samples is normally used to test current skill. It can also be used to test for the ability to learn new skills. It is based on the premise of behavioral consistency, where the way a person acts in a simulated situation is assumed to be the same as they might act on the job.
It is useful for reducing bias by assessors and is perceived to be fair and valid by both recruiters and candidates, as all candidates are treated in the same way, including the amount of time to respond (although this may reduce the chances of slow writers or reflective thinkers). It removes non-job-related cognitive factors, and is visibly related to the job in question.
It has a high predictive validity of .37. to .54 and leads to less turnover of staff.
Criticism of work samples includes that they are atheoretical and related to an empiricist and Western view of the individual and work (Searle, 2003). Work samples must be carefully designed to test specific items. They give problem where more attention is paid to face validity than content validity and can also miss small, but critical factors (such as color vision for engineers).
In any concern for fairness, work samples are of particular value as they have both higher face validity and greater fairness for non-traditional candidates (Lievens and Klimosky, 2001).
_______________________________________________________
Selection Articles:
Accuracy and Faking:
There is a conflict between the recruiter who wants to be accurate and the applicant who may be faking. Accurate information is true and without color or falsehood. It may be affected by the informant, the context, underlying information or measurement effects.
Candidate issues
Information from the candidate will be affected by their inference of the meaning of questions asked. Inference is a process that filters and distorts sense information via internal constructs such as schema and life narratives. If the meaning inferred is not exactly that which is intended, then the response cannot be accurate. Words are re-interpreted and extended with connotative meaning (Barthes, 1957). The implication for recruiters is that great care must be taken with language used in questions to avoid ambiguity and potential distortion in the inferential process.
In their desire to be selected, applicants may fake or bias responses, ranging from slight exaggeration to bare-faced lies. Recruiters can encourage applicants to be honest both by asking for honest replies and by warning of consequences of lying (Cascio, 1976).
There are known areas where faking is more likely, such as tenure and final salary. Simple facts may be followed up as appropriate with former employers to verify these.
Consistent faking is more difficult in multiple contexts. Multiple tests and multiple interviews on different days and with different people provides information that can provide a richer information source to determine variation in a candidate’s responses.
The informant may also be affected by their perception of job requirements and the culture of the target company. If they perceive a requirement for aggressive go-getting, then their responses may well be suitably biased. Recruiters may reduce this (or at least reduce variation in response between applicants) by providing accurate information about the job, the company and its culture.
Recruiters and accuracy
Recruiters also need information from inside the company to build accurate job descriptions and design appropriate recruitment instruments. Similar jobs may be quite different in different contexts (Goldstein et al, 1993) and hence need careful analysis. Managers may exaggerate the skills needed, listing a perfect prototype whilst losing the key attributes in the detail, or may bias detail to subtly affect influence policy or manage internal impressions. When gathering information from groups, social interactional effects may further bias information (eg. Deutch and Gerard, 1955). Recruiters may reduce internal bias by using such methods as critical incident analysis and task (rather than KSA) data (Morgenson and Campion, 1997).
Accuracy may also be a problem where job needs are constantly changing. What is accurate at the time of collection of data may not be true when it is used to select candidates. The time component of accuracy must thus also be considered and perhaps more general information sought where detail is subject to change.
Information may also be gathered from references, which provides additional information to questionnaires and interviews. There is inherent bias in this process, as the referees are chosen by the applicants. Referees, as with applicants and internal sources are subject to bias and recruiters need to take equal care in the questions they ask and the inferences they make from the information provided.
Finally, but actually first, recruiters can ensure that everyone in the recruiting cycle is well educated and knowledgeable about the real issues. Thus the use of qualified test developers, training of interviewers and coaching of managers can go a long way to ensure that recruitment is a professionally run process and that recruited candidates are the best possible people for the specific jobs and for the longer-term health of the overall organization.
______________________________________________________________
Faking and Deviance:
The question of deviance of faking behavior may start with a consideration of what is ‘deviant’. Downes and Rock (1998) indicate the multi-faceted and contextual nature of deviance that can range from moral to criminal to political.
In the common usage within recruitment, there is an assumption that deviance has a deliberate element of deception with the purpose of achieving the goal of the applicant whilst bypassing goals of recruiters. As such, it is perceived as immoral at best and criminal at worst (for example deceptions that may lead to an unqualified fantasist gaining a job as a doctor).
Different viewpoints
Deviance, however, is an individual construction, and whilst a recruiter may consider distortion as deviant, applicants may have a social norm that positions it as ‘normal’, particularly if they socialize within groups of job-hunters (such as third-year students). Where the perception is that ‘everybody is doing it’, not to fake may be perceived as dooming oneself to certain failure. Recruiters may also take this assumption and cognitively downgrade impressive applications.
Just as interviewers may be trained, so also are there many ways that interviewees be trained, practicing on web-based MBTI alternatives, reading the many self-help books (such as Bolles, 2004), being coached and so on. This again is a socialized process where it is considered ‘good practice’ to prepare for interviews by learning how to fake. Recruiters may even collude with such activities, welcoming such preparation as an indication of the seriousness of candidates.
In any persuasive situation, arguments are likely to be more extreme in order to sway the other person (Pruit, 1971). This is also likely within interviews, where the goal of the interviewee is to sway the interviewer. In the other direction, the interviewer may be torn between the role of company representative, attracting good applicants, whilst also taking the role of objective and dispassionate judge, seeking only the truth of the applicant’s capabilities.
Interviews as social occasions
Within the situation of the interview, the interviewer and interviewee form a temporary group and as such are likely to succumb to group behaviors such as politeness (Brown and Levinson, 1978) that constrain and distort communications. There will also be natural harmonization, where each may subtly change to be more like the other (Giles and Wiemann, 1987). Social desirability bias (Fisher, 1993) is likely to lead both alter their behavior to be more acceptable and attractive to the other. Equity theory (Adams 1963, 1965), also points to the likelihood of a seeking of a balance of exchange. Where an interviewer is drawing much information from a candidate, this may drive them to seek to redress the balance, perhaps by interacting more or giving way on questions of doubt.
In interpersonal deception theory, Buller and Burgoon (1994, 1996) note that deception happens in a dynamic interaction where liar and listener dance around one another, changing their thoughts in response to each other’s moves. ‘Normal’ behavior in such situations includes manipulation of information, strategic behavior control and image management. Social interactions are complex and many, and even if either party is deliberately seeking to be objective, the subconscious nature of communication makes it impossible to fully suppress.
Robinson (1996) describes a ‘pyramid of truths and falsehoods’ that indicates the complex fabric of truth in social environments, ranging from the normal social interactions where truth is a negotiated variable through to the scientific pinnacle, where truth is proven and largely unchallengeable. Again, this indicates the dilemma of the conversation between interviewer and interviewee who may be sitting at different levels. This highlights how the legitimacy and desirability of faking is situated with the person and their individual context beyond the interview arena. What is deviant and unwelcome to one may be a natural social dance to the other.
Faking as desirable
In the reality of everyday company business, faking may be a normal and even necessary skill. Some jobs in particular may require the assumption of brand-aligned personas (eg. customer service staff) whilst jobs such as sales and marketing may require some form of distortion that is not unlike the impression management that a job candidate may use. In such cases, recruiters may actively provoke and seek faking in aligned assessment contexts, such as in the ability to maintain a calm exterior when under pressure and where a ‘truer’ expression would involve expression of strong emotions.
Despite these concerns, it has been found that faking have negligible results on psychometric outcomes (eg. Barrick and Mount, 1996) and, provided recruiters are aware of the effects, it need not be assumed that psychometric results (particularly from established and proven tests) are invalidated by faking results.
____________________________________________________________
PErsonality and Job Sucess:
When you use or take personality tests, what are the factors that lead to job success?
There are several personality factors that have been correlated with general success in jobs, although it has also been shown that too much of one factor can cause problems.
Key success factors
A couple of factors have been shown to be highly correlated with success in jobs.
Intelligence
A number of studies (including Hunter and Hunter (1984) and Schmidt et al, (1992)) have shown that general intelligence ('g') often correlates well with job success. Basically, it means that intelligent people are generally good at jobs. Hire bright people and they will be able to do what you ask of them.
There may also be other related factors in this. For example you could argue that intelligent people are able to understand not only the task factors required to succeed in a role but also the social factors. Following the conjecture, if you link intelligence to education and a 'good upbringing' then values-based factors such as conscientiousness and agreeableness may also be seen as related.
Conscientiousness
Conscientiousness has also been shown by several studies as being highly correlated with job success (e.g. Barrick and Mount (1991) and Robertson and Kinder (1993)).
If a person is conscientious, then they will work hard to complete work they have committed themselves to doing. They can also be left alone without need for constant supervision.
Other success factors
Other factors have also been shown to be linked with success, although (Barrick et al, 2003) showed that these tend to be more related to some jobs more than others (particularly those with more significant social elements).
Beyond (and as well as) these, you may well need to do a thorough job analysis with some kind of factor analysis that isolates both individual and clusters of success factors for specific openings that you have.
Openness
If you are open to experience, are ready to challenge yourself and learn and welcome feedback from others, then you will not only learn far more, you will also be perceived as a pleasant person by others who will be more ready to work with you. Unsurprisingly, this tends to make people better at jobs.
Agreeableness
When paired with conscientiousness, a person who is easy to get on with becomes even more successful in many jobs. Particularly if the work requires working with other people, a person who is disagreeable is not likely to gain good cooperation.
Sometimes agreeableness is not required in large quantities. For example a salesperson who is too nice to customers may not win the tough bargaining deals.
Extraversion
If you are outgoing and get your energy from being with other people, you will probably do better in many jobs than others, especially (of course) those that require you to work actively with others, whether in managerial or team environments.
_________________________________________________________
Reducinng faking in test:
Whilst faking is a perennial issue, designers of selection tests have a number of tools and techniques available that can be used to counteract or at least detect this.
Initial instructions
At the start of a test, the instructions given to candidates can include warnings of the consequences of detected faking and request honesty. Instructions may also ask candidates to answer quickly, with their first thoughts rather than pondering. Holden et al (2001, p160) indicate that lying takes time. This is also supported by Ekman’s (1985) general study of lying.
Trick questions
It is also possible to include ‘trick’ questions, where a faking response is easily identified and hence raises suspicions about all other responses. For example in an assessment of a given set of skills, a multiple-choice question may have no right answer. Earlier assertions may later be probed in more detail.
Multiple sources
Instruments that use self-reporting may give false readings when they are used by candidates who have insufficient self-insight to be able to be answer fully. If information is collected from multiple sources then this problem may be reduced, for example through the use of ‘assessment centres’ where multiple methods and assessors give a range of data and viewpoints which can be cross-checked.
Safe answer
Test takers who use the ‘central response tendency’ and opt for ‘safe’ central options may be identified by asking different questions for which a consistent response would include high and low responses.
Where individuals have a high need for approval, they may tend towards positive ‘agree’ and ‘yes’ responses. This may be countered and detected by reversing some questions (reversing also breaks up habituating patterns of similar responses). This tendency towards seeking approval may also be detected by including a ‘social desirability’ scale within the questions to enable isolation of this.
Multiple questions
Assessing the same attribute with multiple questions can also show whether the candidate is averaging across questions (‘I’ve been a bit negative, I think I shall be positive for a while now.’), although obvious care needs to be taken to ensure that similar questions are interpreted in the intended way. Analysis of sequential patterns of positive and negative responses across responses may also identify uncertainty or deliberate averaging.
Ipsative questions
Normative items ask the candidate to rate their level of agreement with statements, and can give a good measure of psychological characteristics (Kline, 1993). However the question of faking has led to an ipsative approach being used in many contexts, where the test-taker is forced to make a choice from a fixed number of options. Ipsative questions either offer choice between items from very different areas (one question I recall from such a test is ‘Which do you prefer, a poem or a gun?’), or a polar choice from the same scale, which may have a yes/no response.
However, as Johnson et al (1988) has pointed out, ipsative forced-choice approaches are highly problematic. The very notion that you can ‘force’ someone to do something denudes them of free will and the very real problems of respondents either second-guessing or making a random choice from a set of items amongst which they have no clear preference. Martin et al (1995) have shown that test takers with a good insight into job needs can provide realistic faked responses. Ipsative methods still persist, in particular where sound alternatives are not available, for example the Zuckerman, Eysenck & Eysenck (1978) scale of sensation-seeking is still in used, despite the report by Ridgeway & Russell (1980) on unacceptably low reliabilities for the various sub-scales.
Question opacity
Faking may also be reduced by use of item opacity, where the respondent does not know ‘right or wrong’ answers. For example use of Biodata approaches, where traits and historic activities have been correlated with requirements of the job in question, can offer very opaque questions (such as the WW2 discovery of the correlation between childhood flying of model aeroplanes and good pilots).
Including the candidate
Including the candidate in the assessment process can also help to reduce faking, socialising them into providing honest responses. This may be implemented, for example, in assessment centres, where they may be involved in discussions about psychometric outcomes.
_______________________________________________________________________
Reliability:
Definition
If a test is unreliable, then although the results for one use may actually be valid, for another they may be invalid. Reliability is thus a measure of how much you can trust the results of a test.
Tests often have high reliability – but at the expense of validity. In other words, you can get the same result, time after time, but it does not tell you what you really want to know.
Stability
Stability is a measure of the repeatability of a test over time, that it gives the same results whenever it is used (within defined constraints, of course).
Test-retest reliability is the repeatability of test over time to get same results with the same person and needs to be done to assure the stability of a test. Stability, in this case, is the variation in the scores that is taken. Problems with this include:
* Carry-over effect: people remembering answers from last time.
* Practice effect: repeated taking of test improves score (typical with classic IQ tests).
* Attrition: People not being present for re-tests.
There is an assumption with stability that what is being measured does not change. Variation should be due to the test, not to any other factor. Sadly, this is not always true.
Consistency
Consistency is a measure of reliability through similarity within the test, with individual questions giving predictable answers every time.
Consistency can be measured with split-half testing and the Kuder-Richardson test.
Split-half testing
Split-half testing measures consistency by:
* Dividing the test into two (usually a mid-point, odd/even numbers, random or other method)
* Administering them as separate tests.
* Compare the results from each half.
A problem with this is that the resultant tests are shorter and can hence lose reliability. Split-half is thus better with tests that are rather long in the first place.
Use Spearman-Brown’s formula to correct problems of shortness, enabling correlation as if each part were full length:
r = (2rhh)/(1 + rhh)
(Where rhh is correlation between two halves)
Kuder-Richardson reliability or coefficient alpha
The Kuder-Richardson reliability or coefficient alpha is relatively simple to do, being based on one administration of the test. It assesses inter-item consistency of test by looking at two error measures:
* Adequacy of content sampling
* Heterogeneity of domain being sampled
It assumes reliable tests contain more variance and are thus more discriminating. Higher heterogeneity leads to lower inter-item consistency. For right/wrong scores that are non-dichotomous items:
Rkk = k / (k – 1(1 – Σσ2i/σ2t))
Where Rkk is alpha coefficient of test, k is number of items, σ2i is item variance, σ2t is test variance
Equivalence of results (parallel form)
Seeks reliability through equivalence between two versions of the same test, comparing results from each version of test (like split-half). It is better than test-retest as it can be done the same day (reducing variation).
There is a danger of tests with high internal validity having limited coverage (and hence lower final validity).
Bloated specifics are where similar questions lead to apparent significance. This can be bad when unintended, but can be used to create deliberate variations.
Parallel versions are useful in such situations as with graduates who may do the same test several times.
An adverse effect occurs where different groups score differently (potential racial, etc. bias). This may require different versions of the same test – eg. MBTI for different countries.
Discussion
There are a number of procedural aspects that affect test reliability, including:
* Test conditions
* Inconsistent administrative practices
* Variation in test marking
* Application of an inappropriate norm group
* Internal state of test-taker (tired, etc.)
* Experience level of test-taker (eg. if taken test before).
_________________________________________________________________
The selection spiral:
There is a significant danger in selection and promotion that a company can spiral downwards into incompetence and failure.
The power of talent
The talent and motivation people in any given job can make a huge difference to the achievements that are gained. It is not uncommon to reckon that there can be a ten-to-one ratio in the performance of two different people doing the same job.
This has led to a significant focus on 'talent' and the categorization of A-, B- and C-players, where A-players are the high-achieving stars, the B-players are the solid, good-enough middle team and the C-players are the very limited bottom-end.
The selection trap
Selection is arguably the most important process in organizations. If you recruit or promote the wrong person, you get to live with the relative incompetence and never know the potential that has been lost.
The biggest danger in organizations is where managers select people who are less capable than themselves. The selection trap is where managers have an ego need to feel superior to their charges. If they interview someone who seems to be better than them, they feel threatened by that person and are less likely to employ them.
There are several fears involved in this. First, managers have a legitimate concern that they should be able to manage their subordinates. If they employ someone who challenges their directive too often, then it would seem that this is an unmanageable situation. This supports the less legitimate desire to look good and fear that a superior subordinate might make you look bad and even try to take your job.
A-players are particularly susceptible to this trap, as they often have bigger egos and may be driven narcissists who work hard to seek the recognition they need. Anyone else who takes glory from them is thus to be feared, rejected or attacked.
The spiral
The subsequent spiral of managers appointing less able people than themselves is that there is a downward spiral in the talent that the company appoints. The gradient of this slope depends on how widespread this behavior is and the gap that the manager needs between him/herself and the new appointee.
In particular, if managers feel threatened by people who might take their job, then they will filter out applicants who show signs of ability in management and leadership. This fear of replacement may also drive a general opposition to development of their employees, leading to a leadership vacuum that eventually results in a company that lacks direction and inspiration, and which is eventually overtaken (and maybe taken over) by fitter competitors.
____________________________________________________________________
Validity:
When designing and using tests and other methods of assessing people, it is important that the test and its use is valid.
Definition
Validity has been described as 'the agreement between a test score or measure and the quality it is believed to measure' (Kaplan and Saccuzzo, 2001). In other words, it measures the gap between what a test actually measures and what it is intended to measure.
This gap can be caused by two particular circumstances:
(a) the design of the test is insufficient for the intended purpose, and
(b) the test is used in a context or fashion which was not intended in the design.
Types of validity
Face validity
Face validity is that the test appears to be valid. This is validated using common-sense rules, for example that a mathematical test should include some numerical elements.
A test can appear to be invalid but actually be perfectly valid, for example where correlations between unrelated items and the desired items have been found. For example, successful pilots in WW2 were found to very often have had an active childhood interest in flying model planes.
A test that does not have face validity may be rejected by test-takers (if they have that option) and also people who are choosing the test to use from amongst a set of options.
Content validity
A test has content validity if it sufficiently covers the area that it is intended to cover. This is particularly important in ability or attainment tests that validate skills or knowledge in a particular domain.
Content under-representation occurs when important areas are missed. Construct-irrelevant variation occurs when irrelevant factors contaminate the test.
Construct validity
Underlying many tests is a construct or theory that is being assessed. For example, there are a number of constructs for describing intelligence (spatial ability, verbal reasoning, etc.) which the test will individually assess.
Constructs can be about causes, about effects and the cause-effect relationship.
If the construct is not valid then the test on which it is based will not be valid. For example, there have been historical constructs that intelligence is based on the size and shape of the skull.
Criterion-related validity
Criterion-related validity is like construct validity, but now relates the test to some external criterion, such as particular aspects of the job.
There are dangers with the external criterion being selected based on its convenience rather than being a full representation of the job. For example an air traffic control test may use a limited set of scenarios.
Concurrent validity is measured by comparing two tests done at the same time, for example a written test and a hands-on exercise that seek to assess the same criterion. This can be used to limit criterion errors.
Predictive validity, in contrast, compares success in the test with actual success in the future job. The test is then adjusted over time to improve its validity.
The validity coefficient
The validity coefficient is calculated as a correlation between the two items being compared, very typically success in the test as compared with success in the job.
A validity of 0.6 and above is considered high, which suggests that very few tests give strong indications of job performance.
Selection tools
* Application form: A structured request for specific information.
* Assessment centers: Multiple tests and exercises through the day.
* Biodata: Correlating biographical data with job performance.
* Interview: The classic selection interview...
* Intelligence tests: Many ways of measuring of intelligence.
* Psychometric tests: Attainment, aptitude and intelligence tests.
* Personality tests: Finding aspects of personality.
* Resumé / Curriculum Vitae: A personal history.
* Work sample: Getting them to do a sample of 'real' work.
Selection articles
* Accuracy and faking: Candidate and recruiter issues.
* Faking and deviance: Faking as 'bad' or 'good' depends on your viewpoint.
* Personality and job success: Personality factors that predict job success.
* Reducing faking in tests: Design psychometric tests to minimize faking.
* Reliability: Stability and consistency give trust in the test.
* The selection spiral: When people select only those less able than themselves.
* Validity: Results measure what the assessment was intended to measure.
Description
An application form is a structured form developed by the company offering the job, which the candidate completes. This may be in addition to a résumé, or as a replacement. Résumés may, when application forms are required, be 'optional'.
Supplemental application forms
A supplemental application form does not replace a résumé. It seeks additional information in specific areas that are shaped by the key criteria you are seeking. For example, you might ask about experience of working in teams, or even ask them to write a short piece about their views of the dynamics of your marketplace.
Replacement application forms
A form that replaces a résumé must gather all information that is required, including contact information, job history, personal statements and education, as well as any specialized job-related information you require.
Development
Development of a supplemental application form starts with a good job analysis, from which key criteria which are to be sought are extracted and parts of the form designed whereby data may be reliably gathered.
Design of the form should be to include clear instructions such that the candidate is in no doubt about what is required in each field. This may include check boxes of various forms for basic facts (e.g. male/female or a checklist for computer application skills), single-line fields for short items such as their name, and larger boxes for free-format descriptions, such as descriptions of their responsibilities in various jobs.
If the form is to be on paper, then standard graphic design principles should be used, such as clear use of space, fitting coherent sets of information on a single side of paper, etc. If the form is for use on the web, then web design principles should be used, such as coping with resizing of windows, clear 'submit' button, etc.
Discussion
Application forms are very popular, being used by 93% of UK firms (Shackleton and Newell, 1991), and have found increasing popularity with the web (Park, 1999 and Reed, 2002), where online completion of forms eases data capture and ensures standardization.
The application form is the recruiter's rebuttal to the résumé, providing them with an initial selection tool that can be used to create a short-list based on what is required by the job rather than what the applicant chooses to tell.
Application forms are finite, and long forms are likely to put off some candidates (although this may be of benefit to put off the casual applicant).
Application forms used a great deal on the web and facilitate automated filtering, where a job may have many applicants and individual sections may be scanned for specific key words (such as qualifications or experience).
Application forms, like CVs, are self-reports and hence may are open to impression management and other forms of faking, and hence ‘factual’ information should be treated with care.
A sharp candidate will take good note of the application form, as it often hints at (or even shouts about) the key criteria that the recruiting company is seeking.
Design of Application Forms must take care about legal constraints. If there is any legal challenge to the application process, the motivation for any item in the application form could be challenged. Both the language and content of application forms thus needs to be carefully screened for bias and sensitivity. For example if ethnic background is being questioned, then there must be a legitimate reason for this (and the wording must also be ‘politically correct’).
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Assesment Centers:
Description
The Assessment Center is an approach to selection whereby a battery of tests and exercises are administered to a person or a group of people across a number of hours (usually within a single day).
Assessment centers are particularly useful where:
* Required skills are complex and cannot easily be assessed with interview or simple tests.
* Required skills include significant interpersonal elements (e.g. management roles).
* Multiple candidates are available and it is acceptable for them to interact with one another.
Individual exercises
Individual exercises provide information on how the person works by themselves. The classic exercise is the in-tray, of which there are many variants, but which have a common theme of giving the person an unstructured large pile of work and then see how they go about doing it.
Individual exercises (and especially the 'in tray') are very common and have a correlation with cognitive ability. Other variants include planning exercises (here’s problems, how will you address them) and case analysis (here’s a scenario, what wrong? How would you fix it?).
One-to-one exercises
In one-to-one exercises, the candidate interacts in various ways with another person, being observed (as with other exercises) by the assessor(s). They are often used to assess listening, communication and interpersonal skills, as well as other job-related knowledge and skills.
In role-play exercises, the person takes on a role (possibly the job being applied for) and interacts with someone who is acting (possibly one of the assessors) in a defined scenario. This may range from dealing with a disaffected employee to putting a persuasive argument to conducting a fact-finding interview.
Other exercises may have elements of role-play but are in more 'normal' positions, such as making a presentation or doing an interview (interesting reversal!).
Group exercises
Group exercises test how people interact in a group, for example showing in practice the Belbin Team Roles that they take.
Leaderless group discussions (often of a group of candidates) start with everyone on a relatively equal position (although this may be affected by such as the shape of the table).
A typical variant is to assign roles to each candidate and give them a brief of which others are unaware. These groups can be used to assess such skills as negotiation, persuasion, teamwork, planning and organization, decision-making and, leadership.
Another variant is simply to give a give topic for group to discuss (has less face validity).
Business simulations may be used, sometimes with computers being used to add information and determine outcomes of decisions. These often work with 'turns' that are made of data given to the group, followed by a discussion and decision which is entered into the computer to give the results for the next round.
Relevant topics increases face validity. Studies (Bass, 1954) have shown high inter-rater reliability (.82) and test-re-test results (.72).
Self-assessment exercises
A neat trick is to ask candidates to assess themselves, for example by asking them to rate themselves after each exercise. There is usually a high correlation between candidate and assessor ratings (indicating honesty).
Ways of improving these exercises include:
* Increasing length of assessment form to include behavioral dimensions based on selection competencies
* Change instructions to promote a more realistic appraisal by applicant of their skills
* Imply that candidate would be held accountable if a discrepancy is found between their and assessor ratings.
Those with low self-assessment accuracy are likely to find behavioral modification and adaptation difficult (perhaps as they have low emotional intelligence).
Development
Developing assessment centers involves much test development, although much can be selected 'off the shelf'. A key area of preparation is with assessors, on whose judgment candidates will be rejected and selected.
Identify criteria
Identify the criteria by which you will assess the candidates. Derive these from a sound job analysis.
Keep the number of criteria low -- less than six is good -- in order to help assessors remember and focus. This also helps simplify the final judgment process.
Develop exercises
Make exercises as realistic as possible. This will help both candidates and assessors and will give a good idea what the candidate is like in real situations.
Design the exercises around the criteria so they can be identified rather than find a nice exercise and see if you can spot any useful criteria. Allow for confirmation and for disconfirmation of criteria.
Include clear guidelines for player so they can get 'into' the exercises as easily as possible. You should be assessing them on the exercise, not on their memory.
Include guidelines also for role-players, assessors and also for those who will set up the exercises (eg. what parts to include in exercise packs, how to set them up ready for use, etc.).
Triangulate for results across multiple exercises so each exercise supports others, showing different facets of the person and their behavior against the criteria.
Select assessors
Select assessors based on their ability to make effective judgments. Gender is not important, but age and rank are.
There are two approaches to selecting assessors. You can use a small pool of assessors who become better at the job, or you can use many people to help diffuse acceptance of the candidates and the selection method.
Do use assessors who are aware of organizational norms and values (this militates against using external assessors), but do also include specialists, e.g. organizational psychologists (who may well be external, unless you are in a large company).
Develop tools for assessors
Asking assessors to make personal judgments is likely to result in bias. Tools can be developed to help them score candidates accurately and consistently.
Include behavioral checklists (lists of behaviors that display criteria) and behavioral coding that uses prepared data-gathering sheets (this standardizes between-gatherers data).
Traditional assessment has a process of observe, record, classify, evaluate. Schema-based assessment has examples of poor, average and good behavior (there is no separation of evaluation and observation).
Prepare assessors and others
Ensure the people who will be assessing, role-playing, etc. are ready beforehand. The assessment center should not be a learning exercise for assessors.
Two days of training are better than one. Include theory of social information processing, interpersonal judgment, social cognition and decision-making theory.
Make assessors responsible for giving feedback to candidates and accountable to organization for their decisions. This encourages them to be careful with their assessments.
Run the assessment center
If you have planned everything well, it will go well. Things to remember include:
* Directions to the center sent well beforehand, including by road, rail and air.
* Welcome for candidates, with refreshments and waiting area between exercises.
* Capturing feedback from assessors immediately after sessions.
* A focus with assessors on criteria.
* Swift and smooth correction of assessors who are not using criteria.
* A timetable for everyone that runs on time.
* Lunch! Coffee breaks!
* Thanks to everyone involved.
* Finishing the exercises in time for the assessors to do the final scoring/discussion session.
Follow-up
After the center, follow up with candidates and assessors as appropriate. A good practice is to give helpful feedback to candidates who are unsuccessful so they can understand their strengths and weaknesses.
Discussion
Assessments have grown hugely in popularity. In 1973 only about 7% of companies were using them. By the mid-1980s, this had grown to 20%, and by the end of the 1990s it had leapt again to 65%.
Assessment centers allow assessment of potential skill and so are good when seeking new recruits. They allows a wide range of criteria to be assessed, including group activity and aggregations of higher-level, managerial competences.
Assessment centers are not cheap to put on and require multiple assessors who must be available. Organizational psychologists can be of particular value to assess and identify the subtler aspects of behavior.
Origins
The assessment center was originated by AT&T, who included the following nine components:
1. Business game
2. Leaderless group discussion
3. In-tray exercise
4. Two-hour interview
5. Projective test
6. Personality test
7. ‘q sort’
8. intelligence tests
9. Autobiographical essay and questionnaire
Validity
Reliability and validity is difficult, as there are so many parts and so much variation. A 1966 study showed high validity in identifying middle managers. There is a lower adverse effect on individuals than separate tests (eg. psychometrics).
Criticisms
The outcome of assessment centers are based on the judgments of the assessors and hence the quality of those judgments. Not only are judgments subject to human bias but they also are affected by the group psychology effects of assessors interacting.
Assessors often deviate from marking schemes, often collapsing multiple criteria into a generic ‘performance’ criterion. This is often due to overburdening of assessors with more than 4-5 criteria (so use less). More attention is often given to direct observation than other data (eg. psychometric tests). Assessors even use their own private criteria – especially organizational fit.
_______________________________________________________________________
Biodata:
Description
Biodata methods collect biographical information about a person that has been proven to correlate with good job performance. The correlations can be quite strange: all you need is to know that if a person has a certain item in their history that they are more likely to be good at the target job.
For example, in World War 2, the US Air Force discovered that men who had built and flown model aircraft when they were boys were more likely to make good fighter pilots. In 1952, Mosel's detailed study of department store sales staff found that the most successful people were widowed, female, 35-54 years old, between 4 foot 11 inches and 5 foot 2 inches, weighed at least 160 pounds, lived in a boarding house, had dependants, had a high-school education, had at least five years sales experience but had been in the previous post for less than five years with no time off for illness.
More recent and understandable events can also be used, for example working on government projects might correlate with effective use of project management methodologies. Biodata can include aspects of personal information, childhood, education, employment, external experiences, skills, socioeconomic status, social activities, hobbies and personal traits and characteristics.
Biodata is collected using a written form, structured to discover the key information that is required. The target people complete the form and hand it in, where it is studied for the key characteristics being sought.
Development
Define 'performance'
First start off by defining the performance that you are seeking. This should be in a form that will help you with the next step, for example using standard descriptions of 'leadership' or 'technical ability'.
Find high performers
Find a significant population from which you can extract sufficient numbers of high performers in the job you are analyzing. Thus a large company may study international managers who have proven successful at managing virtual teams, whilst an army may seek individual soldiers who have shown exemplary battlefield bravery (perhaps via those who have been awarded medals).
Collect biographical data
Design a structured and repeatable data collection method to extract as much biographical information as you can handle. This may include investigation of childhood events, education, jobs performed and self-reported significant 'life events'.
Approaches such as Critical Incident Technique, Interview and Structured questionnaires may be used to collect information. Each method used will typically expose different information, allowing different facets of the person to be examined.
Correlate performance and biographical data
This step is largely statistical, as the biographical data is coded and correlated with job performance.
Scoring is done with ‘item responses level keying’.
* Empirical approach: look at proportion of variance accounted for between item and outcome criteria.
* Rational approach: job experts devise weightings based on theoretical a priori links.
Draft a questionnaire for each hypothesized scale and apply to a large sample (450). Each question is tested and results are factor analyzed to find (desirable) clustering.
Design biodata form
When collecting biodata information, it can be a good idea to hide the key characteristics you are seeking within other irrelevant information (which also may be found in your researches). Where possible, collect hard, verifiable items (such as examinations passed). Where this is not possible, the potential for faking must be taken into account.
Biodata may be collected in scripted format:
Please describe a time when you were faced with a difficult customer. What was said? What did you do? What has the result? (Please answer in no more than 200 words).
It is very often, however, collected in more structured formats that allow for rigorous analysis:
Which of the following have you done in the past five years? Please tick all that apply:
[ ] Climbed a high mountain
[ ] Been on holiday to at least three different countries
[ ] Run a marathon
[ ] Raced a car
Test and use
Finally, try it out, for example on the people who supplied the data in the first place.
Discussion
Biodata is little known but is widely used (the first recorded use was in Chicago in 1894). It has nothing to do with biological aspects of the person, but has much to do with their biography (biodata is short for 'biographical data', not 'biological data'). Done well it has a high level of reliability and validity.
Unlike many other tools, biodata has a solely empirical root. There is no psychological theory behind it. It simply finds what works and does not question why. The founding hypothesis is that statistics holds true, and that correlations found in some people will also be found in other people.
Developing biodata tests is very time-consuming and hence costly. This goes some way to explain its use in particular areas, for example the military, where there is a large population available for study and a consistent need in terms of job performance.
Biodata is different from personality tests in that it has a broader content and is more specific. It can also be re-scored for different roles and can be done both by candidate and also by someone they know.
Once developed, it is then quick to process lots of applicants, particularly if multiple-choice tick-boxes are used.
Biodata has face validity and is clearly fair as it is same for all. It is also easier to monitor for discrimination.
Unless demonstrated to be job-dependent, items about race, gender, marital status, number of dependents, birth order and spousal occupation are likely to be illegal. Other issues of concern include accuracy, faking, invasion of privacy and adverse impact on perceptions of the applicant.
Faking may be minimized when the correlations are not clear (as in the WW2 pilot example). This can be improved further by hiding the real question amongst other less critical items.
Over time, a standard biodata test could lead to less diverse recruiting. It is also constrained by time and context. It can become dated as jobs change, which may reduce reasons for using it as it takes a lot of effort to set up in the first place. Same test may not be useful across the world – problem for global companies.
___________________________________________________________________
Selection Interview:
Description
Interviews are conversations whereby a candidate interacts with one or more people who assess the candidate and, in a selection interview, decide on whether this person should be offered a job.
Such interviews typically last 15 to 60 minutes although they can be shorter or longer.
Interview structure
The structure of an interview is based on the degree of control exerted by the interviewer as to the predictability of what questions are asked and what information is sought. When there is specific informational needs, then a more structured approach may be used.
* Unstructured interviews are unplanned, non-directed, uncontrolled, unformatted, bilateral communications and flexible. They require skills in questioning and probing.
* Semi-structured interviews are pre-scheduled, directed but flexible, major topic areas are controlled and there is a focused flow.
* Structured interviews are pre-planned, interviewer directed, standardised, pre-formatted and inflexible. They have a full structure and use highly-designed, closed questions. They assume a consistent format will get consistent responses.
Interview types
There are four common type of selection interview:
* Situational interviews use situation-specific questions based on job and look at hypothetical performance. They are conducted by specialists: psychologists or trained people.
* Job-related interviews ask about past behaviour on job. They are typically conducted by HR or managers.
* Psychological interviews assess personality traits. They are conducted by work/organizational psychologists.
* Competency interviews widen psychological interviews to include competencies such as interpersonal skills, leadership and other identified key competencies.
Interviews may involve a varying number of people. One-to-one interviews are common, although there are benefits for using multiple interviewers, for example where one person asks the questions and the other observes in a more detached 'third person' position. This objective position can look for body language and other subtleties which the person questioning may miss.
Behavioral interviews
Behavioral interviews assume that past behaviour is likely to predict future behaviour, and the more evidence there is of a previous pattern then the more likely it is that it will be repeated in the future.
The interview is developed with Job Analysis and Critical Incident Technique of effective and ineffective performance, through which a range of ‘performance dimensions’ are devised that indicate critical categories and which provide basis for detailed question themes.
The interviewer then pays particular attention to past behaviours in critical categories and also probes for motivations behind behaviours.
Variations include Behavioral Patterned Description Interviews (BPDIs), Behavioral Events Interviews and Criterion Referenced Interviews.
Situational interview
Situational interviews are based on Latham's Goal-Setting Theory, which assumes intent precedes actions. There is a focus on future (vs. past of behavioural methods).
The situational interview is developed by the use of Critical Incidents to devise rating scale of behaviours. The costs can be high at around $1000 per question.
The interviewer asks what the person would do in theoretical situations and assesses their response against the criteria and rating scale.
A new variant of situational interviews is the ‘multimodal’ form, focusing on self-presentation, vocabulary assessment, biographical questions and other situational aspects.
Situational interviews tend to reduce the chance of discrimination as they offer all candidates the same scenarios and evaluate them against the same criteria. Candidates also prefer them, as they seems fairer, but it still limits their control over proceedings.
Development
Develop questions and scoring
The first stage of interview development is, if it is not already available, a job analysis of the position, in order to identify key knowledge, experiences, competencies. This may be done using sophisticated methods such as often the Critical Incident Technique.
For situational interviews, particular scenarios are identified from which situational questions may be derived. For behavioral interviews, the attitudes and behaviors required in the job are uncovered.
For a structured approach, questions and scoring can be derived from empirical study that may include in-depth analysis of incumbents and interview of high performers. Typically 10-15 traits emerge, from which around 120 questions may be developed around broad situational and behavioral aspects.
Questions should be tested on good/bad performers and around half discarded to ensure a consistently high quality of questions. A scoring guide may then be developed.
Train interviewers
Interviewers should receive some training rather than be plucked from management and other ranks and placed in front of an unsuspecting candidate. Allport (1937) identified key attributes of interviewers:
* Breadth of experience and from diverse backgrounds.
* Above average intelligence and have self-insight and understanding of others.
* Emotionally stable, well-adjusted, with good social skills.
* Ability to be detached.
* Similar in some way to interviewee.
* Expertise in role.
Important in training is in developing objective rating skills, rather than letting the interviewer give way to the sizeable human bias and opinion that they may have.
They also need to be seen to be fair (candidates watch for this) and must, of course, comply with laws and regulations around gender, ethnicity, age and so on. A worst-case interviewer can lay the company open to damaging law suit (both financially and reputationally).
Discussion
The interview is an extremely common selection method and has a high predictive validity for job performance (Robertson and Smith, 2001), indicating many factors that are relevant for the communications job, including cognitive ability (Huffcutt et al., 1966), oral skills (Campion et al, 1988), social skills (Searle, 2003) and person-organisation fit (e.g. Harris, 1999).
Objectivist psychometric perspective
In the wonderfully-named objectivist psychometric perspective (i.e. traditional viewpoint), there is a focus on structure, reliability and validity, based on and assumption that the interview is an objective and accurate means of assessment where interviewees are passive participants providing information to rational interviewees who are skilled at acquiring and interpreting information.
Criteria used include cognitive ability, job knowledge or tacit knowledge (eg. through situational interviewing), social skills (e.g. extraversion, emotional stability, openness) and person-organisation fit (though this is fraught with difficulties).
There are dangers in this approach, for example in the use of the interviewer’s perception of the organisation, and what it needs. The interviewer typically seeks organisation fit by comparing against a simplistic prototype and may well assess personal qualities over required skills. 'Fit' may also happen at the social level, with the interviewer looking for personal fit with the candidate.
Social-interactionist perspective
In the social-interactionist perspective (i.e. modern viewpoint), there is a focus on social factors and the dynamic process within the interview. The interview is seen as a subjective, complex and unique event where both parties act as active participant-observers.
The perception of the interviewer is important within this perspective and research has shown some interesting factors. Where the interviewer has not paid due attention to candidate qualifications, the candidate often draws back, becoming more reticent and talking less about themselves (perhaps punishing the interviewer for the slight). Candidate expectations depend on how much they like the interviewer as the job. The candidate may well feel unfairly treated if not given enough attention or opportunity to dialogue, and may well develop negative expectations if the interviewer talks more than they do.
Faking
Faking is less easy in interviews than CVs or Application Forms, as non-verbal signals may be detected. Nevertheless, interviews are so common that some interviewees acquire significant expertise in the ‘interview technique’, managing impressions and having ready answers for common questions.
Image Management (IM)
Faking also may also appear through dress, words and body language, where impression-management seeks to make a person appear more than they normally are. When the interviewer sees the candidate as over-dressed (or provocatively under-dressed!), with excessive make-up or otherwise trying too hard to impress, they may suspect them of concealment and mark them down accordingly.
Impression management includes Ingratiating behaviour (agreeing, complementing, offering favours), self-promotion (to boost competency range) and also anger and intimidation (to show fearlessness).
In studies, women showed more openness and older, more experienced people maintained more eye contact, projected a more positive image and asked more questions. Older people reduced the number of entitlement statements used and increased self-enhancement and self-promotion statements. Where there was more role ambiguity, impression management increased.
Reliability and validity
Interviews are generally reliable with criterion values of .51 common, rising to .63 when used with psychometric tests. The validity is higher for situational (.60), job-related (.39), psychological (.29) styles.
Situational interviewing is relatively simplistic and is predominantly used in low-complexity jobs. Behavioral interviewing brings in a wider range of behaviours from inside and outside work, allows more thorough probing of motivations and is preferred for higher-level jobs.
Bias and misjudgement
There are many factors in which can bias interviewers, including gender, race, age, appearance, attitude, non-verbal behaviour, physical setting and job market factors (Avery and Campion, 1982), bias towards positive information and even primacy and recency and contrast effects in the ordering of candidates (Asch, 1946), Miller and Campbell (1959) and (Anderson and Shackleton, 1993). These factors may be reduced by training, but often not eliminated.
Within interviews, it is important that fair play is perceived, which includes, for example that all candidates should each have a comparable experience, even if the interviewer concludes early on that they are not suitable.
Interviewers are subject to normal human biases, for example they tend to be biased towards ‘people like me’. Positive information is weighted more than negative data (which takes more time to process). The order of candidates can also cause bias (primacy and recency effects).
The Halo effect happens when one good aspect of candidate makes them look good in other areas as well. The reverse is true, and the Horns effect occurs where a negative perception is generalized to other aspects of the person. A typical horns effect is where the person is overweight and where this is generalized into greed, lack of control, lack of social ability, etc.
The generalization continues and candidates who are nervous at interview can be generalized as always nervous, whilst the confident may be attributed as being skilful in other areas.
______________________________________________________________________
Intelligence testing:
Description
There are three schools of thought about intelligence and consequently how it may be tested.
Uni-factor models
Uni-factor models have a single dimension, defining 'intelligence' as a single thing that is fixed, unchangeable and can be measured. They usually have a socio-biological basis, where intelligence is defined as a combination of genetic and social factors.
Spearman (1904) did a factor analysis of ability, identifying ‘g’ as general ‘psychophysiological’ intelligence.
Unifactor models allow for racial and national differences in intelligence. In once sense, this may be seen as racism or xenophobia. On the other hand, it also calls into question the notion of 'intelligence' as a single thing -- usually the defining race/nationality/gender measures themselves as a standard, and any difference with others is very likely to show the others to be inferior.
Multi-factor models
Multi-factor models identify intelligence as a combination of distinct abilities. They tend to place emphasis on the role of the environment in learning and see intelligence as dynamic and situated., rather than a fixed ability in all circumstances.
Thurstone (1938) identified nine primary factors of intelligence:
Words Spatial Perceptual
Verbal Numerical Memory
Reasoning Deduction Induction
J. P. Guilford identified a cube (5 x 6 x 6) of abilities.
Multi-hierarchical models
Multi-hierarchical models measure the application of intelligence, not just cognitive ability. They try to provide more organization than multi-factor models that seem unwield (eg. Guilford’s 180 cubelets).
Horn and Cattell (1966) defined five second-order factors, notably including fluid and crystallized intelligence:
* Fluid intelligence: non-verbal, general reasoning. [tacit?]
* Crystallized intelligence: application of verbal or conceptual knowledge. [explicit?]
* Visualisation
* Retrieval
* Cognitive speed
Vernon’s model (1950s) describes intelligence as an equation:
General intelligence
= Verbal ability (verbal + numerical)
+ Spatial/Mechanical ability (Spatial + Mechanical)
Carroll’s model (1993) is more complex:
General intelligence
= Fluid intelligence (inductive reasoning + sequential reasoning)
+ Crystallised (lexical knowledge + foreign language aptitude)
+ Visual perception (visual imagery + perceptual integration)
Gardner’s model (1983) of seven intelligences is favoured by educationalists:
1. Linguistic
2. Musical
3. Logical-mathematic
4. Spatial
5. Bodily-kinaesthetic
6. Intrapersonal
7. Interpersonal
Furnham (1992) defined five factors that predict occupational behaviour:
* Ability (skill in the basic job)
* Motivation
* Personality traits
* Demographic factors
* Intelligence
Discussion
General intelligence tools are not used that often as preferences are usually to find more distinct abilities. The main areas of testing (related to Guilford) are verbal, numerical, spatial, dexterity, sensory. Cognitive ability tests are good predictors of initial job performance, but decline over time.
It is important to examine the test manual, to ensure you use the tool correctly (many do not do this well).
Testing people with disabilities is a problem as there is a lack of information available around this area.
There is a strong relationship between job performance and general intelligence. Tests are good at assessing this – interviews don’t add that much. Tests are also more objective.
Many tests seek maximum ability and are developed in a clinical environment. Ackerman et al (1989) noted that typical ability is more relevant in job environments. In a related early research, Terman (1934) identified chronic (long-term) vs. acute (short-term) intelligence.
The classic IQ test has been widely criticized as being biased towards white, middle-class Western adult males.
Culture-free testing is important for global organizations. A well known test is Raven’s ‘Progressive Matrices’ (1965), which measures three ability levels: children and people with learning disabilities up to fluid intelligence. Abstract reasoning is measured via pattern (‘matrix’) with a part missing. It is confusing to follow and the norm table difficult to use. There is little evidence of removing bias towards majority groups.
Some researchers say spatial ability is not as good as verbal ability. Differences may (or may not) be caused by geographic and organisational contexts. The Big Five test is claimed as culture-free (McCrea and Costa) but is also doubted.
________________________________________________________
Pshycometric test:
Description
Psychometric tests are used, as the name suggests, to measure some psychological aspect of the person. Most commonly in selection, this includes personality, ability and motivation.
Occupational tests are used to measure maximum intellectual performance in terms of attainment, ability and aptitude, or typical behavior in terms of motivation and temperament.
Typical tests identify the direction of interests and can be used to suggest types of jobs associated with these areas.
Attainment tests
Attainment tests are used to assess the level of achievement in a particular area, such as in high school examinations.
Aptitude tests
Aptitude tests assess potential in some target area, seeking to discover possible future capability. This is as opposed to ability tests, which seek current capability. They can be used to measure specific aptitudes or collective traits (eg. technical, verbal, numerical).
Intelligence tests
General intelligence tests include cognitive studies that focus on information processing and organization of knowledge.
These tests are often made up of batteries of sub-tests that each test a narrow range, such as arithmetic reasoning, verbal intelligence, etc.
Intelligence tests may measure two factors:
* Fluid ability: applying reasoning skills to novel situations (decreases with age).
* Crystallized ability: using culturally specific component (increases with age).
Intelligence is not normally distributed. At the bottom end, scores are tightly grouped, suggesting strong general factor. At the high end, scores show more independence between sub-tests, indicating specific intelligences.
Development
The overall approach to developing psychometric tests is to generate a large number of sample items, give them to a set of people and then keep only those that differentiate.
Maximum-performance questions are selected based on target-related factors. Questions here are based on right-wrong difference.
Typical-performance questions are selected based on personality, mood, attitude, temperament. Questions here are based on identifying differences in selected factors.
There are five methods of construction, as below.
Criterion-keyed
Criterion-keyed tests focuses on an external domain or criterion. Thus for interest inventory, criteria are interested related to specific occupational group.
They could be used in change to identify those who lack flexibility.
Example: MMPI
This method is criticized as having an atheoretical basis where selection of items based on empirical data on ability to differentiate. It addresses similarities and differences, not why these are so. The domain of test may be limited: for example, ‘mania’ in MMPI has only one criterion scale. The more specific a measure, the more limited it is by its generalizability. There can also be problems when moved from one context to another, especially across cultures.
Factor analytic
This identifies items that load onto one factor and not onto another. It has the advantage that scores always have the same meaning.
Development of the test seeks strong correlation between the item and factor.
Example: Cattell’s 16PF. He listed all personality traits he could find and gave tests to heterogeneous groups of adults. Then he used factor analysis to develop theory of structure and relationships (not for data reduction). This has since been correlated with 50 different occupations.
A key in doing this is the size of the sample group. The larger the group, the lower the standard error.
Item Response Theory has been devised to help test-developers assess the nature of differences.
Item analytic
This is a very simple method which correlates each item with the overall test. It is useful for eliminating unsatisfactory items prior to using factor analysis. This is useful in developing longer tests by eliminating weaker items.
There is a need to be careful here:
* Domain definition: e.g. avoid investigating trust by asking person if trustworthy.
* Bloated specifics: repeated coverage of same item leads to apparent high reliability.
* Transportability: these tests often based on social and other domain-specific values.
Thurston scales
These are widely used, particularly in assessing attitude. They identify statements concerning attitude, then assess relevance of these with a panel of experts.
Items are chosen on the standard deviation of rating given by experts (ie. Those they mostly agree on).
A high level of values tend to be in results, hence transportability is issue.
Guttman scales
These are less widely used.
Items in this method are sorted in terms of difficulty or intensity.
Problems include getting graduation, every item must correlate with total score. which needs many items and large samples.
Discussion
Bright people will tend to do well on many different types of test as such tests have a high correlation with intelligence. This reduces the value of the test in differentiating individuals.
Factors affecting test experience
Factors affecting test experience include:
* Test: pre-test information, type of test, language, instructions, structure, medium, timescales
* Person: experience, confidence, emotion, motivation, memory, culture
* Environment: Light, heat, humidity, noise, distractions, test administrator
* Computers: affect both developers and test-takers.
* Time: affects stress, ability to complete, alertness (time of day). The test itself may also age, esp. when ‘semantically laden’.
* Test-taker:
o Alpha ability: improves as a results of the test, which teaches them things.
o Beta ability: improves management ability (eg. managing time, rtfq).
* Attention: to test taker (Hawthorne effect).
Criticisms and hazards
Criticisms, hazards and potential problems with psychometric test include:
* Inadequate definition of concept to be measured.
* Bias (undesirable) in differentiation (desirable) between test takers. Eg. gender bias.
* Poor application of tools, eg. inadequate job analysis, wrong usage of tools.
* Words defined differently by developers (eg. extravert, innovator), causing confusion.
* Misinterpretation of results by users.
* Not reading the test manual properly (which tells how/where it is to be used)._________
______________________________________________________________________
Personality test:
Description
Personality tests seek to identify - guess what - aspects of a person's personality that are correlated in some way with job performance.
Type
Focus
Concern
Notes
Example
Psychodynamic Internal Unconscious mind Clinical background. Strong Freud/Jung influence.
Attention to dysfunction, neuroticism
Psychoanalysis, Jungian Type Inventory (e.g. MBTI)
Biological Internal Heredity and learning Criticised as defining intelligence with too few factors. Eysenck Personality Questionnaire (EPQ)
Behavioral External Habits and reinforcement Learning though conditioning and shaping behavior. Focus on scientific proof. Misses cognition. Behavioral assessment, behavioral interviews
Phenomenological and humanistic Internal Maslow, Kelley, Lewin. Influence by subjectivism and individualism FIRO-B
Social-cognitive Internal and External Context and cognition Bandura, Walters. Includes social and cognitive psychology.
Trait Internal Values, behavior and relationship with performance. Based on clusters, factor analysis, predictability. 16PF, OPQ, IPT
Personality tests are often administered as self-completed sets of questions about preferences and behaviors each of which contributes towards a score or position along a number of personality dimensions, e.g.:
Extravert
X
|
Introvert
Any given score may be correlated with a particular job. For example, jobs that require significant interaction with people may have a correlation of extraversion with job success.
Development
Personality tests are hard to validate and so are developed over a long period of hypothesis, test and observation.
There are several actions a test developer can use to minimize faking:
* Give instructions with warning.
* Include social desirability (lie) scale. Eg. MMPI.
* Ipsative questions (forced choice and no middle option)
* Conceal purpose of questionnaire (eg. biodata that correlates non-obvious biographical data with performance predictors).
* Say ‘don’t think too hard’.
* Promise (and give feedback) to the test-taker.
Discussion
Personality tests are very commonly used, although often from a viewpoint that (incorrectly) perceives them as very strong predictors of behavior. Personality is a complex concept and whilst personality tests can give useful indicators, the world is not divided up into 16 (or less!) types of people who are unable to see or act outside of their personality profiles.
Stability
There is often a belief that personality is fixed and does not change. In practice there are three types of instability:
* Temporal: Personality can change over time, for example in Jungian Type Inventory, there is a tendency to polarize at one end of the spectrum or recognize a need for flexibility and tend towards the middle.
* Contextual: People act very differently in different situations (e.g. home and work).
* Internal: Personality assessments are often based on self reports, where people often answer questions based on an idealized self or what they believe is needed.
Predictive validity
Much research shows value of personality tools and their links to job needs, for example the best pilots have emotional stability and extraversion. Many people will self-select jobs based on their perceived via this.
Bandwidth can be an issue, where the breadth of cover by each instrument is insufficient. However, lots of factors becomes unwieldy, too few are criticised as simplistic (eg. 16PF vs. Big Five arguments).
The jangle fallacy occurs where same trait name used by two or more questionnaires. This can be confusing.
The most predictive personality factors of job performance are conscientiousness and general intelligence (but what is ‘job performance’?). Sub-factors of ‘conscientiousness’ in studies also varied (competence, order, dutifulness, etc.). Combined traits are finding favour, such as consciousness and agreeableness. Extraversion is important in some situations, such as sales - but high agreeableness may result in lower sales and in some settings, managers do less well if they are conscientious.
Overall, though, personality tests have low predictive validity of job performance, but they are used often for this, for example people may be de-selected solely on test results. People have even been made redundant from jobs based on personality tests (and giving them a biased report to show this).
Work is often done in teams and personality tests often do not cover this (or do so only in a limited way).
Distortion and faking
Distortion and faking can be a problem where people may deliberately or subconsciously bias their self-reports (where social desirability bias can have an undesired effect). There may be a central tendency where people take the safe choice. Acquiescent people tend to use 'yes' and 'agree' to answers more than they should.
Despite concerns, faking does not affect validity that much.
Faking good (Impression Management) can be useful in the target job and is itself an indicator of personality.
______________________________________________________
Resume and CV:
The résumé (USA) or CV (UK) is a personal summary of an applicant's history that may include:
* Contact details
* Summary statement about the person, characterizing them and their ambitions.
* Experience (usually the main body of the application)
* Qualifications, both academic and professional
* Hobbies and other interests
Two common forms of résumé are the functional résumé and the historic résumé. In the functional résumé, the applicant takes a list of particular skills and knowledge and gives evidence of their ability in each item. In the historic resume, they summarize achievements for job positions in historic order (usually with the more recent jobs first).
Discussion
The résumé is probably the most common tool used in selection, at least in the initial selection process where it is often used as the basis for initial shortlisting (at least for external candidates --- internal selection often does not use the résumé). Care must be taken here, as when there are many candidates, it is easy to 'throw the baby out with the bathwater', filtering out good ones as well as the less desirable ones.
The résumé is a self report and, as such, may be economical with the truth or contain exaggerations and even complete fabrication. Applicants know the importance of their résumé, which may get only a few seconds of attention before is rejected, and so may take inordinate care over their construction. A well-crafted résumé may indicate that the person is careful and skilful. It may also be true that they paid a professional to write it for them. It is also likely that they have thought hard about what to tell you and what not to tell you. A more amateur layout could be less polished but it may also be more naive and honest.
As the first thing that the recruiter sees about a person and also the most common tool used in interviews, it often has a disproportionate effect.
For use in interviews, key aspects that match job criteria may be extracted and used to help probing. If the person is lying in the résumé, you may spot this during the more detailed questioning. You may also follow up with referees (although do remember that these also were selected by the candidate).
CVs are written by the candidate and are intended to show them in their best light and hence are unlikely to include negative elements. The CV may thus be taken as an indicator only, with verification of key items by other methods, such as following-up of references, questioning during interview and testing of skills by a work sample.
Without care, recruiters may easily seduced by subtle elements of the CV that have been shown to have undue effect. For example, when comparing CVs, Impression Management elements such as competency statements (Bright and Hutton, 2000) have been shown to have a positive effect.
The CV itself, even when used with other methods, may have a disproportionate influence on selection (Robertson and Smith, 2001), perhaps due to its familiarity or ready availability (Tversky and Kahneman, 1974).
Deselection on racial and other grounds is illegal, including by CV. There are cases where minority activists have send two CVs to a company, identical apart from racial exaggeration or hiding, and then suing the company when they receive a response only on the non-racial CV.
The candidate is also subject to legislation here: if they lie on their CV and are appointed on these ‘facts’, then this may be grounds for later dismissal. It is even know for people to pass themselves off as doctors and lawyers, faking certificates and other documents.
______________________________________________
Work Sample:
Description
‘Work sample’ is a method of testing ability by giving the candidate a sample of typical work to do and evaluating their performance.
Work samples may appear as short questions along the lines of 'What would you do in this situation' or more complex scenarios to analyze. At its most naturalistic, the candidate is put into a the actual job where they may spend some time actually doing real work. The normal situation, however is for the person to be given a role-play or real-life situations where the candidate acts out a realistic situation. This creates a repeatable pattern whereby multiple candidates can be given the same test and hence more easily compared.
Job-knowledge tests
Job-knowledge tests focus on specific dimension or content to determine current knowledge, such as a test of knowledge about the highway code.
Knowledge tests such as this may be computerized, enabling them to be taken at any time and even in any place. This also reduces the cost of administration and can reduce security issues (such as loss of exam papers).
Proctoring is a method often used, whereby questions and sequences are regularly changed to reduce copying cheating.
Job knowledge tests are increasing use in professional areas such as medicine and architecture.
Hands-on performance tests
Hands-on performance tests are used to test people for physical capabilities. For example, a psychomotor test, which is characterized by manual dexterity exercises.
Situational judgments tests
In situational judgments tests, people are asked how they would act in a given situation. This may be done with a multiple choice to enable automated marking. It can be used in many different jobs, for example leadership and teaching.
These tests assess job knowledge and the ability of the candidate to apply this knowledge in specific situations, (rather like in situational interviewing). They can be used to assess for aptitude and trainability as well as for current knowledge, and can be helpful in recruiting people with no previous experience.
Development
Work samples, as with other selection methods often start with a job analysis of good performers.
The job is typically broken down into key behavioral components, which are then used to create a checklist of desirable behaviors.
From this, scenarios and case studies may be developed.
Discussion
Work samples is normally used to test current skill. It can also be used to test for the ability to learn new skills. It is based on the premise of behavioral consistency, where the way a person acts in a simulated situation is assumed to be the same as they might act on the job.
It is useful for reducing bias by assessors and is perceived to be fair and valid by both recruiters and candidates, as all candidates are treated in the same way, including the amount of time to respond (although this may reduce the chances of slow writers or reflective thinkers). It removes non-job-related cognitive factors, and is visibly related to the job in question.
It has a high predictive validity of .37. to .54 and leads to less turnover of staff.
Criticism of work samples includes that they are atheoretical and related to an empiricist and Western view of the individual and work (Searle, 2003). Work samples must be carefully designed to test specific items. They give problem where more attention is paid to face validity than content validity and can also miss small, but critical factors (such as color vision for engineers).
In any concern for fairness, work samples are of particular value as they have both higher face validity and greater fairness for non-traditional candidates (Lievens and Klimosky, 2001).
_______________________________________________________
Selection Articles:
Accuracy and Faking:
There is a conflict between the recruiter who wants to be accurate and the applicant who may be faking. Accurate information is true and without color or falsehood. It may be affected by the informant, the context, underlying information or measurement effects.
Candidate issues
Information from the candidate will be affected by their inference of the meaning of questions asked. Inference is a process that filters and distorts sense information via internal constructs such as schema and life narratives. If the meaning inferred is not exactly that which is intended, then the response cannot be accurate. Words are re-interpreted and extended with connotative meaning (Barthes, 1957). The implication for recruiters is that great care must be taken with language used in questions to avoid ambiguity and potential distortion in the inferential process.
In their desire to be selected, applicants may fake or bias responses, ranging from slight exaggeration to bare-faced lies. Recruiters can encourage applicants to be honest both by asking for honest replies and by warning of consequences of lying (Cascio, 1976).
There are known areas where faking is more likely, such as tenure and final salary. Simple facts may be followed up as appropriate with former employers to verify these.
Consistent faking is more difficult in multiple contexts. Multiple tests and multiple interviews on different days and with different people provides information that can provide a richer information source to determine variation in a candidate’s responses.
The informant may also be affected by their perception of job requirements and the culture of the target company. If they perceive a requirement for aggressive go-getting, then their responses may well be suitably biased. Recruiters may reduce this (or at least reduce variation in response between applicants) by providing accurate information about the job, the company and its culture.
Recruiters and accuracy
Recruiters also need information from inside the company to build accurate job descriptions and design appropriate recruitment instruments. Similar jobs may be quite different in different contexts (Goldstein et al, 1993) and hence need careful analysis. Managers may exaggerate the skills needed, listing a perfect prototype whilst losing the key attributes in the detail, or may bias detail to subtly affect influence policy or manage internal impressions. When gathering information from groups, social interactional effects may further bias information (eg. Deutch and Gerard, 1955). Recruiters may reduce internal bias by using such methods as critical incident analysis and task (rather than KSA) data (Morgenson and Campion, 1997).
Accuracy may also be a problem where job needs are constantly changing. What is accurate at the time of collection of data may not be true when it is used to select candidates. The time component of accuracy must thus also be considered and perhaps more general information sought where detail is subject to change.
Information may also be gathered from references, which provides additional information to questionnaires and interviews. There is inherent bias in this process, as the referees are chosen by the applicants. Referees, as with applicants and internal sources are subject to bias and recruiters need to take equal care in the questions they ask and the inferences they make from the information provided.
Finally, but actually first, recruiters can ensure that everyone in the recruiting cycle is well educated and knowledgeable about the real issues. Thus the use of qualified test developers, training of interviewers and coaching of managers can go a long way to ensure that recruitment is a professionally run process and that recruited candidates are the best possible people for the specific jobs and for the longer-term health of the overall organization.
______________________________________________________________
Faking and Deviance:
The question of deviance of faking behavior may start with a consideration of what is ‘deviant’. Downes and Rock (1998) indicate the multi-faceted and contextual nature of deviance that can range from moral to criminal to political.
In the common usage within recruitment, there is an assumption that deviance has a deliberate element of deception with the purpose of achieving the goal of the applicant whilst bypassing goals of recruiters. As such, it is perceived as immoral at best and criminal at worst (for example deceptions that may lead to an unqualified fantasist gaining a job as a doctor).
Different viewpoints
Deviance, however, is an individual construction, and whilst a recruiter may consider distortion as deviant, applicants may have a social norm that positions it as ‘normal’, particularly if they socialize within groups of job-hunters (such as third-year students). Where the perception is that ‘everybody is doing it’, not to fake may be perceived as dooming oneself to certain failure. Recruiters may also take this assumption and cognitively downgrade impressive applications.
Just as interviewers may be trained, so also are there many ways that interviewees be trained, practicing on web-based MBTI alternatives, reading the many self-help books (such as Bolles, 2004), being coached and so on. This again is a socialized process where it is considered ‘good practice’ to prepare for interviews by learning how to fake. Recruiters may even collude with such activities, welcoming such preparation as an indication of the seriousness of candidates.
In any persuasive situation, arguments are likely to be more extreme in order to sway the other person (Pruit, 1971). This is also likely within interviews, where the goal of the interviewee is to sway the interviewer. In the other direction, the interviewer may be torn between the role of company representative, attracting good applicants, whilst also taking the role of objective and dispassionate judge, seeking only the truth of the applicant’s capabilities.
Interviews as social occasions
Within the situation of the interview, the interviewer and interviewee form a temporary group and as such are likely to succumb to group behaviors such as politeness (Brown and Levinson, 1978) that constrain and distort communications. There will also be natural harmonization, where each may subtly change to be more like the other (Giles and Wiemann, 1987). Social desirability bias (Fisher, 1993) is likely to lead both alter their behavior to be more acceptable and attractive to the other. Equity theory (Adams 1963, 1965), also points to the likelihood of a seeking of a balance of exchange. Where an interviewer is drawing much information from a candidate, this may drive them to seek to redress the balance, perhaps by interacting more or giving way on questions of doubt.
In interpersonal deception theory, Buller and Burgoon (1994, 1996) note that deception happens in a dynamic interaction where liar and listener dance around one another, changing their thoughts in response to each other’s moves. ‘Normal’ behavior in such situations includes manipulation of information, strategic behavior control and image management. Social interactions are complex and many, and even if either party is deliberately seeking to be objective, the subconscious nature of communication makes it impossible to fully suppress.
Robinson (1996) describes a ‘pyramid of truths and falsehoods’ that indicates the complex fabric of truth in social environments, ranging from the normal social interactions where truth is a negotiated variable through to the scientific pinnacle, where truth is proven and largely unchallengeable. Again, this indicates the dilemma of the conversation between interviewer and interviewee who may be sitting at different levels. This highlights how the legitimacy and desirability of faking is situated with the person and their individual context beyond the interview arena. What is deviant and unwelcome to one may be a natural social dance to the other.
Faking as desirable
In the reality of everyday company business, faking may be a normal and even necessary skill. Some jobs in particular may require the assumption of brand-aligned personas (eg. customer service staff) whilst jobs such as sales and marketing may require some form of distortion that is not unlike the impression management that a job candidate may use. In such cases, recruiters may actively provoke and seek faking in aligned assessment contexts, such as in the ability to maintain a calm exterior when under pressure and where a ‘truer’ expression would involve expression of strong emotions.
Despite these concerns, it has been found that faking have negligible results on psychometric outcomes (eg. Barrick and Mount, 1996) and, provided recruiters are aware of the effects, it need not be assumed that psychometric results (particularly from established and proven tests) are invalidated by faking results.
____________________________________________________________
PErsonality and Job Sucess:
When you use or take personality tests, what are the factors that lead to job success?
There are several personality factors that have been correlated with general success in jobs, although it has also been shown that too much of one factor can cause problems.
Key success factors
A couple of factors have been shown to be highly correlated with success in jobs.
Intelligence
A number of studies (including Hunter and Hunter (1984) and Schmidt et al, (1992)) have shown that general intelligence ('g') often correlates well with job success. Basically, it means that intelligent people are generally good at jobs. Hire bright people and they will be able to do what you ask of them.
There may also be other related factors in this. For example you could argue that intelligent people are able to understand not only the task factors required to succeed in a role but also the social factors. Following the conjecture, if you link intelligence to education and a 'good upbringing' then values-based factors such as conscientiousness and agreeableness may also be seen as related.
Conscientiousness
Conscientiousness has also been shown by several studies as being highly correlated with job success (e.g. Barrick and Mount (1991) and Robertson and Kinder (1993)).
If a person is conscientious, then they will work hard to complete work they have committed themselves to doing. They can also be left alone without need for constant supervision.
Other success factors
Other factors have also been shown to be linked with success, although (Barrick et al, 2003) showed that these tend to be more related to some jobs more than others (particularly those with more significant social elements).
Beyond (and as well as) these, you may well need to do a thorough job analysis with some kind of factor analysis that isolates both individual and clusters of success factors for specific openings that you have.
Openness
If you are open to experience, are ready to challenge yourself and learn and welcome feedback from others, then you will not only learn far more, you will also be perceived as a pleasant person by others who will be more ready to work with you. Unsurprisingly, this tends to make people better at jobs.
Agreeableness
When paired with conscientiousness, a person who is easy to get on with becomes even more successful in many jobs. Particularly if the work requires working with other people, a person who is disagreeable is not likely to gain good cooperation.
Sometimes agreeableness is not required in large quantities. For example a salesperson who is too nice to customers may not win the tough bargaining deals.
Extraversion
If you are outgoing and get your energy from being with other people, you will probably do better in many jobs than others, especially (of course) those that require you to work actively with others, whether in managerial or team environments.
_________________________________________________________
Reducinng faking in test:
Whilst faking is a perennial issue, designers of selection tests have a number of tools and techniques available that can be used to counteract or at least detect this.
Initial instructions
At the start of a test, the instructions given to candidates can include warnings of the consequences of detected faking and request honesty. Instructions may also ask candidates to answer quickly, with their first thoughts rather than pondering. Holden et al (2001, p160) indicate that lying takes time. This is also supported by Ekman’s (1985) general study of lying.
Trick questions
It is also possible to include ‘trick’ questions, where a faking response is easily identified and hence raises suspicions about all other responses. For example in an assessment of a given set of skills, a multiple-choice question may have no right answer. Earlier assertions may later be probed in more detail.
Multiple sources
Instruments that use self-reporting may give false readings when they are used by candidates who have insufficient self-insight to be able to be answer fully. If information is collected from multiple sources then this problem may be reduced, for example through the use of ‘assessment centres’ where multiple methods and assessors give a range of data and viewpoints which can be cross-checked.
Safe answer
Test takers who use the ‘central response tendency’ and opt for ‘safe’ central options may be identified by asking different questions for which a consistent response would include high and low responses.
Where individuals have a high need for approval, they may tend towards positive ‘agree’ and ‘yes’ responses. This may be countered and detected by reversing some questions (reversing also breaks up habituating patterns of similar responses). This tendency towards seeking approval may also be detected by including a ‘social desirability’ scale within the questions to enable isolation of this.
Multiple questions
Assessing the same attribute with multiple questions can also show whether the candidate is averaging across questions (‘I’ve been a bit negative, I think I shall be positive for a while now.’), although obvious care needs to be taken to ensure that similar questions are interpreted in the intended way. Analysis of sequential patterns of positive and negative responses across responses may also identify uncertainty or deliberate averaging.
Ipsative questions
Normative items ask the candidate to rate their level of agreement with statements, and can give a good measure of psychological characteristics (Kline, 1993). However the question of faking has led to an ipsative approach being used in many contexts, where the test-taker is forced to make a choice from a fixed number of options. Ipsative questions either offer choice between items from very different areas (one question I recall from such a test is ‘Which do you prefer, a poem or a gun?’), or a polar choice from the same scale, which may have a yes/no response.
However, as Johnson et al (1988) has pointed out, ipsative forced-choice approaches are highly problematic. The very notion that you can ‘force’ someone to do something denudes them of free will and the very real problems of respondents either second-guessing or making a random choice from a set of items amongst which they have no clear preference. Martin et al (1995) have shown that test takers with a good insight into job needs can provide realistic faked responses. Ipsative methods still persist, in particular where sound alternatives are not available, for example the Zuckerman, Eysenck & Eysenck (1978) scale of sensation-seeking is still in used, despite the report by Ridgeway & Russell (1980) on unacceptably low reliabilities for the various sub-scales.
Question opacity
Faking may also be reduced by use of item opacity, where the respondent does not know ‘right or wrong’ answers. For example use of Biodata approaches, where traits and historic activities have been correlated with requirements of the job in question, can offer very opaque questions (such as the WW2 discovery of the correlation between childhood flying of model aeroplanes and good pilots).
Including the candidate
Including the candidate in the assessment process can also help to reduce faking, socialising them into providing honest responses. This may be implemented, for example, in assessment centres, where they may be involved in discussions about psychometric outcomes.
_______________________________________________________________________
Reliability:
Definition
If a test is unreliable, then although the results for one use may actually be valid, for another they may be invalid. Reliability is thus a measure of how much you can trust the results of a test.
Tests often have high reliability – but at the expense of validity. In other words, you can get the same result, time after time, but it does not tell you what you really want to know.
Stability
Stability is a measure of the repeatability of a test over time, that it gives the same results whenever it is used (within defined constraints, of course).
Test-retest reliability is the repeatability of test over time to get same results with the same person and needs to be done to assure the stability of a test. Stability, in this case, is the variation in the scores that is taken. Problems with this include:
* Carry-over effect: people remembering answers from last time.
* Practice effect: repeated taking of test improves score (typical with classic IQ tests).
* Attrition: People not being present for re-tests.
There is an assumption with stability that what is being measured does not change. Variation should be due to the test, not to any other factor. Sadly, this is not always true.
Consistency
Consistency is a measure of reliability through similarity within the test, with individual questions giving predictable answers every time.
Consistency can be measured with split-half testing and the Kuder-Richardson test.
Split-half testing
Split-half testing measures consistency by:
* Dividing the test into two (usually a mid-point, odd/even numbers, random or other method)
* Administering them as separate tests.
* Compare the results from each half.
A problem with this is that the resultant tests are shorter and can hence lose reliability. Split-half is thus better with tests that are rather long in the first place.
Use Spearman-Brown’s formula to correct problems of shortness, enabling correlation as if each part were full length:
r = (2rhh)/(1 + rhh)
(Where rhh is correlation between two halves)
Kuder-Richardson reliability or coefficient alpha
The Kuder-Richardson reliability or coefficient alpha is relatively simple to do, being based on one administration of the test. It assesses inter-item consistency of test by looking at two error measures:
* Adequacy of content sampling
* Heterogeneity of domain being sampled
It assumes reliable tests contain more variance and are thus more discriminating. Higher heterogeneity leads to lower inter-item consistency. For right/wrong scores that are non-dichotomous items:
Rkk = k / (k – 1(1 – Σσ2i/σ2t))
Where Rkk is alpha coefficient of test, k is number of items, σ2i is item variance, σ2t is test variance
Equivalence of results (parallel form)
Seeks reliability through equivalence between two versions of the same test, comparing results from each version of test (like split-half). It is better than test-retest as it can be done the same day (reducing variation).
There is a danger of tests with high internal validity having limited coverage (and hence lower final validity).
Bloated specifics are where similar questions lead to apparent significance. This can be bad when unintended, but can be used to create deliberate variations.
Parallel versions are useful in such situations as with graduates who may do the same test several times.
An adverse effect occurs where different groups score differently (potential racial, etc. bias). This may require different versions of the same test – eg. MBTI for different countries.
Discussion
There are a number of procedural aspects that affect test reliability, including:
* Test conditions
* Inconsistent administrative practices
* Variation in test marking
* Application of an inappropriate norm group
* Internal state of test-taker (tired, etc.)
* Experience level of test-taker (eg. if taken test before).
_________________________________________________________________
The selection spiral:
There is a significant danger in selection and promotion that a company can spiral downwards into incompetence and failure.
The power of talent
The talent and motivation people in any given job can make a huge difference to the achievements that are gained. It is not uncommon to reckon that there can be a ten-to-one ratio in the performance of two different people doing the same job.
This has led to a significant focus on 'talent' and the categorization of A-, B- and C-players, where A-players are the high-achieving stars, the B-players are the solid, good-enough middle team and the C-players are the very limited bottom-end.
The selection trap
Selection is arguably the most important process in organizations. If you recruit or promote the wrong person, you get to live with the relative incompetence and never know the potential that has been lost.
The biggest danger in organizations is where managers select people who are less capable than themselves. The selection trap is where managers have an ego need to feel superior to their charges. If they interview someone who seems to be better than them, they feel threatened by that person and are less likely to employ them.
There are several fears involved in this. First, managers have a legitimate concern that they should be able to manage their subordinates. If they employ someone who challenges their directive too often, then it would seem that this is an unmanageable situation. This supports the less legitimate desire to look good and fear that a superior subordinate might make you look bad and even try to take your job.
A-players are particularly susceptible to this trap, as they often have bigger egos and may be driven narcissists who work hard to seek the recognition they need. Anyone else who takes glory from them is thus to be feared, rejected or attacked.
The spiral
The subsequent spiral of managers appointing less able people than themselves is that there is a downward spiral in the talent that the company appoints. The gradient of this slope depends on how widespread this behavior is and the gap that the manager needs between him/herself and the new appointee.
In particular, if managers feel threatened by people who might take their job, then they will filter out applicants who show signs of ability in management and leadership. This fear of replacement may also drive a general opposition to development of their employees, leading to a leadership vacuum that eventually results in a company that lacks direction and inspiration, and which is eventually overtaken (and maybe taken over) by fitter competitors.
____________________________________________________________________
Validity:
When designing and using tests and other methods of assessing people, it is important that the test and its use is valid.
Definition
Validity has been described as 'the agreement between a test score or measure and the quality it is believed to measure' (Kaplan and Saccuzzo, 2001). In other words, it measures the gap between what a test actually measures and what it is intended to measure.
This gap can be caused by two particular circumstances:
(a) the design of the test is insufficient for the intended purpose, and
(b) the test is used in a context or fashion which was not intended in the design.
Types of validity
Face validity
Face validity is that the test appears to be valid. This is validated using common-sense rules, for example that a mathematical test should include some numerical elements.
A test can appear to be invalid but actually be perfectly valid, for example where correlations between unrelated items and the desired items have been found. For example, successful pilots in WW2 were found to very often have had an active childhood interest in flying model planes.
A test that does not have face validity may be rejected by test-takers (if they have that option) and also people who are choosing the test to use from amongst a set of options.
Content validity
A test has content validity if it sufficiently covers the area that it is intended to cover. This is particularly important in ability or attainment tests that validate skills or knowledge in a particular domain.
Content under-representation occurs when important areas are missed. Construct-irrelevant variation occurs when irrelevant factors contaminate the test.
Construct validity
Underlying many tests is a construct or theory that is being assessed. For example, there are a number of constructs for describing intelligence (spatial ability, verbal reasoning, etc.) which the test will individually assess.
Constructs can be about causes, about effects and the cause-effect relationship.
If the construct is not valid then the test on which it is based will not be valid. For example, there have been historical constructs that intelligence is based on the size and shape of the skull.
Criterion-related validity
Criterion-related validity is like construct validity, but now relates the test to some external criterion, such as particular aspects of the job.
There are dangers with the external criterion being selected based on its convenience rather than being a full representation of the job. For example an air traffic control test may use a limited set of scenarios.
Concurrent validity is measured by comparing two tests done at the same time, for example a written test and a hands-on exercise that seek to assess the same criterion. This can be used to limit criterion errors.
Predictive validity, in contrast, compares success in the test with actual success in the future job. The test is then adjusted over time to improve its validity.
The validity coefficient
The validity coefficient is calculated as a correlation between the two items being compared, very typically success in the test as compared with success in the job.
A validity of 0.6 and above is considered high, which suggests that very few tests give strong indications of job performance.