Careless responding is one of the most fundamental challenges of survey research. We need our respondents to respond honestly and with effort, but when they don’t, we need to be able to detect and remove them from our datasets. A few years ago, Meade and Craig published an article in Psychological Methods exploring a significant number of techniques for doing exactly this, ultimately recommending a combination of three detection techniques for rigorous data cleaning, which, let’s face it, is a necessary step when analyzing any internet survey. These techniques are even-odd consistency, maximum longstring, and Mahalanobis D:
- The even-odd consistency index involves calculating the subscale means for each measure on your survey, split by even and odd items. For example, the mean of items 1, 3, 5, and 7 would become one subscale whereas the mean of items 2, 4, 6, and 8 would become the other. Next, you take all of the even subscales and pair them with all of the odd subscales across all of the measures in your survey, calculate a correlation, and then apply the Spearman-Brown prophecy formula to adjust the value up to a scale of -1 to 1.
- Maximum LongString is the largest value for LongString across all scales on your survey, where LongString is the number of identical response in a row. Meade and Craig recommended LongString would be most useful when the items were randomly ordered.
- Mahalnobis D is calculated from the regression of scale means onto all the scores that inform them. In a sense, you are trying to see if responses to individual items correspond with the scale mean they created consistently across individuals. Some conceptualizations of this index regress participant number onto scores, which conceptually accomplishes basically the same thing.
In all three cases, the next step is to create a histogram of the values and see if you see any outliers.
Calculating Careless Responding Indices
Of these three, Mahalanobis D is the most easily calculated, because saving Mahalanobis D values is a core feature in regression toolkits. It is done easily in SPSS, SAS, R, etc.
The second, the even-odd consistency index, is a bit harder but still fundamentally not too tricky; you just need to really understand how your statistical software works. Each step, individually, is simple: calculate scale means, calculate a correlation, apply a formula.
The third, Max LongString, is the most intuitively understandable but also, often unexpectedly, the most difficult to calculate. I imagine that the non-technically-inclined generally count by hand – “this person has a maximum of 5 identical answers in a row, the next person has 3…”
An SPSS macro already exists to do this, although it’s not terribly intuitive. You need to manually change pieces of the code in order to customize the function to your own data.
Given that, I decided to port the SPSS macro into Excel and make it a little easier to use.
An Excel Macro to Calculate LongString
Function LongString(cells As Range) Dim cell As Range Dim run As Integer Dim firstrow As Boolean Dim maxrun As Integer Dim lastvalue As String firstrow = True run = 1 maxrun = 1 For Each cell In cells If firstrow = True Then firstrow = False lastvalue = cell.Value Else If cell.Value = lastvalue Then run = run + 1 maxrun = Application.Max(run, maxrun) Else run = 1 End If lastvalue = cell.Value End If Next cell LongString = maxrun End Function
To Use This Code Yourself
- With Excel open, press Alt+F11 to open the VBA Editor.
- Copy/paste the code block above into the VBA Editor.
- Close the VBA Editor (return to Excel).
- In an empty cell, simply type =LONGSTRING() and put the cell range of your scale’s survey items inside. For example, if your first scale was between B2 and G2, you’d use =LONGSTRING(B2:G2)
- Repeat this for each scale you’ve used. For example, if you measured five personality dimensions, you’d have five longstrings calculated.
- Finally, in a new cell use the =MAX() function to determine the largest of that set. For example, if you put your five LongStrings in H2 to L2, you’d use =MAX(H2:L2)
That’s it! Importantly, the cells needs to be in Excel in the order they were administered. If you used randomly ordered items, this adds an additional layer of complexity, because you’ll need to recreate the original order for each participant first before you can apply LongString. That takes a bit of Excel magic, but if you need to do this, I recommend you read up on =INDIRECT() and =OFFSET(), which will help you get that original order back, assuming you saved that item order somewhere.
Once you have Max LongString calculated for each participant, create a histogram of those values to see if any strange outliers appear. If you see clear outliers (i.e., a cluster of very high Max LongString values off by itself, away from the main distribution), congratulations, because it’s obvious which cases you should drop. If you don’t see clear outliers, then it’s probably safer to ignore LongString for this analysis.Footnotes:
So, what’s the difference between industrial and organizational psychology?
The difference these days is quite fuzzy, but it used to be much clearer. Let me tell you a little story.
In the old and ancient times for the field of psychology – which of course means the end of the 19th and first half of the 20th century – there was only one field: industrial psychology. It did not always formally have this name (e.g., people calling themselves “industrial psychologists” were often found in “counseling psychology” or “applied psychology” organizations), but it was what now think of as historical “industrial psychology.” Industrial psychology was for the most part (although not entirely) focused on improving production in manufacturing and other manual labor sorts of jobs, as well as improving soldier performance on the battlefield (which at that time was also often manual labor).
In manufacturing, managers noticed that employees seemed to work harder sometimes and less hard other times, and they were not sure why. A bunch of researchers with names you’ll recognize if you have studied I/O – Hugo Munsterberg, Walter Dill Scott, James Cattell, and Edward Titchener in particular – promoted the idea that the fledgling field of psychology might be able to shed some light on this. They would of course believe this as they were all students of Wilhelm Wundt, the grandfather of modern psychology.
The growth of industrial psychology was also heavily influenced by and contributed to a movement in the early 1900s called Taylorism, reflecting the viewpoint of Frederick Taylor, a mechanical engineer by training who was inspired by Munsterberg and others. His view was that the American worker was slow, stupid, and unwilling to do any work except by force or threat. However, he also viewed science as the only way to fix the problem he perceived. The popularity of Taylorism (sometimes called “scientific management”) in the US and around the world (Stalin reportedly loved the idea) paved the way for our field to grow, for better or worse.
As a result, industrial psychology at that time had a lot of overlap with what we now call “human factors psychology.” Studies were often conducted like the famous ones by Elton Mayo at the Hawthorne plant of Western Electric, where key elements of the environment – such as lighting – were varied systematically and the effects on worker behavior observed using the scientific method. In fact, if you poke into the history of specific I/O graduate programs, you’ll often find a split between I/O and HF somewhere in their past. The goal of many studies of that era could be described as trying to trick the worker into working harder. The interesting thing about such techniques: they do work… at least to a certain degree.
In addition to trying to change performance while people were at work, other industrial psychologists became interested in hiring. Specifically, many believed that if they could design the “perfect test,” they could find the absolute most productive workers for these businesses. These tests were typically intended to be assessments of intelligence, early versions of what we now conceptualize as latent “general cognitive ability.” One of the earliest and most well known examples of these efforts were the Army Alpha and Army Beta, tests used by the US Army in World War I given to more than a million soldiers for the purposes of assessing readiness to become a soldier, place them into specific military positions, and also – the first hint of a later shift – to identify high-potential leaders. These tests are early versions of the current test, which is still maintained and studied by I/O psychologists: the ASVAB.
As industrial psychology grew, so did the feeling that our field was missing something. The Hawthorne studies I referenced earlier are often credited as being the trigger point for this, but Hawthorne better serves as an example of this shift rather than thecause. As early as the 1930s, people became aware that industrial psychology’s focus on predicting and improving performance often ignored other aspects of the worker, specifically those involving people’s feelings. Motivation, satisfaction, how people get along with others – these topics were not of much concern among industrial psychologists, and a number of studies, including those at Hawthorne, increased interest in the application of psychology to the broader workplace. They also wondered if performance could be increased further by looking beyond hiring and worker manipulation – perhaps there is more we could do?
Thus, in 1937, the first organization devoted to I/O was created: Section D of the American Association for Applied Psychology, Industrial and Business. The AAAP merged into the American Psychological Association in 1945, rebirthing our field as APA Division 14: Industrial and Business Psychology. The shift from “Business” to “Organization” reflected changing priorities over several decades. Dissatisfaction with the explicit ties to Business (and not, for example, the military, government, etc.) resulted in the division being renamed simply “Industrial Psychology” in 1962. With the shift away from an industrial economy in the 1960s, dissatisfaction with the term “Industrial” led to the name we have today as of 1973: Industrial and Organizational Psychology.
So the short version of this answer is that: the distinction between industrial and organizational psychology these days is not a particularly strong one. It is instead based on historical shifts in priorities among the founding and early members of the professional organizations in our field. If I had to split them, I’d say people on the industrial side tend to focus more on things like employee selection, training and development, performance assessment and appraisal, and legal issues associated with all of those. People on the organizational side tend to focus more on things like motivation, teamwork, and leadership. But even with that distinction, people on both sides tend to borrow liberally from the other.
There was also a historical association of industrial psychology with more rigorous experimentation and statistics, largely because the focus on hiring could only be improved with those methods. The topics common to org psych were much broader with much more unexplored territory for a lot longer. But that has changed too – there aren’t many org psych papers published anymore without multilevel or structural equation modeling, as contributions on both the I and O sides have become smaller and more incremental than in the past. The old days of I/O were practically a Wild West! You could essentially just go into an organization, change something systematically, write it up, and you’d have added to knowledge. These days, it’s a lot harder.
Behind the scenes of all these theoretical/stance changes was also a huge ongoing battle against the American Psychological Association with where our field should fit as a professional organization (did you ever think it strange that SIOP incorporated as a non-profit while still a part of APA?), a problem that continues to this day. But that’s a different story!
“The Difference Between Industrial and Organizational Psychology” originally appeared in an answer I wrote on Quora.
Just a few days ago, the new and very promising open-access I/O psychology journal Personnel Assessment and Decisions released its second issue. And it’s full of interesting work, just as I thought it would be. This issue, in fact, is so full of interesting papers that I’ve decided to review/report on a few of them. The first of these, a paper by Nolan, Carter and Dalal, seeks to understand a very common problem in practice: despite overwhelmingly positive and convincing research supporting various I/O psychology practices, hiring managers often resist actually implementing them. A commonly discussed concern from these managers is that they will be recognized less for their work, or perhaps even be replaced, as a result of new technologies being introduced into the hiring process. But this evidence, to date, has been largely anecdotal. Does this really happen? And do managers really resist I/O practices the way many I/Os intuitively believe they do? The short answer suggested by this paper: yes and yes!
In a pair of studies, Nolan and colleagues explore each of these questions specifically in the context of structured interviews. In this case, “structure” is the technology that threatens these hiring managers. But do managers that use structured interviews really get less credit? And are practitioner fears about this really associated with reduced intentions to adopt new practices, despite their evident value?
In the first study, 468 MTurk workers across 35 occupations were sampled. Each was randomly assigned to a 2 x 2 x 3 matrix of condition, crossing interview structure (yes/high or no), decision structure (mechanical or intuitive combination), and outcome (successful, unsuccessful, unknown). Each participant then read a standard introduction:
Imagine yourself in the following situation…The human resource (HR) manager at your company just hired a new employee to fill an open position. Please read the description of how this decision was made and answer the questions that follow.
After that prompt, the descriptions varied by condition (one of 12 descriptions followed), but were consistent by variable. Specifically, “high interview structure” always indicated the same block of text, regardless of the other conditions. This was done to carefully standardize the experience. Afterward, participants were asked a) if they believed the hiring manager had control over and was the cause of the hiring decision and b) if the decision was consistent (a sample question for this scale: “Using this approach, the same candidate would always be hired regardless of the person who was making the hiring decision.” I/Os familiar with applicant reactions theory may recognize this as a perceived procedural justice rule.
So what happened? First, outcome didn’t matter much for either outcome. Regardless of the actual decision made, outcome and all of its interactions accounted for 1.9% and 3.3% of the total variance in each DV. Of course, in some areas of I/O, 3.3% of the variance is enough to justify something as extreme as theory revision, but in the reactions context, this is a pitifully small effect. So the researchers did not consider it further.
Second, the interview structure and decision structure manipulations created huge main effects. 14% of causality/control’s variance was explained by each manipulation. The total model accounted for 27%, which is a huge effect! For stability, the effect was smaller but still present – 9% and 7% for each manipulation respectively, and 17% for the full model. People perceived managers as having less influence on the process as a result of either type of structure, and because the interactions did not add much predictive power to the model, these effects were essentially independent.
One issue with this study is that these are “paper people.” Decisions about and reactions to paper people can be good indicators of the same situations when involving real people, but there’s an inferential leap required. So if you don’t believe people would react the same way to paper people as real people, then perhaps the results of this study are not generalizable. My suspicion is that the use of paper people may strengthen the effect. So real-world effects are probably a little smaller than this – but at 27% of the variance explained (roughly equivalent to r = .51), there’s a long way down before the effect would disappear. So I’m pretty confident it exists, at the least.
Ok – so people really do seem to judge hiring managers negatively for adopting interview structure. But does that influence hiring manager behavior? Do they really fear being replaced?
In the second study, MTurk was used again, but this time anyone who had no experience with hiring was screened out of the final sample. This resulted in 150 people with such experience, 70% of which were currently in a position involving supervision of direct reports. Thus, people with current or former hiring responsibilities participated in this second part. People who could be realistically replaced by I/O technology.
The design was a bit different. Seeing no results for outcome type, this block was eliminated from their research design, crossing only interview and decision structure (a 2 x 2, only 4 conditions). Perceptions of causality/control and consistency were assessed again. But additionally, perceived threat of unemployment by technology was examined (“Consistently using this approach to make hiring decisions would lessen others’ beliefs about the value I provide to my employing organization.”) as well as intentions to use (“I would use this approach to make the hiring decision.”)
The short version: this time, it was personal. The survey asked if the hiring managers used these techniques, what would happen? Could I be replaced?
As you might expect in a move to more real-world processes, the effects were a bit smaller this time, but still quite large: 22% of causality/control explained, and 9% of consistency. You can see the effect on causality in this figure.
Nolan et al, 2016, Figure 3. Perceptions of causality/control on conditions.
Interestingly, the consistency effect turned into an interaction: apparently having both kinds of structure is worse than just one.
Nolan and colleagues also tested a path model confirming what we all expected:
- Managers who believe others will judge them negatively for using new technologies are less likely to use those technologies.
- This effect occurs indirectly via perceived threat of unemployment by technology.
In summary, some people responsible for hiring fear being replaced by technology, and this decreases their willingness to adopt those technologies. This explains cases I/Os often hear about implementing a new practice in an organization only to discover that nothing has changed because the managers never actually changed anything. In an era where robots are replacing many entry level jobs, this is a legitimate concern!
The key, I think, is to design systems for practice that take advantage of human input. There are many things that algorithms and robots can’t do well (yet) – like making ratings for structured interviews! Emphasizing this and ensuring it is understood up and down the chain of command could reduce this effect.
So for now, we know that being replaced by our products is something managers worry about. This suggests that simply selling a structured interview to a client and leaving it at that is probably not the best approach. Meet with managers, meet with their supervisors, and explain why people are still critical to the process. Only with that can you have some confidence that your structured interview will actually be used!
Always remember the human component to organizations, especially when adding new technology to a process that didn’t have it before. People make the place!Footnotes:
- K.P. Nolan, N.T. Carter, & D.K. Dalal (2016). Threat of technological unemployment: Are hiring managers discounted for using standardized employee selection practices? Personnel Assessment and Decisions, 2 (1) [↩]