Recommendation letters are one of the most face valid predictors of academic and job performance; it is certainly intuitive that someone writing about someone else whom they know well should be able to provide an honest and objective assessment of that person’s capabilities. But despite their ubiquity, little research is available on the actual validity of recommendation letters in predicting academic and job performance. They look like they predict performance; but do they really?
There is certainly reason to be concerned. Of the small research literature available on recommendation letters, the results don’t look good. Selection of writers is biased; usually, we don’t ask people who hate us to write letters for us. Writers themselves are then biased; many won’t agree to write recommendation letters if the only letter they could write would be a weak one. Among those that do write letters, the personality of the letter-writer may play a more major role in the content than the ability level of the recommendee. So given all that, are they still worth considering?
In a recent issue of the International Journal of Selection and Assessment, Kuncel, Kochevar and Ones examine the predictive value of recommendation letters for college and graduate school admissions, both in terms of raw relationships with various outcomes of interest and incrementally beyond standardized test scores and GPA. The short answer: letters do weakly predict outcomes, but generally don’t add much beyond test scores and GPA. For graduate students, the outcome for which letters do add some incremental predictive value is degree attainment (which the researchers argue is a more motivation-oriented outcome than either test scores or GPA) – but even then, not by much.
Kuncel and colleagues came to this conclusion by conducting a meta-analysis of the existing literature on recommendation letters, which unfortunately was not terribly extensive. The largest number of studies appearing in any particular analysis was 16 – most analyses only summarized 5 or 6 studies. Thus the confidence intervals surrounding their estimates are likely quite wide, leaving a lot of uncertainty in the precise estimates they identified. That doesn’t necessarily threaten the validity of their conclusions – since these are certainly the best estimates of recommendation letter validity that are available right now – but it does highlight the somewhat desperate need for more research in this area.
Another caveat to these findings – the studies included in any meta-analysis must have reported enough information to obtain correlation estimates of the relationships of interest. In this case, that means the included studies needed to have quantified recommendation letter quality. I suspect many people reading recommendation letters instead interpret those letters holistically – for example, reading the entire letter and forming a general judgment about how strong it was. That holistic judgment is probably then combined with other holistic judgments to make an actual selection decision. Given what we know about statistical versus holistic combination (i.e., there is basically no good reason to use holistic combination), any particular incremental value gained by using recommendation letters may be lost in such very human, very flawed judgments.
So the conclusion? At the very least, it doesn’t look like using recommendation letters hurts the validity of selection. If you want to use such letters, you will likely get the most impact by coming up with a reasonable numerical scale (e.g. 1 to 10) and assign each letter you receive a value on your scale to indicate how strong the endorsement is. Then calculate the mean of that number alongside the other components of your statistically derived selection system (e.g. GPA and standardized test scores).Footnotes: