NeoAcademic

Where to Place Demographics on Your Surveys

2011 December 7

New Research Links Social Media Marketing and Purchase Intentions

2011 November 22

by Richard N. Landers

One of the biggest challenges associated with this newfangled social media is demonstrating monetary return on investment (ROI). A properly run social media campaign can be very expensive, as it takes a lot of time to properly engage an audience. Up to this point, there has been little to link social media to ROI other than an intuitive sense from practitioners of “of course it must have value!” Fortunately, a new research study soon to be published in the Journal of Business Research finally ties social media marketing to some more tangible outcomes. Customers with better perceptions of social media marketing are more likely to purchase the brands represented there.

In their study, Kim and Ko1 first developed a list of luxury fashion brands to be the focus of the study. They did this by asking a team of fifteen graduate students to list three brands “that came to mind when thinking of luxury.” The list ultimately consisted of: Louis Vuitton, Gucci, Burberry and Dolce & Gabbana, and from this list, Louis Vuitton was chosen to be the focus of the study. This made sense because the next phase of the study would be conducted in Korea, and this was a high-profile brand there with a strong social media presence.

Next, researchers staked out malls in luxury shopping districts in Seoul, where they would intercept shoppers and provide them with a survey. 400 surveys were collected, 362 of which contained complete data, which asked questions about shopper perceptions of Louis Vuitton’s social media presence, and their perceptions of the Louis Vuitton brand. This enabled researchers to examine how beliefs about social media were related to brand perceptions. Study participants were shown a picture of Louis Vuitton’s Facebook page and Twitter feeds before responding to questionnaires about them.

Using structural equations modeling, the authors demonstrated that there were three mediators of social media marketing perceptions and purchase intentions/customer equity. If you aren’t familiar with mediation, the basic idea is that A affects B only through its effect on C. For example, eating candy does not itself make one happy – instead, it’s because candy is delicious that it makes one happy. Thus, candy creates deliciousness creates happiness. We would conclude from this that the relationship between candy and happiness is mediated by deliciousness.

Here are the three mediators tested:

Value Equity. This is a customer’s assessment of how “worth it” the product is. If it’s priced well for a good product, you have very high value equity. If it’s priced badly for a poor product, you have very low value equity.
Relationship Equity. This is a customer’s assessment of how loyal they are to a brand.
Brand Equity. This a customer’s assessment of the added value of the brand beyond the product itself; for example, a customer with high brand equity would perceive a Louis Vuitton product to be of greater value than an identical product from another brand.

Researchers looked at two outcomes:

Purchase Intentions. Hopefully this one’s pretty obvious. However, the scope of this was not clear from the article – intentions within the next month, six months, year?
Customer Equity. This is a customer’s assessment of their expected lifetime value with a brand, a combination of expected total purchases, purchase frequency, purchase volume, expected purchases over other brands, and a few other features.

The researchers found a relationship between perceptions of social media marketing and all three mediators, with the strongest relationships to brand equity and relationship equity. But the relationships between mediators and outcomes was more complex: only value equity and brand equity predicted purchase intentions, while only brand equity predicted customer equity.

Overall, we can conclude that customer perceptions of social media marketing are linked to purchase intentions through their effect on value equity. Or in other words, the better a customer perceives your social media marketing effort, the more likely they are to think your products give them more for their money, and the more likely they are to actually purchase something from you.

There are certainly some limitations. First, the study is limited to the variables chosen; there may be other mediators not examined that could affect outcomes. Second, actual behavioral outcomes (e.g. purchases) were not measured; we are relying on self-report of purchase intentions. Third, and most importantly, as a survey-based study, we can make no causal conclusions. So we cannot safely say “if you increase your social media marketing efforts, more people will intend to purchase your products.” That is left to future research. But even with these limitations, this marks the first explicit tying of social media efforts to measurable cash-related outcomes.

Kim, A., & Ko, E. (2011). Do social media marketing activities enhance customer equity? An empirical study of luxury fashion brand Journal of Business Research DOI: 10.1016/j.jbusres.2011.10.014 [↩]

Computing Intraclass Correlations (ICC) as Estimates of Interrater Reliability in SPSS

2011 November 16

1,004 Comments

tags: SIOP, spss, statistics, stats

by Richard N. Landers

If you think my writing about statistics is clear below, consider my student-centered, practical and concise Step-by-Step Introduction to Statistics for Business for your undergraduate classes, available now from SAGE. Social scientists of all sorts will appreciate the ordinary, approachable language and practical value – each chapter starts with and discusses a young small business owner facing a problem solvable with statistics, a problem solved by the end of the chapter with the statistical kung-fu gained.

This article has been published in the Winnower. You can cite it as:
Landers, R.N. (2015). Computing intraclass correlations (ICC) as estimates of interrater reliability in SPSS. The Winnower 2:e143518.81744. DOI: 10.15200/winn.143518.81744

You can also download the published version as a PDF by clicking here.

Recently, a colleague of mine asked for some advice on how to compute interrater reliability for a coding task, and I discovered that there aren’t many resources online written in an easy-to-understand format – most either 1) go in depth about formulas and computation or 2) go in depth about SPSS without giving many specific reasons for why you’d make several important decisions. The primary resource available is a 1979 paper by Shrout and Fleiss1, which is quite dense. So I am taking a stab at providing a comprehensive but easier-to-understand resource.

Reliability, generally, is the proportion of “real” information about a construct of interest captured by your measurement of it. For example, if someone reported the reliability of their measure was .8, you could conclude that 80% of the variability in the scores captured by that measure represented the construct, and 20% represented random variation. The more uniform your measurement, the higher reliability will be.

In the social sciences, we often have research participants complete surveys, in which case you don’t need ICCs – you would more typically use coefficient alpha. But when you have research participants provide something about themselves from which you need to extract data, your measurement becomes what you get from that extraction. For example, in one of my lab’s current studies, we are collecting copies of Facebook profiles from research participants, after which a team of lab assistants looks them over and makes ratings based upon their content. This process is called coding. Because the research assistants are creating the data, their ratings are my scale – not the original data. Which means they 1) make mistakes and 2) vary in their ability to make those ratings. An estimate of interrater reliability will tell me what proportion of their ratings is “real”, i.e. represents an underlying construct (or potentially a combination of constructs – there is no way to know from reliability alone – all you can conclude is that you are measuring something consistently).

An intraclass correlation (ICC) can be a useful estimate of inter-rater reliability on quantitative data because it is highly flexible. A Pearson correlation can be a valid estimator of interrater reliability, but only when you have meaningful pairings between two and only two raters. What if you have more? What if your raters differ by ratee? This is where ICC comes in (note that if you have qualitative data, e.g. categorical data or ranks, you would not use ICC).

Unfortunately, this flexibility makes ICC a little more complicated than many estimators of reliability. While you can often just throw items into SPSS to compute a coefficient alpha on a scale measure, there are several additional questions one must ask when computing an ICC, and one restriction. The restriction is straightforward: you must have the same number of ratings for every case rated. The questions are more complicated, and their answers are based upon how you identified your raters, and what you ultimately want to do with your reliability estimate. Here are the first two questions:

Do you have consistent raters for all ratees? For example, do the exact same 8 raters make ratings on every ratee?
Do you have a sample or population of raters?

If your answer to Question 1 is no, you need ICC(1). In SPSS, this is called “One-Way Random.” In coding tasks, this is uncommon, since you can typically control the number of raters fairly carefully. It is most useful with massively large coding tasks. For example, if you had 2000 ratings to make, you might assign your 10 research assistants to make 400 ratings each – each research assistant makes ratings on 2 ratees (you always have 2 ratings per case), but you counterbalance them so that a random two raters make ratings on each subject. It’s called “One-Way Random” because 1) it makes no effort to disentangle the effects of the rater and ratee (i.e. one effect) and 2) it assumes these ratings are randomly drawn from a larger populations (i.e. a random effects model). ICC(1) will always be the smallest of the ICCs.

If your answer to Question 1 is yes and your answer to Question 2 is “sample”, you need ICC(2). In SPSS, this is called “Two-Way Random.” Unlike ICC(1), this ICC assumes that the variance of the raters is only adding noise to the estimate of the ratees, and that mean rater error = 0. Or in other words, while a particular rater might rate Ratee 1 high and Ratee 2 low, it should all even out across many raters. Like ICC(1), it assumes a random effects model for raters, but it explicitly models this effect – you can sort of think of it like “controlling for rater effects” when producing an estimate of reliability. If you have the same raters for each case, this is generally the model to go with. This will always be larger than ICC(1) and is represented in SPSS as “Two-Way Random” because 1) it models both an effect of rater and of ratee (i.e. two effects) and 2) assumes both are drawn randomly from larger populations (i.e. a random effects model).

If your answer to Question 1 is yes and your answer to Question 2 is “population”, you need ICC(3). In SPSS, this is called “Two-Way Mixed.” This ICC makes the same assumptions as ICC(2), but instead of treating rater effects as random, it treats them as fixed. This means that the raters in your task are the only raters anyone would be interested in. This is uncommon in coding, because theoretically your research assistants are only a few of an unlimited number of people that could make these ratings. This means ICC(3) will also always be larger than ICC(1) and typically larger than ICC(2), and is represented in SPSS as “Two-Way Mixed” because 1) it models both an effect of rater and of ratee (i.e. two effects) and 2) assumes a random effect of ratee but a fixed effect of rater (i.e. a mixed effects model).

After you’ve determined which kind of ICC you need, there is a second decision to be made: are you interested in the reliability of a single rater, or of their mean? If you’re coding for research, you’re probably going to use the mean rating. If you’re coding to determine how accurate a single person would be if they made the ratings on their own, you’re interested in the reliability of a single rater. For example, in our Facebook study, we want to know both. First, we might ask “what is the reliability of our ratings?” Second, we might ask “if one person were to make these judgments from a Facebook profile, how accurate would that person be?” We add “,k” to the ICC rating when looking at means, or “,1” when looking at the reliability of single raters. For example, if you computed an ICC(2) with 8 raters, you’d be computing ICC(2,8). If you computed an ICC(1) with the same 16 raters for every case but were interested in a single rater, you’d still be computing ICC(2,1). For ICC(#,1), a large number of raters will produce a narrower confidence interval around your reliability estimate than a small number of raters, which is why you’d still want a large number of raters, if possible, when estimating ICC(#,1).

After you’ve determined which specificity you need, the third decision is to figure out whether you need a measure of absolute agreement or consistency. If you’ve studied correlation, you’re probably already familiar with this concept: if two variables are perfectly consistent, they don’t necessarily agree. For example, consider Variable 1 with values 1, 2, 3 and Variable 2 with values 7, 8, 9. Even though these scores are very different, the correlation between them is 1 – so they are highly consistent but don’t agree. If using a mean [ICC(#, k)], consistency is typically fine, especially for coding tasks, as mean differences between raters won’t affect subsequent analyses on that data. But if you are interested in determining the reliability for a single individual, you probably want to know how well that score will assess the real value.

Once you know what kind of ICC you want, it’s pretty easy in SPSS. First, create a dataset with columns representing raters (e.g. if you had 8 raters, you’d have 8 columns) and rows representing cases. You’ll need a complete dataset for each variable you are interested in. So if you wanted to assess the reliability for 8 raters on 50 cases across 10 variables being rated, you’d have 10 datasets containing 8 columns and 50 rows (400 cases per dataset, 4000 total points of data).

A special note for those of you using surveys: if you’re interested in the inter-rater reliability of a scale mean, compute ICC on that scale mean – not the individual items. For example, if you have a 10-item unidimensional scale, calculate the scale mean for each of your rater/target combinations first (i.e. one mean score per rater per ratee), and then use that scale mean as the target of your computation of ICC. Don’t worry about the inter-rater reliability of the individual items unless you are doing so as part of a scale development process, i.e. you are assessing scale reliability in a pilot sample in order to cut some items from your final scale, which you will later cross-validate in a second sample.

In each dataset, you then need to open the Analyze menu, select Scale, and click on Reliability Analysis. Move all of your rater variables to the right for analysis. Click Statistics and check Intraclass correlation coefficient at the bottom. Specify your model (One-Way Random, Two-Way Random, or Two-Way Mixed) and type (Consistency or Absolute Agreement). Click Continue and OK. You should end up with something like this:

Results of a Two-Way Random Consistency ICC Calculation in SPSS

In this example, I computed an ICC(2) with 4 raters across 20 ratees. You can find the ICC(2,1) in the first line – ICC(2,1) = .169. That means ICC(2, k), which in this case is ICC(2, 4) = .449. Therefore, 44.9% of the variance in the mean of these raters is “real”.

So here’s the summary of this whole process:

Decide which category of ICC you need.
1. Determine if you have consistent raters across all ratees (e.g. always 3 raters, and always the same 3 raters). If not, use ICC(1), which is “One-way Random” in SPSS.
2. Determine if you have a population of raters. If yes, use ICC(3), which is “Two-Way Mixed” in SPSS.
3. If you didn’t use ICC(1) or ICC(3), you need ICC(2), which assumes a sample of raters, and is “Two-Way Random” in SPSS.
Determine which value you will ultimately use.
1. If a single individual, you want ICC(#,1), which is “Single Measure” in SPSS.
2. If the mean, you want ICC(#,k), which is “Average Measures” in SPSS.
Determine which set of values you ultimately want the reliability for.
1. If you want to use the subsequent values for other analyses, you probably want to assess consistency.
2. If you want to know the reliability of individual scores, you probably want to assess absolute agreement.
Run the analysis in SPSS.
1. Analyze>Scale>Reliability Analysis.
2. Select Statistics.
3. Check “Intraclass correlation coefficient”.
4. Make choices as you decided above.
5. Click Continue.
6. Click OK.
7. Interpret output.

Shrout, P., & Fleiss, J. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86 (2), 420-428 DOI: 10.1037/0033-2909.86.2.420 [↩]

NeoAcademic

technology, education and training research from an industrial/organizational (I/O) psychologist in the ivory tower

Where to Place Demographics on Your Surveys

Computing Intraclass Correlations (ICC) as Estimates of Interrater Reliability in SPSS

Recent Posts

Professional Links