Recently, Old Dominion University embarked on an initiative to improve the teaching of disciplinary writing across courses university-wide. This is part of ODU’s Quality Enhancement Plan, an effort to improve undergraduate instruction in general. It’s an extensive program, involving extra instructional training and internal grant competitions, among other initiatives.
Writing quality is one of the best indicators of deep understanding of subject matter, and the feedback on that writing is among the most valuable content an instructor can provide. Unfortunately, large class sizes have resulted in a shift of responsibility for grading writing from the faculty teaching courses to the graduate teaching assistants supporting them in those courses. Or more plainly, when you have 150 students in a single class, there’s simply no way to reasonably provide detailed writing feedback by yourself several times in a semester on top of the other duties required of instructors without working substantial overtime.
With that in the background, I was pleasantly surprised to discover a new paper by Doe, Gingerich and Richards1 appearing in the most recent issue of Teaching of Psychology on the training of graduate student teaching assistants in evaluating writing. In their study, they compared 12 GTA graders with a 5-week course on “theory and best practices” for grading with 8 professional graders over time at two time points, about 3 months apart. For this study, the professional graders were considered to be a “gold standard” for grading quality.
Overall, the researchers found that GTAs were more lenient than professional graders at both time points. Both groups provided higher grades at Time 2 than at Time 1. This is most likely due to student learning over the course of the semester, it might be due to variance in assignment difficulty – the researchers did not describe any attempts to account for this. Professional graders assessed papers blind to time point, ruling out a purely temporal effect.
The researchers also found a significant interaction between GTA grader identity and time, indicating different changes over time between graders, so GTAs were not homogenous, suggesting important individual difference moderators (perhaps GTAs become harsher or more lenient at different rates?). Student comment quality was also assessed by the researchers, finding increases in comment quality over time among student raters, but with differences across dimensions of comment quality (e.g. discussion of strengths and weaknesses, rhetorical concerns, positive feedback, etc). The relative magnitudes of increases were not examined, although these could be computed from the provided table.
One major issue with this study is that the reliabilities of the comment quality outcomes are quite low. Correlations between raters ranged from .69 to .92, but both raters only assessed 10% of the papers. When calculating correlations as an estimate of inter-rater reliability, the use of these correlations as estimates of inter-rater reliabiltiy assumes that everyone rates every paper and the mean is used. Since most papers were assessed by only one rater, these reliabilities are overestimates and should have been corrected down using the Spearman-Brown prophecy formula. Having said that, the effect of low reliability is that observed relationships are attenuated – that is, they are smaller in the dataset than they should be. So had this been done correctly, the increased accurate would have made the researchers’ results stronger. The effect of time (and of grader) may be much larger than indicated here.
Overall, the researcher conclude that the assumption of GTA quality when grading writing assignments is misplaced. Even with training and practice, GTA performance did not reach that of professional graders. On the bright side, training did help. The researcher conclude that continuous training is necessary to ensure high quality grading – a one-time training – or I suspect, more commonly, no training – is insufficient.
One thing that was not assessed by the researchers was a comparison of GTA comment quality and professional comment quality versus professor comment quality. But perhaps that would hit a bit too close to home!
- Doe, S. R., Gingerich, K. J., & Richards, T. L. (2013). An evaluation of grading and instructional feedback skills of graduate teaching assistants in introductory psychology Teaching of Psychology, 40 (4), 274-280 DOI: 10.1177/0098628313501039 [↩]
Over the last two days, I attended the OpenVA conference and its Minding the Future preconference. The purpose of these events was to bring together faculty in Virginia to talk about open and digital learning resources. Speakers in the culminating session today described a vision of the future where content creators (faculty, instructional designers, etc.) create educational resources and release them on the web for all the world to use.
That idea feels great to me. Emotionally, I really enjoy the idea of an open and free-wheeling exchange of academic resources. Have a great way to teach a particular concept? Put it online! Share it with faculty of the world! If it’s good, it might be adopted, updated, remixed to make it even better. I think most folks would agree that greater inter-faculty communication and sharing is a good thing.
And yet, a proliferation of open and digital learning resources has not occurred. Thousands of lecture videos from enthusiastic faculty haven’t been posted online. Huge collections of free, innovative interactive Flash and HTML5 apps have not appeared. Massive repositories of activities for each topic in many courses don’t exist. Even the massive open online courses available today are open only in terms of enrollment, not in provided resources. In my field, the number of resources in our biggest teaching repository, which has been around for a few years now, is maybe around 40, many of which are just links to pre-existing YouTube videos. Folks are thrilled to use others’ open resources but reluctant to share their own. So what’s happening? Perhaps it is more useful to think about why people aren’t producing such resources rather than why they are.
Let’s start simple. How many faculty have posted all of their lecture materials to YouTube? All of their slides to Slideshare? I’m guessing that as a proportion of the number of faculty creating such resources, that number is extremely low. I use this example specifically because this would be the easiest content to share – content that has already been created for a course and could be shared with just a few clicks. Although I believe in the value of open educational resources, I don’t post my own either.
Why not? Loss of ownership. This is not in the sense that I want to make money off of my teaching techniques. Instead, I worry that if I were to post such resources, how would they be used? What would stop a University of Phoenix from charging students to take a class using content that I have provided for free? What would stop a University of Walmart from taking my content on unfair treatment of employees in the workplace, replacing a few slides, using it to train its employees on why Walmart is the most fair and equitable employer around, and claiming the ideas were originally mine?
It is this unknown that terrifies me about fully open educational resources. Although I feel comfortable and enthusiastic about sharing resources that I have carefully prepared, vetted, double-checked, and delivered unto the world, I feel much less comfortable sharing everything because I have zero control over what happens to that content once it leaves my computer.
And before you shout “luddite,” I’ll point out that I’m a digital native and a Millennial, at the front of a wave of incoming Millennial faculty. I wrote my first computer program when I was five years old. This is not necessarily an issue of age or tech-savvy. It is deeper than that, and it will not be solved by faculty retirements alone.
I imagine (and hope) there’s a solution to such concerns, but these are the sorts of issues that must be addressed before open resources will be the powerful movement in education that so many desire it to be. If you have a solution, I invite you to share it!
There are two major approaches to data collection with respect to time. Typically, we collect cross-sectional data. This type of data is collected at a single point in time. For example, we might ask someone to complete a single survey. Atypically, we collect longitudinal data. This type of data is collected at multiple points in time, and changes over time are examined. For example, we might collect behavioral data about students today, next week, and the week after that.
The relative advantages and disadvantages of these two approaches can by illustrated by the following similes. Cross-sectional data is like a photograph, whereas longitudinal data is like a video. You might be able to get all the information you need out of a photograph. It does, after all, provide a snapshot of whatever was going on at a particular point in time. But when holding a photograph, you don’t really know what happened before and after that photograph was taken. You must generally assume that what you’re interested in was the same over time. That is not always a safe assumption. Although video solves the timeline problem, it introduces new problems – it’s more complicated to collect (hold that camera still!) and you’ll need to watch the whole thing to figure out what’s going on.
The same is true of cross-sectional versus longitudinal data. Longitudinal data is exceptionally difficult to collect because we usually rely on the kindness of volunteers to complete our studies. When you ask someone to come back to your lab six times, they’re substantially less likely to show up at Time 6 than at Time 1. So that has led researchers to investigate alternate techniques for collecting this type of data. One such approach is to provide mobile phones with pre-installed data collection apps to research participants for long-term use, but little research is available describing how well such an approach should work.
An article in Social Science Computer Review by van Heerden and colleagues1 sheds some light on this issue. The path from phone purchase to actual data collection in a South African sample is fascinating:
- 1000 phones were purchased.
- 996 phones were functional and distributed to research participants over the course of a year.
- One month after distribution ended, 734 phones were found to still be accessible of the 996 distributed. Of those missing:
- 25 had their SIM card changed (making the phone impossible to track and changing its phone number)
- 32 had deleted the researcher’s data collection software
- 205 were reported as lost, stolen, broken, or given away
- Of the 734 phones, 435 phones were selected at random to be sent two surveys, two weeks apart, via text message.
- Of the 435 phones text messaged, 288 were successfully delivered for Survey 1 and 271 for Survey 2. Of those missing:
- 147 were undeliverable for either Survey
- 21 were successfully received for Survey 1 but not for Survey 2
- 4 were successfully received for Survey 2 but not for Survey 1
- Of the 288 successful text message deliveries for Survey 1, 105 completed Survey 1
- Of the 271 successful text message deliveries for Survey 2, 84 completed Survey 2
Thus, of the original 996 distributed phones, 8.4% resulted in complete data. Even considering only the 288 successful messages, the completion rate was 30.9% for Survey 2. That’s pretty depressing.
The researchers actually present this as positive evidence for the use of mobile devices in this manner, considering that meta-analytic evidence suggests a mean expected response rate for web surveys of 34.0% (higher for mail). But considering the extreme expense involved in this approach, I am not convinced. The authors suggest that the next step would be to probe the use of research participant’s own mobile phones, which is certainly a good idea, but I wonder why they didn’t do this in the first place – perhaps they did not expect their population to own mobile phones already.
- van Heerdan, A. C., Norris, S. A., Tollman, S. M., Stein, A. D., & Richter, L. M. (2013). Field lessons from the delivery of questionnaires to young adults using mobile phones Social Science Computer Review, 1-8 : 10.1177/0894439313504537 [↩]