Skip to content

Learn Web Scraping/Data Science at SIOP, APA, and IPAC Workshops

2017 February 22

Image by Arpit Agrawal

Are you a psychologist interested in learning some new techniques to leverage data science in your academic research or in your consulting practices? Web scraping may be the answer you need.

Last year, I published the first in what will likely be a series of articles focused on teaching psychologists techniques from data science. Specifically, I introduced the concept of web scraping, which involves the systematic, algorithmic curation of unstructured online data, usually from social media, and its conversion into an analyzable dataset. I furthermore provided a step-by-step tutorial explaining how to use the free programming language Python and its free package scrapy to do just that.

This year, I’ll be presenting three workshops on web scraping in various venues, although these presentations will be in R.  Each presentation is somewhat different in focus and learning objectives, so feel free to attend all three!

  1. April 28, 2017: Automated conversion of social media into data: Demonstration and tutorial (3 hours)
    Part of the Friday seminar series at the 2017 Annual Conference of the Society for Industrial and Organizational Psychology (SIOP) in Orlando, FL.
  2. July 17, 2017: Web scraping and machine learning for employee recruitment and selection: A hands-on introduction (3.5 hours)
    A pre-conference workshop for the International Personnel Assessment Council (IPAC) annual conference in Birmingham, AL.
  3. August 3-6 (TBD), 2017: How to create a dataset from Twitter or Facebook: Theory and demonstration (1.8 hours)
    A skill-building session for the American Psychological Association (APA) annual conference in Washington, DC.

All three presentations will start with an explanation of data source theories, the key theoretical consideration that affects external validity when trying to identify high quality sources of online information for research.

Additionally, the SIOP presentation will focus on instruction in R and rvest, mimicking the online tutorial I provided but with some extra information and a lot of hands-on examples.

The IPAC presentation will focus on the practicals of web scraping, including discussion of tradeoffs to various data sources when using web scraping for employee selection and recruitment, demonstration of both easy-to-use commercial scraping packages and the manual, R-based approach, and interactive discussion of use cases.

The APA presentation will be a hands-on walkthrough of accessing the Facebook and Twitter APIs to web scrape without nearly as much programming as you need when you don’t have an API!

With any of the three, you should be able to leave the workshop and curate a new internet-sourced dataset immediately!

I believe all three provide CE credit, but I’ll update this when I know for sure! See you in Orlando, Birmingham and Washington!

Previous Post:
Next Post:
One Response leave one →
  1. Andrew permalink
    February 22, 2017

    Makes me wish I was going to IPAC this year!

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS