Lucy Training: Pulling Twitter Data with Python (3-Session Workshop Sequence)

Presenter: Yang Xu

Twitter is a social media where people share news, attitudes and immediate reactions to social events. Twitter offers an API to researchers to collect data. The API allows users to define their own conditions for data retrieval, such as keywords, date range, is retweet or not, reply to or mention of a certain user, etc.

This 3-session workshop introduces a series of hands-on projects on how to collect data from Twitter, parse tweets, and apply basic NLP (Natural Language Processing) analysis.

Participants will learn:

  1. How to set up code, pull data through the Twitter API, and save data locally. (Feb.9)
  2. Wrangle and clean the raw data. (Feb.16)
  3. Apply NLP models, such as sentiment analysis. (Feb.23)

Since each week builds off of the previous week(s), it is recommended to register for all three sessions.

The workshop assumes working knowledge of Python and is open to advanced undergraduate students, graduate students, faculty, and staff. 

This workshop will be offered in-person in Hesburgh Library. There is a limit of 15 participants for this workshop.  (Note: depending on COVID-19 cases and university policy, this workshop may be delivered virtually over Zoom.).

Register Now.  Registration closes one day before each session.

Workshop 1: Extracting Data with Twitter API
Wednesday, February 9, 2022, 3:30-5pm in Hesburgh Library Classroom 246

This session will examine pulling data fromTwitter’s API, including:

  • Timeline from certain accounts
  • The tweets a certain user liked
  • Tweets containing a certain hashtag or multiple hashtags
  • Search tweets by a set of conditions (time-permitting)
     

Workshop 2: Parsing Twitter Data
Wednesday, February 16, 2022, 3:30-5pm in Hesburgh Library Classroom 125

This session will look at parsing raw data extracted from the Twitter API, including:

  • Working with json files
  • Working with different data types (e.g. string, dict, list)
  • Using conditions and loops in programming
     

Workshop 3: Wrangling and Basic NLP with Twitter Data
Wednesday, February 23, 2022, 3:30-5pm in Hesburgh Library Classroom 246

This session will demonstrate wrangling and basic NLP techniques, including:

  • Creating new data columns
  • Descriptive statistics
  • Sentiment analysis
     

More details: https://github.com/Lucy-Family-Institute/CSSR-Workshop-Twitter