Lucy Training Workshop: Introduction to Web Scraping with Python

Presenter: Yang Xu

Explore the World of Web Scraping

Web scraping enables you to efficiently fetch and extract data from web pages, opening doors to information that might otherwise remain hidden or difficult to collect.

This workshop provides an introduction to web scraping using Python. Through a hands-on project, participants will learn how to use the Beautiful Soup Python package to scrape, parse and extract specific data from a webpage.

While the workshop is designed with beginners in mind, it also offers valuable learning opportunities for experienced Python users through a separate self-paced web-crawling project using Scrapy (a widely used Python web scraping framework).

What You’ll Learn

  1. Web scraping fundamentals.
  2. The art of HTML parsing and extracting data using Beautiful Soup.
  3. Precise data retrieval using regex.
  4. Efficient data extraction using Scrapy (advanced).

Who Should Attend

This workshop is open to Notre Dame students, faculty, and researchers aiming to enhance their data skills. Whether you’re a Python novice or an experienced programmer, there’s something for everyone.

Prior Knowledge

To fully appreciate and make the most of the workshop, a basic understanding of Python is beneficial. However, if you have little to no Python experience, the instructor will cover the basics at the beginning of the workshop to ensure everyone can participate.

When and Where

This workshop will be offered in-person in 246 Hesburgh Library, February 15, 2024, 2:30 – 4pm.

This workshop requires a minimum of 10 registrants.  Fewer than 10 registrants by the registration deadline (February 14) will result in the workshop being cancelled. In the event of cancellation, registrants will be notified.

Register Now!