Event box

Love Data Week: Where in the World Is My Data Set? Web Scraping for Curious People Online

Ever feel like the internet is one giant treasure map and your next dataset is hiding just out of sight? Grab your digital compass—this workshop will show you how to uncover data gems tucked inside everyday websites, demystifying web scraping tools and tackling static pages together. You’ll learn how to spot promising data sources, peel back the layers of HTML using R, and transform online chaos into clean, structured information you can actually use. Whether you’re chasing global trends or just curious about what’s possible, you’ll walk away with practical scraping skills—and the confidence to find the data the internet didn’t intend to hand you on a platter. Adventure awaits… bring your curiosity. 

By the end of this workshop, participants will be able to:   

  • Identify where and how data can be acquired from the web and evaluate whether a website is appropriate and technically feasible to scrape.  
  • Understand basics of using R to scrape data from the web and how to apply responsible and reproducible data acquisition practices for web scraping.  

To follow along during the live workshop, participants should have R and RStudio installed prior to the workshop. Instructions for downloading R and RStudio and installing packages are available.  

This is an online event open to all; please register to receive the link. Presentation materials and a recording will be shared with registrants after the workshop. If you have accommodation requests or questions, reach out to libevents@uwaterloo.ca with your needs.  

Please note: This is the third of four workshops for Love Data Week. You do not need to attend all four; please register for each workshop separately.
 

Registration has closed.

Related LibGuide: Love Data Week Workshop Series

Date:
Wednesday, February 11, 2026
Time:
1:00pm - 2:00pm
Time Zone:
Eastern Time - US & Canada (change)
Online:
This is an online event. Event URL will be sent via registration email.
Audience:
  Community Members     Faculty / Instructors     Graduate Students     Researchers     Undergraduate Students     Waterloo staff  
Categories:
  Workshop  

Instructors

David Awosoga is a PhD candidate in statistics, supervised by Dr. Samuel WK Wong. David’s research portfolio includes projects across sports, economics and social network analysis. His current work is in credit assignment and sequential decision-making under uncertainty.  David has technical experience in open-source package development, data engineering, machine learning and statistical consulting. His research interests are sports analytics, spatiotemporal data analysis and Bayesian modelling, and his pedagogical interests are in curriculum development, reproducibility, and teaching proficiency in DevOps tools and code workflows.

Victor Michael Malinowski is a PhD candidate in biostatistics, supervised by Dr. Glen McGee and Dr. Leilei Zeng. Victor’s current research involves developing statistical methodology for causal inference in life history processes under informative observation mechanisms. During his time as a master’s student, he investigated causal inference under interference with applications in agronomy. Victor also worked eight months as a methodology intern in the Modern Statistical Methods and Data Science Branch at Statistics Canada where he contributed to the General Social Survey and the Census of Agriculture.

Event Support 

Amy Lim is the research programs and services coordinator at the University of Waterloo Libraries. She helps coordinate workshops alongside our librarians. If you have questions about this event, please contact her at libevents@uwaterloo.ca.

Facilitator(s)

Profile photo of David Awosoga
David Awosoga

Instructor

PhD candidate, Statistics

Profile photo of Victor Michael Malinowski
Victor Michael Malinowski

Instructor

PhD candidate, Biostatistics

Profile photo of Amy Lim
Amy Lim

Event Coordinator

Research Programs and Services Coordinator