Legal and Ethical Issues in Webscraping

Legal and Ethical Issues in Webscraping



The application programming interfaces (APIs) of major social media platforms such as Twitter, Facebook, and Reddit are being closed or restricted, potentially cutting off access to a significant data source about online social interaction. Does this mean the end of research using social media data in the social sciences and humanities? Not necessarily! Researchers have persevered, continuing to acquire data from social media platforms using web scraping. However, such research can be controversial, and researchers using web scraping have occasionally faced backlash from both social media platforms and the public. There is, therefore, a need for researchers to understand how web scraping can be conducted legally and ethically in order to continue studying social media in a post-API era. As a first step, this course introduces students to the legal and ethical issues involved with web scraping. First, we discuss key issues, including terms of service (ToS) violations, copyright, privacy, notice and consent, public vs. private data, and data sharing. Next, we discuss controversial cases of research using web scraping, the techniques they used, and why they were controversial. Finally, we invite discussion about what legal and ethical web scraping might look like, and how students can apply these lessons to their own projects. 

This is a beginner-level event, and no previous knowledge of the method is required.    

The training will take place via Teams. You are going to receive a link to connect to ahead of the training. 

If you’re new to this training event format, or to CDCS training events in general, read more on what to expect from CDCS training. Here you will also find details of our cancellation and no-show policy, which applies to this event.  

This training is connected to the Scraping Websites with R training taking place on the 9th and 12th of October.


If you're interested in other training on text analysis you can have a look at the following: 


Return to the Training Homepage to see other available events