Legal and Ethical Issues in Webscraping

14 Oct 2024, 14:00 – 16:00

Book Now

Book now

In Person

The application programming interfaces (APIs) of major social media platforms such as Twitter, Facebook, and Reddit are being closed or restricted, potentially cutting off access to a significant data source about online social interaction. Does this mean the end of research using social media data in the social sciences and humanities? Not necessarily! Researchers have persevered, continuing to acquire data from social media platforms using web scraping. However, such research can be controversial, and researchers using web scraping have occasionally faced backlash from both social media platforms and the public. There is, therefore, a need for researchers to understand how web scraping can be conducted legally and ethically in order to continue studying social media in a post-API era. As a first step, this course introduces students to the legal and ethical issues involved with web scraping.

First, we discuss key issues, including terms of service (ToS) violations, copyright, privacy, notice and consent, public vs. private data, and data sharing. Next, we discuss controversial cases of research using web scraping, the techniques they used, and why they were controversial. Finally, we invite discussion about what legal and ethical web scraping might look like, and how students can apply these lessons to their own projects.

This is a beginner-level event, and no previous knowledge of the method is required.

Those who have registered to take part will receive an email with full details on how to get ready for this workshop.

This workshop will be taught by Jessica Witte.

If you’re new to this training event format, or to CDCS training events in general, read more on what to expect from CDCS training. Here you will also find details of our cancellation and no-show policy, which applies to this event.

This training is connected to the Scraping Websites with R training taking place on the 21^st of October.

If you're interested in other training on web scraping and text analysis, you can have a look at:

Return to the Training Homepage to see other available events.

Room 4.35, Edinburgh Futures Institute

This room is on Level 4, in the North East side of the building.

When you enter via the level 2 East entrance on Middle Meadow Walk, the room will be on the 4th floor straight ahead.

When you enter via the level 2 North entrance on Lauriston Place underneath the clock tower, the room will be on the 4th floor to your left.

When you enter via the level 0 South entrance on Porters Walk (opposite Tribe Yoga), the room will be on the 4th floor to your right.