Silent Disco: Understanding and Creating Word Embeddings

A sepia photograph of people working at desks in a large hall with overhead lamps. A large green ampersand featuring an illustration of Ada Lovelace is placed on the left. The logo of the Centre for Data, Culture & Society (DCS) appears in the top right corner.

 

Online 

Our 'Silent Disco' workshops are based on tutorials from the Programming Historian website. This training event will follow content from the tutorial, Understanding and Creating Word Embeddings

Word embeddings allow you to analyse how different terms are used in a collection of texts by capturing information about their contextual usage. Through a primarily theoretical lens, this session will teach you how to prepare a corpus and train a word embedding model. You will explore how word vectors work, how to interpret them, and how to answer humanities research questions using them. 

The workshop will take place via Microsoft Teams in a ‘Silent Disco’ format. Participants will have the flexibility to choose whether to follow the Python or R script and work through the tutorial at their own pace. The facilitators will be available via Teams Chat to reply to any questions that arise during the workshop, and to help with installation, troubleshooting, or other issues. 

To attend this course, you will have to join the associated Microsoft Teams group. The link to join the group will be sent to attendees prior to the course start date, so please make sure to do so in advance.

 

This silent disco will be facilitated by Somya Iqbal and Aybuke Atalay

 

After taking part in this event, you may decide that you need some further help in applying what you have learnt to your research. If so, you can book a Data Surgery meeting with one of our training fellows. 

More details about Data Surgeries. 

Those who have registered to take part will receive an email with full details on how to get ready for this course. 

If you’re new to this training event format, or to CDCS training events in general, read more on what to expect from CDCS training. Here you will also find details of our cancellation and no-show policy, which applies to this event. 

 

Level  

This workshop requires the following pre-knowledge:   

  • Basic familiarity with Python and Jupyter Notebooks in Google Colab, or basic familiarity with R/RMarkdown in RStudio

  • A local IDE setup on your PC (IDLE, Spyder, or Jupyter Notebooks)

 

Learning Outcomes 

  • What word embedding models and word vectors are, and what kinds of questions we can answer with them

  • How to create and interrogate word vectors using Python/R

  • What to consider when putting together the corpus you want to analyse using word vectors

  • The limitations of word vectors as a methodology for answering common research questions

 

Skills  

  • Application of word vectors on text data

  • An ability to interpret word vectors from model outputs

  • Extrapolate the embedding concepts to your own work from applied packages and learning resources in the session

 

Explore More Training

 

Return to the Training Homepage to see other available events 

 

 

You might be interested in

A collage image of historical material

Analysing Spatial Dynamics with GIS and R

A collage of historical images and material

Getting Started with Data Analysis in Python

A collage image of historical material

A Gentle Introduction to Causal Inference

A collage image of historical material

Beyond Social Networks: Advanced Uses of Gephi in Humanities Research

UoE archival image with training event title

Systematic Literature Review with R

Thumbnail with title of the training

Comparing Sentiment Analysis Models in R

An illustrative collage with & symbol and an old photograph

Building Personal and Project Websites

An illustrative collage with & symbol and some patterns in squares

Modelling Unstructured Data with Bert

UoE archive image with title of the training event

Foundations of Machine Learning