Modelling Unstructured Data with Bert

 

In person 

In this course, we will cover the basics of topic modelling and how to use Python to build, evaluate, and analyse BERT topic models. 

Topic modelling is a powerful tool for uncovering latent semantic structures in large collections of text data, providing insights into the underlying themes and trends. We will introduce you to the basics of topic modelling and discuss different approaches to data collection and ingestion. We will also cover techniques for preparing the data for analysis, including cleaning and pre-processing. We will dive into using Python for BERT topic modelling, covering how to build and evaluate topic models using Python, as well as advanced techniques for improving the results of topic modelling. Finally, we will focus on analysing and interpreting the results of topic modelling, including visualizing the results using Python. 

We will also discuss real-world applications of topic modelling and wrap up the course with a conclusion. 

  

This course will be taught by Aybuke Atalay and Joy Lan 

  

After taking part in this event, you may decide that you need some further help in applying what you have learnt to your research. If so, you can book a Data Surgery meeting with one of our training fellows. 

More details about Data Surgeries. 

Those who have registered to take part will receive an email with full details on how to get ready for this course. 

If you’re new to this training event format, or to CDCS training events in general, read more on what to expect from CDCS training. Here you will also find details of our cancellation and no-show policy, which applies to this event. 

  

Level  

This workshop requires the following pre-knowledge:   

  • Familiarity with working with Python through notebooks/Google Colab
  • Familiarity with handling and wrangling data and applying functions
  • Familiarity with NLP and text analysis with Python 

  

Learning Outcomes 

  • Understand the basics of topic modelling
  • Familiarise yourself with the steps needed to set up unstructured datasets to perform topic modelling
  • Explore what topic modelling can and cannot do when applied to real-world data
  • Familiarise yourself with packages and functions in Python (BERTopic) to perform topic modelling analysis and understand how different approaches generate very different results
  • Interpret the results of the different analyses (BERTopic)

 

Skills  

By attending this course, you will familiarise yourself with the following skills:

  • Analyse text using BERT topic models in Python
  • Train BERT topic models to improve the model
  • Interpret BERT topic model outputs 

 

Explore More Training

 

Return to the Training Homepage to see other available events 

 

 

 

Room 4.35, Edinburgh Futures Institute

This room is on Level 4, in the North East side of the building.

When you enter via the level 2 East entrance on Middle Meadow Walk, the room will be on the 4th floor straight ahead.

When you enter via the level 2 North entrance on Lauriston Place underneath the clock tower, the room will be on the 4th floor to your left.

When you enter via the level 0 South entrance on Porters Walk (opposite Tribe Yoga), the room will be on the 4th floor to your right.

You might be interested in

A collage image of historical material

Digital Method of the Month: Text Analysis

A collage of historical images and material

Getting Started with Data Analysis in Python

A collage image of historical material

A Gentle Introduction to Causal Inference

Thumbnail with title of the training

Comparing Sentiment Analysis Models in R

An illustrative collage with & symbol and a historical item

Getting Started with Bayesian Statistics

An illustrative collage with & symbol and an old photograph

Explainable Machine Learning (XAI)

An illustrative collage with & symbol and some patterns in squares

Modelling Unstructured Data with Bert

An illustrative collage with & symbol and a maths graph

Linear Mixed Effects Modelling

An illustrative collage with & symbol and an old photograph

Building Personal and Project Websites

UoE archive image with title of the training event

Foundations of Machine Learning