JSALT 2019 Montréal: Dive into Deep Learning for Natural Language Processing
Time: Friday, June 14, 2019
Location: Ecole de Technology Superieure in Montréal, Canada
Presenter: Leonard Lausen, Haibin Lin
Abstract
Deep learning has rapidly emerged as the most prevalent approach for training predictive models for large-scale machine learning problems. Advances in the neural networks also push the limits of available hardware, requiring specialized frameworks optimized for GPUs and distributed cloud-based training. Moreover, especially in natural language processing (NLP), models contain a variety of moving parts: character-based encoders, pre-trained word embeddings, long-short term memory (LSTM) cells, transformer layers, and beam search for decoding sequential outputs, among others.
This introductory and hands-on tutorial walks you through the fundamentals of machine learning and deep learning with a focus on NLP. We start off with a crash course on deep learning with Gluon, covering data, automatic differentiation, and various model architectures such as convolutional, recurrent, and attentional neural networks. Then, we dive into how context-free and contexual representations help various NLP domains. Throughout the tutorial, we start off from the basic classification problem, and progress into how it can be structured to solve various NLP problems such as sentiment analysis, question answering, and natural language generation.
Materials for the tutorial can be found in JSALT19-GluonNLP repository.
Have a question? You can reach us by emailing mxnet-science-info at amazon.com.
Agenda
Time | Title | Notebooks |
---|---|---|
8:30-9:00 | Continental Breakfast | |
9:00-9:45 | Introduction and Setup | NDArray, Autograd |
9:45-10:30 | Neural Networks 101 | MLP |
10:30-10:45 | Break | |
10:45-11:15 | Machine Learning Basics | Underfit & Overfit |
11:15-11:45 | Context-free Representations for Language | Word Embedding, Sentiment Analysis with Embedding |
11:45-12:15 | Convolutional Neural Networks | Sentiment Analysis with CNN |
12:15-13:15 | Lunch Break | |
13:15-14:00 | Recurrent Neural Networks | Sentiment Analysis with RNN |
14:00-14:45 | Attention Mechanism and Transformer | Sentiment Analysis with Attention |
14:45-15:00 | Coffee Break | |
15:00-16:15 | Contextual Representations for Language | Question Answering with BERT |
16:15-17:00 | Sequence Sampling | Sequence Generation with GPT-2 |