Teaching | Michaël Soumm

Teaching Assistant of Deep Learning: Models and Optimization

ENSAE Paris, Palaiseau, France
Teacher: Marco Cuturi

Introductory course to Deep Learning in Practice, with practical sessions in Pytorch.

Program

Elementary blocks from signal processing and statistics: spatial and temporal convolutions, activation functions, compositions
Automatic differentiation: gradients, jacobians
Review of a few famous nets for vision applications: AlexNet, Resnet,…
Stochastic optimization of parameters for non-convex problems (RMSprop, ADAM etc..)
Theory: convex models for simple two-layer perceptrons; network structure optimization
Recurrent networks and the vanishing gradient problem, LSTM, memory and attention mechanisms
Deep networks in action: GANs and VAEs
Applications to structured data: graph NN

Winter 2023

Teaching Assistant of NLP

ENSAE Paris, Palaiseau, France
Teacher: Pierre Colombo

Introductory course to NLP, with practical sessions in Pytorch and HuggingFace.

Program

Courses:

The Basics of NLP: This session introduces what is NLP, why it is challenging and how we approach any NLP problem.
Representing text into vectors: This lecture covers representation learning techniques for Natural Language Processing.
Deep Learning Methods for NLP: This lecture presents Deep Learning techniques used in NLP. We cover the design and training principles. We present the Multi-Layer-Perceptron (MLP), Recurrent Architectures (RNN and LSTM) and the transformer architecture.
Language Modeling: This lecture introduces language modeling.
Sequence Labeling & Classification: Sequence Labeling & Classification with Deep Learning models such as RNN-based models and transformers.
Sequence Generation: Sequence Generation using Encoder-Decoders.

Labs:

Introduction to textual data with Python: This lab introduces basics processing operations required for any NLP experiments. After introducing preprocessing tools for data cleaning and tokenization, we compute some descriptive statistics on textual data.
Word Embeddings and their evaluation: This lab explores representation learning techniques for words and documents. It explores models like tf-idf and Word2vec and develop quantitative and qualtiative evaluation methods.
Sequence Labeling and Sequence Classification with Deep Learning Models: This lab implements, trains and evaluates sequence classification and labeling models based on Recurrent Neural Networks and transformer deep-learning architecture.
Machine Translation: This lab introduces basics processing operations required for any NLP experiments. After introducing preprocessing tools for data cleaning and tokenization, we compute some descriptive statistics on textual data.

Autumn 2022

Teaching Assistant of Statistics 1

ENSAE Paris, Palaiseau, France
Teacher: Arnak S. Dalalyan

This course presents the theoretical bases of statistical modelling, essentially in a parametric framework. Preference is given to the inferential approach, and we deal primarily with parameter estimation methods and their properties, particularly in terms of optimality (asymptotic or finite distance). The theory of hypothesis testing will also be examined.

Program

General principles: The aims of statistics, the various approaches (inferential, Bayesian). Types of statistical models (parametric, semi- and non-parametric). Sampling, information given by a sample (Fisher, Kullback), statistics (exhaustive, free), exponential models.
Estimation: Estimation problem. Decision-making approach: admissibility. Eliminating estimation bias: optimality, Cramér-Rao bound, efficiency. Asymptotic estimation: maximum likelihood, moments method, asymptotic efficiency. Bayesian estimation: Bayes’ formula, Bayes estimator, subjective and objective approaches.
Hypothesis testing: Neyman-Pearson approach (confidence region, power, level, risks). Simple tests, Neyman-Pearson lemma. Student’s t-test. Asymptotic tests (Wald, Score, likelihood ratio). Goodness-of-fit tests (chi-squared, K-S).
Resampling techniques: Bootstrap, permutation tests.