NUMEV seminar #7 : « An overview of methods to handle missing values » – Julie Josse, Inria

The next NUMEV seminar will take place on Friday, February 3rd at 11am at Amphi Moreau, Campus Saint Priest.

« An overview of methods to handle missing values »

Julie Josse, Inria

When : 11am on Friday, February 3rd, 2023 + Lunch

Where : Amphi Moreau, Bât. 2 (LMGC), Campus Saint Priest, 860 rue de Saint Priest, Montpellier

Open to all researchers from all disciplines. Registration is free but mandatory.

Abstract

The problem of missing values exists since the earliest attempts of exploiting data as a source of knowledge as it lies intrinsically in the process of collecting, recording, and preparing the data itself. It is all the more unavoidable as vast amounts of data are currently collected from different sources: « One of the ironies of Big Data is that missing data plays an increasingly important role. » There is a vast literature on this topic, and a recent survey even identified more than 150 different implementations.

In this presentation, I will share my experience on the topic. I will start by the inferential framework and then show how missing values create additional challenges to the task of supervised learning, as  traditional machine learning algorithms can not handle incomplete data. Finally, I will illustrate the impact of the methods developed in the causal inference field to estimate treatment effects from clinical data.

The speaker

After being a professor of statistics at Ecole Polytechnique and a visiting researcher at Google Brain, Julie Josse joined Inria in September 2020 and created the Premedical (personalized medicine by data integration and causal learning) Inria-Inserm team.

Her research focuses on handling missing data, causal inference, visualization and nonparametric analyzes of complex data structures. Her fields of application mainly include the biological sciences and public health and she has recently focused on the estimation of treatment effects from observational (eg clinical) data as well as RCT.

She works with the Traumabase group at the Paris hospital to help emergency physicians make decisions. Julie devotes herself to reproducible research with statistical software R and has developed packages, including FactoMineR and missMDA to transfer her work. She is also involved in Rforwards to broaden participation of under-represented groups in the community.