When features go missing, Bayes' comes to the rescue

In the midst of all the mess, one of the great things to grow out of the ongoing pandemic is the online conference. It has made it possible for so many more people to attend and learn from international conferences, in a way that doesn't break the bank or the work week!

In that spirit, the 2020 Global PyData conference has gone entirely online and consists of pre-recorded talks from speakers together with Q&A sessions that are aligned to a variety of time zones. I was fortunate to have a part of my recent work at Tripadvisor be accepted at the conference for a 30 min talk; preparing and recording an online talk was a totally new experience for me, and I think I have learnt a lot of new presentation (and environment setup) skills in the process.

In this talk, I go over the merit of thinking deeply about missing values in any real-world machine learning problem, especially those that involve tree-based models. I go on to talk about the imputation approach that we used while designing a recommendation system at Tripadvisor, and discuss how it can be thought of as a simplified Bayesian inference recipe in a probabilistic graph.

I am really excited about the talk and the feedback I am going to get during the Q&A session - do browse the slides from the talk in the meantime!