When features go missing, Bayes' comes to the rescue

In the midst of all the mess, one of the great things to grow out of the ongoing pandemic is the online conference. It has made it possible for so many more people to attend and learn from international conferences, in a way that doesn't break the bank or the work week!

In that spirit, the 2020 Global PyData conference has gone entirely online and consists of pre-recorded talks from speakers together with Q&A sessions that are aligned to a variety of time zones. I was fortunate to have a part of my recent work at Tripadvisor be accepted at the conference for a 30 min talk; preparing and recording an online talk was a totally new experience for me, and I think I have learnt a lot of new presentation (and environment setup) skills in the process.

In this talk, I go over the merit of thinking deeply about missing values in any real-world machine learning problem, especially those that involve tree-based models. I go on to talk about the imputation approach that we used while designing a recommendation system at Tripadvisor, and discuss how it can be thought of as a simplified Bayesian inference recipe in a probabilistic graph.

I am really excited about the talk and the feedback I am going to get during the Q&A session - do browse the slides from the talk in the meantime!

Using the Logitech Unifying Receiver on Linux

After many years of continuous use (including WFH for the last few months), I decided to retire my old keyboard and mouse and jump on the Logitech ergonomic bandwagon with the MX Vertical mouse and the Ergo K860 keyboard. Both these devices are part of a line-up that Logitech is currently promoting for multi-tasking between several computers - they can each pair with 3 different computers, and can use a single Unifying receiver to pair to the same computer. I don't work with several devices at once, but would definitely like to not have more than the necessary number of receivers plugged into my computer; unfortunately, Logitech's unifying software only works with Windows/Mac, so I had to do some browsing to figure out a Linux alternative.

There's a nifty tool designed by Peter Wu to get Logitech devices to work with Linux: ltunify. I tried it on Ubuntu 18.04 and it works perfectly out of the box to pair both my devices to the same computer using a single Unifying receiver. In addition, you may want to use Solaar to get a nice system tray which displays battery levels, settings etc for each of your paired devices. You can pair upto six Logitech devices to a computer using one Unifying receiver.

Open source neuroscience at Boston Python

/images/boston_python_1.thumbnail.jpg /images/boston_python_3.thumbnail.jpg /images/boston_python_2.thumbnail.jpg

Last week, my labmate Joseph Wachutka and I got an opportunity to talk to the Boston Python User Group about our efforts in developing open-source tools in Python for electrophysiology. Boston Python is an amazing mix of Python-lovers, both seasoned and novice, who are always ready to listen, ask incisive questions and offer great feedback! It was enlightening (and very encouraging!) to talk to people interested in pushing open source - many of the questions from the audience were right on target and brought forth the very involved hardware and software challenges we have hacked around in the last few years.

Despite being a central technique in systems neuroscience, electrophysiology has been dominated by expensive proprietary technologies. Throughout our PhD work, Joe and I have tried to develop open source tools (mostly using the Raspberry Pi, Linux and Python) that help us run experiments and record, clean and analyze data. Apart from reducing the costs of performing complex electrophysiological and optogenetic studies by an order of magnitude, we hope that these efforts will make our science easier to share and reproduce (and shouldn't that be the ultimate goal for all scientists anyways?).

Here are the slides for my talk at Boston Python. Please do read about our Raspberry Pi based hardware setup and our recent Scipy Paper describing our HDF5-based data management and analysis pipeline.

String completion as you move to Python 3 (OR, how curly braces became important)

I am currently in the midst of (FINALLY!!) making the shift from trusted Python 2.7 to Python 3.x (while recovering from an extended holiday thanks to a somewhat lengthy visa renewal process). Making the shift itself is pretty simple, thanks to a nifty, command line tool built within the Python standard library itself (at least on Linux) called, very simply, 2to3.

Its always a good idea to keep backups of legacy code (of course!), and to actually write changes to file, so the command goes something like:

2to3 -w *.py

I am in the habit of using % as a placeholder for string completion in Python 2.x. That behavior doesn't go out of use entirely in Python 3.x, but using the str.format() method is recommended instead - see more details.

Most of string completion code worked out of the box in Python 3 - I only realized that the norm had changed when I had curly braces "{}" in my strings. It turns out that curly braces have a special meaning in Python 3 strings - they are placeholders, in addition to the % sign. The format goes something like this:

In [4]: "my string {:d}".format(1)
Out[4]: 'my string 1'

However, the braces themselves need to be encapsulated in a second set of braces to be printed. So, for example:

In [5]: "my string {{{:d}}}".format(1)
Out[5]: 'my string {1}'

In brief, the braces outside the number placeholder {:d} have to be encapsulated in a second (or third, depending on how you look at it) set of braces :)

Installing the latest version of Texlive (2016 right now) on Ubuntu without a headache

Texlive is the most convenient way to get up and running with LaTeX on Linux (specifically Ubuntu, in this case) systems. A simple sudo apt-get install texlive does the trick - however, there's an important catch: Debian versions of texlive traditionally lag behind. Ubuntu 14.04 and 16.04 ship with the 2013 and 2015 versions respectively. So, when I had to look into updating texlive on my 14.04 system, I had no recourse but to try to attempt installing 'vanilla' texlive straight from CTAN, and of course, risk the consequences!

Turns out that installing texlive on CTAN can be a complicated mess of steps that need to be done in exactly the right order - but I came across this supremely helpful Github repo from Scott Kostyshak that makes everything super simple. One simply needs to clone this repository, and follow the steps in the readme, and things work smoothly (I've tested this on my Ubuntu 14.04 system).

Here are the steps:

git clone https://github.com/scottkosty/install-tl-ubuntu
cd install-tl-ubuntu
sudo ./install-tl-ubuntu

NOTE: If you have the Debian version of texlive already installed (2013 in the case of Ubuntu 14.04 - to check your texlive version, say 'tex --version' at the terminal), you will need to remove it with sudo apt-get remove --purge texlive* first.