Jesse Krijthe
http://www.jessekrijthe.com/
Recent content on Jesse KrijtheHugo -- gohugo.ioen-usjkrijthe@gmail.com (Jesse Krijthe)jkrijthe@gmail.com (Jesse Krijthe)Tue, 09 May 2017 17:09:13 +0000Eurovision Winning Probabilities
http://www.jessekrijthe.com/articles/eurovision-betting-prob/
Tue, 09 May 2017 17:09:13 +0000jkrijthe@gmail.com (Jesse Krijthe)http://www.jessekrijthe.com/articles/eurovision-betting-prob/<p>This week marks the <a href="https://en.wikipedia.org/wiki/Eurovision_Song_Contest_2017">62nd edition of the Eurovision Song Contest</a>: an annual event where countries from across Europe and beyond (Australia is competing as well) come together to perform 3 minute pop songs.</p>
<p><a href="https://www.kaggle.com/c/Eurovision2010">Predicting the outcome</a> of the contest poses an interesting statistical problem since the rules of the comeptition have been relatively stable over the years, so there is some data to base future predictions on, yet there is only a single contest every year, making it easy to overtrain a model on the limited data available.</p>
<p>Perhaps the most commonly reported predictions by the media are those implied by the odds set by bookmakers. In this note, I want to explore what probabilities the bookmakers’ odds correspond to for this year’s competition, as well as how well these probabilities predicted the winner in recent years.</p>
<p>Let’s start with the most important part, this year’s probabilities:</p>
<p><img src="http://www.jessekrijthe.com/articles/eurovision-betting-prob_files/figure-html/current-odds-1.png" width="576" /></p>
<p>Italy is the clear favourite. As we’ll see below, the markets are relatively confident in this year’s favourite actually winning, compared to the previous four years.</p>
<p>A short note on the methodology: I converted the decimal odds reported by the bookmakers to probabilities and then divided these by the total probability assigned by a bookie to all the countries combined. The total probability is bigger than one, which reflect the advantage the bookmakers have over their customers. The division by this total is a crude way to correct for this advantage. The probabilities reported here correspond to the median probability of all the bookmakers I had data for.</p>
<p>These are the probabilities (close to) the monday before the competition in previous years:</p>
<p><img src="http://www.jessekrijthe.com/articles/eurovision-betting-prob_files/figure-html/previous-odds-1.png" width="350px" /><img src="http://www.jessekrijthe.com/articles/eurovision-betting-prob_files/figure-html/previous-odds-2.png" width="350px" /><img src="http://www.jessekrijthe.com/articles/eurovision-betting-prob_files/figure-html/previous-odds-3.png" width="350px" /><img src="http://www.jessekrijthe.com/articles/eurovision-betting-prob_files/figure-html/previous-odds-4.png" width="350px" /></p>
<p>In 2014 the bookmakers were clearly off, although it is hard to say whether this is bad, given the sample of only 4 years. In the other three years, the winner was assigned a reasonably high probability. It is interesting how skewed towards Italy the probabilities are this year. In previous years, there was usually a second country with a reasonably high probability. Whether this reflects a clear preference by the European voters, we’ll have to see during the final this Saturday.</p>
<p><em>Update after the contest:</em> Portugal won, which, given the large probability placed on an Italy win by the bookmakers, does not help increase confidence in the hypothesis that these probabilities are properly calibrated.</p>
Favourite Work at ICML 2015
http://www.jessekrijthe.com/articles/icml2015/
Wed, 05 Aug 2015 13:09:13 +0000jkrijthe@gmail.com (Jesse Krijthe)http://www.jessekrijthe.com/articles/icml2015/<p>This post is just to remind myself of some of my favourite posters/presentations that I saw while attending ICML. I have undoubtably missed a lot of interesting stuff. If you have any particular suggestions, please let me know!</p>
<p><a href="http://jmlr.org/proceedings/papers/v37/betancourt15.pdf">The Fundamental Incompatibility of Scalable Hamiltonian Monte Carlo and Naive Data Subsampling</a><br/>
<em>Michael Betancourt</em><br/>
I liked the topic and the kind of analysis and I especially liked his clear style of presentation. Moreover, there was quite a lively discussion about whether this incompatibility is actually a problem, or whether it focussed too much on only the bias that is introduced by naive subsampling.</p>
<p><a href="http://jmlr.org/proceedings/papers/v37/salimans15.pdf">Markov Chain Monte Carlo and Variational Inference: Bridging the Gap</a><br/>
<em>Tim Salimans, Diederik Kingma, Max Welling</em><br/>
The presentation and poster were a bit hard for me to follow but the problem seems important.</p>
<p><a href="http://jmlr.org/proceedings/papers/v37/lopez-paz15.pdf">Towards a Learning Theory of Cause-Effect Inference</a><br/>
<em>David Lopez-Paz, Krikamol Muandet, Bernhard Schölkopf, Iliya Tolstikhin</em><br/>
Interesting use of Maximum Mean Discrepancy in a clear analysis of an important problem.</p>
<p><a href="http://jmlr.org/proceedings/papers/v37/blundell15.pdf">Weight Uncertainty in Neural Network</a><br/>
<em>Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, Daan Wierstra</em><br/>
I have not looked into how exactly their approach is different from previous attempts at incorporating weight uncertainty, but the updates for the weight parameters seemed surprisingly simple.</p>
<p><a href="http://jmlr.org/proceedings/papers/v37/ramaswamy15.pdf">Convex Calibrated Surrogates for Hierarchical Classification</a><br/>
<em>Harish Ramaswamy, Ambuj Tewari, Shivani Agarwal</em><br/>
I like this idea of classification calibrated losses and this seems like an interesting extension to hierarchical loss functions.</p>
<p><a href="http://jmlr.org/proceedings/papers/v37/narasimhana15.pdf">Optimizing Non-decomposable Performance Measures: A Tale of Two Classes</a><br/>
<em>Harikrishna Narasimhan, Purushottam Kar, Prateek Jain</em><br/>
The authors consider functions of the true positive rate and true negative rate and come up with two classes of such functions and an approach to maximize them. The one class includes measures like the G-mean and the H-mean, while the other class includes the F-measure and Jaccard coefficient.</p>
<p><a href="http://jmlr.org/proceedings/papers/v37/jiao15.pdf">The Kendall and Mallows Kernels for Permutations</a><br/>
<em>Yunlong Jiao & Jean-Philippe Vert</em><br/>
The authors consider the problem of learning from permutations or rankings instead of vector of real valued numbers. In particular, they construct PSD kernels based on Kendall’s coefficient and Mallows kernel in order to apply kernel methods to the problem.</p>
<p><a href="http://arxiv.org/pdf/1501.05427v3">Enabling scalable stochastic gradient-based inference for Gaussian processes by employing the Unbiased LInear System SolvEr (ULISSE)</a><br/>
<em>Maurizio Filippone & Raphael Engler</em><br/>
This seems to tackle the important problem exact quantification of uncertainty in covariance parameters for gaussian processes with seemingly few constraints on the number type of covariance function.</p>
<p><a href="http://jmlr.org/proceedings/papers/v37/hugginsb15.pdf">Risk and Regret of Hierarchical Bayesian Learners</a><br/>
<em>Jonathan H. Huggins & Joshua B. Tenenbaum</em><br/>
Again, an interesting analysis of an important problem, although it will take me some more time to study the actual result.</p>