Subway usage prediction

data-science / open-data

We were given a database of about a billion subway trips in Paris and the challenge was to predict how many people would enter each station at any time in the day.


We used what we called contextual modelling to build an efficient prediction algorithm. The idea is that people don’t move for nothing, there are contextual factors that influence our behaviour. Therefore our predictions are not only based on past behaviours but also on contextual data such as weather, graph connectivity or events.


The more context you add to the model, the better the accuracy of prediction will be. The final version of our model, that includes events, is able to predict affluence accurately even when the usual trend is disrupted by a exceptional events like a rugby match near the Stade de France.