The World Cup of Data

24 September 2014

This post was originally published in French on the Blog Aparté: you can read it here.

The time has come to look back at the 20th football World Cup: an event full of emotions, goals and athletic achievements but also full of data and stats to better analyse and understand each team’s performance. More than ever, media organisations brought teams of reporters to cover the event’s data: here is a review on some of them by Estelle Prusker-Deneuville, doctoral student at the University Panthéon-Assas in Paris, head of media studies at SciencesCom and specialized in datajournalism questions.

Equipped with data collection and performance analysis from past matches, as well as cameras and biometric sensors worn during the match, Big Data was the 12th man on the field for the 2014 World Cup winners. The Mannshaft relied on a team of German academics who dedicated two years to assembling a database of possible future opponents [English: here]. During the competition, they used game simulation and analysis software [English: here] to make use of data collected from thousands of players. Data scientists and statisticians joined the staff of the German team and greatly contributed to their success.

The German national team may have gone the furthest with data in their game strategy, but the media, through data-driven reporting, offered many tools to allow audiences to access information in a completely new way. Before the start of the competition, [French newspaper] Le Monde published a datavisualisation on their site that allowed readers to assess the state of strength of each European team. They included all historical data from four major tournaments (World Cup, Euro Cup, UEFA Champions Cup) since 1930.

World Cup Data 5

The data highlighted the strength of the Spanish team on the European stage, and predicted an outstanding performance for Spain in Brazil. Still leading up to the World Cup, football site L’Equipe.fr allowed its readers to relive events that marked the 50-year history of the World Cup through ‘Mundial Memories’, an editorial trip through the archives of the World Cup that pays homage to legendary games through text, graphics and photos.

World Cup Data 4

L’Equipe and NUMA hosted a hackathon ‘Data+Foot’ last March which produced many new ways to combine and examine World Cup data. The tools from this hackathon are now available on L'Equipe’s site. One project, ‘Myth or Mytho’, looks at major football myths and either proves or debunks them based on information from L’Equipe’s and Opta’s databases.

Once the competition was underway, many media outlets reported statistics and analysis, allowing their readers to experience the games through a data lens. English-speaking websites such as Huffington Post or The New York Times offered interactive infographics detailing all the passes and phases of play after each game.

Notably, Le Figaro stepped up their online data reporting, with ‘Foot Center’ (created with Sport24). It is an interactive datavisualisation of World Cup history in real time, which tracks comments about players via social networks.

World Cup Data 1

Foot Center, initially available for French Premier League games, meets user demands for predictions. It calculates and publishes the probability of teams qualifying and outcomes of the matches almost instantaneously during the games. Customers could also view a team’s strengths and weaknesses and follow how well ‘liked’ French players were through a quantitative and qualitative analysis of feelings and intentions of all tweets sent which mentioned ‘Les Bleus’.

The Foot Center highlighted the ‘social’ dimension of the World Cup with the hashtag #InsideMundial to share on social networks along with each stat published.

A real strength of this data device is its integration of an advertiser from the start. Sony Mobile and its official World Cup smartphone were omnipresent in Foot Center’s navigation with its own graphic universe dedicated functionality — a very encouraging partnership for datajournalism and economic models.

[...] The New York Times was particularly strong at data reporting for the World Cup. ‘The Clubs That Connect The World Cup’ is an analysis of the national teams’ performances through the lens of professional teams and the resulting imbalances.

Another NYT innovation, ‘The Upshot’ offers an algorithmic selection of their best World Cup stories. Also noteworthy is American site fivethirtyeight.com which sought to use data to understand the reasons for Argentina’s defeat: ‘Was Lionel Messi Tired?

Journalistic Investigations, a journey through data and stories, tools for performance analysis and prediction — everything deployed by the media and offered to fans and players during this 20th World Cup was unprecedented. [...] Datajournalism is gaining ground in the newsroom.

Estelle Prusker-Deneuville

Translated by Laure Nouraout and Sarah Toporoff

Getting hooked on datajournalism? See all the projects from our past Data Journalism Awards and stay tuned for the 2015 edition!