Stats Perform Pro Forum 2022
in Football on Football, Analytics
A while back I had the chance to present a piece of hobby work titled “Pressing Times: Can data tell us when and how to navigate out of a counter press?” with my old friend Zhi Yuan at the 2022 Stats Perform Pro Forum held in London. For unfamiliar readers, the Pro Forum is a football/soccer analytics forum organised annually by the sports data and analytics company Stats Perform (football fans may be more familiar with their “previous” name Opta, thanks to the signature tweeting style of their OptaJoe account. Unmistakable.).
0 - Stefan Savić didn't commit a single foul over both legs of Atletico Madrid's quarter-final tie against Manchester City. Unexpected. pic.twitter.com/HDcHFgiyK0— OptaJoe (@OptaJoe) April 13, 2022
As excellent coverage and write ups on this year’s event are already available, this blog post will be dedicated to some ramblings about my experience as a first time participant. If you aren’t in the mood for reading all that, you can view the presentation recording via below or slides of our talk right here.
📹 Zhi Yuan Chua & @dlareg49 applied tracking data to identify counter pressing scenarios at the 2022 #ProForum, with the aim of establishing which scenarios are optimal to play through a press.
Find out their key conclusions in the full presentation ➡️ https://t.co/trx8eXIzo2 pic.twitter.com/gp9NpZChR4— Stats Perform Pro (@OptaPro) July 28, 2022
The call for proposals for the 2022 Pro Forum was put out on 20 October 2021, giving participants just over a month to brainstorm and develop their proposals by the late November deadline. A perk introduced for this year’s competition was the on-request provision of 10 matches of anonymised tracking data. After several weeks of brainstorming and experimenting with my collaborators Zhi Yuan and Ashley, we submitted our proposal just a few days short of the deadline.
To our pleasant surprise, we received notification of our proposal’s selection near the last week of January. Given the relatively short working duration of just around 2 months and our full-time work schedules, it was key for us to set reasonable expectations for what would be feasible to accomplish within this time frame. It was really useful, at least for myself, to plan and target to accomplish one task per week e.g. 1 week for gathering and formatting the raw data; 1 week for modifying the off-the-shelf pitch control model and setting up the data computation pipeline.
Data wrangling – first time?
About a month into our project we were introduced to our assigned mentor Dafydd Steele, who works at Liverpool FC as a Statistical Researcher. Apart from being a really helpful sounding board for our analysis and ideas, Dafydd was a real pleasure to chat with and I appreciated him sharing a bit about the work he does at Liverpool + exchanging general football banter and slander about the teams Zhi Yuan and I support. In addition to Dafydd, we also received feedback on our presentation from Andy Cooper (the Pro Forum liaison/organiser himself) and several members of the Stats Perform analytics team a week away from the forum.
To the event itself – my day at the forum predictably began with me awkwardly standing about, trying not to sip my coffee too quickly and glancing around to see if there were any faces or names I could recognise. Thankfully that didn’t last too long and the first person I chatted with was Mark Thompson who publishes the Get Goalside analytics newsletter (referenced twice above). Soon after, Zhi Yuan arrived and Dafydd kindly introduced several of his Liverpool colleagues to us. The forum proper kicked off with host Ben Mackriell, OptaPro head, taking us back to the state of Opta data in 2014 (when the first Pro Forum was held), and demonstrating how it’s evolved since by comparing the number of event data “qualifiers” they use to encode the same sequence of play now vs then. He then reflected on the role of data in football in 2022, advocating for “Football First Analytics” while also recognising that effective application has been and will remain a main challenge despite even with continued data improvements moving forward.
Not surprised if Liverpool players are actually live-fed these numbers
The rest of the morning session flew by with intriguing talks on the other selected presentations and a case study sharing by the Stats Perform team on the thrilling title race of the 2020-21 Ligue 1 season. Midway through I joked with Zhi Yuan about the contrast in how engaged I was in the talks here, versus a talk at the last conference we attended together several years back, which was analytics-based but on the topic of precious metal prices instead. The session concluded with a Q&A with Tranmere Rovers’ manager Micky Mellon which I thought was a neat change of pace. My biggest takeaway from that was perhaps his observation of the current playing generation’s receptiveness towards data and quantitative benchmarks for assessing their own performance and growth.
Way overperforming your xG always helps
Lunch was spent (besides eating) looking through the atrium exhibits which included the two projects that were selected to be presented as posters. One which looked at quantifying counter-attacking contributions was originally from last year’s forum which was held virtually, whereas the other showcased a novel approach to quantify the physical intensity of competitive match play. I also finally plucked up the courage – after being relentlessly bugged by a friend over the past month – to ask Dafydd if there was any chance of Mbappe coming to Liverpool in the summer – he gave it a good chuckle. Shortly after getting mic-ed up the forum resumed with the noon session which began with two virtual presentations and then finally, our turn to take the stage – I thought we did pretty ok. A tea break followed before the keynote talk by Duncan Locke from the Rugby Football Union, which refreshingly deviated from the usual theme of performance analysis/recruitment, by looking instead at data-driven case study in the context of restarting professional play following the first wave of the COVID-19 pandemic in 2020.
A key theme of Duncan’s talk was asking effective questions
The closing address by Ben followed before the real highlight of the day – free booze. With some credigree to my name badge and face now, I got around to speaking with several other attendees and Stats Perform scientists. It was also a cool coincidence to find out that two of the other presenters + their mate who works at a club now studied at Imperial at the same time as us in 2018-19. The post-event buzz then shifted to the nearby pub for more drinks and chatter about football. An interesting point brought up by Bakr (one of the other presenters) was if there could be a better way to measure technical skillfulness in a player e.g., are % successful dribbles/take-ons enough to show that someone like Ronaldinho has great touch or feet? On that thought, what measurables are we actually referring to when we describe players with relatively vague characteristics such as being “intelligent” or the having the ability to “control the tempo”? Or are such terms too subjective/open-ended to be “useful” in the first place?
After hanging around for another hour or so, it was time to bid farewell to my day out at the Pro Forum, as I boarded the circle line to head home where I was greeted with dinner featuring my ultimate guilty pleasure food here – supermarket pizza (from Waitrose even). Two days later it was time to embark on my secondary reason to come up to the UK again, a nice week-long trip to Athens and Turkey, the highlights of which were the hot air balloon ride and treks within the scenic valleys of Cappadocia. Subsequently, it was back to London for a weekend and then to my actual home in Singapore, to ponder about the next thing to work on…
xPrickled by thorny plants here was pretty high
Closing Thoughts
To sum things up, I had a terrifically memorable time throughout the whole Pro Forum process, from the initial proposal submission, to the nights and weekends spent working on the project, to the event day itself. We were definitely very fortunate to get our proposal selected, especially on our first attempt. Andy, whom I spoke more to after the event, remarked that it wasn’t necessarily the “best” individual proposals that were selected, but rather the best set of proposals that would dovetail well with one another to produce a coherent programme. His statement definitely resonated with what I observed through the day, with the unexplored questions of several projects seemingly being “offered” solutions by the approaches of others, e.g. one of the areas of further work in the counterattack project was to consider the risk-reward perspective, which related quite closely with how we assessed the risk-reward ratios of various strategies to play against a counterpress in our project. One of the judges I spoke to also mentioned that the point of this forum isn’t to showcase fully-fleshed out research ideas that may have been in the works for 1-2 years, but instead provide fresh ideas with potential the opportunity to blossom and have an audience.
With that, I’ll end this post with acknowledgements again to Andy and the Stats Perform team for the opportunity; Dafydd for his guidance and the fun conversations; Laurie Shaw and Will Thomson for their open-source implementation of William Spearman’s pitch control model; Andy Rowlinson and Anmol Durgapal for the mplsoccer library, my project mates Zhi Yuan and Ashley for devoting their free time to this and last but not least, my fiancée for food and lodging being supportive and planning our entire Turkey/Athens trip by herself. Hoping that it won’t take another year for me to update this blog next time, and that I’ll get to be a part of an analytics conference like this again too!