Combining fourteen videos into football player tracks
In our previous BallJames blogpost, we talked about how we use detectors to localize football players within camera images. One of these detectors – a neural network – takes an image as input and produces player detections as output. These detections contain the coordinates of players within the image. However, the detector generally does not provide a correspondence between detections across time, nor across the various camera views present within our setup. That is, given two detections in two different images, the detector does not know if the detected players are the same person or not. This correspondence information is required if we want to link detections together, and use them to build a unique motion trajectory for each player. This is where one of our tracking approaches comes in.
The tracking approach that we discuss today has two purposes: 1) it links together detector information from different images by deciding whether the information in the various detections originates from the same person or not, and 2) it uses the collected information to generate a three-dimensional motion trajectory for each person participating in a match. To do this, we rely on a mixture of neural networks and heuristics. It is currently challenging to re-identify a person across different images using neural networks alone. Given two images of any person, a network would have to decide whether the person in these images is the same person or not. This is additionally challenging because the viewpoints for these images can be completely different from one another. On top of that, players of the same team wear almost the exact same clothing. We therefore use some additional heuristics to aid the detection association process.
Single- and multi-view tracking
We can perform tracking on both an individual camera level, as well as across cameras. These trackers function together, with the output information of one tracker being used in the other. Metrics derived from the player detections are used to create association scores between the detections across different images, and link those detections together. These metrics include information related to the jersey that the players are wearing. For example, recognizing the number on the back of a jersey can provide a strong clue about the identity of a player. Another of these metrics is image coordinate information. Camera calibration parameters can be used to transform the collected information across cameras to actual world coordinates. Combined with prior knowledge about which players are on the pitch, we can assign all collected track data to a unique player identity.
In the cloud
The discussed tracking pipeline can be set up in the cloud to handle data received from a football stadium equipped with the BallJames system. It runs in real-time (>25 fps), and can stream track data to a receiver anywhere on the planet. From there onward, data can be further processed to compute metrics relevant for football analytics, such as heat maps of the pitch, and players’ sprint intensities. Future updates will include even more player-specific metrics, such as whether a ball is being kicked with the left or right leg.
DEEP LEARNING DEVELOPER
We are looking for bright developers that want to be part of one of the fastest growing analytics companies in the world