SciSports Data Analysis Innovations

Expected Points Model

08/09/16 | SciSports

Although it’s only mid-September and the national leagues are only several match days underway, some managers are already experiencing some backlash from the media and supporters due to an unsuspected bad start. On the other hand, several clubs are performing a lot better than people would have expected beforehand.

The key question is if those clubs are having a better team than accounted for or is it just a simple matter of having luck on their side in the August matches. SciSports is able to answer this question with the new Expected Points (xP) model.

First of all, in every match played, both teams will create some chances.  Every chance on goal during an already played match is defined as a possible goal with a certain expectancy. The value of this expected goal (xG) is determined by our Expected Goals model . For every chance during the course of the match these values are calculated and ordered.

Subsequently, the probabilities of a possible outcome for both teams are defined for the amount of goals scored. For example, if team A scores no goal during the whole match, the probability of this happening is defined as  with i being an index of a certain xG and N being the total amount of xG’s during the match. This is done for the maximum amount N of goals that could have been scored for both teams. Combining these results creates a matrix in which the probabilities of every possible outcome are calculated. As an example, a matrix is created with an xG of 1.4 for team A and a xG of 0.8 for team B:

This matrix results in a 0.512 chance for a win for team A (green), a 0.274 chance for a draw (yellow) and a 0.214 chance for a win for team B (red).

The next step is the aggregation of probabilities for a certain division of points. The probability for a win for team A is aggregated by all probabilities of outcomes having at least one goal more for team A than for team B. This probability is multiplied by the 3 points available for a win. The same procedure is followed for a draw or a win for team B. Combining these three outcomes results in a value for the expected points of both teams. In this case xP of 1.81 and 0.92 are calculated for respectively team A and B.

The final step of the model is adding all the matches played so far together in one distinctive chart to compare the expected points with the achieved results. We first did this for the Dutch Eredivisie in which Feyenoord Rotterdam is the somewhat unexpected leader.

In this chart the teams are ranked based on the value of xP. The difference between real points (P) and xP is visualized with a bar diagram. Also present in the chart are expected goals (xG) for, against and the difference (xGF. xGA. xGD) and the real goals for (GF), against (GA) and difference (GD). Again, the difference between xGD and GD is visualized with a bar diagram.

Analysing this chart leads to some interesting conclusions. As expected by a quick glance at the real rankings. ADO Den Haag has scored a 2.1 goals better GD value than can be expected by their xG values. This subsequently leads to a surplus of 3.1 points above the xP value and an unexpected third place in the Eredivisie table. Based on this information, our prediction is ADO Den Haag will most likely not be able to keep this position.

Furthermore, PEC Zwolle might feel hard done by, as their xP value is 2.7 higher than the actually earned points. This is largely due to their disability to convert the chances as they have only scored once in four matches.

Finally, it is interesting to consider the case of Roda JC Kerkrade. Although the difference between the values of xP and P is negligible (-0.5), there is a large difference present between xGD and GD. This is largely due to the match AFC Ajax – Roda JC with a final score of 2-2, while the xG distribution calculated an expected outcome of 3.41 – 0.41. Luck was definitely not on AFC Ajax’s side that day.

A similar kind of matrix can also be presented for the most lucrative national league in football: the Premier League, where United is the overperformer.

Due to the fact that the Premier League is only three matches underway, more outliers are present in the chart. Most interesting are the underachieving Liverpool FC and Crystal Palace and the overachieving Chelsea, Hull City and Burnley. The results of Liverpool FC and Burnley can directly be explained by their match on the 20th of August. Liverpool FC dominated most of the match and created a large amount of small chances (xG 1.03 – 0.37). Still, they lost 2 – 0 as Burnley effectively converted two of just three opportunities. You can see our Expected Goals map for this match here.

In conclusion, the xP model is a helpful tool to determine the amount of (bad) luck a team has or whether or not a team is able to convert its created chances. Furthermore, this model can indicate that a team is in fact stronger than the actual results show and provide support and guidance for criticized managers.