Our analysis utilises generalized linear models (GLM) to calculate match odds from existing football data sets. Specifically we use Poisson regression techniques to try and model the predicted home and away goals scored for a defined match. This approach in itself is not new and has been discussed extensively in statistical literature. As it models goals directly the system has the flexibility in predicting most of the common betting markets (Correct Score, Half Time / Full Time, Asian Handicap etc). Our approach is an improvement on some of the early models, achieved by changing the regression equations and adding further data into the analysis (such as shots on / off target, crowd attendance and motivational factors)

The below is a basic example of the equations we use, it does not include all the factores that the production model uses. In
the equations below *X _{i,j}*

This model
is augmented with the inclusion of total shot data. To achieve this we simple
extended the above equation with the assumption in equation below, where *κ* is a scaling factor.

And
therefore

where *A _{i,j}*

The
regression includes a weighting function (equation 2.2) which allows the model
to place greater importance on more recent matches. We have included the same
weighting function. In our model *t=(fd-md)* represents the
difference, in days, between the date the match was played (md) and the date
chosen to represent the “fit date” (fd). *ξ* is a
constant representing the strength of the decay

To allow us to vary the relative importance of “goals” inferred by the shot data to the actual goals scored we have included a further weighting function to fit to those relevent elements. This constant, τ, is fixed at 1 for elements representing goals and a lesser value for those elements in the fit representing shots.