0 votes
To calculate the public's estimation of the winning probability of a horse from its starting price in the Smartform database, is it neccesary to take account of the bookmakers' vig?  Symbolically, if w_i is the amount waggered on horse i and W = \sum_i w_i is the total amount waggered in the race, is starting_price_decimal = (1 - \delta) / \pi_i where \delta = bookmakers' vig and \pi_i = w_i / W?  Summing up 1 / starting_price_decimal for each race gives a quick-and-dirty median value of about 1.15 (for a particular selection of races and ignoring horses without starting prices).  This would suggest a vig of about 13%.  Does this seem reasonable?

In the version of the Smartform database I have, now more than 2 years old, the field tote_win is non-null less than 5% of the rows.  Are more of these data populated in the current version of the database?

Thank you
in Smartform by gillpa Handicapper (730 points)

1 Answer

0 votes
Yes - work out the overround, as in:

http://answers.betwise.net/28/how-do-you-calculate-the-bookmaker-overround-in-smartform

Then, reduce the value starting_price_decimal value -1 in any particular race by the overround to get the percentage winning probability to a theoretical 100% book.

I think this applies as follows:

If we take a theoretical race with 3 horses who the public thinks are evenly matched and are evenly backed, the "true" prices should be 2/1, 2/1 and 2/1, or 3.0, 3.0, and 3.0, or a probablity of 1/3.0 each, ie. 33%.

Greedie bookie has priced this race with all the horses at Evens, so 2.0, or a probability of 1/2 each.

The overround is therefore 50%, ie. the sum of all probabilities is 1.5.

To get back to the public estimation of price, we divide the starting_price_decimal on each horse, 2.0, by the overround, 1.5 and subtract 1 to get back to 33% each.

In terms of using some form of average vig across all races, I think that works with the following caveats

1)  bookmakers' margins are much smaller on favourites, so prices on favourites are probably not so far off the public's estimation, so some adjustment may have to be made for this.

2)  Overrounds are often bigger on bigger fields.

Re. tote_win:

tote_win only returns a value where the horse has won a race, so it's normal for the majority of these data to be NULL.  It should return always return a value using the condition "where finish_position = 1".
by colin Frankel (19.7k points)
...