Missing records in historic_betfair_win_prices

0 votes

Hi

I've just discovered that there are a number of missing records in the historic_betfair_win_prices table

To be clear I have just redownloaded the entire table and repopulated the table from scratch so there is no issue with missing updates on my database here.

I have run the following query

select distinct
rac.*
from
(
smartform.historic_races rac
left join
smartform.historic_betfair_win_prices bsp
on rac.race_id = bsp.sf_race_id
)
rac.meeting_date >= '2020-01-01' and bsp.sf_race_id is null

It returns 630 records (i.e. 630 races for which betfair SPs are missing from this table since the start of last year).

Few things from these results:

- Barring the odd Arab race, the consistent pattern is that entire meetings are missing. It's whole meetings in blocks, not just random races.

- There's obviously an entire missing day of data on 3 Sep 2020. No Betfair SPs at all for seven meetings held on that day.

- After that single day there are no missing meetings at all until 7 Mar 2021. From that point forward missing meetings occur frequently but seemingly randomly, not far short of one per day on average

- There seems to be a few courses whose meetings keep reappearing in the 'missing' list. The main ones are: Chelmsford City, Wolverhampton, Newton Abbot, Market Rasen, Leopardstown, Gowran Park, Punchestown, Downpatrick.

On the last of these points, I know that Betfair have gone slightly random in recent months with the course abbreviations that they have been using in their API and other data feeds, especially for 'multi word' course names. This has proved frustrating. But, if the historic_betfair_win_prices table relies on a "Betfair course to Smartform course" lookup based on the betfair course abbreviation, this might be a possible reason why some data is getting missed.

I appreciate that the Betfair SP data is provided 'without warranty' but it would be really good to chase down this issue as omissions on this scale (perhaps 10%-15% of all records in last 4 months) risk making it pretty much unusable.

And - speaking personally - this data is a really key part of my system build!

Would appreciate any thoughts or response.

asked Jul 11, 2021 by SlightReturn Listed class (2,850 points)
edited Jul 12, 2021 by SlightReturn

1 Answer

0 votes
The missing BSPs look like an issue with the course mappings as you suggest - we will try and get to the bottom of it shortly, provided that the data has been published.
answered Jul 12, 2021 by colin Frankel (19,280 points)
...