Daily vs historic data

0 votes

Given I'm a seasoned IT professional with 30 years+ experience, you'd think I'd be able to intuitively know why there are daily and historic race and runner tables.  I'd have assumed the layout of the daily/historic tables would be the same but perhaps they'd just contain different data based on time, but that's not the case at all.

My initial interest is in race times, which you can only get from the historic_runners table - why is this not available in the daily ones?  The annoying thing is that there are races I can see in the daily race/runner tables, e.g. 1250094, that aren't in the historic tables.  This is pretty useless unless I'm missing something?

I have installed the initial historic data into MySQL and I have setup the updater (which didn't appear to do anything), should I be downloading anything else?

asked Aug 14, 2023 in Smartform by maddisor Novice (410 points)

1 Answer

0 votes
Hi -

The key difference is that the daily tables are advance information on races and runners and the historic tables are compiled after the race has been run.

Every day’s advance information is added to the daily tables to build up a record of advance data (ie data that is available in advance of the race, a.k.a  racecard data)

we cannot therefore know a race time in advance of a race but only after it has been run - it will therefore only appear in the historic record (a.k.a results data)  

For analysis purposes you will likely only want to use historic data.

for applying any analysis in advance of a race you will likely use the daily information.  

Note that we also provide an homogenous record (which excludes results and therefore race times) for historic and daily data with the daily and historic insights tables - these are all derived variables.
answered Aug 14, 2023 by coltest Listed class (2,780 points)
Thanks for the response, so at what time during the day/evening will the days’ results be available in the historic tables? (Assuming I’ve got the update set up correctly which I believe I have as daily data seemed to update).

I've just done a select max(meeting_date) from historic_races and the latest date I have is 2023-08-13, I assume this means I haven't got the updater updating correctly.  Are there any log files or anything like that to see if/when your data is updated?

After manually updating, the latest date I get now for historic_races is 2023-08-14 but I still don't see race 1250094 which I do have in daily_races/daily_runners.  The meeting_status is 'Dormant', what does this mean?
Hi -
Daily races are updated around 7.30 pm the day before racing.  Note that there are also tables for advance_daily_races, which present the earlier view of racecards up to 2 or 3 days (depending upon when it is available) before racing.

Historic races - ie. results - are available from around 5 am the day after racing.

Regarding the race_id you are looking for, this was for racing at Brighton on 11-08-23, so it is in daily races as it was scheduled to happen.  However, the race itself - along with most of the meeting - was abandoned, so it never happened, hence why it does not appear in the historic data.
Thanks for the helpful responses.

Another question I had, I've looked at the quickest run times across the whole range of distance_yards values in historical_races, but the data looks a bit suspect to me.  E.g.:

select distance_yards, min(winning_time_secs)
from historic_races
where winning_time_secs > 10
group by distance_yards
order by 1

Here are a few results:
1740   97.34s
1745   96.00
1760   61.01??
1763   87.23
1764   102.60

Are these accurate?
The data is all licenced from official providers, however there can occasionally - albeit rarely - be errors.  

In particular historic data before 2008 has more inaccuracies than after, so with that query you should also add in the meeting_date and the course as a sanity check.

Further, most work with race times involves removing outliers (even when the outliers are accurately reported), so that data point (see below, 1760 = 61.01 secs?) should be removed.  

As a rule of thumb, 12 seconds per furlong is the max speed a horse can achieve, which will get progressively slower the longer the distance.  

Of course, there are many other factors to consider, not least class of race (or ability of the runners), pace in race, going and course features (eg bends, undulations).
...