0 votes
Can anyone confirm whether or not the Smartform database information has been corrected/adjusted for the draw renumbering that took place a couple of years back (think it happened for right hand courses in March 2011)? I have also noticed that some trainers and jockeys appear more than once with different spellings and wondered whether these will be corrected in the near future or whether I can go through and manually combine those trainers and jockeys that have multiple entries (eg Sir Michael Stoute and Sir Henry Cecil have double entries with and without their titles whlst Ryan Moore appears as Ryan and RL Moore). Should I have to do it myself then thats fine but any advice on how to go about making those corrections myself would be welcomed as I obviously dont want to make damaging irreversible changes to the existing data.
by hdemet Novice (370 points)

2 Answers

0 votes

All the data in the database reflects what was declared at the time.  So, for draw numbers, if a horse ran at a right handed track such as Sandown over 5 furlongs and was drawn 1 (what used to be the widest draw), it actually came from that stall, and the stall has not been given a new number.   From the Flat season starting in 2011 of course, a 1 draw means that horse would be racing next to the rail (where there is a distinct advantage).  

Similarly trainers and jockeys that were previously declared under different naming conventions appear under the naming conventions that were current at the time.  In the case of Henry Cecil, he used to be H R A Cecil, but then of course, Sir Henry was knighted.  So he actually had a different name.    In other cases, such as Ryan Moore to R L Moore, the change is  irritating in that it's without much apparent logic, but that was the way the racecards represented him at one time.  However, the idea of the jockey_id and the trainer_id is that you do have the same ids for the same entities, even if those entities have changed over time.  

So for Henry Cecil, we can run a query as follows:

mysql> select distinct trainer_id, trainer_name from historic_races join historic_runners using (race_id) where trainer_name regexp "Cecil";
+------------+-----------------+
| trainer_id | trainer_name    |
+------------+-----------------+
|       5709 | Cecil Ross      |
|         37 | H R A Cecil     |
|         29 | Ben D Cecil     |
|         37 | H R Cecil       |
|     103666 | H R Cecil       |
|         37 | Sir H R A Cecil |
|     117726 | Lady Cecil      |
+------------+-----------------+
So you can see that whilst Henry Cecil has variously appeared on racecards as H R, H R A and Sir H R A, the id is always consistent.  In this case there, is one exception, where an inconsistent id appears in the database - here is the offending Id:
 
mysql> select distinct meeting_date, trainer_id, trainer_name from historic_races join historic_runners using (race_id) where trainer_name regexp "Cecil" and trainer_id = 103666;
+--------------+------------+--------------+
| meeting_date | trainer_id | trainer_name |
+--------------+------------+--------------+
| 2010-09-04   |     103666 | H R Cecil    |
+--------------+------------+--------------+
 
That is indeed something that will be corrected shortly, since Ids should always be consistent for the same entities - you can expect a data refresh for these kind of id errors in August.
 
Note that representing the data that appeared at the time does have its upsides - the Racing Post for example represent any horse that has been gelded as a gelding - if you query the individual records, you will never know if it raced as a colt since they only carry the current status of the horse.  Just as with all the other data from the time, Smartform represents the gender of the horse was when it actually raced.
 
On the wider question of not making irreversible changes to the data if you want to insert your own data or make any corrections, note that as a subscriber you can always download the current data again.
 
However, if for example you want to make corrections to the draw data to compensate for renumbering, you should not amend any of the tables that are updated on a daily basis; usually the best thing to do is to do one or more of the following:
  • create a new view on the database
  • create a new table in the smartform database
  • create a temporary table in the smartform database
  • create a data structure outside the database using a programming script that connects to the database
In any case there is no need to touch the original database tables.
 

 

by colin Frankel (19.7k points)
edited by colin
Many Thanks - that clarifies things greatly for me.
0 votes
It’s good to see all the corrections. Thanks!
by lauramartin Novice (300 points)
...