How can I find the following data (outlined below)?

0 votes
The data I am looking for is the following: Runners past finishing position in past races, jockey win ratio, trainer win ratio, owner win ratio, horse win ratio. I am working on designing a neural network that will take all of these factors into account but am struggling to find this data for past races of any kind. Any help would be greatly appreciated!
asked Apr 19, 2021 in Smartform by emdeutsch Plater (150 points)

1 Answer

0 votes
Best answer

Hi, the data mentioned in the question would all fall under the definition of derived variables or features.   Users of Smartform create these features themselves, since:

a) There is an edge to be gained in the design of such features - one person's idea of a useful ratio is not the same as another, e.g if calculating trainer win ratio, do you take the winners to runners ratio over a trainer's lifetime, over the past year, over the past month?   Do you calculate by race type, course type, by distance?   Is it best to use winners or placed horses, or even percentage of field beaten?

b) The design of the derived features - or feature engineering - is itself a key part of the process of applying machine learning techniques such as Neural Networks, so there is an equal edge to be gained in using your own tested features as in the application of the model.

However, all the raw data is there in the database to enable you to do this. For a list of the "raw data" fields see:

Additionally, we recently published a table of derived features for runners data (including  a ready made set of "ratios" that users can build upon themselves) to assist with machine learning use cases and provide users with a starting point to explore the creation of similar features for other dimensions (eg. jockey, trainer, sire, owner etc).  See:

answered Apr 20, 2021 by colin Frankel (17,950 points)
selected Apr 21, 2021 by colin