Thanks Colin. What I tend to do is bring race distances into a fairly smallish subset - I seem get the best profit results this way. E.g. for race id 1260609, daily_races_beta has distance_yards of 2419, but the historic_races_beta has this at 2420. My code essentially checks yardages, in this example, like this:
...
} else if (yards < 2420) {
race.setRaceDistance(2200L);
} else if (yards < 3080) {
race.setRaceDistance(2640L);
}
...
so the problem here for race id 1260609 is that 2419 (on daily_races_beta) will bring it down to 2200 yards, but historic_races_beta will take it up to 2640 (where there is slightly more race history).
I can just use the nearest furlong as you suggest, which gives 2420 yards, but I actually end up with less choice of winners this way which results in less profit overall.