Hi - a few points here:
As you say, overseas runners with no form in this country are relatively rare. However, when they do occur there are a number of common form elements that can often be used - in particular, trainer form in this country, sire and dam_sire form, jockey form and so on. Usually, there is also a form string to represent the places achieved in recent races, albeit not necessarily in this country, as well as the number of days since a previous run. Again, all of these elements are generally significant, certainly in model building, although the situation is not optimal for these particular runners.
Taking the example of the US runner Kaufymaker on the first day of Royal Ascot 2021, we have the following:
Kaufymaker
Trainer: Wesley Ward Owner: Mr Gregory Kaufman Ridden by: John Velazquez Cloth number: 17 Stall number: 3
Age: 2 (b. 2019) Colour: ch Gender: f Weight: 8-12 Bred: USA
Dam: Heaven's Touch (b. 2010) Sire: Jimmy Creed (b. 2009) Dam Sire: Montbrook (b. 1990)
Form: 1 (FlatSix) Forecast price: 5/2 Days since ran: 61
Thus the only form missing are the details of the race that Kaufymaker won when running previously at Keenland. - The other form elements are all useable.
On the question of inclusion of race form from other countries, this means licensing data from other countries. All Smartform data for all racing in the UK and Ireland is licensed from official sources for the personal use of Smartform subscribers - however, it's simply not viable to do so for other countries given limited demand as the cost can approach six figures or more for full history and ongoing updates, depending on usage. An alternative would be to webscrape data from overseas racing sites, but this is not somethiing we would advocate as it is usually outside terms of use and may contravene local licences.