Yes - R is a great tool for preprocessing Smartform data and building models etc.
Regarding using R and Amazon etc, from a license point of view this is fine as long as the data is for your personal use and there is no data transfer or resale of data. You can also make 2 copies of the database (eg. locally and on your own AWS instance) provided that they are both under your control / only accessible by yourself.
From a techical point of view, using RMySQL is probably the best library to use, in order to create an R dataframe that you can manipulate further. Here's the start of a recent script as an example:
#load library
library(RMySQL)
#clean up any open connections
all_cons <- dbListConnections(MySQL())
for(con in all_cons) {dbDisconnect(con)}
con <- dbConnect(RMySQL::MySQL(), dbname = "smartform", user="smartform")
horse_query <- paste("select name, race_id, runner_id, race_type_id, race_type, meeting_date, course, distance_yards, distance_behind_winner, starting_price, starting_price_decimal, prize_money, class, trainer_name, jockey_name, jockey_id, trainer_id, finish_position, unfinished, coalesce(finish_position, unfinished) AS 'result', num_runners, going, standard_time_secs, winning_time_secs, stall_number, tack_hood, tack_visor, tack_blinkers, tack_eye_shield, tack_eye_cover,tack_cheek_piece, tack_pacifiers, tack_tongue_strap, coalesce(tack_visor, tack_cheek_piece, tack_eye_cover, tack_eye_shield, tack_hood, tack_blinkers) AS 'headgear', historic_runners.private_handicap, historic_runners.private_handicap_type, from historic_runners join historic_races using (race_id) order by meeting_date DESC;", sep="'")
smartform <- as.data.frame(dbGetQuery(con, horse_query))