Hi,
First time post here.
I have just started training models using the R datarobot package which I am really enjoying.
My training set is 1,234,448 records with 319 columns. When I reduce the set to a small sample, say 10K, it has no problem fitting multiple models.
However, when I try the full dataset, which works via the console it fails in R.
Below is the code and the error message:
project <- StartProject(dataSource = trainDat,
projectName = projectName,
target = target,
workerCount = "max",
wait = TRUE,
smartDownsampled= TRUE)
Error in curl::curl_fetch_memory(url, handle = handle) : Timeout was reached: [datarobot.edr.qantasloyalty.net] Operation timed out after 600000 milliseconds with 0 bytes received
Hi
@Petros
,
Have you tried uploading your dataset into the
AI Catalog
via the DataRobot web app? From there you can continue to work with that data in your R project. Other than that how big is your data set? Could it have been a combination of internet and data size which triggered the 10 timeout?
All the best,
Ira
@Petros
, you may also want to try adding in the 'maxWait' parameter with a high value, by default its 600 (seconds). eg.
project <- StartProject(dataSource = trainDat,
projectName = projectName,
target = target,
workerCount = "max",
wait = TRUE,
maxWait = 6000,
smartDownsampled= TRUE)
Hi
@IraWatt
,
Thanks for your response.
I can upload it via
DataRobot web app and it will train multiple models. I have then identified the model and within R have done the preds. Obviously I would like to do this end to end.
I did try the max wait but to no avail. I am trying again and will see how it goes.