Running large simulations on a desktop computer with R (tips & tricks)
Let me briefly describe what I learned while attempting to run a large simulation on my desktop computer as I think it might be useful to others. Here is the problem: I was forced to run a simulation on my desktop computer because brms couldn’t be installed on my department’s high performance computing (HPC) cluster. My first realisation was: Don’t use Windows!
Problem 0: You cannot stop Windows from rebooting
Trying to get this simulation to finish, showed me how terrible of a working environment Windows actually is. The biggest problem is that Windows doesn’t give you a choice if you want to update your system or not. I wasted a couple of days trying to find out if I can disable automatic updates. However, it is just not possible. As a consequence, the computer sometimes just restarted without my approval (outside of “working hours” despite the fact that the simulation was still running). This infuriated me so much that I set-up my desktop machine to dual boot Windows & Ubuntu. If you don’t use Ubuntu (or different linux), I’d suggest you start.
Problem 1: The simulation takes several days but I also work on the computer
For this simulation I had use my desktop computer in my office. This is the computer that I actually work on. So, I had to come up with a way to pause and restart the simulation whenever I have to use the computer for other stuff. This is done relatively easily. Whenever I have to use the computer for something else, I just press ESCAPE (for instance when running it via RStudio) or in my case use CTRL + C when running it via console. This allowed me to stop and restart the simulation whenever the need arises.
Here are the most important steps:
- Use
tryCatch(). - Add a function that saves the progress in case an error occurs.
- Use parallelisation if it possible (more on this later).
- Add save points at which the progress is automatically saved if you don’t it for each iteration.
- Optional: Predict the time when the whole thing is supposed to finish.
|
|
A word on parallesition: I used foreach and doParallel here and I found that it is important to run the parallelisation in chunks of the right size. Here, I used foreach() nested inside a for()-loop.
The functions predicted_finish(), progressBar_plot() and exit_loop_gracefully() can be found here.
When restarting the script, make sure to load the progress at the beginning, so you can pick-up where you stopped:
|
|
This could already be enough for most applications but I actually encountered another obstacle that needed fixing.
Problem 2: BRMS creates temporary files that fill up the hard disk
The final problem that I encountered was that brms creates temporary files for each model it fits. Normally this is not an issue but if you create hundred of thosuands of models then a few MB per model suddenly fill up the disk. The problem is that even if you delete these temporary files, the space is not cleared until you close & restart R.
So I wrote a function that checks the current disk space that is remaining…
|
|
… and use this function to occasionally check if enough space is available and stop the script if it is not.
|
|
This together with a bash script, which restarts R script if it doesn’t finish successfully, solved the remaining issues.
Making it extra robust with this bash script
To avoid crashes because my computer ran out of memory (see above), I created this bash script (rscript_rboust.sh) that automatically restarts the r-script in case it encounters an error. This can be used as a robust alternative to Rscript.
|
|