One of the frustrations I have with using the R programming language for data
analysis is missing virtual environments from Python. Python has great
virtual environments built in and
conda comes with a really good system as
When I got to OPIA, I started to use R (because aside from SAS, that is the other software that people use. It was a bit of a rough transition, but I am getting to the point where I can understand the difference.
As an aside, I will say that while there were some switching costs from Python to R, I was able to get up and running with R really quickly. I was doing actual work with R right away. I won't say that either one is better, but I think that is absolutely worth learning both if you:
- work with people who use the other system or
- have an interest in ways that data analysis can be expressed in code.
While using Python, I got used to working with virtual environments. So, when using R, just having a big pile of installed packages drove me nuts. I just feels like I'm not worrying about reproducibility enough.
So, this weekend I resolved to set up Docker on my home computer and see
if I can get R working on it.
I can quite happily report that I got it working.
What's more, the image that I was using includes LaTeX,
so I could create PDFs with
knitr right out of the box.
That is a major advantage, since I was having problems with getting LaTeX
to run on my Windows machine at work.
On Tuesday (Monday is a holiday), I will get Docker running on my new
Windows computer at work.
(I wanted a Mac, but it didn't seem be super compatible with the systems
I used day to day.)
If that works smoothly, then I will have a nice way to run all my
knitr reports in HTML and PDF.
So, here's what I discovered.
rockr/verseimage includes RStudio, the
- You run RStudio Server off the docker image -- so you access RStudio in your web browser.
- You can create a
Dockerfilethat takes an existing image and adds things to it using the following commands:
FROM- other docker images
RUN- run a command (including an R command) - which can be used to install other packages.
ADD- to copy data files into the container.
- Once you create the
Dockerfile, you use
docker buildto create the image.
- Once you have the image, use
docker runto start it. Then you can go to the RStudio from your web browser.
- You can mount volumes in your
docker runcommands, so you can have your R code as a part of that.