Using GitHub actions with R: Some notes from our #TidyTuesday setup

Beginning in September 2019 we started organizing a #TidyTuesday event at the hacklab in Skopje - KIKA. It has been 15 events in 2019, and it’s been great. One of the things we wanted to do from the outset was to have an online repository of all the code and presentations. Both to have a record or activities for ourselves and to share our work with other R enthusiasts. The goal was to have a GitHub repository to house the code, and then on a separate branch to have the .Rmd files used for presentations rendered into html so that they would be browsable on the Internet.

For the first few events, I did this by hand. I pushed .Rmd on the GitHub repository, and a corresponding html on a gh-pages branch. But this fall, GitHub enabled actions for all users, so we thought it would be good to automate the gh-pages publishing.

The idea was good, but we hit some bumps in the implementation.

Damjan configured the workflow to run on Ubuntu latest (18.04). However, Ubuntu doesn’t have CRAN packages by default and installing them from apt or from CRAN was too slow.

This made us look in another direction. We found a tidyverse Docker container that would solve issues with the installation needed R packages. It almost looked like a win, only to see that some of the .Rmd files load libraries that are not part of the tidyverse Docker container. Adding more dependencies to the workflow is possible, but dependencies have dependencies and that slows down the process. Currently it takes about 12 minutes to deploy the gh-pages branch. It is possible that this will take longer in the future if the repository grows with new notebooks and libraries needed in them.

The current setup, firstly installs several Ubuntu packages through the main.yml configuration.

Afterwards it builds all the R packages needed to render the Rmd files. We do this with:

Rscript -e 'install.packages(read.table("Rdependencies", colClasses = "character")[,1])'

Another thing to be aware of is that the Rmd files must succesfuly render, otherwise the CI will fail. So it is a good practice to check that before pushing the file to GitHub.

The GitHub repository has the code from most, but not all 15 TidyTuesday events at KIKA. Our GitHub actions setup is also maintained in case other R users want to try it and perhaps improve it.

Novica Nakov
Novica Nakov

Data Wrangler.