Debugging and remedying installation issues for R packages on Code Ocean

How to diagnose and remedy R installation issues:

First, one general piece of advice is to add an MRAN snapshot from a day when you know that a package was installed successfully. This is fairly simple on Code Ocean.

If you are still getting installation issues, you may need to do some digging into required system-level dependencies, typically available through apt-get which is available on Code Ocean as the apt-get package manager in each capsule's environment.

The rest of this article walks through that process with reference to geojsonio, a CRAN package that converts "data to 'GeoJSON' or 'TopoJSON' from various R classes, including vectors, lists, data frames, shape files, and spatial classes.

Interpreting error messages:

  • Let's say you start a new capsule, select a base environment of R 3.6 (the underlying operating system for which is Ubuntu:18.04), and try to install geojsonio. Your build will fail with the following message:

configure: error: gdal-config not found or not executable.
ERROR: configuration failed for package ‘sf’
* removing ‘/usr/local/lib/R/site-library/sf’
Error in i.p(...) :   
    (converted from warning) installation of one or more packages failed,  
    probably ‘jqr’, ‘protolite’, ‘rgdal’, ‘rgeos’, ‘V8’, ‘geojson’, ‘sf’
Calls: <Anonymous> ... with_rprofile_user -> with_envvar -> force -> force -> i.p
Execution halted
  • Figuring out how to parse this for the information that you need can take some practice. For starters, you might try entering the first line into a search engine.

  • Doing so will lead to this Stack Overflow post suggesting that you need an apt-get package called libgdal-dev, which is a Geospatial Data Abstraction Library.

  • Add that to apt-get and re-run, and you'll get a different, more verbose error:

------------------------- ANTICONF ERROR ---------------------------
Configuration failed because  was not found. Try installing: 
    * deb: libv8-dev or libnode-dev (Debian / Ubuntu) 
    * rpm: v8-devel (Fedora, EPEL) 
    * brew: v8 (OSX) 
    * csw: libv8_dev (Solaris)
To use a custom libv8, set INCLUDE_DIR and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
--------------------------------------------------------------------
ERROR: configuration failed for package ‘V8’
* removing ‘/usr/local/lib/R/site-library/V8’
cat: geojson.out: No such file or directory
Error in i.p(...) :   
    (converted from warning) installation of one or more packages failed,  
    probably ‘jqr’, ‘protolite’, ‘V8’, ‘geojson’
Calls: <Anonymous> ... with_rprofile_user -> with_envvar -> force -> force -> i.p
  • Note: "ANTICONF ERROR" is your friend. This class of errors typically tells you the name of the package you need. Because each base environment tells you which Linux variant is running (in this example, Ubuntu:18,04), you now know that you need libv8-dev or libnode-dev. (If you search for these packages, you'll discover that libnode-dev is not avaialble for Ubuntu:18.04, so you need libv8-dev).

  • When you add libv8-dev and try to reinstall, you'll get another ANTICONF ERROR telling you to add libprotobuf-dev.

  • rinse and repeat -> ANTICONF ERROR about libjq-dev;

  • rinse and repeat -> Please install the 'protobuf-compiler' package for your system (the package is actually called protobuf-compiler).

And that should do it. So the needed apt-get packages, on top of those already installed into the R 3.6 environment, are libgdal-dev libjq-dev libprotobuf-dev libv8-dev protobuf-compiler.

Wow, that was horrible! can you help me avoid this pain?

Yes, we are really good at dealing with such things. If you let us know that you are having installation issues via email to support@codeocean.com, we will be glad to help.

Why was this so hard?

A variety of interrelated reasons.

  • Code Ocean is based on Linux containers and capsules typically start from an Ubuntu:16.04 or Ubuntu:18.04 operating system.

  • R packages are often not written entirely in R, but in low-level, compiled languages, most typically C++ and Fortran, for speed;

  • On Mac or Windows, CRAN offers ‘precompiled’ libraries, which means that any low-level C/C++/Fortran source code bundled with the package has already been built into an executable (i.e. translated into machine code) appropriate for that platform -- so these packages tend to download and install quickly.

  • But on Linux, and therefore generally in any container-based platform, it ain’t so, because the Linux ecosystem is comprised not only of different versions (like Mac OS El Capitan vs. Catalina), but also many distributions which aren't fully compatible with one another. For example, if you compile C code on an Ubuntu Linux distribution, and share that compiled executable with a friend who uses Fedora, there’s a reasonable chance that you and your friend have different C compilers, which, for a bunch of complicated reasons, means that when they try to load and use your package, it will not work. This makes creating binaries for all the possible permutations of distributions and versions a daunting task.

  • So R packages for Linux typically distribute the source code, which then needs to be compiled during runtime on your system, which means that you need all the necessary compilers, headers, libraries, etc. to be already installed.

  • R libraries may also bring in other R dependencies. On a fresh R instance with geojsonio installed, if you run library(geojsonio); sessionInfo() , you'll see, inter alia:

other attached packages:
[1] geojsonio_0.7.0

loaded via a namespace (and not attached): 
[1] Rcpp_1.0.2         magrittr_1.5       maptools_0.9-5     units_0.6-4        
[5] lattice_0.20-38    R6_2.4.0           httr_1.4.1         tools_3.6.0        
[9] rgdal_1.4-4        parallel_3.6.0     grid_3.6.0         geojson_0.3.2     
[13] KernSmooth_2.23-15 e1071_1.7-2        DBI_1.0.0          jqr_1.1.0         
[17] rgeos_0.5-1        class_7.3-15       lazyeval_0.2.2     sf_0.7-7          
[21] curl_4.0           sp_1.3-1           V8_2.3             compiler_3.6.0    
[25] classInt_0.4-1     jsonlite_1.6       foreign_0.8-70    
  • This means that geojsonio relies on functions from 27 separate packages, each of which needs to be installed, and each of which may depend on other R packages, and system-level dependencies, and so on. As the authors of Tinyverse phrase it, "[e]very dependency you add to your project is an invitation to break your project."

Last updated