rix: reproducible development environments for R programmers

Bruno Rodrigues

What is R?

  • GNU R is a multi-paradigm programming language (first released 1993)
  • FLOSS implementation of the S language developed at Bell Labs
  • Statistics, machine learning and data science
  • Great for visualisation tasks as well
  • Built-in objects: data frames, matrices, formulas, models…

Hello World!

Regressions:

life_cycle_savings <- read.csv("life_cycle_savings.csv")

fm1 <- lm(sr ~ pop15 + pop75 + dpi + ddpi, data = LifeCycleSavings)

summary(fm1)

Data manipulation/cleaning:

library(dplyr)
library(tidyr)

starwars %>%
  filter(skin_color == "light") %>%
  select(species, sex, mass) %>%
  group_by(sex, species) %>%
  summarise(
    total_individuals = n(),
    mean_mass = mean(mass, na.rm = TRUE),
    sd_mass = sd(mass, na.rm = TRUE),
    .groups = "drop"
  )

CRAN and Bioconductor

  • CRAN is the repository of R packages to extend the language
  • As of writing, 20331 packages available
  • Biocondcutor: repository with a focus on Bioinformatics: 2266 more packages
  • All available through nixpkgs in the rPackages set!

Per-project environments

  • Per-project environments are not really a thing
  • Focus on providing a per-project library of packages
  • Docker if reproducibility is required (see Rocker project)

rix 1/2

  • Per-project reproducible environments with {rix}.
  • Simply use the provided rix() function:
library(rix)

rix(r_ver = "4.3.1",
    r_pkgs = c("dplyr", "ggplot2"),
    system_pkgs = NULL,
    git_pkgs = NULL,
    tex_pkgs = NULL,
    ide = "rstudio",
    project_path = ".")

rix 2/2

  • Also possible to evaluate single function inside a “subshell”:
out_nix <- with_nix(
  expr = function() my_function(...),
  program = "R",
  exec_mode = "non-blocking", # run as background process
  project_path = path_to_subshell.nix,
  message_type = "simple" # you can do `"verbose"`, too
)
  • Works from R installed via Nix or not!

To know more

https://b-rodrigues.github.io/rix/