10 min read

Piping R Code with Magrittr


piping code with magrittr
Magritter (the R package) is named for the surrealist painter Rene' Magritte. His painting "The Treachery of Images" was self captioned, "This is not a pipe." The work is now owned by and exhibited at the Los Angeles County Museum of Art.

View raw source for this post

Summary

I’ve been piping code with dplyr for several years now. But because I use it so often, I thought it was time for a refresher and some renewed investigation. Sometimes, revisiting a topic can help break me out of a programming funk. And sure enough, some new methods were discovered.

Table of Contents

Introduction

The magrittr package allows for the piping of R code. It is to be given a strong French pronuniciation according to vignette("magrittr"). The package was named tongue-in-cheek after the artist Rene’ Magritte. As Magritte’s painting of a pipe was not an actual pipe, neither is magrittr’s operator %>% an actual pipe. Rather it is a convenient way for code to be written from left to right without the nesting of functions or the creation of temporary variables. Nesting of functions is confusing and temporary variables clutter up the global environment. Using magrittr, f(x) is the equivalent of x %>% f() and x %>% f(.) where dot “.” is the placeholder for “x.”

#load library
library(magrittr)

Assignment

Some disfavor the use of magrittr for assignment, preferring the traditional <-.

# More common way to assign
x <- 10 %>% divide_by(2)
print(x)
[1] 5
# Less common
env <- environment()
"x" %>% assign(5, envir = env) %>% print
[1] 5

When to use

Use pipes if you have (1) shorter than 10 steps, (2) a single input or output, and (3) simple dependencies.

Argument Placeholder

From the magrittr tidyverse page cited below, here are two examples of using the “.” as an argument placeholder:

  • x %>% f(y, .) is equivalent to f(y, x)
  • x %>% f(y, z = .) is equivalent to f(y, z = x)

Operators - 4

The only operator that I’ve used consistently is the first one.

%>%

# As described above
10 %>% divide_by(5)
[1] 2

%T>%

Also referred to as the “t-pipe,” it is helpful to determine the output in a series of chained commands.

# Note the 'NULL'
rnorm(100) %>%
  matrix(ncol = 2) %>%
  plot() %>%
  str()

 NULL
#Note the matrix output
rnorm(100) %>%
  matrix(ncol = 2) %T>%
  plot() %>%
  str()

 num [1:50, 1:2] 0.1456 -0.5066 0.827 0.0585 0.2557 ...

%$%

car_data <- 
        mtcars %>%
        subset(hp > 100) %>%
        aggregate(. ~ cyl, data = ., FUN = . %>% mean %>% round(2)) %>%
        transform(kpl = mpg %>% multiply_by(0.4251)) %>%
        print
  cyl   mpg   disp     hp drat   wt  qsec   vs   am gear carb       kpl
1   4 25.90 108.05 111.00 3.94 2.15 17.75 1.00 1.00 4.50 2.00 11.010090
2   6 19.74 183.31 122.29 3.59 3.12 17.98 0.57 0.43 3.86 3.43  8.391474
3   8 15.10 353.10 209.21 3.23 4.00 16.77 0.00 0.14 3.29 3.50  6.419010

%<>%

The above symbol is used for assignment, though disfavored.

# data(LakeHuron)--lake depth
LakeHuron %<>% head(3) %>% print

Functions

A function can be created by piping.

Unary Functions

# Unary functions
f <- . %>% head(3)
chickwts %>% f(.)
  weight      feed
1    179 horsebean
2    160 horsebean
3    136 horsebean

Lambda Functions

Functions can be defined and executed within piped code.

Long-hand Notation

car_data %>%
(function(x) {
  if (nrow(x) > 2) 
    rbind(head(x, 1), tail(x, 1))
  else x
})

Shorthand Notation

car_data %>%
{ 
  if (nrow(.) > 0)
    rbind(head(., 1), tail(., 1))
  else .
}

Aliases

Aliases can greatly improve the readibility of your code. They can be found with the help command: ?magrittr::extract. I’m frequently converting numbers to percentages.

.345 %>% multiply_by(100) %>% round(2) %>% paste0("%")
[1] "34.5%"

Examples

mtcars

There are at least two things within the code that are not obvious, at least to me. First, the piping operator %>% is being used within the functions themselves, instead of just the end of the line. Second, the aggregate function contains a nested unary function FUN = . %>% mean %>% round(2).

# magrittr vignette
car_data <- 
  mtcars %>%
  subset(hp > 100) %>%
  aggregate(. ~ cyl, data = ., FUN = . %>% mean %>% round(2)) %>%
  transform(kpl = mpg %>% multiply_by(0.4251)) %>%
  print
  cyl   mpg   disp     hp drat   wt  qsec   vs   am gear carb       kpl
1   4 25.90 108.05 111.00 3.94 2.15 17.75 1.00 1.00 4.50 2.00 11.010090
2   6 19.74 183.31 122.29 3.59 3.12 17.98 0.57 0.43 3.86 3.43  8.391474
3   8 15.10 353.10 209.21 3.23 4.00 16.77 0.00 0.14 3.29 3.50  6.419010
mtcars %>%
        tibble(.) %>%
        tidyr::drop_na() %>% 
        mutate(type = rownames(mtcars)) %>%
        filter(!grepl('^A|L', type)) %>%
        filter(am == 0) %>%
        select(cyl, mpg) %>%
        group_by(cyl) %>%
        summarize(avg_mpg = mpg %>% mean() %>% round(0)) %>%
        arrange(-cyl) %>%
        set_colnames(c("cylinder", "avg_mpg"))
# A tibble: 3 x 2
  cylinder avg_mpg
     <dbl>   <dbl>
1        8      15
2        6      19
3        4      23

starwars

Use data(starwars). Examples from the vignette("dplyr").

starwars %>% filter(skin_color == "light", eye_color == "brown")
# A tibble: 7 x 14
  name     height  mass hair_color skin_color eye_color birth_year sex    gender
  <chr>     <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr>  <chr> 
1 Leia Or…    150    49 brown      light      brown             19 female femin…
2 Biggs D…    183    84 black      light      brown             24 male   mascu…
3 Cordé       157    NA brown      light      brown             NA female femin…
4 Dormé       165    NA brown      light      brown             NA female femin…
5 Raymus …    188    79 brown      light      brown             NA male   mascu…
6 Poe Dam…     NA    NA brown      light      brown             NA male   mascu…
7 Padmé A…    165    45 brown      light      brown             46 female femin…
# … with 5 more variables: homeworld <chr>, species <chr>, films <list>,
#   vehicles <list>, starships <list>
library(kableExtra)
starwars %>%
        select(name, height, mass, birth_year, sex, species, homeworld) %>%
        filter(species == "Droid") %>%
        arrange(desc(height)) %>%
        slice_head(n = 4) %>%
        kbl()
name height mass birth_year sex species homeworld
IG-88 200 140 15 none Droid NA
C-3PO 167 75 112 none Droid Tatooine
R5-D4 97 32 NA none Droid Tatooine
R2-D2 96 32 33 none Droid Naboo

babynames

This example was taken from R-bloggers post written by Stefan Milton, the author of the magrittr package. The post is included in the acknowledgements.

library(babynames)
babynames %>% 
    filter(name %>% substr(1, 3) %>% equals("Ste")) %>% 
    group_by(year, sex) %>% 
    summarize(total = sum(n)) %>%
    qplot(year, total, color = sex, data = ., geom = "line") %>%
    add(ggtitle('Names starting with "Ste"')) %>% 
    print

Conditional Piping

Making the flow of the pipe condition was the subject of this stackoverflow question.

x <- 1
y <- T
x %>% 
add(1) %>% 
 {if(y) add(.,1) else .}
[1] 3
library(purrr)
1:10 %>%
  when(
    sum(.) <=   x ~ sum(.),
    sum(.) <= 2*x ~ sum(.)/2,
    ~ 0,
    x = 60
  )
[1] 55

Conclusion

It appears that base R has included its own pipe |> in a development version, thus making the magrittr package obsolete in future R versions. This was announced at the 2020 useR Conference. Piping is used widely in many languages and one could reasonably expect that, aside from syntax, the development |> pipe would function similarly to magrittr’s %>% pipe. Although to be clear, it’s not a real pipe either! Happy piping!

Acknowledgements

This blog post was made possible thanks to:

The inimitable Hadley Wickham and another of his books, “R for Data Science.” (Specifically, Chapter 18–“Pipes.”)

magrittr: part of the tidyverse

Simplify Your Code with %>%

Simpler R coding with pipes > the present and future of the magrittr package

References

[1]
R Core Team, R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, 2020 [Online]. Available: https://www.R-project.org/
[2]
Y. Xie, C. Dervieux, and A. Presmanes Hill, Blogdown: Create blogs and websites with r markdown. 2021 [Online]. Available: https://CRAN.R-project.org/package=blogdown
[3]
H. Wickham, R. François, L. Henry, and K. Müller, Dplyr: A grammar of data manipulation. 2021 [Online]. Available: https://CRAN.R-project.org/package=dplyr
[4]
H. Zhu, kableExtra: Construct complex table with kable and pipe syntax. 2021 [Online]. Available: https://CRAN.R-project.org/package=kableExtra
[5]
S. M. Bache and H. Wickham, Magrittr: A forward-pipe operator for r. 2020 [Online]. Available: https://CRAN.R-project.org/package=magrittr
[6]
L. Henry and H. Wickham, Purrr: Functional programming tools. 2020 [Online]. Available: https://CRAN.R-project.org/package=purrr

Disclaimer

The views, analysis and conclusions presented within this paper represent the author’s alone and not of any other person, organization or government entity. While I have made every reasonable effort to ensure that the information in this article was correct, it will nonetheless contain errors, inaccuracies and inconsistencies. It is a working paper subject to revision without notice as additional information becomes available. Any liability is disclaimed as to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from negligence, accident, or any other cause. The author(s) received no financial support for the research, authorship, and/or publication of this article.

Reproducibility

─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 3.6.3 (2020-02-29)
 os       macOS Catalina 10.15.7      
 system   x86_64, darwin15.6.0        
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       America/Chicago             
 date     2021-05-11                  

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version date       lib source        
 assertthat    0.2.1   2019-03-21 [1] CRAN (R 3.6.0)
 blogdown    * 1.3     2021-04-14 [1] CRAN (R 3.6.2)
 bookdown      0.21    2020-10-13 [1] CRAN (R 3.6.3)
 bslib         0.2.4   2021-01-25 [1] CRAN (R 3.6.2)
 cachem        1.0.4   2021-02-13 [1] CRAN (R 3.6.2)
 callr         3.5.1   2020-10-13 [1] CRAN (R 3.6.2)
 cli           2.3.1   2021-02-23 [1] CRAN (R 3.6.3)
 codetools     0.2-18  2020-11-04 [1] CRAN (R 3.6.2)
 colorspace    2.0-0   2020-11-11 [1] CRAN (R 3.6.2)
 crayon        1.4.1   2021-02-08 [1] CRAN (R 3.6.2)
 DBI           1.1.1   2021-01-15 [1] CRAN (R 3.6.2)
 desc          1.3.0   2021-03-05 [1] CRAN (R 3.6.3)
 devtools    * 2.3.2   2020-09-18 [1] CRAN (R 3.6.2)
 digest        0.6.27  2020-10-24 [1] CRAN (R 3.6.2)
 dplyr       * 1.0.5   2021-03-05 [1] CRAN (R 3.6.3)
 ellipsis      0.3.1   2020-05-15 [1] CRAN (R 3.6.2)
 evaluate      0.14    2019-05-28 [1] CRAN (R 3.6.0)
 fansi         0.4.2   2021-01-15 [1] CRAN (R 3.6.2)
 fastmap       1.1.0   2021-01-25 [1] CRAN (R 3.6.2)
 fs            1.5.0   2020-07-31 [1] CRAN (R 3.6.2)
 generics      0.1.0   2020-10-31 [1] CRAN (R 3.6.2)
 ggplot2     * 3.3.3   2020-12-30 [1] CRAN (R 3.6.2)
 ggthemes    * 4.2.4   2021-01-20 [1] CRAN (R 3.6.2)
 glue          1.4.2   2020-08-27 [1] CRAN (R 3.6.2)
 gtable        0.3.0   2019-03-25 [1] CRAN (R 3.6.0)
 highr         0.8     2019-03-20 [1] CRAN (R 3.6.0)
 htmltools     0.5.1.1 2021-01-22 [1] CRAN (R 3.6.2)
 httr          1.4.2   2020-07-20 [1] CRAN (R 3.6.2)
 jquerylib     0.1.3   2020-12-17 [1] CRAN (R 3.6.2)
 jsonlite      1.7.2   2020-12-09 [1] CRAN (R 3.6.2)
 kableExtra  * 1.3.4   2021-02-20 [1] CRAN (R 3.6.3)
 knitr         1.32    2021-04-14 [1] CRAN (R 3.6.2)
 lifecycle     1.0.0   2021-02-15 [1] CRAN (R 3.6.2)
 magrittr    * 2.0.1   2020-11-17 [1] CRAN (R 3.6.2)
 memoise       2.0.0   2021-01-26 [1] CRAN (R 3.6.2)
 munsell       0.5.0   2018-06-12 [1] CRAN (R 3.6.0)
 pillar        1.5.1   2021-03-05 [1] CRAN (R 3.6.3)
 pkgbuild      1.2.0   2020-12-15 [1] CRAN (R 3.6.2)
 pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 3.6.0)
 pkgload       1.2.0   2021-02-23 [1] CRAN (R 3.6.3)
 prettyunits   1.1.1   2020-01-24 [1] CRAN (R 3.6.0)
 processx      3.4.5   2020-11-30 [1] CRAN (R 3.6.2)
 ps            1.6.0   2021-02-28 [1] CRAN (R 3.6.3)
 purrr       * 0.3.4   2020-04-17 [1] CRAN (R 3.6.2)
 R6            2.5.0   2020-10-28 [1] CRAN (R 3.6.2)
 remotes       2.2.0   2020-07-21 [1] CRAN (R 3.6.2)
 rlang         0.4.10  2020-12-30 [1] CRAN (R 3.6.2)
 rmarkdown     2.7     2021-02-19 [1] CRAN (R 3.6.3)
 rprojroot     2.0.2   2020-11-15 [1] CRAN (R 3.6.2)
 rstudioapi    0.13    2020-11-12 [1] CRAN (R 3.6.2)
 rvest         0.3.6   2020-07-25 [1] CRAN (R 3.6.2)
 sass          0.3.1   2021-01-24 [1] CRAN (R 3.6.2)
 scales        1.1.1   2020-05-11 [1] CRAN (R 3.6.2)
 sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.6.0)
 stringi       1.5.3   2020-09-09 [1] CRAN (R 3.6.2)
 stringr       1.4.0   2019-02-10 [1] CRAN (R 3.6.0)
 svglite       2.0.0   2021-02-20 [1] CRAN (R 3.6.3)
 systemfonts   1.0.1   2021-02-09 [1] CRAN (R 3.6.2)
 testthat      3.0.2   2021-02-14 [1] CRAN (R 3.6.2)
 tibble        3.1.0   2021-02-25 [1] CRAN (R 3.6.3)
 tidyr         1.1.3   2021-03-03 [1] CRAN (R 3.6.3)
 tidyselect    1.1.0   2020-05-11 [1] CRAN (R 3.6.2)
 usethis     * 2.0.1   2021-02-10 [1] CRAN (R 3.6.2)
 utf8          1.1.4   2018-05-24 [1] CRAN (R 3.6.0)
 vctrs         0.3.6   2020-12-17 [1] CRAN (R 3.6.2)
 viridisLite   0.3.0   2018-02-01 [1] CRAN (R 3.6.0)
 webshot       0.5.2   2019-11-22 [1] CRAN (R 3.6.0)
 withr         2.4.1   2021-01-26 [1] CRAN (R 3.6.2)
 xfun          0.22    2021-03-11 [1] CRAN (R 3.6.2)
 xml2          1.3.2   2020-04-23 [1] CRAN (R 3.6.2)
 yaml          2.2.1   2020-02-01 [1] CRAN (R 3.6.0)

[1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library