Reviews

6 helpful R features you may not know

Virtually each R person is aware of about widespread packages like dplyr and ggplot2. However with 10,000+ packages on CRAN and but extra on GitHub, it isn’t at all times straightforward to unearth libraries with nice R features. Probably the greatest approach to discover cool, new-to-you R code is to see what different useRs have found. So, I am sharing just a few of my discoveries — and hope you may share a few of yours in return (contact info below).

Select a ColorBrewer palette from an interactive app. Want a shade scheme for a map or app? ColorBrewer is properly generally known as a supply for pre-configured palettes, and the RColorBrewer package deal imports these into R. Nevertheless it’s not at all times straightforward to recollect what’s accessible. The tmaptools package deal’s palette_explorer creates an interactive utility that exhibits you the probabilities.

First, set up tmaptools with set up.packages("tmaptools"), then load tmaptools with library("tmaptools") and run palette_explorer() (or, do not load tmaptools and run tmaptools::palette_explorer() ). You may see all accessible palettes as within the picture above, in addition to sliders to regulate choices like variety of colours. There’s additionally information about primary syntax for utilizing a shade scheme under every group of palettes.

palette_explorer additionally wants shiny and shinyjs packages put in so as to generate the interactive app.

Create character vectors with out citation marks. It may be a bit annoying to manually flip Firefox, Chrome, Edge, Safari, Web Explorer, Opera into the c("Firefox", "Chrome", "Edge", "Safari", "InternetExplorer", "Opera") format R wants to make use of such textual content as a vector of character strings.

That is what the Hmisc package deal’s Cs operate was designed to do. After loading the Hmisc package deal,

Cs(Firefox, Chrome, Edge, Safari, InternetExplorer, Opera)

will consider the identical as

c("Firefox", "Chrome", "Edge", "Safari", "InternetExplorer", "Opera")

When you’ve ever manually added citation marks to a prolonged string of phrases, you may respect the class. Word the dearth of an area in Web Explorer — areas will journey up the Cs operate.

RStudio bonus: When you use RStudio, there’s an alternative choice for glossy vector-string creation. Safety professional Bob Rudis created an RStudio add-in that takes chosen comma-separated textual content and provides the mandatory quotes and c(). And it will possibly deal with areas. Set up it with devtools::install_github("hrbrmstr/hrbraddins") (which implies you want the devtools package deal as properly), and you may see Naked Mix as an possibility within the RStudio Instruments > Addins menu.

You possibly can run it from that Addins menu, however deciding on textual content after which leaving your coding window to go to the Instruments > Addins menu to pick out Naked Mix does not essentially really feel much less cumbersome than typing just a few citation marks. A lot better to create a customized keyboard shortcut for the addin.

You are able to do that by going to Instruments > Modify Keyboard Shortcuts. Scroll down till you see Naked Mix within the Addins part — or seek for Naked Mix within the filter field. Double click on within the shortcut space and kind the keystroke(s) you wish to assign to the addin (I used alt-shift-').

Customizing keyboard shortcuts in RStudioScreenshot of RStudio software program

Customizing keyboard shortcuts in RStudio

Now, any time you wish to flip comma-separated plain textual content into an R vector of character strings, you possibly can spotlight the textual content and use your keyboard shortcuts.

By the best way, RStudio add-ins are principally simply plain R. If you would like having keyboard shortcuts for R duties like this, it is perhaps price learning the syntax.

Produce an interactive desk with one line of code. No matter how a lot you want and use the command line, generally it is nonetheless good to take a look at a spreadsheet-like desk of knowledge to scan, type and filter. RStudio supplied a primary view like this; however for big knowledge units, I like RStudio’s DT package deal, a wrapper for the DataTables JavaScript library. DT::datatable(mydf) creates an interactive HTML desk; DT::datatable(mydf, filter = "high") provides a filter field above every row.

HTML table created with RDisplay screen shot of an HTML desk created with the R DT package deal

Instance of an HTML desk created with the R DT package deal, an interface to the DataTables JavaScript library.

Simple file conversions. rio is one among my favourite R packages. As an alternative of remembering which features to make use of for importing what kinds of recordsdata (learn.csv? learn.desk? read_excel?), rio vastly simplifies the method with one import operate for a few dozen file codecs. So long as the file extension is a format that rio acknowledges, it’s going to appropriately import from recordsdata reminiscent of .csv, .json, .xlsx and .html (tables). Identical for rio’s export command if you would like to avoid wasting to a specific file format. However rio has a 3rd main operate: convert, which can import and export in a single step. Have a million-row Excel file it’s essential to save as a CSV? An HTML desk you want to avoid wasting as JSON? Use a syntax like convert("myfile.xlsx", "myfile.csv"), the place the primary argument is your current file and the second is your required file with the specified extension, and your file will probably be created.

Copy and paste from R to your clipboard. rio bonus: You possibly can copy between your clipboard and R with rio. Ship some knowledge from a small R variable to your clipboard with export(myRobject, "clipboard"). Importing to the clipboard ought to work as properly, though I’ve had combined success with that.

Import massive recordsdata shortly – and maybe save area. I am engaged on a undertaking this week that entails a spreadsheet with greater than 600,000 rows and 40 columns. Studying it into R took round 25 to 30 seconds — doable as soon as, however annoying after I needed to do it a number of occasions. The feather binary file format just isn’t solely readable by each R and Python, however is significantly quicker to learn and write. rio handles feather recordsdata, or you should use read_feather from the feather package deal.

For saving area in addition to velocity, the fst package deal seems to be to be a wonderful selection as a result of it gives compression. In my testing, write.fst(mydf, "myfile.fst", 100) — most compression — was simply as speedy as no compression, and it took about one-third the area of the unique spreadsheet. feather, in the meantime, took up virtually double the spreadsheet disk area.

A couple of extra favorites from readers and social media:

Extra with quotes. In response to the Cs() operate that provides quotes, Kwan Lowe touted the usefulness of noquote(), which strips quotes — helpful for importing sure kinds of knowledge into R. noquote() is a base R operate, aimed it making it simpler to wrangle variables.

Un-factoring elements. One other helpful operate: unfactor() within the varhandle package, which goals to detect the “actual” class of an R knowledge body column of things after which flip it into both numeric or character variables.

desk() various. Have to calculate frequencies of variables in a knowledge body? “I am an enormous fan of xtabs(),” Timothy Teravainen posted at Google+ in response to this weblog. “It is in base R, however I sadly went years with out understanding about it.”

The format is xtabs(~df$col1 + df$col2), which can return a frequency desk with col1 because the rows and col2 because the columns.

Textual content looking. Lastly, in the event you’ve been utilizing common expressions to seek for textual content that begins or ends with a sure character string, there’s a better means. “startsWith() and endsWith() — did I actually not know these?” tweeted knowledge scientist Jonathan Carroll. “That is it, I am sitting down and studying by dox for each #rstats operate.”

Need to share your personal favorites? Inform me through Twitter @sharon000 or e mail at smachlis@computerworld.com.

For extra on helpful R features, see Great R packages for data import, wrangling and visualization.

About the author

GN