Probably the most useful R function I’ve ever written

15 Aug 2016
2016/08/15

The function in question is scriptSearch. I’m not much for superlatives — “most” and “best” imply one dimension, but we live in a multi-dimensional world. I’m making an exception.

The statistic I have in mind for this use of “useful” is the waiting time between calls to the function divided by the human time saved by the call.

I wrote a version of this for a company where I do consulting. There are few days working there that I don’t have at least one bout with it.  Using scriptSearch can easily save half an hour compared to what I would have done prior to having the function.

scriptSearch

The two main inputs are:

  • a string to search for
  • a directory to search in

By default it only looks in R scripts in the directory (and its subdirectories).

Examples of directories to search are:

  • directory holding a large collection of R scripts
  • directory holding the source for local R packages
  • personal directory with lots of subdirectories containing R scripts and functions

Examples of uses are:

  • where is blimblam defined?
  • where are all the uses of splishsplash in the local packages (because I want to change its arguments)?
  • a few weeks ago I created a pdf called factor_history, where is the code that produced that?

These uses might be done with something like:

  • scriptSearch("blimblam *<-", "path/to/scriptFarm", sub=FALSE)
  • scriptSearch("splishsplash", "path/to/Rsource")
  • scriptSearch("factor_history", "..")

You may be confused by the asterisk in the first call.  The string to search for can be a regular expression.  In this case the asterisk means that it will find assignments whether or not there is a space between the object name and the assignment arrow.

BurStMisc

scriptSearch was the main motivation for updating the BurStMisc package to version 1.1.  The package is on CRAN.

ntile

The ntile function is also new to BurStMisc.  It returns equally-sized ordered groups from a numeric vector — for example, quintiles or deciles.

A more primitive version of the function appeared in a blog post called “Miles of iles”.  There is some discussion there of alternative functions.

writeExpectTest

While I was preparing the update to BurStMisc, I found that automating the writing of some tests using the testthat package was both warranted and feasible.  The writeExpectTest function is the result.

corner

The generally useful function that was already in BurStMisc is corner.  This allows you to see a few rows and columns of a large matrix or data frame — or higher dimensional array.

Epilogue

I want to spread the news
That if it feels this good getting used
You just keep on using me

— from “Use Me”  by Bill Withers

8 replies
  1. El-ad David Amir says:

    Thank you for sharing. Out of curiosity, which editor do you use for R code? I’ve been using Sublime, which includes a very similar functionality as part of the IDE. If I recall correctly Atom and vim also include the ability to search through multiple files.

    Reply
    • Patrick Burns says:

      I use RStudio which I think in general is wonderful, but I wouldn’t mind them fixing whatever makes it hang periodically.

      Reply
      • Johannes Lips says:

        RStudio has the option to “Find in Files”, which is accessible with Ctrl+Shift+F, which probably is pretty close to the thing you actually would like to achieve.
        Johannes

        Reply
  2. Gabriele says:

    First of all, very nice post, thanks! Just a minor note for the sake of precision: using the ‘*’ you are allowing for any number of spaces (from 0 to any positive), while it’s with ‘?’ that you allow for either 0 or 1 whitespace in you regexp. Cheers 🙂

    Reply
  3. Carl Witthoft says:

    Or (horrors! blasphemy!) you could go outside of R and use AgentRansack 🙂 . After all, there’s plenty of times one wishes to find a text string in a non-R, or even a binary document (to the limit that A.R. or similar tools can decode, say, Office documents).

    Reply
    • Patrick Burns says:

      Actually, it is possible to avoid horrors while still searching in non-R files.

      Reply
  4. mike says:

    Have you heard about Sublime?

    Really this is on of the basic functions even presented on the front page of application website.

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

© Copyright - Burns Statistics