On the acceptance of R

20 Mar 2013
2013/03/20

Some history and a prediction.

Past

A discussion broke out on the R-help mailing list in January 2006 about a technical report put out by the statistical computing group at UCLA.  The report in question talked mainly about SAS, SPSS and Stata.  It talked briefly — and not especially positively — about R.  Someone accused it of damning R with faint praise.  It might not be a surprise that the R community had a somewhat higher opinion of R.

You can find that thread with a web search like:

"A comment about R" 2006

There was a mechanism of  creating official comments to the technical report.  I edited material from the thread, got some additional views privately, and added a few flourishes myself to produce: R Relative to Statistical Packages.

That was 7 years ago.  There were on the order of 600 packages on CRAN.

Present

Things are different now.  As of St. Paddy’s Day there were 4399 packages on CRAN.  If other major repositories are included, then the number of packages exceeds 6000.

It is unimaginable for something similar to happen in academia now.  In particular, the UCLA site no longer has the technical report available — instead there are substantial resources about R.  In academia, R is highly accepted, and — in some fields — very dominant.

In commercial companies R is in a similar position now to how it was in academia in 2006.

Future

Which leads to my prediction: In the year 2020 R will be a dominant force in commerce similar to how it currently dominates in academia.

I’m not thinking the transition is automatic.  In academia R’s competition was SAS, SPSS, Stata, Minitab and some others.  In commerce R’s competition is primarily Excel.

That’s a whole different sort of competition.  Those addicted to spreadsheets aren’t likely to give them up easily.  But there are signs of cracks in the spreadsheet wall.

6 replies
  1. gappy says:

    Nice post, as usual. I agree that industry competition for interactive analysis is mostly Excel. But in production processes, do you think R will gain adoption? I see it heavily used in finance, but not in production. It seems that there the competition is with Python. And Python has also a truly innovative interface (if motivated by Mathematica) in its ipython notebook.

    Reply
  2. Markus says:

    Great post and I tend to agree with your forecast, but I don’t think that R competes with Excel in the commercial environment. My impression is that over the last decade most corporates focused on the objective to get a ‘true’ picture of their past and current performance and hence invested in more systematic data capturing processes, data warehouses and a reporting infrastructure. Indeed, Excel can be a nice front-end for that purpose. However, as corporates are keen to make the next step to look into the future and employ predictive analytics they will need a new product, which goes beyond the capabilities of Excel and R is a good choice for that.

    Reply
  3. Carl Witthoft says:

    Acceptance in the business world may take longer not so much because of spreadsheet addicts but the pervasive “OMG Open Source bad. Run Run!” attitude of most IT departments and managers. I have seen companies spending huge amounts to license Matlab or IDL rather than allow (let alone encourage) use of R, or scipy for that matter.

    Reply
  4. Marc Schwartz says:

    Nice post Patrick. There are some commercial domains where SAS, not Excel, is still the dominant competition.

    Clinical trials, which is my own domain, is heavily biased towards SAS still today. That is changing for a variety of reasons, but it is an evolutionary, not revolutionary process and has not yet quite “Crossed The Chasm”. We are aggressively moving in that direction however and activities by folks both inside and outside the FDA are helping.

    Also, in case folks reading this are not aware, there are now two certification/validation oriented documents that are available from the main R web site via the Documentation -> Certification links.

    The first is the recently updated (December 2012) “R-FDA” document (http://www.r-project.org/doc/R-FDA.pdf), which provides guidance for the use of R in regulated clinical trials.

    The second is a new document (http://www.r-project.org/doc/R-SDLC.pdf) which contains a subset of the content of the R-FDA document, for more general applications where there is a need to document R’s Software Development Life Cycle (SDLC).

    Reply
  5. Tal Galili says:

    Interesting post Pat.
    I wonder where the weak spot for breaking “the spreadsheet barrier” might be…

    Reply
  6. Patrick Burns says:

    Thanks all for the comments — sorry for the delay in approving them.

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© Copyright - Burns Statistics