On the acceptance of R

20 Mar 2013
2013/03/20

Some history and a prediction.

Past

A discussion broke out on the R-help mailing list in January 2006 about a technical report put out by the statistical computing group at UCLA.  The report in question talked mainly about SAS, SPSS and Stata.  It talked briefly — and not especially positively — about R.  Someone accused it of damning R with faint praise.  It might not be a surprise that the R community had a somewhat higher opinion of R.

You can find that thread with a web search like:

"A comment about R" 2006

There was a mechanism of  creating official comments to the technical report.  I edited material from the thread, got some additional views privately, and added a few flourishes myself to produce: R Relative to Statistical Packages.

That was 7 years ago.  There were on the order of 600 packages on CRAN. Read more →

email address back in action

15 Mar 2013
2013/03/15

The email address [email protected] was out of action for a few hours today.  It is back now.

The options mechanism in R

02 Mar 2013
2013/03/02

Customization in R.

Basics

Several features benefit from being customizable — either because of personal taste or specifics of the environment.

The way R implements this flexibility is through the options function.  This both sets and reports options.  For example, we can see the names of the options that are set by default:

> names(options())
 [1] "add.smooth"            "browser"              
 [3] "browserNLdisabled"     "check.bounds"         
 [5] "continue"              "contrasts"            
 [7] "defaultPackages"       "demo.ask"             
 [9] "device"                "device.ask.default"   
[11] "digits"                "echo"                 
[13] "editor"                "encoding"             
[15] "example.ask"           "expressions"          
[17] "help.search.types"     "help.try.all.packages"
[19] "help_type"             "HTTPUserAgent"        
[21] "internet.info"         "keep.source"          
[23] "keep.source.pkgs"      "locatorBell"          
[25] "mailer"                "max.print"            
[27] "menu.graphics"         "na.action"            
[29] "nwarnings"             "OutDec"               
[31] "pager"                 "papersize"            
[33] "pdfviewer"             "pkgType"              
[35] "prompt"                "repos"                
[37] "scipen"                "show.coef.Pvalues"    
[39] "show.error.messages"   "show.signif.stars"    
[41] "str"                   "str.dendrogram.last"  
[43] "stringsAsFactors"      "timeout"              
[45] "ts.eps"                "ts.S.compat"          
[47] "unzip"                 "useFancyQuotes"       
[49] "verbose"               "warn"                 
[51] "warning.length"        "width"                
[53] "windowsTimeouts"

options returns a list.  Most of the options are not especially interesting — we’ll highlight a few of the most useful. Read more →

Plot ranges of data in R

21 Feb 2013
2013/02/21

How to control the limits of data values in R plots.

R has multiple graphics engines.  Here we will talk about the base graphics and the ggplot2 package.

We’ll create a bit of data to use in the examples:

one2ten <- 1:10

ggplot2 demands that you have a data frame:

ggdat <- data.frame(first=one2ten, second=one2ten)

Seriously exciting data, yes?

Default behavior

The default is — not surprisingly — to create limits so that the data comfortably fit. Read more →

R database interfaces

14 Feb 2013
2013/02/14

Several packages on CRAN provide (or relate to) interfaces between databases and R.  Here is a summary, mostly in the words of the package descriptions.  Remember that package names are case-sensitive.

The packages that talk about being DBI-compliant are referring to the DBI package (see below in “Other SQL”).

MySQL

dbConnect: Provides a graphical user interface to connect with databases that use MySQL.

RMySQL: The current version complies with the database interface definition as implemented in the package DBI 0.2-2.

TSMySQL: TSMySQL provides a MySQL interface for TSdbi. Comprehensive examples of all the TS* packages are provided in the vignette Guide.pdf with the TSdata package. Read more →

Bricks not monoliths

06 Feb 2013
2013/02/06

Chapter 32 of Tao Te Programming advises you to make bricks instead of monoliths.  Here is an example. The example is written with the syntax of R and is a data analysis, but the principle is valid no matter what language you use or what your task is.

Monolith

Here is an outline of a function  reminiscent of many novice attempts:

monolith <-
function (data, col="blue", pch=21) 
{
        # transform data
        # fit model to data
        # plot data, uses 'col' and 'pch'
        # get desired model results
        # return desired model results
}

Each of these comment lines may be many lines of code so that the whole function runs to pages. Read more →

The three-dots construct in R

30 Jan 2013
2013/01/30

There is a mechanism that allows variability in the arguments given to R functions.  Technically it is ellipsis, but more commonly called “…”, dots, dot-dot-dot or three-dots.

Basics

The three-dots allows:

  • an arbitrary number and variety of arguments
  • passing arguments on to other functions

Arbitrary arguments

The two prime cases are the c and list functions:

> c
function (..., recursive = FALSE)  .Primitive("c")
> list
function (...)  .Primitive("list")

Both of these allow you to give them as many arguments as you like, and you can name those arguments (which end up as names in the resulting object). Read more →

A corner on convenient data analysis

24 Jan 2013
2013/01/24

Many people are of the opinion that R has a corner on convenient data analysis.  That may or may not be true.

But now R literally has a corner that makes data analysis more convenient.  If you have a data frame or a matrix with a few columns, then you can use head and/or tail to make sure that it looks as you expect.  However, the result is unappetizing if there are hundreds or thousands of columns.

That is is where corner comes in.  It shows you the first or last  few rows of the first or last few columns.

The  mtcars dataset can serve as an example even though it isn’t exactly a gigantic dataset.

By default the first 6 rows and first 6 columns are extracted:

> corner(mtcars)
                   mpg cyl disp  hp drat    wt
Mazda RX4         21.0   6  160 110 3.90 2.620
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875
Datsun 710        22.8   4  108  93 3.85 2.320
Hornet 4 Drive    21.4   6  258 110 3.08 3.215
Hornet Sportabout 18.7   8  360 175 3.15 3.440
Valiant           18.1   6  225 105 2.76 3.460

The same thing is done with: Read more →

Welcome to the Burns Statistics blog

08 Jan 2013
2013/01/08

The most likely topics to appear here are:

  • the R language
  • statistics
  • programming in general
  • optimization
© Copyright - Burns Statistics