More R Key Objects
This page has the following sections:
A characterization of “object” that is at least close to true is:
If you can assign something a name, then it is an object. If you can not assign something a name, then it is not an object.
In normal speech we think of an object as something that we can hold, turn upside down, and look at. R objects are very much like that.
The main reason that R has such a wide variety of data types is because of attributes. Objects have a main part to them and then they can have one or more attributes that can modify how either R or the user thinks about the object.
You can have a plain bowl of rice and beans. If you add spice, it is something new. Different spice, different dish.
Attributes are spice.
A very common attribute is
names. The elements of an atomic vector can each have a name. The components of a list can each have a name.
class is a very important attribute.
There are characteristics that are inherent in an object.
Objects have a “length”. The length of an atomic vector is the number of elements it has. The length of a list is the number of components that it has. The length of
NULL is zero.
Objects have a “mode”. This says what kind of object they are. There is a
mode function that will tell you the mode of an object. There is also the
typeof function that is slightly more specific about an object.
There are three atomic types that you are likely to care about.
Numeric objects hold numbers.
See More R Numbers.
Logical objects have values that are
Character objects have a string as each element.
To us humans matrices and data frames are rectangular objects with rows and columns. They are both poseurs. Both of them are linear structures pretending to be rectangular. They have very different approaches though.
A matrix is a vector that has a
dim attribute. The dim is a vector of two integers saying how many rows and columns there are. The length of the matrix is the number of rows times the number of columns. You can see the order of the elements within the matrix by doing a command like:
> matrix(1:15, 5) [,1] [,2] [,3] [1,] 1 6 11 [2,] 2 7 12 [3,] 3 8 13 [4,] 4 9 14 [5,] 5 10 15
A data frame has a
class attribute that is
"data.frame". It is really a list with as many components as there are columns. Each component has to have the same number of elements (the number of rows).
Both matrices and data frames can have names for the rows and the columns. (This is mandatory for data frames.) These are implemented differently in the two types of object, but you can get them from either type with
You can test if an object is a data frame with:
Circle 8 of The R Inferno discusses a number of possible problems you might have with matrices and data frames.
Factors have two key attributes. They have a
class attribute and a
levels attribute. The levels is a character vector that gives the possible categories for the object. The basic part of the object is a vector of integers that are the location of the category in the levels vector.
Circle 8.2 of The R Inferno begins with several ways of going wrong with factors.
All of the atomic modes have a missing value. This is printed as
You test for missing values with the
is.na function. For example:
will return a logical vector as long as
x that is
TRUE for the missing values in
FALSE for the other values.
If you feel compelled to replace missing values by something else (like zero), you are almost surely making life harder for yourself rather than easier.
str function is one of your best friends. It tells you how an object is structured. Its output may seem cryptic to you at first, but you will soon learn to appreciate the crypticness.
> examp <- list(A=1:10, B=letters, C=list(NULL, TRUE)) > str(examp) List of 3 $ A: int [1:10] 1 2 3 4 5 6 7 8 9 10 $ B: chr [1:26] "a" "b" "c" "d" ... $ C:List of 2 ..$ : NULL ..$ : logi TRUE
Functions in R are objects just as numeric vectors are objects. You think that is a good idea. It may take you some time before you realize that you think it is a good idea. But I guarantee you that you think it is a good idea.
The word “vector” is quite unfortunate in R. There are three distinct meanings:
- an atomic vector
- an object without attributes (except perhaps names)
- an object that has length
If we always said “atomic vector” for the first meaning, there would not be a problem with that. But all of us get bored saying “atomic vector” and shorten it to “vector”.
The second meaning comes from the meaning in mathematics. It is distinguishing a linear structure from a matrix. The latter has a
dim attribute, the former does not.
The third meaning is the literal sense of the word. This includes lists, which the first meaning excludes.
Back to top level of Impatient R
rice photo by michaelaw via stock.xchng