To return to the previous page click here or use the back button on your browser.
R sees everything as a vector. These are like columns in a spreadsheet, where we have entries in multiple cells. Importantly, every entry must be of the same data type!
Enter the following in the Script Window
then send it to the Console
Here we’ll use one of the most common R functions c()
to combine these numbers into a vector
lifeSpans <- c(2.1, 1.9, 1.7, 1.8, 2.3, 2.5)
This is a numeric vector, where all entries are double precision entries (i.e. they have decimal points).
Note that this appears differently in the Environment Tab
to the initial object x
. Now we can see details on how many values are in the vector, and what type of values they are.
What would happen if we accidentally included the letter “a” in this vector?
We can perform simple operations on a vector by using some in-built functions. Functions in R, consist of their name, followed by the round braces ()
. We then pass an R object to the function, by including it inside these braces.
mean(lifeSpans)
Note that we haven’t made a new R object. We’ll do that in a few minutes.
Instead of typing the complete object name in the above code, we can just start typing the name inside the function mean()
, then hitting the Tab
key. Try entering mean(life + Tab)
, where + Tab
is the tab key. The available options will appear and you can just choose the appropriate one. This can be a huge time-saver!
Some examples may be sd()
, var()
, max()
, min()
, median()
, summary()
. Use Tab Auto-Complete to avoid typos.
sd(lifeSpans)
max(lifeSpans)
summary(lifeSpans)
We can also perform logical
tests on an entire vector
lifeSpans > 2
Could we apply these functions to integer vectors?
What about character vectors?
Try some if you’d like.
charVec<- c("cats", "internet")
min(charVec)
mean(charVec)
Try the different functions and see what results they give you. You might see confusing error messages, but remember error messages are your friend. They tell you something isn’t right, and (sometimes) give you a clue as to what is wrong, even though they can be frustrating.
In the above, we just printed the results to the screen. Sometimes we may want to save the results as a new R object. For example, every vector also has a length attribute, which is just the number of entries in the vector, and this can be how we find out our sample size.
sampleSize <- length(lifeSpans)
sampleSize
## [1] 6
We can extract single values, or sets of values, from a vector by using square brackets
lifeSpans[1]
lifeSpans[4:6]
One other data type we see in R is a factor. R will sometimes assume that any character strings represent categorical variables.
Groups <- c("treated", "control", "control", "control", "treated", "treated")
Groups <- factor(Groups)
Groups
## [1] treated control control control treated treated
## Levels: control treated
These can be a trap for the unwary, as behind the scenes R has saved these as integers, corresponding to which category (or factor level) the values belong to.
as.integer(Groups)
## [1] 2 1 1 1 2 2
In one sense, these are like a hybrid of an integer vector, where the values represent the category number, and a character vector which contains the various category names. In the old days of puny computers, this was more memory efficient. It also makes sense when you consider statistical modelling is at the heart of R.