To return to the previous page click here or use the back button on your browser.
We can load data into R using some more built-in functions, and by default R imports each file as a data.frame
. R expects to see some kind of delimiter it can use to make columns, and these can be:
Fortunately RStudio helps us do this by providing the “Import Dataset” button.
The Import Dataset Button
Let’s load the dataset toothData.csv
which is in the data
directory.
From Text File...
data
foldertoothData.csv
Now a preview screen has appeared, with numerous options.
The Preview Screen
data.frame
as it would appear in R.factor
?Try changing a few to see what effect they have on the data.frame
preview
Now load the data, un-checking the Strings as factors
box, and two lines of code will appear in your console. The second line of your code View(toothData)
was entered by the GUI and will have made a preview frame of your data appear, just like a spreadsheet. We can use the function View()
to inspect an R object anytime, although large objects will be truncated. Files with many rows or columns clearly place huge demands on your memory, and there is no practical advantage to looking at these using this method.
Two important points
toothData
In our sample code, we should see:
toothData <- read.csv("path/to/data/toothData.csv", stringsAsFactors = FALSE)
(NB: path/to/data/
will actually be different for all of you, depending on where you saved the file.)
Notice that this will have loaded the data, treating the character strings as character strings, instead of categorical variables.
We can also load data, using the functions read.csv
, read.table
or read.delim
. The code generated by the GUI will have used one of these.
?read.table
All of the options we saw on the LHS of the GUI are listed here as function arguments. Note that for read.csv
there are default values set such as header=TRUE
, sep=","
. Using these defaults, we could have also loaded the file using the following command. (Remember to put in the correct directory path for your computer instead of path/to/data
).
toothData <- read.csv("path/to/data/toothData.csv", stringsAsFactors = FALSE)
When converting character strings to factors, R will automatically set the category levels in alphabetic order. Setting these manually can often be preferable, so let’s convert the columns supp
and dose
to factors. If we are happy with the default alphabetic order, we can simply use:
toothData$supp <- as.factor(toothData$supp)
However, if we wish to specify these in non-alphabetic order we can set the category order using
toothData$dose <- factor(toothData$dose, levels=c("Low", "Med", "High"))
Notice that in both of the above lines of code, we overwrote the original columns of data in the R object toothData
.