20 July 2016
R is the Error MessagesR is very strict about data formats.xlsx, xls, csv, txt, gtf/gff files + many moreDay1/data, open RealTimeData.xlsx in Excel (or Libre Office)Which sheet do you think will be the most problematic to load?
R loves to seeWhat about all those missing values?
What about all those missing values?
R can happily deal with missing values: \(\implies\) will load as NAR guesses the number of columns from the first rowAlways think in terms of columns
data folder is the file toothData.csvScript Window as a text fileYou will see two lines of code in the Console \(\implies\) two things have just happened
toothData <- read.csv("RAdelaideWorkshop/Day_1/data/toothData.csv")
View(toothData)
toothData <- read.csv("RAdelaideWorkshop/Day_1/data/toothData.csv")
ALWAYS copy the first line into your script!
View(toothData)
The second line has opened a preview of our R object
R object will be named using the file-name before the .csvtoothData object in the Environment tab (click the arrow)toothData is known as a data.frameR equivalent to a spreadsheetView(toothData) toothData head(toothData)
What were the differences between each method?
R assumes that a column of text is a categorical variable (i.e. a factor)stringsAsFactors button during importR function read.csv()utils package which is one of the base packages?read.csv
read.table(), read.csv() and read.delim()read.csv()readr has a similar, but slightly superior version called read_csv()library(dplyr)
library(readr)
toothData <- read_csv("data/toothData.csv")
local data frame \(\implies\) display in the Console is more convenienttoothData
?read_csv
read_csv()file, col_names)
col_names = TRUE)toothData <- read_csv("data/toothData.csv")
Is equivalent to:
toothData <- read_csv(file = "data/toothData.csv")
If we had a file with 3 blank lines to start, what would our code look like?
toothData <- read_csv("data/toothData.csv", ???)
If we had a file with 3 blank lines to start, what would our code look like?
toothData <- read_csv("data/toothData.csv", skip = 3)
What if the first three lines were comments starting with #?
If we had a file with 3 blank lines to start, what would our code look like?
toothData <- read_csv("data/toothData.csv", skip = 3)
What if the first three lines were comments starting with #?
R uses the first row to determine the number of columnstoothData <- read_csv("data/toothData.csv", comment = "#")
read_delim()read_csv() calls read_delim() using delim = ","read_csv2() calls read_delim() using delim = ";"read_tsv() calls read_delim() using delim = "\t"What function would we call for space-delimited files?
R also has a package for loading .xls and xlsx files.
library(readxl)
The main function is read_excel()
?read_excel
Try loading each of the sheets from RealTimeData.xslx
(Remember to call the R objects something)
Do you get any error messages for sheet 3?
How could we load these two separate tables?