20 July 2016
If you've started a new session since last time:
library(dplyr) library(readr)
read.csv() refers to column names as a headerR how many columns you haveWhat happens if we get this wrong?
no_header <- read_csv("data/no_header.csv")
We can easily fix this
no_header <- read_csv("data/no_header.csv", col_names = FALSE)
What about that first column?
We can specify what is loaded or skipped using col_types
?read_csv
no_header <- read_csv("data/no_header.csv", col_names = FALSE,
col_types = "-ccnnc")
What if we get that wrong?
Let's mis-specify the third column as a number
no_header <- read_csv("data/no_header.csv", col_names = FALSE,
col_types = "-cnnnc")
Let's get it wrong first
comments <- read_csv("data/comments.csv")
Now we can get it right
comments <- read_csv("data/comments.csv", comment = "#")
This will work if there are comments in any rows
What happens when you try to load the file bad_colnames_.csv
bad_colnames <- read_csv("data/bad_colnames.csv")
How could we fix this?
Here's my fix
bad_colnames <- read_csv("data/bad_colnames.csv",
skip = 1, col_names = FALSE)
colnames(bad_colnames) <- c("rowname", "gender", "name",
"weight", "height", "transport")
We can set column names manually…
c() functionThe most common function in R is c()
combineR object, or vectorNULLc()
## NULL
colnames(bad_colnames) <- c()
What if missing values have been set to "-"?
Let's get it wrong first
missing_data <- read_csv("data/missing_data.csv")
Where have the errors appeared?
Now we can get it right
missing_data <- read_csv("data/missing_data.csv", na = "-")
After we've edited a file, we might wish to export it
?write_csv
write_delim().csv, .txt, .tsv etcR objects can be exported using write_rds()