15 April 2019
R are the errors, messages & warnings
R
logical, integer, numeric, characterxlsx, xls, csv, txt, files + many moreFirst let's get the data for this exercise.
/home/trainee/data.f <- list.files("~/data/intro_r/", full.names = TRUE)
file.copy(f, "~/R_Training/data/")
data you'll see a set of csv and other excel-type files we've copiedtoothData.csv and click More > Export...R loves to seedata directorytoothData.csvView FileThis will open a preview in the Script Window (close when you're done)
File > New File > R Script (Or Ctrl+Shift+N)DataImport.Rlibrary(tidyverse)
library(tidyverse)
min(), max() etc from basereadr, dplyr, tibble, stringr, ggplot2, tidyr and purrrTo import into our R Environment we can either:
Import Dataset, orEnvironment Tab Stop and wait until we're all ready
(Click Update if you don't see this)
Code Preview BoxImportlibrary(tidyverse)The code we copied has 3 lines:
1. library(readr)
read_csv()library(tidyverse)
The code we copied has 3 lines:
1. library(readr)
2. toothData <- read_csv("data/toothData.csv")
R EnvironmenttoothData by using the file name.library(readr)toothData <- read_csv("data/toothData.csv")View(toothData)
Excel-like formatClose the preview by clicking the cross and delete the line View(toothData)
read_csv()Environment Tab click the broom icon (R EnvironmentEnvironment Tab again and toothData is backRStudio now uses read_csv() from the package readr by defaultread.csv() from the package utilsutils are read.delim() and read.table()readr has the functions read_tsv(), read_delim() and read_table() etc.readr over utilsR has it's origins in statistical analysis
factors in R \(\implies\) more memory efficientreadr import functions do not assume thisread_csv(), read_tsv() etc)toothData is known as a data.frameR equivalent to a spreadsheetreadr uses a variant called a tibble (originally tbl_df)
data.frame with pretty bows & ribbonstableI will be lazy and call this a data frame as the differences are so trivial
toothData print(toothData) head(toothData) glimpse(toothData)
What were the differences between each method?
$toothData$len
[]
toothData[1:3, ] toothData[,"len"]
data.frame/tibble objects must have column names.read_csv()R function read_csv()?read_csv
read_csv()read_csv()read_csv(file, col_names = TRUE, col_types = NULL,
locale = default_locale(), na = c("", "NA"), quoted_na = TRUE,
quote = "\"", comment = "", trim_ws = TRUE, skip = 0,
n_max = Inf, guess_max = min(1000, n_max),
progress = show_progress())
file, col_names etc.)file) we need to specify somethingread_csv()read_csv(file, col_names = TRUE, col_types = NULL,
locale = default_locale(), na = c("", "NA"), quoted_na = TRUE,
quote = "\"", comment = "", trim_ws = TRUE, skip = 0,
n_max = Inf, guess_max = min(1000, n_max),
progress = show_progress())
col_names = TRUE)read_csv()toothData <- read_csv("data/toothData.csv")
Is equivalent to:
toothData <- read_csv(file = "data/toothData.csv")
read_csv()All arguments for the function were defined somewhere in the GUI.
First Row as Names checkboxread_csv()All arguments for the function were defined somewhere in the GUI.
First Row as Names checkbox
read_delim()read_csv() calls read_delim() using delim = ","read_csv2() calls read_delim() using delim = ";"read_tsv() calls read_delim() using delim = "\t"
What function would we call for space-delimited files?
R also has a package for loading .xls and xlsx files.
library(readxl)
The main function is read_excel()
?read_excel
Export RealTimeData.xlsx from your data folder to your local machine and inspect
Try to load each of the sheets from RealTimeData.xslx
(Remember to call the R objects something different)
Do you get any weird behaviour for sheet 3?
How could we load these two separate tables?