Rmarkdown
Functions for Lesson 5
here
, summarise
, set_here
, dr_here
Packages for Lesson 5
here
, dplyr
, readr
,rmarkdown
Create reproducible HTML, PDF, and Word documents from R using RMarkdown
.
* Download the RMarkdown
template (right click > Save As). Save this in the same folder as your coding club files. The file should have a .Rmd extension, e.g. Lesson5_rmd.Rmd.
* Open RStudio
and run the following code to install the necessary packages:
packages <- c("pacman", "rmarkdown", "here")
install.packages(packages, dependencies = T)
lapply(packages, require, character.only = T)
Recreate the below plot using the larger NYC Airbnb dataset. Here's the code to add the title and labels.
url <- "http://data.insideairbnb.com/united-states/ny/new-york-city/2021-04-07/data/listings.csv.gz"
nyc_full <- readr::read_csv(url) # reads in data
ttl <- "NYC Airbnb availability"
subttl <- "for room types across neighbourhoods"
xlab <- "Availability of property in next 30 days"
ylab <- "Count"
# append this to your ggplot function with a `+`
labs(title = ttl, subtitle = subttl, x = xlab, y = ylab)
[1] "id" "listing_url"
[3] "scrape_id" "last_scraped"
[5] "name" "description"
[7] "neighborhood_overview" "picture_url"
[9] "host_id" "host_url"
[11] "host_name" "host_since"
[13] "host_location" "host_about"
[15] "host_response_time" "host_response_rate"
[17] "host_acceptance_rate" "host_is_superhost"
[19] "host_thumbnail_url" "host_picture_url"
[21] "host_neighbourhood" "host_listings_count"
[23] "host_total_listings_count" "host_verifications"
[25] "host_has_profile_pic" "host_identity_verified"
[27] "neighbourhood" "neighbourhood_cleansed"
[29] "neighbourhood_group_cleansed" "latitude"
[31] "longitude" "property_type"
[33] "room_type" "accommodates"
[35] "bathrooms" "bathrooms_text"
[37] "bedrooms" "beds"
[39] "amenities" "price"
[41] "minimum_nights" "maximum_nights"
[43] "minimum_minimum_nights" "maximum_minimum_nights"
[45] "minimum_maximum_nights" "maximum_maximum_nights"
[47] "minimum_nights_avg_ntm" "maximum_nights_avg_ntm"
[49] "calendar_updated" "has_availability"
[51] "availability_30" "availability_60"
[53] "availability_90" "availability_365"
[55] "calendar_last_scraped" "number_of_reviews"
[57] "number_of_reviews_ltm" "number_of_reviews_l30d"
[59] "first_review" "last_review"
[61] "review_scores_rating" "review_scores_accuracy"
[63] "review_scores_cleanliness" "review_scores_checkin"
[65] "review_scores_communication" "review_scores_location"
[67] "review_scores_value" "license"
[69] "instant_bookable" "calculated_host_listings_count"
[71] "calculated_host_listings_count_entire_homes" "calculated_host_listings_count_private_rooms"
[73] "calculated_host_listings_count_shared_rooms" "reviews_per_month"
Using here
to set your working directory
require(here) # load the here package if not already
set_here() # set current working directory
dr_here() # print current working directory where .here is located
Example 1
1. Create a folder in your local directory called 'lookhere'.
2. Open a new file in TextEdit (Mac) or Notepad (Windows), type in something like " We found the here file! ", then save the file as "heretest.txt".
require(readr)
# example 1
list.files() # print files in your current working dir. the 'heretest.txt' file is not here, but one folder below in the new 'lookhere' folder
here("lookhere", "heretest.txt") # using here to forage for the file
here("lookhere", "heretest.txt") %>% read_lines # print the contents in R
Example 2
1. Create another folder within the lookhere folder called lookheretoo
2. Create another .txt file called heretesttoo.txt and save it.
# example 2 create two folders
folder1 <- "lookhere"
folder2 <- "lookheretoo"
file <- "heretesttoo.txt"
# navigate to the /lookheretoo folder and open the heretesttoo file using 'here'
here(folder1, folder2, file) %>% read_lines
read_lines(here(folder1, folder2, file)) # non-tidy version
Why is here
useful?
* You can step through sub folders by defining them individually as function inputs
* You can user define these subfolders as variables at the beginning of your R
script and refer to them throughout your script without pasting, e.g. paste(folder1,folder2,sep="/")
.
R
project (Rproj
) fileOption 1: Create an RProject new directory:
* File > New Project
* Create New Project
* Choose a name for your RProj folder. Fill out Directory name: (make it machine-friendly, i.e. no spaces)
* Choose a place for the RProj to live. Browse
* Select Open in new session
Option 2: If you already have a folder just for Emory Coding Club:
* File > New Project
* Existing directory > Browse
* Select Open in new session
here
package. What do you see?Cmd/Ctrl + Shift + K
# large Airbnb dataset (106 cols)
require(readr, dplyr)
url <- "http://data.insideairbnb.com/united-states/ny/new-york-city/2021-04-07/data/listings.csv.gz"
nyc_full <- read_csv(url) # reads in data
nyc_full %>% glimpse