---------------------------------------------------------------- LECTURE 24, 2017/11/27: my.clock() ORG: - Instructor office hour today, 4:30-5:30pm, JMHH F36 - Reminder -- Ethics rules for this class: . No cell phones, no use of laptops other than for this class! . No smelly foods, please! RECAP: Conditionals and infinite loops - Syntax & semantics, by example: . Execute some code only if TRUE: x <- runif(1) if(x < 2/3) { print("Event with P=2/3") } . Two choices of code, execute one of them, depending on TRUE/FALSE: x <- runif(1) if(x < 2/3) { print("Event with P=2/3") else { print("Event with P=1/3") } . Infinite loop: Terminating requires user intervention -- in RStudio type ESC into Console. i <- 0 repeat{ i <- i+1 print(i) } . Potentially infinite loop with conditional termination: i <- 0 repeat{ i <- i+1 print(i) if(runif(1) < 1/10) { cat("--- The End ---\n"); break } } # [For probability-friendly students: # . What is the probability distribution of the number of repeats? # . What is the average number of repeats? # . What is the median number of repeats approximately? # After figuring out the answer to the first question, # you may implement the probabilities in R up to a sufficiently large N. # You may use R to answer the next two questions. # ] - Some uses of conditionals: . In functions, check argument types to make the function code robust, i.e., return gracefully instead of throwing an error. . In for-loops, execute code only if the looping variable refers to a data structure of suitable type. . In repeat-loops, terminate when a condition is met. In iterative algorithms the condition is achievement of sufficient precision. Example: ... [Note: You are now able to solve any nonlinear univariate equation f(x)=0 for continuous f() !] ROADMAP: - while-loops (Chapter 12) - Coercion, implicit and explicit (Chapter 13) - Matrices and arrays (Chapter 14) - Text analysis ---------------------------------------------------------------- LECTURE 23, 2017/11/20: ORG: - Reminder -- Ethics rules for this class: . No cell phones, no use of laptops other than for this class! . No smelly foods, please! RECAP: Programming your own functions - Syntax: . assignment . 'function(args)' . composite expression '{...}' (= function body) - Example: fun <- function(x,y=10) { x+y } - Uses: fun(10,12) fun(20) # use default y=10 z <- fun(-10,+10); z # assign returned value and print fun(1:10, 10:1) # vectorized use of '+' fun(1:10, 1:2) # recycling works x <- -100; fun(x) # no conflict between 'x' symbols - Symbol handling inside functions: . local/internal namespace is allocated for ~ argument symbols assigned to actual user arguments ~ internally defined symbols . symbols are searched in this order: ~ local namespace ~ user namespace ~ system namespaces . warning: In general, do not use user symbols inside functions. . symbols in the local/internal namespace and their assigned values disappear on exit from the function - Value produced by a function: value of the last statement in the function body - Keep in mind: . the expression 'function(...) {...}' executes a computation: it generates a function; . functions are objects, just like vectors, dataframes, lists; . function objects can be assigned to symbols. . The following produces a function object and then prints its printed representation: print(function(x,y=10) { x+y }) function(x,y=10) { x+y } and so does the following: foo <- function(x,y=10) { x+y } print(foo) foo ROADMAP: - sapply() and its throw-away functions - Chapter 12: Conditionals and more loops BEFORE WE CONTINUE: a small TG present -- an old-fashioned clock execute the definition of 'my.clock()' below, then run the function: my.clock() my.clock <- function() { # Circles: a <- seq(0,2*pi,len=201) rs <- c(1.25, 1.1) dev.new() par(mar=c(3,3,3,3)) plot(x=c(-rs,rs), y=c(-rs,rs), type="n", xaxt="n", yaxt="n", xlab="", ylab="") for(r in rs) { lines(cos(a)*r, sin(a)*r, lwd=8) } # Hours: angle and radius hrs <- 1:12 a.hrs <- pi/2 - 2*pi/12*hrs r.hrs <- mean(rs) text(x=cos(a.hrs)*r.hrs, y=sin(a.hrs)*r.hrs, lab=hrs) # Minute mis <- seq(0,59, by=5) a.mis <- pi/2 - 2*pi/60*mis r.mis <- rs[2]*0.95 text(x=cos(a.mis)*r.mis, y=sin(a.mis)*r.mis, lab=mis, cex=.7) # Hands: hr, mi, se (radii and line thickness) r.hr <- 0.5; lwd.hr <- 10 r.mi <- 0.9; lwd.mi <- 4 r.se <- 0.95; lwd.se <- 1 # Initialize hands (their angles): a.hr <- 0 a.mi <- 0 a.se <- 0 repeat{ # Erase the old hands by drawing them in white color, assuming this is the background color: lines(c(0,cos(a.hr)*r.hr), c(0,sin(a.hr)*r.hr), lwd=lwd.hr, col="white") lines(c(0,cos(a.mi)*r.mi), c(0,sin(a.mi)*r.mi), lwd=lwd.mi, col="white") lines(c(0,cos(a.se)*r.se), c(0,sin(a.se)*r.se), lwd=lwd.se, col="white") # Get the time of day (and date, ignored) as a string: tm <- Sys.time() # In what follows we translate time 'tm' to angles for the three hands. # Note: angle pi/2 = upward vertical direction, subtraction for clockwise motion # Angle of the hand that shows seconds, jumping by the second: se <- as.numeric(substr(tm, 18, 19)) a.se <- pi/2 - 2*pi*se/60 # Angle of the hand that shows minutes, jumping by the minute: mi <- as.numeric(substr(tm, 15, 16)) a.mi <- pi/2 - 2*pi*mi/60 # Angle of the hand that shows hours, moving 'continuously' in proportion to hours, minutes, seconds: hr <- as.numeric(substr(tm, 12, 13)) + mi/60 + se/3600 a.hr <- pi/2 - 2*pi*hr/12 # Drawing the hands: lines(c(0,cos(a.hr)*r.hr), c(0,sin(a.hr)*r.hr), lwd=lwd.hr) lines(c(0,cos(a.mi)*r.mi), c(0,sin(a.mi)*r.mi), lwd=lwd.mi) lines(c(0,cos(a.se)*r.se), c(0,sin(a.se)*r.se), lwd=lwd.se) # The 'axel': points(0, 0, pch=16, cex=4) # Make sure it's all drawn, then sleep for 1/5 second: dev.flush(); Sys.sleep(0.2) } } # end of my.clock() ---------------------------------------------------------------- LECTURE 22, 2017/11/15: ORG: - Quiz 3 Makeup: Friday, Nov 17, 10-11:30am, 350 JMHH . If you want to take the makeup tonight in the TA office hr contact Matt Olson. . Not on quiz: spiral geometry, s/lapply() functions - Point of the quiz: reasoning about code! . Forces slow, methodical thinking based on principles . Learn about the deep differences between slow and fast thinking from the book by Daniel Kahneman: "Thinking, Fast and Slow" ==> Foundation of behavioral economics Nobel prize winner in economics Analysis of the systematic irrationalities in fast thinking, jumping to conclusions. Examples: ~ false causalities ~ post-hoc fallacies [Beware of your own magical thinking in tests!] - HW 2 is due tomorrow Thu, Nov 16, 11pm. Late submissions: ~ No email, please! ~ Canvas time-stamps your submission, so we know about lateness. ~ Budget rules apply, see syllabus. - Reminder -- Ethics rules for this class: . No cell phones, no use of laptops other than for this class! . No smelly foods, please! RECAP -- none, see previous recap on loops ROADMAP: Writing your own functions ---------------------------------------------------------------- LECTURE 21, 2017/11/13: ORG: - Quiz 3: . Today in class; . Makeup: Friday, Nov 17, 10-11:30am, 350 JMHH . If you take it, leave and come back in 30min. . Not on quiz: spiral geometry, apply functions - Point of the quiz: reasoning about code! . Forces slow, methodical thinking based on principles . Learn about the deep differences between slow and fast thinking from the book by Daniel Kahneman - HW 2 is due Thu, Nov 16, 11pm. - Instructor office hour today, Mon, Nov 13, 4:30-6:30pm F36 JMHH !!! - Reminder -- Ethics rules for this class: . No cell phones, no use of laptops other than for this class! . No smelly foods, please! RECAP & EXTENSIONS: Loops - Principles of looping: . Syntax (correct form of expressions): One-liner: for(i in 1:100) { print(i); print(2*i) } Multi-liner: for(i in 1:100) { print(i) print(2*i) } Use the second version for complex looping bodies. . Semantics (meaning of expressions, how they work): ~ A looping data structure over whose elements the loop runs Example: 1:100 ~ A looping variable that takes on all elements of the looping data structure in turn Example: i ~ A looping body (composite expression) executed for each value of the looping variable Example: { print(i); print(2*i) } - Here is another use of loops, out of season... # Constsruct some geometry: 4 circle segments a <- seq(0,pi,length=101); r <- 0.5 # semi-circle with center (0,0.5): x1 <- r*cos(a)+r; y1 <- r*sin(a) # semi-circle with center (0,-0.5): x2 <- r*cos(a)-r; y2 <- r*sin(a) a <- seq(pi,1.5*pi,length=101); r <- 2 # circle segment with center (0,1): x3 <- r*cos(a)+r/2; y3 <- r*sin(a) # clip unwanted coordinates: sel <- x3 <=0; x3 <- x3[sel]; y3 <- y3[sel] # circle segment with center (0,-1): flip sign and reverse previous x4 <- -rev(x3); y4 <- rev(y3) # collect the four circle pieces in one vector: x <- c(x1,x2,x3,x4); y <- c(y1,y2,y3,y4) plot(x,y) # Refine and loop over small vertical random dislocations 'eps': dev.new() par(mar=c(9,8,8,8)) epsilon <- 0.02 for(eps in runif(10000,-epsilon,+epsilon)) { plot(x, y+eps, type="l", xlim=range(x), ylim=range(y), col="red", lwd=5, xaxt="n", yaxt="n", xlab="Will you be my Valentine?", ylab="", cex.lab=2) } - Apply functions: Looping with simple functions . sapply(): x <- sapply(Potpourri, class) x # Print This is one of the most useful functions in all of R! Here, sapply() loops over the elements of 'Potpourri' obtains the string for class() of each element, collects the strings in a vector, and also copies the element names of 'Potpourri'. . Reconstruct the action of sapply() with a for-loop: y <- rep(NA, length(Potpourri)) # c(): not recommended for(j in 1:length(Potpourri)) { y[j] <- class(Potpourri[[j]]) } names(y) <- names(Potpourri) y Check equality: all(x == y) . If the results of the looping function (here: class()) are not uniform, sapply() returns a list: sapply(Potpourri, names) sapply(Potpourri[5:length(Potpourri)], names) ==> The vectors returned by names() applied to the element of Potpourri are of different lengths (many being NULL), hence should be returned as a list. . Benefit of sapply(): + collects results of looping in a data structure + no need to initialize a data structure to be filled with results + no need to assign names to results . Drawback of sapply(): + sofar allows only one function call in the body of the loop HOWEVER, wait for the next chapter! Once we know how to write our own functions, we can give sapply() actions as complex as a for-loop. . lapply(): like sapply(), but ALWAYS returns a list - Rules for using loops in R: . Avoid loops if possible -- C and Java programmers! . Loops with large numbers of repetitions are SLOW! Why? Because R is an interpreted language. Interpreting high-level code is + fast in terms of human time scales (fractions of a second), but + slow in terms of computational time scales (milli/micro-seconds). Illustration: A loop with 10,000 iterations and cost of interpreting high-level code at 1 milli-second (0.001 second) per iteration causes a delay of 10000 * 0.001 = 10 seconds. . If you are tempted to write a loop for a numeric computation, ask first whether there might exist a function in R to do what you want. Example: A dataframe might have uniformly numeric columns, as when the 12 columns represent monthly precipitations in Philly, each row representing one year''s worth of monthly data. You need the sums of the rows to compute yearly preciptation for each year. Should you loop over the rows and apply sum() to each row? No! R provides a function rowSums() for this case! It is also blindingly fast. ROADMAP: - Complex examples of using sapply() [Section 1] - Writing your own functions: Chapter 11. ---------------------------------------------------------------- REVIEW MATERIAL BASED ON OFFICE HOUR 2017/11/08, BEFORE QUIZ 3: * Compare dataframes and lists and vectors: - vectors: . contain only elements of uniform basic data type (no complex elements) . have elements stored in consecutive memory locations (workspace locations) . have no hierarchy: combining two vectors: c(x,y) becomes one vector, concatenated from x and y length(c(x,y)) is: length(x)+length(y), but neither is known from c(x,y) - lists: . can contain arbitrary basic and complex types of elements . are vectors of addresses pointing at elements stored anywhere in memory (workspace) . have hierarchy: length(list(x,y)) is 2 - dataframes: implemented as lists . elements are vectors of same length (interpreted as columns) * How to understand hierarchy in lists, nested: - Define a list: x <- list(1, 1:3, list(letters[1:3], NA, c(TRUE,TRUE)), 11:15) - Draw a diagram of the list: x -- / | \ \ / / \ ------- / / | \ 1 1:3 list -- 11:15 / \ \ / \ \ / \ \ letters[1:5] NA c(TRUE,TRUE) - Find out the following lengths based on reasoning: length(x) length(x[[3]]) length(x[[3]][[1]]) length(x[[4]]) - Single vs double brackets in lists: . Example: x[[3]] # reaches for element 3 of list 'x' x[c(1,3)] # reaches for elements 1 and 3 in 'x' and packages them in a list of length 2 # i.e., we get a sublist of list 'x'; draw its diagram! . Summary: [[]] gets a single element [] gets multiple elements wrapped in a list, i.e., forms a sublist . Discussion: Single brackets can be more general than forming sublists. Example: x[c(1,2,1,2)] # Legal! List of length 4 from elements 1 and 2 of 'x' repeated # Draw its diagram! ---------------------------------------------------------------- LECTURE 20, 2017/11/08: ORG: - Quiz 3: . Coming Monday, Nov 13, in class . Makeup: Friday, Nov 17, 10-11:30am, 350 JMHH If you take it, appear 30min late to class. . Material: Up to and including 'for' loops (not apply() functions) with emphasis on Chapters 8:10, but older material will also appear. Example: random numbers/simulations Not on quiz: spiral geometry - HW 2 is posted, due Thu, Nov 16, 11pm. - Instructor office hour today, Wed, Nov 8, 5:30-6:30pm, Instructor''s office: 471 JMHH - Reminder -- Ethics rules for this class: . No cell phones, no use of laptops other than for this class! . No smelly foods, please! RECOMMENDATIONS FOR QUIZZES: - Slow yourself down! - Think slowly and deeply, from principles, not randomly and not approximately! - When there are similar/related concepts, it is probably the difference that matters! Example: sample mean vs true mean list vs vector vs dataframe - Natural generalizations should be recognized: If there is a difference between sample mean and true mean, there is probably a similar difference between sample correlation and true correlation. - Recognize principles: . Why does cor(rnorm(N), rnorm(N)) converge to zero when N --> Inf? . Does the same hold for the following? cor(runif(N), runif(N)) cor(rnorm(N), runif(N)) cor(rnorm(N,m=10,s=1000), runif(N)) - However, if you have no clue to the answer, you should make a random guess. . This is in the nature of multiple choice tests. . Do not forfeit the 1/4 chance of being randomly correct. . If you got your random guess right, be honest to yourself that you won a lottery and you truly do not know the answer. - Finally, if you 'feel you understand it', realize that it counts for nothing. . The only way to truly know that you understand it is honest self-testing. Go over chapters and recaps, cover up answers and recreate them. . If in a quiz you 'feel that choice B should be the correct one', you have truly not understood the material. . You need to see the principles that apply and use stringent reasoning based on them. - As for grades, nobody will fail if . all the homeworks are handed in with proof of valid effort and . all the quizzes are taken (makeup is fine). RECAP: Composite expressions and loops - Composite expressions: . Not useful in themselves, but essential building blocks for loops and functions . Syntax for forming composite expressions? . What does a composite expression return? . How do you achieve printing inside a composite expression? . What happens to the term 2*x in the following composite expression? { x <- runif(3) # 'x' is assigned a vector containing three unif random numbers 2*x # doubles the value of 'x' and discards the result, not printed -log(x) # returns -log of 'x' as the result of the whole composite expression } - Loops: . Plot 'rnorm(100)' first, then add/concatenate 'rnorm(10)' and plot repeatedly, 290 times: Instructions: ~ Assign 'rnorm(100)' to a symbol 'x', then modify 'x' repeatedly. ~ Call dev.new() beforehand once in order to create a plot window outside RStudio. ... dev.new() x <- rnorm(100) plot(x) for(i in 1:290) { x <- c(x,rnorm(10)); plot(x); dev.flush() } . Repeat but use 'pch=16' in plot(). ... x <- rnorm(100) plot(x, pch=16) for(i in 1:290) { x <- c(x,rnorm(10)); plot(x, pch=16); dev.flush() } . Repeat but use 'ylim=c(-4,4)' in plot(). ... x <- rnorm(100) plot(x, pch=16, ylim=c(-4,4)) for(i in 1:290) { x <- c(x,rnorm(10)); plot(x, pch=16, ylim=c(-4,4)); dev.flush() } . Repeat but use 'xlim=c(0,3000)' in plot(). ... x <- rnorm(100) plot(x, pch=16, ylim=c(-4,4), xlim=c(0,3000)) for(i in 1:290) { x <- c(x,rnorm(10)); plot(x, pch=16, ylim=c(-4,4), xlim=c(0,3000)); dev.flush() } . What is the looping variable? Who chooses it? ... 'i', chosen by the programmer . What values does the looping variable take on as the loop is executed? ... 'i' takes on all elements of '1:290' in turn . Can you loop over a list? ... yes, see chapter 10 . Can you loop over a dataframe? ... yes, see chapter 10 ROADMAP: - Uses of loops, contd. - Looping with apply-functions (not on quiz 3) - Functions that have implied loops - Writing your own functions ---------------------------------------------------------------- LECTURE 19, 2017/11/06: ORG: - Quiz 3: . Next Monday, Nov 13, in class . Makeup: Friday, Nov 17, 10-11:30am, 350 JMHH . Material: up to class on Wed, 11/08. - HW 2 is posted, due Thu, Nov 16, 11pm. - No instructor office hour today at 4:30pm. Instead: this Wed, Nov 8, 5:30-6:30pm - Reminder -- Ethics rules for this class: . No cell phones, no use of laptops other than for this class! . No smelly foods, please! - Help: What bird was this?? RECAP: Lists - How do you ask whether a data structure is a list? Use the following data structures as examples: is.list(Potpourri) is.list(Salary.df) is.list(1:10) - How do you learn the length of a list and the names of its elements? Use 'Salary.df' to do both: length(Salary.df) names(Salary.df) - 'Mental models' for vectors and lists: . Vectors: ... elements of vectors are allocated as consecutive memory locations a symbol pointing at a vector needs to know the beginning and the length . Lists: ... vector of addresses to arbitrary data structures - In what sense is a list a vector? ... vector of addresses - In what sense is a dataframe a list? ... dataframes ARE lists, but the elements have to be vectors of the same length - Recall: In lists we can access elements using the '$' convention: Potpourri$one Potpourri$"the number pi" Why the quotes in the second example? ... - Does it make sense that we could also access columns of dataframes with the '$' convention? ... of course - Trying to access the "Age" of "matt" in the dataframe 'Salary.df' in three ways: . Using dataframe indexing: single brackets with comma Salary.df["matt","Age"] . Using list/vector indexing: Salary.df[["Age"]]["matt"] # Doesn't work -- we lost the names!! Salary.df[["Age"]] # We lost the names! Salary.df[["Age"]][6] # Works but is not useful, we need to know Matt's position... . Using the "$" convention: Salary.df$Age["matt"] # Doesn't work -- we lost the names again!! Salary.df$Age # We lost the names! Salary.df$Age[6] # Works but is not useful, we need to know Matt's position... ROADMAP: - Regression results as lists - Looping - Writing functions ---------------------------------------------------------------- LECTURE 18, 2017/11/01: ORG: - Solutions to HW 1 and Quiz 2 are posted. - Reminder -- Ethics rules for this class: . No cell phones, no use of laptops other than for this class! . No smelly foods, please! RECAP: lists - # Analyzing list that contains a list miniList <- list(1:4, letters[1:3], list(1:2,11:13) ) miniList[[1]] miniList[[3]] # The 3rd element is a list, too! length(miniList) miniList[c(1,3)] # the 2nd element is the 3rd element of miniList, which is a list in turn miniList[c(1,2)] # - Why does the following work? sillyList <- list(list(list(list("A"), list("B")), list(list("C", 0)), list(1, 2) ), list(TRUE, 3:5), list(list(c("A","Z"), 6 ), list(list(list(Inf))), -Inf ), NA ) sillyList ... - Draw a tree diagram to understand 'sillyList'. Then answer the questions below in two ways: based on the diagram and based on code. - How long is 'sillyList'? ... - How long is the 3rd element of 'sillyList'? ... - Get the 2nd element of the 3rd element of 'sillyList': ... sillyList[[3]][[2]] - How long is the result of the previous question? ... - Compare: c("A",c("B","C")) list("A", list("B","C")) What does this say about nesting calls to c() and calls to list()? ... c() flattens list() maintains hierarchy of construction - What is the difference between the following two lines? Answer first by thinking, then execute. list("A","B","C")[1] list("A","B","C")[[1]] - What is the difference between the following two lines? list("A","B","C")[1:2] list("A","B","C")[c(1,3)] # a variation of previous list("A","B","C")[[1:2]] - What is the difference betwen the following two lines? list("first"=10, "second"=100, "third"=1000)["first"] list("first"=10, "second"=100, "third"=1000)[["first"]] - What is the difference betwen the following two lines? list("first"=10, "second"=100, "third"=1000)[c("first","third")] list("first"=10, "second"=100, "third"=1000)[[c("first","third")]] - Think through: What do the following three expressions do? list(sillyList[[1]], sillyList[[3]]); sillyList[c(1,3)]; sillyList[-c(2,4)] ... They do the same!!!! Describe in terms of the graph what these two lists are. ROADMAP: - Lists (contd.) - Loops - Writing functions ---------------------------------------------------------------- LECTURE 17, 2017/10/30: ORG: - TA is working on homework and quiz 2 grading (solutions to be posted) - No instructor office hour today -- apologies. - Reminder -- Ethics rules for this class: . No cell phones, no use of laptops other than for this class! . No smelly foods, please! RECAP: x <- c(NA, 2:9, NA); x - How do you ask whether a data structure is of type 'logical'? ... is.logical(x) - How do you find out which positions in a vector contain missing values? ... is.na(x) - How do you count the number of missing values in a vector? ... sum(is.na(x)) - Write a one-liner to replace the missing values of 'x' with the median of the non-missing values of 'x'. ... x[is.na(x)] <- median(x, na.rm=TRUE) x - How can you check whether the values 1, 2 and 3 are contained in 'x'? ... any(1 == x) # One answer at a time. any(2 == x) any(3 == x) c(1,2,3) %in% x # All three answers in one swoop! - How does the operation used in the previous question deal with NAs? ... x <- c(NA, 2:9, NA); x # Recreate original 'x' with NAs c(1,2,3) %in% x c(NA,1,2,3) %in% x - How does this differ from '=='? ... NA == NA NA == 2 NA == 3 NA == x # By the logic of the previous 3 examples, we get a vector of NAs! # ==> This is a typical mistake! This is NOT how we find the NA positions in 'x'!!!!!!!!!!!! However: NA %in% x ==> %in% checks whether NA occurs in 'x', like is.na() !!!!!!!!! - How can you select "A" with probability 1/3 and "B" with probability 2/3? ... ifelse(runif(1000) < 1/3, "A", "B") - How can you find out which elements of 'realEst$Location' contain "SUB"? ... str(realEst) # Overview of the dataframe (see chapter on reading data from files) realEst$Location # Column of interest table(realEst$Location) # It's categorical, hence tabulate it: absolute frequencies, counts grepl("SUB", realEst$Location) # Finally: Searching for labels containing "SUB" # Next: grepl("SUBNEW", realEst$Location) # TRUE for "SUBNEW"; note 'substring' is not strict; equality gives TRUE, too. realEst$Location == "SUBNEW" # The more natural way, but does the same as the previous line in this case. # Tricky: "SUB" <= realEst$Location # Does the same as grepl("SUB", realEst$Location) in this case ROADMAP: List structures ---------------------------------------------------------------- LECTURE 16, 2017/10/25: ORG: TA is working on homework and quiz 2 grading - Reminder -- Ethics rules for this class: . No cell phones, no use of laptops other than for this class! . No smelly foods! RECAP: - What are the comparison operations: ... == != > <= < >= - Describe the comparison operations on the three fundamental data types: ... ... ... TRUE > FALSE 2.3 > 4.6 "a" < "Aa" - What happens if different fundamental data types are being compared? ... TRUE > 2.3 233 < "2300" TRUE < "abc" 1 < "abc" # quoted digits before letters ==> Coercion: logical --> numeric --> character - What do you expect from a comparison with NA and NaN? ... NA < 1 "abc" < NA - Why do sort() and unique() depend on comparison operations? Which does each depend on? sort(): ... <=, <, >=, > unique(): ... ==, != - What are the logical operations? ... | ... & ... ! - Why do we need logical operations? ... E.g., for formulating 'database queries': Salary > 50K & Salary < 53K ("Salary between 50K and 53K") - Generate the "truth tables" for the logical operations, i.e., enumerate all possible commbinations: ... ... ... c(TRUE,TRUE,FALSE,FALSE) | c(TRUE,FALSE,TRUE,FALSE) c(TRUE,TRUE,FALSE,FALSE) & c(TRUE,FALSE,TRUE,FALSE) !c(TRUE,FALSE) - How are any() and all() related to '|' and '&', and how do they differ? ... ... any(c(TRUE,FALSE,FALSE)) all(c(TRUE,FALSE,FALSE)) any() and all() map logical vectors to single logical values! These functions are used for computational purposes only, not data analysis. In actual data analysis use table() instead, which is more informative: table(c(TRUE,FALSE,FALSE)) - What do sum() and mean() do on logical data? ... ... sum(): count of TRUE mean(): prop of TRUE - What are functions that return logical values? ... ... ROADMAP: - NA imputation - subset test: x %in% y - conditional selection: ifelse(,,) - searching for strings: grepl(,) - list structures ---------------------------------------------------------------- LECTURE 15, 2017/10/23: ORG: - Makeup quiz, 2nd opportunity: Today, Monday, Oct 23, 4:30pm, right after this class, SHDH 215 (!!! NOT JMHH F36) Please, arrive on time. - Honor code: No communications between those who have taken the quiz and those who have not. - Reminder -- Ethics rules for this class: . No cell phones, no use of laptops other than for this class! . Please, no smelly foods! RECAP: - What are the 4 ways of indexing vectors and dataframes? ... ... ... ... - What is the rule for indexing dataframes? ... - MANDATORY: . The 4 indexing mechanisms MUST BE MEMORIZED! . Whenever there is a question of selecting or deselecting elements of a vector or rows/columns of a dataframe, you must reflexively look here for a solution! - How do logical vectors arise in practice? ... - Examples referring to the dataframe 'Salary.df': In each case, what data structure is returned? . Select the column containing employees'' ages. ... Salary.df[,"Age"] . Select the columns containing employees'' ages and gender. ... Salary.df[,c("Age","Gender")] . Select those employees who make more than Ed. ... sel <- Salary.df[,"Salary"] > Salary.df["ed","Salary"] Salary.df[sel,] # Correct, but messy to read: Salary.df[Salary.df[,"Salary"] > Salary.df["ed","Salary"],] . Select gender and age of those employees who make more than Ed but only return age and gender: ... sel <- Salary.df[,"Salary"] > Salary.df["ed","Salary"] Salary.df[sel,c("Age","Gender")] . Select age of those employees who make more than Ed. ... sel <- Salary.df[,"Salary"] > Salary.df["ed","Salary"] Salary.df[sel,"Age"] - When do you need to be cautious when making comparisons of numbers? ... What is a simple solution? ... 0.3-0.1 == 0.5-0.3 round(.3-.1, 5) == round(.5-.3, 5) ROADMAP: - comparisons of character data - counts and proportions of logical data - logic operations - functions that produce logical values ---------------------------------------------------------------- LECTURE 14, 2017/10/18: ORG: - New: Matt Olson (TA) offers makeup quiz 2 on Friday, Oct 20, in 350 JMHH, arrive between 10am and 11am. - Makeup quiz is also (still) offered on Monday, Oct 23, arrive between 4:30pm and 5pm, room TBA (NOT F36) - Honor code: No communications between those who have taken the quiz and those who have not. Section 1 -- Section 2 -- Makeup - Reminder -- Ethics rules for this class No cell phones, no use of laptops other than for this class! RECAP: * Reading dataframes from files: - What functions can you use? In which situations? ... ... - How do you specify the file? ... # Find out R's current working directory: getwd() # If you don't like it, you can set it: setwd("c:/User/STAT-470/") # for example # If the data file is on your computer and in the working directory (default or set by you): dat <- read.csv("some-data.csv", ...) # Or you can give a full 'path', i.e., something of the form disk:/folder/folder/..../, # in which case the working directory is irrelevant: read.csv("c:/user/STAT-470/some-data.csv", ...) # For files on the internet: read.csv(...URL...) - What argument should you NEVER forget when handcrafting dataframes or reading them from files? ... * Logical values, indexing with logical values, comparison operations - The symbols 'T' and 'F' for TRUE and FALSE: caution! Why? ... - Indexing with logical values: # Play toy: x <- 1:10 # Indexing with logical values: x[c(TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE)] # Logical indexing using symbols: sel <- c(TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE,TRUE,FALSE) x[sel] # What happens if the logical vector is too short? x[c(TRUE,FALSE)] # ... # What is surprising when executing the next statements? x[c(TRUE,FALSE,TRUE)] # ... x[c(TRUE,FALSE,FALSE)] # ... x[rep(c(TRUE,FALSE,FALSE), length=length(x))] # Same as previous line # AYT? What do you expect the following line to do? x[TRUE] # ... # AYT? Can one generate repeats with logical indexing the way x[c(3,3,3)] does? # ... - How logical vectors come about: Comparison operations # Comparing individual numbers: 2 < 3; 2 > 3; 2 <= 2; 3 >= 3; 2 == 3; 2 != 2 # Comparing vectors element by element: c(10,0,5) <= c(5,0,7) 1:3 < 2:0 # Typical examples: 'query vectors' StudentHeights >= 66 Salary == 53000 # What happens here? runif(100) > 0.5 # ... ROADMAP: - Comparison operations (contd.) - Logical operations! ---------------------------------------------------------------- LECTURE 13, 2017/10/16: ORG: - Reminder -- Ethics rules for this class No cell phones, no use of laptops other than for this class! - Quiz 2: Wednesday, Oct 18, in class Material: cumulative to Oct 16, emphasis on simulation and indexing, NO logic (Chap. 8) Makeup: Monday, Oct 23, 4:30-5:30pm instructor''s office hour, F36 JMHH If you take the makeup, appear 25 minutes late to class on Wed. [Note to self: find/post previous quiz questions] RECAP: dataframes - Rules about data types in dataframes? ... - Query functions for dataframes? ... class(); dim(), nrow(), ncol(); rownames(), colnames() - Indexing rules for dataframes? ... x[,], in both positions, use what you can do in vectors vec[] (pos int, neg int, names) - Practice: Example 2 in Chapter 07, about 'Salary.df' ROADMAP: - Reading dataframes from files - Logic (Chapter 08) ---------------------------------------------------------------- LECTURE 12, 2017/10/11: ORG: - Reminder -- Ethics rules for this class No cell phones, no use of laptops other than for this class! - Quiz 2: Wednesday, Oct 18, in class Material: cumulative to Oct 16 Makeup: Monday, Oct 23, instructor''s office hour If you take the makeup, appear 25 minutes late to class. RECAP: x <- c(abc=3, .abc=4, .abc_=5) - How do you ask a vector for the names of its elements? ... names(x) - Given an existing vector, how can you assign names to all its elements? ... names(x) <- c("a","aa","aaa") - When doing so, do you need to follow symbol syntax for the names? ... When do you need to follow symbol syntax for named elements? As above in the construction of 'x': unquoted arguments abc, .abc, .abc_ The following version with quoted argument names would NOT need to follow symbol syntax: x <- c("abc"=3, ".abc"=4, ".abc_"=5) x <- c('abc'=3, '.abc'=4, '.abc_'=5) GENERAL FACT: You can always quote argument names to ALL functions, but generally one doesn''t used quoted argument names because most functions use symbol syntax for their arguments. - How can you set specific names and leave others alone? ... names(x)[2] <- "lkajdfafd732*&%&^*&*&*&" - What needs to be in the brackets when indexing a vector? ... x[vector of positive integers between 1 and length(x), possibly with repeats] ... x[vector of negative integers between -1 and -length(x), possibly (but uselessly) with repeats] ... x[vector of type character (of strings), subset of names(x), possibly with repeats ] ROADMAP: - Dataframes - Logic ---------------------------------------------------------------- LECTURE 11, 2017/10/09: ORG: - Reminder -- Ethics rules for this class No cell phones, no use of laptops other than for this class! - HW 1: . Due today 11pm . Instructor office hour Monday 4:30-5:30pm, JMHH F36 (<<< note room). RECAP: * FUNDAMENTALS: - Why are the following code lines ok and their returned values sensible? length(3) x <- 3; length(x) - What can you expect when two vector arguments to a function or operation are of unequal length? ... - How can you interpret the following code in light of the previous point? Salary + 1000 # Explain: ... * INDEXING WITH INTEGERS: - From the vector 'Salary', select the first and the last element: ... - Criticize the following 'expressions': Salary[(1,5)] # .... Salary[1,5] # ... - What needs to be in the bracket for SELECTION? ... - From the vector 'Salary', DEselect (remove) the first and the last element: ... - What needs to be in the bracket for DESELECTION? ... - Does the following code work? If yes/no, why? N1 <- floor(log10(0.5)) N2 <- floor(log10(500)) sel <- seq(N1,N2) Salary[sel] # ... - Does the following code work? If yes/no, why? Salary[floor(log10(50:150))] - Select Cecilia and Liz from 'Salary': ... - If we select elements by name, what is the assumption about the vector? ... # that there are names for the elements! * VECTORS WITH NAMED ELEMENTS: - When we form a vector such as x <- c(a=10,b=100) does it mean that there exist symbols 'a' and 'b' in your symbol table? Explain! ... - How can we ask for all names in a vector? Show for 'x': ... names(x) ROADMAP: - Rest of Chapter 6, more about names, indexing with names of elements - Chapter 7: dataframes ---------------------------------------------------------------- LECTURE 10, 2017/10/04: ORG: - Reminder: Ethics rules for this class . No cell phones, no use of laptops other than for this class! . Reciprocity: You may remind the instructor of infractions as well! - HW 1: . Due Monday 11pm . Instructor office hour Monday 4:30-5:30pm, JMHH F36 (<<< note room). . No TA office hour tomorrow Thu. . Any questions? RECAP: - First a correction: Chapter 5 . Scale invariance of correlation is limited to positive multipliers: cor(height*c1, weight*c2) = cor(height,weight) (c1,c2 > 0) Please, correct! . What about the square of the correlation? cor(height*c1, weight*c2)^2 = cor(height,weight)^2 Do we still need c1,c2 > 0? ... - Compare: (1) Simulation of a linear regression model versus (2) "Parametric bootstrap" of a linear regression model . What do we do in (1)? ... we specify a, b, sigma, an x vector of length N, then set y <- a+b*x+rnorm(N,s=sigma) . What do we do in (2)? ... we have data, we estimate a, b, sigma => ahat, bhat, sigmahat; yboot <- ahat+bhat*x+rnorm(N,s=sigmahat) # leave x alone! then plot(x, y); plot(x, yboot) and compare ... ==> If the two plots look very different, then the model is deficient . How can (2) inform you about the adequacy of the model? ... - Vector operations: . List some of functions we mentioned: ... . Classify some of these functions: ... . Missing value treatment by these functions: ... - Concept of recycling: . Explain: ... . Examples: ... 1:10 + 1:2 1:10 * 1:2 1:10 * 1:3 # Now what? 1:10 + 5 # Falls under the concept of recycling ROADMAP: - Chapter 6: Indexing of vectors with in integers and strings/names - Chapter 7: Data frames - the bread and butter data structure for statistics ---------------------------------------------------------------- LECTURE 9, 2017/10/02 ORG: - Homework 1: new due date due to fall break - Monday, Oct 9, 11pm - Instructor office hours: Mon 10/02 (today) and 10/09 (next week) 4:30-5:30pm, F36 JMHH No TA office hour this Thu due to fall break. RECAP: Random numbers, sampling, and stochastic simulations - In what sense are random numbers random? ... - What random number generators do we know? ... - What are their arguments? ... - What happens to the histograms of random numbers as the number N of draws increases? ... - How do we randomly sample elements of a given vector? ... - Can we sample from any type of vector? ... - What are the two sampling modes? ... - Which of the two sampling modes generates iid draws? ... - In what sense are the draws of the other sampling mode dependent? ... - Intuitively, when is this dependence stronger -- when the input vector is short or long? ... - Describe the ingredients for the simulation of a simple linear regression model: ... N <- 100 x <- runif(N) # Totally arbitrary x-values, but must be more than one to fit a line ## rnorm(N) # another example ## rnorm(N, m=100, s=5) # another example ## rep(1:10,10) # another example, really, the x-values are arbitrary!! a <- 10 b <- 5 y <- a + b*x + rnorm(N,m=0,s=0.1) plot(x, y, pch=16) abline(a=a, b=b, col="red") ROADMAP: - "Parametric bootstrap" simulation of a simple linear regression (Chapter 4) - Vector computations, vector recycling (Chapter 5) - Indexing vectors with integers and strings (Chapter 6) ---------------------------------------------------------------- LECTURE 8, 2017/09/25: ORG: - Quiz 1 results and 1st homework are posted. - TA office hour Thu 4:30-6:30pm, F94 JMHH - Instructor office hour Monday 4:30-5:30pm, F36 JMHH RECAP: Random numbers and stochastic simulations - What is the opposite of stochastic? ... - What do the two opposite terms mean? ... det.: ... stoch.: - Is there true randomness in the world? ... - What is a simulation on a computer? ... - What is the role of random numbers in computer simulations? ... - Are random numbers generated by your computer truly random? ... - How come a random number generator produces a different result every time it is called? ... - What is the most fundamental random number generator? ... - What is the underlying probability density function? ... - How should we think of a 'probability distribution'? ... - What links a probability distribution to finite data drawn from it? ... - What are the 'parameters' of uniform distributions? ... - What are the their default values in R? ... - What happens to a histogram of random numbers drawn from a uniform distribution as the number of random numbers increases? ... - How can we generate random integers from uniform random numbers? ... - Are these random integers drawn with or without replacement? ... ROADMAP: - Normal distributions and their random numbers - Simulation of a simple linear regression model ---------------------------------------------------------------- LECTURE 7, 2017/09/25: ORG: - Quiz 1 results and 1st homework to be posted shortly. - Instructor office hour today 4:30-5:30pm, 471 JMHH - TA office hour Thu 4:30-6:30pm, F94 JMHH RECAP: - Basic data types: . ... . ... . ... numeric; character; logical [You can use the function class() to find out the data type of R objects.) - Functions we know: log(); log10(); exp(); sqrt() Numeric operations (They are 'syntactic sugar' for functions.) c() c() # Make up an example with named elements seq() rep() length() rev() sort() unique() mean() round() trunc() ceiling() floor() plot() ls() # Rstudio always shows its output in ... rm() # save.image() # Why do you need this rarely in RStudio? help() # " What is the 'data type' of arguments to help()? - What are the two main purposes of functions? . ... compute data structure . ... plotting, printing ('side effects') - Symbols: . Synonyms? ... . Symbol syntax? ... . What do symbols do? ... point at an object ... give permanence to compute objects . Where are symbols kept? ... user symbol tables; user namespace; global environment ... system symbol tables; system namespace; (system environment) . What kind of symbols can be removed by rm()? ... . What type of object is R looking for when it sees the symbol 'mean' in mean(1:10) # ? . What type of object is R looking for when it sees the symbol 'mean' in mean # ? . Message about symbols: ~ Symbols can point at data AND at functions. ~ In this sense, data and functions are both objects in R. ~ Users can assign functions to symbols. . Note on RStudio: ~ The upper right panel always shows the output of ls(), divided into two sections: 'Data' and 'Functions' ~ RStudio does NOT allow you to remove symbols using that panel! This may be out of caution: point-and-click removal could do damage that would be difficult to keep track of. . Synonyms: 'user symbol table' = 'user namespace' = 'global environment' RStudio uses the last term for the upper right panel. ROADMAP: - (Pseudo-)Random numbers and simulations - Vector computations - Indexing vectors with integers and names ---------------------------------------------------------------- LECTURE 6, 2017/09/20: ORG: - Chapter files have occasional edits from class to class, apologies! You should be able to copy/edit the new material/versions by hand into your copy from last class. - Reminder: NO CELL PHONES, LAPTOP USE FOR THIS CLASS ONLY, ... [see Syllabus for rules] RECAP: - Three basic/atomic/fundamental data types, with examples: ... ... ... - Three functions we studied in detail, with examples: ... ... ... - Two concepts related to arguments of functions: ... ... ... - Sideline: plot() . Explain to yourself how the x and y arguments work . Explain how curves are drawn, and why this works for the human eye, and why computers are unable to draw perfect curves. [Related to these questions is the notion of 'discretization'.] ROADMAP: - Complete Chapter 3: . a few more functions . more details about symbols . saving the workspace - Chapter 4: simulations with random numbers and random sampling ---------------------------------------------------------------- LECTURE 5, 2017/09/18: ORG: - Office hour today: 4:30-5:30pm, 471 JMHH - Today is the end of the course selection period. - Quiz 1 will be graded this week. - A homework is brewing... RECAP: - Three basic/fundamental/atomic data types: . ... numbers, numeric Examples: 2; -2; 2.2; 2e2 ... Uses: counting; measuring; . ... strings/character/text Examples: ... Uses: ... . ... Examples: ... Uses: ... - The simplest composite data type in R: ... vectors - A simple function to create arbitrary examples of the above: ... c() - What happens when vectors are arguments to the above function? ... c(1:3,11:13) - What happens if calls to the function are 'nested'? ... c(1,c(2,c(3,c(4,5)))) - General observations: . Much of R programming consists of calling functions applied to 'arguments'. . Examples of functions we have encountered so far: ... sqrt; sort; ls; rm; abs; log; .... . Calling functions triggers computations that produce a result. . The result is typically one of the following: ~ data/values created by the function, or ~ plots or ~ house keeping action (rm) . In the first case, ~ the data are printed and then disappear, as in c(1,10,100) ~ or the data/values are kept and made available for later use if they are assigned to a symbol, as in x <- c(1,10,100); print(x) ROADMAP: - creating patterned numeric vectors using the functions seq() and rep() - arguments to functions: default arguments, named arguments - operations on numeric vectors: length(), rev(), sort(), unique(), round(), ... - more about symbols: symbol tables, use of symbols for functions - saving the workspace ---------------------------------------------------------------- LECTURE 4, 2017/09/13: - TODAY: QUIZ 1 . If you take the makeup on Friday, please, step out and return in 25 minutes. . All quiz takers, please, fetch a problem set and a bubble form. . DO NOT START YET -- KEEP THE PROBLEM SET UPSIDE DOWN! . Start filling in the top of the bubble form: ~ LastName, First Name (printed and bubbled) ~ Penn ID (printed and bubbled) ~ Write 'Quiz 1' in the top right corner. . WAIT FOR THE INSTRUCTOR TO SIGNAL THE START OF THE QUIZ! . RECOMMENDED: ~ Solve the quiz first by circling your choices on the problem set. ~ Transfer your choices to the bubble form at the end. ~ If you make a mistake, ask for a new bubble form. ~ Over-writing causes missing data! . WHEN YOU ARE FINISHED, STAY SEATED AND QUIET !!!!!!!!!!!!!!!! ~ Turn the problem set and the bubble form upside down. Do not turn them upside up anymore!!!!! ~ You may then use your electronic devices quietly. . When the quiz is over, listen carefully for instructions!!!!!! . Ethics reminder! - TA Office hours: Tomorrow Thu, 4:30-6:30pm, JMHH F94 - Makeup quiz: Friday, arrive between 10am and 11am, room TBA. RECAP + RANTS: - Required attitude in programming and data analysis: . We need to be very detail-oriented, very lawyerly. . We are instructing computers to do things for us. . Computers are unforgiving about missed detail. . This is not cool; it is tedious. . However, one can get used to it, and along the way become a more careful thinker. - Symbols: . Symbol syntax . Assignment of values/data to symbols . Housekeeping: listing all symbols, removing symbols . Main difference to math symbols? ... . In complex programs and data analyses we may require lots of symbols, for ~ data ~ intermediate results ~ final results ~ for variables needed in computations - Syntax for R expressions: . Expresions can be spread over multiple lines. How? ... . Multiple short expressions can be written on a single line. How? ... - Missing values arise . as a result of problematic computations, . as a result of missing in empirical data collection. ROADMAP: Chapter 3 - Two more basic data types: character data, logical data - Forming vectors - Patterned numeric vectors ---------------------------------------------------------------- LECTURE 3, 2017/09/11: - QUIZ 1: Wednesday, Sept 13, beginning of class . The make-up quiz: Friday, Sept 15 Arrive between 10am and 11am on Friday. Room TBA on Canvas. . Material: End of Chapter 2, even if we do not get that far!!!!!!! ==> Self-study! (Includes quantitative literacy items.) (Also includes recap below.) . If you take the makeup quiz, on the day of the in-class quiz, appear to class 25 minutes late. Section 1: 12:25pm, Section 2: 3:25pm . Format: multiple choice, 12 questions, 4 possible answers each (actually: 13) 20 min . Ethics: Those who take the makeup quiz are under honor code not to inform themselves from those who took the quiz; and those who took the quiz at the regular time are under honor code not to inform those who will take the make-up quiz. . Practice quizzes from 2016 are posted, might not perfectly match what we do here... E.g., there will be no questions about integer division and remainder operations. - INSTRUCTOR OFFICE HOURS: Today, 4:30-6:30pm, 471 or 440 JMHH - CLASS ROOM RULES OF CONDUCT: . cell phones, use of computers, .... . Another word on ethics: https://www.washingtonpost.com/entertainment/theater_dance/on-broadway-dramatic-riches-await-in-both-words-and-music/2017/09/07/ - RECAP: . Two TYPES of computer languages? (1) ... (2) ... . What are their purposes? (1) ... (2) ... . Where is R code executed? General terminology: ... RStudio terminology: ... . Using R as a pocket calculator: ~ List the USUAL NUMERIC OPERATIONS by their symbols: ... ~ Do you recall some UNUSUAL NUMERIC OPERATIONS in R? ... ~ Describe the behavior of the 'ladder' operation; what is the result? ... ... ... ~ What happens when numeric operations are applied to a ladder? ... ~ Sort the operations by their order or 'precedence': ... ~ List numeric functions, with some of their uses: (Section 2) ... ... ... ... ... ... ......... sin, cos, tan Two uses of trig: 1) geometry, 2) time series analysis (periods) sqrt, exp, log . Syntax rules for ~ parens: ... ~ blanks: ... . What is the issue with the result of the following expression? sin(pi) * ROADMAP: - Missing numeric values - Missing data values - Number representations - Symbols/variable names/variables - Assignment - Symbol syntax ---------------------------------------------------------------- LECTURE 2, 2017/09/06: - QUIZ 1: Wednesday, Sept 13, beginning of class . The make-up quiz is on Friday, Sept 15, ToD TBA, Statistics Department, proctored by our receptionist Noelle. 10am-11am time of arrival . Material: End of Monday lecture. . If you take the makeup quiz, please, appear on the day of the quiz to class 25 minutes late. (Section 1: 12:25pm, Section 2: 3:25pm) . Ethics: Those who take the makeup quiz are under honor code not to inform themselves from those who took the quiz. . Format: multiple choice, 12 questions, 4 possible answers each (actually: 13) 20 min - ENROLLMENT: . If you any doubt about this course, its content, its pace, its instructor,... please, free up your seat. - SYLLABUS: Study it! In particular: . Budget rules for ... . Brownie points from ... . Homework collaboration rules - CODE OF CLASSROOM CONDUCT: . NO CELL PHONES!!! No texting, no messaging, to whatsapp, no emailing, no instagram, no twitter, ... Silence it and put it away! . Do not distract your fellow students from learning. . Use of the LAPTOP is exclusively for class-related activities. -- Honor code -- especially in the back rows . Minimize switching windows on your laptop to avoid peripheral-vision distraction of your fellow students. Fact: You are not alone in this room. . If you must leave early, let the instructor know and sit near an exit. Same if you expect an urgent phone call. . 'Like'-free professional expression, in speech and writing. - PREREQUISITES: The course does not assume that you know programming. Students who do know some R, please, be patient the first few weeks and be helpful to your fellow students with lesser background. - IF THIS COURSE IS BELOW YOUR LEVEL, please, free up your slot because others are waiting to get in. - THIS COURSE WILL BE OFFERED AGAIN IN THE FALL OF 2018. - EXPECTATIONS: This course strives to give an in-depth understanding of the R language. It will not strive for encyclopedic breadth. Just the same, you will be acquainted with over a hundred R functions by the end of the semester. - CANVAS is set up. The Syllabus as well as the first chapter are uploaded. Please, post questions on Canvas 'Discussions'. Everybody is encouraged to answer questions to the best of their knowledge and in professional language. Students who stand out answering questions receive brownie points that may increase their grade. - RSTUDIO: This software should be installed on your laptop by now. Did anybody NOT manage to set up RStudio? Two modes of operation of RStudio: (1) Type R code directly into the Console window in the bottom left. ==> You are using R as a calculator or for simple experiments. (2) Edit a file in the upper left window and send R code to the Console according to the choices found under 'Code' in the top bar, most importantly: Run Line(s): Ctrl/Command+Enter We usually use RStudio in mode (2): . In class we edit the current chapter file. . For homework you will edit a homework file. ----------------------------------------------------------------