Computer programming, or coding is the use of a specific language to execute commands on a computer. The commands are usually called functions, though there’s more to it than that. A function is simply a mapping between some input and some output. It specifies a procedure for turning input into output.
For example, the sample mean is a function. The details are what the equation \(\frac{1}{N}\Sigma X\) says to do. This function has “arguments”, details to be passed to it and used. In this example, the arguments are N (the sample size) and the sum of the scores.
At it’s essence, R is a procedural language, meaning that programming in R is very literal, and involves writing out calls to various functions in order to obtain the output that you’d like to have. It also has “object-oriented” properties in the background. We don’t need to know about those, but they’re great since they make R very flexible as an open source tool. Open source means that anyone in the world can add functions or procedures to R by writing new code in R.
In R, we have a choice of typing in commands at the console, which is just the command prompt or writing out many commands and statements in a file and running or executing some or all of them when we need them. The latter is basically the same as writing syntax in SPSS. Like syntax, R scripts can be written, saved, re-used, etc. In both cases, the idea is to be able to reconstruct your analysis exactly each time you re-do it. This saves time and energy, but also makes the analysis “reproducible” such that other scientists can check your work if they desire.
We will use the RStudio Script window to write and save our code. But let’s start with the console.
Type the following statements one by one in the console and see the output. You can actually copy each line one at a time and paste it in your console.
Press ENTER to have the console return results.
Note that any line that starts with # is not read by R. This is called a comment, or “commenting out” a line. These are handy in script files as they explain what is there to a person who doesn’t know what you were hoping to do (which is sometimes you!).
# addition
100 + 100
# subtraction
100 - 100
# multiplication
100 * 100
# division
100 / 100
# Squaring
100^2
# Raising to any power
100^(1.23)
# Using "e" (Euler's Number)
exp(-1.2)
So at this point we can see that R is a handy-dandy calculator. More elaborate computations can be performed using parentheses and some other good stuff. Let’s try it, but now I want you to open up a new Script file. You can do that by clicking File > New File > R Script (or by various other short-cut methods).
The basic unit of data in R is a vector. Think of it like a row or column of data in Excel. Data objects aren’t very useful if we don’t save them to R’s Global Environment (this means the active datasets in R’s memory at any time).
We “assign” data to an object in R using the assignment operator
<-
. To type it you type the less than sign
<
then the short dash -
. We put data in a
vector by using the c()
syntax. c
is short for
concatenate, a fancy word meaning string a bunch of stuff together.
When you assign data to an object (x
below). It won’t
print to the console. To see what’s in that object you can simply type
it’s name at the console.
## [1] 1 2 3 4
Let’s say you’ve got 10 scores from different patients rating the quality of their care on a 5-item scale. The data are averages across the 5 items. A score of 5 means the patient was very satisfied.
Type in the following lines, you can omit the comments. When you are finished, click the button in your script window that says “Source.” This button sends all of your script to the console and runs it all. If instead you want to run one line at a time, highlight it with your cursor, or simply put your cursor at the beginning of the line and click the “Run” button. You can do this for the rest of the tutorial (that is, run each new line by itself rather than hitting source over and over again).
# make a small dataset
mydata <- c(3.0, 2.0, 5.0, 2.5, 3.5, 4.5, 2.0, 3.80, 1.0, 4.0)
# compute the mean using the sum of the scores divided by how many there are (the length of the vector holding them)
mymean <- sum(mydata) / length(mydata)
# compute sum of squares using similar commands note the parentheses!
mySS <- sum(mydata^2) - ((sum(mydata)^2) / length(mydata))
# compute the sample standard deviation by taking the square root of the sample variance
mysd <- sqrt(mySS / (length(mydata)- 1))
Ok, you may be wondering where the output is. Look at the Environment tab in your RStudio pane. You’ll see a list of Values and the names you gave them along with the results. If you did it correctly you should have:
## [1] 3.0 2.0 5.0 2.5 3.5 4.5 2.0 3.8 1.0 4.0
## [1] 3.13
## [1] 14.221
## [1] 1.257025
You’ve done several things here, let’s unpack each one. Let’s compare our results with R’s built in functions
## [1] 3.13
## [1] 1.257025
Aha! We did it. As you can see, many things that you might be tempted to hard code are already functions / commands in R. We’ll see that this is true for many descriptive statistics in the next section. But first we need to start working with real data, and importing it into R.
The two types of files that are likely to contain data you already have are .sav files (SPSS) and either .csv or .xlsx files. I’m grouping the latter 2 together although they’re actually quite different. To make things easier, I’m going to show you how to import .csv files only and also SPSS files.
The best way to make sure the data file you’re importing is accessible is to make a folder on your computer, store the file in that folder, and then change the working directory to that folder
Set as working directory. Click the Gear icon to open this menu.
Data can be cleaned in R quite efficiently, but we won’t worry about that. However, you want your data to adhere to the same format as SPSS files have. That is, you want each row to be a person (unit of observation) and each column to be a different measured or assigned variable. Missing data is fine, and R knows how to handle a variety of missing data.
Let’s try it. First, you’ll need to download the nurses.sav file. Save it to your Downloads folder if it doesn’t automatically go there.
Here’s where it get’s a little tricky. To import SPSS data, we will
use the “read.spss” command. That command asks you to give the full path
to the file. On windows and PC it’s a bit different. I have a Mac so the
following works for me. Before we run it, we have to load the add-on
package that contains the command. It is called foreign
and
you’ll likely need to install this package first. Following code loads
the package, then runs the command by specifying the path to the file in
quotes. This assumes that you have already set the working directory to
the folder that contains nurses.sav.
Anyway, you can verify that your data imported by looking at the Global Environment pane and clicking on the name of the dataset. It should pop up as a tab in RStudio and if you click on the tab you’ll see something that is very much like a spreadsheet. Also, you should see a blue circle with a little arrow back over by your dataset. It’s a quick way to view the columns in your data. Notice we have a “NA” at various places. This is R’s native missing data symbol and can be handled by various commands in various ways.
One more thing, let’s get some more details about our data before we analyze it. Try this
## 'data.frame': 1000 obs. of 20 variables:
## $ hospital : num 1 1 1 1 1 1 1 1 1 1 ...
## $ ward : num 1 1 1 1 1 1 1 1 1 2 ...
## $ wardid : num 11 11 11 11 11 11 11 11 11 12 ...
## $ nurse : num 1 2 3 4 5 6 7 8 9 10 ...
## $ age : num 36 45 32 57 46 60 23 32 60 45 ...
## $ gender : Factor w/ 2 levels "male","female": 1 1 1 2 2 2 2 2 1 1 ...
## $ experien : num 11 20 7 25 22 22 13 13 17 21 ...
## $ stress : num 7 7 7 6 6 6 6 7 7 6 ...
## $ wardtype : Factor w/ 2 levels "general care",..: 1 1 1 1 1 1 1 1 1 2 ...
## $ hospsize : Factor w/ 3 levels "small","medium",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ expcon : Factor w/ 2 levels "control","experiment": 2 2 2 2 2 2 2 2 2 2 ...
## $ Zage : num -0.582 0.166 -0.914 1.162 0.249 ...
## $ Zgender : num -1.66 -1.66 -1.66 0.6 0.6 ...
## $ Zexperien: num -1.002 0.487 -1.664 1.315 0.818 ...
## $ Zstress : num 2.07 2.07 2.07 1.04 1.04 ...
## $ Zwardtype: num -1 -1 -1 -1 -1 ...
## $ Zhospsize: num 1.78 1.78 1.78 1.78 1.78 ...
## $ Zexpcon : num 0.992 0.992 0.992 0.992 0.992 ...
## $ Cexpcon : num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
## $ Chospsize: num 1 1 1 1 1 1 1 1 1 1 ...
## - attr(*, "variable.labels")= Named chr [1:20] "id" "id" "unique ward id" "id" ...
## ..- attr(*, "names")= chr [1:20] "hospital" "ward" "wardid" "nurse" ...
## - attr(*, "codepage")= int 65001
What you’ve done here is basically print to the screen what you get
with the toggle circle button. This is very helpful if you have factors
labels (as with gender
) which you will see are included
with SPSS imports.
The nurses
data set has a special type called data
frame. This is basically R’s version of a spreadsheet. It has different
columns for variables, and different rows for cases. Each column is of a
specific type: num
for numeric, char
for
character (text), Factor
for coded variables,
int
for integer, and logi
for logical (True /
False).
We won’t have much use in this class for writing our own little
commands and functions, we’ll use the built-in functions. Here we’ll
check one out from the psych
package. For more on this type
?psych in the console. Actually do it! If successful, you’ll see the
help page for psych
come up in one of your panes. In that
window, click on describe
it will be underlined in the 3rd
line of the text. We’re going to use this function now. Note that in the
“Usage” section of the help document, there’s lots of options for the
describe
function. They’re all default, so the only thing
we need to do is simply specify what x
is.
As with foreign
the psych
package needs to
be loaded with the library()
command. In practice, it’s
best to put all of your library()
commands at the top of
your script and load them simultaneously. All we need to do is specify
the data frame containing the data and (optionally) the columns.
library(psych)
# use the describe function from psych on nurses
# use only columns 2 through 21
describe(nurses[ , c("age","experien","stress") ])
## vars n mean sd median trimmed mad min max range skew
## age 1 1000 43.01 12.04 43 42.98 14.83 23 64 41 0.03
## experien 2 1000 17.06 6.04 17 17.06 5.93 1 38 37 0.03
## stress 3 1000 4.98 0.98 5 5.03 1.48 1 7 6 -0.45
## kurtosis se
## age -1.18 0.38
## experien -0.39 0.19
## stress 0.25 0.03
The column labels are pretty self explanatory, but your view in R studio may be different. For the most part, running something in the Script window will send output to the Console.
The syntax I used above to select columns to analyze in the
nurses
data frame is only one option out of many. Let’s
take a quick look at several ways to get the values in the “age”
column
## [1] 36 45 32 57 46 60 23 32 60 45 57 47 32 42 42 53 60 33 64 37 23 61 58 52
## [25] 28 52 43 64 47 62 39 46 58 34 41 56 39 60 57 50 29 57 51 25 27 53 42 43
## [49] 24 48 60 27 61 33 47 50 38 28 33 52 53 31 27 57 47 41 64 45 42 25 34 36
## [73] 55 33 51 60 24 31 58 60 29 24 24 46 43 24 45 64 25 55 42 58 64 47 25 50
## [97] 42 60 33 51 27 43 45 53 53 37 25 32 46 57 27 55 51 24 56 31 46 57 42 53
## [121] 48 25 37 47 29 41 31 52 36 39 62 42 51 29 25 50 23 60 43 43 27 28 36 60
## [145] 64 64 61 27 32 28 64 55 46 45 23 55 60 39 24 61 28 34 34 25 39 64 60 58
## [169] 58 32 41 61 47 60 56 47 31 47 51 61 50 37 23 56 45 23 48 46 64 47 56 51
## [193] 28 61 41 37 25 25 50 64 38 39 43 28 33 39 46 25 52 45 52 33 25 36 50 57
## [217] 43 53 62 60 62 33 33 27 61 23 38 64 64 57 25 56 43 45 23 46 37 31 29 37
## [241] 61 43 39 29 52 58 34 57 37 56 32 29 29 32 25 36 45 58 52 42 31 37 47 31
## [265] 38 58 31 41 62 62 24 28 58 56 50 51 24 28 43 45 61 45 41 31 32 24 57 41
## [289] 58 56 51 45 32 55 53 41 47 41 28 62 34 41 31 57 60 29 46 27 50 45 34 45
## [313] 39 43 28 55 32 38 36 29 48 31 37 58 31 46 60 46 57 50 64 24 37 36 58 64
## [337] 34 47 43 39 43 27 27 55 61 37 36 32 47 27 42 55 45 23 58 60 58 33 50 47
## [361] 23 53 27 53 37 53 52 39 48 64 42 31 34 41 47 48 29 51 48 61 23 61 50 39
## [385] 37 47 24 47 23 36 41 55 56 37 39 24 56 60 42 29 39 61 37 24 31 36 47 64
## [409] 47 31 55 34 29 42 42 56 57 29 33 39 62 60 56 47 34 39 50 38 50 25 34 39
## [433] 42 42 39 56 56 31 58 43 58 37 25 57 52 36 24 42 60 56 55 31 32 42 32 32
## [457] 57 61 41 46 51 31 56 42 45 39 46 36 42 41 37 52 42 57 62 32 56 48 33 34
## [481] 61 24 31 46 27 42 62 57 31 34 57 53 47 34 47 31 25 57 47 61 27 53 34 64
## [505] 34 55 43 29 45 27 47 58 45 64 55 32 29 29 48 45 43 23 60 39 45 36 45 36
## [529] 42 37 41 64 58 62 32 33 25 60 38 48 45 56 46 45 37 57 62 24 61 51 33 60
## [553] 36 52 56 46 50 32 45 28 51 38 55 45 32 43 24 47 47 45 52 45 62 48 50 46
## [577] 61 58 51 33 24 60 43 57 33 62 61 47 60 27 24 31 55 53 23 25 27 45 52 25
## [601] 37 50 41 28 50 52 23 28 24 37 56 58 36 58 41 51 34 58 37 25 45 58 28 43
## [625] 38 38 37 61 64 47 39 45 34 34 52 33 50 45 31 53 28 50 58 39 42 28 43 64
## [649] 47 29 56 23 28 43 60 62 34 36 51 53 58 57 23 23 36 39 33 45 50 34 64 31
## [673] 61 48 34 58 42 51 28 64 58 37 43 33 62 39 55 43 61 52 45 57 32 32 34 32
## [697] 56 57 61 23 24 42 24 57 27 51 58 29 42 43 32 51 47 62 61 47 37 23 27 55
## [721] 25 43 52 25 50 50 58 34 43 46 42 51 46 25 29 48 53 29 51 62 29 46 48 37
## [745] 25 38 24 27 45 41 38 57 45 57 25 43 45 38 46 46 41 45 37 53 39 51 32 25
## [769] 23 41 55 24 55 50 62 46 48 53 56 46 56 29 36 31 53 52 47 56 25 56 61 64
## [793] 41 56 37 36 41 25 62 27 29 36 58 46 37 32 64 52 31 52 47 39 45 53 51 50
## [817] 34 48 32 38 34 62 31 50 47 27 61 48 64 28 36 51 55 31 28 33 38 45 28 38
## [841] 46 57 42 31 37 57 42 42 60 33 48 24 37 37 28 34 33 50 31 29 61 58 48 25
## [865] 34 27 23 58 48 50 45 51 53 25 50 29 64 36 56 34 61 34 58 60 39 24 34 48
## [889] 28 34 42 46 38 31 24 56 23 50 27 60 23 25 24 37 56 42 39 29 64 50 36 46
## [913] 56 31 56 64 47 32 41 41 34 37 48 62 38 24 25 36 23 24 60 41 28 58 24 34
## [937] 33 55 43 57 31 23 37 55 58 55 47 47 28 33 32 52 34 56 42 29 28 55 37 47
## [961] 48 31 25 47 51 34 62 39 46 38 36 25 42 36 32 42 36 51 52 31 25 41 29 23
## [985] 23 38 28 62 45 60 34 41 27 39 33 56 29 32 34 58
## [1] 36 45 32 57 46 60 23 32 60 45 57 47 32 42 42 53 60 33 64 37 23 61 58 52
## [25] 28 52 43 64 47 62 39 46 58 34 41 56 39 60 57 50 29 57 51 25 27 53 42 43
## [49] 24 48 60 27 61 33 47 50 38 28 33 52 53 31 27 57 47 41 64 45 42 25 34 36
## [73] 55 33 51 60 24 31 58 60 29 24 24 46 43 24 45 64 25 55 42 58 64 47 25 50
## [97] 42 60 33 51 27 43 45 53 53 37 25 32 46 57 27 55 51 24 56 31 46 57 42 53
## [121] 48 25 37 47 29 41 31 52 36 39 62 42 51 29 25 50 23 60 43 43 27 28 36 60
## [145] 64 64 61 27 32 28 64 55 46 45 23 55 60 39 24 61 28 34 34 25 39 64 60 58
## [169] 58 32 41 61 47 60 56 47 31 47 51 61 50 37 23 56 45 23 48 46 64 47 56 51
## [193] 28 61 41 37 25 25 50 64 38 39 43 28 33 39 46 25 52 45 52 33 25 36 50 57
## [217] 43 53 62 60 62 33 33 27 61 23 38 64 64 57 25 56 43 45 23 46 37 31 29 37
## [241] 61 43 39 29 52 58 34 57 37 56 32 29 29 32 25 36 45 58 52 42 31 37 47 31
## [265] 38 58 31 41 62 62 24 28 58 56 50 51 24 28 43 45 61 45 41 31 32 24 57 41
## [289] 58 56 51 45 32 55 53 41 47 41 28 62 34 41 31 57 60 29 46 27 50 45 34 45
## [313] 39 43 28 55 32 38 36 29 48 31 37 58 31 46 60 46 57 50 64 24 37 36 58 64
## [337] 34 47 43 39 43 27 27 55 61 37 36 32 47 27 42 55 45 23 58 60 58 33 50 47
## [361] 23 53 27 53 37 53 52 39 48 64 42 31 34 41 47 48 29 51 48 61 23 61 50 39
## [385] 37 47 24 47 23 36 41 55 56 37 39 24 56 60 42 29 39 61 37 24 31 36 47 64
## [409] 47 31 55 34 29 42 42 56 57 29 33 39 62 60 56 47 34 39 50 38 50 25 34 39
## [433] 42 42 39 56 56 31 58 43 58 37 25 57 52 36 24 42 60 56 55 31 32 42 32 32
## [457] 57 61 41 46 51 31 56 42 45 39 46 36 42 41 37 52 42 57 62 32 56 48 33 34
## [481] 61 24 31 46 27 42 62 57 31 34 57 53 47 34 47 31 25 57 47 61 27 53 34 64
## [505] 34 55 43 29 45 27 47 58 45 64 55 32 29 29 48 45 43 23 60 39 45 36 45 36
## [529] 42 37 41 64 58 62 32 33 25 60 38 48 45 56 46 45 37 57 62 24 61 51 33 60
## [553] 36 52 56 46 50 32 45 28 51 38 55 45 32 43 24 47 47 45 52 45 62 48 50 46
## [577] 61 58 51 33 24 60 43 57 33 62 61 47 60 27 24 31 55 53 23 25 27 45 52 25
## [601] 37 50 41 28 50 52 23 28 24 37 56 58 36 58 41 51 34 58 37 25 45 58 28 43
## [625] 38 38 37 61 64 47 39 45 34 34 52 33 50 45 31 53 28 50 58 39 42 28 43 64
## [649] 47 29 56 23 28 43 60 62 34 36 51 53 58 57 23 23 36 39 33 45 50 34 64 31
## [673] 61 48 34 58 42 51 28 64 58 37 43 33 62 39 55 43 61 52 45 57 32 32 34 32
## [697] 56 57 61 23 24 42 24 57 27 51 58 29 42 43 32 51 47 62 61 47 37 23 27 55
## [721] 25 43 52 25 50 50 58 34 43 46 42 51 46 25 29 48 53 29 51 62 29 46 48 37
## [745] 25 38 24 27 45 41 38 57 45 57 25 43 45 38 46 46 41 45 37 53 39 51 32 25
## [769] 23 41 55 24 55 50 62 46 48 53 56 46 56 29 36 31 53 52 47 56 25 56 61 64
## [793] 41 56 37 36 41 25 62 27 29 36 58 46 37 32 64 52 31 52 47 39 45 53 51 50
## [817] 34 48 32 38 34 62 31 50 47 27 61 48 64 28 36 51 55 31 28 33 38 45 28 38
## [841] 46 57 42 31 37 57 42 42 60 33 48 24 37 37 28 34 33 50 31 29 61 58 48 25
## [865] 34 27 23 58 48 50 45 51 53 25 50 29 64 36 56 34 61 34 58 60 39 24 34 48
## [889] 28 34 42 46 38 31 24 56 23 50 27 60 23 25 24 37 56 42 39 29 64 50 36 46
## [913] 56 31 56 64 47 32 41 41 34 37 48 62 38 24 25 36 23 24 60 41 28 58 24 34
## [937] 33 55 43 57 31 23 37 55 58 55 47 47 28 33 32 52 34 56 42 29 28 55 37 47
## [961] 48 31 25 47 51 34 62 39 46 38 36 25 42 36 32 42 36 51 52 31 25 41 29 23
## [985] 23 38 28 62 45 60 34 41 27 39 33 56 29 32 34 58
## [1] 36 45 32 57 46 60 23 32 60 45 57 47 32 42 42 53 60 33 64 37 23 61 58 52
## [25] 28 52 43 64 47 62 39 46 58 34 41 56 39 60 57 50 29 57 51 25 27 53 42 43
## [49] 24 48 60 27 61 33 47 50 38 28 33 52 53 31 27 57 47 41 64 45 42 25 34 36
## [73] 55 33 51 60 24 31 58 60 29 24 24 46 43 24 45 64 25 55 42 58 64 47 25 50
## [97] 42 60 33 51 27 43 45 53 53 37 25 32 46 57 27 55 51 24 56 31 46 57 42 53
## [121] 48 25 37 47 29 41 31 52 36 39 62 42 51 29 25 50 23 60 43 43 27 28 36 60
## [145] 64 64 61 27 32 28 64 55 46 45 23 55 60 39 24 61 28 34 34 25 39 64 60 58
## [169] 58 32 41 61 47 60 56 47 31 47 51 61 50 37 23 56 45 23 48 46 64 47 56 51
## [193] 28 61 41 37 25 25 50 64 38 39 43 28 33 39 46 25 52 45 52 33 25 36 50 57
## [217] 43 53 62 60 62 33 33 27 61 23 38 64 64 57 25 56 43 45 23 46 37 31 29 37
## [241] 61 43 39 29 52 58 34 57 37 56 32 29 29 32 25 36 45 58 52 42 31 37 47 31
## [265] 38 58 31 41 62 62 24 28 58 56 50 51 24 28 43 45 61 45 41 31 32 24 57 41
## [289] 58 56 51 45 32 55 53 41 47 41 28 62 34 41 31 57 60 29 46 27 50 45 34 45
## [313] 39 43 28 55 32 38 36 29 48 31 37 58 31 46 60 46 57 50 64 24 37 36 58 64
## [337] 34 47 43 39 43 27 27 55 61 37 36 32 47 27 42 55 45 23 58 60 58 33 50 47
## [361] 23 53 27 53 37 53 52 39 48 64 42 31 34 41 47 48 29 51 48 61 23 61 50 39
## [385] 37 47 24 47 23 36 41 55 56 37 39 24 56 60 42 29 39 61 37 24 31 36 47 64
## [409] 47 31 55 34 29 42 42 56 57 29 33 39 62 60 56 47 34 39 50 38 50 25 34 39
## [433] 42 42 39 56 56 31 58 43 58 37 25 57 52 36 24 42 60 56 55 31 32 42 32 32
## [457] 57 61 41 46 51 31 56 42 45 39 46 36 42 41 37 52 42 57 62 32 56 48 33 34
## [481] 61 24 31 46 27 42 62 57 31 34 57 53 47 34 47 31 25 57 47 61 27 53 34 64
## [505] 34 55 43 29 45 27 47 58 45 64 55 32 29 29 48 45 43 23 60 39 45 36 45 36
## [529] 42 37 41 64 58 62 32 33 25 60 38 48 45 56 46 45 37 57 62 24 61 51 33 60
## [553] 36 52 56 46 50 32 45 28 51 38 55 45 32 43 24 47 47 45 52 45 62 48 50 46
## [577] 61 58 51 33 24 60 43 57 33 62 61 47 60 27 24 31 55 53 23 25 27 45 52 25
## [601] 37 50 41 28 50 52 23 28 24 37 56 58 36 58 41 51 34 58 37 25 45 58 28 43
## [625] 38 38 37 61 64 47 39 45 34 34 52 33 50 45 31 53 28 50 58 39 42 28 43 64
## [649] 47 29 56 23 28 43 60 62 34 36 51 53 58 57 23 23 36 39 33 45 50 34 64 31
## [673] 61 48 34 58 42 51 28 64 58 37 43 33 62 39 55 43 61 52 45 57 32 32 34 32
## [697] 56 57 61 23 24 42 24 57 27 51 58 29 42 43 32 51 47 62 61 47 37 23 27 55
## [721] 25 43 52 25 50 50 58 34 43 46 42 51 46 25 29 48 53 29 51 62 29 46 48 37
## [745] 25 38 24 27 45 41 38 57 45 57 25 43 45 38 46 46 41 45 37 53 39 51 32 25
## [769] 23 41 55 24 55 50 62 46 48 53 56 46 56 29 36 31 53 52 47 56 25 56 61 64
## [793] 41 56 37 36 41 25 62 27 29 36 58 46 37 32 64 52 31 52 47 39 45 53 51 50
## [817] 34 48 32 38 34 62 31 50 47 27 61 48 64 28 36 51 55 31 28 33 38 45 28 38
## [841] 46 57 42 31 37 57 42 42 60 33 48 24 37 37 28 34 33 50 31 29 61 58 48 25
## [865] 34 27 23 58 48 50 45 51 53 25 50 29 64 36 56 34 61 34 58 60 39 24 34 48
## [889] 28 34 42 46 38 31 24 56 23 50 27 60 23 25 24 37 56 42 39 29 64 50 36 46
## [913] 56 31 56 64 47 32 41 41 34 37 48 62 38 24 25 36 23 24 60 41 28 58 24 34
## [937] 33 55 43 57 31 23 37 55 58 55 47 47 28 33 32 52 34 56 42 29 28 55 37 47
## [961] 48 31 25 47 51 34 62 39 46 38 36 25 42 36 32 42 36 51 52 31 25 41 29 23
## [985] 23 38 28 62 45 60 34 41 27 39 33 56 29 32 34 58
The $
“selects” the age variable from nurses, and the
result is a vector (the column of ages)
Matrix indexing is really handy, but also tricky. It is common to all
computer code. A single cell of a matrix or rectangular data array is
accessed using it’s row number and column number in square brackets
sepearated by a comma. For example, this will give me the age of the
nurse in the 6th row. Six specifies the row, while 5 is there singe
age
is the 5th column in the dataset
## [1] 60
Usually however, we want a whole row or column. We do that simply by leaving either the row or column entry blank like so (note: I’m using the head() command to print only 6 rows, the dataset is 1000 rows…)
## [1] 36 45 32 57 46 60 23 32 60 45 57 47 32 42 42 53 60 33 64 37 23 61 58 52
## [25] 28 52 43 64 47 62 39 46 58 34 41 56 39 60 57 50 29 57 51 25 27 53 42 43
## [49] 24 48 60 27 61 33 47 50 38 28 33 52 53 31 27 57 47 41 64 45 42 25 34 36
## [73] 55 33 51 60 24 31 58 60 29 24 24 46 43 24 45 64 25 55 42 58 64 47 25 50
## [97] 42 60 33 51 27 43 45 53 53 37 25 32 46 57 27 55 51 24 56 31 46 57 42 53
## [121] 48 25 37 47 29 41 31 52 36 39 62 42 51 29 25 50 23 60 43 43 27 28 36 60
## [145] 64 64 61 27 32 28 64 55 46 45 23 55 60 39 24 61 28 34 34 25 39 64 60 58
## [169] 58 32 41 61 47 60 56 47 31 47 51 61 50 37 23 56 45 23 48 46 64 47 56 51
## [193] 28 61 41 37 25 25 50 64 38 39 43 28 33 39 46 25 52 45 52 33 25 36 50 57
## [217] 43 53 62 60 62 33 33 27 61 23 38 64 64 57 25 56 43 45 23 46 37 31 29 37
## [241] 61 43 39 29 52 58 34 57 37 56 32 29 29 32 25 36 45 58 52 42 31 37 47 31
## [265] 38 58 31 41 62 62 24 28 58 56 50 51 24 28 43 45 61 45 41 31 32 24 57 41
## [289] 58 56 51 45 32 55 53 41 47 41 28 62 34 41 31 57 60 29 46 27 50 45 34 45
## [313] 39 43 28 55 32 38 36 29 48 31 37 58 31 46 60 46 57 50 64 24 37 36 58 64
## [337] 34 47 43 39 43 27 27 55 61 37 36 32 47 27 42 55 45 23 58 60 58 33 50 47
## [361] 23 53 27 53 37 53 52 39 48 64 42 31 34 41 47 48 29 51 48 61 23 61 50 39
## [385] 37 47 24 47 23 36 41 55 56 37 39 24 56 60 42 29 39 61 37 24 31 36 47 64
## [409] 47 31 55 34 29 42 42 56 57 29 33 39 62 60 56 47 34 39 50 38 50 25 34 39
## [433] 42 42 39 56 56 31 58 43 58 37 25 57 52 36 24 42 60 56 55 31 32 42 32 32
## [457] 57 61 41 46 51 31 56 42 45 39 46 36 42 41 37 52 42 57 62 32 56 48 33 34
## [481] 61 24 31 46 27 42 62 57 31 34 57 53 47 34 47 31 25 57 47 61 27 53 34 64
## [505] 34 55 43 29 45 27 47 58 45 64 55 32 29 29 48 45 43 23 60 39 45 36 45 36
## [529] 42 37 41 64 58 62 32 33 25 60 38 48 45 56 46 45 37 57 62 24 61 51 33 60
## [553] 36 52 56 46 50 32 45 28 51 38 55 45 32 43 24 47 47 45 52 45 62 48 50 46
## [577] 61 58 51 33 24 60 43 57 33 62 61 47 60 27 24 31 55 53 23 25 27 45 52 25
## [601] 37 50 41 28 50 52 23 28 24 37 56 58 36 58 41 51 34 58 37 25 45 58 28 43
## [625] 38 38 37 61 64 47 39 45 34 34 52 33 50 45 31 53 28 50 58 39 42 28 43 64
## [649] 47 29 56 23 28 43 60 62 34 36 51 53 58 57 23 23 36 39 33 45 50 34 64 31
## [673] 61 48 34 58 42 51 28 64 58 37 43 33 62 39 55 43 61 52 45 57 32 32 34 32
## [697] 56 57 61 23 24 42 24 57 27 51 58 29 42 43 32 51 47 62 61 47 37 23 27 55
## [721] 25 43 52 25 50 50 58 34 43 46 42 51 46 25 29 48 53 29 51 62 29 46 48 37
## [745] 25 38 24 27 45 41 38 57 45 57 25 43 45 38 46 46 41 45 37 53 39 51 32 25
## [769] 23 41 55 24 55 50 62 46 48 53 56 46 56 29 36 31 53 52 47 56 25 56 61 64
## [793] 41 56 37 36 41 25 62 27 29 36 58 46 37 32 64 52 31 52 47 39 45 53 51 50
## [817] 34 48 32 38 34 62 31 50 47 27 61 48 64 28 36 51 55 31 28 33 38 45 28 38
## [841] 46 57 42 31 37 57 42 42 60 33 48 24 37 37 28 34 33 50 31 29 61 58 48 25
## [865] 34 27 23 58 48 50 45 51 53 25 50 29 64 36 56 34 61 34 58 60 39 24 34 48
## [889] 28 34 42 46 38 31 24 56 23 50 27 60 23 25 24 37 56 42 39 29 64 50 36 46
## [913] 56 31 56 64 47 32 41 41 34 37 48 62 38 24 25 36 23 24 60 41 28 58 24 34
## [937] 33 55 43 57 31 23 37 55 58 55 47 47 28 33 32 52 34 56 42 29 28 55 37 47
## [961] 48 31 25 47 51 34 62 39 46 38 36 25 42 36 32 42 36 51 52 31 25 41 29 23
## [985] 23 38 28 62 45 60 34 41 27 39 33 56 29 32 34 58
## hospital ward wardid nurse age gender experien stress wardtype hospsize
## 6 1 1 11 6 60 female 22 6 general care large
## expcon Zage Zgender Zexperien Zstress Zwardtype Zhospsize Zexpcon
## 6 experiment 1.411358 0.600153 0.8180786 1.044405 -1.001501 1.777279 0.9915356
## Cexpcon Chospsize
## 6 0.5 1
If we want multiple rows or columns, we give a range or use a vector to name the columns:
## age gender experien
## 1 36 male 11
## 2 45 male 20
## 3 32 male 7
## 4 57 female 25
## 5 46 female 22
## 6 60 female 22
## age stress
## 1 36 7
## 2 45 7
## 3 32 7
## 4 57 6
## 5 46 6
## 6 60 6
## age hospsize
## 1 36 large
## 2 45 large
## 3 32 large
## 4 57 large
## 5 46 large
## 6 60 large
Finally, let’s see how to make a new variable in a data frame. We do
this by using the $
and naming a new variable on the fly.
Here I’m going make a new composite variable which is just age times
experience
This data set has Z scores for a number of variables. Let’s try to duplicate that with R code and check our results:
## Zage Zage2
## 1 -0.5817336 -0.5817336
## 2 0.1656757 0.1656757
## 3 -0.9139156 -0.9139156
## 4 1.1622216 1.1622216
## 5 0.2487212 0.2487212
## 6 1.4113581 1.4113581
Notice the liberal use of parenthesis. R respects order of operations very literally, so when doing computations like this, it is important to check your work.
Here’s one more, a log transformation
This is just a taste of the R universe, and we’ll try to stick to the basics throughout the course. I have lots of R resources and suggestions so please reach out to me!
Wow that was a lot. Our objectives were:
Part of your “homework” is to import some Excel data (as a .csv) into R and do some basic statistcs. To do so, you can start with this syntax below and simply insert the file name in quotes, with the .csv extension. If it doesn’t work, it means that either you spelled the file name wrong, or it’s not in your working directory.