R for Data Science With Sports Applications

2023-09-26

Recap

  • From Lab 1: it is important to keep track of where files are located in your computer.

  • Your hard drive is organized in folders or directories.

  • In Mac os, ~/Desktop is your desktop.

  • If you have a directory called lab 1 in your desktop, ~/Desktop/lab1 is the location of that folder.

  • Run setwd("~/Desktop/lab1") to make that the working directory.

  • list.files() prints the contents of the working directory in your console.

Recap 2

  • R code is organized in functions.

  • Functions take arguments and return values.

  • Data is stored in objects.

  • Assignment (<-) makes variables point to objects.

Recap 3

x <- c(1, 2, 3)
mean(x)
[1] 2
str(mtcars)
'data.frame':   32 obs. of  11 variables:
 $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
 $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
 $ disp: num  160 160 108 258 360 ...
 $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
 $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
 $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
 $ qsec: num  16.5 17 18.6 19.4 17 ...
 $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
 $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
 $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
 $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

Data frame

  • The most important kind of object in R.
## NBA championship 2017/2018 82 regular season games
df <- readRDS('./nba.rds')

Inspect df

df
                     Team Playoff GP  MIN  PTS  W  L  P2M  P2A      P2p  P3M
1           Atlanta Hawks       N 82 3941 8475 24 58 2213 4471 49.49676  917
2          Boston Celtics       Y 82 3961 8529 55 27 2202 4483 49.11889  939
3           Brooklyn Nets       N 82 3971 8741 28 54 2095 4190 50.00000 1041
4       Charlotte Hornets       N 82 3956 8874 36 46 2373 4873 48.69690  824
5           Chicago Bulls       N 82 3971 8440 27 55 2264 4736 47.80405  906
6     Cleveland Cavaliers       Y 82 3946 9091 50 32 2330 4314 54.01020  981
7        Dallas Mavericks       N 82 3961 8390 24 58 2161 4354 49.63252  967
8          Denver Nuggets       N 82 3976 9020 46 36 2398 4566 52.51862  940
9         Detroit Pistons       N 82 3961 8509 39 43 2322 4756 48.82254  886
10  Golden State Warriors       Y 82 3946 9304 58 24 2583 4611 56.01822  926
11        Houston Rockets       Y 82 3951 9213 65 17 1918 3436 55.82072 1256
12         Indiana Pacers       Y 82 3951 8656 48 34 2604 5073 51.33057  741
13            LA Clippers       N 82 3941 8937 42 40 2525 4808 52.51664  777
14     Los Angeles Lakers       N 82 3981 8862 35 47 2516 4864 51.72697  822
15      Memphis Grizzlies       N 82 3941 8145 22 60 2255 4636 48.64107  758
16             Miami Heat       Y 82 3986 8480 44 38 2281 4491 50.79047  903
17        Milwaukee Bucks       Y 82 3966 8731 44 38 2539 4783 53.08384  718
18 Minnesota Timberwolves       Y 82 3961 8980 47 35 2707 5218 51.87811  658
19   New Orleans Pelicans       Y 82 3991 9161 48 34 2663 4929 54.02719  837
20        New York Knicks       N 82 3966 8566 29 53 2661 5279 50.40727  673
21  Oklahoma City Thunder       Y 82 3966 8844 48 34 2390 4730 50.52854  881
22          Orlando Magic       N 82 3946 8479 25 57 2338 4637 50.42053  844
23     Philadelphia 76ers       Y 82 3956 9004 52 30 2448 4653 52.61122  901
24           Phoenix Suns       N 82 3941 8522 21 61 2390 4855 49.22760  763
25 Portland Trail Blazers       Y 82 3951 8661 49 33 2377 4824 49.27446  845
26       Sacramento Kings       N 82 3951 8104 27 55 2441 5096 47.90031  738
27      San Antonio Spurs       Y 82 3946 8424 47 35 2506 5022 49.90044  696
28        Toronto Raptors       Y 82 3966 9156 59 23 2415 4464 54.09946  968
29              Utah Jazz       Y 82 3951 8540 48 34 2252 4372 51.50961  887
30     Washington Wizards       Y 82 3971 8742 43 39 2461 4845 50.79463  814
    P3A      P3p  FTM  FTA      FTp OREB DREB  AST  TOV STL BLK   PF   PM team
1  2544 36.04560 1298 1654 78.47642  743 2693 1946 1276 638 348 1606 -447  ATL
2  2492 37.68058 1308 1697 77.07720  767 2878 1842 1149 604 373 1618  294  BOS
3  2924 35.60192 1428 1850 77.18919  792 2852 1941 1245 512 390 1688 -307  BKN
4  2233 36.90103 1656 2216 74.72924  827 2901 1770 1041 559 373 1409   21  CHA
5  2549 35.54335 1194 1574 75.85769  790 2873 1923 1147 626 289 1571 -577  CHI
6  2636 37.21548 1488 1909 77.94657  694 2761 1916 1126 582 312 1524   77  CLE
7  2688 35.97470 1167 1530 76.27451  666 2717 1858 1007 578 310 1578 -249  DAL
8  2536 37.06625 1404 1830 76.72131  902 2748 2059 1227 627 404 1533  121  DEN
9  2373 37.33670 1207 1621 74.46021  830 2756 1868 1103 628 317 1508  -12  DET
10 2369 39.08822 1360 1668 81.53477  691 2877 2402 1265 655 612 1607  490  GSW
11 3470 36.19597 1609 2061 78.06890  739 2825 1767 1135 699 392 1597  695  HOU
12 2010 36.86567 1225 1573 77.87667  788 2684 1819 1088 721 340 1544  113  IND
13 2196 35.38251 1556 2095 74.27208  832 2767 1832 1204 628 373 1638    3  LAC
14 2384 34.47987 1364 1910 71.41361  876 2927 1949 1295 633 388 1736 -127  LAL
15 2152 35.22305 1361 1732 78.57968  779 2544 1767 1227 612 396 1900 -509  MEM
16 2506 36.03352 1209 1601 75.51530  763 2801 1862 1178 620 437 1648   39  MIA
17 2024 35.47431 1499 1915 78.27676  688 2579 1905 1135 722 443 1752  -25  MIL
18 1845 35.66396 1592 1980 80.40404  848 2593 1861 1021 689 345 1495  183  MIN
19 2312 36.20242 1324 1716 77.15618  712 2924 2195 1223 657 485 1570  107  NOP
20 1914 35.16196 1225 1557 78.67694  859 2752 1912 1207 552 421 1682 -292  NYK
21 2491 35.36732 1421 1985 71.58690 1024 2671 1750 1147 743 412 1653  280  OKC
22 2405 35.09356 1271 1678 75.74493  722 2692 1921 1192 622 400 1579 -395  ORL
23 2445 36.85072 1405 1868 75.21413  893 2996 2221 1353 682 420 1811  369  PHI
24 2286 33.37708 1453 1962 74.05708  842 2776 1743 1289 569 370 1807 -768  PHX
25 2308 36.61179 1372 1715 80.00000  835 2893 1599 1109 573 423 1599  213  POR
26 1967 37.51906 1008 1371 73.52298  777 2578 1768 1125 643 340 1639 -573  SAC
27 1977 35.20486 1324 1715 77.20117  849 2777 1868 1078 628 460 1408  237  SAS
28 2705 35.78558 1422 1790 79.44134  800 2807 1995 1095 626 500 1783  638  TOR
29 2425 36.57732 1375 1766 77.85957  740 2807 1839 1205 708 420 1608  353  UTA
30 2173 37.45973 1378 1786 77.15566  823 2713 2065 1196 645 353 1746   48  WAS
   Conference  Division Rank
1           E Southeast   15
2           E  Atlantic    2
3           E  Atlantic   12
4           E Southeast   10
5           E   Central   13
6           E   Central    4
7           W Southwest   13
8           W Northwest    9
9           E   Central    9
10          W   Pacific    2
11          W Southwest    1
12          E   Central    5
13          W   Pacific   10
14          W   Pacific   11
15          W Southwest   14
16          E Southeast    6
17          E   Central    7
18          W Northwest    8
19          W Southwest    6
20          E  Atlantic   11
21          W Northwest    4
22          E Southeast   14
23          E  Atlantic    3
24          W   Pacific   15
25          W Northwest    3
26          W   Pacific   12
27          W Southwest    7
28          E  Atlantic    1
29          W Northwest    5
30          E Southeast    8

library(dplyr)
glimpse(df)
Rows: 30
Columns: 28
$ Team       <chr> "Atlanta Hawks", "Boston Celtics", "Brooklyn Nets", "Charlo…
$ Playoff    <fct> N, Y, N, N, N, Y, N, N, N, Y, Y, Y, N, N, N, Y, Y, Y, Y, N,…
$ GP         <int> 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,…
$ MIN        <int> 3941, 3961, 3971, 3956, 3971, 3946, 3961, 3976, 3961, 3946,…
$ PTS        <int> 8475, 8529, 8741, 8874, 8440, 9091, 8390, 9020, 8509, 9304,…
$ W          <int> 24, 55, 28, 36, 27, 50, 24, 46, 39, 58, 65, 48, 42, 35, 22,…
$ L          <int> 58, 27, 54, 46, 55, 32, 58, 36, 43, 24, 17, 34, 40, 47, 60,…
$ P2M        <int> 2213, 2202, 2095, 2373, 2264, 2330, 2161, 2398, 2322, 2583,…
$ P2A        <int> 4471, 4483, 4190, 4873, 4736, 4314, 4354, 4566, 4756, 4611,…
$ P2p        <dbl> 49.49676, 49.11889, 50.00000, 48.69690, 47.80405, 54.01020,…
$ P3M        <int> 917, 939, 1041, 824, 906, 981, 967, 940, 886, 926, 1256, 74…
$ P3A        <int> 2544, 2492, 2924, 2233, 2549, 2636, 2688, 2536, 2373, 2369,…
$ P3p        <dbl> 36.04560, 37.68058, 35.60192, 36.90103, 35.54335, 37.21548,…
$ FTM        <int> 1298, 1308, 1428, 1656, 1194, 1488, 1167, 1404, 1207, 1360,…
$ FTA        <int> 1654, 1697, 1850, 2216, 1574, 1909, 1530, 1830, 1621, 1668,…
$ FTp        <dbl> 78.47642, 77.07720, 77.18919, 74.72924, 75.85769, 77.94657,…
$ OREB       <int> 743, 767, 792, 827, 790, 694, 666, 902, 830, 691, 739, 788,…
$ DREB       <int> 2693, 2878, 2852, 2901, 2873, 2761, 2717, 2748, 2756, 2877,…
$ AST        <int> 1946, 1842, 1941, 1770, 1923, 1916, 1858, 2059, 1868, 2402,…
$ TOV        <int> 1276, 1149, 1245, 1041, 1147, 1126, 1007, 1227, 1103, 1265,…
$ STL        <int> 638, 604, 512, 559, 626, 582, 578, 627, 628, 655, 699, 721,…
$ BLK        <int> 348, 373, 390, 373, 289, 312, 310, 404, 317, 612, 392, 340,…
$ PF         <int> 1606, 1618, 1688, 1409, 1571, 1524, 1578, 1533, 1508, 1607,…
$ PM         <int> -447, 294, -307, 21, -577, 77, -249, 121, -12, 490, 695, 11…
$ team       <fct> ATL, BOS, BKN, CHA, CHI, CLE, DAL, DEN, DET, GSW, HOU, IND,…
$ Conference <fct> E, E, E, E, E, E, W, W, E, W, W, E, W, W, W, E, E, W, W, E,…
$ Division   <fct> Southeast, Atlantic, Atlantic, Southeast, Central, Central,…
$ Rank       <int> 15, 2, 12, 10, 13, 4, 13, 9, 9, 2, 1, 5, 10, 11, 14, 6, 7, …

Dplyr verbs

Key functions. Take a data.frame as input and return a data.frame.

  • filter
  • select
  • mutate
  • group_by
  • summarize
  • arrange

Filter

Filter rows from the data.frame.

playoff_teams <- filter(df, Playoff=='Y')
glimpse(playoff_teams)
Rows: 16
Columns: 28
$ Team       <chr> "Boston Celtics", "Cleveland Cavaliers", "Golden State Warr…
$ Playoff    <fct> Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y
$ GP         <int> 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,…
$ MIN        <int> 3961, 3946, 3946, 3951, 3951, 3986, 3966, 3961, 3991, 3966,…
$ PTS        <int> 8529, 9091, 9304, 9213, 8656, 8480, 8731, 8980, 9161, 8844,…
$ W          <int> 55, 50, 58, 65, 48, 44, 44, 47, 48, 48, 52, 49, 47, 59, 48,…
$ L          <int> 27, 32, 24, 17, 34, 38, 38, 35, 34, 34, 30, 33, 35, 23, 34,…
$ P2M        <int> 2202, 2330, 2583, 1918, 2604, 2281, 2539, 2707, 2663, 2390,…
$ P2A        <int> 4483, 4314, 4611, 3436, 5073, 4491, 4783, 5218, 4929, 4730,…
$ P2p        <dbl> 49.11889, 54.01020, 56.01822, 55.82072, 51.33057, 50.79047,…
$ P3M        <int> 939, 981, 926, 1256, 741, 903, 718, 658, 837, 881, 901, 845…
$ P3A        <int> 2492, 2636, 2369, 3470, 2010, 2506, 2024, 1845, 2312, 2491,…
$ P3p        <dbl> 37.68058, 37.21548, 39.08822, 36.19597, 36.86567, 36.03352,…
$ FTM        <int> 1308, 1488, 1360, 1609, 1225, 1209, 1499, 1592, 1324, 1421,…
$ FTA        <int> 1697, 1909, 1668, 2061, 1573, 1601, 1915, 1980, 1716, 1985,…
$ FTp        <dbl> 77.07720, 77.94657, 81.53477, 78.06890, 77.87667, 75.51530,…
$ OREB       <int> 767, 694, 691, 739, 788, 763, 688, 848, 712, 1024, 893, 835…
$ DREB       <int> 2878, 2761, 2877, 2825, 2684, 2801, 2579, 2593, 2924, 2671,…
$ AST        <int> 1842, 1916, 2402, 1767, 1819, 1862, 1905, 1861, 2195, 1750,…
$ TOV        <int> 1149, 1126, 1265, 1135, 1088, 1178, 1135, 1021, 1223, 1147,…
$ STL        <int> 604, 582, 655, 699, 721, 620, 722, 689, 657, 743, 682, 573,…
$ BLK        <int> 373, 312, 612, 392, 340, 437, 443, 345, 485, 412, 420, 423,…
$ PF         <int> 1618, 1524, 1607, 1597, 1544, 1648, 1752, 1495, 1570, 1653,…
$ PM         <int> 294, 77, 490, 695, 113, 39, -25, 183, 107, 280, 369, 213, 2…
$ team       <fct> BOS, CLE, GSW, HOU, IND, MIA, MIL, MIN, NOP, OKC, PHI, POR,…
$ Conference <fct> E, E, W, W, E, E, E, W, W, W, E, W, W, E, W, E
$ Division   <fct> Atlantic, Central, Pacific, Southwest, Central, Southeast, …
$ Rank       <int> 2, 4, 2, 1, 5, 6, 7, 8, 6, 4, 3, 3, 7, 1, 5, 8

Return another data frame with the rows where the second argument is TRUE.

Select

Remove columns from the data frame

df_2 <- select(df, Team, Playoff, W, L)
glimpse(df_2)
Rows: 30
Columns: 4
$ Team    <chr> "Atlanta Hawks", "Boston Celtics", "Brooklyn Nets", "Charlotte…
$ Playoff <fct> N, Y, N, N, N, Y, N, N, N, Y, Y, Y, N, N, N, Y, Y, Y, Y, N, Y,…
$ W       <int> 24, 55, 28, 36, 27, 50, 24, 46, 39, 58, 65, 48, 42, 35, 22, 44…
$ L       <int> 58, 27, 54, 46, 55, 32, 58, 36, 43, 24, 17, 34, 40, 47, 60, 38…

Mutate

Return a new data frame with a new column:

df_rebs <- mutate(df, REB=OREB+DREB)
glimpse(df_rebs)
Rows: 30
Columns: 29
$ Team       <chr> "Atlanta Hawks", "Boston Celtics", "Brooklyn Nets", "Charlo…
$ Playoff    <fct> N, Y, N, N, N, Y, N, N, N, Y, Y, Y, N, N, N, Y, Y, Y, Y, N,…
$ GP         <int> 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,…
$ MIN        <int> 3941, 3961, 3971, 3956, 3971, 3946, 3961, 3976, 3961, 3946,…
$ PTS        <int> 8475, 8529, 8741, 8874, 8440, 9091, 8390, 9020, 8509, 9304,…
$ W          <int> 24, 55, 28, 36, 27, 50, 24, 46, 39, 58, 65, 48, 42, 35, 22,…
$ L          <int> 58, 27, 54, 46, 55, 32, 58, 36, 43, 24, 17, 34, 40, 47, 60,…
$ P2M        <int> 2213, 2202, 2095, 2373, 2264, 2330, 2161, 2398, 2322, 2583,…
$ P2A        <int> 4471, 4483, 4190, 4873, 4736, 4314, 4354, 4566, 4756, 4611,…
$ P2p        <dbl> 49.49676, 49.11889, 50.00000, 48.69690, 47.80405, 54.01020,…
$ P3M        <int> 917, 939, 1041, 824, 906, 981, 967, 940, 886, 926, 1256, 74…
$ P3A        <int> 2544, 2492, 2924, 2233, 2549, 2636, 2688, 2536, 2373, 2369,…
$ P3p        <dbl> 36.04560, 37.68058, 35.60192, 36.90103, 35.54335, 37.21548,…
$ FTM        <int> 1298, 1308, 1428, 1656, 1194, 1488, 1167, 1404, 1207, 1360,…
$ FTA        <int> 1654, 1697, 1850, 2216, 1574, 1909, 1530, 1830, 1621, 1668,…
$ FTp        <dbl> 78.47642, 77.07720, 77.18919, 74.72924, 75.85769, 77.94657,…
$ OREB       <int> 743, 767, 792, 827, 790, 694, 666, 902, 830, 691, 739, 788,…
$ DREB       <int> 2693, 2878, 2852, 2901, 2873, 2761, 2717, 2748, 2756, 2877,…
$ AST        <int> 1946, 1842, 1941, 1770, 1923, 1916, 1858, 2059, 1868, 2402,…
$ TOV        <int> 1276, 1149, 1245, 1041, 1147, 1126, 1007, 1227, 1103, 1265,…
$ STL        <int> 638, 604, 512, 559, 626, 582, 578, 627, 628, 655, 699, 721,…
$ BLK        <int> 348, 373, 390, 373, 289, 312, 310, 404, 317, 612, 392, 340,…
$ PF         <int> 1606, 1618, 1688, 1409, 1571, 1524, 1578, 1533, 1508, 1607,…
$ PM         <int> -447, 294, -307, 21, -577, 77, -249, 121, -12, 490, 695, 11…
$ team       <fct> ATL, BOS, BKN, CHA, CHI, CLE, DAL, DEN, DET, GSW, HOU, IND,…
$ Conference <fct> E, E, E, E, E, E, W, W, E, W, W, E, W, W, W, E, E, W, W, E,…
$ Division   <fct> Southeast, Atlantic, Atlantic, Southeast, Central, Central,…
$ Rank       <int> 15, 2, 12, 10, 13, 4, 13, 9, 9, 2, 1, 5, 10, 11, 14, 6, 7, …
$ REB        <int> 3436, 3645, 3644, 3728, 3663, 3455, 3383, 3650, 3586, 3568,…

The first argument is a data.frame. The rest of the arguments is one or more expressions. You can use formulas and mathematical operators (-, +, *, /) in those expressions.

Group By

  • Returns a grouped data frame.
  • Does nothing to the data, but subsequent functions behave differently (summarize).
df_grouped <- group_by(df, Playoff)

Summarize

Returns a data frame with a summary of the argument. It will have one row per group in the argument data frame.

tbl <- summarize(df_grouped, avg_pts=mean(PTS))

Like mutate, you need to pass one or more expression, that will be applied to each group in the data. ## Arrange

  • Sorts the data.frame
  • The arguments are the columns used for sorting.
  • Use a minus sign before the argument to sort in descending order (ascending is the default)

Arrange

  • Get the top 5 ranked teams
sorted_df <- arrange(df, Rank)
head(sorted_df, 5)
                   Team Playoff GP  MIN  PTS  W  L  P2M  P2A      P2p  P3M  P3A
1       Houston Rockets       Y 82 3951 9213 65 17 1918 3436 55.82072 1256 3470
2       Toronto Raptors       Y 82 3966 9156 59 23 2415 4464 54.09946  968 2705
3        Boston Celtics       Y 82 3961 8529 55 27 2202 4483 49.11889  939 2492
4 Golden State Warriors       Y 82 3946 9304 58 24 2583 4611 56.01822  926 2369
5    Philadelphia 76ers       Y 82 3956 9004 52 30 2448 4653 52.61122  901 2445
       P3p  FTM  FTA      FTp OREB DREB  AST  TOV STL BLK   PF  PM team
1 36.19597 1609 2061 78.06890  739 2825 1767 1135 699 392 1597 695  HOU
2 35.78558 1422 1790 79.44134  800 2807 1995 1095 626 500 1783 638  TOR
3 37.68058 1308 1697 77.07720  767 2878 1842 1149 604 373 1618 294  BOS
4 39.08822 1360 1668 81.53477  691 2877 2402 1265 655 612 1607 490  GSW
5 36.85072 1405 1868 75.21413  893 2996 2221 1353 682 420 1811 369  PHI
  Conference  Division Rank
1          W Southwest    1
2          E  Atlantic    1
3          E  Atlantic    2
4          W   Pacific    2
5          E  Atlantic    3
  • Multiple arguments break ties
  • How would you print only the name of the teams?

Count

  • Count how many observations for each value of the variable.
  • No arguments counts all the rows
  • If we pass arguments, counts grouping with the variable we passed.
count(df)
   n
1 30
  • How many teams per division?
count(df, Division)
   Division n
1  Atlantic 5
2   Central 5
3 Northwest 5
4   Pacific 5
5 Southeast 5
6 Southwest 5

Remember object types

  • Different functions take different type of objects.

  • df is a data.frame

  • A data.frame is a collection of vectors

  • Vectors can be of different types

glimpse(df)
Rows: 30
Columns: 28
$ Team       <chr> "Atlanta Hawks", "Boston Celtics", "Brooklyn Nets", "Charlo…
$ Playoff    <fct> N, Y, N, N, N, Y, N, N, N, Y, Y, Y, N, N, N, Y, Y, Y, Y, N,…
$ GP         <int> 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,…
$ MIN        <int> 3941, 3961, 3971, 3956, 3971, 3946, 3961, 3976, 3961, 3946,…
$ PTS        <int> 8475, 8529, 8741, 8874, 8440, 9091, 8390, 9020, 8509, 9304,…
$ W          <int> 24, 55, 28, 36, 27, 50, 24, 46, 39, 58, 65, 48, 42, 35, 22,…
$ L          <int> 58, 27, 54, 46, 55, 32, 58, 36, 43, 24, 17, 34, 40, 47, 60,…
$ P2M        <int> 2213, 2202, 2095, 2373, 2264, 2330, 2161, 2398, 2322, 2583,…
$ P2A        <int> 4471, 4483, 4190, 4873, 4736, 4314, 4354, 4566, 4756, 4611,…
$ P2p        <dbl> 49.49676, 49.11889, 50.00000, 48.69690, 47.80405, 54.01020,…
$ P3M        <int> 917, 939, 1041, 824, 906, 981, 967, 940, 886, 926, 1256, 74…
$ P3A        <int> 2544, 2492, 2924, 2233, 2549, 2636, 2688, 2536, 2373, 2369,…
$ P3p        <dbl> 36.04560, 37.68058, 35.60192, 36.90103, 35.54335, 37.21548,…
$ FTM        <int> 1298, 1308, 1428, 1656, 1194, 1488, 1167, 1404, 1207, 1360,…
$ FTA        <int> 1654, 1697, 1850, 2216, 1574, 1909, 1530, 1830, 1621, 1668,…
$ FTp        <dbl> 78.47642, 77.07720, 77.18919, 74.72924, 75.85769, 77.94657,…
$ OREB       <int> 743, 767, 792, 827, 790, 694, 666, 902, 830, 691, 739, 788,…
$ DREB       <int> 2693, 2878, 2852, 2901, 2873, 2761, 2717, 2748, 2756, 2877,…
$ AST        <int> 1946, 1842, 1941, 1770, 1923, 1916, 1858, 2059, 1868, 2402,…
$ TOV        <int> 1276, 1149, 1245, 1041, 1147, 1126, 1007, 1227, 1103, 1265,…
$ STL        <int> 638, 604, 512, 559, 626, 582, 578, 627, 628, 655, 699, 721,…
$ BLK        <int> 348, 373, 390, 373, 289, 312, 310, 404, 317, 612, 392, 340,…
$ PF         <int> 1606, 1618, 1688, 1409, 1571, 1524, 1578, 1533, 1508, 1607,…
$ PM         <int> -447, 294, -307, 21, -577, 77, -249, 121, -12, 490, 695, 11…
$ team       <fct> ATL, BOS, BKN, CHA, CHI, CLE, DAL, DEN, DET, GSW, HOU, IND,…
$ Conference <fct> E, E, E, E, E, E, W, W, E, W, W, E, W, W, W, E, E, W, W, E,…
$ Division   <fct> Southeast, Atlantic, Atlantic, Southeast, Central, Central,…
$ Rank       <int> 15, 2, 12, 10, 13, 4, 13, 9, 9, 2, 1, 5, 10, 11, 14, 6, 7, …

We can access vectors inside a data frame in multiple ways. $ operator.

mean(df$PTS)
[1] 8719.333
  • Dplyr verbs streamline access to vectors

Mutate

df_with_mean <- mutate(df, mean_pts=mean(PTS))
  • Think about data types!