[1] 2
2023-09-26
From Lab 1: it is important to keep track of where files are located in your computer.
Your hard drive is organized in folders or directories.
In Mac os, ~/Desktop is your desktop.
If you have a directory called lab 1 in your desktop, ~/Desktop/lab1 is the location of that folder.
Run setwd("~/Desktop/lab1") to make that the working directory.
list.files() prints the contents of the working directory in your console.
R code is organized in functions.
Functions take arguments and return values.
Data is stored in objects.
Assignment (<-) makes variables point to objects.
'data.frame': 32 obs. of 11 variables:
$ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
$ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
$ disp: num 160 160 108 258 360 ...
$ hp : num 110 110 93 110 175 105 245 62 95 123 ...
$ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
$ wt : num 2.62 2.88 2.32 3.21 3.44 ...
$ qsec: num 16.5 17 18.6 19.4 17 ...
$ vs : num 0 0 1 1 0 1 0 1 1 1 ...
$ am : num 1 1 1 0 0 0 0 0 0 0 ...
$ gear: num 4 4 4 3 3 3 3 4 4 4 ...
$ carb: num 4 4 1 1 2 1 4 2 2 4 ...
Team Playoff GP MIN PTS W L P2M P2A P2p P3M
1 Atlanta Hawks N 82 3941 8475 24 58 2213 4471 49.49676 917
2 Boston Celtics Y 82 3961 8529 55 27 2202 4483 49.11889 939
3 Brooklyn Nets N 82 3971 8741 28 54 2095 4190 50.00000 1041
4 Charlotte Hornets N 82 3956 8874 36 46 2373 4873 48.69690 824
5 Chicago Bulls N 82 3971 8440 27 55 2264 4736 47.80405 906
6 Cleveland Cavaliers Y 82 3946 9091 50 32 2330 4314 54.01020 981
7 Dallas Mavericks N 82 3961 8390 24 58 2161 4354 49.63252 967
8 Denver Nuggets N 82 3976 9020 46 36 2398 4566 52.51862 940
9 Detroit Pistons N 82 3961 8509 39 43 2322 4756 48.82254 886
10 Golden State Warriors Y 82 3946 9304 58 24 2583 4611 56.01822 926
11 Houston Rockets Y 82 3951 9213 65 17 1918 3436 55.82072 1256
12 Indiana Pacers Y 82 3951 8656 48 34 2604 5073 51.33057 741
13 LA Clippers N 82 3941 8937 42 40 2525 4808 52.51664 777
14 Los Angeles Lakers N 82 3981 8862 35 47 2516 4864 51.72697 822
15 Memphis Grizzlies N 82 3941 8145 22 60 2255 4636 48.64107 758
16 Miami Heat Y 82 3986 8480 44 38 2281 4491 50.79047 903
17 Milwaukee Bucks Y 82 3966 8731 44 38 2539 4783 53.08384 718
18 Minnesota Timberwolves Y 82 3961 8980 47 35 2707 5218 51.87811 658
19 New Orleans Pelicans Y 82 3991 9161 48 34 2663 4929 54.02719 837
20 New York Knicks N 82 3966 8566 29 53 2661 5279 50.40727 673
21 Oklahoma City Thunder Y 82 3966 8844 48 34 2390 4730 50.52854 881
22 Orlando Magic N 82 3946 8479 25 57 2338 4637 50.42053 844
23 Philadelphia 76ers Y 82 3956 9004 52 30 2448 4653 52.61122 901
24 Phoenix Suns N 82 3941 8522 21 61 2390 4855 49.22760 763
25 Portland Trail Blazers Y 82 3951 8661 49 33 2377 4824 49.27446 845
26 Sacramento Kings N 82 3951 8104 27 55 2441 5096 47.90031 738
27 San Antonio Spurs Y 82 3946 8424 47 35 2506 5022 49.90044 696
28 Toronto Raptors Y 82 3966 9156 59 23 2415 4464 54.09946 968
29 Utah Jazz Y 82 3951 8540 48 34 2252 4372 51.50961 887
30 Washington Wizards Y 82 3971 8742 43 39 2461 4845 50.79463 814
P3A P3p FTM FTA FTp OREB DREB AST TOV STL BLK PF PM team
1 2544 36.04560 1298 1654 78.47642 743 2693 1946 1276 638 348 1606 -447 ATL
2 2492 37.68058 1308 1697 77.07720 767 2878 1842 1149 604 373 1618 294 BOS
3 2924 35.60192 1428 1850 77.18919 792 2852 1941 1245 512 390 1688 -307 BKN
4 2233 36.90103 1656 2216 74.72924 827 2901 1770 1041 559 373 1409 21 CHA
5 2549 35.54335 1194 1574 75.85769 790 2873 1923 1147 626 289 1571 -577 CHI
6 2636 37.21548 1488 1909 77.94657 694 2761 1916 1126 582 312 1524 77 CLE
7 2688 35.97470 1167 1530 76.27451 666 2717 1858 1007 578 310 1578 -249 DAL
8 2536 37.06625 1404 1830 76.72131 902 2748 2059 1227 627 404 1533 121 DEN
9 2373 37.33670 1207 1621 74.46021 830 2756 1868 1103 628 317 1508 -12 DET
10 2369 39.08822 1360 1668 81.53477 691 2877 2402 1265 655 612 1607 490 GSW
11 3470 36.19597 1609 2061 78.06890 739 2825 1767 1135 699 392 1597 695 HOU
12 2010 36.86567 1225 1573 77.87667 788 2684 1819 1088 721 340 1544 113 IND
13 2196 35.38251 1556 2095 74.27208 832 2767 1832 1204 628 373 1638 3 LAC
14 2384 34.47987 1364 1910 71.41361 876 2927 1949 1295 633 388 1736 -127 LAL
15 2152 35.22305 1361 1732 78.57968 779 2544 1767 1227 612 396 1900 -509 MEM
16 2506 36.03352 1209 1601 75.51530 763 2801 1862 1178 620 437 1648 39 MIA
17 2024 35.47431 1499 1915 78.27676 688 2579 1905 1135 722 443 1752 -25 MIL
18 1845 35.66396 1592 1980 80.40404 848 2593 1861 1021 689 345 1495 183 MIN
19 2312 36.20242 1324 1716 77.15618 712 2924 2195 1223 657 485 1570 107 NOP
20 1914 35.16196 1225 1557 78.67694 859 2752 1912 1207 552 421 1682 -292 NYK
21 2491 35.36732 1421 1985 71.58690 1024 2671 1750 1147 743 412 1653 280 OKC
22 2405 35.09356 1271 1678 75.74493 722 2692 1921 1192 622 400 1579 -395 ORL
23 2445 36.85072 1405 1868 75.21413 893 2996 2221 1353 682 420 1811 369 PHI
24 2286 33.37708 1453 1962 74.05708 842 2776 1743 1289 569 370 1807 -768 PHX
25 2308 36.61179 1372 1715 80.00000 835 2893 1599 1109 573 423 1599 213 POR
26 1967 37.51906 1008 1371 73.52298 777 2578 1768 1125 643 340 1639 -573 SAC
27 1977 35.20486 1324 1715 77.20117 849 2777 1868 1078 628 460 1408 237 SAS
28 2705 35.78558 1422 1790 79.44134 800 2807 1995 1095 626 500 1783 638 TOR
29 2425 36.57732 1375 1766 77.85957 740 2807 1839 1205 708 420 1608 353 UTA
30 2173 37.45973 1378 1786 77.15566 823 2713 2065 1196 645 353 1746 48 WAS
Conference Division Rank
1 E Southeast 15
2 E Atlantic 2
3 E Atlantic 12
4 E Southeast 10
5 E Central 13
6 E Central 4
7 W Southwest 13
8 W Northwest 9
9 E Central 9
10 W Pacific 2
11 W Southwest 1
12 E Central 5
13 W Pacific 10
14 W Pacific 11
15 W Southwest 14
16 E Southeast 6
17 E Central 7
18 W Northwest 8
19 W Southwest 6
20 E Atlantic 11
21 W Northwest 4
22 E Southeast 14
23 E Atlantic 3
24 W Pacific 15
25 W Northwest 3
26 W Pacific 12
27 W Southwest 7
28 E Atlantic 1
29 W Northwest 5
30 E Southeast 8
Rows: 30
Columns: 28
$ Team <chr> "Atlanta Hawks", "Boston Celtics", "Brooklyn Nets", "Charlo…
$ Playoff <fct> N, Y, N, N, N, Y, N, N, N, Y, Y, Y, N, N, N, Y, Y, Y, Y, N,…
$ GP <int> 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,…
$ MIN <int> 3941, 3961, 3971, 3956, 3971, 3946, 3961, 3976, 3961, 3946,…
$ PTS <int> 8475, 8529, 8741, 8874, 8440, 9091, 8390, 9020, 8509, 9304,…
$ W <int> 24, 55, 28, 36, 27, 50, 24, 46, 39, 58, 65, 48, 42, 35, 22,…
$ L <int> 58, 27, 54, 46, 55, 32, 58, 36, 43, 24, 17, 34, 40, 47, 60,…
$ P2M <int> 2213, 2202, 2095, 2373, 2264, 2330, 2161, 2398, 2322, 2583,…
$ P2A <int> 4471, 4483, 4190, 4873, 4736, 4314, 4354, 4566, 4756, 4611,…
$ P2p <dbl> 49.49676, 49.11889, 50.00000, 48.69690, 47.80405, 54.01020,…
$ P3M <int> 917, 939, 1041, 824, 906, 981, 967, 940, 886, 926, 1256, 74…
$ P3A <int> 2544, 2492, 2924, 2233, 2549, 2636, 2688, 2536, 2373, 2369,…
$ P3p <dbl> 36.04560, 37.68058, 35.60192, 36.90103, 35.54335, 37.21548,…
$ FTM <int> 1298, 1308, 1428, 1656, 1194, 1488, 1167, 1404, 1207, 1360,…
$ FTA <int> 1654, 1697, 1850, 2216, 1574, 1909, 1530, 1830, 1621, 1668,…
$ FTp <dbl> 78.47642, 77.07720, 77.18919, 74.72924, 75.85769, 77.94657,…
$ OREB <int> 743, 767, 792, 827, 790, 694, 666, 902, 830, 691, 739, 788,…
$ DREB <int> 2693, 2878, 2852, 2901, 2873, 2761, 2717, 2748, 2756, 2877,…
$ AST <int> 1946, 1842, 1941, 1770, 1923, 1916, 1858, 2059, 1868, 2402,…
$ TOV <int> 1276, 1149, 1245, 1041, 1147, 1126, 1007, 1227, 1103, 1265,…
$ STL <int> 638, 604, 512, 559, 626, 582, 578, 627, 628, 655, 699, 721,…
$ BLK <int> 348, 373, 390, 373, 289, 312, 310, 404, 317, 612, 392, 340,…
$ PF <int> 1606, 1618, 1688, 1409, 1571, 1524, 1578, 1533, 1508, 1607,…
$ PM <int> -447, 294, -307, 21, -577, 77, -249, 121, -12, 490, 695, 11…
$ team <fct> ATL, BOS, BKN, CHA, CHI, CLE, DAL, DEN, DET, GSW, HOU, IND,…
$ Conference <fct> E, E, E, E, E, E, W, W, E, W, W, E, W, W, W, E, E, W, W, E,…
$ Division <fct> Southeast, Atlantic, Atlantic, Southeast, Central, Central,…
$ Rank <int> 15, 2, 12, 10, 13, 4, 13, 9, 9, 2, 1, 5, 10, 11, 14, 6, 7, …
Key functions. Take a data.frame as input and return a data.frame.
filterselectmutategroup_bysummarizearrangeFilter rows from the data.frame.
Rows: 16
Columns: 28
$ Team <chr> "Boston Celtics", "Cleveland Cavaliers", "Golden State Warr…
$ Playoff <fct> Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y
$ GP <int> 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,…
$ MIN <int> 3961, 3946, 3946, 3951, 3951, 3986, 3966, 3961, 3991, 3966,…
$ PTS <int> 8529, 9091, 9304, 9213, 8656, 8480, 8731, 8980, 9161, 8844,…
$ W <int> 55, 50, 58, 65, 48, 44, 44, 47, 48, 48, 52, 49, 47, 59, 48,…
$ L <int> 27, 32, 24, 17, 34, 38, 38, 35, 34, 34, 30, 33, 35, 23, 34,…
$ P2M <int> 2202, 2330, 2583, 1918, 2604, 2281, 2539, 2707, 2663, 2390,…
$ P2A <int> 4483, 4314, 4611, 3436, 5073, 4491, 4783, 5218, 4929, 4730,…
$ P2p <dbl> 49.11889, 54.01020, 56.01822, 55.82072, 51.33057, 50.79047,…
$ P3M <int> 939, 981, 926, 1256, 741, 903, 718, 658, 837, 881, 901, 845…
$ P3A <int> 2492, 2636, 2369, 3470, 2010, 2506, 2024, 1845, 2312, 2491,…
$ P3p <dbl> 37.68058, 37.21548, 39.08822, 36.19597, 36.86567, 36.03352,…
$ FTM <int> 1308, 1488, 1360, 1609, 1225, 1209, 1499, 1592, 1324, 1421,…
$ FTA <int> 1697, 1909, 1668, 2061, 1573, 1601, 1915, 1980, 1716, 1985,…
$ FTp <dbl> 77.07720, 77.94657, 81.53477, 78.06890, 77.87667, 75.51530,…
$ OREB <int> 767, 694, 691, 739, 788, 763, 688, 848, 712, 1024, 893, 835…
$ DREB <int> 2878, 2761, 2877, 2825, 2684, 2801, 2579, 2593, 2924, 2671,…
$ AST <int> 1842, 1916, 2402, 1767, 1819, 1862, 1905, 1861, 2195, 1750,…
$ TOV <int> 1149, 1126, 1265, 1135, 1088, 1178, 1135, 1021, 1223, 1147,…
$ STL <int> 604, 582, 655, 699, 721, 620, 722, 689, 657, 743, 682, 573,…
$ BLK <int> 373, 312, 612, 392, 340, 437, 443, 345, 485, 412, 420, 423,…
$ PF <int> 1618, 1524, 1607, 1597, 1544, 1648, 1752, 1495, 1570, 1653,…
$ PM <int> 294, 77, 490, 695, 113, 39, -25, 183, 107, 280, 369, 213, 2…
$ team <fct> BOS, CLE, GSW, HOU, IND, MIA, MIL, MIN, NOP, OKC, PHI, POR,…
$ Conference <fct> E, E, W, W, E, E, E, W, W, W, E, W, W, E, W, E
$ Division <fct> Atlantic, Central, Pacific, Southwest, Central, Southeast, …
$ Rank <int> 2, 4, 2, 1, 5, 6, 7, 8, 6, 4, 3, 3, 7, 1, 5, 8
Return another data frame with the rows where the second argument is TRUE.
Remove columns from the data frame
Rows: 30
Columns: 4
$ Team <chr> "Atlanta Hawks", "Boston Celtics", "Brooklyn Nets", "Charlotte…
$ Playoff <fct> N, Y, N, N, N, Y, N, N, N, Y, Y, Y, N, N, N, Y, Y, Y, Y, N, Y,…
$ W <int> 24, 55, 28, 36, 27, 50, 24, 46, 39, 58, 65, 48, 42, 35, 22, 44…
$ L <int> 58, 27, 54, 46, 55, 32, 58, 36, 43, 24, 17, 34, 40, 47, 60, 38…
Return a new data frame with a new column:
Rows: 30
Columns: 29
$ Team <chr> "Atlanta Hawks", "Boston Celtics", "Brooklyn Nets", "Charlo…
$ Playoff <fct> N, Y, N, N, N, Y, N, N, N, Y, Y, Y, N, N, N, Y, Y, Y, Y, N,…
$ GP <int> 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,…
$ MIN <int> 3941, 3961, 3971, 3956, 3971, 3946, 3961, 3976, 3961, 3946,…
$ PTS <int> 8475, 8529, 8741, 8874, 8440, 9091, 8390, 9020, 8509, 9304,…
$ W <int> 24, 55, 28, 36, 27, 50, 24, 46, 39, 58, 65, 48, 42, 35, 22,…
$ L <int> 58, 27, 54, 46, 55, 32, 58, 36, 43, 24, 17, 34, 40, 47, 60,…
$ P2M <int> 2213, 2202, 2095, 2373, 2264, 2330, 2161, 2398, 2322, 2583,…
$ P2A <int> 4471, 4483, 4190, 4873, 4736, 4314, 4354, 4566, 4756, 4611,…
$ P2p <dbl> 49.49676, 49.11889, 50.00000, 48.69690, 47.80405, 54.01020,…
$ P3M <int> 917, 939, 1041, 824, 906, 981, 967, 940, 886, 926, 1256, 74…
$ P3A <int> 2544, 2492, 2924, 2233, 2549, 2636, 2688, 2536, 2373, 2369,…
$ P3p <dbl> 36.04560, 37.68058, 35.60192, 36.90103, 35.54335, 37.21548,…
$ FTM <int> 1298, 1308, 1428, 1656, 1194, 1488, 1167, 1404, 1207, 1360,…
$ FTA <int> 1654, 1697, 1850, 2216, 1574, 1909, 1530, 1830, 1621, 1668,…
$ FTp <dbl> 78.47642, 77.07720, 77.18919, 74.72924, 75.85769, 77.94657,…
$ OREB <int> 743, 767, 792, 827, 790, 694, 666, 902, 830, 691, 739, 788,…
$ DREB <int> 2693, 2878, 2852, 2901, 2873, 2761, 2717, 2748, 2756, 2877,…
$ AST <int> 1946, 1842, 1941, 1770, 1923, 1916, 1858, 2059, 1868, 2402,…
$ TOV <int> 1276, 1149, 1245, 1041, 1147, 1126, 1007, 1227, 1103, 1265,…
$ STL <int> 638, 604, 512, 559, 626, 582, 578, 627, 628, 655, 699, 721,…
$ BLK <int> 348, 373, 390, 373, 289, 312, 310, 404, 317, 612, 392, 340,…
$ PF <int> 1606, 1618, 1688, 1409, 1571, 1524, 1578, 1533, 1508, 1607,…
$ PM <int> -447, 294, -307, 21, -577, 77, -249, 121, -12, 490, 695, 11…
$ team <fct> ATL, BOS, BKN, CHA, CHI, CLE, DAL, DEN, DET, GSW, HOU, IND,…
$ Conference <fct> E, E, E, E, E, E, W, W, E, W, W, E, W, W, W, E, E, W, W, E,…
$ Division <fct> Southeast, Atlantic, Atlantic, Southeast, Central, Central,…
$ Rank <int> 15, 2, 12, 10, 13, 4, 13, 9, 9, 2, 1, 5, 10, 11, 14, 6, 7, …
$ REB <int> 3436, 3645, 3644, 3728, 3663, 3455, 3383, 3650, 3586, 3568,…
The first argument is a data.frame. The rest of the arguments is one or more expressions. You can use formulas and mathematical operators (-, +, *, /) in those expressions.
summarize).Returns a data frame with a summary of the argument. It will have one row per group in the argument data frame.
Like mutate, you need to pass one or more expression, that will be applied to each group in the data. ## Arrange
data.frame Team Playoff GP MIN PTS W L P2M P2A P2p P3M P3A
1 Houston Rockets Y 82 3951 9213 65 17 1918 3436 55.82072 1256 3470
2 Toronto Raptors Y 82 3966 9156 59 23 2415 4464 54.09946 968 2705
3 Boston Celtics Y 82 3961 8529 55 27 2202 4483 49.11889 939 2492
4 Golden State Warriors Y 82 3946 9304 58 24 2583 4611 56.01822 926 2369
5 Philadelphia 76ers Y 82 3956 9004 52 30 2448 4653 52.61122 901 2445
P3p FTM FTA FTp OREB DREB AST TOV STL BLK PF PM team
1 36.19597 1609 2061 78.06890 739 2825 1767 1135 699 392 1597 695 HOU
2 35.78558 1422 1790 79.44134 800 2807 1995 1095 626 500 1783 638 TOR
3 37.68058 1308 1697 77.07720 767 2878 1842 1149 604 373 1618 294 BOS
4 39.08822 1360 1668 81.53477 691 2877 2402 1265 655 612 1607 490 GSW
5 36.85072 1405 1868 75.21413 893 2996 2221 1353 682 420 1811 369 PHI
Conference Division Rank
1 W Southwest 1
2 E Atlantic 1
3 E Atlantic 2
4 W Pacific 2
5 E Atlantic 3
Different functions take different type of objects.
df is a data.frame
A data.frame is a collection of vectors
Vectors can be of different types
Rows: 30
Columns: 28
$ Team <chr> "Atlanta Hawks", "Boston Celtics", "Brooklyn Nets", "Charlo…
$ Playoff <fct> N, Y, N, N, N, Y, N, N, N, Y, Y, Y, N, N, N, Y, Y, Y, Y, N,…
$ GP <int> 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,…
$ MIN <int> 3941, 3961, 3971, 3956, 3971, 3946, 3961, 3976, 3961, 3946,…
$ PTS <int> 8475, 8529, 8741, 8874, 8440, 9091, 8390, 9020, 8509, 9304,…
$ W <int> 24, 55, 28, 36, 27, 50, 24, 46, 39, 58, 65, 48, 42, 35, 22,…
$ L <int> 58, 27, 54, 46, 55, 32, 58, 36, 43, 24, 17, 34, 40, 47, 60,…
$ P2M <int> 2213, 2202, 2095, 2373, 2264, 2330, 2161, 2398, 2322, 2583,…
$ P2A <int> 4471, 4483, 4190, 4873, 4736, 4314, 4354, 4566, 4756, 4611,…
$ P2p <dbl> 49.49676, 49.11889, 50.00000, 48.69690, 47.80405, 54.01020,…
$ P3M <int> 917, 939, 1041, 824, 906, 981, 967, 940, 886, 926, 1256, 74…
$ P3A <int> 2544, 2492, 2924, 2233, 2549, 2636, 2688, 2536, 2373, 2369,…
$ P3p <dbl> 36.04560, 37.68058, 35.60192, 36.90103, 35.54335, 37.21548,…
$ FTM <int> 1298, 1308, 1428, 1656, 1194, 1488, 1167, 1404, 1207, 1360,…
$ FTA <int> 1654, 1697, 1850, 2216, 1574, 1909, 1530, 1830, 1621, 1668,…
$ FTp <dbl> 78.47642, 77.07720, 77.18919, 74.72924, 75.85769, 77.94657,…
$ OREB <int> 743, 767, 792, 827, 790, 694, 666, 902, 830, 691, 739, 788,…
$ DREB <int> 2693, 2878, 2852, 2901, 2873, 2761, 2717, 2748, 2756, 2877,…
$ AST <int> 1946, 1842, 1941, 1770, 1923, 1916, 1858, 2059, 1868, 2402,…
$ TOV <int> 1276, 1149, 1245, 1041, 1147, 1126, 1007, 1227, 1103, 1265,…
$ STL <int> 638, 604, 512, 559, 626, 582, 578, 627, 628, 655, 699, 721,…
$ BLK <int> 348, 373, 390, 373, 289, 312, 310, 404, 317, 612, 392, 340,…
$ PF <int> 1606, 1618, 1688, 1409, 1571, 1524, 1578, 1533, 1508, 1607,…
$ PM <int> -447, 294, -307, 21, -577, 77, -249, 121, -12, 490, 695, 11…
$ team <fct> ATL, BOS, BKN, CHA, CHI, CLE, DAL, DEN, DET, GSW, HOU, IND,…
$ Conference <fct> E, E, E, E, E, E, W, W, E, W, W, E, W, W, W, E, E, W, W, E,…
$ Division <fct> Southeast, Atlantic, Atlantic, Southeast, Central, Central,…
$ Rank <int> 15, 2, 12, 10, 13, 4, 13, 9, 9, 2, 1, 5, 10, 11, 14, 6, 7, …
We can access vectors inside a data frame in multiple ways. $ operator.