= "November 03, 2023"
due_date_assignment print(due_date_assignment)
November 03, 2023
Arda Türkan
November 3, 2023
This assignment consists of three parts.
“Import Excel to R”
Three differences between R and Python.
Examining and handling “NA Values” on a sample data.
I will give a brief summary of the video: Import Data, Copy Data from Excel to R CSV & TXT Files
Mike Marin discusses how to import data from Excel into R. He outlines the steps for importing data from both CSV and tab-delimited text files. He demonstrates the use of the “read.csv,” “read.table,” and “read.delim” commands in R to import data, using the “file.choose” command to select the data file from a menu. He also highlights the importance of setting the “header” argument to TRUE when the first row of the data set contains variable names. The video covers both CSV and tab-delimited text file import methods.
Although there are many similarities between Python and R languages that are frequently used today, there are also very clear and fundamental differences. We can summarize these three fundamental differences as follows:
While in Python, the assignment operator is a single equality sign “=”, in R, the assignment operator can also be expressed as “<-”, and for the sake of code clarity and convention, the use of “<-” is recommended by the community.
Python
R
Python indentation refers to adding white space before a statement to a particular block of code. In other words, the lines which belongs to same block of code must have a same indent. In R, there is no such obligation. However, indentation is recommended for ease of coding and reading.
Python
unexpected indent (<string>, line 2)
R
The one of the most crucial difference between syntaxes of R and Python is to reference a variable in an array by using its index number. While indexing in Python starts at 0, in R, the first element has an index of 1.
Although there are different data types in both languages, this expression is valid for all data types in both languages.
Python
R
First of all our data can be shown by:
[1] 2 1 3 2 1 3 1 4 3 2 2 NA 2 2 1 4 NA 1 1 2 1 2 2 1
[25] 2 5 NA 2 2 3 1 2 4 1 1 1 4 5 2 3 4 1 2 4 1 1 2 1
[49] 5 NA NA NA 1 1 5 1 3 1 NA 4 4 7 3 2 NA NA 1 NA 4 1 2 2
[73] 3 2 1 2 2 4 3 4 2 3 1 3 2 1 1 1 3 1 NA 3 1 2 2 1
[97] 2 2 1 1 4 1 1 2 3 3 2 2 3 3 3 4 1 1 1 2 NA 4 3 4
[121] 3 1 2 1 NA NA NA NA 1 5 1 2 1 3 5 3 2 2 NA NA NA NA 3 5
[145] 3 1 1 4 2 4 3 3 NA 2 3 2 6 NA 1 1 2 2 1 3 1 1 5 NA
[169] NA 2 4 NA 2 5 1 4 3 3 NA 4 3 1 4 1 1 3 1 1 NA NA 3 5
[193] 2 2 2 3 1 2 2 3 2 1 NA 2 NA 1 NA NA 2 1 1 NA 3 NA 1 2
[217] 2 1 3 2 2 1 1 2 3 1 1 1 4 3 4 2 2 1 4 1 NA 5 1 4
[241] NA 3 NA NA 1 1 5 2 3 3 2 4 NA 3 2 5 NA 2 3 4 6 2 2 2
[265] NA 2 NA 2 NA 3 3 2 2 4 3 1 4 2 NA 2 4 NA 6 2 3 1 NA 2
[289] 2 NA 1 1 3 2 3 3 1 NA 1 4 2 1 1 3 2 1 2 3 1 NA 2 3
[313] 3 2 1 2 3 5 5 1 2 3 3 1 NA NA 1 2 4 NA 2 1 1 1 3 2
[337] 1 1 3 4 NA 1 2 1 1 3 3 NA 1 1 3 5 3 2 3 4 1 4 3 1
[361] NA 2 1 2 2 1 2 2 6 1 2 4 5 NA 3 4 2 1 1 4 2 1 1 1
[385] 1 2 1 4 4 1 3 NA 3 3 NA 2 NA 1 2 1 1 4 2 1 4 4 NA 1
[409] 2 NA 3 2 2 2 1 4 3 6 1 2 3 1 3 2 2 2 1 1 3 2 1 1
[433] 1 3 2 2 NA 4 4 4 1 1 NA 4 3 NA 1 3 1 3 2 4 2 2 2 3
[457] 2 1 4 3 NA 1 4 3 1 3 2 NA 3 NA 1 3 1 4 1 1 1 2 4 3
[481] 1 2 2 2 3 2 3 1 1 NA 3 2 1 1 2 NA 2 2 2 3 3 1 1 2
[505] NA 1 2 1 1 3 3 1 3 1 1 1 1 1 2 5 1 1 2 2 1 1 NA 1
[529] 4 1 2 4 1 3 2 NA 1 1 NA 2 1 1 4 2 3 3 1 5 3 1 1 2
[553] NA 1 1 3 1 3 2 4 NA 2 3 2 1 2 1 1 1 2 2 3 1 5 2 NA
[577] 2 NA 3 2 2 2 1 5 3 2 3 1 NA 3 1 2 2 2 1 2 2 4 NA 6
[601] 1 2 NA 1 1 2 2 3 NA 3 2 3 3 4 2 NA 2 NA 4 NA 1 1 2 2
[625] 3 1 1 1 3 NA 2 5 NA 7 1 NA 4 3 3 1 NA 1 1 1 1 3 2 4
[649] 2 2 3 NA NA 1 4 3 2 2 2 3 2 4 2 2 4 NA NA NA 6 3 3 1
[673] 4 4 2 1 NA 1 6 NA 3 3 2 1 1 6 NA 1 5 1 NA 2 6 2 NA 4
[697] 1 3 1 2 NA 1 1 3 1 2 4 2 1 3 2 4 3 2 2 1 1 5 6 4
[721] 2 2 2 2 4 NA 1 2 2 2 2 4 5 NA NA NA 4 3 3 3 2 4 2 4
[745] NA NA NA NA 2 1 NA 2 4 3 2 NA 2 3 1 3 4 NA 1 2 1 2 NA 3
[769] 1 2 1 2 1 2 1 2 2 2 2 1 1 3 3 1 3 4 3 NA NA 4 2 3
[793] 2 1 3 2 4 2 2 3 1 2 4 3 3 4 NA 1 4 2 1 1 1 3 1 5
[817] 2 2 4 2 NA 1 3 1 2 NA 1 2 1 2 1 NA 1 3 2 3 2 NA 2 1
[841] 4 2 NA NA NA 2 4 2 NA NA 3 1 NA 5 5 2 2 2 NA 2 1 3 1 3
[865] 2 4 2 4 NA 4 1 2 3 2 3 3 2 3 2 2 2 1 3 2 4 2 NA 3
[889] 3 2 2 NA NA 3 2 1 2 4 1 1 1 1 4 3 2 NA 3 2 NA 1 NA 3
[913] 2 1 1 1 2 NA 2 2 3 3 2 NA NA 4 5 2 2 2 1 2 3 1 3 3
[937] 4 3 NA 1 1 1 NA 4 3 5 1 1 2 NA 2 2 2 2 5 2 2 3 1 2
[961] 3 NA 1 2 NA NA 2 NA 3 1 1 2 5 3 5 1 1 4 NA 2 1 3 1 1
[985] 2 4 3 3 3 NA 1 1 2 2 1 1 2 2 NA 2
To see the total number of “NA” values, we use the following code:
We will replace the missing values by 0 and store the data in a new object :
[1] 2 1 3 2 1 3 1 4 3 2 2 0 2 2 1 4 0 1 1 2 1 2 2 1 2 5 0 2 2 3 1 2 4 1 1 1 4
[38] 5 2 3 4 1 2 4 1 1 2 1 5 0 0 0 1 1 5 1 3 1 0 4 4 7 3 2 0 0 1 0 4 1 2 2 3 2
[75] 1 2 2 4 3 4 2 3 1 3 2 1 1 1 3 1 0 3 1 2 2 1 2 2 1 1 4 1 1 2 3 3 2 2 3 3 3
[112] 4 1 1 1 2 0 4 3 4 3 1 2 1 0 0 0 0 1 5 1 2 1 3 5 3 2 2 0 0 0 0 3 5 3 1 1 4
[149] 2 4 3 3 0 2 3 2 6 0 1 1 2 2 1 3 1 1 5 0 0 2 4 0 2 5 1 4 3 3 0 4 3 1 4 1 1
[186] 3 1 1 0 0 3 5 2 2 2 3 1 2 2 3 2 1 0 2 0 1 0 0 2 1 1 0 3 0 1 2 2 1 3 2 2 1
[223] 1 2 3 1 1 1 4 3 4 2 2 1 4 1 0 5 1 4 0 3 0 0 1 1 5 2 3 3 2 4 0 3 2 5 0 2 3
[260] 4 6 2 2 2 0 2 0 2 0 3 3 2 2 4 3 1 4 2 0 2 4 0 6 2 3 1 0 2 2 0 1 1 3 2 3 3
[297] 1 0 1 4 2 1 1 3 2 1 2 3 1 0 2 3 3 2 1 2 3 5 5 1 2 3 3 1 0 0 1 2 4 0 2 1 1
[334] 1 3 2 1 1 3 4 0 1 2 1 1 3 3 0 1 1 3 5 3 2 3 4 1 4 3 1 0 2 1 2 2 1 2 2 6 1
[371] 2 4 5 0 3 4 2 1 1 4 2 1 1 1 1 2 1 4 4 1 3 0 3 3 0 2 0 1 2 1 1 4 2 1 4 4 0
[408] 1 2 0 3 2 2 2 1 4 3 6 1 2 3 1 3 2 2 2 1 1 3 2 1 1 1 3 2 2 0 4 4 4 1 1 0 4
[445] 3 0 1 3 1 3 2 4 2 2 2 3 2 1 4 3 0 1 4 3 1 3 2 0 3 0 1 3 1 4 1 1 1 2 4 3 1
[482] 2 2 2 3 2 3 1 1 0 3 2 1 1 2 0 2 2 2 3 3 1 1 2 0 1 2 1 1 3 3 1 3 1 1 1 1 1
[519] 2 5 1 1 2 2 1 1 0 1 4 1 2 4 1 3 2 0 1 1 0 2 1 1 4 2 3 3 1 5 3 1 1 2 0 1 1
[556] 3 1 3 2 4 0 2 3 2 1 2 1 1 1 2 2 3 1 5 2 0 2 0 3 2 2 2 1 5 3 2 3 1 0 3 1 2
[593] 2 2 1 2 2 4 0 6 1 2 0 1 1 2 2 3 0 3 2 3 3 4 2 0 2 0 4 0 1 1 2 2 3 1 1 1 3
[630] 0 2 5 0 7 1 0 4 3 3 1 0 1 1 1 1 3 2 4 2 2 3 0 0 1 4 3 2 2 2 3 2 4 2 2 4 0
[667] 0 0 6 3 3 1 4 4 2 1 0 1 6 0 3 3 2 1 1 6 0 1 5 1 0 2 6 2 0 4 1 3 1 2 0 1 1
[704] 3 1 2 4 2 1 3 2 4 3 2 2 1 1 5 6 4 2 2 2 2 4 0 1 2 2 2 2 4 5 0 0 0 4 3 3 3
[741] 2 4 2 4 0 0 0 0 2 1 0 2 4 3 2 0 2 3 1 3 4 0 1 2 1 2 0 3 1 2 1 2 1 2 1 2 2
[778] 2 2 1 1 3 3 1 3 4 3 0 0 4 2 3 2 1 3 2 4 2 2 3 1 2 4 3 3 4 0 1 4 2 1 1 1 3
[815] 1 5 2 2 4 2 0 1 3 1 2 0 1 2 1 2 1 0 1 3 2 3 2 0 2 1 4 2 0 0 0 2 4 2 0 0 3
[852] 1 0 5 5 2 2 2 0 2 1 3 1 3 2 4 2 4 0 4 1 2 3 2 3 3 2 3 2 2 2 1 3 2 4 2 0 3
[889] 3 2 2 0 0 3 2 1 2 4 1 1 1 1 4 3 2 0 3 2 0 1 0 3 2 1 1 1 2 0 2 2 3 3 2 0 0
[926] 4 5 2 2 2 1 2 3 1 3 3 4 3 0 1 1 1 0 4 3 5 1 1 2 0 2 2 2 2 5 2 2 3 1 2 3 0
[963] 1 2 0 0 2 0 3 1 1 2 5 3 5 1 1 4 0 2 1 3 1 1 2 4 3 3 3 0 1 1 2 2 1 1 2 2 0
[1000] 2
To check whether there is any “NA” values are left: