(a) Selected Video: Mustafa Gökçe Baydoğan - A Talk on Data Analytics and Industrial Engineering
Brief Summary
In this seminar, Mustafa Gökçe Baydoğan discusses the integral role of Industrial Engineering (IE) in the field of data analytics through various real-world industrial projects. The key highlights of the talk include:
Types of Analytics: He explains the transition from descriptive and predictive analytics to prescriptive analytics, emphasizing how Operations Research (OR) and Machine Learning (ML) work together to provide actionable insights.
Case Studies: Two major projects are detailed:
Predicting lumber warping in the forest industry by treating digital images as matrices and extracting physical features like knot locations and ring orientations.
Managing electricity consumption forecasts and imbalance costs in the energy sector.
The IE Advantage: He stresses that the strength of an Industrial Engineer lies in identifying the root cause of a problem and ensuring that models are “explainable” for business decision-makers.
Advice for Students: He encourages aspiring data scientists to learn by “getting their hands dirty” with real datasets rather than just watching tutorials and highlights the importance of graduate studies for specialization.
(b) Selected Video: Mustafa Gökçe Baydoğan - A Talk on Data Analytics and Industrial Engineering
Question: In the lumber warping prediction project, how did Mustafa Baydoğan incorporate domain-specific knowledge into the data analytics process, and why was this approach preferred over a purely data-driven (black-box) model?
Answer: Mustafa Baydoğan researched forestry literature to understand the physical causes of wood warping, such as knot density and the orientation of growth rings based on where the lumber was cut from the tree. He transformed digital images into matrices to extract these specific features. This approach was preferred because it made the model “explainable” (interpretability), allowing industry experts to understand why a certain piece of wood was flagged and take corrective actions before the drying process.
if (k %%2==0) {head(polls_us_election_2016, k)} else {tail(polls_us_election_2016, k)}
state startdate enddate pollster grade
4196 North Carolina 2016-05-20 2016-05-22 Public Policy Polling B+
4197 Kentucky 2016-09-30 2016-10-13 Ipsos A-
4198 Florida 2016-07-30 2016-08-07 Quinnipiac University A-
4199 Pennsylvania 2016-06-08 2016-06-19 Quinnipiac University A-
4200 Ohio 2016-06-30 2016-07-11 Quinnipiac University A-
4201 North Carolina 2016-03-18 2016-03-20 Public Policy Polling B+
4202 South Dakota 2016-10-28 2016-11-02 Ipsos A-
4203 Washington 2016-10-21 2016-11-02 Ipsos A-
4204 Virginia 2016-09-16 2016-09-22 Ipsos A-
4205 Wisconsin 2016-08-04 2016-08-07 Marquette University A
4206 Utah 2016-11-01 2016-11-07 Google Consumer Surveys B
4207 Oregon 2016-10-21 2016-11-02 Ipsos A-
4208 Michigan 2016-01-23 2016-01-26 EPIC-MRA A-
samplesize population rawpoll_clinton rawpoll_trump rawpoll_johnson
4196 928 v 41.00 43.00 3.00
4197 336 lv 39.38 53.08 NA
4198 1056 lv 43.00 43.00 7.00
4199 950 rv 39.00 36.00 9.00
4200 955 rv 36.00 37.00 7.00
4201 843 v 44.00 42.00 NA
4202 170 lv 28.45 47.20 NA
4203 538 lv 46.71 38.33 NA
4204 452 lv 46.54 40.04 NA
4205 683 lv 47.00 34.00 9.00
4206 286 lv 21.33 35.05 9.99
4207 446 lv 46.46 37.41 NA
4208 600 lv 43.00 41.00 NA
rawpoll_mcmullin adjpoll_clinton adjpoll_trump adjpoll_johnson
4196 NA 43.28262 47.12021 -0.036293
4197 NA 38.34430 54.36357 NA
4198 NA 45.19351 46.65680 3.448447
4199 NA 43.35339 41.19061 4.791570
4200 NA 40.73937 42.33380 2.936299
4201 NA 42.13165 43.55006 NA
4202 NA 26.57791 45.43384 NA
4203 NA 45.56387 38.22545 NA
4204 NA 46.47852 40.48017 NA
4205 NA 48.74781 39.07778 4.705020
4206 NA 26.65200 40.57738 9.705791
4207 NA 45.12949 37.10720 NA
4208 NA 42.14966 42.05508 NA
adjpoll_mcmullin
4196 NA
4197 NA
4198 NA
4199 NA
4200 NA
4201 NA
4202 NA
4203 NA
4204 NA
4205 NA
4206 NA
4207 NA
4208 NA
Code
total_na <-sum(is.na(polls_us_election_2016))print(paste("Toplam NA sayısı:", total_na))
This assignment was prepared by integrating human intelligence (HI) with AI-assisted research tools. The video analysis in Part (a) and (b) was synthesized using AI to provide a concise English summary, while the technical R scripts in Part 4 were developed to meet specific logical constraints.
Verification: All AI-generated outputs, including calculations (e.g., the value of \(k=13\)) and data transformations, have been manually audited and verified for accuracy to ensure they align with the course requirements.
Prompts used for generation: 1. “Synthesize a professional summary and quiz questions from the provided video transcript in English.” 2. “Develop an R script for data cleaning and NA replacement based on user-defined variables and logical conditions.”
---title: "Assignment 1"---My first assignment has two parts.## (a) **Selected Video:** Mustafa Gökçe Baydoğan - A Talk on Data Analytics and Industrial Engineering#### Brief SummaryIn this seminar, Mustafa Gökçe Baydoğan discusses the integral role of Industrial Engineering (IE) in the field of data analytics through various real-world industrial projects. The key highlights of the talk include:- **Types of Analytics:** He explains the transition from descriptive and predictive analytics to prescriptive analytics, emphasizing how Operations Research (OR) and Machine Learning (ML) work together to provide actionable insights.- **Case Studies:** Two major projects are detailed: - Predicting lumber warping in the forest industry by treating digital images as matrices and extracting physical features like knot locations and ring orientations. - Managing electricity consumption forecasts and imbalance costs in the energy sector.- **The IE Advantage:** He stresses that the strength of an Industrial Engineer lies in identifying the root cause of a problem and ensuring that models are "explainable" for business decision-makers.- **Advice for Students:** He encourages aspiring data scientists to learn by "getting their hands dirty" with real datasets rather than just watching tutorials and highlights the importance of graduate studies for specialization.## (b) **Selected Video:** Mustafa Gökçe Baydoğan - A Talk on Data Analytics and Industrial Engineering**Question:** In the lumber warping prediction project, how did Mustafa Baydoğan incorporate domain-specific knowledge into the data analytics process, and why was this approach preferred over a purely data-driven (black-box) model?**Answer:** Mustafa Baydoğan researched forestry literature to understand the physical causes of wood warping, such as knot density and the orientation of growth rings based on where the lumber was cut from the tree. He transformed digital images into matrices to extract these specific features. This approach was preferred because it made the model "explainable" (interpretability), allowing industry experts to understand why a certain piece of wood was flagged and take corrective actions before the drying process.```{r}#| echo: truelibrary(dslabs)data("polls_us_election_2016")my_first <-"Onur Furkan"my_birth_year <-2004k <- (nchar(my_first) + my_birth_year) %%15+8print(paste("Hesaplanan k değeri:", k))if (k %%2==0) {head(polls_us_election_2016, k)} else {tail(polls_us_election_2016, k)}total_na <-sum(is.na(polls_us_election_2016))print(paste("Toplam NA sayısı:", total_na))na_counts <-colSums(is.na(polls_us_election_2016))sort(na_counts, decreasing =TRUE)[1:8]new_data <- polls_us_election_2016new_data[] <-lapply(new_data, function(x) {if(is.numeric(x)) { x[is.na(x)] <- my_birth_year + k }return(x)})new_data[] <-lapply(new_data, function(x) {if(is.character(x) |is.factor(x)) { x <-as.character(x) x[is.na(x)] <-paste0(my_first, "_", k) }return(x)})if (k %%2==0) {head(new_data, k)} else {tail(new_data, k)}sum(is.na(new_data))anyNA(new_data)```------------------------------------------------------------------------### Notes on MethodologyThis assignment was prepared by integrating human intelligence (HI) with AI-assisted research tools. The video analysis in Part (a) and (b) was synthesized using AI to provide a concise English summary, while the technical R scripts in Part 4 were developed to meet specific logical constraints.**Verification:** All AI-generated outputs, including calculations (e.g., the value of $k=13$) and data transformations, have been manually audited and verified for accuracy to ensure they align with the course requirements.**Prompts used for generation:** 1. "Synthesize a professional summary and quiz questions from the provided video transcript in English." 2. "Develop an R script for data cleaning and NA replacement based on user-defined variables and logical conditions."