##### About the course

This course covers in details, an introduction to R, Linear algebra for data science which includes algebraic view consisting of vectors, matrices, a product of matrix & vector, rank, null space, a solution of overdetermined set of equations and pseudo-inverse and Geometric view consisting of vectors, distance, projections, and eigenvalue decomposition. Then it teaches you about Statistics which include descriptive statistics, the notion of probability, distributions, mean, variance, covariance, covariance matrix, understanding univariate and multivariate normal distributions, introduction to hypothesis testing, and the confidence interval for estimates.

After that, it introduces you to the concept of optimization followed by the typology of data science problems and a solution framework. Then it covers what is Simple linear regression and how to verify assumptions used in linear regression. Also, the course covers Multivariate linear regression, model assessment, assessing the importance of different variables, subset selection. After that it teaches you the classification using logistic regression and finally, the course concludes with classification using K Nearest Neighbour and K-Means Clustering. This is completely an online course, and you can access it from anywhere in the world.

##### Learning Outcomes

After completing this course, you will be able to:

- Develop relevant programming abilities.
- Demonstrate proficiency with statistical analysis of data.
- Develop the ability to build and assess data-based models.
- Execute statistical analyses with professional statistical software.
- Demonstrate skill in data management.
- Boost your hireability through innovative and independent learning.
- Get a certificate on successful completion of the course.

##### Target Audience

The course can be taken by:

**Students:** Students: All students who are pursuing any Computer Science and Engineering, Information Technology related courses.

**Teachers/Faculties:** All teachers/faculties who wish to acquire new skills or improve their efficiency in Data Science.

**Professionals:** All working professionals, who wish to enhance their skills by learning data science.

##### Why Learn Data Science for Engineers?

Data science has over the past few years come a really long way. That is why it is an integral part of understanding the working of many industries, however complex and intricate. Data science is the future of the world today. The data scientists are also the integral part of the organization and they help the world address major global challenges, that in turn can have far-reaching impacts across countries. The demand for data scientists is increasing so quickly, that McKinsey predicts that by 2018, there will be a 50 percent gap in the supply of data scientists versus demand.

##### Course Features

**24X7 Access:**You can view lectures as per your own convenience.**Online lectures:**20 hours of online lectures with high-quality videos.**Updated Quality content:**Content is latest and gets updated regularly to meet the current industry demands.

##### Test & Evaluation

There will be a final test containing a set of multiple choice questions. Your evaluation will include the scores achieved in the final test.

##### Certification

Certification requires you to complete the final test. Your certificate will be generated online after successful completion of course.

##### Topics to be covered

**Module-1: Data Science for Engineers Course Philosophy and Expectation**In this module, we will see course objectives and expected the outcome of the course.

- What are the course objectives?
- What will not be covered?
- What are the course outcomes and objectives?

**Module-2: Introduction to R**Introduce R as a programming language to perform data analysis and the brief introduction of R studio.

- What is R and RStudio and how to get started with it?
- How to write, sav and execute R files?

**Module-3: Introduction to R (Continued)**In this module, we will see adding comments to R file, clear environment of R studio and save the workspace of R.

- How to add comments in R file?
- How to clear the console and environment and how to save the data from the workspace?

**Module-4: Variables and data types in R**In this module, we are going to see the rules for naming the variables in R, basic data types that are available in R and we are also going to see two basic R objects-Vectors and Lists in detail.

- What are the rules for naming the variables and what are the basic data types in R?
- What are the basic objects in R?

**Module-5: Data Frames**In this module, we are going to introduce Data frame objects of R and perform some operation on the data frame.

- What is a Dataframe and how to create it?
- How to access the rows and columns of a data frame and how to edit it?
- How to add ad delete extra rows and columns in a data frame?

**Module-6: Recasting and joining of data frames**The recasting of a data frames means, need to recast data frames and then look at more sophisticated operations on data frames such as Recasting and Joining of data frames.

- What does Recasting of a data frame mean?
- How is the recasting of a data frame done?
- How to join two data frames?

**Module-7: Arithmetic, Logical and Matrix operations in R**In this module, we are going to do Arithmetic, Logical, and Matrix operations in R.

- What are Arithmetic and Logical operations in R?
- How to create a Matrix and access its elements?
- How to access an entry in a matrix and how does a colon operator works?
- How to do matrix concatenation and perform algebraic operations on the matrix?

**Module-8: Advanced Programming in R: Functions**We are going to introduce the Functions in R and explain how to load or source the functions and how to call or invoke the functions, we are also going to see passing arguments to functions.

- What are the functions in R and how to create and invoke it?
- How to pass arguments in the function and how functions are evaluated in R?

**Module-9: Advanced Programming in R: Functions (Continued)**In this module we will see the functions with MIMO, loading and call a function, we also see about inline functions and looping over objects using commands such as apply, lapply, and tapply.

- What are the functions with Multiple Input and Multiple Output (MIMO) and Inline functions?
- How to loop over the objects?

**Module-10: Control Structures**We are going to study about if-else-if family, constructs for loop, Nested for loops, for loop with break and while loop.

- What is an if-else family of constructs and Sequence function in R?
- What is for and Nested for loop in R?
- What is a while loop in R?

**Module-11: Data Visualization in R Basic graphics**In this module, we are going to show the generation of basic graphics such as scatter plot, line plot, bar plot using R and also give the brief idea of the need for sophisticated graphics.

- How to generate Scatter, Line and Bar Plot?
- Why there is a need for sophisticated graphics?

**Module-12: Linear Algebra for Data Science**In this module we will learn about Linear algebra and matrices, also learn Identification of independent attributes and the linear relationship among attributes.

- What Linear Algebra is useful for and what is a Matrix?
- How to represent data using matrices in data science?
- How to identify independent variables or attributes in the data matrix?
- How to identify linear relationships among variables or attributes in the data matrix?

**Module-13: Solving Linear Equations**In this tutorial session, we will solve some matrix equations problem.

- What are the general considerations for solving matrix equations?
- How to solve matrix equation for the case m = n and its examples?
- How to use optimization perspective to find a solution to the matrix equation in case of m>n?

**Module-14: Solving Linear Equations (Continued)**In this tutorial session, we will solve some matrix equations problems.

- What is the example to solve matrix equation using optimization perspective for case m>n?
- How to use optimization perspective to find a solution to the matrix equation in case of m < n?

**Module-15: Linear Algebra - Distance, Hyperplanes, and Halfspaces, EigenValues, Eigenvectors**In this module, we will learn Vector with the notion of distance, and then learn Unit, orthogonal, Orthonormal, and Basis vectors by their example.

- What is the concept of Vectors?
- What are Unit, Orthogonal and Orthonormal Vectors?
- What are the Basis Vectors?
- How to find basis vectors of the given set of vectors?

**Module-16: Linear Algebra - Distance, Hyperplanes, and Halfspaces, EigenValues, Eigenvectors (Continued 1)**we are going to look at the representation of line and plane in geometrically and the concept of projection with its example and we are also looking at the generalization of projection.

- How are equations represented geometrically?
- What is the concept projections?
- How to illustrate projections through example and how projection is generalized?

**Module-17: Linear Algebra - Distance, Hyperplanes, and Halfspaces, EigenValues, Eigenvectors (Continued 2)**In this module, we are going to be looking at Hyperplanes, Halfspace, Eigenvalues and Eigenvectors with their examples.

- What are Hyperplanes and what is the concept of Halfspace?
- What are the Eigenvalues and Eigenvectors (Part 1)?
- What are the Eigenvalues and Eigenvectors (Part 2)?

**Module-18: Linear Algebra - Distance, Hyperplanes, and Halfspaces, EigenValues, Eigenvectors (Continued 3)**The objective of this module is to learn about Connections between eigenvectors, column space, and null space.

- What is the connection between eigenvectors, column space and null space (Part 1)?
- What is the connection between eigenvectors, column space and null space (Part 2)?
- What example is taken to explain the connection between eigenvectors, column space and null space?

**Module-19: Statistical Modelling**we will go on to characterizing random phenomena what they are and how probability can be used as a measure for describing such phenomena.

- What are a Random and Discrete Phenomena?
- What is Probability and what are Exclusive and Independent events?
- What are the different rules in Probability and what is Conditional Probability?
- How to illustrate Conditional Probability through an example?

**Module-20: Random variables and Probability Mass/Density Function**In this module we will go to introduce the notion of Random variable and the idea of probability mass and density function, we also see how to characterize these functions, properties of PDF, computation of probability using R, Multivariate normal distribution.

- What is a Random Variable (RV) and Probability Mass/Density Function (PDF)?
- What is the Binomial Mass Function and Gaussian or Normal Density Function?
- What is a Chi-square density function and what is the moment of a pdf?
- What are the properties of a Gaussian RV, how to compute the probability of using R and what are the other different functions in R?
- What is the joint pdf of two continuous RVs and what is Multivariate Normal Distribution?

**Module-21: Simple Statistics**In this module, we will introduce a few measures of statistical and how they are used in the analysis.

- What is the need of sampling, its basic concepts and what are the two parts of statistical analysis?
- What is Mean and Median and mode?
- What are the measures of spread and properties of sample mean and variance?
- What are the different types of plots for graphical analysis?

**Module-22: Hypotheses Testing**In this module we will try to introduce the basics of Hypothesis testing, some motivation for hypothesis testing, we look at some cases of hypothesis testing.

- What is the motivation behind Hypotheses Testing, what is hypotheses testing and its procedure?
- What are one-sided and two-sided tests?
- What are the different errors in Hypotheses testing?
- How are hypotheses testing for mean illustrated using an example?
- How are hypotheses testing for differences in mean illustrated using an example?
- How are hypotheses testing for differences in variance illustrated using an example?

**Module-23: Optimization for Data Science**We will start with a general description of the optimization problem and then we will point out the relevance of understanding this field of optimization from a data science perspective, we will also introduce various types of the optimization problem, and we will focus on the Univariate optimization problem.

- What is the concept of Optimization?
- What are the components and types of optimization problem?
- What is Univariate Optimization problem and what is the concept of Local and Global Optimum?
- What are the conditions for Local Optimum in Univariate Optimization Problem?

**Module-24: Unconstrained Multivariate Optimization**Unconstrained multivariate optimization problem, analytical conditions for the minimum multivariate problem, conditions in the univariate case translate to the multivariate case.

- What is Multivariate Optimization problem?
- What is the concept of Local and Global Optimum in Multivariate Optimization Problem?
- What are the conditions for Local Optimum in Univariate Optimization Problem?

**Module-25: Unconstrained Multivariate Optimization (Continued)**In this module, we will learn Directional search for solving an Unconstrained multivariate optimization problem.

- How to use a directional search to solve a multivariate optimization problem?
- How to mathematically interpret the solution to the multivariate optimization problem?
- What are Steepest descent and optimum step size?

**Module-26: Gradient (Steepest) Descent (OR) Learning Rule**The numerical example of how the gradient descent works in optimization in many cases this is also called the learning rule.

- What is the first step in the learning rule?
- What is the second and third step in learning rule?
- What is the fourth step in the learning rule?

**Module-27: Multivariate OPtimization with Equality Constraints**In this module, we will study how to solve the Multivariate optimization problem with equality constraints and effect of equality constraints on the optimal solution.

- What is Multivariate optimization problem with equality constraints (part 1)?
- What is Multivariate optimization problem with equality constraints (part 2)?
- What is Multivariate optimization problem with equality constraints (part 3)?

**Module-28: Multivariate OPtimization with Inequality Constraints**In this module, we will study how to solve the Multivariate optimization problem with inequality constraints and the effect of inequality constraints on the optimal solution.

- What is Multivariate optimization problem with inequality constraints (part 1)?
- What is Multivariate optimization problem with inequality constraints (part 2)?
- What is Multivariate optimization problem with inequality constraints (part 3)?
- What is Multivariate optimization problem with inequality constraints (part 4)?
- What is Multivariate optimization problem with inequality constraints (part 5)?
- What is Multivariate optimization problem with inequality constraints (part 6)?

**Module-29: Introduction to Data Science**The objective of this module is to learn about the various techniques in data science, types of problems and reasons for various techniques available in data science.

- What are the various techniques used for solving problems in Data Science?
- What are classification problems (part 1)
- What are classification problems (part 2)
- What are the functional approximation problems?
- Why there are many techniques for solving two types of problems (part 1)?
- Why there are many techniques for solving two types of problems (part 2)?

**Module-30: Solving Data Analysis Problems - A Guided Thought Process**In this module we are going to take a very simple example and then illustrate how you should think about solving data science problems and end of it, we will come up with a flowchart that is useful.

- How to solve Data Analysis Problem (part 1)?
- How to solve Data Analysis Problem (part 2)?
- How to solve Data Analysis Problem (part 3)?
- What is the conceptual framework for solving Data Analysis Problems?

**Module-31: Module: Predictive Modeling**We are going to introduce the notion of correlation and its types, what they are useful for.

- What are Correlation and its various measures?
- What is Pearson's Correlation and how to apply it to Anscombe's data?
- What is Spearman Rank Correlation and how to apply it to Anscombe's data?
- What is Kendall Rank Correlation Coefficient and how to apply it to Anscombe's data?

**Module-32: Linear Regression**In this module, we are going to introduce Regression and its process and also the method of linear regression technique for analyzing data and building models.

- What are Regression and its types?
- What are the regression methods and its process?
- How is the Concept of Ordinary Least Squares (OLS) applied to Linear Regression Model?
- How is the Concept of Ordinary Least Squares (OLS) applied to Linear Regression Model (continued)?
- How to test the goodness of fit of OLS Model?

**Module-33: Model Assessment**In this module, we are going to assess whether the linear model we have developed actually fitted is reasonably good or not and decide whether the coefficients of the linear model are significant.

- What questions to be asked in the assessment of an OLS model?
- What are the properties of the estimates?
- What are the confidence intervals on regression coefficients how to perform hypotheses test on them?
- What are the definitions for Sum Squared Quantities and what is F-Test for selecting a model?
- How Is F-Test applied to an example in R?

**Module-34: Diagnostics to Improve Linear Model Fit**In this module, we will assess the linear model on Anscombe data sets and another way of assessing whether linear is adequate or not is called residual plots.

- What are the drawbacks of applying Linear Model to Anscombe's dataset?
- What are residual plots and how they're used for assessment of models?
- How are residuals used for checking normality of errors, non-uniform error variance, and outliers in data?
- How is outlier detection illustrated with the help of an example?

**Module-35: Simple Linear Regression Model Building**In this module we are going to implement simple Linear regression in R as a part of this module we are also going to look at loading the data from the .txt file, plot the data, build the linear model, and interpret the summary of the model.

- How to load and view the data, what is its structure and how to visualize it?
- How to build a Simple Linear Regression Model?

**Module-36: Simple Linear Regression Model Assessment**In this module we are going to look at simple linear regression model assessment as a part of this we are also going to look at identifying significant coefficients in the linear model.

- What is the First Level Model Assessment?

**Module-37: Simple Linear Regression Model Assessment (Continued)**The second level of model assessment as a part of this we are going to see if we can improve the quality of the linear model and can we identify bad measurements in the data(outliers).

- What are outliers and how to identify them by residual analysis?
- How to remove outliers, check for the need of refinement and build the refined model?

**Module-38: Multiple Linear Regression**The objective of this module is to learn Multiple Linear Regression problems which consist of one dependent variable, but several independent variables, and solving multiple linear regression problem.

- What is the Multiple Linear Regression Problem?
- How to solve the Multiple Linear Regression Problem (part 1)?
- How to solve the Multiple Linear Regression Problem (part 2)?
- How to solve the Multiple Linear Regression Problem (part 3)?
- How to solve the Multiple Linear Regression Problem (part 4)?

**Module-39: Cross-Validation**In this module, we will try to learn cross-validation, which is very useful in model building and use cross-validation on validation data set to determine the optimal numbers of parameters.

- What is the motivation behind cross-validation and what is Bias-Variance trade-off on the test data set?
- What are Training and Validation Datasets and what is a Validation Set Approach and its example?
- How sampling of small data sets is done and what is Leave-one-out-cross-validation (LOOCV) and k-Fold Cross Validation?

**Module-40: Multiple linear regression modeling building and selection**We are going to build multiple linear regression model we are also going to look at the model summary and identify insignificant variable and discard them and rebuild the model, we also look at the model selection.

- How to load, read and view the data, and how to plot a pairwise scatter plot for it?
- How to build the Multiple Linear Regression Model?

**Module-41: Classification**In this module, we will see the various classification problems and some characteristic of classification problems.

- What does classification and what are Binary and Multi-Class classification problems?
- What are Linearly Separable and Non Linearly Separable problems?
- How to solve classification problems?

**Module-42: Logistic Regression**In this module, we will learn the basic idea of Logistic Regression.

- What is Logistic Regression and what are the various aspects of a Binary classification problem?
- What are Linear and Log models and what is the sigmoid function?
- How to estimate parameters?
- What is the Log-likelihood function?

**Module-43: Logistic Regression (Continued)**In this module, we will take a very simple example with several data points to show how logistic regression works in practice and I will also introduce a notion of regularization which would help in avoiding overfitting when doing logistic regression.

- What are a Logit Model and its example problem?
- How is the problem solved using Logistic Regression?
- What is Regularization in Logistic Regression?

**Module-44: Performance Measures**The objective of this module is to see about typical performance measures that are used once a classifier is built and also see ROC curve.

- What is the result of running an R code for any classifier?
- How to measure performance?
- How are the performance parameters illustrated through an example?
- What is ROC?

**Module-45: Logistic Regression Implementation in R**In this module we are going to look at a case study and a problem statement associated with it, we are also going to solve the case study using R.

- What is the Automotive Crash Testing problem?
- How to solve the Automotive Crash Testing problem using R?
- How to build Logistic Regression model and find the odds for the Automotive Crash Testing problem?
- How to plot the probabilities and what is the confusion matrix?

**Module-46: K - Nearest Neighbors (kNN)**In this module, we are going to understand the very powerful classification algorithm called the k-nearest neighbors and also understand different things to consider before applying this algorithm.

- What is a k Nearest Neighbor (kNN)?
- What are the assumptions and algorithm for kNN?
- How kNN is illustrated?
- What are the different things to be considered before applying kNN algorithm and how to select parameters?

**Module-47: K - Nearest Neighbors implementation in R**In this module, we are going to look at a case study to implement K-NN algorithm and a problem statement associated with it, we are also going to solve the case study using R.

- What is the problem statement for the case study of Automotive Service Company?
- How to solve the case study problem of Automotive Service Company using R (Part 1)?
- How to solve the case study problem of Automotive Service Company using R (Part 2)?
- How to implement k-Nearest Neighbors using knn() function and how to apply knn algorithm on data?
- What are the results of applying the knn algorithm?

**Module-48: K - means Clustering**The objective of this module is to illustrate the concept of K-means clustering and its disadvantages.

- What are K-means Clustering and its description?
- How K-means Clustering Algorithm works and its example?
- How to determine the number of Clusters (K) and what are the disadvantages of K-means?

**Module-49: K - means Implementation in R**In this module, we are going to look at a case study to implement K-means clustering algorithm and a problem statement associated with it, we are also going to solve the case study using R.

- What is the problem statement for the case study of Clustering of trips and its solution?
- How to implement k-means clustering using kmeans() function and its results?

**Module-50: Data Science for Engineers - Summary**the quick summary of the course, the next logical step after learning this course.

- What is the overall course summary and what is the next logical step after learning this course?

**Data Science for Engineers - Final Quiz**

## Reviews

There are no reviews yet.