NEW HERE? USE "AFORUM20" TO GET GET 20 % OFF CLAIM OFFER

UK: +44 748 007-0908 USA: +1 917 810-5386
My Orders
Register
Order Now

Fundamentals of Data Analytics

    1. BACKGROUND The purpose of this assignment is to implement the decision tree using different split point evaluation measure and Naïve Bayes classification. The performance metrics were evaluated for a machine learning model. 2. QUESTIONS Question 1: (30points) Given the following table construct a decision tree using a purity threshold of 100%. Use information gain as the split point evaluation measure. Present all the steps. Next classify the points in terms of Risk: a) (Age=27, Car= Vintage)- 15points b) (Age=50, Car=Sports)- 15points Point Age Car Risk X1 25 Sports Low X2 20 Vintage High X3 25 Sports Low X4 45 SUV High X5 20 Sports High X6 25 SUV High Question 2: (30points) Given the following table you need to predict the class label of a tuple using Naïve Bayes Classification. Present all the probabilistic calculations. Customer Internet Service Contract Payment Method Churn C1 DSL Monthly Cash No C2 DSL Monthly Credit No C3 Broadband Monthly Cash Yes C4 Fiber Optics Monthly Cash Yes C5 Fiber Optics Yearly Cash Yes C6 Fiber Optics Yearly Credit No C7 Broadband Yearly Credit Yes C8 DSL Monthly Cash No C9 DSL Yearly Cash Yes C10 Fiber Optics Yearly Cash Yes C11 DSL Yearly Credit Yes C12 Broadband Monthly Credit Yes C13 Broadband Yearly Cash Yes C14 DSL Monthly Credit No C15 DSL Monthly Credit ? Next you need to predict churn or not churn for the following customer: C15= (DSL, Monthly, Credit)- 30points Question 3: (20 points) Consider the following confusion matrix. The numbers present if an employee will leave or not from his/her organization. Predicted Leave Actual Leave No Yes Total No 2326 15 2341 Yes 50 634 684 Total 2376 649 3025 Calculate the following metrics: a) Accuracy b) Sensitivity c) Specificity d) Precision e) Recall After the calculation, you need to discuss the results and provide explanations regarding the performance of the Machine Learning model selected. Question 4: (20points) Define the meaning of the following terms: -Entropy-5pts -Information Gain -5pts -Gini Index-5pts -Confusion Matrix-5pts