Thursday, July 7, 2022

BITS-WILP-Machine Learning - ML - Comprehensive Examination-Regular - 2019-2020

Birla Institute of Technology & Science, Pilani

Work Integrated Learning Programmes Division

Second Semester 2019-20

M.Tech. (Data Science and Engineering)

Comprehensive Examination (Regular)



Course No.         : DSECLZG565

Course Title         : MACHINE LEARNING  

Nature of Exam        : Open Book 

Weightage         : 40% 

Duration         : 2 Hours  

Date of Exam:           July 12, 2020                            Time of Exam: 10:00 AM – 12:00 PM

Note: Assumptions made if any, should be stated clearly at the beginning of your answer. 


Question 1.        [3+3+2+3=11 marks]       

       

  1. Suppose you flip a coin with unknown bias θ; P(x = H | θ) = θ, five times and observe the outcome as HHHHH. 

What is the maximum likelihood estimator for θ? [1 mark]

Would you think this is a good estimator? If not, why not?  [2 marks]


  1. A disease has four symptoms and past history of a physician has the following data. Use Naïve Bayes classifier to predict whether patient has disease for new patient data symptoms. [2 marks]                                            

 

Symp1

Symp2

Symp3

Symp4

Disease

1

yes

no

mild

yes

no

2

yes

yes

no

no

yes

3

yes

no

strong

yes

yes

4

no

yes

mild

yes

yes

5

no

no

no

no

no

6

no

yes

strong

yes

yes

7

no

yes

strong

no

no

8

yes

yes

mild

yes

yes


For  a new patient

Symp1

Symp2

Symp3

Symp4

Disease

yes

no

mild

yes

?



  1. Can logistic regression be applied to multi-class classification problem? 

State true or false [1 mark]


  1. Why are log probabilities computed instead of probabilities? [1 mark]

    1. To make computation consistent

    2. To factor into smaller values of probabilities

    3. To factor into larger values of probabilities

    4. None of these

  1.       1.  In a linear relationship y = m*x+b, y is said to be dependent on x when: [1 mark]

  1. m is closer to zero.

  2. m is far from zero.

  3. b is far from zero.

  4. b is closer to zero.


2.    In a linear relationship between y and x, y is not dependent on x when: [1 mark]

  1. The coefficient is closer to zero.

  2. The coefficient is far from zero.

  3. The intercept is far from zero.

  4. The intercept is closer to zero.


          3.    In a linear regression model y= w0 + w1*x, if true relationship between y and x is

           y = 7.5 +3.2x, then w0 acts as, [1 mark]

  1. Intercepts

  2. Coefficients

  3. Estimators

  4. Residuals


Question 2.                

The following backpropagation network uses an activation function called leaky ReLU that generates output = input, if input >= 0, and 0.1 * input if output < 0.  At a particular iteration, the weights are indicated in the following figure.  Training error is given by E = 0.5*(t-y)2 where t is the target output and y is the actual output from the network. What are the outputs of hidden nodes and actual final output y from the network with x1=x2=1? What will be the weights w31 and w12 in the next iteration with learning rate = 0.1, x1=x2=1, and target output t=0? Assume derivative of activation function = 0 at input = 0, and zero bias at all nodes. [1+1+1+1.5+2.5=7 marks]











Question 3.   

  1. Consider training a boosting classifier using decision stumps on the following data set:

1. Circle the examples which will have their weights increased at the end of the first iteration?               [2 marks]

2. How many iterations will it take to achieve zero training error? Explain. [3 marks]


  1. A new mobile phone service chain store would like to open 20 service centres in Bangalore.  Each service centre should cover at least one shopping centre and 5,000 households of annual income over 75,000. Design a scalable algorithm that decides locations of service centres by taking all the aforementioned constraints into consideration [5 marks]


Question 4.       

In a clinical trial, height and weight of patients is recorded as shown below in the table. For incoming patient with weight = 58 Kg and Height = 180 cm, classify if patient is Under-weight or Normal using KNN algorithm with When K = 3? [5 marks]                                                                                                                                   

Weight (in Kg)

Height (in cm)

Class

61

190

Under-weight

62

182

Normal

57

185

Under-weight

51

167

Under-weight

69

176

Normal

56

174

Under-weight

60

173

Normal

55

172

Normal

65

172

Normal

Question 5.                       

Considering the following data, Let x1, x2 be the features

                 Positive Points: {(3, 1), (5, 2), (1, 1), (2, 2), (6, -1)}

                 Negative Points: {(-3, 1), (-2, 2), (0, 3), (-3, 4), (-1, 5)}     

        Derive an equation of hyperplane and compute the model parameters. [7 marks]


           


---------------------------------------------------------------------------- 
All the messages below are just forwarded messages if some one feels hurt about it please add your comments we will remove the post. Host/author is not responsible for these posts.

BITS-WILP-Machine Learning - ML - Comprehensive Examination - 2019-2020

Birla Institute of Technology & Science, Pilani

Work Integrated Learning Programmes Division

Second Semester 2019-20

M.Tech. (Data Science and Engineering)

Comprehensive Examination (Makeup)



Course No.         : DSECLZG565

Course Title         : MACHINE LEARNING  

Nature of Exam        : Open Book 

Weightage         : 40% 

Duration         : 2 Hours  

Date of Exam:           July 12, 2020                            Time of Exam: 10:00 AM – 12:00 PM


Note: Assumptions made if any, should be stated clearly at the beginning of your answer. 


Question 1.    [3+3+2+3=11 Marks]   

             

  1. Suppose you receive messages in sequence of bits (0’s and 1’s) with unknown bias θ for 1’s; there is a message sequence as x1, x2, ..., xn of length n is received.

What θ maximizes the likelihood of the data observed (in terms of n) ? Assume that sample x1, x2, ..., xn is from a parametric distribution f (x|θ), where f (x|θ) is the Bernoulli probability mass function with parameter θ. [3 marks]



    1. In context of naive Bayes, what is meant by Laplace smoothing? [1 mark]

  1. Handling extremely low probabilities. 

  2. None of these 

  3. Make zero probabilities non-zero. 

  4. Making probabilities zero.

  1. Why Naïve Bayes algorithm is called so?    [2 marks]


  1. Consider fitting a logistic regression model to predict whether a customer will default the bank loan or not given his bank balance, income and whether student/non-student. The optimal model coefficients are: Intercept = -10.86, balance = 0.0057* balance, income = 0.0030 and student = -0.6468. Predict whether a student with balance of Rs.1500 and an income of Rs 40,000 will default or not. [2 marks]


  1. The regression line for predicting weight from height is height=1.51*weight+45.47.   Heights is in cm  and weights in kg Interpret the equation and find the height of a person whose weight is 100kgs [2+1=3marks]


Question 2.  [2+5=7 Marks]   

An odd parity generator outputs a ‘1’ when sum of ‘1’s in an input binary sequence is odd. 

  1. What are the parity bits P for a binary sequence (x1, x2) of length 2? x1, x2 are either 0 or 1. [2 marks]

  2. Realize an odd parity generator for binary sequence of length 2 using an MLP, with the following logic gate building blocks (with sigmoidal activation function). Show the network architecture with all weights and bias values. [ 1+1+3 = 5 marks]

Question 3.    Answer the following questions. [5+5 =10 Marks]


  1. Consider training an AdaBoost classifier using decision stumps on the following data set. Decision stump classifier chooses a constant value c and classifies all points where x > c as one class and other points where x ≤ c as the other class. 

1. What is the initial weight that is assigned to each data point? [1 marks]

2. Show the decision boundary for the first decision stump (indicate the positive and negative side of the decision boundary).  [2 marks]

3. Circle the point whose weight increases in the boosting process [2 marks]


  1. Suppose you are given the following pairs. You will simulate the k-means algorithm to identify TWO clusters in the data. Suppose you are given initial assignment cluster centre as {cluster1: #1}, {cluster2: #10} – the first data point is used as the first cluster centre and the 10th as the second cluster centre. Please simulate the k-means (k=2) algorithm for one iteration. What are the cluster assignments after one iteration? Assume k-means uses Euclidean distance.

                                     [5 Marks]   



Data #

x

y

1

1.9

0.97

2

1.76

0.84

3

2.32

1.63

4

2.31

2.09

5

1.14

2.11

6

5.02

3.02

7

5.74

3.84

8

2.25

3.47

9

4.71

3.6

10

3.17

4.96

https://lh5.googleusercontent.com/kyhKlQh1YUGccCDMSPCQr1lplWKli0qf6YDG5gH0d_pEEGAbf1MQxqOuCSc2F95Wg6h8JnCxfkXLsTgavIZ5El-ac6kh0OJoPZS82uSnV0YPHzNTfrbQYlpn0ZKH3y2l8qTmQCMG






Question 4. Answer the following questions. [5 Marks]   

Students in a particular class are graded in subjects A, B and C out of 10 points. Based on the information provided in the table below for 8 students, predict using KNN algorithm approach if a student who scored the following grades  A 5; B 7; C 6 will pass or fail?

  1. When K = 3?

Score in A

Score in B

Score in C

Result

9

5

7

Pass

7

3

6

Fail

5

8

9

Pass

8

6

7

Pass

4

7

8

Fail

6

7

6

Pass

6

8

5

Fail

5

6

5

Fail


Question 5. Answer the following questions. [7 Marks]   


  1. Solve the below and find the equation for hyper plane using linear Support Vector Machine method. 

Positive Points: {(3, 2), (4, 3), (2, 3), (3, -1)}

Negative Points: {(1, 0), (-1, -3), (0, 2), (-1, 2)}






---------------------------------------------------------------------------- 
All the messages below are just forwarded messages if some one feels hurt about it please add your comments we will remove the post. Host/author is not responsible for these posts.