# NPTEL Introduction to Machine Learning Assignment Week 10 Answers

Are you looking for NPTEL Introduction to Machine Learning Assignment Week 10 Answers? If yes, you will find the answers to the questions asked in the NPTEL Introduction to Machine Learning quiz exam here. If you are preparing for this exam this article will help you in finding the latest and updated answers.

There is a total of 10 questions related to Partitional Clustering, Hierarchical Clustering, Birch Algorithm, CURE Algorithm, Density-based Clustering. The correct answers are marked in Green Color with a tick sign.

Note: If the questions in the exam is not same/changed please share them with us, so that we update with the latest questions & answers

### NPTEL Introduction to Machine Learning Assignment Week 10 Answers

1. The pairwise distance between 6 points is given below. Which of the option shows the hierarchy of clusters created by single link clustering algorithm?

1. (a)
2. (b)
3. (c)
4. (d)

2. For the pairwise distance matrix given in the previous question, which of the following shows the hierarchy of clusters created by the complete link clustering algorithm

1. (a)
2. (b)
3. (c)
4. (d)

3. In BIRCH, using number of points N, sum of points SUM and sum of squared points SS, we can determine the centroid and radius of the combination of any two clusters A and B How do you determine the radius of the combined cluster? (In terms of N, SUM and SS of both two clusters A and B)

Radius of a cluster is given by

Note: We use the following definition of radius from BIRCH paper ‘Radius is the average distance from the member points to the centroid’

1. Radius=√SSA/NA – (SUMA/NA)2 + SSB/NB-(SUMB/NB)2

4. Statement 1. CURE IS robust to outliers

Statement 2 Because of multiplicative shrinkage, the effect of outliers is dampened

1. Statement 1 is true. Statement 2 is true. Statement 2 is the correct reason for statement 1
2. Statement 1 is true Statement 2 is true. Statement 2 is not the correct reason for statement 1
3. Statement 1 is true Statement 2 is false.
4. Both statements are false.

5. Run K-means on the input features of the iris dataset using the following initialization:

KMeans(n clusters=3, random state=seed)

Usuaily, for clustering tasks, we are not given labels, but since we do have labels for our dataset, we can use accuracy to determine how good our clusters are

Label the prediction class for all the points in a cluster as the majority true label.
E.g {a, a, b} would be labeled as {a, a, a)
What is the accuracy of the resulting labels?

1. 0879
2. 0.893
3. 0.919
4. 0.933

6. For the same clusters obtained in the previous question, calculate the rand-index. Formula for rand-index:

R=a+b/Cn2

Where,

a= number of times a pair of elements occur in the same cluster in both sequences
b= number of times a pair of elements occur in the different clusters in both sequences

Note The two clusters are given by. (1) Ground truth labels, (2) Prediction labels using clustering as directed in Q5.

1. 0.879
2. 0.893
3. 0.919
4. 0.933

7. a in rand-index can be viewed as true positives (pair of points belonging to the same cluster) and b as true negatives(pair of points belonging to different clusters) How then, are rand-index and accuracy from the previous two questions related?

1. rand-index = accuracy
2. rand-index = 1.01xaccuracy
3. rand-index = accuracy/2
4. None of the above

8. Run BIRCH on the input features of iris dataset using Birch(n clusters=3, threshold-1). What is the rand-index obtained?

1. 0.68
2. 0.79
3. 0.88
4. 0.98

9. Run BIRCH on the following values of threshold parameter [0.01, 0.02, 0.03,0.99, 1.00) using the same command as given in the previous question. What value of threshold achieves the best rand-index?

1. 0.12
2. 0.41
3. 0.58
4. 1

10. Run PCA on Ins dataset input features with n components =2 Now run DBSCAN using DBSCAN(eps=0.5, min samples=5) on both the original features and the PCA features What are their respective number of outliers/noisy points detected by DBSCAN?

As an extra, you can plot the PCA features on a 2D plot using matplotlib.pyplot. scater with parameter c=y-pred (where y-pred is the cluster prediction) to VIsualise the clusters and outliers

1. 10, 10
2. 17,7
3. 21,11
4. 5, 10

### FAQ

What is NPTEL Introduction to Machine Learning?

NPTEL Introduction to Machine Learning Course is an online free course by IIT Madras that has been developed by Prof. Balaraman Ravindran. The main aim of this course is to provide the basic concepts of machine learning from a mathematically well-motivated perspective.

Yes, all these answers are 100% correct.