COMP 2211 Exploring Artificial Intelligence

Lab 4 K-Means Clustering

Review


This is a review of K-Means Clustering. It aims to refresh your memory of what you have learned in class.

  • Clustering
  • K-Mean Clustering
    • The Algorithm
    • Evaluation of K-Mean Clustering
    • Limitation

Please download the review notebook by right-clicking the link. Then, select "Save link as..." to save the file to your local disk. Next, upload it to your Google Drive and open it using Google Colab.

Lab Work


A number of lab tasks are given to you to familiarize yourself with the K-Mean algorithm (and practice your NumPy programming skills). Please download the lab tasks notebook by right-clicking the link. Then, select "Save link as..." to save the file to your local disk. Next, upload it to your Google Drive and open it using Google Colab.

Model Answers

Right click and select "Save link as..." to download the solution. Please ask if you do not understand the solution. As I said before, knowing how to use NumPy is extremely important. Please don't hessitate to ask when you're in doubt.

Submission & Deadline

  • Deadline: Friday, 18 March 2022, 23:59
  • You may earn 2 points for each lab via Automated Grading on the ZINC Online Submission System
  • Please check here for a usage overview of ZINC
  • Export and zip lab4_tasks.py (the name should be the same, including its case), and submit the zip file to ZINC
  • You may submit your file multiple times, but only the latest version will be graded
  • Lab work submitted via channels other than ZINC will NOT be accepted. Late submission will NOT be accepted as well

Frequently Asked Questions

  • Q: Why I can't see my output in ZINC system so that I can debug my program?
    A: The ZINC system is supposed to test your program instead of validate it. You should validate your code using the provided results in the notebook.
  • Q: Do I need to loop inside the run function of KClutser class?
    A: No. The loop shall be outside the run function. I've already done that when I call run. So you don't need to worry about convergence.

This list is incomplete; you can help by expanding it


Errata

This sections shows the modification that was made to the corresponding files. Therefore, you can either choose to update the file yourself or download the file and move your code.

    lab4_tasks.ipynb

  • In the first block of code, in function def isNotebook():, an extra condition shell == 'Shell' is added for the code to work in GoogleColab.
  • In the first block of code, if isNotebook: is changed to if isNotebook():.
  • In the forth block of code, the line print('SSE: ', sse := SSE(X, output, 3, kmean.centroid)) is replaced with
    sse = SSE(X, output, 3, kmean.centroid)
    print('SSE: ', sse)
  • In the first block of code, import matplotlib.pyplot as plt on the second line is removed.
  • At the result, sse is changed to round(sse, 5) to avoid the rounding error of floating points numbers.
  • More hints/guide are added to help with vectorizing the code.

Page maintained by
Homepage