COMP 2211 Programming Assignment 2

Introduction

Quick Draw dataset is a doodling dataset of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw!.

In this assignment, you are given a small subset of the dataset comprising 8 categories and 5000 images per category, on which you will build a CNN model to classify those doodles. You will mainly use Keras library, with a little touch on NumPy and TensorFlow. (I mean the non-Keras part of TensorFlow).

Please refer to piazza posts or the Assignment Tasks section below for the changelog of skeleton files since PA release.

Doodles are simple drawings that can have concrete representational meaning or may just be composed of random and abstract lines, generally without ever lifting the drawing device from the paper, in which case it is usually called a "scribble". ——Wikipedia

Assignment Tasks

The following bullet points give you a general idea of what to do in this assignment.

Loading and Peeking at the dataset
- Task 1: Reshape X to correct format
- Task 2: One-hot encode Y
Data augmentation
- Task 3: Build the data augmentation pipeline
Build the model
- Task 4: Build the main CNN model
Compile and train the model
- Task 5: Compile the model
- Task 6: Train the model
Evaluate the model
- Task 7: Evaluate the model on the test dataset
- Task 8: Use the model to predict specific images

Please download the notebook here. The first code block in the notebook will download the following additional files, but you can also download them manually and comment out the first cell.

The additional python script (where your code goes to)
The dataset
The sample dataset images

The final file structure is like

pa2.ipynb
pa2.py
draw.npz
draw_example/
    drums/
        0.png
        ...
        3.png
    eiffel_tower/
        ....
    ...

And you should see the following if you open the notebook successfully. Colab notebook preview

Changes made to the following files:

Not related to codes
- May 6: In notebook task 8, removed the line "~~optionally, set the batch size to 32~~". That is already the default behavior according to the documentation.
- Apr 25: In notebook, each image in the dataset is an 1D array of length ~~728~~ 784 (28*28)
- Apr 25: In notebook Task 7, model.evaluate doesn't have parameters for validation set. I removed the sentence mentioning x_val and y_val. It was copied from Task 6 and I forgot to remove it.
- Apr 25: In notebook Optional Tasks Parameter Tuning: "if the ~~accuracy~~ loss doesn't go down/goes down too slowly..."
- Apr 19: In notebook Task 8, model.predict returns an array of shape ~~(num_images,)~~ (num_images, N_labels).
- Apr 16: In notebook, the dataset we use is a subset of Quick Draw containing 8 categories and ~~1000~~ 5000 images per category.

End of Assignment Tasks

Submission and Deadline

Deadline: 23:59:00 on May 12, 2022 (Thursday).

ZINC Submission

Create a single zip file that contains pa2.py and draw_model.h5 (your model saved by the code we provided in the notebook). Please do not create a folder inside the zip.

Submit the zip file to ZINC. ZINC usage instructions can be found here.

There are 2 ZINC submission entries. The one with validation in its name is to validate your code runs, and it reports scores immediately. The one without validation in the name is the final destination you submit your code, and it won't report scores before DDL!

Notes:

You may submit your file multiple times, but only the latest version will be graded.
Submit early to avoid any last-minute problems. Only ZINC submissions will be accepted.
The ZINC server will be very busy on the last day, especially in the last few hours, so you should expect you would get the grading result report not very quickly. However, as long as your submission is successful, we will grade your latest submission with all test cases after the deadline.
If you have encountered any server-side problems or webpage glitches with ZINC, you may post on the ZINC support forum to get attention and help from the ZINC team quickly and directly. If you post on Piazza, you may not get the fastest response as we need to forward your report to them, and then forward their reply to you, etc.

Running Requirement

We grade each function separately. For each function, we have already provided some dummy codes so that they run without raising exceptions. If your implementation of one function raises errors, you will not get the score for this function, as is expected. But a wrong implementation that raises no error will neither guarantee your scores (or you can score even using the skeleton code XD).

Reminders

Make sure you upload the correct version of your source files - we only grade what you upload. Some students in the past submitted an empty file or a wrong file which is worth zero mark. So you must double-check the file you have submitted.

Late Submission Policy

There will be a penalty of -1 point (out of 100 points) for every minute you are late.

For instance, if you submit your solution at 1:00:00 a.m. on May 12, there will be a penalty of -61 points for your assignment. (Since the deadline of assignment 2 is 23:59:00 on May 12).

However, the lowest grade you may get from an assignment is zero: any negative score after the deduction due to a late penalty (and any other penalties) will be reset to zero.

End of Submission and Deadline

Grading Scheme

Task 1: train_x, val_x, test_x have correct shapes and values (1pt for each)
Task 2: train_y, val_y, test_y have correct shapes and values (1pt for each)
Task 3: AugmentationLayer
- The Sequential has correct layer types (i.e. classes) in the correct order. (2pt. No intermediate scores if some layers are wrong)
- The Sequential’s layers are constructed with correct parameters (1pt. No intermediate scores if some layers are wrong. No score if the test case above fails.)
Task 4: Building the main model
- The Sequential has correct layer types (i.e. classes) in the correct order. (3pt. No intermediate scores if some layers are wrong)
- The Sequential’s layers are constructed with correct parameters (1pt. No intermediate scores if some layers are wrong. No score if the test case above fails.)
Task 5: Compiling the model
- The loss, the optimizer type, the metrics are correctly specified (1pt for each)
- The learning rate can be set from the function parameter (1pt)
Task 6: Training the model
- The epoch number can be set from the function parameter (1pt)
- The batch size are set correctly (1pt)
- The validation data and validation batch size are set correctly (2pt. No intermediate scores)
Task 7: Evaluate the model
- The test data and batch size are set correctly (2pt. No intermediate scores)
Task 8: Predict images
- The function output correct values (1pt)
Your draw_model
- The accuracy on test dataset is above 60% (1pt)
- The main model has correct weights (2pt. Correctness has considered the intrinsic randomness in GPU computation. No intermediate scores if some layers’ weights are wrong.)

End of Grading Scheme

Frequently Asked Questions

Q: ZINC only reports one line of error "test_pa2_pre" instead of multiple lines of test cases. Only one error
A: That's because ZINC fails to even start the entire test unit. The most possible cause is that you imported some 3rd-party libraries other than tensorflow, keras, and numpy. The test environment doesn't have those libraries. Common pitfalls are matplotlib and sklearn.
One special case is tkinter, which will pass in validation submission but fail in final submission.
Also check if your IDE auto-imported some libraries upon a wrong tab-completion, and you forgot to remove it afterwards.

Q: Some plots have "unknown" labels in the captions. Or some plots write an entire array to the captions.
A: The plotting codes themselves are correct. As long as you return correct values in correct shapes in your tasks, the plots should have sane captions.

Q: In task4, I have the following error: Input 0 of layer conv2d_xx is incompatible with the layer: ...
A: First, don't use the augmentation_layer in the notebook. Second, better not to write your tasks in the notebook. Or at least don't reuse or edit any local variables already there in the notebook. Also see the following question.
The cause of this specific error is explained in piazza @294.

Q: Can I write my codes in the notebook first, and copy to .py later?
A: You can. But do not rely on any local variables in the notebook (as eventually you will get rid of them after pasting) and remember to rerun and validate your code after pasting. For example, using the augmentation_layer variable from the provided code cell in your build_model will cause error.

Q: Is my model acceptable if I achieve xx/16 in the last task?
A: The last task only demonstrate your implementation of the single-image prediction function. The accuracy of the 16 images involves a lot of luck, and does not indicate the quality of your model. Please refer to your Task 7 output for the model accuracy. Make sure that number is above 60%.

Q: My accuracies are all zero during training!
A: If you use metrics=[keras.metrics.Accuracy()] or metrics=[tf.keras.metrics.Accuracy()], try to replace it with metrics=["accuracy"]. Some students reported strange behaviors of the former two. I don't know why...

Q: Can we use literal 28 instead of sqrt when we reshape?
A: Yes.

Q: Validation accuracy is greater than training accuracy in this dataset!
A: Good for you to question that! See piazza @327.

Q: My code doesn't work. There is an error/bug. Here is the code. Can you help me fix it?
A: As the assignment is a major course assessment, to be fair, you are supposed to work on it on your own and we should not finish the tasks for you. We are happy to help with explanations and advice, but we shall not directly debug the code for you.

Q: Are we allowed to use external libraries (e.g., scikit-learn) to implement this assignment?
A: In this assignment, we will only be using NumPy and TensorFlow (Keras is part of TensorFlow). You are NOT allowed to import extra external libraries (i.e., no scikit-learn). The goal of this assignment is to get familiar with Keras specifically, by building a CNN image classification model.

Q: Are we allowed to use Python standard libraries (e.g., from collections import defaultdict)?
A: Yes, Python standard libraries are allowed. Please visit here for an official comprehensive list of modules included in Python 3.7 (Colab deploys Python 3.7 for now, please also test on Colab if you use local machines).

Q: If ZINC says I have achieved "Total Score ?/?", does that mean I have passed the assignment and obtained full marks?
A: No, it may not be. We will re-grade your submitted assignment file using another set of test cases. So, you may get different marks if you do not pass some of the test cases during the re-grading performed after the submission deadline. Please check your code more thoroughly.

...

End of Frequently Asked Questions

Solution

Solution: pa2_sol.py and correct_model.h5

There are three grading attempts and the latest is the final score (because I fixed some bugs in the grading scripts). We are sorry that ZINC frontend doesn't provide PyTest details. A hacky way to see the error details on ZINC is elaborated below. You can also download the test files and the notebook to run the test yourself on Colab. Details on how to run the tests are self-explained in the notebook.

I have questions regarding the grades!

The appealing process is underway. You can check your error causes first. For those with empty scores due to timeout, those having syntax issues thus receiving 0 scores, those having other questions regarding the grading program, please wait momentarily until I work out the appealing scheme.

The following describes how to see the error messages on ZINC

You can go to your submission page, open the developer panel (e.g., F12 on Chrome-like browsers; Safari requires Preferences - Advanced - Show development menu in menu bar, then opt+cmd+i) and click on the "Network" tab. Then click on the "View details" button of the specific grading result. You can see a "graphql" item pop up in the network tab. (It only shows up upon the first click of this button. If you clicked the button before opening devtool, there will be no new item popping up. You can refresh the page, open devtool first, then click on the button.)

Click on the graphql item and go to the Preview tab on the right. Expand the JSON data along "data - report - sanitizedReports - pyTest - 0 - report - testsuites - 0 - testcases - <some number> - failures - 0". The error traceback is in "context" and any extra message is in "message". ZINC devtool demo The result is in one huge string with escaped new-line character. You can copy and put it in a python print("..") command or terminal echo ".." command to see it clearly.

End of Supplimentary Notes

Supplimentary Notes

This part is not related to the assignment itself. It contains some extra information in case anyone is interested. Not reading this won't affect your grades.

Q: Why are there so many set_random_seeds in the notebook?

In short, don't copy me. The correct practice for a normal program is to set a seed only once at the top of the program. Imagine all the randomness of the computer comes from lookups of a seemingly random but fixed long string, and the seed controls where to start. You set it once and let the computer start from there. If you set the seed the second time with the same value, you "reset" the random state of the program back to when you set it the first time. And the code you intend to be random becomes deterministic. In the PA however, I set it again before training the model. Therefore, no matter how you have played with the model before and how far the random state have moved on, the trained model should be the same. Otherwise, since there is randomness in each layer's initializer and the dropout layer, different random states may result in different final models.

Q: I work on my local machine with an all-round IDE. It reports annoying linting issues in the codes.

Short answer: either download this .flake8 file, put it next to your PA scripts and restart IDE, or disable linters in your IDE if that doesn't work. Or you can leave them there as long as your code runs correctly.

If you don't know what a linter is, they are like Grammarly for programming languages that help you write beautiful and robust codes. But our skeleton codes contain some deliberate formatting that is not considered conventionally beautiful.

If your linter is flake8 like me, the above config file will bypass those specific conventions. If you use another linter, see if it can recognize a flake8 config, otherwise, you may want to turn off your linter. If you don't have a linter or don't even know what I am talking about, just ignore it!