COMP 2211 Lab 6: Multilayer Perceptron

COMP 2211 Exploring Artificial Intelligence

Lab 6 Multilayer Perceptron

Review

This part of this lab is a review of the Multilayer Perceptron. It aims to refresh your memory of what you have learned in class.

Multilayer Perceptron

Basic of MLP
Loss function
Gradient Descent
Procedure of applying MLP in an actual problem
Some remarks for MLP

Please download the notebook by right-clicking and selecting "Save link as" and opening it using using Google Colab. You should see the following if you open the notebook successfully.

End of Review

Introduction

Sentiment analysis studies the emotion in texts, it has different applications like marketing strategy refinement, social media monitoring, product analysis etc.
https://ai.stanford.edu/~amaas/data/sentiment/

In this lab, we will use the IMDb movie reviews dataset to predict user attitude according to the review written by them.

End of Introduction

Lab Work

A couple of lab tasks are given to you to practice your skills in processing data and building an AI model using MLP. Please download the notebook, .py submission template, the dataset as well as the GloVe embedding and open them on google colab. You should see the following if you open the notebook successfully.

End of Lab Work

Submission & Grading

Deadline: Friday, 15 April 2022, 23:59
You may earn 2 points for each lab via Automated Grading on the ZINC Online Submission System
Please check here for a usage overview of ZINC
Zip lab6_tasks.py (the name should be the same, including its case), and submit the zip file (i.e. lab6_tasks.py.zip) to ZINC
You may submit your file multiple times, but only the latest version will be graded
Lab work submitted via channels other than ZINC will NOT be accepted. Late submission will NOT be accepted as well

End of Submission & Grading

Frequently Asked Questions

Remember to call the preprocessing() function for each movie review when assigning the variable X.

Q: Why my result having a horizontal line when plotting the training and validation accuracy?
A: You may want to check whether something is wrong with task1 and 2. One possible reason is the wordDict storing the incorrect word embedding/insufficient word embedding causing there to be no valid word representations to give a successful training.

Q: What is embedding_layer in task 3? I don't find this on the lecture note.
A: The embedding layer contains the glove word representation corresponding to the token ID given by the tokenizer, so the X_train or X_test (which contains arrays of int token ID) will be multiplied to the embedding matrix and create a matrix of glove word representation corresponding to each of the words in a single movie review. This part is not introduced in the lecture notes, and this is also why I have coded the embedding layer part in the notebook.

Q: What actually is passed to the model for training as input. Is it an arr of shape(150,100)?
A: For each movie review, it will be an array in size of (150,), and this will be transformed into (150, 100) in the embedding layer.

This list is incomplete; you can help by expanding it.

End of Frequently Asked Questions. Don't hesitate to ask ;)

Page maintained by

Chung Tsz Ting
Email: ttchungac@connect.ust.hk
Last Modified:

Homepage

Course Homepage