# **COMP 2211 Exploring Artificial Intelligence**
## Convolutional Neural Network

![cnn.png](https://miro.medium.com/max/1400/1*irWQaiIjHS27ZAPaVDoj6w.png)

## **Lab Tasks Procedure**
1. Data preprocessing **(Task1)**
2. Build the model **(Task2)**
3. Compile the model
4. Train the model
5. Save the model

Check your Colab open the GPU accelerator: 'Edit' -> 'Notebook settings' -> 'Hardware accelerator'

In [None]:
# check your Colab device
import tensorflow as tf
import pprint
device_name = tf.config.list_physical_devices()
pprint.pprint(device_name)

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


## Download dataset

In [None]:
"""
    Download neccesary files for sanity check
"""
username = input("Please enter your username: ")
import getpass
password = getpass.getpass("Please enter your password: ")
url = f'https://{username}:{password}@course.cse.ust.hk/comp2211/labs/lab8/task_data.zip'
!wget $url -O task_data.zip
!unzip -q task_data.zip -d .

Please enter your username: zraoac
Please enter your password: ··········
--2022-04-25 03:14:29--  https://zraoac:*password*@course.cse.ust.hk/comp2211/labs/lab8/task_data.zip
Resolving course.cse.ust.hk (course.cse.ust.hk)... 143.89.41.176
Connecting to course.cse.ust.hk (course.cse.ust.hk)|143.89.41.176|:443... connected.
HTTP request sent, awaiting response... 401 Unauthorized
Authentication selected: Basic realm="Enter Your CSD PC/Unix Password"
Reusing existing connection to course.cse.ust.hk:443.
HTTP request sent, awaiting response... 200 OK
Length: 71825930 (68M) [application/zip]
Saving to: ‘task_data.zip’


2022-04-25 03:14:34 (16.9 MB/s) - ‘task_data.zip’ saved [71825930/71825930]



## Dataset: **Fruit Recognition**
---
- Training set size: 15178.
- Number of classes: 33.
- Image size: 100 x 100 pixels.

In [None]:
import os

data_dir = './task_data'
# os.list() return the list of subfolder's name
# sorted() rearrange the order of the list
category_list = sorted(os.listdir(data_dir)) 

# create a dict mapping the category name to the class index
cate2Idx = {}
for i in range(len(category_list)):
  cate2Idx[category_list[i]] = i
print(cate2Idx)

{'Apple Braeburn': 0, 'Apple Granny Smith': 1, 'Apricot': 2, 'Avocado': 3, 'Banana': 4, 'Blueberry': 5, 'Cactus fruit': 6, 'Cantaloupe': 7, 'Cherry': 8, 'Clementine': 9, 'Corn': 10, 'Cucumber Ripe': 11, 'Grape Blue': 12, 'Kiwi': 13, 'Lemon': 14, 'Limes': 15, 'Mango': 16, 'Onion White': 17, 'Orange': 18, 'Papaya': 19, 'Passion Fruit': 20, 'Peach': 21, 'Pear': 22, 'Pepper Green': 23, 'Pepper Red': 24, 'Pineapple': 25, 'Plum': 26, 'Pomegranate': 27, 'Potato Red': 28, 'Raspberry': 29, 'Strawberry': 30, 'Tomato': 31, 'Watermelon': 32}


In [None]:
# Import libraries
import cv2
import numpy as np

from sklearn.model_selection import train_test_split
import keras
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Dense, Dropout, Flatten

## 1. Data preprocessing

We need to load the data and store them with appropriate format.

### Task 1

Complete the following code.

1. Load images.
2. Resize images from 100 x 100 to 28 x 28.
3. Save the image data in **x**.
4. Save the corresponding class index in **y**.

In [None]:
# Input: data_dir(str) -- the path of data.
#        cate2Idx(dict) -- mapping the category name to class index.
# Return: x(array) -- the images data, the shape in this task should be (15178, 28, 28, 3).
#         y(array) -- the label of images, the shape in this task should be (15178,).
def data_preprocessing(data_dir, cate2Idx):
  x = None
  y = None
  #### TODO HERE
  x = []
  y = []
  category_list = sorted(os.listdir(data_dir)) 
  for category in category_list:
    image_list = os.listdir('{}/{}'.format(data_dir, category))
    for img_name in image_list:
      img = cv2.resize(cv2.cvtColor(cv2.imread('{}/{}/{}'.format(data_dir, category, img_name)), cv2.COLOR_BGR2RGB), (28, 28))
      x.append(img)
      y.append(cate2Idx[category])

  x = np.asarray(x)
  y = np.asarray(y)

  #### END TODO
  return x, y

In [None]:
x, y = data_preprocessing(data_dir, cate2Idx)
print(x.shape, y.shape)

(15178, 28, 28, 3) (15178,)


In [None]:
# split the dataset to train and test parts
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

# Transforming the integer into a 33 element binary vector.
y_train = np_utils.to_categorical(y_train, len(category_list))
y_test = np_utils.to_categorical(y_test, len(category_list))

## 2. Build the model

### Task2

Complete the following code. You need built your own model with at least 3 convolutional layers and 2 dense layers. 

In [None]:
# - Only use Conv2D, MaxPooling2D, Dense, Dropout and Flatten.
# - At least 2 convolutional layers and 2 dense layers.
def custom_model():
  model = None
  #### TODO HERE
  model = Sequential([
    Conv2D(filters=32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 3)),
    Conv2D(filters=64, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(units=128, activation='relu'),
    Dense(units=33, activation='softmax')
  ])
  #### END TODO
  return model

In [None]:
model = custom_model()
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 26, 26, 32)        896       
                                                                 
 conv2d_1 (Conv2D)           (None, 24, 24, 64)        18496     
                                                                 
 max_pooling2d (MaxPooling2D  (None, 12, 12, 64)       0         
 )                                                               
                                                                 
 flatten (Flatten)           (None, 9216)              0         
                                                                 
 dense (Dense)               (None, 128)               1179776   
                                                                 
 dense_1 (Dense)             (None, 33)                4257      
                                                        

## 3. Compile the Model

In [None]:
# compile the model
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

## 4. Train the model

You can also try different parameters.

In [None]:
model.fit(x_train, y_train, batch_size=128, epochs=5, validation_data=(x_test, y_test))

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f1a802306d0>

## 5. Save the model

Save your model and submit it to ZINC

In [None]:
model_name = 'model_lab8.h5'
model.save(model_name, save_format='h5')