Image recognition with Tensorflow: Improving the model ====================================================== In the first session, we learned how to make a basic image recognition model in tensorflow, but our model performed rather poorly. Today, we’ll be going over various ways to improve our model. .. tab:: Python .. code:: python import matplotlib.pyplot as plt import numpy as np import PIL import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers from tensorflow.keras.models import Sequential We’ll need to get our dataset again -> images of daisies, dandelions, roses, sunflowers, tulips. .. tab:: Python .. code:: python import pathlib dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz" data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True) data_dir = pathlib.Path(data_dir) We can then reload our model into Keras. .. tab:: Python .. code:: python batch_size = 32 img_height = 180 img_width = 180 train_ds = tf.keras.utils.image_dataset_from_directory( data_dir, validation_split=0.2, subset="training", seed=123, image_size=(img_height, img_width), batch_size=batch_size) val_ds = tf.keras.utils.image_dataset_from_directory( data_dir, validation_split=0.2, subset="validation", seed=123, image_size=(img_height, img_width), batch_size=batch_size) class_names = train_ds.class_names AUTOTUNE = tf.data.AUTOTUNE train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE) val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE) .. tab:: Output .. code:: none Found 3670 files belonging to 5 classes. Using 2936 files for training. Found 3670 files belonging to 5 classes. Using 734 files for validation. Our first model was built by constructing various layers that feed into one another. The layers so far included: - rescaling our values to be between 0 and 1, rather than 0 and 255. - flattening our data to be in a single dimension, rather than 3. - a fully connected Dense layer that uses all pixel information - another fully connected layer that makes class predictions We then compile our model and pick an optimizer, a loss function, and a measure for us to evaluate. .. tab:: Python .. code:: python num_classes = len(class_names) model = Sequential([ layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)), layers.Flatten(), layers.Dense(32, activation='relu'), layers.Dense(num_classes) ]) model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) model.summary() .. tab:: Output .. code:: none Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= rescaling_1 (Rescaling) (None, 180, 180, 3) 0 flatten_1 (Flatten) (None, 97200) 0 dense_2 (Dense) (None, 32) 3110432 dense_3 (Dense) (None, 5) 165 ================================================================= Total params: 3,110,597 Trainable params: 3,110,597 Non-trainable params: 0 _________________________________________________________________ We then run our model for 10 epochs. .. tab:: Python .. code:: python epochs=10 history = model.fit( train_ds, validation_data=val_ds, epochs=epochs ) .. tab:: Output .. code:: none Epoch 1/10 92/92 [==============================] - 11s 92ms/step - loss: 2.7531 - accuracy: 0.1952 - val_loss: 1.5996 - val_accuracy: 0.2439 Epoch 2/10 92/92 [==============================] - 7s 72ms/step - loss: 1.6110 - accuracy: 0.2466 - val_loss: 1.6066 - val_accuracy: 0.2398 Epoch 3/10 92/92 [==============================] - 7s 82ms/step - loss: 1.6054 - accuracy: 0.2459 - val_loss: 1.6048 - val_accuracy: 0.2398 Epoch 4/10 92/92 [==============================] - 7s 82ms/step - loss: 1.6036 - accuracy: 0.2459 - val_loss: 1.6036 - val_accuracy: 0.2398 Epoch 5/10 92/92 [==============================] - 8s 88ms/step - loss: 1.6024 - accuracy: 0.2459 - val_loss: 1.6028 - val_accuracy: 0.2398 Epoch 6/10 92/92 [==============================] - 8s 91ms/step - loss: 1.6015 - accuracy: 0.2459 - val_loss: 1.6022 - val_accuracy: 0.2398 Epoch 7/10 92/92 [==============================] - 7s 80ms/step - loss: 1.6009 - accuracy: 0.2459 - val_loss: 1.6021 - val_accuracy: 0.2398 Epoch 8/10 92/92 [==============================] - 7s 79ms/step - loss: 1.6005 - accuracy: 0.2459 - val_loss: 1.6019 - val_accuracy: 0.2398 Epoch 9/10 92/92 [==============================] - 7s 79ms/step - loss: 1.6003 - accuracy: 0.2459 - val_loss: 1.6018 - val_accuracy: 0.2398 Epoch 10/10 92/92 [==============================] - 7s 79ms/step - loss: 1.6002 - accuracy: 0.2459 - val_loss: 1.6017 - val_accuracy: 0.2398 We then plotted the model, showing loss and accuracy over time as the model trained for the training and validation data. .. admonition:: Custom functions We have taken the code we used last time for building a figure into a :doc:`custom function ` called ``plot_results()``. This allows us to reuse this code several times, without needed to type it all again. .. tab:: Python .. code:: python def plot_results(history): acc = history.history['accuracy'] val_acc = history.history['val_accuracy'] loss = history.history['loss'] val_loss = history.history['val_loss'] epochs_range = range(epochs) plt.figure(figsize=(8, 8)) plt.subplot(1, 2, 1) plt.plot(epochs_range, acc, label='Training Accuracy') plt.plot(epochs_range, val_acc, label='Validation Accuracy') plt.ylim(0, 1) plt.legend(loc='lower right') plt.title('Training and Validation Accuracy') plt.subplot(1, 2, 2) plt.plot(epochs_range, loss, label='Training Loss') plt.plot(epochs_range, val_loss, label='Validation Loss') plt.legend(loc='upper right') plt.title('Training and Validation Loss') plt.show() plot_results(history) .. tab:: Output :new-set: .. image:: /_static/images/machine-learning/image-recognition/training-validation-1.png :align: center By the end of the session, we will produce a model that has an accuracy of approximately 75% that looks like this: .. image:: /_static/images/machine-learning/image-recognition/good_training.png :align: center Model complexity ---------------- Our original model had 32 nodes in the primary Dense layer. We can increase the number of nodes to try to capture more complicated relationships in our image data. In this new model, we have 128 nodes in the first Dense layer. .. tab:: Python .. code:: python model = Sequential([ layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)), layers.Flatten(), layers.Dense(128, activation='relu'), layers.Dense(num_classes) ]) model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) model.summary() epochs=10 history = model.fit( train_ds, validation_data=val_ds, epochs=epochs ) plot_results(history) .. tab:: Output .. code:: none Model: "sequential_2" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= rescaling_2 (Rescaling) (None, 180, 180, 3) 0 flatten_2 (Flatten) (None, 97200) 0 dense_4 (Dense) (None, 128) 12441728 dense_5 (Dense) (None, 5) 645 ================================================================= Total params: 12,442,373 Trainable params: 12,442,373 Non-trainable params: 0 _________________________________________________________________ Epoch 1/10 92/92 [==============================] - 14s 146ms/step - loss: 7.5241 - accuracy: 0.3038 - val_loss: 1.7235 - val_accuracy: 0.3597 Epoch 2/10 92/92 [==============================] - 12s 134ms/step - loss: 1.7192 - accuracy: 0.3822 - val_loss: 1.9573 - val_accuracy: 0.3529 Epoch 3/10 92/92 [==============================] - 13s 141ms/step - loss: 1.4352 - accuracy: 0.4523 - val_loss: 1.8062 - val_accuracy: 0.3910 Epoch 4/10 92/92 [==============================] - 13s 137ms/step - loss: 1.6018 - accuracy: 0.4452 - val_loss: 1.5387 - val_accuracy: 0.4169 Epoch 5/10 92/92 [==============================] - 12s 131ms/step - loss: 1.3087 - accuracy: 0.5133 - val_loss: 2.5156 - val_accuracy: 0.3406 Epoch 6/10 92/92 [==============================] - 13s 143ms/step - loss: 1.2992 - accuracy: 0.5221 - val_loss: 1.6486 - val_accuracy: 0.3924 Epoch 7/10 92/92 [==============================] - 14s 148ms/step - loss: 1.1644 - accuracy: 0.5640 - val_loss: 1.4594 - val_accuracy: 0.4292 Epoch 8/10 92/92 [==============================] - 13s 144ms/step - loss: 1.0563 - accuracy: 0.6008 - val_loss: 1.3974 - val_accuracy: 0.4673 Epoch 9/10 92/92 [==============================] - 13s 139ms/step - loss: 1.1096 - accuracy: 0.5838 - val_loss: 1.4080 - val_accuracy: 0.4537 Epoch 10/10 92/92 [==============================] - 13s 141ms/step - loss: 1.0183 - accuracy: 0.6233 - val_loss: 1.5409 - val_accuracy: 0.4060 .. image:: /_static/images/machine-learning/image-recognition/training-validation-2.png :align: center Convolutional layer ------------------- Having more nodes helps, but our model performance is still poor. Dense layers ignore the dimensional and structural parts of a image -> which are the values of the pixels nearby? To consider this dimensional aspect of image data, we need to add an additional layer called a **convolutional layer**. Convolutional layers look for patterns of pixels across small portions of our image, such as vertical or horizontal lines. These filters are passed over the image in an overlapping fashion to pick out where they occur. Below is an example of a particular filter called an embossing filter. .. image:: /_static/images/machine-learning/image-recognition/emboss_filter.png :align: center .. image:: /_static/images/machine-learning/image-recognition/Emboss_example.jpg :align: center .. admonition:: More on our layers We use `layers.Rescaling() `__ to create a convolutional layer. We’ll start by applying 16 filters to our 3 RGB channels. ``padding=same`` just makes sure the output for each filter is the same size as each image. We’ll use the same activation function as for the Dense layer, as well. After the convolutional layer, we’ll add a **max pooling layer** with `layers.MaxPooling2D() `__. Max pooling layers will summarize parts of the image for each filter, saying how strongly the filter appeared in each region of the image. .. image:: /_static/images/machine-learning/image-recognition/MaxpoolSample2.png :align: center We’ll also keep our Dense layer, and feed the output of the convolutional layer to it. .. tab:: Python .. code:: python model = Sequential([ layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)), layers.Conv2D(16, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Flatten(), layers.Dense(128, activation='relu'), layers.Dense(num_classes) ]) model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) model.summary() .. tab:: Output .. code:: none Model: "sequential_3" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= rescaling_3 (Rescaling) (None, 180, 180, 3) 0 conv2d (Conv2D) (None, 180, 180, 16) 448 max_pooling2d (MaxPooling2D (None, 90, 90, 16) 0 ) flatten_3 (Flatten) (None, 129600) 0 dense_6 (Dense) (None, 128) 16588928 dense_7 (Dense) (None, 5) 645 ================================================================= Total params: 16,590,021 Trainable params: 16,590,021 Non-trainable params: 0 _________________________________________________________________ .. tab:: Python :new-set: .. code:: python epochs=10 history = model.fit( train_ds, validation_data=val_ds, epochs=epochs ) .. tab:: Output .. code:: none Epoch 1/10 92/92 [==============================] - 23s 248ms/step - loss: 2.2693 - accuracy: 0.4077 - val_loss: 1.1336 - val_accuracy: 0.5409 Epoch 2/10 92/92 [==============================] - 22s 241ms/step - loss: 0.9376 - accuracy: 0.6352 - val_loss: 1.1416 - val_accuracy: 0.5286 Epoch 3/10 92/92 [==============================] - 22s 239ms/step - loss: 0.6816 - accuracy: 0.7694 - val_loss: 1.0825 - val_accuracy: 0.5736 Epoch 4/10 92/92 [==============================] - 23s 247ms/step - loss: 0.4227 - accuracy: 0.8815 - val_loss: 1.0544 - val_accuracy: 0.5926 Epoch 5/10 92/92 [==============================] - 24s 258ms/step - loss: 0.2387 - accuracy: 0.9496 - val_loss: 1.1503 - val_accuracy: 0.5913 Epoch 6/10 92/92 [==============================] - 23s 250ms/step - loss: 0.1448 - accuracy: 0.9751 - val_loss: 1.1923 - val_accuracy: 0.5886 Epoch 7/10 92/92 [==============================] - 24s 264ms/step - loss: 0.0867 - accuracy: 0.9901 - val_loss: 1.3246 - val_accuracy: 0.5586 Epoch 8/10 92/92 [==============================] - 23s 246ms/step - loss: 0.0580 - accuracy: 0.9942 - val_loss: 1.2783 - val_accuracy: 0.5599 Epoch 9/10 92/92 [==============================] - 24s 260ms/step - loss: 0.0311 - accuracy: 0.9980 - val_loss: 1.3972 - val_accuracy: 0.5599 Epoch 10/10 92/92 [==============================] - 25s 275ms/step - loss: 0.0223 - accuracy: 0.9986 - val_loss: 1.4029 - val_accuracy: 0.5899 This model still performs less than optimally, but more consistently overall. .. tab:: Python .. code:: python plot_results(history) .. tab:: Output :new-set: .. image:: /_static/images/machine-learning/image-recognition/training-validation-3.png :align: center Adding more layers ------------------ A single convolutional layer can pick simple parts of images, like lines or curves. However, if we feed the output from one convolutional layer to a new convolutional layer, the model can start to identify pieces of the image together. Lines can combine to be the ovals, which can combine to be the petals of a flower by the time we reach a third layer. By looking at combinations of simple parts of an image, the model becomes capable of identifying complex components. We do want to add a max pooling layer after every convolutional layer to help summarize as our model grows increasingly complex. .. image:: /_static/images/machine-learning/image-recognition/neural_net_layers.jpeg :align: center .. tab:: Python .. code:: python model = Sequential([ layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)), layers.Conv2D(16, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(32, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(64, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Flatten(), layers.Dense(128, activation='relu'), layers.Dense(num_classes) ]) model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) model.summary() .. tab:: Output .. code:: none Model: "sequential_5" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= rescaling_5 (Rescaling) (None, 180, 180, 3) 0 conv2d_4 (Conv2D) (None, 180, 180, 16) 448 max_pooling2d_4 (MaxPooling (None, 90, 90, 16) 0 2D) conv2d_5 (Conv2D) (None, 90, 90, 32) 4640 max_pooling2d_5 (MaxPooling (None, 45, 45, 32) 0 2D) conv2d_6 (Conv2D) (None, 45, 45, 64) 18496 max_pooling2d_6 (MaxPooling (None, 22, 22, 64) 0 2D) flatten_5 (Flatten) (None, 30976) 0 dense_10 (Dense) (None, 128) 3965056 dense_11 (Dense) (None, 5) 645 ================================================================= Total params: 3,989,285 Trainable params: 3,989,285 Non-trainable params: 0 _________________________________________________________________ Fit same model as last time. .. tab:: Python .. code:: python epochs=10 history = model.fit( train_ds, validation_data=val_ds, epochs=epochs ) .. tab:: Output .. code:: none Epoch 1/10 92/92 [==============================] - 31s 333ms/step - loss: 1.3661 - accuracy: 0.4544 - val_loss: 1.0383 - val_accuracy: 0.5722 Epoch 2/10 92/92 [==============================] - 35s 379ms/step - loss: 0.9859 - accuracy: 0.6148 - val_loss: 0.9836 - val_accuracy: 0.6253 Epoch 3/10 92/92 [==============================] - 31s 332ms/step - loss: 0.8168 - accuracy: 0.6826 - val_loss: 0.9336 - val_accuracy: 0.6512 Epoch 4/10 92/92 [==============================] - 32s 345ms/step - loss: 0.6347 - accuracy: 0.7582 - val_loss: 0.9632 - val_accuracy: 0.6485 Epoch 5/10 92/92 [==============================] - 34s 368ms/step - loss: 0.4451 - accuracy: 0.8362 - val_loss: 1.0757 - val_accuracy: 0.6322 Epoch 6/10 92/92 [==============================] - 30s 331ms/step - loss: 0.3166 - accuracy: 0.8931 - val_loss: 1.1708 - val_accuracy: 0.6362 Epoch 7/10 92/92 [==============================] - 30s 329ms/step - loss: 0.1727 - accuracy: 0.9414 - val_loss: 1.4464 - val_accuracy: 0.6076 Epoch 8/10 92/92 [==============================] - 31s 335ms/step - loss: 0.1047 - accuracy: 0.9704 - val_loss: 1.6272 - val_accuracy: 0.6185 Epoch 9/10 92/92 [==============================] - 33s 358ms/step - loss: 0.0638 - accuracy: 0.9847 - val_loss: 1.6699 - val_accuracy: 0.6512 Epoch 10/10 92/92 [==============================] - 28s 309ms/step - loss: 0.0253 - accuracy: 0.9959 - val_loss: 1.8831 - val_accuracy: 0.6662 .. tab:: Python :new-set: .. code:: python plot_results(history) .. tab:: Output :new-set: .. image:: /_static/images/machine-learning/image-recognition/training-validation-4.png :align: center Our model is starting to perform a lot better, with our validation accuracy above 60%. We can see that training accuracy is actually nearly perfect however, and our validation loss is actually becoming worse over time. This pattern is consistent with **overfitting**. This occurs when the model becomes excellent at predicting the training data, but the predictions don’t generalize as well to new data. You can think of overfitting as memorizing an answer sheet. When the model gets presented new information that is not on the answer sheet, it does poorly. Looking at our data, how might our model be memorizing the answer sheet? .. image:: /_static/images/machine-learning/image-recognition/some_flowers.png :align: center Dropout layer ------------- Overfitting can be caused by having a model that is too complicated. New data are not likely to contain every single feature or shape in our training data, so making our model contain too many of these can lead to overfitting. To alleviate this problem, we can use a **dropout layer**. This randomly removes a proportion of the inputs to the next layer, removing some complexity from the model. .. image:: /_static/images/machine-learning/image-recognition/dropout_layer.jpg :align: center .. admonition:: Dropout layers We can use `layers.Dropout() `__ to create a dropout layer. The proportion specified inside the function will determine the proportion of inputs randomly removed or deactivated from the previous layer. This forces the model to not rely too heavily on specific values. This should eliminate some overfitting that is going on. .. tab:: Python .. code:: python model = Sequential([ layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)), layers.Conv2D(16, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(32, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(64, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Dropout(0.2), layers.Flatten(), layers.Dense(128, activation='relu'), layers.Dense(num_classes, name="outputs") ]) model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) model.summary() .. tab:: Output .. code:: none Model: "sequential_6" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= rescaling_6 (Rescaling) (None, 180, 180, 3) 0 conv2d_7 (Conv2D) (None, 180, 180, 16) 448 max_pooling2d_7 (MaxPooling (None, 90, 90, 16) 0 2D) conv2d_8 (Conv2D) (None, 90, 90, 32) 4640 max_pooling2d_8 (MaxPooling (None, 45, 45, 32) 0 2D) conv2d_9 (Conv2D) (None, 45, 45, 64) 18496 max_pooling2d_9 (MaxPooling (None, 22, 22, 64) 0 2D) dropout (Dropout) (None, 22, 22, 64) 0 flatten_6 (Flatten) (None, 30976) 0 dense_12 (Dense) (None, 128) 3965056 outputs (Dense) (None, 5) 645 ================================================================= Total params: 3,989,285 Trainable params: 3,989,285 Non-trainable params: 0 _________________________________________________________________ .. tab:: Python :new-set: .. code:: python epochs = 10 history = model.fit( train_ds, validation_data=val_ds, epochs=epochs ) .. tab:: Output .. code:: none Epoch 1/10 92/92 [==============================] - 35s 369ms/step - loss: 1.2947 - accuracy: 0.4377 - val_loss: 1.1064 - val_accuracy: 0.5654 Epoch 2/10 92/92 [==============================] - 28s 306ms/step - loss: 0.9720 - accuracy: 0.6277 - val_loss: 1.0085 - val_accuracy: 0.5913 Epoch 3/10 92/92 [==============================] - 28s 307ms/step - loss: 0.7915 - accuracy: 0.6989 - val_loss: 0.8812 - val_accuracy: 0.6567 Epoch 4/10 92/92 [==============================] - 34s 374ms/step - loss: 0.5771 - accuracy: 0.7916 - val_loss: 0.8650 - val_accuracy: 0.6771 Epoch 5/10 92/92 [==============================] - 34s 363ms/step - loss: 0.3629 - accuracy: 0.8747 - val_loss: 1.0409 - val_accuracy: 0.6499 Epoch 6/10 92/92 [==============================] - 37s 406ms/step - loss: 0.2467 - accuracy: 0.9176 - val_loss: 1.2296 - val_accuracy: 0.6376 Epoch 7/10 92/92 [==============================] - 41s 441ms/step - loss: 0.1355 - accuracy: 0.9564 - val_loss: 1.3323 - val_accuracy: 0.6594 Epoch 8/10 92/92 [==============================] - 43s 467ms/step - loss: 0.1058 - accuracy: 0.9690 - val_loss: 1.3769 - val_accuracy: 0.6608 Epoch 9/10 92/92 [==============================] - 41s 444ms/step - loss: 0.0664 - accuracy: 0.9799 - val_loss: 1.4244 - val_accuracy: 0.6458 Epoch 10/10 92/92 [==============================] - 38s 419ms/step - loss: 0.0539 - accuracy: 0.9826 - val_loss: 1.7785 - val_accuracy: 0.6444 We can see that our model performance didn’t really improve over the last rendition. Model performance is still plateauing, meaning reducing model complexity was not incredibly successful. .. tab:: Python .. code:: python plot_results(history) .. tab:: Output :new-set: .. image:: /_static/images/machine-learning/image-recognition/training-validation-5.png :align: center Data augmentation ----------------- Our data set is somewhat limited in size, only containing several thousand images. We could try to increase the sample size by grabbing more images of flowers from the internet, which can be time consuming. Another potential solution is **data augmentation**. Data augmentation is taking the data we already have and applying random transformations to data to generate more training data. For us, this means taking images we already have, and randomly flipping rotating, and zooming them. We then add these modified images to the data set. Note: the current version of tensorflow prints out a lot of warnings when you do data augmentation. This is a bug which you can safely ignore. .. tab:: Python .. code:: python data_augmentation = keras.Sequential( [ layers.RandomFlip("horizontal", input_shape=(img_height, img_width, 3)), layers.RandomRotation(0.1), layers.RandomZoom(0.1), ] ) When we visualize, the data, we can see that we have new versions of old images with slight zooms, flips, and rotations. .. tab:: Python .. code:: python plt.figure(figsize=(10, 10)) for images, _ in train_ds.take(1): for i in range(9): augmented_images = data_augmentation(images) ax = plt.subplot(3, 3, i + 1) plt.imshow(augmented_images[1].numpy().astype("uint8")) plt.axis("off") .. tab:: Output :new-set: .. image:: /_static/images/machine-learning/image-recognition/data_augmentation.png :align: center We can insert the data augmentation into the model as a first layer. .. tab:: Python .. code:: python model = Sequential([ data_augmentation, layers.Rescaling(1./255), layers.Conv2D(16, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(32, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Conv2D(64, 3, padding='same', activation='relu'), layers.MaxPooling2D(), layers.Dropout(0.2), layers.Flatten(), layers.Dense(128, activation='relu'), layers.Dense(num_classes, name="outputs") ]) model.compile(optimizer='adam', loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) model.summary() .. tab:: Output .. code:: none Model: "sequential_8" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= sequential_7 (Sequential) (None, 180, 180, 3) 0 rescaling_7 (Rescaling) (None, 180, 180, 3) 0 conv2d_10 (Conv2D) (None, 180, 180, 16) 448 max_pooling2d_10 (MaxPoolin (None, 90, 90, 16) 0 g2D) conv2d_11 (Conv2D) (None, 90, 90, 32) 4640 max_pooling2d_11 (MaxPoolin (None, 45, 45, 32) 0 g2D) conv2d_12 (Conv2D) (None, 45, 45, 64) 18496 max_pooling2d_12 (MaxPoolin (None, 22, 22, 64) 0 g2D) dropout_1 (Dropout) (None, 22, 22, 64) 0 flatten_7 (Flatten) (None, 30976) 0 dense_13 (Dense) (None, 128) 3965056 outputs (Dense) (None, 5) 645 ================================================================= Total params: 3,989,285 Trainable params: 3,989,285 Non-trainable params: 0 _________________________________________________________________ We increase the number of epochs here to make sure we minimize validation loss. .. tab:: Python .. code:: python epochs=20 history = model.fit( train_ds, validation_data=val_ds, epochs=epochs ) .. tab:: Output .. code:: none Epoch 1/20 92/92 [==============================] - 41s 428ms/step - loss: 1.2081 - accuracy: 0.4877 - val_loss: 1.1076 - val_accuracy: 0.5409 Epoch 2/20 92/92 [==============================] - 38s 418ms/step - loss: 1.0066 - accuracy: 0.6076 - val_loss: 0.9422 - val_accuracy: 0.6376 Epoch 3/20 92/92 [==============================] - 37s 403ms/step - loss: 0.8936 - accuracy: 0.6557 - val_loss: 0.9431 - val_accuracy: 0.6308 Epoch 4/20 92/92 [==============================] - 38s 418ms/step - loss: 0.8185 - accuracy: 0.6836 - val_loss: 0.8330 - val_accuracy: 0.6635 Epoch 5/20 92/92 [==============================] - 38s 417ms/step - loss: 0.7811 - accuracy: 0.7010 - val_loss: 0.8102 - val_accuracy: 0.6798 Epoch 6/20 92/92 [==============================] - 42s 454ms/step - loss: 0.7439 - accuracy: 0.7221 - val_loss: 0.8108 - val_accuracy: 0.7003 Epoch 7/20 92/92 [==============================] - 40s 436ms/step - loss: 0.6975 - accuracy: 0.7381 - val_loss: 0.8033 - val_accuracy: 0.6866 Epoch 8/20 92/92 [==============================] - 37s 402ms/step - loss: 0.6671 - accuracy: 0.7466 - val_loss: 0.8010 - val_accuracy: 0.6866 Epoch 9/20 92/92 [==============================] - 37s 405ms/step - loss: 0.6394 - accuracy: 0.7589 - val_loss: 0.7509 - val_accuracy: 0.6962 Epoch 10/20 92/92 [==============================] - 39s 424ms/step - loss: 0.6038 - accuracy: 0.7602 - val_loss: 0.6841 - val_accuracy: 0.7234 Epoch 11/20 92/92 [==============================] - 41s 441ms/step - loss: 0.5788 - accuracy: 0.7813 - val_loss: 0.7177 - val_accuracy: 0.7343 Epoch 12/20 92/92 [==============================] - 38s 415ms/step - loss: 0.5617 - accuracy: 0.7888 - val_loss: 0.7086 - val_accuracy: 0.7275 Epoch 13/20 92/92 [==============================] - 38s 411ms/step - loss: 0.5211 - accuracy: 0.8045 - val_loss: 0.7097 - val_accuracy: 0.7234 Epoch 14/20 92/92 [==============================] - 34s 370ms/step - loss: 0.5050 - accuracy: 0.8140 - val_loss: 0.6678 - val_accuracy: 0.7493 Epoch 15/20 92/92 [==============================] - 34s 375ms/step - loss: 0.4933 - accuracy: 0.8205 - val_loss: 0.7441 - val_accuracy: 0.7166 Epoch 16/20 92/92 [==============================] - 37s 402ms/step - loss: 0.4630 - accuracy: 0.8324 - val_loss: 0.7033 - val_accuracy: 0.7302 Epoch 17/20 92/92 [==============================] - 42s 463ms/step - loss: 0.4207 - accuracy: 0.8386 - val_loss: 0.6541 - val_accuracy: 0.7698 Epoch 18/20 92/92 [==============================] - 37s 405ms/step - loss: 0.4236 - accuracy: 0.8443 - val_loss: 0.7598 - val_accuracy: 0.7357 Epoch 19/20 92/92 [==============================] - 57s 626ms/step - loss: 0.3940 - accuracy: 0.8552 - val_loss: 0.7461 - val_accuracy: 0.7411 Epoch 20/20 92/92 [==============================] - 76s 827ms/step - loss: 0.3551 - accuracy: 0.8723 - val_loss: 0.7043 - val_accuracy: 0.7629 We are now approaching validation accuracy over 70%, much better than where we started. However, there is still room to improve. Feel free to keep trying to make modifications to make the model better. .. tab:: Python .. code:: python plot_results(history) .. tab:: Output :new-set: .. image:: /_static/images/machine-learning/image-recognition/training-validation-6.png Using our model to make predictions ----------------------------------- We can try to predict flower type of an image not in the training or validation set. We have a function below that imports images of flowers from the internet (when provided a URL) and uses the model we’ve made to predict the flower type. .. collapse:: More details .. container:: We define a custom function here. Please look at :doc:`this guide about custom function `. We use `tf.keras.utils.get_file() `__ to download an image file from the internet that we specify when we run the function. If the file is already downloaded, it does not get re-downloaded. `tf.keras.utils.load_img() `__ loads the image into a PIL format. .. tab:: Python .. code:: python flower_path = tf.keras.utils.get_file(origin=flower_url) img = tf.keras.utils.load_img( flower_path, target_size=(img_height, img_width) ) Now, we use `tf.keras.utils.img_to_array() `__ to convert our image data into an array. `tf.expand_dims() `__ adds another dimension to the array to compensate for the fact that we need input data to be in a batch. This function here essentially puts the image into a batch of 1 so that it is usable by our pipeline. .. tab:: Python .. code:: python img_array = tf.keras.utils.img_to_array(img) img_array = tf.expand_dims(img_array, 0) # Create a batch We can feed our new image into ``model.predict()`` to give scores for each of the five flowers. The class with the highest score is the prediction for that image. `tf.nn.softmax() `__ converts the scores from ``model.predict()`` so that the total of all of the scores adds up to 1.0. You can read more about the softmax function `here `__. .. tab:: Python .. code:: python predictions = model.predict(img_array) score = tf.nn.softmax(predictions[0]) We use `.format() `__ to insert values into our printed string. We can `np.argmax() `__ to get the index of the highest value in ``score``. We treat the highest relative score as the confidence in our prediction. .. tab:: Python .. code:: python print( "This image most likely belongs to {} with a {:.2f} percent confidence." .format(class_names[np.argmax(score)], 100 * np.max(score)) ) .. tab:: Python .. code:: python def predict_flower(flower_url): flower_path = tf.keras.utils.get_file(origin=flower_url) img = tf.keras.utils.load_img( flower_path, target_size=(img_height, img_width) ) img_array = tf.keras.utils.img_to_array(img) img_array = tf.expand_dims(img_array, 0) # Create a batch predictions = model.predict(img_array) score = tf.nn.softmax(predictions[0]) print( "This image most likely belongs to {} with a {:.2f} percent confidence." .format(class_names[np.argmax(score)], 100 * np.max(score)) ) return None Our model makes some successful predictions with high confidence, but it also makes some incorrect predictions with reasonably high confidence as well. .. tab:: Python .. code:: python sunflower_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/592px-Red_sunflower.jpg" tulip_url = "https://helloartsy.com/wp-content/uploads/kids/flowers/how_to_draw_a_tulip/how-to-draw-a-tulip_step-6.jpg" dandelion_url1 = "https://www.southernliving.com/thmb/lJ33cHiLHjUloZlfDX1UppUJ_DA=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/GettyImages-1176988236-2000-c8f09103b3f5459ebd1528b7ea264c4e.jpg dandelion_url2 = "https://www.minnesotawildflowers.info/udata/r9ndp23q/pd/taraxacum-officinale-3.jpg" rose_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/c/c4/Natural_Rose_1.jpg/3024px-Natural_Rose_1.jpg" daisy_url = "https://upload.wikimedia.org/wikipedia/commons/5/53/Belis_peremnis_-_panoramio.jpg" print('SUNFLOWER') predict_flower(sunflower_url) print('TULIP') predict_flower(tulip_url) print('DANDELION (yellow)') predict_flower(dandelion_url1) print('DANDELION (white)') predict_flower(dandelion_url2) print('ROSE') predict_flower(rose_url) print('DAISY') predict_flower(daisy_url) .. tab:: Output .. code:: none SUNFLOWER 1/1 [==============================] - 0s 62ms/step This image most likely belongs to sunflowers with a 99.18 percent confidence. TULIP 1/1 [==============================] - 0s 143ms/step This image most likely belongs to tulips with a 88.10 percent confidence. DANDELION (yellow) 1/1 [==============================] - 0s 61ms/step This image most likely belongs to sunflowers with a 72.07 percent confidence. DANDELION (white) 1/1 [==============================] - 0s 148ms/step This image most likely belongs to dandelion with a 98.70 percent confidence. ROSE 1/1 [==============================] - 0s 29ms/step This image most likely belongs to roses with a 76.97 percent confidence. DAISY 1/1 [==============================] - 0s 58ms/step This image most likely belongs to daisy with a 100.00 percent confidence.