CNN Data Preparation

Summary

Images into batches
Data normalization

Content

Images into batches

Why make image batches

There might not be enough memory to fit all the images into memory at the same time
Trying to learn many images in one go could result in the model not being able to learn very well

How to make batches

import tensorflow as tf
from tensorflow.keras.utils import image_dataset_from_directory
import pathlib

tf.random.set_seed(42)

train_dir = pathlib.Path("pizza_steak/train")

train_data = image_dataset_from_directory(
    directory=train_dir,
    batch_size=32,
    image_size=(224, 224),
    label_mode="binary",
    seed=42
)

obj_iter = iter(train_data.take(1))
i, l  = next(obj_iter)
print(i.shape)

Data normalization

Why data normalization

When the data is normalized, it tend to perform better

How to normalize data

input = tf.constant([[2, 34, 255],
         [35, 231, 123],
         [52, 231, 111]])

# Rescaling layer will multiply the vector by the given scale factor
layer = tf.keras.layers.Rescaling(scale=1/255)
output = layer(input)

print(input)
print(output)

"""
tf.Tensor(
[[  2  34 255]
 [ 35 231 123]
 [ 52 231 111]], shape=(3, 3), dtype=int32)
tf.Tensor(
[[0.00784314 0.13333334 1.        ]
 [0.13725491 0.9058824  0.48235297]
 [0.20392159 0.9058824  0.43529415]], shape=(3, 3), dtype=float32)
"""

Summary​

Content​

Images into batches​

Why make image batches​

How to make batches​

Data normalization​

Why data normalization​

How to normalize data​

Summary

Content

Images into batches

Why make image batches

How to make batches

Data normalization

Why data normalization

How to normalize data