import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras import datasets
from tensorflow.keras.layers import Input, Conv2D, Dense, Flatten, Dropout, GlobalMaxPooling2D, MaxPooling2D
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.models import Model
(X_train, y_train) , (X_test, y_test) = datasets.cifar10.load_data()
X_train.shape
(50000, 32, 32, 3)
y_train.shape
(50000, 1)
y_train = y_train.reshape(-1,)
X_test.shape
(10000, 32, 32, 3)
y_test.shape
(10000, 1)
The CIFAR-10 dataset is a widely used image classification dataset. It consists of 60,000 32x32 color images in 10 different classes, with each class containing 6,000 images. This dataset is a subset of the larger CIFAR-100 dataset, focusing on 10 mutually exclusive classes. The CIFAR-10 dataset is often used for training and evaluating machine learning and deep learning models for image classification tasks.
The CIFAR-10 dataset is divided into the following 10 classes:
The dataset is typically split into two sets:
The CIFAR-10 dataset is a standard benchmark for image classification tasks. It is often used for various machine learning and deep learning applications, including but not limited to:
classes = [
"Airplane",
"Automobile",
"Bird",
"Cat",
"Deer",
"Dog",
"Frog",
"Horse",
"Ship",
"Truck"
]
def plot_sample(index, X=X_train,y=y_train):
plt.figure(figsize = (15,2))
plt.imshow(X[index])
plt.xlabel(classes[y[index]])
In this code, I defined a function that allows me to visualize a sample from a dataset along with its corresponding class label. Let me explain why I created this function:
Function Purpose: The purpose of this function is to plot a single sample from a dataset and label it with its corresponding class name. It's particularly useful when working with image classification tasks, where you want to inspect individual samples.
Parameters:
index
: This parameter specifies the index of the sample I want to visualize.X=X_train
: The function takes an optional parameter X
, which is typically a dataset containing the samples. In this case, it is set to X_train
, which is the training dataset.y=y_train
: Similarly, the function takes an optional parameter y
, which represents the labels corresponding to the samples. It is set to y_train
, which contains the training labels.Plotting the Sample:
plt.figure(figsize=(15, 2))
: This line sets the figure size for the plot, making it 15 units wide and 2 units tall. This helps ensure the sample is displayed with the desired dimensions.plt.imshow(X[index])
: Here, I use plt.imshow()
to display the image sample at the specified index (X[index]
). This function is often used for showing images.plt.xlabel(classes[y[index]])
: This line sets the label for the x-axis of the plot. It labels the image with the class name corresponding to the label y[index]
using the classes
list.The reason for creating this function is to simplify the process of visualizing individual samples from a dataset. It's especially helpful during the development and debugging phases of machine learning and deep learning projects when you want to verify that the data is being loaded and processed correctly. Additionally, it provides a quick way to inspect the training data and check if it matches the corresponding class labels. This function enhances the efficiency of working with image datasets and aids in understanding the data you're working with.ta you're working with.
import random
# Choose 10 random indices
random_indices = random.sample(range(len(X_train)), 10)
# Plot the random samples
for i in random_indices:
plot_sample(i)
# Reduce the pixel values
X_train, X_test = X_train / 255.0, X_test / 255.0
# Flatten the label values
y_train, y_test = y_train.flatten(), y_test.flatten()
I wrote this code to prepare the image data for training a machine learning model, specifically a neural network. Here's why I included this code and what it does:
Pixel Value Normalization:
X_train, X_test = X_train / 255.0, X_test / 255.0
X_train
) and testing (X_test
) datasets by 255.0.Label Flattening:
y_train, y_test = y_train.flatten(), y_test.flatten()
y_train
and y_test
. The labels were originally in a multi-dimensional format.I performed these operations to ensure that both the input data and labels are in a format that can be easily used by machine learning models. Normalizing pixel values is crucial for model stability and convergence, while flattening labels simplifies the data structure, making it compatible with a wide range of machine learning algorithms. It's a standard preprocessing step to set up the data for training and testing in a consistent and machine-friendly format.
# number of classes
K = len(set(y_train))
# Calculate the number of classes for output layer
print("The number of classes:", K)
The number of classes: 10
I wrote this code to determine the number of classes in the dataset and print this information. I included this code snippet to identify and confirm the number of classes in the dataset. Here's why it's important:
Counting the Classes:
K = len(set(y_train))
K
by finding the length of a set created from the y_train
array. In other words, I counted the unique class labels present in the training data.Printing the Number of Classes:
print("The number of classes:", K)
K
, I printed it out with a description for clarity.Why I Did It:
In summary, this code helps me understand the dataset's classification structure and ensures that I configure the neural network correctly for the specific number of classes in the problem.
i = Input(shape = X_train[0].shape)
x = Conv2D(32, (3,3), activation = 'relu', padding='same')(i)
x = BatchNormalization()(x)
x = Conv2D(32, (3,3), activation = 'relu', padding='same')(x)
x = BatchNormalization()(x)
x = MaxPooling2D((2,2))(x)
x = Conv2D(64, (3,3), activation = 'relu', padding='same')(x)
x = BatchNormalization()(x)
x = Conv2D(64, (3,3), activation = 'relu', padding='same')(x)
x = BatchNormalization()(x)
x = MaxPooling2D((2,2))(x)
x = Conv2D(128, (3,3), activation = 'relu', padding='same')(x)
x = BatchNormalization()(x)
x = Conv2D(128, (3,3), activation = 'relu', padding='same')(x)
x = BatchNormalization()(x)
x = MaxPooling2D((2,2))(x)
x = Flatten()(x)
x = Dropout(0.2)(x)
I wrote this code to define the architecture of the convolutional neural network (CNN) model. Let me explain why I did that:
Input Layer
i = Input(shape = X_train[0].shape)
X_train[0].shape
defines the shape of a single input sample. It's essential to specify the input shape for the model to know what to expect.x = Conv2D(32, (3,3), activation = 'relu', padding='same')(i)
x = BatchNormalization()(x)
x = Conv2D(32, (3,3), activation = 'relu', padding='same')(x)
x = BatchNormalization()(x)
x = MaxPooling2D((2,2))(x)
x = Conv2D(64, (3,3), activation = 'relu', padding='same')(x)
x = BatchNormalization()(x)
x = Conv2D(64, (3,3), activation = 'relu', padding='same')(x)
x = BatchNormalization()(x)
x = MaxPooling2D((2,2))(x)
x = Conv2D(128, (3,3), activation = 'relu', padding='same')(x)
x = BatchNormalization()(x)
x = Conv2D(128, (3,3), activation = 'relu', padding='same')(x)
x = BatchNormalization()(x)
x = MaxPooling2D((2,2))(x)
x = Flatten()(x)
x = Dropout(0.2)(x)
Why I did it:
I created this architecture to build a deep CNN model for image classification. Convolutional layers are excellent at capturing hierarchical features in images, making them suitable for tasks like object recognition.
The addition of BatchNormalization and Dropout layers helps with training stability and regularization, respectively.
By designing the model in this way, I aimed to leverage the power of deep learning to extract intricate features from the input data, which is particularly useful for image classification tasks. This architecture is a common choice for such tasks and can be further customized depending on the specific problem and dataset.
x = Dense(1024, activation= 'relu')(x)
x = Dropout(0.2)(x)
I included this code to define the hidden layers of my neural network. Let me explain why I did that:
x = Dense(1024, activation= 'relu')(x)
x = Dropout(0.2)(x)
Why I did it:
The Dense layer is used to create a fully connected layer in the neural network. It allows the model to learn intricate patterns and relationships in the data.
I chose a ReLU activation function because it's effective in training deep neural networks and helps with the vanishing gradient problem.
The Dropout layer is essential for regularization. Overfitting is a common concern in deep learning, and Dropout helps mitigate it by reducing the network's reliance on any specific set of neurons, making the model more generalizable to unseen data.
This configuration of hidden layers is a common choice for deep neural networks, particularly in image classification tasks. It strikes a balance between complexity and regularization, helping to achieve good performance on a variety of datasets. However, it can be further adjusted and tuned depending on the specific problem and dataset characteristics.
x = Dense(K, activation='softmax')(x)
I added this code to define the output layer of my neural network, and I'll explain why:
x = Dense(K, activation='softmax')(x)
I created a Dense layer with K
units, where K
represents the number of classes or categories in my classification problem. In this case, it's essential to set the number of units in the output layer to match the number of classes I want to predict.
I used the softmax activation function for the output layer. Softmax is commonly used in multi-class classification problems. It calculates the probability distribution over all classes, making it suitable for determining which class the input data belongs to.
Why I did it:
The output layer of a neural network is responsible for producing the final predictions. In a classification problem, the number of units in this layer corresponds to the number of possible classes, ensuring that the network provides a prediction for each class.
The softmax activation function is ideal for multi-class classification because it normalizes the output values, converting them into class probabilities. This means that the predicted values will sum to 1, and I can interpret the highest probability as the predicted class.
By configuring the output layer in this way, I'm preparing my neural network for a multi-class classification task, and it will generate class probabilities that I can use to make predictions and evaluate the model's performance.
model = Model(i, x)
model.summary()
Model: "model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 32, 32, 3)] 0 conv2d (Conv2D) (None, 32, 32, 32) 896 batch_normalization (BatchN (None, 32, 32, 32) 128 ormalization) conv2d_1 (Conv2D) (None, 32, 32, 32) 9248 batch_normalization_1 (Batc (None, 32, 32, 32) 128 hNormalization) max_pooling2d (MaxPooling2D (None, 16, 16, 32) 0 ) conv2d_2 (Conv2D) (None, 16, 16, 64) 18496 batch_normalization_2 (Batc (None, 16, 16, 64) 256 hNormalization) conv2d_3 (Conv2D) (None, 16, 16, 64) 36928 batch_normalization_3 (Batc (None, 16, 16, 64) 256 hNormalization) max_pooling2d_1 (MaxPooling (None, 8, 8, 64) 0 2D) conv2d_4 (Conv2D) (None, 8, 8, 128) 73856 batch_normalization_4 (Batc (None, 8, 8, 128) 512 hNormalization) conv2d_5 (Conv2D) (None, 8, 8, 128) 147584 batch_normalization_5 (Batc (None, 8, 8, 128) 512 hNormalization) max_pooling2d_2 (MaxPooling (None, 4, 4, 128) 0 2D) flatten (Flatten) (None, 2048) 0 dropout (Dropout) (None, 2048) 0 dense (Dense) (None, 1024) 2098176 dropout_1 (Dropout) (None, 1024) 0 dense_1 (Dense) (None, 10) 10250 ================================================================= Total params: 2,397,226 Trainable params: 2,396,330 Non-trainable params: 896 _________________________________________________________________
The model summary provides a detailed breakdown of the neural network architecture, its layers, and the number of parameters used in each layer.
Input Layer (input_1
): This layer is designed to accept images with a shape of (32, 32, 3), which corresponds to 32x32-pixel images with three color channels (red, green, and blue).
Convolutional Layers (conv2d
, conv2d_1
, conv2d_2
, conv2d_3
, conv2d_4
, conv2d_5
): These layers are responsible for learning features from the input images. They employ convolution operations to detect various patterns and features in the images. The output shapes vary, and each convolutional layer has its own set of parameters.
Batch Normalization Layers (batch_normalization
, batch_normalization_1
, batch_normalization_2
, batch_normalization_3
, batch_normalization_4
, batch_normalization_5
): Batch normalization is used to normalize the activations of each layer, which helps improve training efficiency and reduces the risk of vanishing/exploding gradients.
Max Pooling Layers (max_pooling2d
, max_pooling2d_1
, max_pooling2d_2
): These layers perform max-pooling operations to down-sample the feature maps and reduce the spatial dimensions. This helps in preserving important features while reducing computational complexity.
Flatten Layer (flatten
): This layer is responsible for converting the 2D feature maps into a 1D vector, preparing the data for the fully connected layers.
Dropout Layers (dropout
, dropout_1
): These layers are used for regularization by randomly setting a fraction of input units to 0 during training, which helps prevent overfitting.
Dense Layers (dense
, dense_1
): These fully connected layers perform the final classification. The last dense layer has 10 units, matching the number of classes in the CIFAR-10 dataset. It uses the softmax activation function to generate class probabilities.
Total Parameters: The model contains a total of 2,397,226 parameters, which include weights and biases. These parameters are learned during training to make the model capable of recognizing and classifying images.
Trainable Parameters: Out of the total parameters, 2,396,330 are trainable, which means they are updated during training to optimize the model's performance.
Non-trainable Parameters: There are 896 non-trainable parameters, which are typically used in certain layers for internal computations and are not updated during training.
This model summary provides a comprehensive overview of the network's architecture and parameter count, giving insights into the complexity of the machine learning model used for image classification.
model.compile(optimizer = 'adam',
loss = 'sparse_categorical_crossentropy',
metrics = ['accuracy'])
I added this code to compile my neural network, and I'll explain why I did that:
I specified the optimizer as 'adam'. Adam (short for Adaptive Moment Estimation) is a popular optimization algorithm for training neural networks. It combines the advantages of two other methods, RMSprop and Momentum, to efficiently update the model's weights during training. It's well-suited for a wide range of deep learning tasks, and I chose it for its effectiveness.
For the loss function, I selected 'sparse_categorical_crossentropy'. This loss function is appropriate for multi-class classification tasks like mine, where the target labels are integers (in contrast to one-hot encoded vectors). It calculates the cross-entropy loss between the predicted class probabilities and the actual class labels. Using 'sparse_categorical_crossentropy' simplifies the handling of target labels in my dataset.
I added 'accuracy' as a metric to monitor during training. This metric will provide information on how well my model is performing by calculating the classification accuracy. It tells me the percentage of correctly classified examples in the training data.
Why I did it:
Compiling the model is a crucial step before training because it sets the configuration for how the network will learn from the data.
I selected 'adam' as the optimizer because it is known for its fast convergence and good performance in a wide range of scenarios. It simplifies the process of finding the optimal model weights.
'sparse_categorical_crossentropy' is the appropriate loss function for my multi-class classification task because it handles integer target labels efficiently. It quantifies the dissimilarity between the predicted probabilities and the true labels.
By adding 'accuracy' as a metric, I can monitor the model's training progress and assess its performance based on classification accuracy. This helps me evaluate how well the model is learning to make accurate predictions.
r = model.fit(X_train, y_train, validation_data = (X_test, y_test), epochs=10)
Epoch 1/10 1563/1563 [==============================] - 351s 221ms/step - loss: 1.2825 - accuracy: 0.5567 - val_loss: 1.0883 - val_accuracy: 0.6202 Epoch 2/10 1563/1563 [==============================] - 332s 213ms/step - loss: 0.8343 - accuracy: 0.7100 - val_loss: 0.7954 - val_accuracy: 0.7247 Epoch 3/10 1563/1563 [==============================] - 330s 211ms/step - loss: 0.6806 - accuracy: 0.7661 - val_loss: 0.7119 - val_accuracy: 0.7559 Epoch 4/10 1563/1563 [==============================] - 321s 205ms/step - loss: 0.5757 - accuracy: 0.8032 - val_loss: 0.6538 - val_accuracy: 0.7786 Epoch 5/10 1563/1563 [==============================] - 995s 637ms/step - loss: 0.4909 - accuracy: 0.8315 - val_loss: 0.6698 - val_accuracy: 0.7806 Epoch 6/10 1563/1563 [==============================] - 322s 206ms/step - loss: 0.4167 - accuracy: 0.8586 - val_loss: 0.5669 - val_accuracy: 0.8128 Epoch 7/10 1563/1563 [==============================] - 332s 213ms/step - loss: 0.3489 - accuracy: 0.8787 - val_loss: 0.6242 - val_accuracy: 0.8107 Epoch 8/10 1563/1563 [==============================] - 332s 212ms/step - loss: 0.3061 - accuracy: 0.8934 - val_loss: 0.6445 - val_accuracy: 0.8089 Epoch 9/10 1563/1563 [==============================] - 322s 206ms/step - loss: 0.2526 - accuracy: 0.9123 - val_loss: 0.6428 - val_accuracy: 0.8150 Epoch 10/10 1563/1563 [==============================] - 967s 619ms/step - loss: 0.2198 - accuracy: 0.9255 - val_loss: 0.6621 - val_accuracy: 0.8062
I trained the model using the model.fit
function with the following specifications:
X_train
and y_train
were used as the training dataset.X_test
and y_test
were used as the validation dataset.Here are the results for each epoch:
The primary goal of this training was to develop a deep learning model for image classification that could accurately classify images from the CIFAR-10 dataset into one of its ten classes. The increase in training accuracy over the epochs indicates that the model was learning and adapting to the data. The validation accuracy was also monitored to ensure that the model generalizes well to unseen data.
The training process involved optimizing the model's weights and biases using the Adam optimizer and minimizing the sparse categorical cross-entropy loss. The use of dropout layers aimed to prevent overfitting.
Overall, the results from this first batch of training indicated promising progress towards developing an effective image classification model, and further improvements were achieved in the subsequent batch.
batch_size = 32
data_generator = tf.keras.preprocessing.image.ImageDataGenerator(
width_shift_range=0.1, height_shift_range=0.1, horizontal_flip = True)
train_generator = data_generator.flow(X_train, y_train, batch_size)
steps_per_epoch = X_train.shape[0] // batch_size
r = model.fit(train_generator, validation_data = (X_test, y_test),
steps_per_epoch = steps_per_epoch, epochs = 10)
Epoch 1/10 1562/1562 [==============================] - 362s 231ms/step - loss: 0.6168 - accuracy: 0.7963 - val_loss: 0.5594 - val_accuracy: 0.8184 Epoch 2/10 1562/1562 [==============================] - 336s 215ms/step - loss: 0.5332 - accuracy: 0.8212 - val_loss: 0.5079 - val_accuracy: 0.8292 Epoch 3/10 1562/1562 [==============================] - 3079s 2s/step - loss: 0.4966 - accuracy: 0.8328 - val_loss: 0.5072 - val_accuracy: 0.8302 Epoch 4/10 1562/1562 [==============================] - 377s 242ms/step - loss: 0.4743 - accuracy: 0.8392 - val_loss: 0.5033 - val_accuracy: 0.8332 Epoch 5/10 1562/1562 [==============================] - 352s 225ms/step - loss: 0.4468 - accuracy: 0.8475 - val_loss: 0.5482 - val_accuracy: 0.8228 Epoch 6/10 1562/1562 [==============================] - 368s 236ms/step - loss: 0.4206 - accuracy: 0.8560 - val_loss: 0.4538 - val_accuracy: 0.8425 Epoch 7/10 1562/1562 [==============================] - 364s 233ms/step - loss: 0.4011 - accuracy: 0.8619 - val_loss: 0.4674 - val_accuracy: 0.8430 Epoch 8/10 1562/1562 [==============================] - 354s 227ms/step - loss: 0.3871 - accuracy: 0.8677 - val_loss: 0.4689 - val_accuracy: 0.8368 Epoch 9/10 1562/1562 [==============================] - 357s 229ms/step - loss: 0.3751 - accuracy: 0.8710 - val_loss: 0.5223 - val_accuracy: 0.8289 Epoch 10/10 1562/1562 [==============================] - 358s 229ms/step - loss: 0.3576 - accuracy: 0.8770 - val_loss: 0.4771 - val_accuracy: 0.8399
In the second batch of training, I employed data augmentation techniques to further improve the performance of my image classification model. Data augmentation is a crucial step to increase the model's ability to generalize and perform better on unseen data.
I used the following data augmentation settings:
width_shift_range
and height_shift_range
set to 0.1: This allowed for random horizontal and vertical shifts in the training images, simulating variations in object position within the images.horizontal_flip
set to True: This introduced horizontal flipping of the images, which helps the model become more robust to variations in object orientation.Here's a summary of the key steps and results for the second batch of training:
batch_size
was set to 32 for this batch.data_generator
was created using tf.keras.preprocessing.image.ImageDataGenerator
with the specified augmentation settings.train_generator
was created using the data generator to produce augmented training data.steps_per_epoch
was calculated based on the size of the training dataset and the batch size.Results for each epoch are as follows:
I conducted this second batch of training with data augmentation to enhance the model's ability to recognize patterns in the images despite variations in object position and orientation. The results indicate that data augmentation had a positive impact on model performance. Overall, this training step contributed to the model's improved ability to classify images from the CIFAR-10 dataset with higher accuracy and robustness.
plt.plot(r.history['accuracy'], label = 'acc', color = 'blue')
plt.plot(r.history['val_accuracy'], label= 'val_acc', color = 'green')
plt.legend();
After conducting the second batch of dataset training with data augmentation, it was essential to evaluate the performance of the machine learning model. One of the most effective ways to understand how well the model was learning from the data was to visualize the training and validation accuracy over the epochs.
To achieve this, I created a line plot that displays the training accuracy (labeled 'acc' and shown in blue) and the validation accuracy (labeled 'val_acc' and shown in green) over the course of training. This plot allowed me to gain insights into how well the model was performing during each epoch of the second batch of training.
The results of the line plot provide valuable information:
Analyzing the plot:
In this specific plot, I observed that the training accuracy improved consistently over the epochs, and the validation accuracy followed a similar trend. Both lines showed an upward trajectory, indicating that the model was learning effectively from the augmented data and was also generalizing well to unseen data.
This line plot served as a visual indicator of the model's performance during training, helping me ensure that the machine learning model was on the right track, capable of both learning and generalizing effectively from the dataset. It played a crucial role in assessing the success of the second batch of training with data augmentation and its positive impact on model performance.
from sklearn.metrics import confusion_matrix , classification_report
classes = ["Airplane","Automobile","Bird","Cat","Deer","Dog","Frog","Horse","Ship","Truck"]
y_pred = model.predict(X_test)
y_pred_classes = [np.argmax(element) for element in y_pred]
print("Classification Report: \n", classification_report(y_test, y_pred_classes, target_names=classes))
313/313 [==============================] - 11s 34ms/step Classification Report: precision recall f1-score support Airplane 0.75 0.94 0.84 1000 Automobile 0.87 0.96 0.92 1000 Bird 0.81 0.78 0.80 1000 Cat 0.75 0.69 0.72 1000 Deer 0.83 0.82 0.83 1000 Dog 0.80 0.77 0.78 1000 Frog 0.79 0.94 0.86 1000 Horse 0.94 0.84 0.89 1000 Ship 0.97 0.78 0.87 1000 Truck 0.94 0.88 0.91 1000 accuracy 0.84 10000 macro avg 0.85 0.84 0.84 10000 weighted avg 0.85 0.84 0.84 10000
As part of evaluating the performance of my image classification model, I generated a classification report using the scikit-learn library. This classification report provided a comprehensive overview of how well the model was performing across different classes and various evaluation metrics. Here's a breakdown of the report and why I found it valuable:
Precision: I examined the precision scores for each class. Precision measures the accuracy of positive predictions. In my case, it revealed how many of the images predicted as a particular class were correct. Higher precision indicated fewer false positives.
Recall: I also looked at the recall scores. Recall quantifies the model's ability to correctly identify all relevant instances within a class. It told me how many of the actual instances of a class were correctly predicted by the model.
F1-score: The F1-score is the harmonic mean of precision and recall. It provides a balance between precision and recall, which is particularly useful when the dataset has class imbalances.
Support: Support represents the number of actual occurrences of each class in the dataset.
The classification report included values for each of the ten classes (in this case, airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck) and two macro-level metrics:
Macro Average: I observed the macro-average precision, recall, and F1-score. It calculates the average of these metrics for each class, giving equal weight to each class. This was helpful to understand the model's overall performance across all classes.
Weighted Average: The weighted average provided a similar evaluation to the macro-average but considered the class distribution. It was especially useful when the dataset had class imbalances. It weighted the metrics by the number of samples in each class.
Accuracy: The classification report also reported the overall accuracy of the model. Accuracy measured how many of the total predictions were correct. It gave an indication of the model's overall performance on the test set.
In this specific report, I found that the model performed reasonably well. The precision, recall, and F1-scores for most classes were relatively high, indicating that the model was effective at classifying images into their respective categories. The accuracy of 84% suggested that the model was making correct predictions for a large portion of the test dataset.
The classification report provided a comprehensive understanding of the model's strengths and areas that might need improvement. It allowed me to identify specific classes where the model excelled and others where it might require further fine-tuning. Overall, this report was a crucial tool in assessing the model's performance and guiding any necessary adjustments to enhance its accuracy and precision further.
import seaborn as sns
import matplotlib.pyplot as plt
# Create the confusion matrix
confusion_matrix = tf.math.confusion_matrix(y_test, y_pred_classes)
# Set the class labels for the heatmap
class_labels = [
"Airplane",
"Automobile",
"Bird",
"Cat",
"Deer",
"Dog",
"Frog",
"Horse",
"Ship",
"Truck"
]
# Plot the confusion matrix using a heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(confusion_matrix, annot=True, cmap='Blues',
xticklabels=class_labels, yticklabels=class_labels)
plt.xlabel('Predicted Labels')
plt.ylabel('True Labels')
plt.show()
I created a heatmap plot of the confusion matrix as part of my model evaluation. The confusion matrix is a crucial tool for understanding how well my image classification model performed across different classes. In this case, I generated a heatmap using the seaborn
library and matplotlib
to visualize the confusion matrix results. Here's why I did this and what the heatmap revealed:
Confusion Matrix: The confusion matrix is a grid that compares the model's predicted labels to the actual labels in the test dataset. It's particularly useful for multi-class classification tasks, like the one I was working on. Each cell of the matrix represents a combination of true positive, true negative, false positive, and false negative predictions for a specific class.
Class Labels: To make the heatmap more interpretable, I labeled the rows and columns of the heatmap with class names. This way, I could easily identify which class the model was confusing with another.
Heatmap Visualization: A heatmap is a graphical representation of the confusion matrix, where each cell's color intensity represents the number of samples that fall into a particular category. I used a blue color map ('Blues') to visualize the results, with darker shades indicating higher values.
By plotting the confusion matrix as a heatmap, I gained the following insights:
Diagonal Elements: The diagonal from the top-left to the bottom-right of the heatmap represented the true positive predictions. In other words, it showed how many samples from each class were correctly classified.
Off-diagonal Elements: The off-diagonal cells showed misclassifications. I could see which classes were frequently confused with each other. Darker cells indicated more significant confusion between those classes.
Class-Specific Performance: I could assess the model's performance for each class individually. For classes with bright diagonal cells and low off-diagonal values, the model performed well. However, for classes with dark off-diagonal cells, the model struggled to distinguish between those classes.
The heatmap visualization of the confusion matrix offered a clear, visual representation of the model's performance. It helped me identify specific areas where the model was excelling and areas where it needed improvement. This information was invaluable for fine-tuning the model and understanding its behavior in a multi-class classification context. It also provided a more intuitive way to grasp the overall model performance, especially when dealing with a large number of classes.
In summary, the confusion matrix heatmap was a vital component of model evaluation, giving me a visual overview of class-specific performance and highlighting potential areas for optimization.