import os, cv2, torch
import numpy as np
import matplotlib.pyplot as plt
import torchvision.transforms as tt
import torch.nn as nn
import torch.nn.functional as F
from tqdm import tqdm
from torchsummary import summary
from torch.utils.data import DataLoader
from torchvision.datasets import ImageFolder
from torchvision.utils import save_image, make_grid
I'm importing the necessary libraries for my PyTorch project. Here's a breakdown of what each import statement does:
os
: Allows me to interact with the operating system, such as handling file paths.cv2
: Provides computer vision functionalities, which might be useful for image processing tasks.torch
: The core PyTorch library.numpy
: A library for numerical operations in Python, commonly used alongside PyTorch.matplotlib.pyplot
: Enables me to visualize data, such as plotting graphs and images.torchvision.transforms as tt
: The transforms
module from torchvision, which helps me perform various image transformations.torch.nn as nn
: PyTorch's neural network module, providing tools to build and train neural networks.torch.nn.functional as F
: The functional interface to operations in torch.nn
.tqdm
: A library that creates progress bars, making it easier to monitor long-running tasks.torch.utils.data.DataLoader
: Used for loading data efficiently, creating batches, and shuffling.torchvision.datasets.ImageFolder
: A dataset class that loads images from a directory structure.torchvision.utils.save_image
and torchvision.utils.make_grid
: Utility functions for saving images and creating grid displays.Overall, these libraries cover file handling, image processing, neural network construction, and data loading.
Keep in mind that the specific functionalities of these libraries will be used as I progress through my PyTorch project.
The Anime Face Dataset comprises 21,551 anime faces sourced from www.getchu.com. - All images in the dataset SHOULD BE resized manually to a standard size of 64 x 64 pixels. This ensures uniformity and simplifies the training process for the neural network.
# CONSTANTS
IMAGE_SIZE = 64
BATCH_SIZE = 128
MEAN, STD = (0.5, 0.5, 0.5), (0.5, 0.5, 0.5)
DATA_DIR = './data/'
train_ds = ImageFolder(DATA_DIR, transform = tt.Compose([tt.Resize(IMAGE_SIZE),
tt.CenterCrop(IMAGE_SIZE),
tt.ToTensor(),
tt.Normalize(mean=MEAN,
std=STD)]))
train_dl = DataLoader(train_ds, BATCH_SIZE, shuffle = True,
num_workers = 2, pin_memory = True)
def denorm(img_tensors):
return img_tensors * MEAN[0] + STD[0]
def show_images(images, nmax=64):
fig, ax = plt.subplots(figsize = (8, 8))
ax.set_xticks([]); ax.set_yticks([])
ax.imshow(make_grid(denorm(images.detach()[:nmax]), nrow=8).permute(1,2,0))
def show_batch(dl, nmax=64):
for images, _ in dl:
show_images(images, nmax)
break
show_batch(train_dl)
I've defined some constants and implemented a simple data loading and visualization pipeline using PyTorch. Here's a breakdown of the code:
I've set up some constants to control the behavior of my data processing and loading pipeline:
IMAGE_SIZE
: Specifies the size to which images should be resized.BATCH_SIZE
: Defines the number of images in each batch during training.MEAN
and STD
: These represent the mean and standard deviation used for normalizing the image data.DATA_DIR
: The directory path where my dataset is loca#ted.I'm using the ImageFolder
class from torchvision to load the dataset. The tt.Compose
function is used to chain together a series of image transformations:
tt.Resize
: Resizes the image to the specified size.tt.CenterCrop
: Performs a center crop on the resized image.tt.ToTensor
: Converts the image to a PyTorch tensor.tt.Normalize
: Normalizes the image tensor using the specified mean and standard deviation.The train_dl
DataLoader is then created to efficiently load and iterate over batches of data during training. I've used pin_memory=True
to enable faster data transfer to the GPU if it#'s available.
I've defined two additional functions:
denorm
: Reverts the normalization, allowing me to display the original images.show_images
: Takes a batch of image tensors, denormalizes them, and displays them in a grid.show_batch
: Takes a DataLoader and displays a batch of images using the show_images
function.Finally, I call show_batch(train_dl)
to visualize a batch of training images.
This code provides a solid foundation for loading, transforming, and visualizing image data in the context of a PyTorch project.
def to_device(data, device):
if isinstance(data, (list, tuple)):
return [to_device(x, device) for x in data]
return data.to(device, non_blocking = True)
class DeviceDataLoader():
def __init__(self, dl, device):
self.dl = dl
self.device = device
def __iter__(self):
for b in self.dl:
yield to_device(b, self.device)
def __len__(self):
return len(self.dl)
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
train_dl = DeviceDataLoader(train_dl, device)
In this code, I've defined a utility function to_device
and a custom class DeviceDataLoader
to facilitate data movement between CPU and GPU. Here's an explanation using first-person perspective:
to_device
Function:¶I've created a utility function called to_device
that helps me move data to the specified device (CPU or GPU). This is particularly useful when dealing with PyTorch tensors. Here's how it works:
to_device
function to each element within the list or tuple..to(device, non_blocking=True)
method to move the data to the specified device. The non_blocking=True
argument is used for asynchronous data transfer when using CUDA (GPU).DeviceDataLoader
Class:¶I've created a custom data loader class, DeviceDataLoader
, which wraps around an existing data loader (dl
) and ensures that each batch of data is moved to the specified device. Here's how it's structured:
__init__
method initializes the DeviceDataLoader
with the provided data loader (dl
) and the target device.__iter__
method is implemented to iterate through batches of data from the original data loader (dl
). For each batch, it yields the batch after applying the to_device
function to move it to the specified device.__len__
method returns the length of the original data loader.I determine whether to use the CPU or GPU based on the availability of a CUDA-enabled device. I then create a PyTorch device (torch.device
) accordingly.
Finally, I instantiate the DeviceDataLoader
class, passing the original train_dl
and the chosen device. This ensures that during training, the batches of data are efficiently moved to the specified device, allowing for faster computation, especially on a GPU.
This code is a useful abstraction for handling device-specific data movement and is a common practice when working with PyTorch on different hardware.
# Define the discriminator architecture using nn.Sequential
discriminator = nn.Sequential(
# First Convolutional Layer
nn.Conv2d(3, 64, kernel_size=4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(64),
nn.LeakyReLU(0.2, inplace=True),
# Second Convolutional Layer
nn.Conv2d(64, 128, kernel_size=4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(128),
nn.LeakyReLU(0.2, inplace=True),
# Third Convolutional Layer
nn.Conv2d(128, 256, kernel_size=4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(256),
nn.LeakyReLU(0.2, inplace=True),
# Fourth Convolutional Layer
nn.Conv2d(256, 512, kernel_size=4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(512),
nn.LeakyReLU(0.2, inplace=True),
# Fifth Convolutional Layer (Output Layer)
nn.Conv2d(512, 1, kernel_size=4, stride=1, padding=0, bias=False),
nn.Flatten(),
nn.Sigmoid()
)
discriminator = to_device(discriminator, device)
summary(discriminator, (3, 64, 64))
---------------------------------------------------------------- Layer (type) Output Shape Param # ================================================================ Conv2d-1 [-1, 64, 32, 32] 3,072 BatchNorm2d-2 [-1, 64, 32, 32] 128 LeakyReLU-3 [-1, 64, 32, 32] 0 Conv2d-4 [-1, 128, 16, 16] 131,072 BatchNorm2d-5 [-1, 128, 16, 16] 256 LeakyReLU-6 [-1, 128, 16, 16] 0 Conv2d-7 [-1, 256, 8, 8] 524,288 BatchNorm2d-8 [-1, 256, 8, 8] 512 LeakyReLU-9 [-1, 256, 8, 8] 0 Conv2d-10 [-1, 512, 4, 4] 2,097,152 BatchNorm2d-11 [-1, 512, 4, 4] 1,024 LeakyReLU-12 [-1, 512, 4, 4] 0 Conv2d-13 [-1, 1, 1, 1] 8,192 Flatten-14 [-1, 1] 0 Sigmoid-15 [-1, 1] 0 ================================================================ Total params: 2,765,696 Trainable params: 2,765,696 Non-trainable params: 0 ---------------------------------------------------------------- Input size (MB): 0.05 Forward/backward pass size (MB): 2.81 Params size (MB): 10.55 Estimated Total Size (MB): 13.41 ----------------------------------------------------------------
LATENT_SIZE = 128
I'm defining the architecture for the discriminator in my DCGAN (Deep Convolutional Generative Adversarial Network) project using PyTorch. This neural network plays a crucial role in distinguishing between real and generated anime faces. Here's a breakdown of the code:
nn.Sequential
.nn.Flatten()
layer reshapes the output tensor into a 1D vector.to_device
function. This function helps facilitate seamless data movement between the model and the device.LATENT_SIZE
constant to 128, which represents the size of the latent space. This latent vector serves as the input to the generator in the DCGAN.In summary, this discriminator architecture is designed to take in images and output a probability indicating whether the input is a real or generated anime face. The Leaky ReLU activations and batch normalization layers contribute to the stability and efficiency of the training process. The discriminator is ready to be trained alongside the generator in my DCGAN project.
# Define the generator architecture using nn.Sequential
generator = nn.Sequential(
# First Transposed Convolutional Layer
nn.ConvTranspose2d(LATENT_SIZE, 512, kernel_size=4, stride=1, padding=0, bias=False),
nn.BatchNorm2d(512), # Batch Normalization for stabilization
nn.ReLU(True), # ReLU activation function for non-linearity
# Second Transposed Convolutional Layer
nn.ConvTranspose2d(512, 256, kernel_size=4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(256),
nn.ReLU(True),
# Third Transposed Convolutional Layer
nn.ConvTranspose2d(256, 128, kernel_size=4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(128),
nn.ReLU(True),
# Fourth Transposed Convolutional Layer
nn.ConvTranspose2d(128, 64, kernel_size=4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(64),
nn.ReLU(True),
# Fifth Transposed Convolutional Layer (Output Layer)
nn.ConvTranspose2d(64, 3, kernel_size=4, stride=2, padding=1, bias=False),
nn.Tanh() # Tanh activation for output normalization
)
generator = to_device(generator, device)
summary(generator, input_size=(LATENT_SIZE, 1, 1))
---------------------------------------------------------------- Layer (type) Output Shape Param # ================================================================ ConvTranspose2d-1 [-1, 512, 4, 4] 1,048,576 BatchNorm2d-2 [-1, 512, 4, 4] 1,024 ReLU-3 [-1, 512, 4, 4] 0 ConvTranspose2d-4 [-1, 256, 8, 8] 2,097,152 BatchNorm2d-5 [-1, 256, 8, 8] 512 ReLU-6 [-1, 256, 8, 8] 0 ConvTranspose2d-7 [-1, 128, 16, 16] 524,288 BatchNorm2d-8 [-1, 128, 16, 16] 256 ReLU-9 [-1, 128, 16, 16] 0 ConvTranspose2d-10 [-1, 64, 32, 32] 131,072 BatchNorm2d-11 [-1, 64, 32, 32] 128 ReLU-12 [-1, 64, 32, 32] 0 ConvTranspose2d-13 [-1, 3, 64, 64] 3,072 Tanh-14 [-1, 3, 64, 64] 0 ================================================================ Total params: 3,806,080 Trainable params: 3,806,080 Non-trainable params: 0 ---------------------------------------------------------------- Input size (MB): 0.00 Forward/backward pass size (MB): 3.00 Params size (MB): 14.52 Estimated Total Size (MB): 17.52 ----------------------------------------------------------------
I'm defining the architecture for the generator in my DCGAN project using PyTorch. This neural network is responsible for generating synthetic anime faces. Let me break down the code for you:
I've structured the generator using nn.Sequential
, composing it with several transposed convolutional layers:
First Layer:
LATENT_SIZE
input channels and 512 output channels.Second Layer:
Third Layer:
Fourth Layer:
Fifth Layer:
I'm ensuring that the generator is placed on the chosen device (CPU or GPU) using the to_device
function. This function helps seamlessly move the model to the desired device.
This generator is designed to take a latent vector as input and produce synthetic anime faces. The transposed convolutional layers are crucial for transforming the latent vector into a spatial representation that resembles the distribution of real anime faces. The batch normalization and ReLU activations enhance the stability and expressiveness of the generator.
This generator, in conjunction with the discriminator, forms the core of my DCGAN project, aiming to generate realistic anime faces.
# Generate a batch of random latent vectors
xb = torch.randn(BATCH_SIZE, LATENT_SIZE, 1, 1)
# Generate fake images using the generator
fake_images = generator(xb)
# Print the shape of the generated fake images
print(fake_images.shape)
torch.Size([128, 3, 64, 64])
# Show the generated fake images
show_images(fake_images)
# Create a directory to save the generated images
sample_dir = 'generated'
os.makedirs(sample_dir, exist_ok=True)
# Generate a fixed set of random latent vectors for consistent visualization
fixed_latent = torch.randn(64, LATENT_SIZE, 1, 1, device=device)
# Define a function to save generated samples to the specified directory
def save_samples(index, latent_tensors, show=True):
fake_images = generator(latent_tensors)
fake_fname = f'generated-images-{index:0=4d}.png'
# Save the generated images
save_image(denorm(fake_images), os.path.join(sample_dir, fake_fname), nrow=8)
print('Saving', fake_fname)
# Optionally, display the saved images in a grid
if show:
fig, ax = plt.subplots(figsize=(8, 8))
ax.set_xticks([]); ax.set_yticks([])
ax.imshow(make_grid(fake_images.cpu().detach(), nrow=8).permute(1, 2, 0))
In this section of my DCGAN project, I am performing various operations related to the generation and visualization of synthetic images. Let me describe each step in detail:
Generate Random Latent Vectors:
xb
) using a normal distribution with shape (BATCH_SIZE, LATENT_SIZE, 1, 1)
.Generate Fake Images:
xb
) into synthetic images (fake_images
).Inspect the Shape of Generated Images:
Show Generated Images:
show_images
), I display the generated fake images for a visual inspection.Create a Directory for Saving Generated Samples:
Generate Fixed Latent Vectors for Consistency:
fixed_latent
) to maintain consistency in the visualization of generated samples.Define a Function to Save and Display Samples:
save_samples
) responsible for saving generated images with a specific naming convention in the 'generated' directory.Save and Display Generated Samples:
save_samples
function to save generated images using the fixed latent vectors and display them if needed.In essence, these steps collectively contribute to the exploration, visualization, and consistent storage of synthetic images generated by the DCGAN model.
# Define a function to train a GAN model for a specified number of epochs
def fit(model, criterion, epochs, lr, start_idx=1):
# Set discriminator and generator models to training mode
model['discriminator'].train()
model['generator'].train()
# Clear GPU memory to avoid memory issues during training
torch.cuda.empty_cache()
# Initialize lists to store losses and scores
losses_g = []
losses_d = []
real_scores = []
fake_scores = []
# Set up separate Adam optimizers for the discriminator and generator networks
optimizer = {
'discriminator': torch.optim.Adam(model['discriminator'].parameters(),
lr=lr, betas=(0.5, 0.999)),
'generator': torch.optim.Adam(model['generator'].parameters(),
lr=lr, betas=(0.5, 0.999))
}
# Iterate over the specified number of epochs
for epoch in range(epochs):
# Initialize lists to store losses and scores for each batch in the epoch
loss_d_per_epoch = []
loss_g_per_epoch = []
real_score_per_epoch = []
fake_score_per_epoch = []
# Iterate over batches in the training data loader
for real_images, _ in tqdm(train_dl):
# Zero out gradients for the discriminator optimizer
optimizer['discriminator'].zero_grad()
# Forward pass for real images through the discriminator
real_preds = model['discriminator'](real_images)
real_targets = torch.ones(real_images.size(0), 1, device=device)
real_loss = criterion['discriminator'](real_preds, real_targets)
cur_real_score = torch.mean(real_preds).item()
# Generate fake images using the generator
latent = torch.randn(BATCH_SIZE, LATENT_SIZE, 1, 1, device=device)
fake_images = model['generator'](latent)
# Forward pass for fake images through the discriminator
fake_targets = torch.zeros(fake_images.size(0), 1, device=device)
fake_preds = model['discriminator'](fake_images)
fake_loss = criterion['discriminator'](fake_preds, fake_targets)
cur_fake_score = torch.mean(fake_preds).item()
# Append scores to lists
real_score_per_epoch.append(cur_real_score)
fake_score_per_epoch.append(cur_fake_score)
# Calculate and backpropagate the total discriminator loss
loss_d = real_loss + fake_loss
loss_d.backward()
optimizer['discriminator'].step()
loss_d_per_epoch.append(loss_d.item())
# Zero out gradients for the generator optimizer
optimizer['generator'].zero_grad()
# Generate new fake images and calculate generator loss
latent = torch.randn(BATCH_SIZE, LATENT_SIZE, 1, 1, device=device)
fake_images = model['generator'](latent)
preds = model['discriminator'](fake_images)
targets = torch.ones(BATCH_SIZE, 1, device=device)
loss_g = criterion['generator'](preds, targets)
# Backpropagate and optimize the generator's parameters
loss_g.backward()
optimizer['generator'].step()
loss_g_per_epoch.append(loss_g.item())
# Store the average losses and scores for the epoch
losses_g.append(np.mean(loss_g_per_epoch))
losses_d.append(np.mean(loss_d_per_epoch))
real_scores.append(np.mean(real_score_per_epoch))
fake_scores.append(np.mean(fake_score_per_epoch))
# Log losses & scores for the last batch in each epoch
print(f"Epoch [{epoch + 1}/{epochs}], "
f"loss_g: {losses_g[-1]:.4f}, "
f"loss_d: {losses_d[-1]:.4f}, "
f"real_score: {real_scores[-1]:.4f}, "
f"fake_score: {fake_scores[-1]:.4f}")
# Save generated samples after the last epoch
if epoch == epochs - 1:
save_samples(epoch + start_idx, fixed_latent, show=False)
# Save the final discriminator and generator models
torch.save(model['discriminator'].state_dict(), 'discriminator.pth')
torch.save(model['generator'].state_dict(), 'generator.pth')
# Return the lists containing losses and scores for both the generator and discriminator
return losses_g, losses_d, real_scores, fake_scores
This code defines a training function for a Generative Adversarial Network (GAN). Let me explain each part using the first-person point of view 'I':
Setting up Training Environment:
train()
on their respective instances.torch.cuda.empty_cache()
.Initialization and Configuration:
losses_g
, losses_d
, real_scores
, fake_scores
) to store losses and scores during training.Training Loop:
train_dl
).Logging and Saving:
save_samples
function.torch.save
.Return:
This function facilitates the training of a GAN model and provides insights into the learning progress through printed logs and saved samples.
model = {
'discriminator': discriminator.to(device),
'generator': generator.to(device)
}
criterion = {
'discriminator': nn.BCELoss(),
'generator': nn.BCELoss()
}
lr = 0.0002
epochs = 50
history = fit(model, criterion, epochs, lr)
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:48<00:00, 4.19s/it]
Epoch [1/50], loss_g: 6.0652, loss_d: 0.8181, real_score: 0.7618, fake_score: 0.2467
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:10<00:00, 4.32s/it]
Epoch [2/50], loss_g: 4.5312, loss_d: 0.7919, real_score: 0.7432, fake_score: 0.2549
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:59<00:00, 4.25s/it]
Epoch [3/50], loss_g: 4.5523, loss_d: 0.6902, real_score: 0.7637, fake_score: 0.2340
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:51<00:00, 4.21s/it]
Epoch [4/50], loss_g: 4.7291, loss_d: 0.7549, real_score: 0.7561, fake_score: 0.2433
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:55<00:00, 4.23s/it]
Epoch [5/50], loss_g: 5.0454, loss_d: 0.6721, real_score: 0.7698, fake_score: 0.2240
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:56<00:00, 4.24s/it]
Epoch [6/50], loss_g: 5.2042, loss_d: 0.6109, real_score: 0.7879, fake_score: 0.2079
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:00<00:00, 4.27s/it]
Epoch [7/50], loss_g: 5.5504, loss_d: 0.5896, real_score: 0.7996, fake_score: 0.1984
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:59<00:00, 4.26s/it]
Epoch [8/50], loss_g: 5.5796, loss_d: 0.5388, real_score: 0.8101, fake_score: 0.1842
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:05<00:00, 4.29s/it]
Epoch [9/50], loss_g: 5.3962, loss_d: 0.5155, real_score: 0.8164, fake_score: 0.1796
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:09<00:00, 4.32s/it]
Epoch [10/50], loss_g: 5.6377, loss_d: 0.5032, real_score: 0.8238, fake_score: 0.1749
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:03<00:00, 4.28s/it]
Epoch [11/50], loss_g: 5.3780, loss_d: 0.4539, real_score: 0.8356, fake_score: 0.1607
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:07<00:00, 4.31s/it]
Epoch [12/50], loss_g: 5.5346, loss_d: 0.4654, real_score: 0.8424, fake_score: 0.1565
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:04<00:00, 4.29s/it]
Epoch [13/50], loss_g: 5.4358, loss_d: 0.3985, real_score: 0.8545, fake_score: 0.1439
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:12<00:00, 4.34s/it]
Epoch [14/50], loss_g: 5.5395, loss_d: 0.4467, real_score: 0.8462, fake_score: 0.1519
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:03<00:00, 4.28s/it]
Epoch [15/50], loss_g: 5.5605, loss_d: 0.4454, real_score: 0.8455, fake_score: 0.1526
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:59<00:00, 4.26s/it]
Epoch [16/50], loss_g: 5.3136, loss_d: 0.4184, real_score: 0.8505, fake_score: 0.1461
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:57<00:00, 4.24s/it]
Epoch [17/50], loss_g: 5.3686, loss_d: 0.4461, real_score: 0.8495, fake_score: 0.1526
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:59<00:00, 4.26s/it]
Epoch [18/50], loss_g: 5.3113, loss_d: 0.4180, real_score: 0.8498, fake_score: 0.1459
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:53<00:00, 4.22s/it]
Epoch [19/50], loss_g: 5.2832, loss_d: 0.4279, real_score: 0.8522, fake_score: 0.1472
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:54<00:00, 4.23s/it]
Epoch [20/50], loss_g: 5.2682, loss_d: 0.4070, real_score: 0.8586, fake_score: 0.1409
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:49<00:00, 4.20s/it]
Epoch [21/50], loss_g: 5.2217, loss_d: 0.3837, real_score: 0.8649, fake_score: 0.1321
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:52<00:00, 4.22s/it]
Epoch [22/50], loss_g: 5.2380, loss_d: 0.3737, real_score: 0.8688, fake_score: 0.1324
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:00<00:00, 4.26s/it]
Epoch [23/50], loss_g: 5.1183, loss_d: 0.3593, real_score: 0.8706, fake_score: 0.1269
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:56<00:00, 4.24s/it]
Epoch [24/50], loss_g: 5.0967, loss_d: 0.3602, real_score: 0.8716, fake_score: 0.1275
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:56<00:00, 4.24s/it]
Epoch [25/50], loss_g: 5.1378, loss_d: 0.3683, real_score: 0.8711, fake_score: 0.1276
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:54<00:00, 4.23s/it]
Epoch [26/50], loss_g: 4.9867, loss_d: 0.4029, real_score: 0.8663, fake_score: 0.1335
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:57<00:00, 4.25s/it]
Epoch [27/50], loss_g: 4.7878, loss_d: 0.3279, real_score: 0.8866, fake_score: 0.1135
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:58<00:00, 4.25s/it]
Epoch [28/50], loss_g: 4.9205, loss_d: 0.3779, real_score: 0.8752, fake_score: 0.1243
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:52<00:00, 4.21s/it]
Epoch [29/50], loss_g: 4.6765, loss_d: 0.2789, real_score: 0.8926, fake_score: 0.1079
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:51<00:00, 4.21s/it]
Epoch [30/50], loss_g: 4.9732, loss_d: 0.4974, real_score: 0.8607, fake_score: 0.1390
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:54<00:00, 4.23s/it]
Epoch [31/50], loss_g: 4.3921, loss_d: 0.2553, real_score: 0.8971, fake_score: 0.1018
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:55<00:00, 4.24s/it]
Epoch [32/50], loss_g: 4.6987, loss_d: 0.3743, real_score: 0.8765, fake_score: 0.1228
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:56<00:00, 4.24s/it]
Epoch [33/50], loss_g: 4.6536, loss_d: 0.2659, real_score: 0.8989, fake_score: 0.1010
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:52<00:00, 4.22s/it]
Epoch [34/50], loss_g: 4.7566, loss_d: 0.4439, real_score: 0.8659, fake_score: 0.1326
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:51<00:00, 4.21s/it]
Epoch [35/50], loss_g: 4.5519, loss_d: 0.3645, real_score: 0.8845, fake_score: 0.1153
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:52<00:00, 4.21s/it]
Epoch [36/50], loss_g: 4.7310, loss_d: 0.3432, real_score: 0.8859, fake_score: 0.1144
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:55<00:00, 4.23s/it]
Epoch [37/50], loss_g: 4.5264, loss_d: 0.2636, real_score: 0.9057, fake_score: 0.0936
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:00<00:00, 4.26s/it]
Epoch [38/50], loss_g: 4.6865, loss_d: 0.2616, real_score: 0.9003, fake_score: 0.0989
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:28<00:00, 4.08s/it]
Epoch [39/50], loss_g: 4.6169, loss_d: 0.3611, real_score: 0.8861, fake_score: 0.1144
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:27<00:00, 4.07s/it]
Epoch [40/50], loss_g: 4.5619, loss_d: 0.2195, real_score: 0.9129, fake_score: 0.0828
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:23<00:00, 4.05s/it]
Epoch [41/50], loss_g: 4.5618, loss_d: 0.4803, real_score: 0.8788, fake_score: 0.1251
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:32<00:00, 4.10s/it]
Epoch [42/50], loss_g: 4.3703, loss_d: 0.1656, real_score: 0.9299, fake_score: 0.0696
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:36<00:00, 4.12s/it]
Epoch [43/50], loss_g: 4.5105, loss_d: 0.4910, real_score: 0.8698, fake_score: 0.1299
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:33<00:00, 4.10s/it]
Epoch [44/50], loss_g: 4.3677, loss_d: 0.2912, real_score: 0.9031, fake_score: 0.0966
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:07<00:00, 4.31s/it]
Epoch [45/50], loss_g: 4.6320, loss_d: 0.3030, real_score: 0.9013, fake_score: 0.0986
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:02<00:00, 4.28s/it]
Epoch [46/50], loss_g: 4.6951, loss_d: 0.3971, real_score: 0.8896, fake_score: 0.1101
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:18<00:00, 4.37s/it]
Epoch [47/50], loss_g: 4.1778, loss_d: 0.1618, real_score: 0.9292, fake_score: 0.0700
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:04<00:00, 4.29s/it]
Epoch [48/50], loss_g: 4.5123, loss_d: 0.4282, real_score: 0.8999, fake_score: 0.1008
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [11:59<00:00, 4.26s/it]
Epoch [49/50], loss_g: 4.4507, loss_d: 0.2422, real_score: 0.9115, fake_score: 0.0887
100%|████████████████████████████████████████████████████████████████████████████████| 169/169 [12:01<00:00, 4.27s/it]
Epoch [50/50], loss_g: 4.6000, loss_d: 0.3309, real_score: 0.8960, fake_score: 0.1017 Saving generated-images-0050.png
losses_g, losses_d, real_scores, fake_scores = history
plt.figure(figsize=(15,6))
plt.plot(losses_d, '-')
plt.plot(losses_g, '-')
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['Discriminator', 'Generator'])
plt.title('The Discriminator Loss and the Generator Loss');
In this code, I am visualizing the training history of a Generative Adversarial Network (GAN) that I have trained. The training history includes the losses of both the discriminator and the generator over the epochs.
Here's a breakdown of the code:
losses_g, losses_d, real_scores, fake_scores = history
: I am unpacking the training history into separate lists for generator losses (losses_g
), discriminator losses (losses_d
), real scores (real_scores
), and fake scores (fake_scores
).
plt.figure(figsize=(15,6))
: I am creating a new figure for the plot with a specified size to ensure clarity and visibility.
plt.plot(losses_d, '-')
: I am plotting the discriminator losses over the epochs with a solid line ('-'
).
plt.plot(losses_g, '-')
: I am plotting the generator losses over the epochs with a solid line ('-'
).
plt.xlabel('epoch')
: I am labeling the x-axis as 'epoch' to indicate the horizontal axis represents the number of training epochs.
plt.ylabel('loss')
: I am labeling the y-axis as 'loss' to indicate the vertical axis represents the loss values.
plt.legend(['Discriminator', 'Generator'])
: I am adding a legend to the plot, specifying that the two lines correspond to the discriminator and generator losses.
plt.title('The Discriminator Loss and the Generator Loss')
: I am setting a title for the plot, describing the information it conveys.
This visualization helps me analyze the training process by observing how the losses of the discriminator and generator evolve over the course of training. It can provide insights into the convergence and stability of the GAN model.
generated_img = cv2.imread(f'./generated/generated-images-00{epochs}.png')
generated_img = generated_img[:, :, [2,1,0]]
fig, ax = plt.subplots(figsize=(8,8))
ax.set_xticks([]); ax.set_yticks([])
ax.imshow(generated_img);
In this code, I am loading and displaying a generated image from a GAN training process. Let me break down the code for you:
generated_img = cv2.imread(f'./generated/generated-images-00{epochs}.png')
: I am using OpenCV (cv2
) to read an image file corresponding to a generated sample. The file name is constructed based on the variable epochs
, which likely represents the epoch at which the image was generated. The image is loaded into the variable generated_img
.
generated_img = generated_img[:, :, [2,1,0]]
: I am rearranging the color channels of the image. OpenCV loads images in BGR format, and this line swaps the channels to RGB format. The indexing [2,1,0]
represents the order of channels after the rearrangement.
fig, ax = plt.subplots(figsize=(8,8))
: I am creating a new Matplotlib figure (fig
) and axes (ax
) for visualization. The figsize
parameter sets the size of the figure to 8x8 inches.
ax.set_xticks([]); ax.set_yticks([])
: I am removing the tick marks on both the x and y axes, providing a cleaner appearance for the image.
ax.imshow(generated_img)
: I am displaying the generated image on the axes using the imshow
function. The image is visualized in the Matplotlib plot.
This code segment allows me to visually inspect a specific generated image from the GAN training process. It's useful for checking the quality and diversity of generated samples at a particular epoch.
generator_model = generator
# Load the generator model in PyTorch
loaded_generator = torch.load('generator_model.pth')
# Serialize the PyTorch generator model using pickle
with open('generator_model.pkl', 'wb') as f:
pickle.dump(generator_model, f)
# Deserialize the PyTorch generator model using pickle
with open('generator_model.pkl', 'rb') as f:
loaded_generator = pickle.load(f)
I've created a user-friendly version of the DCGAN Anime Face Generator using Streamlit, offering a seamless one-click experience. This version allows users to effortlessly generate anime faces with just a single click.
To explore the Streamlit version, click the button below: