import os, cv2, math, random, pydot, graphviz
import numpy as np
import datetime as dt
import tensorflow as tf
import matplotlib.pyplot as plt
from pytube import YouTube
from moviepy.editor import *
from collections import deque
from sklearn.model_selection import train_test_split
from tensorflow.keras.layers import *
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import to_categorical, plot_model
from tensorflow.keras.callbacks import EarlyStopping
%matplotlib inline
I have utilized several Python libraries for this project:
os
: for operating system-related functionalities.cv2
: OpenCV for image and video processing.math
and random
: for mathematical operations and random number generation.numpy
: for numerical operations on arrays.datetime
: for working with dates and times.tensorflow
: the main deep learning framework for building and training the model.matplotlib.pyplot
: for plotting graphs and visualizations.collections.deque
: a double-ended queue for efficiently adding and removing elements.pytube (from pytube import YouTube)
: Used for downloading YouTube videos for testing the model.moviepy.editor
: for video editing operations.sklearn.model_selection.train_test_split
: for splitting the dataset into training and testing sets.tensorflow.keras
: including layers, models, utilities, and callbacks.seed_number = 27
np.random.seed(seed_number)
random.seed(seed_number)
tf.random.set_seed(seed_number)
I established the seed values for Numpy
, Python
, and Tensorflow
to ensure consistent results with each execution.
In the initial step, we will visually explore the data and associated labels to gain insight into the nature of the dataset. The dataset used is the UCF50 - Action Recognition Dataset, which stands out for its realistic videos sourced from YouTube. This characteristic distinguishes it from many other action recognition datasets, which often feature staged performances by actors. The dataset comprises:
50
Action Categories25
Groups of Videos per Action Category133
Average Videos per Action Category199
Average Number of Frames per Video320
Average Frame Width per Video240
Average Frame Height per Video26
Average Frames Per Second per VideoFor the visualization process, we will randomly select 20
categories from the dataset. Subsequently, for each selected category, we will choose a random video and visualize the first frame, along with the corresponding labels. This approach allows us to gain insights into a representative subset ( 20
random videos) of the entire dataset.
# Generate a Matplotlib figure and define its dimensions.
plt.figure(figsize=(20, 20))
# Retrieve the names of every class or category within UCF50 Dataset.
all_classes_names = os.listdir('dataset/UCF50')
# Produce a list comprising 20 randomly generated values. These values will fall
# within the range of 0 to 50, with 50 representing the total number of classes
random_range = random.sample(range(len(all_classes_names)), 20)
# Looping through each of the randomly generated values.
for counter, random_index in enumerate(random_range, 1):
# Get the name of the selected class.
selected_class_name = all_classes_names[random_index]
# Get a list of video files for the selected class.
video_files_names_list = os.listdir(f'dataset/UCF50/{selected_class_name}')
# Choose a random video file from the list.
selected_video_file_name = random.choice(video_files_names_list)
# Read the first frame from the selected video file.
video_reader = cv2.VideoCapture(f'dataset/UCF50/{selected_class_name}/{selected_video_file_name}')
_, bgr_frame = video_reader.read()
video_reader.release()
# Convert the frame from BGR to RGB.
rgb_frame = cv2.cvtColor(bgr_frame, cv2.COLOR_BGR2RGB)
# Add the class name as text on the frame.
cv2.putText(rgb_frame, selected_class_name, (10, 30), cv2.FONT_HERSHEY_TRIPLEX, 1, (255, 255, 255), 2)
# Display the frame in a subplot on the Matplotlib figure.
plt.subplot(5, 4, counter); plt.imshow(rgb_frame); plt.axis('off')
Following this, we will conduct preprocessing on the dataset. Initially, we'll read the video files from the dataset and resize the frames of the videos to a consistent width and height. This resizing serves to decrease computational requirements. Additionally, we'll normalize the data to the range [0-1]
by dividing the pixel values by 255
. This normalization accelerates convergence during the network training process.
However, before proceeding, let's set up some constants.
# Set the dimensions to which each video frame will be resized in our dataset.
IMAGE_HEIGHT, IMAGE_WIDTH = 64, 64
# Define the number of frames per video sequence fed to the model.
SEQUENCE_LENGTH = 20
# Designate the directory containing the UCF50 dataset.
DATASET_DIR = "dataset/UCF50"
# Specify the list that holds the names of the classes intended for training. Feel free to select any desired set of classes.
CLASSES_LIST = ["WalkingWithDog", "TaiChi", "Swing", "HorseRace", "Basketball", "PushUps"]
Note: The constants IMAGE_HEIGHT
, IMAGE_WIDTH
, and SEQUENCE_LENGTH
* can be adjusted for improved results. However, it's important to note that increasing the sequence length is effective only up to a certain point; beyond that, increasing the values will lead to higher computational costs.*
We will establish a function called frames_extraction()
that generates a list containing resized and normalized frames from a video specified as its argument. This function will iteratively read the video file frame by frame, with not all frames being added to the list. Only an evenly distributed sequence length of frames will be included.
def frames_extraction(video_path):
'''
This function extracts the necessary frames from a video after resizing and normalization.
Parameters:
video_path: The disk path of the video from which frames are to be extracted.
Returns:
frames_list: A list containing the resized and normalized frames of the video.
'''
# Initialize an empty list to store frames.
frames_list = []
# Open the video file using OpenCV's VideoCapture.
video_reader = cv2.VideoCapture(video_path)
# Obtain the total number of frames in the video.
video_frames_count = int(video_reader.get(cv2.CAP_PROP_FRAME_COUNT))
# Calculate the window for skipping frames based on the desired sequence length.
skip_frames_window = max(int(video_frames_count / SEQUENCE_LENGTH), 1)
# Iterate through the frames to extract the required sequence.
for frame_counter in range(SEQUENCE_LENGTH):
# Set the position to read a specific frame based on the skip_frames_window.
video_reader.set(cv2.CAP_PROP_POS_FRAMES, frame_counter * skip_frames_window)
# Read the frame from the video.
success, frame = video_reader.read()
# Check for read success; if not, print an error message and exit the loop.
if not success:
print("Error reading video frames")
break
# Resize the frame to the specified dimensions.
resized_frame = cv2.resize(frame, (IMAGE_HEIGHT, IMAGE_WIDTH))
# Normalize the pixel values to the range [0, 1].
normalized_frame = resized_frame / 255
# Append the normalized frame to the frames_list.
frames_list.append(normalized_frame)
# Release the video reader.
video_reader.release()
# Return the list containing the resized and normalized frames.
return frames_list
Next, we will establish a function named create_dataset()
. This function will systematically go through all the classes listed in the CLASSES_LIST
constant. For each class, it will invoke the frame_extraction()
function on every video file associated with the selected classes. The function will then return the frames (features
), class indices ( labels
), and file paths of the video files (video_files_paths
).
def create_dataset():
'''
This function will gather the data from the chosen classes and generate the necessary dataset.
Returns:
features: A list containing the extracted frames of the videos.
labels: A list containing the indices of the classes corresponding to the videos.
video_files_paths: A list containing the paths of the videos on the disk.
'''
# Initialize lists to store dataset components.
features = []
labels = []
video_files_paths = []
# Iterate through each class in the CLASSES_LIST.
for class_index, class_name in enumerate(CLASSES_LIST):
print(f'Extracting Data of Class: {class_name}')
# Retrieve the list of files in the current class directory.
files_list = os.listdir(os.path.join(DATASET_DIR, class_name))
# Iterate through each file in the class.
for file_name in files_list:
# Construct the full path to the video file.
video_file_path = os.path.join(DATASET_DIR, class_name, file_name)
# Extract frames from the video file using the frames_extraction function.
frames = frames_extraction(video_file_path)
# Check if the extracted frames match the desired sequence length.
if len(frames) == SEQUENCE_LENGTH:
# Append the frames, class index, and video file path to the respective lists.
features.append(frames)
labels.append(class_index)
video_files_paths.append(video_file_path)
# Convert lists to numpy arrays.
features = np.asarray(features)
labels = np.array(labels)
# Return the dataset components.
return features, labels, video_files_paths
Now, we will employ the previously defined create_dataset()
function to gather data from the chosen classes and generate the necessary dataset.
# Generate the dataset.
features, labels, video_files_paths = create_dataset()
Extracting Data of Class: WalkingWithDog Extracting Data of Class: TaiChi Extracting Data of Class: Swing Extracting Data of Class: HorseRace Extracting Data of Class: Basketball Extracting Data of Class: PushUps
Next, we will transform the labels
(representing class indexes) into vectors using one-hot encoding.
# Utilizing Keras's `to_categorical` function to convert labels into vectors through one-hot encoding.
one_hot_encoded_labels = to_categorical(labels)
At this point, we possess the essential features
(a NumPy array comprising all extracted video frames) and one_hot_encoded_labels
(another NumPy array containing class labels in one-hot-encoded format). Consequently, we will partition our data to establish training and testing sets. Prior to the split, we will shuffle the dataset to prevent any bias and ensure the splits accurately represent the overall data distribution.
# Partition the data into a training set (75%) and a test set (25%).
features_train, features_test, labels_train, labels_test = train_test_split(features,
one_hot_encoded_labels,
test_size = 0.25,
shuffle = True,
random_state = seed_number)
In this stage, we will execute the initial approach by employing a combination of ConvLSTM cells. The ConvLSTM cell represents a variation of an LSTM network, incorporating convolution operations within the network architecture. Essentially, it is an LSTM structure with embedded convolutions, enabling it to discern spatial features in the data while considering the temporal relationships.
For video classification, this methodology adeptly captures the spatial relations within individual frames and the temporal relations across different frames. The ConvLSTM's ability to handle 3-dimensional input (width, height, num_of_channels)
sets it apart from a simple LSTM, which only accommodates 1-dimensional input. Consequently, an LSTM alone is unsuitable for modeling spatiotemporal data.
For further insights into this architecture, refer to the paper Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting by Xingjian Shi (NIPS 2015).
To build the model, we will employ the Keras ConvLSTM2D
recurrent layers. The ConvLSTM2D
layer necessitates the specification of the number of filters and the kernel size for implementing the convolutional operations. The output of these layers is ultimately flattened and supplied to the Dense
layer with softmax activation, producing the probabilities for each action category.
Additionally, MaxPooling3D
layers will be utilized to reduce frame dimensions, minimizing unnecessary computations. Dropout
layers will also be incorporated to mitigate overfitting risks associated with the model learning the training data too precisely. The architecture is intentionally kept simple, containing a modest number of trainable parameters. This choice is deliberate, given that we are working with a limited subset of the dataset, which does not demand an expansive model.
def create_convlstm_model():
'''
This function will build the necessary ConvLSTM model.
Returns:
model: The completed ConvLSTM model as required.
'''
# Initialize a Sequential model.
model = Sequential()
# Add a ConvLSTM2D layer with specified parameters.
model.add(ConvLSTM2D(filters=4, kernel_size=(3, 3), activation='tanh',
data_format="channels_last", recurrent_dropout=0.2,
return_sequences=True,
input_shape=(SEQUENCE_LENGTH, IMAGE_HEIGHT, IMAGE_WIDTH, 3)))
# Add a MaxPooling3D layer with specified parameters.
model.add(MaxPooling3D(pool_size=(1, 2, 2), padding='same',
data_format='channels_last'))
# Add a TimeDistributed layer with Dropout for regularization.
model.add(TimeDistributed(Dropout(0.2)))
# Repeat the pattern with additional ConvLSTM, MaxPooling3D, and TimeDistributed layers.
# Note: Recurrent dropout is used for regularization.
model.add(ConvLSTM2D(filters=8, kernel_size=(3, 3), activation='tanh',
data_format="channels_last",
recurrent_dropout=0.2, return_sequences=True))
model.add(MaxPooling3D(pool_size=(1, 2, 2), padding='same',
data_format='channels_last'))
model.add(TimeDistributed(Dropout(0.2)))
model.add(ConvLSTM2D(filters=14, kernel_size=(3, 3), activation='tanh',
data_format="channels_last",
recurrent_dropout=0.2, return_sequences=True))
model.add(MaxPooling3D(pool_size=(1, 2, 2), padding='same',
data_format='channels_last'))
model.add(TimeDistributed(Dropout(0.2)))
model.add(ConvLSTM2D(filters=16, kernel_size=(3, 3), activation='tanh',
data_format="channels_last",
recurrent_dropout=0.2, return_sequences=True))
model.add(MaxPooling3D(pool_size=(1, 2, 2), padding='same', data_format='channels_last'))
# Flatten the output.
model.add(Flatten())
# Add a Dense layer with softmax activation for classification.
model.add(Dense(len(CLASSES_LIST), activation="softmax"))
# Display model summary.
model.summary()
return model
Now, we will employ the previously defined function create_convlstm_model()
to build the necessary convlstm
model.
# Construct the required convlstm model.
convlstm_model = create_convlstm_model()
# Display the success message.
print("Model Created Successfully!")
Model: "sequential_2" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv_lstm2d_4 (ConvLSTM2D) (None, 20, 62, 62, 4) 1024 max_pooling3d_4 (MaxPooling (None, 20, 31, 31, 4) 0 3D) time_distributed_15 (TimeDi (None, 20, 31, 31, 4) 0 stributed) conv_lstm2d_5 (ConvLSTM2D) (None, 20, 29, 29, 8) 3488 max_pooling3d_5 (MaxPooling (None, 20, 15, 15, 8) 0 3D) time_distributed_16 (TimeDi (None, 20, 15, 15, 8) 0 stributed) conv_lstm2d_6 (ConvLSTM2D) (None, 20, 13, 13, 14) 11144 max_pooling3d_6 (MaxPooling (None, 20, 7, 7, 14) 0 3D) time_distributed_17 (TimeDi (None, 20, 7, 7, 14) 0 stributed) conv_lstm2d_7 (ConvLSTM2D) (None, 20, 5, 5, 16) 17344 max_pooling3d_7 (MaxPooling (None, 20, 3, 3, 16) 0 3D) flatten_2 (Flatten) (None, 2880) 0 dense_2 (Dense) (None, 6) 17286 ================================================================= Total params: 50,286 Trainable params: 50,286 Non-trainable params: 0 _________________________________________________________________ Model Created Successfully!
Following this, we will introduce an early stopping callback to mitigate the risk of overfitting. Subsequently, we will commence the training process after compiling the model.
# Define an early stopping callback to monitor validation loss, with patience for improvement and restoration of best weights.
early_stopping_callback = EarlyStopping(monitor='val_loss', patience=10, mode='min', restore_best_weights=True)
# Compile the ConvLSTM model with categorical crossentropy loss, Adam optimizer, and accuracy as a metric.
convlstm_model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['Accuracy'])
# Train the ConvLSTM model using training data, specifying epochs, batch size, shuffle, validation split, and callbacks.
convlstm_model_training_history = convlstm_model.fit(x=features_train, y=labels_train, epochs=20,
batch_size=4, shuffle=True, validation_split=0.2,
callbacks=[early_stopping_callback])
Epoch 1/20 110/110 [==============================] - 123s 999ms/step - loss: 1.8008 - Accuracy: 0.1899 - val_loss: 1.7841 - val_Accuracy: 0.1909 Epoch 2/20 110/110 [==============================] - 98s 887ms/step - loss: 1.7521 - Accuracy: 0.2471 - val_loss: 1.7311 - val_Accuracy: 0.3091 Epoch 3/20 110/110 [==============================] - 97s 878ms/step - loss: 1.6553 - Accuracy: 0.3021 - val_loss: 1.6648 - val_Accuracy: 0.2636 Epoch 4/20 110/110 [==============================] - 98s 888ms/step - loss: 1.4408 - Accuracy: 0.4119 - val_loss: 1.4624 - val_Accuracy: 0.4364 Epoch 5/20 110/110 [==============================] - 96s 876ms/step - loss: 1.1626 - Accuracy: 0.5378 - val_loss: 1.2383 - val_Accuracy: 0.5636 Epoch 6/20 110/110 [==============================] - 96s 876ms/step - loss: 0.8829 - Accuracy: 0.6773 - val_loss: 1.3677 - val_Accuracy: 0.5636 Epoch 7/20 110/110 [==============================] - 95s 868ms/step - loss: 0.7455 - Accuracy: 0.7323 - val_loss: 1.0973 - val_Accuracy: 0.6364 Epoch 8/20 110/110 [==============================] - 97s 878ms/step - loss: 0.4981 - Accuracy: 0.8307 - val_loss: 1.1433 - val_Accuracy: 0.6273 Epoch 9/20 110/110 [==============================] - 97s 879ms/step - loss: 0.4043 - Accuracy: 0.8673 - val_loss: 1.0328 - val_Accuracy: 0.6909 Epoch 10/20 110/110 [==============================] - 96s 871ms/step - loss: 0.3348 - Accuracy: 0.8947 - val_loss: 1.0383 - val_Accuracy: 0.7273 Epoch 11/20 110/110 [==============================] - 97s 879ms/step - loss: 0.2660 - Accuracy: 0.9153 - val_loss: 1.3968 - val_Accuracy: 0.6182 Epoch 12/20 110/110 [==============================] - 96s 876ms/step - loss: 0.2454 - Accuracy: 0.9153 - val_loss: 1.1003 - val_Accuracy: 0.6727 Epoch 13/20 110/110 [==============================] - 97s 883ms/step - loss: 0.3046 - Accuracy: 0.9016 - val_loss: 0.8931 - val_Accuracy: 0.7182 Epoch 14/20 110/110 [==============================] - 96s 877ms/step - loss: 0.0970 - Accuracy: 0.9680 - val_loss: 1.1406 - val_Accuracy: 0.7182 Epoch 15/20 110/110 [==============================] - 96s 875ms/step - loss: 0.0692 - Accuracy: 0.9840 - val_loss: 1.2246 - val_Accuracy: 0.7182 Epoch 16/20 110/110 [==============================] - 96s 873ms/step - loss: 0.0728 - Accuracy: 0.9703 - val_loss: 1.1869 - val_Accuracy: 0.6636 Epoch 17/20 110/110 [==============================] - 97s 882ms/step - loss: 0.1159 - Accuracy: 0.9611 - val_loss: 1.3929 - val_Accuracy: 0.6909 Epoch 18/20 110/110 [==============================] - 96s 875ms/step - loss: 0.0582 - Accuracy: 0.9886 - val_loss: 1.1369 - val_Accuracy: 0.7000 Epoch 19/20 110/110 [==============================] - 96s 876ms/step - loss: 0.0206 - Accuracy: 0.9954 - val_loss: 1.1694 - val_Accuracy: 0.7273 Epoch 20/20 110/110 [==============================] - 95s 862ms/step - loss: 0.0153 - Accuracy: 0.9977 - val_loss: 1.2719 - val_Accuracy: 0.7273
# Define an early stopping callback to monitor validation loss, with patience for improvement and restoration of best weights.
early_stopping_callback = EarlyStopping(monitor='val_loss', patience=10, mode='min', restore_best_weights=True)
# Compile the ConvLSTM model with categorical crossentropy loss, Adam optimizer, and accuracy as a metric.
convlstm_model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['Accuracy'])
# Train the ConvLSTM model using training data, specifying epochs, batch size, shuffle, validation split, and callbacks.
convlstm_model_training_history = convlstm_model.fit(x=features_train, y=labels_train, epochs=50,
batch_size=4, shuffle=True, validation_split=0.2,
callbacks=[early_stopping_callback])
Epoch 1/50 110/110 [==============================] - 201s 2s/step - loss: 1.7990 - Accuracy: 0.2014 - val_loss: 1.7638 - val_Accuracy: 0.3273 Epoch 2/50 110/110 [==============================] - 176s 2s/step - loss: 1.6709 - Accuracy: 0.3478 - val_loss: 1.4846 - val_Accuracy: 0.4273 Epoch 3/50 110/110 [==============================] - 182s 2s/step - loss: 1.4404 - Accuracy: 0.4394 - val_loss: 1.3247 - val_Accuracy: 0.4545 Epoch 4/50 110/110 [==============================] - 182s 2s/step - loss: 1.1813 - Accuracy: 0.5217 - val_loss: 1.3389 - val_Accuracy: 0.4909 Epoch 5/50 110/110 [==============================] - 188s 2s/step - loss: 1.0066 - Accuracy: 0.5950 - val_loss: 1.0212 - val_Accuracy: 0.6091 Epoch 6/50 110/110 [==============================] - 184s 2s/step - loss: 0.7437 - Accuracy: 0.7300 - val_loss: 0.9209 - val_Accuracy: 0.6545 Epoch 7/50 110/110 [==============================] - 184s 2s/step - loss: 0.6540 - Accuracy: 0.7757 - val_loss: 0.8339 - val_Accuracy: 0.7545 Epoch 8/50 110/110 [==============================] - 189s 2s/step - loss: 0.4412 - Accuracy: 0.8330 - val_loss: 0.8365 - val_Accuracy: 0.7455 Epoch 9/50 110/110 [==============================] - 186s 2s/step - loss: 0.4046 - Accuracy: 0.8467 - val_loss: 0.8543 - val_Accuracy: 0.7091 Epoch 10/50 110/110 [==============================] - 186s 2s/step - loss: 0.3022 - Accuracy: 0.8879 - val_loss: 0.9104 - val_Accuracy: 0.7364 Epoch 11/50 110/110 [==============================] - 187s 2s/step - loss: 0.2530 - Accuracy: 0.9245 - val_loss: 1.1266 - val_Accuracy: 0.6727 Epoch 12/50 110/110 [==============================] - 184s 2s/step - loss: 0.1212 - Accuracy: 0.9703 - val_loss: 0.9275 - val_Accuracy: 0.7818 Epoch 13/50 110/110 [==============================] - 182s 2s/step - loss: 0.1748 - Accuracy: 0.9474 - val_loss: 0.9752 - val_Accuracy: 0.7545 Epoch 14/50 110/110 [==============================] - 177s 2s/step - loss: 0.0896 - Accuracy: 0.9748 - val_loss: 0.9548 - val_Accuracy: 0.7636 Epoch 15/50 110/110 [==============================] - 178s 2s/step - loss: 0.0528 - Accuracy: 0.9908 - val_loss: 1.0907 - val_Accuracy: 0.7545 Epoch 16/50 110/110 [==============================] - 177s 2s/step - loss: 0.0320 - Accuracy: 0.9908 - val_loss: 1.0649 - val_Accuracy: 0.7818 Epoch 17/50 110/110 [==============================] - 177s 2s/step - loss: 0.0513 - Accuracy: 0.9863 - val_loss: 1.1535 - val_Accuracy: 0.7727
Following the training process, we will assess the model's performance on the test set.
model_evaluation_history = convlstm_model.evaluate(features_test, labels_test)
6/6 [==============================] - 5s 826ms/step - loss: 1.1096 - Accuracy: 0.8087
At this point, we will store the model to circumvent the necessity of training it anew for each use.
# Retrieve loss and accuracy from the model evaluation history.
model_evaluation_loss, model_evaluation_accuracy = model_evaluation_history
# Define the date and time format.
date_time_format = '%Y_%m_%d_%H_%M_%S'
# Obtain the current date and time.
current_date_time_dt = dt.datetime.now()
# Convert the date and time to a string with the specified format.
current_date_time_string = dt.datetime.strftime(current_date_time_dt, date_time_format)
# Construct a unique file name based on date, time, loss, and accuracy.
model_file_name = f'convlstm_model_Date_Time_{current_date_time_string}___Loss_{model_evaluation_loss}___Accuracy_{model_evaluation_accuracy}.h5'
# Save the ConvLSTM model with the generated file name.
convlstm_model.save(model_file_name)
# Retrieve loss and accuracy from the model evaluation history.
model_evaluation_loss, model_evaluation_accuracy = model_evaluation_history
# Define the date and time format.
date_time_format = '%Y_%m_%d_%H_%M_%S'
# Obtain the current date and time.
current_date_time_dt = dt.datetime.now()
# Convert the date and time to a string with the specified format.
current_date_time_string = dt.datetime.strftime(current_date_time_dt, date_time_format)
# Construct a unique file name based on date, time, loss, and accuracy.
model_file_name = f'convlstm_model_Date_Time_{current_date_time_string}___Loss_{model_evaluation_loss}___Accuracy_{model_evaluation_accuracy}.h5'
# Save the ConvLSTM model with the generated file name.
convlstm_model.save(model_file_name)
In this step, we will establish a function called plot_metric() to depict the training and validation metrics. As we already possess distinct metrics from our training and validation steps, the next step involves visualizing these metrics.
def plot_metric(model_training_history, metric_name_1, metric_name_2, plot_name):
'''
This function is designed to create a graph displaying the provided metrics.
Parameters:
model_training_history: A history object containing recorded training and validation
loss values and metric values across consecutive epochs.
metric_name_1: The name of the first metric to be visualized in the graph.
metric_name_2: The name of the second metric to be visualized in the graph.
plot_name: The title of the graph.
'''
# Extract metric values from the training history.
metric_value_1 = model_training_history.history[metric_name_1]
metric_value_2 = model_training_history.history[metric_name_2]
# Generate a range of epochs for x-axis.
epochs = range(len(metric_value_1))
# Plot the first metric in blue.
plt.plot(epochs, metric_value_1, 'blue', label=metric_name_1)
# Plot the second metric in red.
plt.plot(epochs, metric_value_2, 'red', label=metric_name_2)
# Set the title of the graph.
plt.title(str(plot_name))
# Add a legend to the graph.
plt.legend()
Now, we will employ the previously defined function plot_metric()
to create a visual representation and comprehend the metrics.
# Plot the training and validation loss metrics for visualization.
plot_metric(convlstm_model_training_history, 'loss', 'val_loss', 'Total Loss vs Total Validation Loss')
# Plot the training and validation accuracy metrics for visualization.
plot_metric(convlstm_model_training_history, 'Accuracy', 'val_Accuracy', 'Total Accuracy vs Total Validation Accuracy')
In this stage, we will implement the LRCN Approach, which combines Convolutional and LSTM layers within a single model. An alternative approach involves using separate CNN and LSTM models, with the CNN model extracting spatial features and a pre-trained model being fine-tuned for the task. Subsequently, the LSTM model utilizes these extracted features to predict the action in the video.
However, we will implement the Long-term Recurrent Convolutional Network (LRCN) approach, wherein CNN and LSTM layers are integrated into a unified model. Convolutional layers facilitate spatial feature extraction from video frames, and these spatial features are then input to LSTM layer(s) at each time-step for temporal sequence modeling. This approach enables the network to directly learn spatiotemporal features in an end-to-end training fashion, resulting in a robust model.
For a detailed understanding of this architecture, refer to the paper Long-term Recurrent Convolutional Networks for Visual Recognition and Description by Jeff Donahue (CVPR 2015).
Additionally, we will incorporate the TimeDistributed
wrapper layer, enabling the application of the same layer to each frame of the video independently. This wrapper adapts a layer's input shape from (width, height, num_of_channels)
to (no_of_frames, width, height, num_of_channels)
. This capability is advantageous as it allows the entire video to be input into the model in a single step.
In the process of realizing our LRCN architecture, we will employ time-distributed Conv2D
layers, succeeded by MaxPooling2D
and Dropout
layers. The features derived from the Conv2D
layers will undergo flattening through the Flatten
layer and subsequently be input to an LSTM
layer. The predictions for the performed action will be made by a Dense
layer with softmax activation, utilizing the output from the LSTM
layer.
def create_LRCN_model():
'''
This function will build the necessary LRCN model.
Returns:
model: The completed LRCN model as required.
'''
# Utilize a Sequential model for model construction.
model = Sequential()
# TimeDistributed Conv2D layer with 16 filters, kernel size (3, 3), padding, and ReLU activation.
model.add(TimeDistributed(Conv2D(16, (3, 3), padding='same', activation='relu'),
input_shape=(SEQUENCE_LENGTH, IMAGE_HEIGHT, IMAGE_WIDTH, 3)))
# TimeDistributed MaxPooling2D layer with pool size (4, 4).
model.add(TimeDistributed(MaxPooling2D((4, 4))))
# TimeDistributed Dropout layer with dropout rate of 0.25.
model.add(TimeDistributed(Dropout(0.25)))
# TimeDistributed Conv2D layer with 32 filters, kernel size (3, 3), padding, and ReLU activation.
model.add(TimeDistributed(Conv2D(32, (3, 3), padding='same', activation='relu')))
# TimeDistributed MaxPooling2D layer with pool size (4, 4).
model.add(TimeDistributed(MaxPooling2D((4, 4))))
# TimeDistributed Dropout layer with dropout rate of 0.25.
model.add(TimeDistributed(Dropout(0.25)))
# TimeDistributed Conv2D layer with 64 filters, kernel size (3, 3), padding, and ReLU activation.
model.add(TimeDistributed(Conv2D(64, (3, 3), padding='same', activation='relu')))
# TimeDistributed MaxPooling2D layer with pool size (2, 2).
model.add(TimeDistributed(MaxPooling2D((2, 2))))
# TimeDistributed Dropout layer with dropout rate of 0.25.
model.add(TimeDistributed(Dropout(0.25)))
# TimeDistributed Conv2D layer with 64 filters, kernel size (3, 3), padding, and ReLU activation.
model.add(TimeDistributed(Conv2D(64, (3, 3), padding='same', activation='relu')))
# TimeDistributed MaxPooling2D layer with pool size (2, 2).
model.add(TimeDistributed(MaxPooling2D((2, 2))))
# TimeDistributed Flatten layer.
model.add(TimeDistributed(Flatten()))
# LSTM layer with 32 units.
model.add(LSTM(32))
# Dense layer with softmax activation for classification.
model.add(Dense(len(CLASSES_LIST), activation='softmax'))
# Display the model's summary.
model.summary()
# Return the constructed LRCN model.
return model
Now we will use the previously defined function create_LRCN_model()
to build the necessary LRCN
model.
# Create an LRCN model using the specified function.
LRCN_model = create_LRCN_model()
# Print a success message indicating the model creation.
print("Model Created Successfully!")
Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= time_distributed_3 (TimeDis (None, 20, 64, 64, 16) 448 tributed) time_distributed_4 (TimeDis (None, 20, 16, 16, 16) 0 tributed) time_distributed_5 (TimeDis (None, 20, 16, 16, 16) 0 tributed) time_distributed_6 (TimeDis (None, 20, 16, 16, 32) 4640 tributed) time_distributed_7 (TimeDis (None, 20, 4, 4, 32) 0 tributed) time_distributed_8 (TimeDis (None, 20, 4, 4, 32) 0 tributed) time_distributed_9 (TimeDis (None, 20, 4, 4, 64) 18496 tributed) time_distributed_10 (TimeDi (None, 20, 2, 2, 64) 0 stributed) time_distributed_11 (TimeDi (None, 20, 2, 2, 64) 0 stributed) time_distributed_12 (TimeDi (None, 20, 2, 2, 64) 36928 stributed) time_distributed_13 (TimeDi (None, 20, 1, 1, 64) 0 stributed) time_distributed_14 (TimeDi (None, 20, 64) 0 stributed) lstm (LSTM) (None, 32) 12416 dense_1 (Dense) (None, 6) 198 ================================================================= Total params: 73,126 Trainable params: 73,126 Non-trainable params: 0 _________________________________________________________________ Model Created Successfully!
# Create an LRCN model using the specified function.
LRCN_model = create_LRCN_model()
# Print a success message indicating the model creation.
print("Model Created Successfully!")
Model: "sequential_3" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= time_distributed_18 (TimeDi (None, 20, 64, 64, 16) 448 stributed) time_distributed_19 (TimeDi (None, 20, 16, 16, 16) 0 stributed) time_distributed_20 (TimeDi (None, 20, 16, 16, 16) 0 stributed) time_distributed_21 (TimeDi (None, 20, 16, 16, 32) 4640 stributed) time_distributed_22 (TimeDi (None, 20, 4, 4, 32) 0 stributed) time_distributed_23 (TimeDi (None, 20, 4, 4, 32) 0 stributed) time_distributed_24 (TimeDi (None, 20, 4, 4, 64) 18496 stributed) time_distributed_25 (TimeDi (None, 20, 2, 2, 64) 0 stributed) time_distributed_26 (TimeDi (None, 20, 2, 2, 64) 0 stributed) time_distributed_27 (TimeDi (None, 20, 2, 2, 64) 36928 stributed) time_distributed_28 (TimeDi (None, 20, 1, 1, 64) 0 stributed) time_distributed_29 (TimeDi (None, 20, 64) 0 stributed) lstm_1 (LSTM) (None, 32) 12416 dense_3 (Dense) (None, 6) 198 ================================================================= Total params: 73,126 Trainable params: 73,126 Non-trainable params: 0 _________________________________________________________________ Model Created Successfully!
Following the model structure inspection, we will proceed to compile and initiate the training process.
# Define an early stopping callback to prevent overfitting.
early_stopping_callback = EarlyStopping(monitor='val_loss', patience=15,
mode='min', restore_best_weights=True)
# Compile the LRCN model with categorical crossentropy loss and Adam optimizer.
LRCN_model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
# Train the LRCN model on the training data.
LRCN_model_training_history = LRCN_model.fit(x=features_train, y=labels_train,
epochs=50, batch_size=4, shuffle=True,
validation_split=0.2, callbacks=[early_stopping_callback])
Epoch 1/20 110/110 [==============================] - 22s 151ms/step - loss: 1.8032 - accuracy: 0.1854 - val_loss: 1.7883 - val_accuracy: 0.2727 Epoch 2/20 110/110 [==============================] - 12s 108ms/step - loss: 1.7773 - accuracy: 0.2082 - val_loss: 1.7526 - val_accuracy: 0.2727 Epoch 3/20 110/110 [==============================] - 12s 106ms/step - loss: 1.7001 - accuracy: 0.3112 - val_loss: 1.6867 - val_accuracy: 0.3273 Epoch 4/20 110/110 [==============================] - 12s 106ms/step - loss: 1.5832 - accuracy: 0.3844 - val_loss: 1.5591 - val_accuracy: 0.4364 Epoch 5/20 110/110 [==============================] - 12s 105ms/step - loss: 1.3870 - accuracy: 0.4600 - val_loss: 1.4243 - val_accuracy: 0.3909 Epoch 6/20 110/110 [==============================] - 12s 106ms/step - loss: 1.2555 - accuracy: 0.5149 - val_loss: 1.2770 - val_accuracy: 0.4455 Epoch 7/20 110/110 [==============================] - 12s 110ms/step - loss: 1.1631 - accuracy: 0.5789 - val_loss: 1.2397 - val_accuracy: 0.5545 Epoch 8/20 110/110 [==============================] - 12s 105ms/step - loss: 0.9274 - accuracy: 0.6568 - val_loss: 1.4252 - val_accuracy: 0.4909 Epoch 9/20 110/110 [==============================] - 12s 105ms/step - loss: 0.9240 - accuracy: 0.6636 - val_loss: 1.2808 - val_accuracy: 0.5364 Epoch 10/20 110/110 [==============================] - 12s 105ms/step - loss: 0.8760 - accuracy: 0.6545 - val_loss: 1.0998 - val_accuracy: 0.5636 Epoch 11/20 110/110 [==============================] - 12s 105ms/step - loss: 0.7788 - accuracy: 0.7071 - val_loss: 0.9594 - val_accuracy: 0.6091 Epoch 12/20 110/110 [==============================] - 12s 105ms/step - loss: 0.6528 - accuracy: 0.7757 - val_loss: 0.9045 - val_accuracy: 0.6545 Epoch 13/20 110/110 [==============================] - 12s 106ms/step - loss: 0.6009 - accuracy: 0.8032 - val_loss: 0.7205 - val_accuracy: 0.6909 Epoch 14/20 110/110 [==============================] - 12s 105ms/step - loss: 0.5832 - accuracy: 0.8009 - val_loss: 0.8245 - val_accuracy: 0.6455 Epoch 15/20 110/110 [==============================] - 12s 105ms/step - loss: 0.4668 - accuracy: 0.8421 - val_loss: 0.6718 - val_accuracy: 0.7545 Epoch 16/20 110/110 [==============================] - 12s 106ms/step - loss: 0.4291 - accuracy: 0.8490 - val_loss: 0.9266 - val_accuracy: 0.6182 Epoch 17/20 110/110 [==============================] - 12s 108ms/step - loss: 0.4315 - accuracy: 0.8535 - val_loss: 0.7208 - val_accuracy: 0.7364 Epoch 18/20 110/110 [==============================] - 12s 105ms/step - loss: 0.4142 - accuracy: 0.8513 - val_loss: 0.6414 - val_accuracy: 0.7636 Epoch 19/20 110/110 [==============================] - 11s 104ms/step - loss: 0.3738 - accuracy: 0.8764 - val_loss: 0.6430 - val_accuracy: 0.8273 Epoch 20/20 110/110 [==============================] - 11s 104ms/step - loss: 0.3171 - accuracy: 0.8947 - val_loss: 0.6467 - val_accuracy: 0.7273
# Define an early stopping callback to prevent overfitting.
early_stopping_callback = EarlyStopping(monitor='val_loss', patience=15,
mode='min', restore_best_weights=True)
# Compile the LRCN model with categorical crossentropy loss and Adam optimizer.
LRCN_model.compile(loss='categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])
# Train the LRCN model on the training data.
LRCN_model_training_history = LRCN_model.fit(x=features_train, y=labels_train,
epochs=50, batch_size=4, shuffle=True,
validation_split=0.2, callbacks=[early_stopping_callback])
Epoch 1/50 110/110 [==============================] - 19s 130ms/step - loss: 1.8043 - accuracy: 0.1876 - val_loss: 1.7832 - val_accuracy: 0.1818 Epoch 2/50 110/110 [==============================] - 12s 107ms/step - loss: 1.7583 - accuracy: 0.2174 - val_loss: 1.7574 - val_accuracy: 0.2364 Epoch 3/50 110/110 [==============================] - 12s 107ms/step - loss: 1.6856 - accuracy: 0.2838 - val_loss: 1.6683 - val_accuracy: 0.3545 Epoch 4/50 110/110 [==============================] - 12s 109ms/step - loss: 1.5179 - accuracy: 0.3593 - val_loss: 1.6128 - val_accuracy: 0.3636 Epoch 5/50 110/110 [==============================] - 13s 117ms/step - loss: 1.4261 - accuracy: 0.4302 - val_loss: 1.5707 - val_accuracy: 0.3182 Epoch 6/50 110/110 [==============================] - 13s 120ms/step - loss: 1.3523 - accuracy: 0.4531 - val_loss: 1.4226 - val_accuracy: 0.4455 Epoch 7/50 110/110 [==============================] - 12s 107ms/step - loss: 1.3139 - accuracy: 0.4760 - val_loss: 1.3859 - val_accuracy: 0.4455 Epoch 8/50 110/110 [==============================] - 12s 108ms/step - loss: 1.2298 - accuracy: 0.4989 - val_loss: 1.4084 - val_accuracy: 0.4364 Epoch 9/50 110/110 [==============================] - 12s 108ms/step - loss: 1.1237 - accuracy: 0.5789 - val_loss: 1.3420 - val_accuracy: 0.4273 Epoch 10/50 110/110 [==============================] - 12s 108ms/step - loss: 1.0319 - accuracy: 0.6178 - val_loss: 1.3927 - val_accuracy: 0.4364 Epoch 11/50 110/110 [==============================] - 12s 108ms/step - loss: 0.9735 - accuracy: 0.6362 - val_loss: 1.4486 - val_accuracy: 0.4818 Epoch 12/50 110/110 [==============================] - 12s 107ms/step - loss: 0.8285 - accuracy: 0.6911 - val_loss: 1.0597 - val_accuracy: 0.6000 Epoch 13/50 110/110 [==============================] - 12s 107ms/step - loss: 0.7535 - accuracy: 0.7185 - val_loss: 1.1249 - val_accuracy: 0.6000 Epoch 14/50 110/110 [==============================] - 12s 108ms/step - loss: 0.6254 - accuracy: 0.7735 - val_loss: 1.1813 - val_accuracy: 0.5818 Epoch 15/50 110/110 [==============================] - 12s 108ms/step - loss: 0.8066 - accuracy: 0.7551 - val_loss: 0.8911 - val_accuracy: 0.6909 Epoch 16/50 110/110 [==============================] - 12s 108ms/step - loss: 0.5593 - accuracy: 0.7986 - val_loss: 1.1038 - val_accuracy: 0.6182 Epoch 17/50 110/110 [==============================] - 12s 109ms/step - loss: 0.6167 - accuracy: 0.7986 - val_loss: 0.9633 - val_accuracy: 0.6273 Epoch 18/50 110/110 [==============================] - 12s 107ms/step - loss: 0.5495 - accuracy: 0.8055 - val_loss: 1.2317 - val_accuracy: 0.6000 Epoch 19/50 110/110 [==============================] - 12s 108ms/step - loss: 0.4686 - accuracy: 0.8307 - val_loss: 0.7101 - val_accuracy: 0.7727 Epoch 20/50 110/110 [==============================] - 12s 107ms/step - loss: 0.4444 - accuracy: 0.8535 - val_loss: 0.6849 - val_accuracy: 0.7818 Epoch 21/50 110/110 [==============================] - 12s 109ms/step - loss: 0.5357 - accuracy: 0.8215 - val_loss: 0.7453 - val_accuracy: 0.7727 Epoch 22/50 110/110 [==============================] - 12s 108ms/step - loss: 0.3067 - accuracy: 0.9016 - val_loss: 0.6516 - val_accuracy: 0.7636 Epoch 23/50 110/110 [==============================] - 12s 108ms/step - loss: 0.2369 - accuracy: 0.9314 - val_loss: 0.5659 - val_accuracy: 0.8364 Epoch 24/50 110/110 [==============================] - 12s 109ms/step - loss: 0.1581 - accuracy: 0.9542 - val_loss: 0.5450 - val_accuracy: 0.8545 Epoch 25/50 110/110 [==============================] - 12s 107ms/step - loss: 0.2364 - accuracy: 0.9336 - val_loss: 0.9689 - val_accuracy: 0.6909 Epoch 26/50 110/110 [==============================] - 12s 108ms/step - loss: 0.1778 - accuracy: 0.9405 - val_loss: 0.6004 - val_accuracy: 0.8091 Epoch 27/50 110/110 [==============================] - 12s 108ms/step - loss: 0.1751 - accuracy: 0.9519 - val_loss: 0.5509 - val_accuracy: 0.8364 Epoch 28/50 110/110 [==============================] - 12s 107ms/step - loss: 0.1402 - accuracy: 0.9680 - val_loss: 0.7420 - val_accuracy: 0.8091 Epoch 29/50 110/110 [==============================] - 12s 107ms/step - loss: 0.1340 - accuracy: 0.9680 - val_loss: 0.5523 - val_accuracy: 0.8545 Epoch 30/50 110/110 [==============================] - 12s 110ms/step - loss: 0.1950 - accuracy: 0.9382 - val_loss: 1.0864 - val_accuracy: 0.7000 Epoch 31/50 110/110 [==============================] - 12s 108ms/step - loss: 0.2675 - accuracy: 0.9130 - val_loss: 0.7155 - val_accuracy: 0.8000 Epoch 32/50 110/110 [==============================] - 12s 109ms/step - loss: 0.1619 - accuracy: 0.9451 - val_loss: 0.8362 - val_accuracy: 0.7727 Epoch 33/50 110/110 [==============================] - 12s 109ms/step - loss: 0.1174 - accuracy: 0.9680 - val_loss: 0.5287 - val_accuracy: 0.8455 Epoch 34/50 110/110 [==============================] - 12s 108ms/step - loss: 0.0787 - accuracy: 0.9840 - val_loss: 0.5390 - val_accuracy: 0.8455 Epoch 35/50 110/110 [==============================] - 12s 108ms/step - loss: 0.0433 - accuracy: 0.9931 - val_loss: 0.4488 - val_accuracy: 0.8727 Epoch 36/50 110/110 [==============================] - 12s 109ms/step - loss: 0.0402 - accuracy: 0.9931 - val_loss: 0.5689 - val_accuracy: 0.8727 Epoch 37/50 110/110 [==============================] - 12s 108ms/step - loss: 0.0276 - accuracy: 0.9954 - val_loss: 0.4988 - val_accuracy: 0.8818 Epoch 38/50 110/110 [==============================] - 12s 108ms/step - loss: 0.0224 - accuracy: 0.9931 - val_loss: 0.6320 - val_accuracy: 0.8273 Epoch 39/50 110/110 [==============================] - 12s 108ms/step - loss: 0.1956 - accuracy: 0.9382 - val_loss: 0.5862 - val_accuracy: 0.8545 Epoch 40/50 110/110 [==============================] - 12s 108ms/step - loss: 0.1368 - accuracy: 0.9474 - val_loss: 0.6982 - val_accuracy: 0.8182 Epoch 41/50 110/110 [==============================] - 12s 109ms/step - loss: 0.1143 - accuracy: 0.9611 - val_loss: 0.6149 - val_accuracy: 0.8636 Epoch 42/50 110/110 [==============================] - 12s 108ms/step - loss: 0.1412 - accuracy: 0.9542 - val_loss: 0.5040 - val_accuracy: 0.8909 Epoch 43/50 110/110 [==============================] - 12s 107ms/step - loss: 0.1036 - accuracy: 0.9703 - val_loss: 0.4237 - val_accuracy: 0.8909 Epoch 44/50 110/110 [==============================] - 12s 106ms/step - loss: 0.0945 - accuracy: 0.9611 - val_loss: 0.7064 - val_accuracy: 0.8182 Epoch 45/50 110/110 [==============================] - 12s 107ms/step - loss: 0.0956 - accuracy: 0.9680 - val_loss: 0.3921 - val_accuracy: 0.9091 Epoch 46/50 110/110 [==============================] - 12s 107ms/step - loss: 0.1812 - accuracy: 0.9359 - val_loss: 0.4480 - val_accuracy: 0.8909 Epoch 47/50 110/110 [==============================] - 12s 108ms/step - loss: 0.0391 - accuracy: 0.9886 - val_loss: 0.5927 - val_accuracy: 0.8000 Epoch 48/50 110/110 [==============================] - 12s 108ms/step - loss: 0.0404 - accuracy: 0.9863 - val_loss: 0.4228 - val_accuracy: 0.9000 Epoch 49/50 110/110 [==============================] - 12s 110ms/step - loss: 0.0113 - accuracy: 1.0000 - val_loss: 0.3579 - val_accuracy: 0.9091 Epoch 50/50 110/110 [==============================] - 12s 111ms/step - loss: 0.0306 - accuracy: 0.9931 - val_loss: 0.3319 - val_accuracy: 0.9091
Similar to the previous step, we will assess the performance of the trained LRCN
model on the test set.
model_evaluation_history = LRCN_model.evaluate(features_test, labels_test)
6/6 [==============================] - 1s 165ms/step - loss: 0.4639 - accuracy: 0.9126
At this point, we will store the model to circumvent the necessity of training it anew for each use.
# Retrieve loss and accuracy from the model evaluation history.
model_evaluation_loss, model_evaluation_accuracy = model_evaluation_history
# Define the date and time format.
date_time_format = '%Y_%m_%d_%H_%M_%S'
# Obtain the current date and time.
current_date_time_dt = dt.datetime.now()
# Convert the date and time to a string with the specified format.
current_date_time_string = dt.datetime.strftime(current_date_time_dt, date_time_format)
# Construct a unique file name based on date, time, loss, and accuracy.
model_file_name = f'LRCN_model_Date_Time_{current_date_time_string}___Loss_{model_evaluation_loss}___Accuracy_{model_evaluation_accuracy}.h5'
# Save the ConvLSTM model with the generated file name.
convlstm_model.save(model_file_name)
Now, we will employ the previously defined function plot_metric()
to create a visual representation and comprehend the metrics.
# Plot the training and validation loss metrics for visualization.
plot_metric(LRCN_model_training_history, 'loss', 'val_loss', 'Total Loss vs Total Validation Loss')
# Plot the training and validation loss metrics for visualization.
plot_metric(LRCN_model_training_history, 'accuracy', 'val_accuracy', 'Total Accuracy vs Total Validation Accuracy')
Considering the promising outcomes, particularly for a limited set of classes, we will now assess the performance of the LRCN
model on various YouTube videos.
We will establish a function called download_youtube_videos()
to initiate the download of YouTube videos, leveraging the pafy
library. This library facilitates the download process with just a video URL, obtaining relevant metadata such as the video title.
def download_youtube_videos(youtube_video_url, output_directory):
'''
This function downloads the YouTube video specified by the provided URL.
Args:
youtube_video_url: URL of the video to be downloaded.
output_directory: The directory path where the downloaded video will be stored.
Returns:
title: The title of the downloaded YouTube video.
'''
# Create a YouTube object for the YouTube video.
youtube_video = YouTube(youtube_video_url)
# Obtain the title of the video.
title = youtube_video.title
# Retrieve the stream with the highest resolution.
video_stream = youtube_video.streams.get_highest_resolution()
# Define the output file path using the video title.
output_file_path = f'{output_directory}/{title}.mp4'
# Download the video to the specified output directory.
video_stream.download(output_directory)
# Return the title of the downloaded YouTube video.
return title
Now we will use the previously created download_youtube_videos()
function to download a YouTube video for testing the LRCN
model.
# Specify the directory where test videos will be stored.
test_videos_directory = 'test_videos'
# Create the test videos directory if it doesn't exist.
os.makedirs(test_videos_directory, exist_ok=True)
# Download a YouTube video for testing using the specified URL and directory.
video_title = download_youtube_videos('https://www.youtube.com/watch?v=8u0qjmHIOcE', test_videos_directory)
# Form the file path for the downloaded video.
input_video_file_path = f'{test_videos_directory}/{video_title}.mp4'
In the following steps, we will define a function called predict_on_video()
. This function reads a video frame by frame from a specified path, performs action recognition on the video, and saves the results.
def predict_on_video(video_file_path, output_file_path, SEQUENCE_LENGTH):
'''
This function conducts action recognition on a video using the LRCN model.
Parameters:
video_file_path: Path to the video on disk for action recognition.
output_file_path: Path to store the output video with overlaid predicted actions.
SEQUENCE_LENGTH: Fixed number of frames forming a sequence input for the model.
'''
# Open the video file for reading.
video_reader = cv2.VideoCapture(video_file_path)
# Get the original video dimensions.
original_video_width = int(video_reader.get(cv2.CAP_PROP_FRAME_WIDTH))
original_video_height = int(video_reader.get(cv2.CAP_PROP_FRAME_HEIGHT))
# Open a video file for writing with the same dimensions and frame rate.
video_writer = cv2.VideoWriter(output_file_path,
cv2.VideoWriter_fourcc('M', 'P', '4', 'V'),
video_reader.get(cv2.CAP_PROP_FPS),
(original_video_width, original_video_height))
# Initialize a deque to store frames for sequence input.
frames_queue = deque(maxlen=SEQUENCE_LENGTH)
# Initialize the predicted class name.
predicted_class_name = ''
# Process each frame in the video.
while video_reader.isOpened():
ok, frame = video_reader.read()
if not ok:
break
# Resize the frame to the required input size.
resized_frame = cv2.resize(frame, (IMAGE_HEIGHT, IMAGE_WIDTH))
# Normalize the frame values.
normalized_frame = resized_frame / 255
# Add the normalized frame to the frames queue.
frames_queue.append(normalized_frame)
# If the queue is full, make a prediction.
if len(frames_queue) == SEQUENCE_LENGTH:
predicted_labels_probabilities = LRCN_model.predict(np.expand_dims(frames_queue, axis=0))[0]
predicted_label = np.argmax(predicted_labels_probabilities)
predicted_class_name = CLASSES_LIST[predicted_label]
# Overlay the predicted class name on the frame.
cv2.putText(frame, predicted_class_name, (10, 30),
cv2.FONT_HERSHEY_TRIPLEX, 1, (0, 255, 0), 2)
# Write the frame to the output video file.
video_writer.write(frame)
# Release video readers and writers.
video_reader.release()
video_writer.release()
Now, we will apply the function predict_on_video()
previously developed to conduct action recognition on the test video obtained through the download_youtube_videos()
function. Subsequently, we will showcase the resulting video with the superimposed predicted actions.
# Formulate the file path for the output video.
output_video_file_path = f'{test_videos_directory}/{video_title}--Output-SeqLen{SEQUENCE_LENGTH}.mp4'
# Apply action recognition to the test video and generate the output video with predictions.
predict_on_video(input_video_file_path, output_video_file_path, SEQUENCE_LENGTH)
# Display the resulting video with the superimposed predicted actions.
VideoFileClip(output_video_file_path, audio=False, target_resolution=(300, None)).ipython_display()
1/1 [==============================] - 2s 2s/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 67ms/step 1/1 [==============================] - 0s 70ms/step 1/1 [==============================] - 0s 59ms/step 1/1 [==============================] - 0s 60ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 61ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 59ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 67ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 61ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 55ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 65ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 68ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 55ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 61ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 55ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 60ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 60ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 59ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 55ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 63ms/step 1/1 [==============================] - 0s 63ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 64ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 60ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 59ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 55ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 32ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 61ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 59ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 61ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 37ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 55ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 33ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 37ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 60ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 33ms/step 1/1 [==============================] - 0s 34ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 55ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 60ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 53ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 32ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 35ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 59ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 37ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 32ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 38ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 56ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 37ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 54ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 68ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 36ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 83ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 61ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 45ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 52ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 57ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 50ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 40ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 58ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 46ms/step 1/1 [==============================] - 0s 43ms/step 1/1 [==============================] - 0s 47ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 44ms/step 1/1 [==============================] - 0s 42ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 49ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 41ms/step 1/1 [==============================] - 0s 37ms/step 1/1 [==============================] - 0s 39ms/step 1/1 [==============================] - 0s 48ms/step 1/1 [==============================] - 0s 51ms/step 1/1 [==============================] - 0s 65ms/step Moviepy - Building video __temp__.mp4. Moviepy - Writing video __temp__.mp4
Moviepy - Done ! Moviepy - video ready __temp__.mp4
Now, I'll define a function for making a single prediction on entire videos. It involves extracting evenly distributed N (SEQUENCE_LENGTH)
frames from the entire video, which is particularly beneficial when dealing with videos focused on a single activity. This strategy optimizes computational efficiency and time in such cases.
def predict_single_action(video_file_path, SEQUENCE_LENGTH):
'''
This function will predict a single action in a video using the LRCN model.
Args:
video_file_path: The path of the video stored on disk for action recognition.
SEQUENCE_LENGTH: The fixed number of frames in a video passed as one sequence to the model.
'''
# Open the video file for reading.
video_reader = cv2.VideoCapture(video_file_path)
# Retrieve the original video dimensions.
original_video_width = int(video_reader.get(cv2.CAP_PROP_FRAME_WIDTH))
original_video_height = int(video_reader.get(cv2.CAP_PROP_FRAME_HEIGHT))
# Initialize an empty list to store frames.
frames_list = []
# Initialize the predicted class name.
predicted_class_name = ''
# Get the total number of frames in the video.
video_frames_count = int(video_reader.get(cv2.CAP_PROP_FRAME_COUNT))
# Calculate the skip window to evenly sample frames for the sequence.
skip_frames_window = max(int(video_frames_count / SEQUENCE_LENGTH), 1)
# Iterate through the frames to build the sequence.
for frame_counter in range(SEQUENCE_LENGTH):
# Set the video reader to the frame specified by the skip window.
video_reader.set(cv2.CAP_PROP_POS_FRAMES, frame_counter * skip_frames_window)
# Read the frame from the video.
success, frame = video_reader.read()
# Break if the frame reading was unsuccessful.
if not success:
break
# Resize the frame to the desired input size.
resized_frame = cv2.resize(frame, (IMAGE_HEIGHT, IMAGE_WIDTH))
# Normalize the pixel values to the range [0, 1].
normalized_frame = resized_frame / 255
# Append the normalized frame to the list.
frames_list.append(normalized_frame)
# Perform action recognition on the sequence of frames.
predicted_labels_probabilities = LRCN_model.predict(np.expand_dims(frames_list, axis=0))[0]
# Determine the predicted action label.
predicted_label = np.argmax(predicted_labels_probabilities)
# Map the label to the corresponding action class name.
predicted_class_name = CLASSES_LIST[predicted_label]
# Display the predicted action and confidence.
print(f'Action Predicted: {predicted_class_name}\nConfidence: {predicted_labels_probabilities[predicted_label]}')
# Release the video reader.
video_reader.release()
Now, I will use the previously established function predict_single_action()
to make a singular prediction on an entire YouTube test video, which will be downloaded using the previously defined function download_youtube_videos()
.
# Download the YouTube video.
video_title = download_youtube_videos('https://youtu.be/fc3w827kwyA', test_videos_directory)
# Construct the input YouTube video path.
input_video_file_path = f'{test_videos_directory}/{video_title}.mp4'
# Perform a Single Prediction on the Test Video.
predict_single_action(input_video_file_path, SEQUENCE_LENGTH)
# Display the input video with an increased maxduration.
VideoFileClip(input_video_file_path, audio=False,
target_resolution=(300, None)).ipython_display(maxduration=5000)
1/1 [==============================] - 0s 67ms/step Action Predicted: TaiChi Confidence: 0.9097801446914673 Moviepy - Building video __temp__.mp4. Moviepy - Writing video __temp__.mp4
Moviepy - Done ! Moviepy - video ready __temp__.mp4