14. Ice Seal Classification using YOLOv11#

14.1. Overview#

In this lesson, we will train a YOLOv11 classification model on images of ice-associated seals from the NOAA Alaska Fisheries Science Center. These 640x640 images have been extracted from aerial photography using a separate ROI object detector. Due to the small size of the original imagery, individual seals cannot be identified with an adequate level of detail in a single pass. This necessitates a two-shot detection approach, where the first stage (the ROI object detector) localizes potential seals, and the second stage (the classifier) refines the identification at the species level. The dataset includes the following classes:

bearded_pup bearded_seal ribbon_pup ribbon_seal ringed_pup ringed_seal spotted_pup spotted_seal unknown_pup unknown_seal After training the model, we will run inference on a folder (named transect) containing images of unknown seals. The aggregated predictions will then be visualized using heatmaps and stacked area plots to explore species distribution and ecological relationships.

14.2. Learning Objectives#

  • Train a YOLOv11 classification model on marine seal images.

  • Evaluate the model’s performance using standard metrics.

  • Perform inference on a transect folder of unknown seal images.

  • Aggregate and visualize predictions using heatmaps and stacked area plots.

  • Interpret the ecological relationships and species distributions from the visualization.

14.3. Dataset Description#

The dataset comprises 640x640 images of ice-associated seals with the following distribution:

  • bearded_pup: 336 images

  • bearded_seal: 1537 images

  • ribbon_pup: 28 images

  • ribbon_seal: 185 images

  • ringed_pup: 43 images

  • ringed_seal: 2542 images

  • spotted_pup: 190 images

  • spotted_seal: 1329 images

  • unknown_pup: 313 images

  • unknown_seal: 648 images

Each image file follows a naming convention (e.g., 100_bearded_pup.jpg). Ensure that the dataset is organized and the paths are correctly set in your configuration file (data.yaml).

Download the dataset here: https://huggingface.co/datasets/atticus-carter/NOAA_AFSC_MML_Iceseals_Classification/blob/main/640_yolo_classification_dataset.zip

Download the secret transect here: https://huggingface.co/datasets/atticus-carter/NOAA_AFSC_MML_Iceseals_Transects/blob/main/transect_mystery.zip

14.4. Preparing the Environment#

Before starting, make sure you are using a GPU-enabled runtime. Run the cell below to check your GPU status.

!nvidia-smi

Next, install the required dependencies:

!pip install ultralytics matplotlib seaborn pandas

Now, import the required libraries:

import os
from ultralytics import YOLO
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

Then unzip your dataset:

!unzip /content/640_yolo_classification_dataset.zip

14.5. Training the Classification Model#

The following code trains the YOLOv11 classification model using the provided marine seal dataset. The model will be trained for 100 epochs with an image size of 640 pixels. Make sure the training configuration (e.g., class labels and paths) is correctly specified in your data.yaml file. Optionally you can configure your tensorboard now if you prefer to visualize metrics with it.

%load_ext tensorboard
%tensorboard --logdir /content/runs/classify/train
from ultralytics import YOLO

# Load the YOLOv11 classification model
model = YOLO("yolo11n-cls.pt")

# Train the model
results = model.train(data="/content/640_yolo_classification_dataset", epochs=300, patience=50, stream_buffer=True, imgsz=640)

# Print training results
print(results)

14.6. Assessing Model Performance#

Review the output images displayed above to assess the model’s performance. Pay close attention to per-class metrics and overall performance indicators such as the F1 score, precision-recall curves, and confusion matrices. These visuals will help you identify which classes are performing well and where improvements might be needed.

model = YOLO("/content/runs/classify/train3/weights/best.pt")
results = model.val()
results
from IPython.display import Image, display
import os

# Set the base directory
base_dir = "/content/runs/classify/train/"

# List of filenames to display
filenames = [
    "labels.jpg",
    "F1_curve.png",
    "PR_curve.png",
    "P_curve.png",
    "R_curve.png",
    "confusion_matrix.png",
    "confusion_matrix_normalized.png"
]

# Display each image
for filename in filenames:
    image_path = os.path.join(base_dir, filename)
    display(Image(image_path))

14.7. Inference on the Transect Folder#

The transect folder is organized into subfolders representing every 100 meters along a 3km transect (e.g., 0m, 100m, 200m, …, 3000m). Each subfolder contains ROI detection images from that segment. The code below traverses each subfolder, runs inference on all JPEG images within, and saves the resulting prediction images. It also aggregates the classification probabilities for later visualization.

!unzip /content/transect_mystery.zip
from ultralytics import YOLO
import os

# Load the pre-trained YOLOv11 model for inference
model = YOLO("/content/runs/classify/train3/weights/ NOAA_AFSC_MML_Iceseals_Classification.pt")  # using the pretrained model for inference

# Define the path to the transect root folder
transect_root = '/content/transect_mystery'

# List all subfolders inside the transect folder (each subfolder represents a 100m segment)
subfolders = sorted([os.path.join(transect_root, d)
                     for d in os.listdir(transect_root)
                     if os.path.isdir(os.path.join(transect_root, d))])

# Initialize a dictionary to hold predictions per segment
segment_predictions = {}

for folder in subfolders:
    # List all JPEG images in the current subfolder
    images = [os.path.join(folder, f) for f in os.listdir(folder) if f.endswith('.jpg')]

    # Check if the folder contains any images
    if not images:
        print(f"Warning: Folder '{folder}' contains no JPEG images. Skipping...")
        continue  # Skip to the next folder if no images are found

    # Run batched inference on the images in the current subfolder
    results = list(model(images, stream=True))

    # Collect predictions for aggregation
    segment_probs = []
    for result in results:
        segment_probs.append(result.probs)  # Classification probabilities
        # Save the result image with a modified filename indicating the segment folder
        result.save(filename=os.path.join(folder, 'result_' + os.path.basename(result.path)))

    # Store predictions for this segment folder in the dictionary using the subfolder name as key
    segment_predictions[os.path.basename(folder)] = segment_probs

# Now, segment_predictions is a dictionary keyed by subfolder names (e.g., "0m", "100m", etc.)
# with a list of prediction probability arrays for each image in that segment.

14.8. Visualizing Transect Predictions#

We now aggregate the predictions from each 100m segment. For each segment, we compute the mean probability for each species and then visualize these aggregated predictions using a heatmap, stacked area plot and stacked bar chart. This data was cleaned to remove erroneous ROIs via a simple clustering review system.

import re
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go

# Define class labels in the order corresponding to the model outputs
class_labels = [
    'bearded_pup', 'bearded_seal', 'ribbon_pup', 'ribbon_seal',
    'ringed_pup', 'ringed_seal', 'spotted_pup', 'spotted_seal',
    'unknown_pup', 'unknown_seal'
]

# Helper function to extract a numeric value from a segment label
def extract_numeric(segment_label):
    # This finds the first occurrence of one or more digits in the string.
    match = re.search(r'\d+', segment_label)
    if match:
        return int(match.group())
    else:
        # If no number is found, return a default value (e.g., 0)
        return 0

# Create a DataFrame to hold mean probabilities per segment
segment_data = []

# Assume segment_predictions is defined somewhere in your code.
# It should be a dictionary where keys are segment labels and values are lists of prediction objects.
for segment, probs_list in segment_predictions.items():
    if len(probs_list) > 0:
        # Compute the mean probabilities over the list of predictions for this segment.
        segment_mean = np.mean([prob.data.cpu().numpy() for prob in probs_list], axis=0)
        # Build a dictionary for this segment, including each species probability.
        row = {"Segment": segment}
        row.update({label: segment_mean[i] for i, label in enumerate(class_labels)})
        segment_data.append(row)

# Create a DataFrame from the aggregated segment data
df_segment = pd.DataFrame(segment_data)

# Extract numeric values from the 'Segment' column for proper sorting.
df_segment['Segment_numeric'] = df_segment['Segment'].apply(extract_numeric)
df_segment = df_segment.sort_values(by="Segment_numeric")


plt.figure(figsize=(10, 6))
# Use only the class columns for the heatmap, while the index is the segment label.
sns.heatmap(df_segment.set_index('Segment')[class_labels], annot=True, cmap='viridis')
plt.title('Heatmap of Mean Classification Probabilities per Transect Segment')
plt.xlabel('Species')
plt.ylabel('Transect Segment (every 100m)')
plt.show()


df_segment_sorted = df_segment.copy()  # Already sorted by numeric segment

fig = go.Figure()

# Plot each species (columns after 'Segment' and 'Segment_numeric') as a trace.
for col in class_labels:
    fig.add_trace(go.Scatter(
        x=df_segment_sorted['Segment'],  # Display the original segment labels on the x-axis
        y=df_segment_sorted[col],
        name=col,
        stackgroup='one',  # Group traces for stacking
        mode='lines',
        fill='tonexty'
    ))

fig.update_layout(
    title='Stacked Area Plot of Mean Species Distribution Along Transect',
    xaxis_title='Transect Segment (every 100m)',
    yaxis_title='Mean Probability',
    legend=dict(x=1.05, y=1),
    width=1000,
    height=600
)

fig.show()


# Create a dictionary to hold counts per segment.
# For each segment, we will count the best guess (i.e. highest probability) for every prediction.
counts = {}

for segment, probs_list in segment_predictions.items():
    # Initialize a count dictionary for this segment.
    counts[segment] = {label: 0 for label in class_labels}
    for prob in probs_list:
        # Convert the tensor to a NumPy array
        arr = prob.data.cpu().numpy()
        # Find the index of the highest probability
        best_idx = np.argmax(arr)
        best_label = class_labels[best_idx]
        counts[segment][best_label] += 1

# Convert the counts dictionary into a DataFrame.
# The keys become the index (segments) and the values (dictionaries) become the row data.
df_counts = pd.DataFrame.from_dict(counts, orient='index')
df_counts.index.name = 'Segment'
df_counts.reset_index(inplace=True)

# Extract numeric values from the 'Segment' column for sorting
df_counts['Segment_numeric'] = df_counts['Segment'].apply(extract_numeric)
df_counts = df_counts.sort_values(by='Segment_numeric')

fig_counts = go.Figure()

for label in class_labels:
    fig_counts.add_trace(go.Bar(
        x=df_counts['Segment'],  # original segment labels on x-axis
        y=df_counts[label],
        name=label
    ))

fig_counts.update_layout(
    barmode='stack',
    title='Counts of Individual Animals with Best Guesses per Segment',
    xaxis_title='Transect Segment (every 100m)',
    yaxis_title='Count',
    width=1000,
    height=600
)

fig_counts.show()