10. Training YOLO Models: A Guide to Understanding Tasks and Modes#

10.1. Introduction to YOLO Models#

YOLO (You Only Look Once) models are incredibly versatile and can be applied to various tasks in computer vision, including detection, segmentation, classification, and pose estimation. These models can also handle oriented bounding boxes (OBB), which is particularly useful for specialized tasks like satellite or medical imagery. In marine science, YOLO models can help analyze underwater images and videos both in real time or after the fact, detecting and classifying marine organisms, shipwrecks, or plastic debris. Their ability to work on real-time video streams can also be leveraged in autonomous underwater vehicles (AUVs) for monitoring and exploration tasks. This efficiency is crucial in remote marine environments where computational resources are limited. A note about YOLO, as of now the only “Official” distributions of YOLO via Ultralytics are YOLOv5,v8 and now v11. Other distributions exist however they are often less open source and are usually very well tuned for a few large benchmark datasets but not large scale differences for our purposes. For this reason you may see YOLO imports be in any of the official distributions, I have done my best to choose the most stable and easiest to use version for each task, but the skills should be easily tranferable.

10.2. YOLO Tasks#

YOLO models can perform the following tasks:

  • Detect: Identify and localize objects in images or videos.

  • Segment: Divide images or videos into regions corresponding to different objects.

  • Classify: Predict the class label of an image.

  • Pose: Estimate key points for objects in an image.

  • OBB: Use oriented bounding boxes for specific imagery applications.

In marine research, detection can be used to identify different fish species or coral structures in video footage, while segmentation could help distinguish between different types of habitats, such as coral reefs and sea grass beds. Classification tasks can assist in labeling different faunal species, and pose estimation can be useful when tracking movement patterns in marine animals. Oriented bounding boxes are valuable in aerial or satellite imagery analysis, where objects like ships or coastal structures need to be tracked in varying orientations.

10.3. YOLO Modes#

The model can be used in different modes, depending on the objective:

  • Train: Train a YOLO model on a custom dataset.

  • Val: Validate the model after training.

  • Predict: Make predictions on new data.

  • Export: Export the model for deployment.

  • Track: Track objects in real-time.

  • Benchmark: Evaluate model speed and accuracy after export.

In marine applications, training and validating a model is often done with custom datasets collected from ROVs or AUVs during underwater surveys. Prediction and tracking modes can be employed in live deployments to identify and monitor marine fauna in real-time, providing valuable data for fisheries management or ecological monitoring. Benchmarking can ensure that these models perform well in different environments, whether for deep-sea monitoring or coastal surveillance.

10.4. Key Training Settings#

Training settings affect model performance, speed, and accuracy. Important settings include batch size, learning rate, image size, and device selection. For a complete understanding of each parameter, explore the official Train Settings on the Ultralytics documentation.

In marine contexts, training settings must be carefully tuned due to the challenges posed by underwater environments, such as varying light conditions, occlusion, and the presence of marine snow. Optimizing batch sizes and learning rates helps ensure models are robust enough to handle these inconsistencies and perform accurately in diverse settings like coral reefs, hydrothermal vents, or the open ocean.

10.5. Inference Settings#

Inference settings allow for model predictions on new data. Key parameters include confidence threshold, image size, and device. Adjusting these settings helps fine-tune the model’s predictions to your specific needs. Learn more in the Predict Settings.

In marine research, inference is typically done after field surveys, where large datasets are processed to identify species or objects of interest. Adjusting inference settings can reduce false positives and ensure accurate detections of marine animals like whales or seals, especially when analyzing aerial or underwater images. For marine biologists, proper tuning of these settings can make the difference between accurate monitoring and misleading results.

10.6. Validation Settings#

The validation settings guide how to assess the model’s performance during training. You can explore these parameters in the Validation Settings section.

Validation is critical when using models in marine science, as ocean environments are highly variable, and incorrect model validation can lead to false conclusions. For example, if you’re monitoring coral bleaching events, validation ensures that the model accurately differentiates between healthy and bleached coral, which directly impacts conservation efforts.

10.7. Export Settings#

Exporting a model for deployment involves adjusting settings like the target format (ONNX, TorchScript, etc.). Find detailed instructions in the Export Settings.

Exporting trained YOLO models in a marine context allows for their integration into field tools like AUVs or drones. By deploying models directly onto these devices, researchers can conduct in-situ analyses of underwater ecosystems, automatically identifying species or mapping features without needing constant human input. This is essential for remote, long-term monitoring projects.

10.8. Augmentation Settings#

Augmentation techniques introduce variability into training data to help the model generalize better. Learn how to apply these techniques in the Augmentation Settings.

Marine environments are highly variable, with changes in water clarity, light, and object appearance. Augmentation techniques, like flipping or changing brightness, simulate these variations, allowing YOLO models to perform well under different conditions. For example, augmenting images of fish in low-light conditions can help the model generalize better when deployed in deep-sea environments.

10.9. Logging, Checkpoints, and Plotting Settings#

Proper logging, checkpointing, and plotting are crucial for tracking your model’s progress. Details on how to implement these are available in the Logging, Checkpoints, and Plotting Settings.

In marine research, monitoring the progress of your model during training is essential, especially when working with datasets collected in extreme conditions like the deep sea. Logging ensures that researchers can keep track of where models perform well or need improvement, while checkpointing allows training to resume if interrupted. Visual plots can help compare results across different environmental conditions, providing insights into model performance across multiple marine ecosystems.

10.10. Next Steps#

The next lessons will make heavy use of these parameters. It’s important to know them well to optimize training and inference performance. Bookmark the Ultralytics YOLO documentation to stay updated on new features and default values.

In marine science, understanding these settings can significantly enhance your ability to build accurate, deployable models. Whether you’re studying marine life in shallow coastal areas or monitoring deep-sea ecosystems, mastery of these YOLO parameters ensures your model can handle the complexities of underwater environments.