6. Image Manipulation in Python with PIL and OpenCV#
6.1. Overview#
In this lesson, we will explore image manipulation using two popular Python libraries: PIL (Pillow) and OpenCV. You will learn how to load, display, resize, rotate, and apply basic filters to images using these libraries. Additionally, we will compare the functionality and performance of PIL and OpenCV.
By the end of this lesson, you will have a better understanding of how to perform image manipulation tasks programmatically and how to choose between PIL and OpenCV depending on your needs.
6.1.1. Learning Objectives#
By the end of this section, you will:
Understand the importance of selecting appropriate image augmentations for training machine learning models in underwater imagery analysis.
Learn how to apply image transformations such as shearing, cropping, adding noise, and adjusting brightness using PIL and OpenCV.
Develop a deeper understanding of how augmentations improve model robustness in real-world scenarios, such as handling camera misalignment, murky water conditions, and variable lighting during AUV surveys
6.2. The Theory Behind Choosing Augmentations for Training Imagery#
When training machine learning models, particularly in highly variable condition tasks such as underwater image analysis, image augmentation is a powerful tool to improve model robustness and performance. Augmentation creates variations of the training data, allowing models to generalize better to real-world conditions. For example, shearing can simulate the effects of camera tilt commonly seen in AUV surveys, where slight angles and shifts can distort how objects are captured. By applying shear transformations, we teach models to recognize objects even when they are skewed due to misalignment during transect movement. Another useful augmentation is cropping, which mimics scenarios where cameras, such as stationary underwater systems, capture only part of an object—often seen in situations where fish or coral are cut off at the edges of the frame. Training with cropped images helps models learn to detect partial objects and improves their robustness in handling incomplete data. In murky underwater environments with sediment or low visibility, adding noise can replicate the challenge of detecting objects in degraded imagery. Noise simulates particles in the water, preparing models to distinguish features despite visual interference. Similarly, brightness adjustments are crucial for dealing with varying lighting conditions that change with depth or time of day. By exposing models to images with different brightness levels, they become more adaptable to fluctuations in light intensity. Additionally, rotations help models handle misalignment that occurs naturally in dynamic underwater environments, where cameras may not always be perfectly horizontal. Lastly, blurring can simulate motion or water flow, which is useful when image clarity is compromised by movement during data collection. Tailoring these augmentations to the challenges of underwater surveys helps build models that are more adaptable, improving detection accuracy and resilience in diverse conditions. The following sections provide examples of how to implement these augmentations in PIL and OpenCV.
Note
The following is meant to be a place for you to come back and reference these crucial augmentations in future activities.
6.3. Loading and Displaying Images#
Before performing any manipulations, we need to load and display images. Both PIL and OpenCV provide easy ways to handle this.
6.3.1. Using PIL:#
# Importing required libraries
from PIL import Image
import matplotlib.pyplot as plt
# Load an image using PIL
pil_image = Image.open('/path/to/your/image.jpg')
# Display the image using matplotlib
plt.imshow(pil_image)
plt.axis('off') # Hide axes
plt.title("Image Loaded with PIL")
plt.show()
6.3.2. Using OpenCV:#
# Importing required libraries
import cv2
import matplotlib.pyplot as plt
# Load an image using OpenCV
opencv_image = cv2.imread('/path/to/your/image.jpg')
# Convert the image from BGR to RGB (OpenCV loads images in BGR by default)
opencv_image_rgb = cv2.cvtColor(opencv_image, cv2.COLOR_BGR2RGB)
# Display the image using matplotlib
plt.imshow(opencv_image_rgb)
plt.axis('off') # Hide axes
plt.title("Image Loaded with OpenCV")
plt.show()
6.4. Resizing Images#
Resizing is one of the most common operations when working with images. We will resize images using both PIL and OpenCV.
6.4.1. Using PIL:#
# Resizing an image using PIL
pil_resized_image = pil_image.resize((200, 200)) # Resize to 200x200 pixels
# Display resized image
plt.imshow(pil_resized_image)
plt.axis('off')
plt.title("Resized Image using PIL")
plt.show()
6.4.2. Using OpenCV:#
# Resizing an image using OpenCV
opencv_resized_image = cv2.resize(opencv_image_rgb, (200, 200)) # Resize to 200x200 pixels
# Display resized image
plt.imshow(opencv_resized_image)
plt.axis('off')
plt.title("Resized Image using OpenCV")
plt.show()
6.5. Rotating Images#
Rotating images is another common transformation. We can easily rotate images using both PIL and OpenCV.
6.5.1. Using PIL:#
# Rotating an image using PIL
pil_rotated_image = pil_image.rotate(45) # Rotate by 45 degrees
# Display rotated image
plt.imshow(pil_rotated_image)
plt.axis('off')
plt.title("Rotated Image using PIL")
plt.show()
6.5.2. Using OpenCV:#
# Rotating an image using OpenCV
# First, we need to define the rotation matrix and apply it to the image
(h, w) = opencv_image_rgb.shape[:2]
center = (w // 2, h // 2)
# Rotate the image by 45 degrees
rotation_matrix = cv2.getRotationMatrix2D(center, 45, 1.0)
opencv_rotated_image = cv2.warpAffine(opencv_image_rgb, rotation_matrix, (w, h))
# Display rotated image
plt.imshow(opencv_rotated_image)
plt.axis('off')
plt.title("Rotated Image using OpenCV")
plt.show()
6.6. Applying Filters#
Both PIL and OpenCV offer the ability to apply filters to images. We’ll demonstrate a simple blur effect.
6.6.1. Using PIL (Gaussian Blur):#
from PIL import ImageFilter
# Apply Gaussian Blur using PIL
pil_blurred_image = pil_image.filter(ImageFilter.GaussianBlur(5)) # Apply a blur with radius 5
# Display blurred image
plt.imshow(pil_blurred_image)
plt.axis('off')
plt.title("Blurred Image using PIL")
plt.show()
6.6.2. Using OpenCV (Gaussian Blur):#
# Apply Gaussian Blur using OpenCV
opencv_blurred_image = cv2.GaussianBlur(opencv_image_rgb, (15, 15), 0) # Apply a blur with a 15x15 kernel
# Display blurred image
plt.imshow(opencv_blurred_image)
plt.axis('off')
plt.title("Blurred Image using OpenCV")
plt.show()
6.7. Adjusting Brightness#
Brightness is a key property of an image that you can manipulate to make an image lighter or darker.
6.7.1. Using PIL:#
from PIL import ImageEnhance
# Enhance brightness using PIL
enhancer = ImageEnhance.Brightness(pil_image)
pil_bright_image = enhancer.enhance(1.5) # Increase brightness by 50%
# Display brightened image
plt.imshow(pil_bright_image)
plt.axis('off')
plt.title("Brightened Image using PIL")
plt.show()
6.7.2. Using OpenCV:#
# Increase brightness by scaling pixel values
opencv_bright_image = cv2.convertScaleAbs(opencv_image_rgb, alpha=1.2, beta=50) # Increase brightness
# Display brightened image
plt.imshow(opencv_bright_image)
plt.axis('off')
plt.title("Brightened Image using OpenCV")
plt.show()
6.8. Shearing Images#
Shearing is a transformation that slants the shape of an object. We’ll shear images using PIL and OpenCV.
6.8.1. Using PIL:#
from PIL import ImageTransform
# Create a shear transform matrix
shear_matrix = (1, 0.5, 0, 0.5, 1, 0)
# Apply shear using PIL
pil_sheared_image = pil_image.transform(pil_image.size, ImageTransform.AffineTransform(shear_matrix))
# Display sheared image
plt.imshow(pil_sheared_image)
plt.axis('off')
plt.title("Sheared Image using PIL")
plt.show()
6.8.2. Using OpenCV:#
# Apply shear using OpenCV
rows, cols, ch = opencv_image_rgb.shape
M = np.float32([[1, 0.5, 0], [0.5, 1, 0]]) # Shearing matrix
opencv_sheared_image = cv2.warpAffine(opencv_image_rgb, M, (cols, rows))
# Display sheared image
plt.imshow(opencv_sheared_image)
plt.axis('off')
plt.title("Sheared Image using OpenCV")
plt.show()
6.9. Flipping Images#
Flipping an image horizontally or vertically can be useful in data augmentation for machine learning tasks.
6.9.1. Using PIL:#
# Flip an image horizontally using PIL
pil_flipped_image = pil_image.transpose(Image.FLIP_LEFT_RIGHT)
# Display flipped image
plt.imshow(pil_flipped_image)
plt.axis('off')
plt.title("Flipped Image using PIL")
plt.show()
6.9.2. Using OpenCV:#
# Flip an image horizontally using OpenCV
opencv_flipped_image = cv2.flip(opencv_image_rgb, 1) # 1 for horizontal flipping
# Display flipped image
plt.imshow(opencv_flipped_image)
plt.axis('off')
plt.title("Flipped Image using OpenCV")
plt.show()
6.10. Cropping Images#
Cropping allows you to select a specific region of the image.
6.10.1. Using PIL:#
# Crop a region of the image using PIL
left, upper, right, lower = 50, 50, 200, 200
pil_cropped_image = pil_image.crop((left, upper, right, lower))
# Display cropped image
plt.imshow(pil_cropped_image)
plt.axis('off')
plt.title("Cropped Image using PIL")
plt.show()
6.10.2. Using OpenCV:#
# Crop a region of the image using OpenCV
opencv_cropped_image = opencv_image_rgb[50:200, 50:200]
# Display cropped image
plt.imshow(opencv_cropped_image)
plt.axis('off')
plt.title("Cropped Image using OpenCV")
plt.show()
6.11. Converting Images to Grayscale#
Converting an image to grayscale reduces the number of color channels, which is useful for some computer vision tasks (See next session to learn why that is!)
6.11.1. Using PIL:#
# Convert image to grayscale using PIL
pil_gray_image = pil_image.convert('L')
# Display grayscale image
plt.imshow(pil_gray_image, cmap='gray')
plt.axis('off')
plt.title("Grayscale Image using PIL")
plt.show()
6.11.2. Using OpenCV:#
# Convert image to grayscale using OpenCV
opencv_gray_image = cv2.cvtColor(opencv_image_rgb, cv2.COLOR_RGB2GRAY)
# Display grayscale image
plt.imshow(opencv_gray_image, cmap='gray')
plt.axis('off')
plt.title("Grayscale Image using OpenCV")
plt.show()
6.12. Adding Noise to Images#
Adding random noise to an image can help simulate real-world noise that can be found in murky water, dirty cameras etc. Making it especially useful for training machine learning models.
6.12.1. Using NumPy and PIL:#
import numpy as np
# Add random noise to the image using PIL
pil_noisy_image = np.array(pil_image).astype(np.float64)
noise = np.random.normal(0, 25, pil_noisy_image.shape)
pil_noisy_image = np.clip(pil_noisy_image + noise, 0, 255).astype(np.uint8)
# Display noisy image
plt.imshow(pil_noisy_image)
plt.axis('off')
plt.title("Noisy Image using PIL")
plt.show()
6.12.2. Using OpenCV:#
# Add random noise to the image using OpenCV
opencv_noisy_image = opencv_image_rgb.astype(np.float64)
noise = np.random.normal(0, 25, opencv_noisy_image.shape)
opencv_noisy_image = np.clip(opencv_noisy_image + noise, 0, 255).astype(np.uint8)
# Display noisy image
plt.imshow(opencv_noisy_image)
plt.axis('off')
plt.title("Noisy Image using OpenCV")
plt.show()
6.13. Histogram Equalization#
Histogram equalization improves the contrast in images by spreading out the intensity values. Currently this is almost exclusively done using OpenCV.
6.13.1. Using OpenCV:#
# Apply histogram equalization using OpenCV
opencv_gray_image = cv2.cvtColor(opencv_image_rgb, cv2.COLOR_RGB2GRAY)
opencv_hist_eq_image = cv2.equalizeHist(opencv_gray_image)
# Display equalized image
plt.imshow(opencv_hist_eq_image, cmap='gray')
plt.axis('off')
plt.title("Histogram Equalized Image using OpenCV")
plt.show()
6.14. Edge Detection#
Detecting edges is useful for object detection and shape analysis. Currently this is almost exclusively done using OpenCV.
6.14.1. Using OpenCV:#
# Perform edge detection using Canny method in OpenCV
opencv_edges = cv2.Canny(opencv_gray_image, 100, 200)
# Display edges
plt.imshow(opencv_edges, cmap='gray')
plt.axis('off')
plt.title("Edge Detection using OpenCV")
plt.show()
6.15. Blending Two Images#
Blending combines two images with a specified weight ratio.
6.15.1. Using OpenCV:#
# Blend two images using OpenCV
# Ensure images are the same size first
opencv_image2 = cv2.resize(opencv_image_rgb, (opencv_image_rgb.shape[1], opencv_image_rgb.shape[0]))
blended_image = cv2.addWeighted(opencv_image_rgb, 0.7, opencv_image2, 0.3, 0)
# Display blended image
plt.imshow(blended_image)
plt.axis('off')
plt.title("Blended Image using OpenCV")
plt.show()
6.16. Thresholding#
Thresholding converts an image to a binary (black-and-white) image by setting a threshold value. Currently this is almost exclusively done using OpenCV
6.16.1. Using OpenCV:#
# Apply simple thresholding in OpenCV
_, thresholded_image = cv2.threshold(opencv_gray_image, 128, 255, cv2.THRESH_BINARY)
# Display thresholded image
plt.imshow(thresholded_image, cmap='gray')
plt.axis('off')
plt.title("Thresholded Image using OpenCV")
plt.show()
6.17. Affine Transformation#
Affine transformations include rotation, translation, scaling, and shearing while preserving collinearity.
6.17.1. Using OpenCV:#
# Define the affine transformation matrix
rows, cols, ch = opencv_image_rgb.shape
src_points = np.float32([[50, 50], [200, 50], [50, 200]])
dst_points = np.float32([[10, 100], [200, 50], [100, 250]])
affine_matrix = cv2.getAffineTransform(src_points, dst_points)
# Apply affine transformation
opencv_affine_image = cv2.warpAffine(opencv_image_rgb, affine_matrix, (cols, rows))
# Display affine transformed image
plt.imshow(opencv_affine_image)
plt.axis('off')
plt.title("Affine Transformation using OpenCV")
plt.show()
6.18. Color Space Conversion: BGR to RGB#
OpenCV loads images in BGR (Blue, Green, Red) format by default, whereas most image processing libraries like Matplotlib expect images in RGB (Red, Green, Blue) format. To ensure the correct color representation when displaying images, you need to convert from BGR to RGB.
6.18.1. Using OpenCV:#
# Convert image from BGR to RGB using OpenCV
opencv_image_bgr = cv2.imread('path_to_image') # Load the image in BGR format
opencv_image_rgb = cv2.cvtColor(opencv_image_bgr, cv2.COLOR_BGR2RGB) # Convert to RGB
# Display the RGB image
plt.imshow(opencv_image_rgb)
plt.axis('off')
plt.title("BGR to RGB Conversion using OpenCV")
plt.show()