Robust Underwater Image Classification using Image Segmentation, CNN, and dynamic ROI Approximation

Towards Learning Technical Systems

Stefan Bosse1,*

Parth Kasundra2

1University of Bremen, Dept. Mathematics & Computer Science, Bremen, Germany

2marinom GmbH, Bremen

*Presenting author

1 / 31

Stefan Bosse - Robust Underwater Image Classification - Overview

Overview

Let's talk about self-adaptive and intelligent technical systems applied to sensor data.

2 / 31

Stefan Bosse - Robust Underwater Image Classification - Overview

Overview

Let's talk about self-adaptive and intelligent technical systems applied to sensor data.

Object and Region-of-Interest detection in underwater images is a challenge, even for humans and experts!

Domain-specific Automated Region-of-Interest detection is addressed in this work.

3 / 31

Stefan Bosse - Robust Underwater Image Classification - Overview

Challenges

  • Underwater images pose low quality with respect to illumination conditions, sharpness, and noise.

  • Finding ROIs automatically can help to identify relevant regions in the image quickly by humans, or they can be used as an input for automated inspection and structural health monitoring (SHM).

Underwater inspection of technical structures, e.g., piles of sea mill energy harvester, typically aims to find material changes of the construction, e.g., rust or coverage with pocks, to make decisions about repair and to assess the operational safety.

4 / 31

Stefan Bosse - Robust Underwater Image Classification - Overview

Challenges

  • Currently, for the inspection of piles of sea windmill energy harvester, divers have to go under water.

  • But even if humans inspect the underwater surfaces (underwater by the diver or remotely), the scenes are cluttered and the identification of surface coverage is a challenge.

  • Automated visual inspection is desired to reduce maintenance and service times.

5 / 31

Stefan Bosse - Robust Underwater Image Classification - Overview

Goals

The image segment classification and ROI detection algorithms should be capable to be implemented on embedded systems, e.g., directly integrated in camera systems with application specific co-processor support.

6 / 31

Stefan Bosse - Robust Underwater Image Classification - Overview

Goals

The image segment classification and ROI detection algorithms should be capable to be implemented on embedded systems, e.g., directly integrated in camera systems with application specific co-processor support.

The aim is to achieve an accuracy of at least 85-90% for the predicted images, with a high degree of generalization and independence from various image and environmental parameters such as lighting conditions and background colouration, as well as relevant classification features.

7 / 31

Stefan Bosse - Robust Underwater Image Classification - Overview

Summary and Highlights

  • We propose and evaluate a hybrid approach with segmented classification using small-scaled CNN classifiers (with less than 20000 hyper parameters and less than 3 Million unity vector operations)

  • A reconstruction of labelled ROIs is provided by using an iterative mean and expandable bounding box algorithm.

    • The iterative bounding box algorithm combined with bounding box overlap checking suppress spurious wrong segment classifications and represent the best and most accurate matching ROI for a specific classification label, e.g., surfaces with pocks coverage.

The overall classification accuracy (true-positive classification) with respect to a single segments is about 70%, but with respect to the iteratively expanded ROI bounding boxes it is about 90%.

8 / 31

Stefan Bosse - Robust Underwater Image Classification - Overview

Images and Classes

The underwater inspection of technical structures, e.g., construction parts of off-shore wind turbines like piles, involves the identification of various parts in the underwater images:

  1. Background with water, bubbles, and fishes, summarized as feature class B;
  2. Technical structure, e.g., a mono pile of a wind turbine, summarized as feature class P;
  3. Formation of coverage with marine vegetation or organisms on the surface of the structure, summarized as feature class C.
9 / 31

Stefan Bosse - Robust Underwater Image Classification - Overview

Images and Classes

  • The images set consists of different RGB underwater images posing a

    • high variance in illumination conditions,
    • spatial orientation,
    • noise (bubbles, blurring), and
    • colour palettes.
  • The images are snapshots taken from videos recorded by a human diver with an underwater camera.

10 / 31

Stefan Bosse - Robust Underwater Image Classification - Methods and Architecture

Methods and Architecture

11 / 31

Stefan Bosse - Robust Underwater Image Classification - Methods and Architecture

Methods

  1. Manual image labelling with labelled polygon regions
  2. Segmentation of raw input images in smaller segment images (segment size n × m pixels, e.g., 64 × 64)
  3. Convolutional Neural Networks (CNN)
    • Input: Image Segment
    • Output: Class probabilities (B, P, C) and maximum likelihood selection of best candidate (or none)
  4. Iterative rectangular bounding box approximation using density-based clustering (DBSCAN) and centre-of-mass (COM) computations; COM determines the centre of the bounding box.
    • Input: All labelled segments of input images
    • Output: Set of labelled ROI coordinates
  5. Post correction of overlapping conflicting labelled ROIs
12 / 31

Stefan Bosse - Robust Underwater Image Classification - Methods and Architecture

Architecture

Overview of the data flow architecture and the used algorithms

13 / 31

Stefan Bosse - Robust Underwater Image Classification - Methods and Architecture

Software

Web browser-based software architecture with remote shell worker processes (Bosse, Appl. Sciences, 2022)

14 / 31

Stefan Bosse - Robust Underwater Image Classification - Methods and Architecture

CNN Parameters

  • Four different CNN model architectures were evaluetd:
Arch. Layer Filter Activation Output Parameter VecOps
A (8/16) Conv
Relu
Pool
Conv
Relu
Pool
Fc
SoftMax
[5×5]×8,s=1
-
[2×2]×8,s=2
[5×5]×16,s=1
-
[3×3]×16,s=3
-
-
-
relu
-
-
relu
-
relu
-
64×64×8
64×64×8
32×32×8
32×32×16
32×32×16
10×10×16
1×1×3
3
608
32768
0
3216
16384
0
4803
3
Σ57782
4915200
32768
8192
6553600
16384
1600
9600
3
Σ11537347
B (4/8) Conv
Relu
Pool
Conv
Relu
Pool
Fc
SoftMax
[5×5]×4,s=1
-
[2×2]×4,s=2
[5×5]×8,s=1
-
[3×3]×8,s=3
-
-
-
relu
-
-
relu
-
relu
-
64×64×4
64×64×4
32×32×4
32×32×8
32×32×8
10×10×8
1×1×3
3
304
16384
0
808
8192
0
2403
3
Σ28094
2457600
16384
4096
1628400
8192
800
4800
3
Σ4127878
15 / 31

Stefan Bosse - Robust Underwater Image Classification - Methods and Architecture

CNN Parameters

Arch. Layer Filter Activation Output Parameter VecOps
C (8/8) Conv
Relu
Pool
Conv
Relu
Pool
Fc
SoftMax
[5×5]×8,s=1
-
[2×2]×8,s=2
[5×5]×8,s=1
-
[3×3]×16,s=3
-
-
--
relu
-
-
relu
-
relu
-
64×64×8
64×64×8
32×32×8
32×32×8
32×32×8
10×10×8
1×1×3
3
608
32768
0
1608
8192
0
2403
3
Σ45582
4915200
32768
8192
3276800
8192
800
4800
3
Σ8246755
D (4/4) Conv
Relu
Pool
Conv
Relu
Pool
Fc
SoftMax
[5×5]×4,s=1
-
[2×2]×4,s=2
[5×5]×4,s=1
-
[3×3]×4,s=3
-
-
-
relu
-
-
relu
-
relu
-
64×64×4
64×64×4
32×32×4
32×32×4
32×32×4
10×10×4
1×1×3
3
304
16384
0
404
4096
0
1203
3
Σ22394
2457600
16384
4096
819200
4096
400
2400
3
Σ3304179
16 / 31

Stefan Bosse - Robust Underwater Image Classification - Methods and Architecture

Mean Bounding Box Algorithm (MBB)

  • There is a set of class symbols Σ and a class matrix M consisting of elements labelling an image segment with a class, so that:

Σ={B,P,C,U}σΣ^M=⎜ ⎜ ⎜ ⎜σ1,1..σ1,jσ2,1..σ2,j......σi,1..σi,j⎟ ⎟ ⎟ ⎟

  • The matrix M is flattened to a point cloud list set P={pσ}σ∈Σ.
  • Each class set p contains the matrix positions of the respective elements, i.e., pσ={ ⟨i,j⟩}, with all points classified by the CNN to the same label class σ∈Σ.
17 / 31

Stefan Bosse - Robust Underwater Image Classification - Methods and Architecture

Mean Bounding Box Algorithm (MBB)

  • The DBSCAN clustering will return a group list of points that satisfy the clustering conditions, one point group list for each label class:

DBSCAN:P{{pj}j,{pk}k,{pl}l,..},jklP:{pi}i,i={1,2,3,..,n}pi=<i,j>R2

  • It is assumed that a cluster will contain a majority of correctly classified points (segments), and a minority of scattered wrong classified points.
18 / 31

Stefan Bosse - Robust Underwater Image Classification - Methods and Architecture

Mean Bounding Box Algorithm (MBB)

Iterative bounding box expansion with final conflict overlapping shrinking

19 / 31

Stefan Bosse - Robust Underwater Image Classification - Methods and Architecture

Mean Bounding Box Algorithm (MBB)

  • The MBB algorithm computes points ⟨x1,y1,x2,y2⟩ of a bounding box that is centred at the mass-of-centre point c of all points of a cluster and with outer sides given by the vectorial mean centred position of all points above or below, and left or right form the c point.

  • The expansion of a previously computed bounding box is done by all points outside of the current bounding box, performing the next extension iteration.

    • Again, a spatial position averaging is performed, extending the boundary of the bound box.
    • The expansion is performed iteratively.
    • Each step includes more points, but increases the probability that the bound box is over-sized with respect to spurious outlier points that result from wrong CNN classifications.
20 / 31

Stefan Bosse - Robust Underwater Image Classification - Methods and Architecture

Mean Bounding Box Algorithm (MBB)

  • In case of high iteration loop values, bounding boxes from different classes can overlap.

  • To reduce overlapping conflicts, a class priority is introduced layering the class regions by relevance.

  • After the ROI expansion is done, overlapping bounding boxes with lower priority are shrink until all overlap conflicts are resolved.

    • Commonly, more than one side of the bounding box can be shrunken to reduce the overlap conflict.
    • The possible candidates are evaluated and sorted with respect to the amount of shrinkage at each side.
    • The lowest shrinkage is applied first. If the conflict is not reduced by the selected side shrinking, the next side is shrink until the conflict (with one or more higher priority bounding boxes) is reduced.
21 / 31

Stefan Bosse - Robust Underwater Image Classification - Experiments and Results

Experiments and Results

Training and test data: Randomly chosen sub-set of about 10000 image segments taken from about 300 snap shot images

22 / 31

Stefan Bosse - Robust Underwater Image Classification - Experiments and Results

Experiments and Results

Training and test data: Randomly chosen sub-set of about 10000 image segments taken from about 300 snap shot images

Four different identical models were trained and applied in parallel (different random initialisation and sub-set of images)

23 / 31

Stefan Bosse - Robust Underwater Image Classification - Experiments and Results

Results

Classified bounding boxes for one image using four models trained in parallel (same parameters) but with different random initialisation and training data sub-set (Blue: class background, red: class coverage, green: class free construction surface)

24 / 31

Stefan Bosse - Robust Underwater Image Classification - Experiments and Results

Results

Data Set Total error (¬TPC) % Error (¬TP)/class (B,P,C) % Prediction accuracy/Class C (TP,FP,TN,FN) %
Training 10.6±1.5 5.0±3.4
6.0±2.8
21.0±7.1
79.0±7
4.8±2
94.7±6.6
10.5±3.1
Test 11.1±1.8 5.3±2.6
5.8±3.2
22.0±8.3
78.0±4.3
5.1±2.2
95.1±2.1
11.0±4.4
All 10.9±1.6 4.2±2.8
5.9±3.4
21.7±8
78.4±8
5.0±2.2
95.0±2.2
10.8±4

Accumulated prediction results for training, test, entire data set union with statistical features of the model ensemble trained in parallel (using different data sub-sets and random initialisation). All errors with 2σ standard deviation interval, and N=9000 samples, n=3000 for each class, and using CNN architecture A.

25 / 31

Stefan Bosse - Robust Underwater Image Classification - Experiments and Results

Results

CNN Architecture Parameters Forward Time Backward Time
A (8/16) 122587 18 ms1, 0.5 ms2 26 ms1, 1 ms2
B (4/8) 66639 8 ms1 10 ms1
C (8/8) 104603 12 ms1 18 ms1
D (4/4) 58047 6 ms1 8 ms1

Forward and backward (training) times for one 64 × 64 × 3 segment and different CNN architectures using the JavaScript ConvNet.js classifier1 and TensorFlow (CPU)2

26 / 31

Stefan Bosse - Robust Underwater Image Classification - Conclusion

Conclusion

27 / 31

Stefan Bosse - Robust Underwater Image Classification - Conclusion

Although the overall classification accuracy is about 90%, the high variance of the segment prediction results across differently trained models (model ensemble all having the same architecture) limits the output quality of the labelled ROI detector, typically resulting in an underestimation of the classified regions and a lacking of generalisation.

28 / 31

Stefan Bosse - Robust Underwater Image Classification - Conclusion

Although the overall classification accuracy is about 90%, the high variance of the segment prediction results across differently trained models (model ensemble all having the same architecture) limits the output quality of the labelled ROI detector, typically resulting in an underestimation of the classified regions and a lacking of generalisation.

But the presented static segment prediction with point clustering and iterative selective bounding box approximation with final overlap conflict reduction is still reliable. Similar to random forest trees, a multi-model prediction with model fusion (e.g., major coverage estimation) is proposed to get the best matching bonding boxes for the relevant classes.

29 / 31

Stefan Bosse - Robust Underwater Image Classification - Conclusion

The reduction of the CNN complexity with respect to the number of filters and dynamic parameters does not lower the classification accuracy significantly.

  • Although, CNN are less suitable for low-resource embedded systems, the CNN architecture D (4/4) could be implemented in an embedded camera systems, expecting overall ROI extraction times for one image frame about 5 seconds, not suitable for real-time operation (maximal latency 100 ms). Using control-path parallelisation performing the image segment classifications in parallel, the ROI extraction could be reduced to 1 second using generic multi-core CPUs, or 100 ms using FPGA-based co-processors.
30 / 31

Stefan Bosse - Robust Underwater Image Classification - Conclusion

Robust Underwater Image Classification using Image Segmentation, CNN, and dynamic ROI Approximation

Towards Learning Technical Systems

Stefan Bosse1,*

Parth Kasundra2

1University of Bremen, Dept. Mathematics & Computer Science, Bremen, Germany

2marinom GmbH, Bremen

*Presenting author

31 / 31