Here is the confusion matrix:
[10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0;
0, 22, 2, 4, 0, 0, 0, 0, 2, 0, 0;
0, 0, 23, 0, 0, 0, 0, 0, 1, 0, 0;
0, 7, 0, 15, 0, 0, 0, 0, 0, 0, 0;
0, 4, 3, 0, 2, 0, 0, 0, 0, 0, 0;
0, 1, 4, 0, 2, 3, 0, 0, 1, 0, 1;
0, 5, 0, 1, 0, 0, 4, 0, 0, 1, 0;
0, 0, 0, 0, 0, 0, 0, 17, 0, 0, 0;
0, 1, 4, 0, 0, 0, 0, 0, 13, 0, 0;
0, 6, 0, 3, 0, 0, 1, 0, 0, 13, 0;
1, 0, 1, 0, 2, 0, 0, 0, 1, 0, 14]
accuracy: 0.697436 (136/195)
So a ~70% classification rate is better than I had anticipated, but we would like our model to be more accurate and robust. One way to do this is to add more features. Today I've been looking at how to use edge and contour detection tools to extract these kinds of features.
Using Canny Edge detection in OpenCV, I was able to extract some decent edge images.
Parameters:
Lower threshold: 50
Upper threshold: 200
I chose these thresholds based on a sample OpenCV program (houghlines.cpp).
They work well right now, but perhaps later I can look into some form of automatic thresholding.
Here are the original images and the edge extracted images:
As one will notice, the edge detection identifies small details such as hair, light reflection, or wing folds. In order to filter out this "noise," I ran a median blur filter on the original images before performing the edge extraction:
Here is an example of the blurred image in grayscale mode alongside the original.
From here, I used the OpenCV Contours structure to identify polygons or enclosed shapes formed by the edges. Separate contours are labeled with different colors:
Contour image using 3x3 kernel for median blurring |
Contour image using 5x5 kernel for median blurring |
I believe the ultimate goal of using the contour detection would be:
To be able to identify the different regions of the wasp (head, thorax, abdomen, antennae, wings) and perhaps even the patterns specific to that species of wasp (such as the 4-window pattern on the back of this "Apodynerus f.formosensis"). This could be a simplified to a feature such as Number_of_Regions. One could measure the length-width ratio and the general shape of said regions, and those would be examples of morphological features that are highly descriptive.
Problems with this approach:
As one will notice that even after median blurring, there are still unwanted details picked up by the edge detector. When the blurring kernel is chosen to be too large, we lose the finer details such as the back pattern or legs. When the blurring kernel is chosen to be too small, some regions are closed off, leading to multiple contours where there should only be one (such as the wing).
In addition, I don't know an easy way to remove small, undesirable contours. I thought of thresholding the contours based on pixel count, but that would remove small, desirable details such as the back pattern. I will look into it.
No comments:
Post a Comment