Monday, November 21, 2016

AP186 Activity 8 Blog Report - Morphological Operations - Part 2

Here comes the application part! Ma'am Jing provided us with two images of scattered punched paper, one with circles of the same size (relatively), and another with larger circles among the regular circles to simulate cancer cells.

It's 10:00PM and I'm a bit tired, but so far, I've cropped seven images from the image of just regular circles. I did that using GIMP.

Fig. 1. Digital image of scattered punched paper simulating cells.




Fig. 2. Seven subimages covering 256 by 256 pixel areas of all cells in Fig. 1.


Fig. 3. Digital image of simulated cells with some simulated cancer cells.


(11/23/2016, 9:55AM PhST) I was working on my code yesterday afternoon as well as this morning, and so far I think I'm doing alright.

Anyway, here's a quick recap on my progress so far. I checked for the best segmentation threshold using the histogram of the first subimage, which looks like this:
Fig. 4. Number of pixels per grayscale value of the first subimage.

I settled with the threshold of 200, making all pixels with values greater than 200 as 1 (white) and others 0 (black), since I found that a threshold higher than 200 gets some information from the cells lost, while the artifacts can be removed by morphological operations.

So for all seven subimages, I segmented using a grayscale value of 200. I then tested the effects of a few morphological operations, including CloseImage, OpenImage, and TopHat, and with different circular structuring elements of varying radii. I also saw that a 'circle' structuring element is one of the possible parameter values in place of 'custom' from this source:

http://forge.scilab.org/index.php/p/IPD/source/tree/9/macros/CreateStructureElement.sci

I picked:
SE = CreateStructureElement('circle',8);
I(:,:,i) = OpenImage(BW, SE);
And from the first subimage, it seems to successfully remove artifacts and round out some of the broken circles. I then used SearchBlobs, which assigns a number to each blob:
I(:,:,i) = SearchBlobs(I(:,:,i));
I then use this method to make the separate blobs clear through different shades of gray:
imwrite((255.0/double(max(I(:,:,i))))*I(:,:,i),'S_C1_0'+string(i)+'_sb.png');
I tried the entire process for all, and these are the results:







Fig. 5. Segmented subimages (left) using t=200 and the results (right) after OpenImage and SearchBlobs, with each numbered blob made to show at a different shade of gray.

For the third and sixth subimages, the threshold of 200 had a lot of artifacts. However, the morphological operation seems to have removed the artifacts aptly.

I then got the areas by iterating between the slices of my multimatrix containing all labelled SearchBlobs. The maximum of each slice should also give the total number of blobs for that slice, so using that as the end point of my iterating index should allow me to access each numbered blob from 1 to that number. Accessing them and placing the calculated area in an empty array I called "areas" gives all the values I need.
for i = 1:7
    for j = 1:max(I(:,:,i))
        area = size(find(I(:,:,i) == j));
        areas($+1) = area(2);
    end
end
 I checked the values compiled in the "areas" array. It seems that there are blobs with area of zero, and this I simply removed by making a new list of areas "areas2" and finding all that aren't zero:
for i = find(areas > 0)
    areas2($+1) = areas(i);
end
histplot(100,areas2);
The histogram, with 100 bins, of all the pixel areas of each blob turned out like this:
Fig. 6. Histogram of the areas, with x-axis being number of pixels.
I am still trying to interpret the physical equivalent of the above histogram's y-axis. This is because histplot doesn't seem to give the number of blobs within that bin. I will also get the mean and standard deviations of "areas2" shortly.

However, what I know so far is that this new array with those odd zeros removed, "areas2" lists 74 areas when I checked its size. That means there are 74 blobs total for all subimages.

(12:29PM PhST) I found out how to make the histogram not normalized, so that it gives the number of blobs within each bin. The default of plothist is normalized, so I just have to set the parameter for normalization as false instead of true. (Reference: https://help.scilab.org/docs/6.0.0/en_US/histplot.html)

Anyway, below is a better histogram, with 32 bins. Each bin has a 100 pixel range, and the x-axis ranges from areas of 0 pixels to 3200 pixels.

Fig. 7. Histogram of the areas with 32 bins, and the y-axis being the number of blobs within each range of area values.
From the above histogram, the mode should be at the 500 to 600 pixels range, with 39 out of 74 blobs falling within that range. But since I need to get the mean, I will calculate that using Scilab.
-->mean(areas2)
 ans  =

    782.3783783783783

-->stdev(areas2)
 ans  =

    646.4978721575065 
These values, particularly the standard deviation, are too skewed to the overlapping groups of cells forming large blobs. However, even with the given morphological operations, there doesn't seem to be a combination that eliminates those overlapping groups without eliminating the single cells first. So I don't think a change in morphological operation is the way to go here.

I think that, since the histogram already gives an idea on the outliers, I should calculate the mean and standard deviation with those outliers removed. Particularly, those above 650 pixels in area. The main goal of this application part is to identify which cells are the cancer cells, so getting a better set of data is key.

-->mean(areas3)
 ans  =

    481.3454545454546

-->stdev(areas3)
 ans  =

    108.5340394090154 
This range is much better for extracting cancer cells, I think. I will think more about this later, since I have class.

(4:27PM PhST) I will try using the same morphological operations on the image with cancer cells, then pick the blobs that are above the mean plus the standard deviation I obtained.


(11/25/2016, 10:08AM PhST) Using 590 (mean plus standard deviation) as the threshold and the same grayscale threshold and morphological operation as the subimages with just normal cells, these are the images I obtained:

Fig. 8. Image with simulated cancer cells among normal cells, segmented.

Fig. 9. Image with artifacts removed through OpenImage with a circular SE of radius 8.

Fig. 10. The previous image with different grayscale values for each numerically assigned blob.

Fig. 11. Blobs that exceed the range of areas within the mean plus standard deviation.
 The above image still has cells that overlap with each other. Thus, we need a method that removes those that aren't cancer cells. Since we still have the structuring element from before, we can use it for erosion twice and then dilation twice.

I do this through these lines of code:
cancer = bool2s(ErodeImage(bool2s(ErodeImage(cancer,SE)),SE));
cancer = bool2s(DilateImage(bool2s(DilateImage(cancer,SE)),SE));
with the matrix named "cancer" being Fig. 11 and SE being the same circular structuring element of radius 8 used in the previous morphological operations. This leaves us with the actual cancer cells:

Fig. 12. Filtered cancer cells from the previous image.
I'm not sure if there is a better way to extract the cancer cells in Fig. 12, but I am happy that I was able to pinpoint them from the rest.

Thank you to Albert Yumol, Zed Fernandez and Robertson Esperanza for helping answer questions.

Self-Evaluation (for both Activity 8 Part 1 and Part 2): 8/10

No comments:

Post a Comment