Image Processing Pat: AP186 Activity 7 Blog Report

I'll try to make this blog report a little different. For the past few blog reports I've been in a rush to catch up to everyone else. I've been doing this in a manner that was admittedly not the best that I could do, just so I could immediately go back to doing other things. Some of these include what was being asked for at that particular week from research meetings with my thesis adviser, requirements for other subjects, and reviewing for long exams, among others.

But now I'll try to take it slowly. At least until I find myself rushing for deadlines again.

This would be the first part, since it takes me a bit of time to write blogs. However, the next parts shouldn't be to far off into the future since I'm already done with the results of the activity, only

Anyway, this activity deals with image segmentation! There are two kinds of image segmentation tackled by the activity: grayscale image segmentation and colored image segmentation. There are further types within colored image segmentation, but I'll get to that later.

The first type is easier, since it only deals with grayscale images. For example, take a grayscale image loaded by Scilab. This should give values between 0 to 255. Using a function that finds all of the values above a particular threshold and setting those values to zero, with all the other values set to one, should give us a clear segment of the image that is below that threshold.

To test this type of image segmentation, we were given a grayscale image of a check by Ma'am Jing.

Fig. 1. Grayscale image of a check.

What I did was, using the image in Fig. 1, I used a one-liner code to make every gray value above a threshold value equal to zero, making those parts of the image black, and every gray value below that threshold value equal to one, making those parts of the image white.

The one-liner code that worked was:

BW = I < 125;

where 125 was the threshold value and I was the image read by Scilab in a matrix. It works because the code performs on each element of the matrix a Boolean evaluation of whether that element is below 125 or not, and places the answers to another matrix BW.

In binary, True is 1 and False is 0. Thus, creating an image out of the matrix BW gives the segmented image based on the threshold used.

For the check, segmentation is done in an effort to extract the handwriting as well as the black text within the image. This would be useful for scanning most black and white documents, since a scanner would not automatically give just 0 values and 255 values where there is black print on the document. It would also give all the grays, which would be even more prominent if the document itself isn't on white paper. For those cases, a different threshold would be ideal.

The threshold gray values I used ranged from 50 to 200 in increments of 25.

Fig. 2. Segmented images using threshold grayscale values of 50 to 200 in increments of 25.

I think the signature in the check starts being clear at the fourth image in Fig. 2, which corresponds to a threshold value of 125. At a grayscale value of 200, much of the text is obscured since the threshold is already met by the darker parts of the paper itself. At 175 and 150, the lines seem to be thicker than necessary, and unnecessary artifacts are already captured. Meanwhile, at grayscale values of 100 and below, the text is already starting to fade, since some parts of the text already miss the threshold used. So at least for this particular image, I would use 125 to get the segmented part that contains text and handwriting, without some of the artifacts given by 150 and 175.

But that's for grayscale image segmentation. What about colored image segmentation?

The PDF given to us for the activity emphasizes on the importance of colored image segmentation. A grayscale equivalent of a colored image wouldn't necessarily allow us to get a particular colored object within the image, since the grayscale value of a colored object may be the same as its background. Thus, we need to look at the RGB color channels of the image, get values from there, then segment.

The values that we particularly need are chromaticities. Chromaticity gives color data regardless of how bright or dark the color appears. In obtaining chromaticity, the normalized chromaticity coordinates (NCC) are used. These can be expressed as:

r = R/I
g = G/I
b = B/I

where:

I = R + G + B

Doing this to the color channel values R, G, and B helps avoid the distinction between shades of a particular color, and effectively allows the grouping together of similarly colored but differently shaded parts of an image. For example, a red ball with lighted parts and shadowy parts would have a a very large range of R, G, and B values, but through NCC, it would have a very small range of r, g, and b values.

An image, pixel by pixel, can be counted into a particular coordinate of the NCC space to create a 2D histogram of the colors of that image by how frequent they appear. Note that:

b = 1 - r - g

Since b can be expressed in terms of r and g, having a 2D histogram of just r and g as axes is enough. Any point on the 2D histogram corresponds to a color, with the colors that are closer to blue being close to the origin.

If we have the chromaticities of a desired object through the 2D histogram in NCC space, then we can use that 2D histogram to segment. This is first done by checking each pixel of the entire image (with the desired object somewhere within that image) for its NCC. The histogram value for that coordinate is then derived from the desired object's 2D histogram. Then, that histogram value is assigned to the same pixel location in a new image, with the same dimensions as the entire image being segmented. This method is called histogram backprojection.

Once the new image is complete, with all its values coming from the histogram, it can then be segmented like a grayscale image.

We can now test this out with a photograph of a 3D image that is brightly colored and of one color only. The test photo I used was the image in Fig. 3 below, which I obtained from Google Images.

The URL for the image itself was:

http://hgtvhome.sndimg.com/content/dam/images/hgtv/fullset/2014/2/12/0/RX-HGMAG002_Happy-Living-Room-075-a-4x3.jpg.rend.hgtvcom.1280.960.jpeg

The search query I used was "room", and I picked the best image with a color that popped out. In this case it was a blue chair inside a room that mostly doesn't have any other blue objects in it. There is, however, the dark blue jeans of the female model for the room, which with chromaticity, could also be mistaken as part of the blue chair. Thus, I use it as a way to challenge the segmentation such that it obtains the chair but not the jeans. I cropped the image to focus on where the chair was, since Scilab was giving a stacksize error with the entire 1280 by 960 image.

Fig. 3. Image of a blue chair used for colored image segmentation.

The two ways to get the 2D histogram in NCC space involve parametric segmentation and non-parametric segmentation. The former uses probabilities and probability distribution functions, while the latter uses binning by converting relatively continuous values into integer values.

In parametric segmentation, imagine a blob in NCC space pinpointing the color information of your region of interest. For instance, if it is a blue chair, there would be a blob near the origin of the NCC space. Parametric segmentation assumes that this blob is Gaussian, which means that if the mean and standard deviation are obtained, those would pinpoint the location and the width of the blob in NCC space.

So instead of obtaining a histogram, we simply obtain two means and two standard deviations of the ROI's color information blob: one across the blue-to-red axis of the NCC space, and one across the blue-to-green axis of the NCC space. That gives us four numbers.

Now, when we take a look at the entire image, we can check the chromaticity of each pixel in the image and use the equation below to obtain the probability that the pixel's chromaticity r is in our blob along the blue-to-red axis.

In the above equation, mu_r and sigma_r are the mean and standard deviation of the ROI's blob across the blue-to-red axis.

Similarly, the probability that the pixel's chromaticity g is in our blob along the blue-to-green axis is:

In the above equation, mu_g and sigma_g are the mean and standard deviation of the ROI's blob across the blue-to-green axis.

In Scilab, r and g are matrices with the size of the entire image, so p(r) and p(g) would give matrices of probabilities with that same size. Multiplying p(r) and p(g) would give the desired probabilities of the image pixels being part of the ROI, since if it hits one axis but doesn't quite hit the other, the product of the probabilities would show this. With that, we are done with parametric segmentation! For further applications, the resulting grayscale image can be segmented through grayscale image segmentation, like we did with the check. But simply getting the grayscale image of probabilities is enough for our purposes in this activity.

For non-parametric segmentation, the method is different. There is a need for a couple of steps that process the ROI and give the histogram based on whatever number of bins is desired per blue-to-green and blue-to-red chromaticity axis. These steps basically use the value obtained from the ROI's chromaticity per pixel and each value is binned into the range that it fits in. The binning is done by converting the green chromaticity and red chromaticity into an integer from 1 to 32 and multiplying the red chromaticity integer by 32. Adding the g integer and the 32*r integer gives a unique value corresponding to its chromaticity coordinate in NCC space, and a list of these unique integer sums from the ROI are compiled. These are then binned into the 2D histogram, knowing how the integer sum corresponds to where in the histogram it is placed.

Then, with the entire image being examined pixel by pixel for its chromaticities r and g, the corresponding point in the 2D histogram for each pair of r and g is then located, and its value in that histogram is placed into a new image with the same dimensions as the entire image. The new image obtains values at the same spot as the pixel being examined. With that, non-parametric segmentation is done, and further uses for the resulting grayscale image need not be done.

The following are the regions of interest that I used:

Fig. 4. Regions of interest used for colored image segmentation of the blue chair image.

The orientation of the 2D histogram in terms of color can be seen, for reference, here:

Fig. 5. Reference histogram to get color locations, slightly edited from the Activity 7 manual.

The histograms that I were able to obtain through the steps for non-parametric segmentation, respective to each of the above regions of interest, were:

Fig. 6. Histogram for the first blue chair ROI patch in Fig. 4 from non-parametric segmentation.

From the above histogram counter-checked with the reference in Fig. 5, the blue color of the first ROI patch corresponds to the position of the peak in the histogram.

Fig. 7. Histogram for the second blue chair ROI patch in Fig. 4 from non-parametric segmentation.

For Fig. 7, the blue color of the second ROI patch still corresponds to the position of the peak in the histogram, but it is slightly closer to cyan, possibly due to the patch coming from the illuminated side of the chair.

Fig. 8. Histogram for the third blue chair ROI patch in Fig. 4 from non-parametric segmentation.

This time, the peaks encompass a larger area, and seem to include multiple peaks. This is because the third ROI patch is a large cropped image that covers both illuminated and shadowy parts of the blue chair.

Fig. 9. Histogram for the purple ROI patch in Fig. 4 from non-parametric segmentation.

Looking at Fig. 5 again for reference, the location of Fig. 9's peak is indeed in the purple region of the histogram.

Fig. 10. Histogram for the skin ROI patch in Fig. 4 from non-parametric segmentation.

Fig. 10 has a peak located near the diagonal but slightly closer to the red corner of the histogram's color triangle, which puts skin at orange.

The following are the resulting images from non-parametric segmentation and parametric segmentation using the five ROI patches in Fig. 4 on the same image in Fig. 3:

Fig. 11. Non-parametric (top) and parametric (bottom) segmentation result for the first blue ROI patch.

Fig. 11 shows that non-parametric segmentation gives a very white result with little gradient covering a fair amount of the ROI. However, it fails to include the leftmost parts of the chair as part of the ROI. In contrast, the parametric segmentation includes those leftmost parts as having a small likelihood of being part of the ROI, but the bottom part of the chair it is less certain of. Both segmentation methods fail to include the illuminated right side of the chair, but this is due to the small ROI patch being used.

Fig. 12. Non-parametric (top) and parametric (bottom) segmentation result for the second blue ROI patch.

Fig. 12 shows a more desirable looking result from the non-parametric segmentation, which seems to imply that it does better at illuminated ROIs. The parametric segmentation result has a very blurry and patchy quality to it. Again, both segmentation methods fail to include the darker parts of the blue chair, and this is due to the ROI only covering a small patch, being from the lighter side.

Fig. 13. Non-parametric (top) and parametric (bottom) segmentation result for the third blue ROI patch.

Fig. 13 shows that non-parametric segmentation is better at discerning the blue chair from the blue jeans of the lady, but with such a large ROI, it seems that the gradient is much smoother with parametric segmentation. From this, I can infer that, if a smooth gradient in the resulting image is desired, parametric segmentation can give that to you, particularly with larger ROIs containing multiple peaks in the histogram, as seen in Fig. 8. One possible explanation for this is that parametric segmentation assumes a Gaussian distribution, which makes all the peaks into one Gaussian peak, producing a better gradient on the resulting image. However, one drawback to that is it accidentally counts the jeans of the lady as part of the ROI, when it really isn't.

Fig. 14. Non-parametric (top) and parametric (bottom) segmentation result for the purple ROI patch.

Fig. 14 shows that for a very small, relatively monochromatic ROI such as the purple blouse of the lady in the picture, both non-parametric and parametric are able to pinpoint it. There are no other purple parts in the image, as well, which aids in producing a resulting segmented image that is just the ROI. Parametric assumes a Gaussian, so it gives a more graduated result with more grays, while non-parametric just gives mostly whites.

Fig. 15. Non-parametric (top) and parametric (bottom) segmentation result for the skin ROI patch.

Lastly, Fig. 15 shows how a skin ROI can be used for segmentation, but since the background also has a lot of browns, including the outline of the hair of the lady, more parts are segmented than desired. For this, it shows that color segmentation is not enough for certain situations, such as with skin of a person in a room with many browns in the background. Again, non-parametric shows a lot of whites, while parametric shows a slow transition from whites to grays.

For me, I like how the parametric segmentation results look, since they usually give smooth results save for the illuminated blue chair ROI in Fig. 12. For lighter images, non-parametric is likely better. Non-parametric is also better at discerning slightly different hues, such as the blue jeans of the lady from the chair, due to non-parametric segmentation not assuming a Gaussian distribution. The assumption of a Gaussian distribution causes parametric segmentation to mistake those blue jeans as being part of the Gaussian probability, since they probably fell within the Gaussian dictated by the mean and standard deviation parameters. So for those cases, non-parametric segmentation is preferred.

That marks the end of Activity 7. As per the suggestion of Ma'am Jing, I would like to try blogging in parts next time, so that more updates are seen in this blog more frequently. I would like to thank Ma'am Jing for helping with creating a 3D visualization of the non-parametric histograms. I would also like to thank Robertson Esperanza for helping answer random questions about the activity.

Self-Evaluation: 8/10

Image Processing Pat

Thursday, November 3, 2016

AP186 Activity 7 Blog Report - Image Segmentation

No comments:

Post a Comment