Image Processing Pat

Wednesday, December 7, 2016

AP186 Activity 11 Blog Report - Basic Video Processing

(1:07 PM) The last activity is basic video processing. Although my project involved videos, and that gave me some experience, taking the right video at home by myself is still the biggest roadblock I am encountering at the moment. I've taken five videos so far, one last night and four this morning, and each video was taken under different settings. None of them give clear frames on the motion of the ball I drop down. The first one is too dark, the second one got its FPS dropped from 29 (the first one) to 12, and while the third fixed the lighting issue, the FPS problem was still there. The fourth and fifth ones weren't even giving the correct FPS value, since they seemed to keep frames going for more time than each frame should.

I will try again with high exposure and default FPS settings.

(2:39 PM) I ended up taking two more videos, and I'm settling with the second one. The first one didn't even capture the dropping point within its field of vision. Anyway, here is my video with high exposure as well as recorded with a frame rate of 29 frames per second (which seems to be the maximum that my phone can do). I converted it into a lower quality video, so that it can be handled by my internet's uploading speed:

Unfortunately, this activity requires a really good video camera, since the blur is still present even with the best settings that I can set my phone with my experience. I will just have to work with blurry blobs, it seems. Hopefully the ROI manages to capture these blurry blobs, since this was mainly my issue with the very first video I took, with the ROI colors not being able to match with the ball as it falls.

The biggest issue for all of the videos I have is that because my camera has such a low frame rate relative to the motion of the ball, when the ball becomes a blur as it falls down, it loses its chromaticity. That's why the ROI method doesn't work for it. I tried the grayscale method of image segmentation in the very first video I took for this activity, and the shadows near the floor had the same grayscale value as the blur that was the falling ball. I hope that isn't the case for this seventh video I took, which I am currently using.

(3:03 PM) As I expected, the ROI method doesn't work. Even with high exposure, the chromaticity of the blur is not the same as the ball's chromaticity when it isn't moving. I will try patching together snapshots of the blur in conjunction with the original ROI into a new, larger ROI, and see what happens.

(3:20 PM) I ended up just making the ball in frames where it is stationary have a really low probability of being part of the patched-together ROI. I will just try grayscale image segmentation now.

(3:31 PM) Nope, doesn't quite work either. It looks very similar to my results for the first video I recorded for this activity when grayscale image segmentation is performed.

I think this means that if I use the video I have, I will need to create a customized code for each set of frames, and basically process each set with different parameters just to get the blob for the frames in that set.

I will do that now. Anyway, here are GIF animations of some of my failed attempts:

Fig. 1. GIF animations of failed segmentations. From left to right: first video using color segmentation, first video using grayscale segmentation, seventh video using color segmentation, seventh video using grayscale segmentation.

(6:47 PM) Using nonparametric color segmentation with these parameters:

Frames
1-3: Color, seg_J > 0, ROI from Frame 1
4-6: Color, seg_J > 0, ROI from Frame 5
7-11: Color, seg_J > 0.1, ROI from Frame 9
12-16: Color, seg_J > 0.3, ROI from Frame 13
17: Color, seg_J > 0, ROI from Frame 17

I was able to get a segmented version of the first trial's frames. I forgot to mention that the video contains three trials, and the first trial contains 17 frames. Note that seg_J has values between 0 and 1. The segmented version looks like:

Fig. 2. Segmented version (right) of the first trial (left) using nonparametric color segmentation.

As observable in Fig. 2, when the ball blurs, it becomes difficult to isolate it from the background. Thus, it looks like it disintegrates into powder while falling, until it hits the ground and looks whole again. For the vertical clumps of powder that represent the blurred ball in the middle frames, the middle of that powder should more or less represent the position of the ball at that point in time, so getting the centroid should still work, as long as they are joined into one blob and the artifacts are removed.

(7:35 PM) Here is the cleaned version using morphological operations. Particularly, closing operators for hollow blobs, and opening operators for cleaning away artifacts.

Fig. 3. Segmented frames in Fig. 2 cleaned using morphological operations.

Now to get the centroid locations. I will post this first before I place the results here just in case our internet runs out at a bad time later. I'll edit this shortly.

x y
62.040155 18.397668
62.684818 26.245875
63.222785 41.222785
64.428919 64.059538
64.272 95.650667
64.528274 132.66518
66.392709 179.28832
68.240343 231.59485
71.469045 292.85205
70.615854 359.83293
71.282927 435.5561
73.510921 521.00128
77.623872 608.6349
76.209991 713.00044
82.895717 821.03315
86.647849 938.28916
81.093805 1007.1646

The above are pixel coordinates. From 55 pixels to 1020 pixels, the height of the corner of the wall was measured to be 147.5 centimeters. This gives a conversion factor of 0.1528 centimeters per pixel, or 0.001528 meters per pixel.

Thank you to http://gifmaker.me/ for allowing me to convert frames into GIFs.

This was a very difficult activity to do in such a short time. But this is also the last activity, and that makes it bittersweet. With regards to AP186, I am happy with how the course went. The activities were challenging, and for most of them, I had rough starts. I managed to pull through with lots of help for the earlier activities, and then a lot of the later activities I had to (and managed to) do on my own, which made them even more fulfilling. I was very proud with how I managed to do the project that was tasked, and the difficulties in doing video processing I already surpassed, in a way, with the project. I'm sad I wasn't able to take a good video back in NIP, for this last activity. Still, I hope what I've done is enough.

Self-Evaluation: 8/10

Tuesday, December 6, 2016

AP186 Activity 10 Blog Report - Enhancement by Histogram Manipulation

In this activity, we will be using histogram manipulation to enhance dark-looking images.

Take this image, for example:

Fig. 1. Image to be enhanced in this activity.

The grayscale of the above image was obtained using I = rgb2gray(I_rgb). The result is:

Fig 2. Grayscale equivalent of the image in Fig. 1.

The grayscale histogram of the image is then obtainable using plot(imhist(I)), which gives:

Fig. 3. Grayscale histogram of the image.

The maximum and minimum grayscale values were obtained to be 245 (with 1 pixel having that value) and 0 (with 15 pixels having that value). Meanwhile, the maximum and minimum number of pixels having a particular grayscale value between 0 and 245 is 5257 pixels and 1 pixel. The obtained cumulative distribution function (CDF) using plot(y) where y = cumsum(nhist,1)) and nhist = imhist(I) is:

Fig. 4. Cumulative distribution function of the image.

The maximum cumulative number of pixels in Fig. 4 is 76800, and this can be used for the desired CDF's maximum as well.

In order to create the desired CDF, I created an array of integers from 0 to 76800 (max cumulative number of pixels) to act as the x-axis of the desired CDF. Then, to act as the y-axis, I divided each element of the x-axis array by 76800 then multiplied it by 245 (max grayscale value) to get the y-axis array. The desired CDF when plotting the x-axis array and the y-axis array then looks like:

Fig. 5. Desired cumulative distribution function of the image.

The enhanced image is then obtained by plugging in the grayscale image matrix I (plus 1, since the indices in Scilab start at 1) to the original CDF, and plugging in the result of that to the desired CDF. The resulting array is then reshaped into a matrix with the dimensions of the original image, giving the enhanced image. In code form, this is:

I2 = matrix(desired_y(y(I+1))/255,240,320);

where desired_y is the y-axis array of the desired CDF, y is the original CDF, I is the grayscale image matrix, and 240 by 320 is the original image pixel dimensions. The values are divided by 255 to get a range of 0 to 1 in the new image.

The resulting enhanced image, known as a histogram equalized image, looks like:

Fig. 6. Resulting histogram equalized image for the grayscale of the original image.

Upon saving the image in Fig. 6, I immediately read it using imread() to get the histogram of the image using imhist() as well as the CDF using cumsum(). These ended up looking like:

Fig. 7. Histogram of the histogram equalized image in Fig. 6.

The above looks like a more equalized histogram apart from the darker parts, which may have been a result of the 0 to 1 fractional values settling for integers from 0 to 255. Looking at the histogram with just 10 bins, the plot looks very equalized. The CDF of the histogram equalized image is:

Fig. 8. Cumulative distribution function of the histogram equalized image in Fig. 6.

The above CDF seems to imitate the desired CDF, which is expected.

For a hopefully better enhanced image, we have to consider that the human eye does not have a linear response. Thus, we can try using a sigmoid function as the desired CDF.

To create a sigmoid function, I used the second answer in this Stack Overflow question as reference:
http://stackoverflow.com/questions/3741063/scilab-x-6-6-y-1-1e-x-why-it-doesnt-work

Since I need a desired CDF that has an x-axis array from 0 to 76800, similar to Fig. 5, and a y-axis array from 0 to 245 (the desired CDF has to return a grayscale value after histogram manipulation enhancement), the correct modification to the sigmoid function from the source above is:

desired_y = -6:(12/76799):6; desired_x = 76800*ones(desired_y)./(1+%e.^-desired_y);
desired_y2 = 245*(desired_y+6)/12;
plot(desired_x,desired_y2); //desired CDF

The desired CDF looks like any old sigmoid function but with the x-axis and y-axis swapped:

Fig. 9. Desired cumulative distribution function following a sigmoid.

I think this is the correct form, since the fourth page of the Activity 10 manual gives a sample desired CDF with a regular looking sigmoid, and the x-axis there has a range of 0 to 255. Thus, the sigmoid must have the x-axis and the y-axis flipped if the 0 to 255 range should actually be on the y-axis in our desired CDF.

The enhanced image using the sigmoid CDF turned out to be:

Fig. 10. Enhanced image for the grayscale of the original image.

Using this technique for a colored image, we reuse some of the code from Activity 7:

I = double(I);
R = I(:,:,1); G = I(:,:,2); B = I(:,:,3);
Int = R + G + B;

We then perform histogram manipulation on the "Int" matrix. I checked the maximum value in the "Int" matrix and obtained 731. I then checked the minimum value, and it is 0, as expected (those are in the grayscale, after all). Thus, I replaced the 0 to 245 range from the previous code with the 0 to 731 range more applicable to the "Int" matrix.

In getting the histogram of the "Int" matrix, to be used for obtaining the original CDF, imhist() cannot be used. Thus, we replace the nhist = imhist(I) command with the [nhist,nind] = histc(732,Int) command. And then, in order to use cumsum(), we turn nhist into 732 rows by 1 column instead of 1 row by 732 columns.

[nhist,nind] = histc(732,Int,normalization=%f);
nhist = matrix(nhist,732,1);
y = cumsum(nhist,1);

Anyway, after making those small changes, the code with the sigmoid function works for the "Int" matrix. I named the resulting matrix as the "Int2" matrix. The new RGB channels are then calculated by first getting the r, g, and b as follows:

Int(find(Int==0))=100000; //avoids dividing by zero
r = R./ Int; g = G./Int; b = B./Int

Then, the new RGB channels are calculated using the "Int2" matrix by multiplying the matrix to the obtained r, g, and b:

R2 = r.*Int2; G2 = g.*Int2; B2 = b.*Int2;
I_enhanced(:,:,1) = R2; I_enhanced(:,:,2) = G2; I_enhanced(:,:,3) = B2;

I then used 255 to normalize all of the values of the multimatrix "I_enhanced" corresponding to the colored enhanced version of the original image. The resulting image is as follows:

Fig. 11. Enhanced colored image using histogram manipulation.

Compared with Fig. 1, Fig. 11 has much better colors on the trees, the background buildings, and even the details of the clothes of some of the spectators. In common with Fig. 6 and Fig. 10, the enhanced images have these weird horizontal scan lines on the very dark parts of the picture. I think this can be attributed to the treatment of my cellphone to those parts when it saves the picture as a relatively low-quality .jpg image.

Although I did this activity in a hurry while still sick, I enjoyed it for how it produced cool resulting images.

Self-Evaluation: 9/10

Monday, December 5, 2016

AP186 Activity 9 Blog Report - Playing Notes by Image Processing

After a week of project making, presenting, and paper writing, as well as four days of being sick, I think I'm ready to proceed with the next activity. There are three activities left, so this would be the third to the last activity.

In this activity, we will be playing notes based on a sheet music image. I tried to think of a good sheet music to use that wasn't too complicated, and I remembered the song we used to play for Music class in high school with our recorders, "Ode to Joy" by Ludwig van Beethoven. Or at least the main melody of it. The song can be summed up by this sheet music from Music-for-Music-Teachers.com:

Fig. 1. Sheet music to be used in this activity,

In the sheet music, there are quarter notes and half notes. The morphological operation I will need should result in blobs that distinguish between the two. But first, I will chop up the entire sheet music image into four parts. Take the first part, here:

Fig. 2. First part of the sheet music.

I plan on placing in a single array the entire melody for this first part. Thus, I initialize the array as an empty array S. I then use the reference given in our manual (http://www.phy.mtu.edu/~suits/notefreqs.html) to get the corresponding frequencies for the notes covered by the entire song.

C = 261.63*2;
D = 293.66*2;
E = 329.63*2;
F = 349.23*2;
G = 392.00*2;
A = 440.00*2;
B = 493.88*2;

And with the code snippet also from the manual, we use the function that generates sound waves based on the note and the duration.

function n = note(f, t)
n = sin (2*%pi*f*t);
endfunction;

For the values of t, the sheet music does not specify the tempo or beats per minute of the song. Thus, I will arbitrarily use 240 beats per minute, which corresponds to 0.25 seconds for the quarter note and 0.50 seconds for the half note.

t=soundsec(0.25);
t2=soundsec(0.50);

We now use the image in Fig. 2 to read the sheet music via image processing. After reading using imread() and converting to Grayscale from RGB using rgbtogray(), the image is inverted by subtracting each grayscale pixel value from 255. The image is then binarized by setting all grayscale values greater than zero as True and those that are zero as False. The resulting image is Fig. 3 below:

Fig. 3. Result from inversion and binarization of the image in Fig. 2.

A rectangular structuring element with 1 pixel width and 2 pixels height, which looks like a short vertical line, is then used to erode the image. This causes the horizontal lines to disappear.

Fig. 4. Result from erosion using the vertical line structuring element.

Another rectangular structuring element, this time with 2 pixels width and 1 pixel height or a short horizontal line, is used to erode the image in Fig. 4. This causes the vertical lines to disappear.

Fig. 5. Result from erosion using the horizontal line structuring element.

Notice that the half note blob looks like it's separated into two curved lines instead of one hollow oval. Also, it would be best to make the stray parts of the G-clef and F-clef that look like a note be attached to the G-clef and F-clef blobs. Thus, I used a rectangular structuring element with 4 pixels width and 1 pixel height, which is a slightly longer horizontal line than the previous structuring element, to join the half note blob parts into one hollow oval, as well as the stray blobs of the G-clef and F-clef to the main G-clef and F-clef blobs.

Fig. 6. Result from dilation using the slightly longer horizontal line structuring element.

Now we have a workable image. We use SearchBlobs() to assign a number to each of the blobs in the image, and the area of each blob is checked one by one using size(). I found that pixel areas greater than 40 and less than 45 correspond to the half note blobs, and that pixel areas greater than 45 and less than 60 correspond to the quarter note blobs. Thus, we can filter those blobs out and put them in their own images:

Fig. 7. Quarter note blobs from the first part of the music sheet.

Fig. 8. Half note blobs from the first part of the music sheet.

I then use SearchBlobs() twice: on the image containing quarter notes, and on the image containing half notes. Using AnalyzeBlobs() to get the centroids of each blob, I then obtained the y-axis location (corresponding to which note they are played at). I then used MS Paint to manually check the y-axis pixel ranges in Fig. 2 that correspond to which notes are played. I found these to be:

for i = 1:max(BW1)
if S5(i).Centroid(2) >= 50 & S5(i).Centroid(2) < 53
L2 = note(C,t);
elseif S5(i).Centroid(2) >= 47 & S5(i).Centroid(2) < 50
L2 = note(D,t);
elseif S5(i).Centroid(2) >= 44 & S5(i).Centroid(2) < 47
L2 = note(E,t);
elseif S5(i).Centroid(2) >= 41 & S5(i).Centroid(2) < 44
L2 = note(F,t);
elseif S5(i).Centroid(2) >= 38 & S5(i).Centroid(2) < 41
L2 = note(G,t);
elseif S5(i).Centroid(2) >= 35 & S5(i).Centroid(2) < 38
L2 = note(A,t);
elseif S5(i).Centroid(2) >= 32 & S5(i).Centroid(2) < 35
L2 = note(B,t);
else
L2 = note(E,t); //this would sound incorrect
end
S = cat(2,S,L2);
end

Looping across each numbered blob and comparing the y-axis location with which range it falls on, I then place it in an array L2 containing that note. The S array then compiles the arrays via concatenation. The x-axis location of the blob should also be usable to obtain the order in which the notes are played, but since the half note is always at the end in this particular song, I simply performed the blob scanning on the half notes after the quarter notes, attaching the half note at the end of the array S through concatenation.

I then used the sound() function to listen to the resulting S array, and the writewav() function to save it into a .wav file.

The other parts were made sure to have the same y-axis coordinate positions for the horizontal lines of the sheet music. Using the exact same code but changing the filename to the other parts, these were the results:

That ends Activity 9. This was difficult to do since I am still a bit sick. However, I'm glad I was able to do it.

Self-Evaluation: 9/10

Monday, November 21, 2016

AP186 Activity 8 Blog Report - Morphological Operations - Part 2

Here comes the application part! Ma'am Jing provided us with two images of scattered punched paper, one with circles of the same size (relatively), and another with larger circles among the regular circles to simulate cancer cells.

It's 10:00PM and I'm a bit tired, but so far, I've cropped seven images from the image of just regular circles. I did that using GIMP.

Fig. 1. Digital image of scattered punched paper simulating cells.

Fig. 2. Seven subimages covering 256 by 256 pixel areas of all cells in Fig. 1.

Fig. 3. Digital image of simulated cells with some simulated cancer cells.

(11/23/2016, 9:55AM PhST) I was working on my code yesterday afternoon as well as this morning, and so far I think I'm doing alright.

Anyway, here's a quick recap on my progress so far. I checked for the best segmentation threshold using the histogram of the first subimage, which looks like this:

Fig. 4. Number of pixels per grayscale value of the first subimage.

I settled with the threshold of 200, making all pixels with values greater than 200 as 1 (white) and others 0 (black), since I found that a threshold higher than 200 gets some information from the cells lost, while the artifacts can be removed by morphological operations.

So for all seven subimages, I segmented using a grayscale value of 200. I then tested the effects of a few morphological operations, including CloseImage, OpenImage, and TopHat, and with different circular structuring elements of varying radii. I also saw that a 'circle' structuring element is one of the possible parameter values in place of 'custom' from this source:

http://forge.scilab.org/index.php/p/IPD/source/tree/9/macros/CreateStructureElement.sci

I picked:

SE = CreateStructureElement('circle',8);
I(:,:,i) = OpenImage(BW, SE);

And from the first subimage, it seems to successfully remove artifacts and round out some of the broken circles. I then used SearchBlobs, which assigns a number to each blob:

I(:,:,i) = SearchBlobs(I(:,:,i));

I then use this method to make the separate blobs clear through different shades of gray:

imwrite((255.0/double(max(I(:,:,i))))*I(:,:,i),'S_C1_0'+string(i)+'_sb.png');

I tried the entire process for all, and these are the results:

Fig. 5. Segmented subimages (left) using t=200 and the results (right) after OpenImage and SearchBlobs, with each numbered blob made to show at a different shade of gray.

For the third and sixth subimages, the threshold of 200 had a lot of artifacts. However, the morphological operation seems to have removed the artifacts aptly.

I then got the areas by iterating between the slices of my multimatrix containing all labelled SearchBlobs. The maximum of each slice should also give the total number of blobs for that slice, so using that as the end point of my iterating index should allow me to access each numbered blob from 1 to that number. Accessing them and placing the calculated area in an empty array I called "areas" gives all the values I need.

for i = 1:7
for j = 1:max(I(:,:,i))
area = size(find(I(:,:,i) == j));
areas($+1) = area(2);
end
end

I checked the values compiled in the "areas" array. It seems that there are blobs with area of zero, and this I simply removed by making a new list of areas "areas2" and finding all that aren't zero:

for i = find(areas > 0)
areas2($+1) = areas(i);
end
histplot(100,areas2);

The histogram, with 100 bins, of all the pixel areas of each blob turned out like this:

Fig. 6. Histogram of the areas, with x-axis being number of pixels.

I am still trying to interpret the physical equivalent of the above histogram's y-axis. This is because histplot doesn't seem to give the number of blobs within that bin. I will also get the mean and standard deviations of "areas2" shortly.

However, what I know so far is that this new array with those odd zeros removed, "areas2" lists 74 areas when I checked its size. That means there are 74 blobs total for all subimages.

(12:29PM PhST) I found out how to make the histogram not normalized, so that it gives the number of blobs within each bin. The default of plothist is normalized, so I just have to set the parameter for normalization as false instead of true. (Reference: https://help.scilab.org/docs/6.0.0/en_US/histplot.html)

Anyway, below is a better histogram, with 32 bins. Each bin has a 100 pixel range, and the x-axis ranges from areas of 0 pixels to 3200 pixels.

Fig. 7. Histogram of the areas with 32 bins, and the y-axis being the number of blobs within each range of area values.

From the above histogram, the mode should be at the 500 to 600 pixels range, with 39 out of 74 blobs falling within that range. But since I need to get the mean, I will calculate that using Scilab.

-->mean(areas2)
ans =

782.3783783783783

-->stdev(areas2)
ans =

646.4978721575065

These values, particularly the standard deviation, are too skewed to the overlapping groups of cells forming large blobs. However, even with the given morphological operations, there doesn't seem to be a combination that eliminates those overlapping groups without eliminating the single cells first. So I don't think a change in morphological operation is the way to go here.

I think that, since the histogram already gives an idea on the outliers, I should calculate the mean and standard deviation with those outliers removed. Particularly, those above 650 pixels in area. The main goal of this application part is to identify which cells are the cancer cells, so getting a better set of data is key.

-->mean(areas3)
ans =

481.3454545454546

-->stdev(areas3)
ans =

108.5340394090154

This range is much better for extracting cancer cells, I think. I will think more about this later, since I have class.

(4:27PM PhST) I will try using the same morphological operations on the image with cancer cells, then pick the blobs that are above the mean plus the standard deviation I obtained.

(11/25/2016, 10:08AM PhST) Using 590 (mean plus standard deviation) as the threshold and the same grayscale threshold and morphological operation as the subimages with just normal cells, these are the images I obtained:

Fig. 8. Image with simulated cancer cells among normal cells, segmented.

Fig. 9. Image with artifacts removed through OpenImage with a circular SE of radius 8.

Fig. 10. The previous image with different grayscale values for each numerically assigned blob.

Fig. 11. Blobs that exceed the range of areas within the mean plus standard deviation.

The above image still has cells that overlap with each other. Thus, we need a method that removes those that aren't cancer cells. Since we still have the structuring element from before, we can use it for erosion twice and then dilation twice.

I do this through these lines of code:

cancer = bool2s(ErodeImage(bool2s(ErodeImage(cancer,SE)),SE));
cancer = bool2s(DilateImage(bool2s(DilateImage(cancer,SE)),SE));

with the matrix named "cancer" being Fig. 11 and SE being the same circular structuring element of radius 8 used in the previous morphological operations. This leaves us with the actual cancer cells:

Fig. 12. Filtered cancer cells from the previous image.

I'm not sure if there is a better way to extract the cancer cells in Fig. 12, but I am happy that I was able to pinpoint them from the rest.

Thank you to Albert Yumol, Zed Fernandez and Robertson Esperanza for helping answer questions.

Self-Evaluation (for both Activity 8 Part 1 and Part 2): 8/10