Introduction
This tutorial presents basic concepts on image classification. For this tutorial's purposes, image classification is interpreted as pixel classification, a process in which every pixel in an image is assigned to a class or category on the image. This process can be used to ask questions such as "Which pixels in this image corresponds to forests, asphalt, sand, snow?" but not questions like "Does this image contains a Red 1966 Volkswagen Beetle?". For some comments on this see How do I compare two images to see if they are equal? and null.
Disclaimer: The examples in this tutorial (including data,
classes, samples, methods, parameters and results) were chosen to illustrate the
classification process; their accuracy and applicability cannot be guaranteed
for other purposes. In particular, the code shown here can be used only to
classify color RGB images.
The code presented in this section was written to
be clear and easy to understand, using only a single class with a main
method whenever possible (even avoiding the creation of classes to hold
structures for the classification), and must be adapted for more complex
processing and/or modified to work with other types of images.
Concepts
Pixel-based image classification
I will use a remote sensing image classification example, but the concepts could be easily extended to other domains.
Our task is to identify, for each pixel in one image, which class should be assigned to that pixel. Since there are way fewer classes than possible values for pixels we can consider classification as a process of simplification. If we assume that the image was classified correctly, we can easily do tasks such as area measurement and region extraction.
There are two main methods for image classification: supervised and unsupervised. In supervised classification, we must teach the classification algorithm how it can differentiate one class of other, usually by providing samples of pixels we know that should be assigned to a particular class. The algorithm will then use the information we provided to classify the other pixels on the image.
In unsupervised classification we provide the algorithm with basic information on how many classes we expect to be present on the image, and the algorithm attempts to identify those classes. Some unsupervised algorithms are also known as clustering algorithms. A brief discussion on unsupervised algorithms is given in a null.
Most classification methods (both supervised and unsupervised) rely on some distance measure that is calculated over the pixels' values (not its coordinates in the image). The in the next section.
Distance in feature space
Regardless of the method used for classification, the idea is that if one pixel is assigned to class A, pixels similar to those should probably be assigned to the same class.
The concept of similarity for pixel classification is very important -- pixels are similar to each other not in regard to their position or context, but in regard to their values. In the case of RGB images, each pixel is represented by three coordinates, usually in the range [0-255], which represent a point in the three-dimensional feature space. The same is true for pixels of multispectral or hyperspectral images: a pixel of an image with 100 bands is a point in the 100-dimensional feature space. Although we can only visualize a few dimensions the mathematics for calculating pixels' similarity is the same for any number of dimensions.
As an example, the pixel which RGB values are (128,128,128) can be plotted in the exact center of the RGB cube (or feature space), while the pixels (0,0,255) and (255,0,0) would be located in different vertices of the RGB cube or feature space. Note that this is very different from the position of the pixels in the image. Neighboor pixels can be very distant in the feature space, and distant pixels in the image can be very close in the feature space.
Two concepts based on distance are relevant for this tutorial: intervals on the feature space and distances in the feature space. An interval between two pixels is a region (a hyperrectangle) in feature space bounded by two pixels' values, and is equivalent to a rule in a rule-based expert system: for example, the region bounded by (18,30,50) and (23,48,90) in the RGB feature space contains all pixels for which the rule "18≤R≤23 and 30≤G≤48 and 50≤B≤90" evaluates as true. A very simple classifier can be constructed to create and apply the rules. This classifier is called the parallelepiped classifier and will be shown in this tutorial.
The distance between two pixels in feature space can be calculated in several different ways. For this tutorial we will consider only the Euclidean distance, which is the shortest distance between the pixels in feature space, and is calculated as the square root of the summation of the squared differences between the pixels' values. For example, the distance between (49,37,118) and (33,31,200) is given by the square root of (49-33)*(49-33)+(37-31)*(37-31)+(118-200)*(118-200), or 83.76.
The distance between pixels in feature space is used for comparison: for several algorithms we calculate the distance of a pixel to other pixels (always in feature space) and decide on the class based on the smallest distance. For example, consider that we know that the pixels from class Forest have values around (21,71,40) and pixels from class Urban have pixels with values around (66,75,70). If we are presented with a pixel for which the class is unknown and with values (37,77,53) we can calculate the distances between this pixel and the pixel from the class Forest (21.47) and to the pixel from class Urban (33.67) and decide that since the pixel is closer (in feature space) to Forest than Urban it probably should be assigned to the class Forest. This classification method is called minimum distance classifier and will also be explored in this tutorial.
Sample extraction for pixel-based classification
Many supervised classification methods requires, as input, samples of all the classes that will be used for the classification. Most algorithms won't use the sample pixels' values directly, instead calculating some signatures from those pixels that will be used to represent the corresponding classes. Signatures can be considered as descriptors for the classes, often containing statistical information about the pixels used as samples. Signatures are specific to a classification method, each one requiring different information about the pixels for the classification step.
To create signatures (or train the supervised classification algorithm) we need to identify regions in a prototypical image which contains pixels for a particular class (the samples) and use those pixels to calculate the signatures. This process must be repeated for each class, for which we can use one or more sample regions. The identification of samples should be done by an expert on the image features (for our examples, someone with remote sensing training).
At this point one may ask, "If I have to search in the image for pixels of the classes I want to find, why not just paint the whole image with colors corresponding to the classes?" Well, usually we need far fewer pixels to train the algorithm that are present in the whole image, and pixel-by-pixel painting is not as easy as it seems, it is slow, expensive and error-prone -- click here (JPG, 149.8K) for a challenge!
Since the type of signature and how it is calculated depends on the classification algorithm, the methods to calculate the signature will be presented separately for each algorithm.
Samples for the examples in this tutorial
Landsat ETM image of Para, Brazil. |
This is the image we will use for classification in this tutorial's examples. This image is from a region close to the Tucuruà dam in Pará state, north of Brazil. It is a 3-band image from the Landsat 7 Enhanced Thematic Mapper (ETM+) sensor. Bands 7 (mid-infrared), 4 (near infrared) and 2 (green) were used to compose the RGB image. The image shown in the left was reduced and enhanced for better visualization, click here (PNG, 1.1M) to get the original, unretouched, 781x671 pixels image that will be used for classification. The classes that will be used for classification are clouds, shadows, water, forest, pasture and urban, and some sample regions were selected on the image for each class. Again, these classes and their samples were chosen for demonstration purposes, for real applications one should ask for assistance of an expert in remote sensing. |
The applications associated with this tutorial will use two text files to be used by the different classifiers. One of the files is the classes definition file, and contains, in each of its lines, a definition for a class in the classification task. This definition consists of a unique integer identifier, followed by three values which will be used as the reference color for that class, followed by the class name. For example, the line that defines the Urban color can be written as 6 255 80 80 Urban. Lines starting with the hash symbol (#) are considered comments. See also the text file with the classes definition (TXT, 0.3K) for this tutorial.
We will also need to declare which regions will be used as samples for each class. This is done by writing the coordinates (relative to the image) of the regions in a samples definition file. Each line in this file must contain five numbers: the first is the unique identifier of the class, followed by the coordinates of the upper-left corner of the rectangle that contains the samples, followed by this rectangles' width and height. For example, one of the samples for the class Cloud shown below is defined by the line 1 306 62 6 8. Lines starting with the hash symbol (#) are considered comments. See also the text file with the samples coordinates (TXT, 0.6K) for this tutorial.
Those files were used to define the classes and samples used in this tutorial (samples were extracted with a graphics editing program, which shown the rectangle coordinates for a selected region). Some of the region sampled for this example are shown below. Click here (JPG, 133.2K) to see all the sampled areas shown over the image. The color used in the rectangles around the samples is the class reference color.
Sample area for classes Cloud and Shadow. |
Sample area for class Shadow. |
Sample area for class Water. |
Sample area for class Forest. |
Sample area for class Pasture. |
Sample area for class Urban. |
At this point we have several pixels for which we manually assigned the classes. Although all pixels for a class appear to have more or less the same color, their values are quite different. We can see the differences in the plots below, which shows each of the pixels on the samples projected on the RG, RB and GB planes, using the colors for the classes (so red pixels are for the class Urban, for example).
Plot for R/G plane. |
Plot for R/B plane. |
Plot for G/B plane. |
The same data can be visualized in a three-dimensional plot in the interactive applet below. Move the mouse around the plot to see the data from a different perspective, click any button on the mouse for a quick zoom in the data.
Interactive RGB Plot.
From the plots above we can see that some of the values of the pixels for the samples are concentrated in a narrow region (water, shadow) while others are spreaded (cloud, urban). More important for our purposes is the fact that some of the classes' samples overlap, i.e. there aren't any lines or planes that clearly separates any classes of the others in the plots. We will see how this affects the classification algorithms later.
With some pixels labeled (i.e. identified) for each classe we can calculate the signatures for each class. The signature extraction method will depend on the classifier, and will be presented separately in sections below.
The Parallelepiped Classifier
The parallelepiped classifier is a very simple supervised classifier that uses intervals or bounded regions of pixels' values to determine whether a pixel belongs to a class or not. The intervals' bounding points are obtained from the values of the pixels of samples for the class. Since this classifier is supervised, there are two steps in its use: signature creation (training) and classification.
Signature Creation (training)
The signature creation steps uses as input the original image pixels, class description and samples data files to calculate the minimum and maximum bounding coordinate for each class. The steps are as follow:
This simple procedure will calculate the minimum and maximum bounds for each class, which will be used as signatures for the classes. The procedure is implemented in the CreateParallelepipedSignatures application, show below.
CreateParallelepipedSignatures.java
1 /*
2 * Part of the Java Image Processing Cookbook, please see
3 * http://www.lac.inpe.br/~rafael.santos/JIPCookbook.jsp
4 * for information on usage and distribution.
5 * Rafael Santos (rafael.santos@lac.inpe.br)
6 */
7 package tutorials.simpleclassifier;
8
9 import java.awt.Color;
10 import java.awt.image.BufferedImage;
11 import java.io.BufferedReader;
12 import java.io.BufferedWriter;
13 import java.io.File;
14 import java.io.FileReader;
15 import java.io.FileWriter;
16 import java.io.IOException;
17 import java.util.StringTokenizer;
18 import java.util.TreeMap;
19
20 import javax.imageio.ImageIO;
21
22 /**
23 * This application creates signatures for each class for a parallelepiped classifier.
24 * Please see
25 * http://www.lac.inpe.br/~rafael.santos/JIPCookbook.jsp
26 * for more information on the files and formats used in this class.
27 */
28 public final class CreateParallelepipedSignatures
29 {
30 /**
31 * The application entry point. We must pass three parameters: the original
32 * image file name, the name of the file with the description of the classes,
33 * and the name of the file with the coordinates for the samples.
34 * @throws IOException
35 */
36 public static void main(String[] args) throws IOException
37 {
38 // Check parameters names.
39 if (args.length != 3)
40 {
41 System.err.println("Must pass three command-line parameters to this application:");
42 System.err.println(" - The original image (from which samples will be extracted;");
43 System.err.println(" - The file with the classes names and colors");
44 System.err.println(" - The file with the samples coordinates");
45 System.exit(1);
46 }
47 // Open the original image.
48 BufferedImage input = ImageIO.read(new File(args[0]));
49 // Read the classes description file.
50 BufferedReader br = new BufferedReader(new FileReader(args[1]));
51 // Store the classes color in a map.
52 TreeMap
53 while(true)
54 {
55 String line = br.readLine();
56 if (line == null) break;
57 if (line.startsWith("#")) continue;
58 StringTokenizer st = new StringTokenizer(line);
59 if (st.countTokens() < 4) continue;
60 int classId = Integer.parseInt(st.nextToken());
61 int r = Integer.parseInt(st.nextToken());
62 int g = Integer.parseInt(st.nextToken());
63 int b = Integer.parseInt(st.nextToken());
64 classMap.put(classId,new Color(r,g,b));
65 }
66 br.close();
67 // Create the structures to represent the bounds for the parallelepipeds,
68 // one for each class. Behold the power of the Collections!
69 TreeMap
70 TreeMap
71 for(Integer classIndex:classMap.keySet())
72 {
73 minMap.put(classIndex,new int[]{1000,1000,1000}); // large enough
74 maxMap.put(classIndex,new int[]{-1,-1,-1}); // small enough
75 }
76 // Open the file with the coordinates and get the pixels' values for those
77 // coordinates.
78 br = new BufferedReader(new FileReader(args[2]));
79 while(true)
80 {
81 String line = br.readLine();
82 if (line == null) break;
83 if (line.startsWith("#")) continue;
84 StringTokenizer st = new StringTokenizer(line);
85 if (st.countTokens() < 5) continue;
86 int classId = Integer.parseInt(st.nextToken());
87 int x = Integer.parseInt(st.nextToken());
88 int y = Integer.parseInt(st.nextToken());
89 int w = Integer.parseInt(st.nextToken());
90 int h = Integer.parseInt(st.nextToken());
91 Color c = classMap.get(classId);
92 if (c != null) // We have a region!
93 {
94 // Get the bounds for this region.
95 int[] min = minMap.get(classId);
96 int[] max = maxMap.get(classId);
97 // Let's get all pixels values in it.
98 for(int row=0;row<=h;row++)
99 for(int col=0;col<=w;col++)
100 {
101 int rgb = input.getRGB(x+col,y+row);
102 int r = (int)((rgb&0x00FF0000)>>>16); // Red level
103 int g = (int)((rgb&0x0000FF00)>>>8); // Green level
104 int b = (int) (rgb&0x000000FF); // Blue level
105 // Use those values to adjust the bounds for the parallelepipeds.
106 min[0] = Math.min(min[0],r); max[0] = Math.max(max[0],r);
107 min[1] = Math.min(min[1],g); max[1] = Math.max(max[1],g);
108 min[2] = Math.min(min[2],b); max[2] = Math.max(max[2],b);
109 }
110 // Put the bounds back on the map.
111 minMap.put(classId,min);
112 maxMap.put(classId,max);
113 }
114 }
115 br.close();
116 // The values on the maps are the bounds for each class. Let's save them
117 // to a file so we can reuse them in the classifier.
118 BufferedWriter bw = new BufferedWriter(new FileWriter("parallel_signatures.txt"));
119 // In each line information for a class.
120 for(Integer classId:classMap.keySet())
121 {
122 bw.write(classId+" ");
123 int[] min = minMap.get(classId);
124 int[] max = maxMap.get(classId);
125 bw.write(min[0]+" "+min[1]+" "+min[2]+" ");
126 bw.write(max[0]+" "+max[1]+" "+max[2]+" ");
127 bw.newLine();
128 }
129 bw.close();
130 }
131 }
The CreateParallelepipedSignatures application uses as input the original image, a text file with the classes' description and a text file with the samples' coordinates. The formats of those text files are described in the section Samples for the examples in this tutorial.
The application was executed using the example image, the class definition file and the samples' coordinates file. The resulting signatures are stored in a text file, parallel_signatures.txt (TXT, 0.1K), summarized in the table below.
Clouds | Shadows | Water | Forest | Pasture | Urban | |
minK | (76,106,103) | (8,17,31) | (5,9,33) | (15,43,36) | (26,62,46) | (43,50,51) |
maxK | (215,166,255) | (17,34,38) | (21,37,45) | (48,98,48) | (69,93,63) | (160,107,142) |
Classification
The classification process for the parallelepiped classifier is very simple, all we need is the original image, the classes description (so we can use different colors to represent the different classes) and the signatures created by the CreateParallelepipedSignatures application. The steps are as follow:
The classification algorithm allows rejection of the pixel by the algorithm, meaning that that pixel's RGB values were outside the bounds for all classes, leaving the pixel as unclassified. There are ways to avoid rejection, one obvious is to get more samples, preferrably extracted from the regions with unclassified pixels. Depending on the application, a certain amount of unclassified pixels is acceptable.
The classification algorithm is implemented by the application, shown below.
ClassifyWithParallelepipedAlgorithm.java
1 /*
2 * Part of the Java Image Processing Cookbook, please see
3 * http://www.lac.inpe.br/~rafael.santos/JIPCookbook.jsp
4 * for information on usage and distribution.
5 * Rafael Santos (rafael.santos@lac.inpe.br)
6 */
7 package tutorials.simpleclassifier;
8
9 import java.awt.Color;
10 import java.awt.image.BufferedImage;
11 import java.io.BufferedReader;
12 import java.io.File;
13 import java.io.FileReader;
14 import java.io.IOException;
15 import java.util.StringTokenizer;
16 import java.util.TreeMap;
17
18 import javax.imageio.ImageIO;
19
20 /**
21 * This application classifies images using the parallelepiped classifier.
22 * Please see
23 * http://www.lac.inpe.br/~rafael.santos/JIPCookbook.jsp
24 * for more information on the files and formats used in this class.
25 */
26 public class ClassifyWithParallelepipedAlgorithm
27 {
28 /**
29 * The application entry point. We must pass three parameters: the original
30 * image file name, the name of the file with the description of the classes,
31 * and the name of the file with the signatures for the classes.
32 * @throws IOException
33 */
34 public static void main(String[] args) throws IOException
35 {
36 // Check parameters names.
37 if (args.length != 3)
38 {
39 System.err.println("Must pass three command-line parameters to this application:");
40 System.err.println(" - The original image (from which samples will be extracted;");
41 System.err.println(" - The file with the classes names and colors");
42 System.err.println(" - The file with the signatures for each class");
43 System.exit(1);
44 }
45 // Open the original image.
46 BufferedImage input = ImageIO.read(new File(args[0]));
47 // Read the classes description file.
48 BufferedReader br = new BufferedReader(new FileReader(args[1]));
49 // Store the classes color in a map.
50 TreeMap
51 while(true)
52 {
53 String line = br.readLine();
54 if (line == null) break;
55 if (line.startsWith("#")) continue;
56 StringTokenizer st = new StringTokenizer(line);
57 if (st.countTokens() < 4) continue;
58 int classId = Integer.parseInt(st.nextToken());
59 int r = Integer.parseInt(st.nextToken());
60 int g = Integer.parseInt(st.nextToken());
61 int b = Integer.parseInt(st.nextToken());
62 classMap.put(classId,new Color(r,g,b));
63 }
64 br.close();
65 // Read the signatures from a file.
66 TreeMap
67 TreeMap
68 br = new BufferedReader(new FileReader(args[2]));
69 while(true)
70 {
71 String line = br.readLine();
72 if (line == null) break;
73 if (line.startsWith("#")) continue;
74 StringTokenizer st = new StringTokenizer(line);
75 if (st.countTokens() < 7) continue;
76 int classId = Integer.parseInt(st.nextToken());
77 int[] min = new int[3]; int[] max = new int[3];
78 min[0] = Integer.parseInt(st.nextToken());
79 min[1] = Integer.parseInt(st.nextToken());
80 min[2] = Integer.parseInt(st.nextToken());
81 max[0] = Integer.parseInt(st.nextToken());
82 max[1] = Integer.parseInt(st.nextToken());
83 max[2] = Integer.parseInt(st.nextToken());
84 minMap.put(classId,min);
85 maxMap.put(classId,max);
86 }
87 br.close();
88 // Create a color image to hold the results of the classification.
89 int w = input.getWidth(); int h = input.getHeight();
90 BufferedImage results = new BufferedImage(w,h,BufferedImage.TYPE_INT_RGB);
91 // Do the classification, pixel by pixel, selecting which class they should be assigned to.
92 for(int row=0;row
93 for(int col=0;col
94 {
95 int rgb = input.getRGB(col,row);
96 int r = (int)((rgb&0x00FF0000)>>>16); // Red level
97 int g = (int)((rgb&0x0000FF00)>>>8); // Green level
98 int b = (int) (rgb&0x000000FF); // Blue level
99 // To which class should we assign this pixel?
100 Color assignedClass = new Color(0,0,0); // unassigned.
101 for(int key:minMap.keySet())
102 {
103 if (isBetween(r,g,b,minMap.get(key),maxMap.get(key)))
104 {
105 assignedClass = classMap.get(key);
106 }
107 }
108 // With the color, paint the output image.
109 results.setRGB(col,row,assignedClass.getRGB());
110 }
111 // At the end, store the resulting image.
112 ImageIO.write(results,"PNG",new File("classified-with-parallelepiped.png"));
113 }
114
115 private static boolean isBetween(int r,int g,int b,int[] min,int[] max)
116 {
117 return ((min[0] <= r) && (r <= max[0]) &&
118 (min[1] <= g) && (g <= max[1]) &&
119 (min[2] <= b) && (b <= max[2]));
120 }
121 }
The application was executed with the image, class description and signature files used as examples. The resulting image is shown below.
Image classified with the parallelepiped method. |
This is the image obtained as result of the application of the parallelepiped classification algorithm. The colors of the pixels are the same as specified in the class description file (red for Urban, blue for Water, etc.), with some pixels in black because they were rejected by the classifier. The size of the image shown in the left was reduced for better visualization, click here (PNG, 91.9K) to get the original 781x671 pixels classified image. |
We can see some errors in the classification, with some apparent misclassification errors involving the classes Urban and Pasture and (more noticeably) Water and Shadow. We can see in the projected plots for the samples' distributions that the values of the samples for those classes are, similar, with lots of superposed values -- rectangles that bound the samples values for one class would have intersection with the bounding rectangles for other classes.
The approach and implementation shown here have a serious problem: if a pixel belongs to more than one class (it could happen when the classes' bounds in feature space are superposed) the pixel will be classified as one of the classes, which will be determined depending on the implementation of the algorithm (the one I used will use the class with the highest identifier). Since the pixel could belong to more than one class, a tie-breaking procedure should be done, or the pixel should be classified in yet another class (e.g. undecided).
The Minimum Distance Classifier
The minimum distance classifier is a also very simple supervised classifier method that uses a central point (in feature space) to represent a class. The central point is calculated as the average of all pixels in all samples for that class. Classification is performed by calculating the distance (always in feature space) from a pixel with unknown class to the central points of each class and choosing the class which yields the smallest distance. Again, since it is a supervised classification algorithm, there are two steps in its use: signature creation and classification.
Signature Creation (training)
The signature creation steps is very similar to the used by the parallelepiped signature creation, except that instead of bounds we will create a single data vector corresponding to the average of all pixels in all samples for a particular class. The steps are as follow:
This simple procedure will calculate the average vector for each class, which will be used as signatures for the classes in the classification step. The procedure is implemented in the CreateMinimumDistanceSignatures application, shown below.
CreateMinimumDistanceSignatures.java
1 /*
2 * Part of the Java Image Processing Cookbook, please see
3 * http://www.lac.inpe.br/~rafael.santos/JIPCookbook/index.jsp
4 * for information on usage and distribution.
5 * Rafael Santos (rafael.santos@lac.inpe.br)
6 */
7 package tutorials.simpleclassifier;
8
9 import java.awt.Color;
10 import java.awt.image.BufferedImage;
11 import java.io.BufferedReader;
12 import java.io.BufferedWriter;
13 import java.io.File;
14 import java.io.FileReader;
15 import java.io.FileWriter;
16 import java.io.IOException;
17 import java.util.StringTokenizer;
18 import java.util.TreeMap;
19
20 import javax.imageio.ImageIO;
21
22 /**
23 * This application creates signatures for each class for a minimum distance classifier.
24 * Please see
25 * http://www.lac.inpe.br/~rafael.santos/JIPCookbook
26 * for more information on the files and formats used in this class.
27 */
28 public final class CreateMinimumDistanceSignatures
29 {
30 /**
31 * The application entry point. We must pass three parameters: the original
32 * image file name, the name of the file with the description of the classes,
33 * and the name of the file with the coordinates for the samples.
34 * @throws IOException
35 */
36 public static void main(String[] args) throws IOException
37 {
38 // Check parameters names.
39 if (args.length != 3)
40 {
41 System.err.println("Must pass three command-line parameters to this application:");
42 System.err.println(" - The original image (from which samples will be extracted;");
43 System.err.println(" - The file with the classes names and colors");
44 System.err.println(" - The file with the samples coordinates");
45 System.exit(1);
46 }
47 // Open the original image.
48 BufferedImage input = ImageIO.read(new File(args[0]));
49 // Read the classes description file.
50 BufferedReader br = new BufferedReader(new FileReader(args[1]));
51 // Store the classes color in a map.
52 TreeMap
53 while(true)
54 {
55 String line = br.readLine();
56 if (line == null) break;
57 if (line.startsWith("#")) continue;
58 StringTokenizer st = new StringTokenizer(line);
59 if (st.countTokens() < 4) continue;
60 int classId = Integer.parseInt(st.nextToken());
61 int r = Integer.parseInt(st.nextToken());
62 int g = Integer.parseInt(st.nextToken());
63 int b = Integer.parseInt(st.nextToken());
64 classMap.put(classId,new Color(r,g,b));
65 }
66 br.close();
67 // Create the structures to represent the signature for the minimum distance
68 // classifier: the average value of the pixels in the samples for each class.
69 TreeMap
70 // We will also need to count the number of pixels in a class' samples.
71 TreeMap
72 for(Integer classIndex:classMap.keySet())
73 {
74 avgMap.put(classIndex,new double[]{0,0,0});
75 countMap.put(classIndex,0);
76 }
77 // Open the file with the coordinates and get the pixels' values for those
78 // coordinates.
79 br = new BufferedReader(new FileReader(args[2]));
80 while(true)
81 {
82 String line = br.readLine();
83 if (line == null) break;
84 if (line.startsWith("#")) continue;
85 StringTokenizer st = new StringTokenizer(line);
86 if (st.countTokens() < 5) continue;
87 int classId = Integer.parseInt(st.nextToken());
88 int x = Integer.parseInt(st.nextToken());
89 int y = Integer.parseInt(st.nextToken());
90 int w = Integer.parseInt(st.nextToken());
91 int h = Integer.parseInt(st.nextToken());
92 Color c = classMap.get(classId);
93 if (c != null) // We have a region!
94 {
95 double[] accum = avgMap.get(classId);
96 int count = countMap.get(classId);
97 // Let's get all pixels values in it.
98 for(int row=0;row<=h;row++)
99 for(int col=0;col<=w;col++)
100 {
101 int rgb = input.getRGB(x+col,y+row);
102 int r = (int)((rgb&0x00FF0000)>>>16); // Red level
103 int g = (int)((rgb&0x0000FF00)>>>8); // Green level
104 int b = (int) (rgb&0x000000FF); // Blue level
105 // Add them to the average value.
106 accum[0] += r; accum[1] += g; accum[2] += b;
107 count++;
108 }
109 // Put the average and count values back on the map.
110 avgMap.put(classId,accum);
111 countMap.put(classId,count);
112 }
113 }
114 br.close();
115 // Write the average value vector, doing the actual averaging before.
116 BufferedWriter bw = new BufferedWriter(new FileWriter("mindist_signatures.txt"));
117 // In each line information for a class.
118 for(Integer classId:classMap.keySet())
119 {
120 bw.write(classId+" ");
121 double[] avg = avgMap.get(classId);
122 int count = countMap.get(classId);
123 bw.write(avg[0]/count+" "+avg[1]/count+" "+avg[2]/count+" ");
124 bw.newLine();
125 }
126 bw.close();
127 }
128 }
This application uses the original image, a text file with the classes' description and a text file with the samples' coordinates. The formats of those text files are described in the section Samples for the examples in this tutorial.
The signatures created with this application were stored in a text file, mindist_signatures.txt (TXT, 0.3K), summarized in the table below.
Clouds | Shadows | Water | Forest | Pasture | Urban | |
avgK | (166.45, 143.21, 215.82) | (11.67, 21.75, 33.53) | (9.35, 11.60, 37.23) | (23.90, 71.039, 40.92) | (45.39 74.82 55.87) | (98.93 69.54 84.20) |
Classification
Classification with the minimum distance algorithm is also very simple. We will need is the original image, the classes description and the signatures created with the CreateMinimumDistanceSignatures application. The steps are as follow:
This algorithm does not consider rejection of the pixel: a pixel will be classified as the class with the closest center on feature space even if "closest" is too far. We can change the algorithm to allow rejection if the closest distance is larger than a threshold that must be informed to the algorithm. The algorithm (with the distance rejection option) is implemented by the class ClassifyWithMinimumDistanceAlgorithm, shown below.
1 /*
2 * Part of the Java Image Processing Cookbook, please see
3 * http://www.lac.inpe.br/~rafael.santos/JIPCookbook/index.jsp
4 * for information on usage and distribution.
5 * Rafael Santos (rafael.santos@lac.inpe.br)
6 */
7 package tutorials.simpleclassifier;
8
9 import java.awt.Color;
10 import java.awt.image.BufferedImage;
11 import java.io.BufferedReader;
12 import java.io.File;
13 import java.io.FileReader;
14 import java.io.IOException;
15 import java.util.StringTokenizer;
16 import java.util.TreeMap;
17
18 import javax.imageio.ImageIO;
19
20 /**
21 * This application classifies images using the minimum distance classifier.
22 * Please see
23 * http://www.lac.inpe.br/~rafael.santos/JIPCookbook
24 * for more information on the files and formats used in this class.
25 */
26 public class ClassifyWithMinimumDistanceAlgorithm
27 {
28 /**
29 * The application entry point. We must pass at least three parameters: the original
30 * image file name, the name of the file with the description of the classes,
31 * and the name of the file with the signatures for the classes.
32 * If an additional numeric parameter is passed, it will be used as a threshold
33 * for classification (see tutorial for more information).
34 * @throws IOException
35 */
36 public static void main(String[] args) throws IOException
37 {
38 // Check parameters names.
39 if (args.length < 3)
40 {
41 System.err.println("Must pass at least three command-line parameters to this application:");
42 System.err.println(" - The original image (from which samples will be extracted;");
43 System.err.println(" - The file with the classes names and colors");
44 System.err.println(" - The file with the signatures for each class");
45 System.err.println(" - (optionally) a threshold for minimum distance for classification");
46 System.exit(1);
47 }
48 // Open the original image.
49 BufferedImage input = ImageIO.read(new File(args[0]));
50 // Read the classes description file.
51 BufferedReader br = new BufferedReader(new FileReader(args[1]));
52 // Store the classes color in a map.
53 TreeMap
54 while(true)
55 {
56 String line = br.readLine();
57 if (line == null) break;
58 if (line.startsWith("#")) continue;
59 StringTokenizer st = new StringTokenizer(line);
60 if (st.countTokens() < 4) continue;
61 int classId = Integer.parseInt(st.nextToken()); <
Copyright © 2011 - All Rights Reserved - Softron.in
Template by Softron Technology