A PROJECT REPORT ON Moving Object Detection Based On Kirsch Operator Combined With Optical Flow
ABSTRACT
The detection of moving object is important in many tasks, such as video surveillance and moving object tracking. Although there are some methods for the moving object detection, it is still a challenging area. In this paper, a new method which combines the Kirsch operator with the Optical Flow method (KOF) is proposed. On the one hand, the Kirsch operator is used to compute the contour of the objects in the video. On the other hand, the Optical Flow method is adopted to establish the motion vector field for the video sequence. Then the Otsu method is implemented after the Optical Flow method in order to distinguish the moving object and the background clearly. Finally the contour information fuses the information of motion vector field to label the moving objects in the video sequences. The experiment results prove that the proposed method is effective for the moving objects detection.
I. INTRODUCTION Moving object detection is the first step in video analysis. It can be used in many regions such as video surveillance, traffic monitoring and people tracking Generally speaking, there are three common motion segmentation techniques, which are frame difference, background subtraction and optical flow method. Frame difference method has less computational complexity, and it is easy to implement, but generally does a poor job of extracting the complete shapes of ertain types of moving objects . Background subtraction method uses the current frame minus the reference background image. The pixels where the difference is above a threshold are classified as the moving object. The Mixture of Gaussians method is widely used for the background modeling since it was proposed by Friedman and Russell . Stauffer presented an adaptive background mixture model by a mixture of K Gaussian distributions. Optical flow method can detect the moving object even when the camera moves, but it needs more time for its computational complexity, and it is very sensitive to the noise. The motion area usually appears quite noisy in real images and optical flow estimation involves only local computation . So the optical flow method can not detect the exact contour of the moving object. From the above it is clear that there are some shortcomings in the traditional moving object detection methods: • Frame difference can not detect the exact contour of the moving object • Optical flow method is sensitive to the noise. The KOF method which is proposed in this paper can solve the above problems. KOF method uses the Kirsch operator to acquire the boundaries information of the moving objects, meanwhile the optical flow method is used to get the motion vector field of the moving objects. Then both of the information acquired above is fused. At last, the moving objects are labeled with the minimum rectangle outside. The experiment results show that the present method is effective.
2. PROPOSED MOVING OBJECT METHOD A. The outline of the method The process of KOF method is shown in Fig. 1. The proposed method mainly consists of the edge detection, optical flow, data fusion and morphologic operation. Consider the requirements of the simplicity and effectiveness, Kirsch operator is used for the edgedetection. For the task of the optical flow, the Lucas- Kanade method is adopted, which can quickly provide the dense optical flow vector of the moving object. The binary process adopts the Otsu algorithm . It can decide the threshold which is used to distinguish the background and the moving objects self-adaptively. However, because of the noise, the optical flow method can not detect the accurate boundaries of the moving objects. The edge detection algorithm mentioned just before can solve this problem. Moreover, the edge image acquired by the Kirsch operator can be regarded as space gradient, while the optical flow image is time gradient . Combining the space gradient information with time gradient information can give us the more accurately information of the moving objects, so in the data fusion, the AND operator is used between the edge binary image and the optical flow binary image. In order to get the more exact contour of the moving objects, the morphologic operations such as Close and Hole Filling are implemented. Finally, the moving object is extracted from the image.
B. The edge detection method Kirsch operator The Kirsch-operator is a non-linear edge detector that finds the maximum edge strength in a few predetermined directions. Mathematical description The operator is calculated as follows for directions with 45° difference:
where the direction kernels
and so on.
The edge image can be regarded as the space gradient. There are some gradient operators, such as Sobel, Robert, Kirsch etc. As Kirsch operator can adjust the threshold automatically according to the character of the image, the Kirsch gradient operator is chosen to extract the contour of the object. The Kirsch operator has eight window templates. Every template makes the greatest response to a particular direction. The eight template operators are shown in Fig. 2. Except the outermost column and the outermost row, every pixel and its 3×3 eight neighborhoods in an image convolved with these eight templates respectively, so every pixel has eight outputs, the maximum output of the eight templates is chosen to be the value in this position. The gray value of a point and its eight neighborhoods in the image are illustrated as in Fig. 3.
Assume ( 0,1, 7) k q k = × × × is the output which is operated by the th k template of the Kirsch operators, k q can be obtained from the below equation: k = Mk * P(k = 0,1,...7) (1) where k M is the th k template operator in the eight Kirsch operators, P are the gray values of a pixel and its 3×3 eight neighborhoods. The edge intensity S(i, j) of P(i, j)is defined as ( , ) max{ }( 0,1, 7) k S i j = q k = × × × . Every pixel does the operation above, so the edge intensity image S is accepted. If the gray value difference between the object and the background is small in the image and the detected edge feature is not obvious, the follow-up study can not continue. So the binary process is necessary. When the value of the edge intensity image is above a threshold, it will be classified as the edge of the object. After the above operation, the edge binary image is acquired. Some video sequences including both outdoors and indoors are experimented: the results are shown in Fig. 4. The first
column is the outdoor scene and the second column is the indoor scene. It is clear that the Sobel and Robert operators lose part of the contour of the objects, while the Kirsch can detect the boundary of the objects clearly. Otsu's method In computer vision and image processing, Otsu's method is used to automatically perform histogram shape-based image thresholding, or, the reduction of a graylevel image to a binary image. The algorithm assumes that the image to be thresholded contains two classes of pixels (e.g. foreground and background) then calculates the optimum threshold separating those two classes so that their combined spread (intra-class variance) is minimal. The extension of the original method to multi-level thresholding is referred to as the Multi Otsu method. Method In Otsu's method we exhaustively search for the threshold that minimizes the intra-class variance, defined as a weighted sum of variances of the two classes:
Weights ?i are the probabilities of the two classes separated by a threshold t and variances of these classes. Otsu shows that minimizing the intra-class variance is the same as maximizing inter-class variance:[2]
which is expressed in terms of class probabilities ?i and class means ?i which in turn can be updated iteratively. This idea yields an effective algorithm. ALGORITHM 1. Compute histogram and probabilities of each intensity level
2. Set up initial ?i(0) and ?i(0) 3. Step through all possible thresholds 1. Update ?i and ?i 2. Compute 4. Desired threshold corresponds to the maximum maximum intensity
C. The Optical Flow method Optical flow or optic flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer (an eye or a camera) and the scene. Optical flow techniques such as motion detection, object segmentation, time-to-collision and focus of expansion calculations, motion compensated encoding, and stereo disparity measurement utilize this motion of the objects surfaces, and edges. Estimation of the optical flow Sequences of ordered images allow the estimation of motion as either instantaneous image velocities or discrete image displacements.[5] Fleet and Weiss provide a tutorial introduction to gradient based optical flow .[6] John L. Barron, David J. Fleet, and Steven Beauchemin provide a performance analysis of a number of optical flow techniques. It emphasizes the accuracy and density of measurements.[7] The optical flow methods try to calculate the motion between two image frames which are taken at times t and t + ?t at every voxel position. These methods are called differential since they are based on local Taylor series approximations of the image signal; that is, they use partial derivatives with respect to the spatial and temporal coordinates. For a 2D+t dimensional case (3D or n-D cases are similar) a voxel at location (x,y,t) with intensity I(x,y,t) will have moved by ?x, ?y and ?t between the two image frames, and the following image constraint equation can be given:
I(x,y,t) = I(x + ?x,y + ?y,t + ?t) Assuming the movement to be small, the image constraint at I(x,y,t) with Taylor series can be developed to get:
H.O.T. From these equations it follows that:
or
which results in
where Vx,Vy are the x and y components of the velocity or optical flow of I(x,y,t) and and
,
are the derivatives of the image at (x,y,t) in the corresponding directions. Ix,Iy
and It can be written for the derivatives in the following. Thus: IxVx + IyVy = ? It or
This is an equation in two unknowns and cannot be solved as such. This is known as the aperture problem of the optical flow algorithms. To find the optical flow another set of equations is needed, given by some additional constraint. All optical flow methods introduce additional conditions for estimating the actual flow. Methods for determining optical flow
? ?
Phase correlation – inverse of normalized cross-power spectrum Block-based methods – minimizing sum of squared differences or sum of absolute differences, or maximizing normalized cross-correlation
?
Differential methods of estimating optical flow, based on partial derivatives of the image signal and/or the sought flow field and higher-order partial derivatives, such as:
o
Lucas–Kanade Optical Flow Method – regarding image patches and an affine model for the flow field Horn–Schunck method – optimizing a functional based on residuals from the brightness constancy constraint, and a particular regularization term expressing the expected smoothness of the flow field Buxton–Buxton method – based on a model of the motion of edges in image sequences[8] Black–Jepson method – coarse optical flow via correlation General variational methods – a range of modifications/extensions of Horn–Schunck, using other data terms and other smoothness terms.
o
o
o o
?
Discrete optimization methods – the search space is quantized, and then image matching is addressed through label assignment at every pixel, such that the corresponding deformation minimizes the distance between the source and the target image.[9] The optimal solution is often recovered through min-cut max-flow algorithms, linear programming or belief propagation methods.
Uses of optical flow Motion estimation and video compression have developed as a major aspect of optical flow research. While the optical flow field is superficially similar to a dense motion field derived from the techniques of motion estimation, optical flow is the study of not only the determination of the optical flow field itself, but also of its use in estimating the threedimensional nature and structure of the scene, as well as the 3D motion of objects and the observer relative to the scene. Optical flow was used by robotics researchers in many areas such as: object detection and tracking, image dominant plane extraction, movement detection, robot navigation and visual odometry. The application of optical flow includes the problem of inferring not only the motion of the observer and objects in the scene, but also the structure of objects and the environment. Since awareness of motion and the generation of mental maps of the structure of our environment are critical components of animal (and human) vision, the conversion of this innate ability to a computer capability is similarly crucial in the field of machine vision. Consider a five-frame clip of a ball moving from the bottom left of a field of vision, to the top right. Motion estimation techniques can determine that on a two dimensional plane the ball is moving up and to the right and vectors describing this motion can be extracted from the sequence of frames. For the purposes of video compression (e.g., MPEG), the sequence is now described as well as it needs to be. However, in the field of machine vision, the question of whether the ball is moving to the right or if the observer is moving to the left is unknowable yet critical information. Not even if a static, patterned background were present in the five frames, could we confidently state that the ball was moving to the right, because the pattern might have an infinite distance to the observer.
There are some methods for computing the optical flow such as differential, matching, energy-based, and phasebased methods. In this paper, the Lucas-Kanade method is used. The optical flow constrained equation is as (2):
Where ( , )T V = ? u . ? is the horizontal component of the optical flow, u is the vertical component of the optical flow. From a Taylor expansion of (2) or more generally from an assumption that intensity is conserved, dI (x, t) / dt = 0 , the gradient constraint equation is derived:
Where ( , ) t I x t denotes the partial time derivative of . Lucas and Kanade assume that the motion vector keeps constant in a small spatial neighborhood, and they use the weighted leastsquares to estimate the optical flow. So in the small spatial neighborhood ? , the error of the optical flow is defined as:
Where W(x) denotes a window function that gives more influence to constraints at the center of the neighborhood than those at the periphery. The solution to (4) is given by
Where, for n points i x ÎW at a single time t
The solution to (5)
. Only one component of the optical
flow can not reflect the motion information of the objects. So the two factors must be combined together. By experiment, the optical flow image,
scilicet time gradient image is defined as: 2 2 x = ? +u in this paper. After the optical flow method, in the binary process, the Otsu algorithm is adopted. The Otsu algorithm can select the threshold which is used to distinguish the moving object and the background daptively. It is a classic non-parametric, unsupervised adaptive threshold selection method.
D. Data fusion and the morphologic operation After the above operation, the space gradient binary image and the time gradient binary image are acquired. In order to get the exact contour, the AND operator is used between the two binary images in the data fusion. The process can be simplified as an equation as follows:
Where bw D denotes the result of the data fusion. bw S denotes the space gradient binary image.
T denotes the time gradient binary image. (i, j) denotes the coordinate of the pixel in the image. And then, the morphologic operators such as Close and Hole filling are used to eliminate the discontinuity of object. At last, the area of each connected region is calculated, and the regions that below a threshold are discarded. The remaining areas are considered as the moving objects.
DIGITAL IMAGE PROCESSING
BACKGROUND: Digital image processing is an area characterized by the need for extensive experimental work to establish the viability of proposed solutions to a given problem. An important characteristic underlying the design of image processing systems is the significant level of testing & experimentation that normally is required before arriving at an acceptable solution. This characteristic implies that the ability to formulate approaches &quickly prototype candidate solutions generally plays a major role in reducing the cost & time required to arrive at a viable system implementation. What is DIP? An image may be defined as a two-dimensional function f(x, y), where x & y are spatial coordinates, & the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray level of the image at that point. When x, y & the amplitude values of f are all finite discrete quantities, we call the image a digital image. The field of DIP refers to processing digital image by means of digital computer. Digital image is composed of a finite number of elements, each of which has a particular location & value. The elements are called pixels. Vision is the most advanced of our sensor, so it is not surprising that image play the single most important role in human perception. However, unlike humans, who are limited to the visual band of the EM spectrum imaging machines cover almost the entire EM spectrum, ranging from gamma to radio waves. They can operate also on images generated by sources that humans are not accustomed to associating with image.
There is no general agreement among authors regarding where image processing stops & other related areas such as image analysis& computer vision start. Sometimes a
distinction is made by defining image processing as a discipline in which both the input & output at a process are images. This is limiting & somewhat artificial boundary. The area of image analysis (image understanding) is in between image processing & computer vision.
There are no clear-cut boundaries in the continuum from image processing at one end to complete vision at the other. However, one useful paradigm is to consider three types of computerized processes in this continuum: low-, mid-, & high-level processes. Low-level process involves primitive operations such as image processing to reduce noise, contrast enhancement & image sharpening. A low- level process is characterized by the fact that both its inputs & outputs are images. Mid-level process on images involves tasks such as segmentation, description of that object to reduce them to a form suitable for computer processing & classification of individual objects. A mid-level process is characterized by the fact that its inputs generally are images but its outputs are attributes extracted from those images. Finally higher- level processing involves “Making sense” of an ensemble of recognized objects, as in image analysis & at the far end of the continuum performing the cognitive functions normally associated with human vision.
Digital image processing, as already defined is used successfully in a broad range of areas of exceptional social & economic value.
What is an image?
An image is represented as a two dimensional function f(x, y) where x and y are spatial co-ordinates and the amplitude of „f? at any pair of coordinates (x, y) is called the intensity of the image at that point.
Gray scale image:
A grayscale image is a function I (xylem) of the two spatial coordinates of the image plane. I(x, y) is the intensity of the image at the point (x, y) on the image plane. I (xylem) takes non-negative values assume the image is bounded by a rectangle [0, a] ?[0, b]I: [0, a] ? [0, b] ? [0, info) Color image:
It can be represented by three functions, R (xylem) for red, G (xylem) for green and B (xylem) for blue. An image may be continuous with respect to the x and y coordinates and also in amplitude. Converting such an image to digital form requires that the coordinates as well as the amplitude to be digitized. Digitizing the coordinate?s values is called sampling. Digitizing the amplitude values is called quantization.
Coordinate convention: The result of sampling and quantization is a matrix of real numbers. We use two principal ways to represent digital images. Assume that an image f(x, y) is sampled so that the resulting image has M rows and N columns. We say that the image is of size M X N. The values of the coordinates (xylem) are discrete quantities. For notational clarity and convenience, we use integer values for these discrete coordinates. In many image processing books, the image origin is defined to be at (xylem)=(0,0).The next coordinate values along the first row of the image are (xylem)=(0,1).It is important to keep in mind
that the notation (0,1) is used to signify the second sample along the first row. It does not mean that these are the actual values of physical coordinates when the image was sampled. Following figure shows the coordinate convention. Note that x ranges from 0 to M-1 and y from 0 to N-1 in integer increments. The coordinate convention used in the toolbox to denote arrays is different from the preceding paragraph in two minor ways. First, instead of using (xylem) the toolbox uses the notation (race) to indicate rows and columns. Note, however, that the order of coordinates is the same as the order discussed in the previous paragraph, in the sense that the first element of a coordinate topples, (alb), refers to a row and the second to a column. The other difference is that the origin of the coordinate system is at (r, c) = (1, 1); thus, r ranges from 1 to M and c from 1 to N in integer increments. IPT documentation refers to the coordinates. Less frequently the toolbox also employs another coordinate convention called spatial coordinates which uses x to refer to columns and y to refers to rows. This is the opposite of our use of variables x and y. Image as Matrices: The preceding discussion leads to the following representation for a digitized image function: f (0,0) f(1,0) f(xylem)= . . . f(0,1) f(1,1) . . ……….. ………… f(0,N-1) f(1,N-1) .
f(M-1,0) f(M-1,1) ………… f(M-1,N-1) The right side of this equation is a digital image by definition. Each element of this array is called an image element, picture element, pixel or pel. The terms image
and pixel are used throughout the rest of our discussions to denote a digital image and its elements. A digital image can be represented naturally as a MATLAB matrix: f(1,1) f(1,2) ……. f(1,N) f(2,1) f(2,2) …….. f(2,N) . f= . . . . .
f(M,1) f(M,2) …….f(M,N)
Where f(1,1) = f(0,0) (note the use of a monoscope font to denote MATLAB quantities). Clearly the two representations are identical, except for the shift in origin. The notation f (p ,q) denotes the element located in row p and the column q. For example f (6, 2) is the element in the sixth row and second column of the matrix f. Typically we use the letters M and N respectively to denote the number of rows and columns in a matrix. A 1xN matrix is called a row vector whereas an Mx1 matrix is called a column vector. A 1x1 matrix is a scalar. Matrices in MATLAB are stored in variables with names such as A, a, RGB, real array and so on. Variables must begin with a letter and contain only letters, numerals and underscores. As noted in the previous paragraph, all MATLAB quantities are written using mono-scope characters. We use conventional Roman, italic notation such as f(x ,y), for mathematical expressions Reading Images: Images are read into the MATLAB environment using function imread whose syntax is
imread(„filename?) Format name TIFF JPEG GIF BMP PNG XWD Description Tagged Image File Format Joint Photograph Experts Group Graphics Interchange Format Windows Bitmap Portable Network Graphics X Window Dump recognized extension .tif, .tiff .jpg, .jpeg .gif .bmp .png .xwd
Here filename is a spring containing the complete of the image file(including any applicable extension).For example the command line >> f = imread („8. jpg?); reads the JPEG (above table) image chestxray into image array f. Note the use of single quotes („) to delimit the string filename. The semicolon at the end of a command line is used by MATLAB for suppressing output. If a semicolon is not included. MATLAB displays the results of the operation(s) specified in that line. The prompt symbol (>>) designates the beginning of a command line, as it appears in the MATLAB command window. When as in the preceding command line no path is included in filename, imread reads the file from the current directory and if that fails it tries to find the file in the MATLAB search path. The simplest way to read an image from a specified directory is to include a full or relative path to that directory in filename. For example, >> f = imread ( „D:\myimages\chestxray.jpg?);
reads the image from a folder called my images on the D: drive, whereas >> f = imread(„ . \ myimages\chestxray .jpg?); reads the image from the my images subdirectory of the current of the current working directory. The current directory window on the MATLAB desktop toolbar displays MATLAB?s current working directory and provides a simple, manual way to change it. Above table lists some of the most of the popular image/graphics formats supported by imread and imwrite. Function size gives the row and column dimensions of an image: >> size (f) ans = 1024 * 1024
This function is particularly useful in programming when used in the following form to determine automatically the size of an image: >>[M,N]=size(f); This syntax returns the number of rows(M) and columns(N) in the image. The whole function displays additional information about an array. For instance ,the statement >> whos f gives Name F size 1024*1024 Bytes 1048576 Class unit8 array
Grand total is 1048576 elements using 1048576 bytes The unit8 entry shown refers to one of several MATLAB data classes. A semicolon at the end of a whose line has no effect ,so normally one is not used. Displaying Images: Images are displayed on the MATLAB desktop using function imshow, which has the basic syntax: imshow(f,g) Where f is an image array, and g is the number of intensity levels used to display it. If g is omitted ,it defaults to 256 levels .using the syntax imshow(f,{low high}) Displays as black all values less than or equal to low and as white all values greater than or equal to high. The values in between are displayed as intermediate intensity values using the default number of levels .Finally the syntax Imshow(f,[ ]) Sets variable low to the minimum value of array f and high to its maximum value. This form of imshow is useful for displaying images that have a low dynamic range or that have positive and negative values. Function pixval is used frequently to display the intensity values of individual pixels interactively. This function displays a cursor overlaid on an image. As the cursor is moved over the image with the mouse the coordinates of the cursor position and the corresponding intensity values are shown on a display that appears below the figure window .When working with color images, the coordinates as well as the red, green and blue components are displayed. If the left button on the mouse is clicked and then held
pressed, pixval displays the Euclidean distance between the initial and current cursor locations. The syntax form of interest here is Pixval which shows the cursor on the last image displayed. Clicking the X button on the cursor window turns it off. The following statements read from disk an image called rose_512.tif extract basic information about the image and display it using imshow: >>f=imread(„rose_512.tif?); >>whos f Name F Size 512*512 Bytes 262144 Class unit8 array
Grand total is 262144 elements using 262144 bytes >>imshow(f) A semicolon at the end of an imshow line has no effect, so normally one is not used. If another image,g, is displayed using imshow, MATLAB replaces the image in the screen with the new image. To keep the first image and output a second image, we use function figure as follows: >>figure ,imshow(g) Using the statement >>imshow(f),figure ,imshow(g) displays both images.
Note that more than one command can be written on a line ,as long as different commands are properly delimited by commas or semicolons. As mentioned earlier, a semicolon is used whenever it is desired to suppress screen outputs from a command line. Suppose that we have just read an image h and find that using imshow produces the image. It is clear that this image has a low dynamic range, which can be remedied for display purposes by using the statement. >>imshow(h,[ ]) WRITING IMAGES: Images are written to disk using function imwrite, which has the following basic syntax: Imwrite (f,?filename?)
With this syntax, the string contained in filename must include a recognized file format extension .Alternatively, the desired format can be specified explicitly with a third input argument. >>imwrite(f,?patient10_run1?,?tif?) Or alternatively For example the following command writes f to a TIFF file named patient10_run1: >>imwrite(f,?patient10_run1.tif?) If filename contains no path information, then imwrite saves the file in the current working directory. The imwrite function can have other parameters depending on e file format selected. Most of the work in the following deals either with JPEG or TIFF images ,so we focus attention here on these two formats.
More general imwrite syntax applicable only to JPEG images is imwrite(f,?filename.jpg,,?quality?,q) where q is an integer between 0 and 100(the lower number the higher the degradation due to JPEG compression). For example, for q=25 the applicable syntax is >> imwrite(f,?bubbles25.jpg?,?quality?,25) The image for q=15 has false contouring that is barely visible, but this effect becomes quite pronounced for q=5 and q=0.Thus, an expectable solution with some margin for error is to compress the images with q=25.In order to get an idea of the compression achieved and to obtain other image file details, we can use function imfinfo which has syntax. Imfinfo filename Here filename is the complete file name of the image stored in disk. For example, >> imfinfo bubbles25.jpg outputs the following information(note that some fields contain no information in this case): Filename: „bubbles25.jpg? FileModDate: ?04-jan-2003 12:31:26? FileSize: 13849
Format:
„jpg? „„
Format Version: Width: Height: Bit Depth: 714 682 8
Color Depth:
„grayscale? „„
Format Signature: Comment: {}
Where file size is in bytes. The number of bytes in the original image is corrupted simply by multiplying width by height by bit depth and dividing the result by 8. The result is 486948.Dividing this file size gives the compression
ratio
486948/13849)=35.16.This compression ratio was achieved. While maintaining image quality consistent with the requirements of the appearance. In addition to the obvious advantages in storage space, this reduction allows the transmission of approximately 35 times the amount of un compressed data per unit time. The information fields displayed by imfinfo can be captured in to a so called structure variable that can be for subsequent computations. Using the receding an example and assigning the name K to the structure variable. We use the syntax >>K=imfinfo(„bubbles25.jpg?);
To store in to variable K all the information generated by command imfinfo, the information generated by imfinfo is appended to the structure variable by means of fields, separated from K by a dot. For example, the image height and width are now stored in structure fields K. Height and K. width.
As an illustration, consider the following use of structure variable K to commute the compression ratio for bubbles25.jpg: >> K=imfinfo(„bubbles25.jpg?); >> image_ bytes =K.Width* K.Height* K.Bit Depth /8; >> Compressed_ bytes = K.FilesSize; >> Compression_ ratio=35.162 Note that iminfo was used in two different ways. The first was t type imfinfo bubbles25.jpg at the prompt, which resulted in the information being displayed on the screen. The second was to type K=imfinfo („bubbles25.jpg?),which resulted in the information generated by imfinfo being stored in K. These two different ways of calling imfinfo are an example of command_ function duality, an important concept that is explained in more detail in the MATLAB online documentation. More general imwrite syntax applicable only to tif images has the form Imwrite(g,?filename.tif?,?compression?,?parameter?,….?resloution?,[colres rowers] ) Where „parameter? can have one of the following principal values: „none? indicates no compression, „pack bits? indicates pack bits compression (the default for non „binary images?) and „ccitt? indicates ccitt compression. (the default for binary images).The 1*2 array [colres rowers] Contains two integers that give the column resolution and row resolution in dot per_ unit (the default values). For example, if the image dimensions are in inches, colres is in the number of dots(pixels)per inch (dpi) in the vertical direction and similarly for rowers in the horizontal direction. Specifying the resolution by single scalar, res is equivalent to writing [res res]. >>imwrite(f,?sf.tif?,?compression?,?none?,?resolution?,……………..[300 300])
the values of the vector[colures rows] were determined by multiplying 200 dpi by the ratio 2.25/1.5, which gives 30 dpi. Rather than do the computation manually, we could write >> res=round(200*2.25/1.5); >>imwrite(f,?sf.tif?,?compression?,?none?,?resolution?,res) where its argument to the nearest integer.It function round rounds is important to note that the number of pixels was not changed by these commands. Only the scale of the image changed. The original 450*450 image at 200 dpi is of size 2.25*2.25 inches. The new 300_dpi image is identical, except that is 450*450 pixels are distributed over a 1.5*1.5_inch area. Processes such as this are useful for controlling the size of an image in a printed document with out sacrificing resolution. Often it is necessary to export images to disk the way they appear on the MATLAB desktop. This is especially true with plots .The contents of a figure window can be exported to disk in two ways. The first is to use the file pull-down menu is in the figure window and then choose export. With this option the user can select a location, filename, and format. More control over export parameters is obtained by using print command: Print-fno-dfileformat-rresno filename Where no refers to the figure number in the figure window interest, file format refers one of the file formats in table above. „resno? is the resolution in dpi, and filename is the name we wish to assign the file. If we simply type print at the prompt, MATLAB prints (to the default printer) the contents of the last figure window displayed. It is possible also to specify other options with print, such as specific printing device. Data Classes:
Although we work with integers coordinates the values of pixels themselves are not restricted to be integers in MATLAB. Table above list various data classes supported by MATLAB and IPT are representing pixels values. The first eight entries in the table are refers to as numeric data classes. The ninth entry is the char class and, as shown, the last entry is referred to as logical data class. All numeric computations in MATLAB are done in double quantities, so this is also a frequent data class encounter in image processing applications. Class unit 8 also is encountered frequently, especially when reading data from storages devices, as 8 bit images are most common representations found in practice. These two data classes, classes logical, and, to a lesser degree, class unit 16 constitute the primary data classes on which we focus. Many ipt functions however support all the data classes listed in table. Data class double requires 8 bytes to represent a number uint8 and int 8 require one byte each, uint16 and int16 requires 2bytes and unit 32.
Name Double Uinit8 Element). Uinit16
Description Double _ precision, floating_ point numbers the Approximate. unsigned Element). unsigned 16_bit integers in the range [0,65535] (2byte Per element). 8_bit integers in the range [0,255] (1byte per
Uinit 32
unsigned 32_bit integers in the range [0,4294967295] (4 bytes per element).
Int8
signed 8_bit integers in the range[-128,127] 1 byte per element)
Int 16
signed 16_byte integers in the range [32768, 32767] (2 bytes per element).
Int 32
Signed 32_byte integers in the range [-2147483648, 21474833647] (4 byte per element).
Single
single _precision floating _point numbers with values In the approximate range (4 bytes per elements).
Char Logical
characters (2 bytes per elements). values are 0 to 1 (1byte per element).
int 32 and single, required 4 bytes each. The char data class holds characters in Unicode representation. A character string is merely a 1*n array of characters logical array contains only the values 0 to 1,with each element being stored in memory using function logical or by using relational operators.
Image Types: The toolbox supports four types of images: 1 .Intensity images 2. Binary images 3. Indexed images 4. R G B images
Most monochrome image processing operations are carried out using binary or intensity images, so our initial focus is on these two image types. Indexed and RGB colour images.
Intensity Images: An intensity image is a data matrix whose values have been scaled to represent intentions. When the elements of an intensity image are of class unit8, or class unit 16, they have integer values in the range [0,255] and [0, 65535], respectively. If the image is of class double, the values are floating _point numbers. Values of scaled, double intensity images are in the range [0, 1] by convention. Binary Images: Binary images have a very specific meaning in MATLAB.A binary image is a logical array 0s and1s.Thus, an array of 0s and 1s whose values are of data class, say unit8, is not considered as a binary image in MATLAB .A numeric array is converted to binary using function logical. Thus, if A is a numeric array consisting of 0s and 1s, we create an array B using the statement. B=logical (A) If A contains elements other than 0s and 1s.Use of the logical function converts all nonzero quantities to logical 1s and all entries with value 0 to logical 0s. Using relational and logical operators also creates logical arrays. To test if an array is logical we use the I logical function: islogical(c)
If c is a logical array, this function returns a 1.Otherwise returns a 0. Logical array can be converted to numeric arrays using the data class conversion functions. Indexed Images: An indexed image has two components: A data matrix integer, x. A color map matrix, map. Matrix map is an m*3 arrays of class double containing floating_ point values in the range [0, 1].The length m of the map are equal to the number of colors it defines. Each row of map specifies the red, green and blue components of a single color. An indexed images uses “direct mapping” of pixel intensity values color map values. The color of each pixel is determined by using the corresponding value the integer matrix x as a pointer in to map. If x is of class double ,then all of its components with values less than or equal to 1 point to the first row in map, all components with value 2 point to the second row and so on. If x is of class units or unit 16, then all components value 0 point to the first row in map, all components with value 1 point to the second and so on. RGB Image: An RGB color image is an M*N*3 array of color pixels where each color pixel is triplet corresponding to the red, green and blue components of an RGB image, at a specific spatial location. An RGB image may be viewed as “stack” of three gray scale images that when fed in to the red, green and blue inputs of a color monitor Produce a color image on the screen. Convention the three images forming an RGB color image are referred to as the red, green and blue components images. The data class of the components images determines their range of values. If an RGB image is of class double the range of values is [0, 1].
Similarly the range of values is [0,255] or [0, 65535].For RGB images of class units or unit 16 respectively. The number of bits use to represents the pixel values of the component images determines the bit depth of an RGB image. For example, if each component image is an 8bit image, the corresponding RGB image is said to be 24 bits deep. Generally, the number of bits in all component images is the same. In this case the number of possible color in an RGB image is (2^b) ^3, where b is a number of bits in each component image. For the 8bit case the number is 16,777,216 colors
CONCLUSIONS In this paper, a novel method which combines the irsch operator with the optical flow is proposed for the moving object detection. Consider the edge image as the space gradient
while the optical flow image is time gradient. The KOF method contains both the space gradient information and the time gradient information. Otsu algorithm and morphologic operation are also used as the supporting techniques. Contrast with the three traditional moving object detection methods, the KOF method not only can give the exact boundary of the moving objects, but also has the better anti-noise performance. Although the method is a little timeconsuming, the fast development of the hardware of the computer can solve this problem. The experiment results prove that the method is effective for the moving object detection.
REFERENCES: 1. C. Stauffer and W. E. L. Grimson, “Learning patterns of activity using real -time tracking,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, No. 8, pp. 747–757, Aug. 2000. 2. K. Q. Huang, L. S. Wang, T. N. Tan, and S. Maybank, “A realtime object detecting and tracking system for outdoor night surveillance,” Pattern Recognition, vol. 41, pp. 432–444, Jan. 2008.
3. J. R. Bergen, P. J. Burt, R. Hingorani, and S. Peleg, “A three-frame algorithm for estimating two-component image motion”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, No. 9, pp. 886-896, Sep. 1992. 4. R. J. Radke, S. Andra, O. AI-Kofahi, and B. Roysa, “Image change detection algorithms: a systematic survey”, IEEE Transactions on Image Processing, vol. 14, No. 3, pp. 294-307, Mar. 2005. 5. O. Miller, A. Averbuch, and Y. Keller, “Automatic adaptive segmentation of moving objects based on spatio-temporal information,” in Proc. VIIth Digital Image Computing: Techniques and Applications, 2003, pp.1007-1016.
doc_700599636.docx
ABSTRACT
The detection of moving object is important in many tasks, such as video surveillance and moving object tracking. Although there are some methods for the moving object detection, it is still a challenging area. In this paper, a new method which combines the Kirsch operator with the Optical Flow method (KOF) is proposed. On the one hand, the Kirsch operator is used to compute the contour of the objects in the video. On the other hand, the Optical Flow method is adopted to establish the motion vector field for the video sequence. Then the Otsu method is implemented after the Optical Flow method in order to distinguish the moving object and the background clearly. Finally the contour information fuses the information of motion vector field to label the moving objects in the video sequences. The experiment results prove that the proposed method is effective for the moving objects detection.
I. INTRODUCTION Moving object detection is the first step in video analysis. It can be used in many regions such as video surveillance, traffic monitoring and people tracking Generally speaking, there are three common motion segmentation techniques, which are frame difference, background subtraction and optical flow method. Frame difference method has less computational complexity, and it is easy to implement, but generally does a poor job of extracting the complete shapes of ertain types of moving objects . Background subtraction method uses the current frame minus the reference background image. The pixels where the difference is above a threshold are classified as the moving object. The Mixture of Gaussians method is widely used for the background modeling since it was proposed by Friedman and Russell . Stauffer presented an adaptive background mixture model by a mixture of K Gaussian distributions. Optical flow method can detect the moving object even when the camera moves, but it needs more time for its computational complexity, and it is very sensitive to the noise. The motion area usually appears quite noisy in real images and optical flow estimation involves only local computation . So the optical flow method can not detect the exact contour of the moving object. From the above it is clear that there are some shortcomings in the traditional moving object detection methods: • Frame difference can not detect the exact contour of the moving object • Optical flow method is sensitive to the noise. The KOF method which is proposed in this paper can solve the above problems. KOF method uses the Kirsch operator to acquire the boundaries information of the moving objects, meanwhile the optical flow method is used to get the motion vector field of the moving objects. Then both of the information acquired above is fused. At last, the moving objects are labeled with the minimum rectangle outside. The experiment results show that the present method is effective.
2. PROPOSED MOVING OBJECT METHOD A. The outline of the method The process of KOF method is shown in Fig. 1. The proposed method mainly consists of the edge detection, optical flow, data fusion and morphologic operation. Consider the requirements of the simplicity and effectiveness, Kirsch operator is used for the edgedetection. For the task of the optical flow, the Lucas- Kanade method is adopted, which can quickly provide the dense optical flow vector of the moving object. The binary process adopts the Otsu algorithm . It can decide the threshold which is used to distinguish the background and the moving objects self-adaptively. However, because of the noise, the optical flow method can not detect the accurate boundaries of the moving objects. The edge detection algorithm mentioned just before can solve this problem. Moreover, the edge image acquired by the Kirsch operator can be regarded as space gradient, while the optical flow image is time gradient . Combining the space gradient information with time gradient information can give us the more accurately information of the moving objects, so in the data fusion, the AND operator is used between the edge binary image and the optical flow binary image. In order to get the more exact contour of the moving objects, the morphologic operations such as Close and Hole Filling are implemented. Finally, the moving object is extracted from the image.
B. The edge detection method Kirsch operator The Kirsch-operator is a non-linear edge detector that finds the maximum edge strength in a few predetermined directions. Mathematical description The operator is calculated as follows for directions with 45° difference:
where the direction kernels
and so on.
The edge image can be regarded as the space gradient. There are some gradient operators, such as Sobel, Robert, Kirsch etc. As Kirsch operator can adjust the threshold automatically according to the character of the image, the Kirsch gradient operator is chosen to extract the contour of the object. The Kirsch operator has eight window templates. Every template makes the greatest response to a particular direction. The eight template operators are shown in Fig. 2. Except the outermost column and the outermost row, every pixel and its 3×3 eight neighborhoods in an image convolved with these eight templates respectively, so every pixel has eight outputs, the maximum output of the eight templates is chosen to be the value in this position. The gray value of a point and its eight neighborhoods in the image are illustrated as in Fig. 3.
Assume ( 0,1, 7) k q k = × × × is the output which is operated by the th k template of the Kirsch operators, k q can be obtained from the below equation: k = Mk * P(k = 0,1,...7) (1) where k M is the th k template operator in the eight Kirsch operators, P are the gray values of a pixel and its 3×3 eight neighborhoods. The edge intensity S(i, j) of P(i, j)is defined as ( , ) max{ }( 0,1, 7) k S i j = q k = × × × . Every pixel does the operation above, so the edge intensity image S is accepted. If the gray value difference between the object and the background is small in the image and the detected edge feature is not obvious, the follow-up study can not continue. So the binary process is necessary. When the value of the edge intensity image is above a threshold, it will be classified as the edge of the object. After the above operation, the edge binary image is acquired. Some video sequences including both outdoors and indoors are experimented: the results are shown in Fig. 4. The first
column is the outdoor scene and the second column is the indoor scene. It is clear that the Sobel and Robert operators lose part of the contour of the objects, while the Kirsch can detect the boundary of the objects clearly. Otsu's method In computer vision and image processing, Otsu's method is used to automatically perform histogram shape-based image thresholding, or, the reduction of a graylevel image to a binary image. The algorithm assumes that the image to be thresholded contains two classes of pixels (e.g. foreground and background) then calculates the optimum threshold separating those two classes so that their combined spread (intra-class variance) is minimal. The extension of the original method to multi-level thresholding is referred to as the Multi Otsu method. Method In Otsu's method we exhaustively search for the threshold that minimizes the intra-class variance, defined as a weighted sum of variances of the two classes:
Weights ?i are the probabilities of the two classes separated by a threshold t and variances of these classes. Otsu shows that minimizing the intra-class variance is the same as maximizing inter-class variance:[2]
which is expressed in terms of class probabilities ?i and class means ?i which in turn can be updated iteratively. This idea yields an effective algorithm. ALGORITHM 1. Compute histogram and probabilities of each intensity level
2. Set up initial ?i(0) and ?i(0) 3. Step through all possible thresholds 1. Update ?i and ?i 2. Compute 4. Desired threshold corresponds to the maximum maximum intensity
C. The Optical Flow method Optical flow or optic flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer (an eye or a camera) and the scene. Optical flow techniques such as motion detection, object segmentation, time-to-collision and focus of expansion calculations, motion compensated encoding, and stereo disparity measurement utilize this motion of the objects surfaces, and edges. Estimation of the optical flow Sequences of ordered images allow the estimation of motion as either instantaneous image velocities or discrete image displacements.[5] Fleet and Weiss provide a tutorial introduction to gradient based optical flow .[6] John L. Barron, David J. Fleet, and Steven Beauchemin provide a performance analysis of a number of optical flow techniques. It emphasizes the accuracy and density of measurements.[7] The optical flow methods try to calculate the motion between two image frames which are taken at times t and t + ?t at every voxel position. These methods are called differential since they are based on local Taylor series approximations of the image signal; that is, they use partial derivatives with respect to the spatial and temporal coordinates. For a 2D+t dimensional case (3D or n-D cases are similar) a voxel at location (x,y,t) with intensity I(x,y,t) will have moved by ?x, ?y and ?t between the two image frames, and the following image constraint equation can be given:
I(x,y,t) = I(x + ?x,y + ?y,t + ?t) Assuming the movement to be small, the image constraint at I(x,y,t) with Taylor series can be developed to get:
H.O.T. From these equations it follows that:
or
which results in
where Vx,Vy are the x and y components of the velocity or optical flow of I(x,y,t) and and
,
are the derivatives of the image at (x,y,t) in the corresponding directions. Ix,Iy
and It can be written for the derivatives in the following. Thus: IxVx + IyVy = ? It or
This is an equation in two unknowns and cannot be solved as such. This is known as the aperture problem of the optical flow algorithms. To find the optical flow another set of equations is needed, given by some additional constraint. All optical flow methods introduce additional conditions for estimating the actual flow. Methods for determining optical flow
? ?
Phase correlation – inverse of normalized cross-power spectrum Block-based methods – minimizing sum of squared differences or sum of absolute differences, or maximizing normalized cross-correlation
?
Differential methods of estimating optical flow, based on partial derivatives of the image signal and/or the sought flow field and higher-order partial derivatives, such as:
o
Lucas–Kanade Optical Flow Method – regarding image patches and an affine model for the flow field Horn–Schunck method – optimizing a functional based on residuals from the brightness constancy constraint, and a particular regularization term expressing the expected smoothness of the flow field Buxton–Buxton method – based on a model of the motion of edges in image sequences[8] Black–Jepson method – coarse optical flow via correlation General variational methods – a range of modifications/extensions of Horn–Schunck, using other data terms and other smoothness terms.
o
o
o o
?
Discrete optimization methods – the search space is quantized, and then image matching is addressed through label assignment at every pixel, such that the corresponding deformation minimizes the distance between the source and the target image.[9] The optimal solution is often recovered through min-cut max-flow algorithms, linear programming or belief propagation methods.
Uses of optical flow Motion estimation and video compression have developed as a major aspect of optical flow research. While the optical flow field is superficially similar to a dense motion field derived from the techniques of motion estimation, optical flow is the study of not only the determination of the optical flow field itself, but also of its use in estimating the threedimensional nature and structure of the scene, as well as the 3D motion of objects and the observer relative to the scene. Optical flow was used by robotics researchers in many areas such as: object detection and tracking, image dominant plane extraction, movement detection, robot navigation and visual odometry. The application of optical flow includes the problem of inferring not only the motion of the observer and objects in the scene, but also the structure of objects and the environment. Since awareness of motion and the generation of mental maps of the structure of our environment are critical components of animal (and human) vision, the conversion of this innate ability to a computer capability is similarly crucial in the field of machine vision. Consider a five-frame clip of a ball moving from the bottom left of a field of vision, to the top right. Motion estimation techniques can determine that on a two dimensional plane the ball is moving up and to the right and vectors describing this motion can be extracted from the sequence of frames. For the purposes of video compression (e.g., MPEG), the sequence is now described as well as it needs to be. However, in the field of machine vision, the question of whether the ball is moving to the right or if the observer is moving to the left is unknowable yet critical information. Not even if a static, patterned background were present in the five frames, could we confidently state that the ball was moving to the right, because the pattern might have an infinite distance to the observer.
There are some methods for computing the optical flow such as differential, matching, energy-based, and phasebased methods. In this paper, the Lucas-Kanade method is used. The optical flow constrained equation is as (2):
Where ( , )T V = ? u . ? is the horizontal component of the optical flow, u is the vertical component of the optical flow. From a Taylor expansion of (2) or more generally from an assumption that intensity is conserved, dI (x, t) / dt = 0 , the gradient constraint equation is derived:
Where ( , ) t I x t denotes the partial time derivative of . Lucas and Kanade assume that the motion vector keeps constant in a small spatial neighborhood, and they use the weighted leastsquares to estimate the optical flow. So in the small spatial neighborhood ? , the error of the optical flow is defined as:
Where W(x) denotes a window function that gives more influence to constraints at the center of the neighborhood than those at the periphery. The solution to (4) is given by
Where, for n points i x ÎW at a single time t
The solution to (5)
. Only one component of the optical
flow can not reflect the motion information of the objects. So the two factors must be combined together. By experiment, the optical flow image,
scilicet time gradient image is defined as: 2 2 x = ? +u in this paper. After the optical flow method, in the binary process, the Otsu algorithm is adopted. The Otsu algorithm can select the threshold which is used to distinguish the moving object and the background daptively. It is a classic non-parametric, unsupervised adaptive threshold selection method.
D. Data fusion and the morphologic operation After the above operation, the space gradient binary image and the time gradient binary image are acquired. In order to get the exact contour, the AND operator is used between the two binary images in the data fusion. The process can be simplified as an equation as follows:
Where bw D denotes the result of the data fusion. bw S denotes the space gradient binary image.
T denotes the time gradient binary image. (i, j) denotes the coordinate of the pixel in the image. And then, the morphologic operators such as Close and Hole filling are used to eliminate the discontinuity of object. At last, the area of each connected region is calculated, and the regions that below a threshold are discarded. The remaining areas are considered as the moving objects.
DIGITAL IMAGE PROCESSING
BACKGROUND: Digital image processing is an area characterized by the need for extensive experimental work to establish the viability of proposed solutions to a given problem. An important characteristic underlying the design of image processing systems is the significant level of testing & experimentation that normally is required before arriving at an acceptable solution. This characteristic implies that the ability to formulate approaches &quickly prototype candidate solutions generally plays a major role in reducing the cost & time required to arrive at a viable system implementation. What is DIP? An image may be defined as a two-dimensional function f(x, y), where x & y are spatial coordinates, & the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray level of the image at that point. When x, y & the amplitude values of f are all finite discrete quantities, we call the image a digital image. The field of DIP refers to processing digital image by means of digital computer. Digital image is composed of a finite number of elements, each of which has a particular location & value. The elements are called pixels. Vision is the most advanced of our sensor, so it is not surprising that image play the single most important role in human perception. However, unlike humans, who are limited to the visual band of the EM spectrum imaging machines cover almost the entire EM spectrum, ranging from gamma to radio waves. They can operate also on images generated by sources that humans are not accustomed to associating with image.
There is no general agreement among authors regarding where image processing stops & other related areas such as image analysis& computer vision start. Sometimes a
distinction is made by defining image processing as a discipline in which both the input & output at a process are images. This is limiting & somewhat artificial boundary. The area of image analysis (image understanding) is in between image processing & computer vision.
There are no clear-cut boundaries in the continuum from image processing at one end to complete vision at the other. However, one useful paradigm is to consider three types of computerized processes in this continuum: low-, mid-, & high-level processes. Low-level process involves primitive operations such as image processing to reduce noise, contrast enhancement & image sharpening. A low- level process is characterized by the fact that both its inputs & outputs are images. Mid-level process on images involves tasks such as segmentation, description of that object to reduce them to a form suitable for computer processing & classification of individual objects. A mid-level process is characterized by the fact that its inputs generally are images but its outputs are attributes extracted from those images. Finally higher- level processing involves “Making sense” of an ensemble of recognized objects, as in image analysis & at the far end of the continuum performing the cognitive functions normally associated with human vision.
Digital image processing, as already defined is used successfully in a broad range of areas of exceptional social & economic value.
What is an image?
An image is represented as a two dimensional function f(x, y) where x and y are spatial co-ordinates and the amplitude of „f? at any pair of coordinates (x, y) is called the intensity of the image at that point.
Gray scale image:
A grayscale image is a function I (xylem) of the two spatial coordinates of the image plane. I(x, y) is the intensity of the image at the point (x, y) on the image plane. I (xylem) takes non-negative values assume the image is bounded by a rectangle [0, a] ?[0, b]I: [0, a] ? [0, b] ? [0, info) Color image:
It can be represented by three functions, R (xylem) for red, G (xylem) for green and B (xylem) for blue. An image may be continuous with respect to the x and y coordinates and also in amplitude. Converting such an image to digital form requires that the coordinates as well as the amplitude to be digitized. Digitizing the coordinate?s values is called sampling. Digitizing the amplitude values is called quantization.
Coordinate convention: The result of sampling and quantization is a matrix of real numbers. We use two principal ways to represent digital images. Assume that an image f(x, y) is sampled so that the resulting image has M rows and N columns. We say that the image is of size M X N. The values of the coordinates (xylem) are discrete quantities. For notational clarity and convenience, we use integer values for these discrete coordinates. In many image processing books, the image origin is defined to be at (xylem)=(0,0).The next coordinate values along the first row of the image are (xylem)=(0,1).It is important to keep in mind
that the notation (0,1) is used to signify the second sample along the first row. It does not mean that these are the actual values of physical coordinates when the image was sampled. Following figure shows the coordinate convention. Note that x ranges from 0 to M-1 and y from 0 to N-1 in integer increments. The coordinate convention used in the toolbox to denote arrays is different from the preceding paragraph in two minor ways. First, instead of using (xylem) the toolbox uses the notation (race) to indicate rows and columns. Note, however, that the order of coordinates is the same as the order discussed in the previous paragraph, in the sense that the first element of a coordinate topples, (alb), refers to a row and the second to a column. The other difference is that the origin of the coordinate system is at (r, c) = (1, 1); thus, r ranges from 1 to M and c from 1 to N in integer increments. IPT documentation refers to the coordinates. Less frequently the toolbox also employs another coordinate convention called spatial coordinates which uses x to refer to columns and y to refers to rows. This is the opposite of our use of variables x and y. Image as Matrices: The preceding discussion leads to the following representation for a digitized image function: f (0,0) f(1,0) f(xylem)= . . . f(0,1) f(1,1) . . ……….. ………… f(0,N-1) f(1,N-1) .
f(M-1,0) f(M-1,1) ………… f(M-1,N-1) The right side of this equation is a digital image by definition. Each element of this array is called an image element, picture element, pixel or pel. The terms image
and pixel are used throughout the rest of our discussions to denote a digital image and its elements. A digital image can be represented naturally as a MATLAB matrix: f(1,1) f(1,2) ……. f(1,N) f(2,1) f(2,2) …….. f(2,N) . f= . . . . .
f(M,1) f(M,2) …….f(M,N)
Where f(1,1) = f(0,0) (note the use of a monoscope font to denote MATLAB quantities). Clearly the two representations are identical, except for the shift in origin. The notation f (p ,q) denotes the element located in row p and the column q. For example f (6, 2) is the element in the sixth row and second column of the matrix f. Typically we use the letters M and N respectively to denote the number of rows and columns in a matrix. A 1xN matrix is called a row vector whereas an Mx1 matrix is called a column vector. A 1x1 matrix is a scalar. Matrices in MATLAB are stored in variables with names such as A, a, RGB, real array and so on. Variables must begin with a letter and contain only letters, numerals and underscores. As noted in the previous paragraph, all MATLAB quantities are written using mono-scope characters. We use conventional Roman, italic notation such as f(x ,y), for mathematical expressions Reading Images: Images are read into the MATLAB environment using function imread whose syntax is
imread(„filename?) Format name TIFF JPEG GIF BMP PNG XWD Description Tagged Image File Format Joint Photograph Experts Group Graphics Interchange Format Windows Bitmap Portable Network Graphics X Window Dump recognized extension .tif, .tiff .jpg, .jpeg .gif .bmp .png .xwd
Here filename is a spring containing the complete of the image file(including any applicable extension).For example the command line >> f = imread („8. jpg?); reads the JPEG (above table) image chestxray into image array f. Note the use of single quotes („) to delimit the string filename. The semicolon at the end of a command line is used by MATLAB for suppressing output. If a semicolon is not included. MATLAB displays the results of the operation(s) specified in that line. The prompt symbol (>>) designates the beginning of a command line, as it appears in the MATLAB command window. When as in the preceding command line no path is included in filename, imread reads the file from the current directory and if that fails it tries to find the file in the MATLAB search path. The simplest way to read an image from a specified directory is to include a full or relative path to that directory in filename. For example, >> f = imread ( „D:\myimages\chestxray.jpg?);
reads the image from a folder called my images on the D: drive, whereas >> f = imread(„ . \ myimages\chestxray .jpg?); reads the image from the my images subdirectory of the current of the current working directory. The current directory window on the MATLAB desktop toolbar displays MATLAB?s current working directory and provides a simple, manual way to change it. Above table lists some of the most of the popular image/graphics formats supported by imread and imwrite. Function size gives the row and column dimensions of an image: >> size (f) ans = 1024 * 1024
This function is particularly useful in programming when used in the following form to determine automatically the size of an image: >>[M,N]=size(f); This syntax returns the number of rows(M) and columns(N) in the image. The whole function displays additional information about an array. For instance ,the statement >> whos f gives Name F size 1024*1024 Bytes 1048576 Class unit8 array
Grand total is 1048576 elements using 1048576 bytes The unit8 entry shown refers to one of several MATLAB data classes. A semicolon at the end of a whose line has no effect ,so normally one is not used. Displaying Images: Images are displayed on the MATLAB desktop using function imshow, which has the basic syntax: imshow(f,g) Where f is an image array, and g is the number of intensity levels used to display it. If g is omitted ,it defaults to 256 levels .using the syntax imshow(f,{low high}) Displays as black all values less than or equal to low and as white all values greater than or equal to high. The values in between are displayed as intermediate intensity values using the default number of levels .Finally the syntax Imshow(f,[ ]) Sets variable low to the minimum value of array f and high to its maximum value. This form of imshow is useful for displaying images that have a low dynamic range or that have positive and negative values. Function pixval is used frequently to display the intensity values of individual pixels interactively. This function displays a cursor overlaid on an image. As the cursor is moved over the image with the mouse the coordinates of the cursor position and the corresponding intensity values are shown on a display that appears below the figure window .When working with color images, the coordinates as well as the red, green and blue components are displayed. If the left button on the mouse is clicked and then held
pressed, pixval displays the Euclidean distance between the initial and current cursor locations. The syntax form of interest here is Pixval which shows the cursor on the last image displayed. Clicking the X button on the cursor window turns it off. The following statements read from disk an image called rose_512.tif extract basic information about the image and display it using imshow: >>f=imread(„rose_512.tif?); >>whos f Name F Size 512*512 Bytes 262144 Class unit8 array
Grand total is 262144 elements using 262144 bytes >>imshow(f) A semicolon at the end of an imshow line has no effect, so normally one is not used. If another image,g, is displayed using imshow, MATLAB replaces the image in the screen with the new image. To keep the first image and output a second image, we use function figure as follows: >>figure ,imshow(g) Using the statement >>imshow(f),figure ,imshow(g) displays both images.
Note that more than one command can be written on a line ,as long as different commands are properly delimited by commas or semicolons. As mentioned earlier, a semicolon is used whenever it is desired to suppress screen outputs from a command line. Suppose that we have just read an image h and find that using imshow produces the image. It is clear that this image has a low dynamic range, which can be remedied for display purposes by using the statement. >>imshow(h,[ ]) WRITING IMAGES: Images are written to disk using function imwrite, which has the following basic syntax: Imwrite (f,?filename?)
With this syntax, the string contained in filename must include a recognized file format extension .Alternatively, the desired format can be specified explicitly with a third input argument. >>imwrite(f,?patient10_run1?,?tif?) Or alternatively For example the following command writes f to a TIFF file named patient10_run1: >>imwrite(f,?patient10_run1.tif?) If filename contains no path information, then imwrite saves the file in the current working directory. The imwrite function can have other parameters depending on e file format selected. Most of the work in the following deals either with JPEG or TIFF images ,so we focus attention here on these two formats.
More general imwrite syntax applicable only to JPEG images is imwrite(f,?filename.jpg,,?quality?,q) where q is an integer between 0 and 100(the lower number the higher the degradation due to JPEG compression). For example, for q=25 the applicable syntax is >> imwrite(f,?bubbles25.jpg?,?quality?,25) The image for q=15 has false contouring that is barely visible, but this effect becomes quite pronounced for q=5 and q=0.Thus, an expectable solution with some margin for error is to compress the images with q=25.In order to get an idea of the compression achieved and to obtain other image file details, we can use function imfinfo which has syntax. Imfinfo filename Here filename is the complete file name of the image stored in disk. For example, >> imfinfo bubbles25.jpg outputs the following information(note that some fields contain no information in this case): Filename: „bubbles25.jpg? FileModDate: ?04-jan-2003 12:31:26? FileSize: 13849
Format:
„jpg? „„
Format Version: Width: Height: Bit Depth: 714 682 8
Color Depth:
„grayscale? „„
Format Signature: Comment: {}
Where file size is in bytes. The number of bytes in the original image is corrupted simply by multiplying width by height by bit depth and dividing the result by 8. The result is 486948.Dividing this file size gives the compression
ratio

To store in to variable K all the information generated by command imfinfo, the information generated by imfinfo is appended to the structure variable by means of fields, separated from K by a dot. For example, the image height and width are now stored in structure fields K. Height and K. width.
As an illustration, consider the following use of structure variable K to commute the compression ratio for bubbles25.jpg: >> K=imfinfo(„bubbles25.jpg?); >> image_ bytes =K.Width* K.Height* K.Bit Depth /8; >> Compressed_ bytes = K.FilesSize; >> Compression_ ratio=35.162 Note that iminfo was used in two different ways. The first was t type imfinfo bubbles25.jpg at the prompt, which resulted in the information being displayed on the screen. The second was to type K=imfinfo („bubbles25.jpg?),which resulted in the information generated by imfinfo being stored in K. These two different ways of calling imfinfo are an example of command_ function duality, an important concept that is explained in more detail in the MATLAB online documentation. More general imwrite syntax applicable only to tif images has the form Imwrite(g,?filename.tif?,?compression?,?parameter?,….?resloution?,[colres rowers] ) Where „parameter? can have one of the following principal values: „none? indicates no compression, „pack bits? indicates pack bits compression (the default for non „binary images?) and „ccitt? indicates ccitt compression. (the default for binary images).The 1*2 array [colres rowers] Contains two integers that give the column resolution and row resolution in dot per_ unit (the default values). For example, if the image dimensions are in inches, colres is in the number of dots(pixels)per inch (dpi) in the vertical direction and similarly for rowers in the horizontal direction. Specifying the resolution by single scalar, res is equivalent to writing [res res]. >>imwrite(f,?sf.tif?,?compression?,?none?,?resolution?,……………..[300 300])
the values of the vector[colures rows] were determined by multiplying 200 dpi by the ratio 2.25/1.5, which gives 30 dpi. Rather than do the computation manually, we could write >> res=round(200*2.25/1.5); >>imwrite(f,?sf.tif?,?compression?,?none?,?resolution?,res) where its argument to the nearest integer.It function round rounds is important to note that the number of pixels was not changed by these commands. Only the scale of the image changed. The original 450*450 image at 200 dpi is of size 2.25*2.25 inches. The new 300_dpi image is identical, except that is 450*450 pixels are distributed over a 1.5*1.5_inch area. Processes such as this are useful for controlling the size of an image in a printed document with out sacrificing resolution. Often it is necessary to export images to disk the way they appear on the MATLAB desktop. This is especially true with plots .The contents of a figure window can be exported to disk in two ways. The first is to use the file pull-down menu is in the figure window and then choose export. With this option the user can select a location, filename, and format. More control over export parameters is obtained by using print command: Print-fno-dfileformat-rresno filename Where no refers to the figure number in the figure window interest, file format refers one of the file formats in table above. „resno? is the resolution in dpi, and filename is the name we wish to assign the file. If we simply type print at the prompt, MATLAB prints (to the default printer) the contents of the last figure window displayed. It is possible also to specify other options with print, such as specific printing device. Data Classes:
Although we work with integers coordinates the values of pixels themselves are not restricted to be integers in MATLAB. Table above list various data classes supported by MATLAB and IPT are representing pixels values. The first eight entries in the table are refers to as numeric data classes. The ninth entry is the char class and, as shown, the last entry is referred to as logical data class. All numeric computations in MATLAB are done in double quantities, so this is also a frequent data class encounter in image processing applications. Class unit 8 also is encountered frequently, especially when reading data from storages devices, as 8 bit images are most common representations found in practice. These two data classes, classes logical, and, to a lesser degree, class unit 16 constitute the primary data classes on which we focus. Many ipt functions however support all the data classes listed in table. Data class double requires 8 bytes to represent a number uint8 and int 8 require one byte each, uint16 and int16 requires 2bytes and unit 32.
Name Double Uinit8 Element). Uinit16
Description Double _ precision, floating_ point numbers the Approximate. unsigned Element). unsigned 16_bit integers in the range [0,65535] (2byte Per element). 8_bit integers in the range [0,255] (1byte per
Uinit 32
unsigned 32_bit integers in the range [0,4294967295] (4 bytes per element).
Int8
signed 8_bit integers in the range[-128,127] 1 byte per element)
Int 16
signed 16_byte integers in the range [32768, 32767] (2 bytes per element).
Int 32
Signed 32_byte integers in the range [-2147483648, 21474833647] (4 byte per element).
Single
single _precision floating _point numbers with values In the approximate range (4 bytes per elements).
Char Logical
characters (2 bytes per elements). values are 0 to 1 (1byte per element).
int 32 and single, required 4 bytes each. The char data class holds characters in Unicode representation. A character string is merely a 1*n array of characters logical array contains only the values 0 to 1,with each element being stored in memory using function logical or by using relational operators.
Image Types: The toolbox supports four types of images: 1 .Intensity images 2. Binary images 3. Indexed images 4. R G B images
Most monochrome image processing operations are carried out using binary or intensity images, so our initial focus is on these two image types. Indexed and RGB colour images.
Intensity Images: An intensity image is a data matrix whose values have been scaled to represent intentions. When the elements of an intensity image are of class unit8, or class unit 16, they have integer values in the range [0,255] and [0, 65535], respectively. If the image is of class double, the values are floating _point numbers. Values of scaled, double intensity images are in the range [0, 1] by convention. Binary Images: Binary images have a very specific meaning in MATLAB.A binary image is a logical array 0s and1s.Thus, an array of 0s and 1s whose values are of data class, say unit8, is not considered as a binary image in MATLAB .A numeric array is converted to binary using function logical. Thus, if A is a numeric array consisting of 0s and 1s, we create an array B using the statement. B=logical (A) If A contains elements other than 0s and 1s.Use of the logical function converts all nonzero quantities to logical 1s and all entries with value 0 to logical 0s. Using relational and logical operators also creates logical arrays. To test if an array is logical we use the I logical function: islogical(c)
If c is a logical array, this function returns a 1.Otherwise returns a 0. Logical array can be converted to numeric arrays using the data class conversion functions. Indexed Images: An indexed image has two components: A data matrix integer, x. A color map matrix, map. Matrix map is an m*3 arrays of class double containing floating_ point values in the range [0, 1].The length m of the map are equal to the number of colors it defines. Each row of map specifies the red, green and blue components of a single color. An indexed images uses “direct mapping” of pixel intensity values color map values. The color of each pixel is determined by using the corresponding value the integer matrix x as a pointer in to map. If x is of class double ,then all of its components with values less than or equal to 1 point to the first row in map, all components with value 2 point to the second row and so on. If x is of class units or unit 16, then all components value 0 point to the first row in map, all components with value 1 point to the second and so on. RGB Image: An RGB color image is an M*N*3 array of color pixels where each color pixel is triplet corresponding to the red, green and blue components of an RGB image, at a specific spatial location. An RGB image may be viewed as “stack” of three gray scale images that when fed in to the red, green and blue inputs of a color monitor Produce a color image on the screen. Convention the three images forming an RGB color image are referred to as the red, green and blue components images. The data class of the components images determines their range of values. If an RGB image is of class double the range of values is [0, 1].
Similarly the range of values is [0,255] or [0, 65535].For RGB images of class units or unit 16 respectively. The number of bits use to represents the pixel values of the component images determines the bit depth of an RGB image. For example, if each component image is an 8bit image, the corresponding RGB image is said to be 24 bits deep. Generally, the number of bits in all component images is the same. In this case the number of possible color in an RGB image is (2^b) ^3, where b is a number of bits in each component image. For the 8bit case the number is 16,777,216 colors
CONCLUSIONS In this paper, a novel method which combines the irsch operator with the optical flow is proposed for the moving object detection. Consider the edge image as the space gradient
while the optical flow image is time gradient. The KOF method contains both the space gradient information and the time gradient information. Otsu algorithm and morphologic operation are also used as the supporting techniques. Contrast with the three traditional moving object detection methods, the KOF method not only can give the exact boundary of the moving objects, but also has the better anti-noise performance. Although the method is a little timeconsuming, the fast development of the hardware of the computer can solve this problem. The experiment results prove that the method is effective for the moving object detection.
REFERENCES: 1. C. Stauffer and W. E. L. Grimson, “Learning patterns of activity using real -time tracking,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, No. 8, pp. 747–757, Aug. 2000. 2. K. Q. Huang, L. S. Wang, T. N. Tan, and S. Maybank, “A realtime object detecting and tracking system for outdoor night surveillance,” Pattern Recognition, vol. 41, pp. 432–444, Jan. 2008.
3. J. R. Bergen, P. J. Burt, R. Hingorani, and S. Peleg, “A three-frame algorithm for estimating two-component image motion”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, No. 9, pp. 886-896, Sep. 1992. 4. R. J. Radke, S. Andra, O. AI-Kofahi, and B. Roysa, “Image change detection algorithms: a systematic survey”, IEEE Transactions on Image Processing, vol. 14, No. 3, pp. 294-307, Mar. 2005. 5. O. Miller, A. Averbuch, and Y. Keller, “Automatic adaptive segmentation of moving objects based on spatio-temporal information,” in Proc. VIIth Digital Image Computing: Techniques and Applications, 2003, pp.1007-1016.
doc_700599636.docx