06213551

4152
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 9, SEPTEMBER 2012
Multifocusing and Depth Estimation Using a Color Shift Model-Based Computational Camera
Sangjin Kim, Student Member, IEEE , Eunsung Lee, Student Member, IEEE , Monson H. Hayes, Fellow, IEEE , and Joonki Paik, Senior Member, IEEE
Abstract This paper presents a novel approach to depth estimation using a multiple color-lter aperture (MCA) camera and its application to multifocusing. An image acquired by the MCA camera contains spatially varying misalignment among RGB color channels, where the direction and length of the misalignment is a function of the distance of an object from the plane of focus. Therefore, if the misalignment is estimated from the MCA output image, multifocusing and depth estimation become possible using a set of image processing algorithms. We rst segment the image into multiple clusters having approximately uniform misalignment using a color-based region classication method, and then nd a rectangular region that encloses each cluster. For each of the rectangular regions in the RGB color channels, color shifting vectors are estimated using a phase correlation method. After the set of three clusters are aligned in the opposite direction of the estimated color shifting vectors, the aligned clusters are fused to produce an approximately in-focus image. Because of the nite size of the color-lter apertures, the fused image still contains a certain amount of spatially varying out-of-focus blur, which is removed by using a truncated constrained least-squares lter followed by a spatially adaptive artifacts removing lter. Experimental results show that the MCA-based multifocusing method signicantly enhances the visual quality of an image containing multiple objects of different distances, and can be fully or partially incorporated into multifocusing or extended depth of eld systems. The MCA camera also realizes single camera-based depth estimation, where the displacement between multiple apertures plays a role of the baseline of a stereo vision system. Experimental results show that the estimated depth is accurate enough to perform a variety of vision-based tasks, such as image understanding, description, and robot vision. Index Terms Color shifting vector, computational camera, depth estimation, multifocusing.
I. I NTRODUCTION ULTIFOCUSING is a fundamental problem in image acquisition when multiple objects are located at different distances from the camera. A digital multifocusing system would be useful and important in a variety of applications
Manuscript received July 9, 2011; revised May 16, 2012; accepted May 21, 2012. Date of publication June 8, 2012; date of current version August 22, 2012. This work was supported in part by the Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education, Science, and Technology under Grant 2009-0081059. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Wai-Kuen Cham. The authors are with the Department of Image, Chung-Ang University, Seoul 156-756, Korea (e-mail: layered372@wm.cau.ac.kr; lessel7@wm. cau.ac.kr; mhh3@gatech.edu; paikj@cau.ac.kr). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TIP.2012.2202671
such as consumer digital cameras, camcorders, and video surveillance systems. Conventional cameras have come a long way in dealing with the problems associated with imperfect focal settings and the resulting blur. However, when there are multiple objects that are located at a variety of different distances, and when an image is captured with a small F-stop (large aperture), the removal of the out-of-focus blur on objects that are outside the depth of eld remains an interesting and important area of research. Recently, computational cameras have been developed to acquire information that cannot be easily obtained with a conventional camera [1][5]. Using this additional information, digital post-processing of these images allows for a variety of important imaging problems to be solved, such as refocusing, increased dynamic range, depth-guided editing, and relighting [6], [7]. In this paper, we present a novel solution to multifocusing and depth estimation using a computational camera that has multiple color-ltered apertures (MCA) [8][11]. To the best of our knowledge, fundamental theory of color shift model-based auto-focusing using the MCA conguration was rst proposed by Maik, et al. in [8], where multiple images with different focal settings were captured and fused for auto-focusing. A similar MCA conguration was used in [9] for single camera depth estimation and matting. Although digital image restoration and enhancement algorithms have been proposed for improving the performance of the MCA computational camera in [10] and [11], they have been based on the assumption that the out-of-focus blur is constant over the image. The MCA camera with proposed image processing algorithms enables automatic, computationally efcient multifocusing using a three-step process that involves: (i) image segmentation for classifying clusters, (ii) color shift modelbased registration and fusion, and (iii) image restoration. More specically, an image acquired by the MCA camera is rst segmented into multiple clusters, each of which has the uniform color, and then the corresponding rectangular region is generated that encloses each cluster. We assume that all the pixels in each of cluster correspond to a patch of an object, which is at approximately the same distance from the camera. The uniform distance assumption in a cluster can cover wide range of scenes composed by multiple objects. Since the distance of an object to the camera is uniquely related to the color shifting vector that is produced by the MCA camera, the next step is to estimate the color shifting vector of the corresponding cluster. For generating the support
10577149/$31.00 2012 IEEE
PAIK et al.: MULTIFOCUSING AND DEPTH ESTIMATION
4153
Fig. 1. Denition of clusters and the corresponding rectangular regions. An object consists of one or more clusters.
of phase correlation-based color channel shift estimation [12], [13], we rst extend each cluster to the corresponding rectangular region as shown in Fig. 1. Given color shifting vectors, the corresponding cluster regions in all three color channels are registered and fused to generate an color-aligned image. The color-aligned image by registration and fusion still contains spatially varying out-of-focus blur due to the different distances of objects. To realize spatially adaptive image restoration for multifocusing, a truncated constrained leastsquares (TCLS) lter is proposed. More specically, since the point-spread function (PSF) of the blur is related to the distance of an object away from the depth of eld or the plane of focus, a set of PSFs over a given range of distances are computed using the method proposed in [14], [15]. From the PSFs, a set of TCLS lters are then generated and stored. When an image is captured by the MCA camera, the best-matching TCLS lter coefcients are selected based on the depth information in relationship with the corresponding color shifting vectors. In order to remove undesired artifacts that result from the image acquisition and restoration processes, we propose a spatially adaptive artifacts removing lter that involves spatially adaptive weighted-mixing of the color channel-aligned image and the TCLS restored image. As previously noted, the MCA camera is also able to perform depth estimation from the color shifting vectors that correspond to the direction and length of the amount of misalignment between RGB color channels. Experimental results show that the estimated depth is sufciently accurate to perform a variety of vision-based tasks, such as image understanding, description, and robot vision. II. BACKGROUND This section reviews and summarizes prior works related with the proposed work. Multiple aperture systems: The idea of using a multiple aperture lens has been proposed using a micro lens array [16]. The image produced with a micro lens array is fundamentally inferior to a normal single-lens camera since the resolution of each micro lens in the array is signicantly limited by diffraction. A new computational imaging architecture has also been proposed for refocusing [17]. This architecture can acquire four images from the same scene with each image being taken with a different aperture setting. However, this approach has a complicated optical design that includes a relay optics tube, a four-way aperture splitting mirror, four sets of
mirrors, and imaging lenses. This additional optical structure necessitates both geometrical and radiometrical calibration. In this context, the MCA system can be viewed as an evolutionary step for single camera-based multifocusing and depth estimation without using multiple sensors or sophisticated alignment of optical elements. Extended depth of eld: The wave-front coding system has an optical sub-system with a special modulation transfer function (MTF), which can be represented as a separable combination of one-dimensional (1D) ambiguity functions with no zero-response in the frequency-domain. Such separable MTF can be restored by a simple inverse lter in the digital signal processing module. As a result, the wave-front coding system can provide the uniformly focused image [18][20]. An alternative approach has been proposed using chromatic aberrations [21], [22]. The spectral focal sweep camera uses intentionally maximized chromatic aberration of the lens to provide a small PSF independent of the depth [22]. The primary advantage of this camera is that it does not require an expensive corrected lens. In spite of common objective for multifocusing, the proposed method differs from the extended depth of eld (EDoF) system in the sense that; i) it uses a non-traditional aperture instead of the design of a new lens, ii) it provides additional depth information as well as multifocusing, and iii) it produces variable PSF that depends on the depth of an object, while the EDoF system produce a sufciently small PSF that is independent of the depth, which results in more favorable implementation and better result of image restoration. On the other hand, the MCA camera realizes a single camera-based depth estimation which is the major advantage over a conventional stereo-based depth estimation. Coded aperture: Several researchers have proposed specially designed coded apertures that allow for the recovery of high-resolution image information and depth information from a single image. In the coded aperture imaging system, a specially patterned mask is placed on the aperture of the camera to change the frequency characteristics of defocus blur. Levin et al. focused on the estimation of correct depth at the cost of restored image quality [23] whereas Raskar et al. designed a coded aperture with a spectrum that is as at as possible without zeros for easy debluring [24]. Veeraraghavan et al. used a narrowband coded aperture to recover the full resolution image information for the in-focus parts of the scene [25]. The single image-based depth estimation heavily relies on specic frequencies and image priors, which results in increasing noise sensitivity, limited spatial resolution, and distance ambiguity. An advanced approach using asymmetric coded aperture pairs was proposed by Zhou et al. to overcome the limitations of a single coded aperture [26]. Their work presented an optimized pair of coded apertures that guarantees estimation of relative defocus and preserving frequency content for the image quality. The proposed work enables the MCA camera to estimate depth using relative shifts among color channels instead of the size of the blur pattern. Under the MCA framework, accurately estimated depth information guarantees the correct PSF at the cost of additional color registration processes.
4154
Depth-from-focus: Depth-from-focus involves the use of the cameras focusing function for depth recovery by searching for the most in-focus (or sharpest) position from a sequence of images taken by various focusing parameters [27][29]. The advantage of this approach is that searching for the best focus position does not require a large amount of computation. On the other hand, it requires a large number of images for accurate depth estimation, which may be an issue in high-resolution imaging systems. A new image restoration framework based on depth-from-focus has been proposed for decoupling the effects of defocus and exposure using a sequence of images obtained by varying aperture setting [30]. The proposed approach using the MCA camera differs from the depth-from-focus approach in that it estimates depth from relative shifts among color channels instead of using the amount or size of defocus. While accuracy of depthfrom-focus method relies on the PSF model and parameter estimation, that of the proposed method depends only on the direction and length of color shifting vectors. Color-lter apertures: Maik et al. have originally proposed color shift model using multiple color-ltered apertures for auto-focusing in [8]. Although they developed the basic concepts of the color shifting property of the MCA camera, their auto-focusing method has inherent limitations such as the requirement of manual user strokes, and the use of empirical tuning parameters for the color channel registration and fusion. Furthermore, the resulting image still contains undesired outof-focus blur. Improved versions include digital multifocusing method that overcomes the limitations of Maiks approach under assumption that the out-of-focus blur is spatiallyinvariant [10], [11]. However, they do not consider that the degree of the out-of-focus blur depends on the distance of object from the camera. The use of multiple color-ltered apertures has also been proposed for depth estimation. For example, a camera with two color apertures [31] and a three color aperture camera [9] have been used to obtain depth maps from a single image. In the three color aperture system, a sixteen level depth map is produced by exploiting the tendency of colors in natural images to form elongated clusters in the RGB color space. In spite of a number of systems that have been develped using the color-ltered apertures, the design of a practical system that is simultaneously capable of spatiallyvarying multifocusing and depth estimation has remained an open problem. However, the proposed MCA-based depth estimation method can provide a generalized framework for simultaneous depth estimation and multifocusing up to any level from pixels to layers. On the other hand, since the proposed method is based on the disparity between RGB color channels, the performance of the proposed method requires that all three primary colors are present. Since real scenes contain at least two primary colors, a color boosting algorithm has been proposed for color-invariant feature description for the MCA camera in [32]. III. M ULTIPLE C OLOR -F ILTER A PERTURE C AMERA An aperture of an optical system adjusts the amount of incoming light passing through the lens and determines the
(a)
(b)
Fig. 2. Two different single-aperture models. (a) Aperture is aligned on the optical axis of the camera. (b) Aperture is shifted away from the optical axis, which produces various convergence positions according to the distance of an object.
focal length and the depth of eld of a camera. With a conventional camera, the center of an aperture is aligned with the optical axis of the lens, and the convergence pattern on the imaging sensor forms either a point or a circular region depending on the distance of the object to the lens as shown in Fig. 2(a). However, if the center of the aperture is shifted away from the optical axis, the convergence position will be either above or below the optical axis, depending on whether an object is in the near-focus or far-focus plane as illustrated in Fig. 2(b). More specically, when the aperture is above the optical axis, the convergence position of an object at the near-focus plane will be shifted to a point above the optical center on the imaging plane, whereas the convergence position of an object at the far-focus plane will be shifted to a point below the optical center on the imaging plane. As a result, the distance of an object from the camera can be estimated from the location of the convergence pattern. To make this measurement feasible, a multiple aperture lens is required. Instead of a single shifted aperture, consider a lens with a pair of apertures covered with green and blue lters, respectively, as illustrated in Fig. 3(a). With this conguration, an object in a far-focus position will converge at two different locations in G and B color channels, respectively. An object in a near-focus plane will also converge at two different locations, but in the reverse order. For example, the green aperture will form an image of a near-focused object above the optical center, while making an image of a far-focused object below the optical center. As a result, the distance of an object from the camera may be estimated by nding the relative displacement of the images in the two color channels.
4155
(a)
(b) (a) (b)
Fig. 3. (a) Two-aperture model with green and blue color lters. (b) Threeaperture model with RGB color lters.
Fig. 5. Real out-of-focus images acquired by (a) conventional and (b) MCA cameras. The original scene has continuously changing depths, where the left side is farthest and the right side is the closest.
(a)
(b)
(c)
(d)
Fig. 4. MCA camera and different focusing patterns depending on the distance between an object and the camera. (a) Digital single lens reected DSLR camera equipped with the MCA. (b) Object located at a near-focus plane produces a misaligned color channels. (c) Object located at the in-focus plane produces an image on the sensor with all three color channels aligned (or converged). (d) Object located at the far-focus plane produces misaligned color channels in the opposite direction of the case in (b).
Since a full color image requires red, green, and blue color channels, the two-aperture can be extended to a three-aperture model as illustrated in Fig. 3(b). In the similar manner to the two-aperture model, the image of an object through three colored apertures will converge at different locations on the imaging sensor, where the displacement of each color image depends on the distance of the object from the lens. We call this conguration the multiple color-lter aperture (MCA) model. In order to estimate the depth of an object and to perform multifocusing, it is necessary that the convergence patterns be found in the red, green, and blue channels separately. Acquisition of these images can be accomplished using a single image sensor with a color lter array, such as the Bayer pattern, or by using three RGB imaging sensors [33]. Since most digital cameras use a color lter array, the use of RGB color lters in the MCA camera seems reasonable. A camera equipped with the MCA is shown in Fig. 4(a), where three RGB color lters form a regular (or equilateral) triangle whose centroid is on the optical center. Three different cases with near-focused, in-focused, and far-focused objects are shown in Figs. 4(b)(d), respectively.
Real images acquired by a conventional digital camera and the MCA camera are shown in Figs. 5(a) and 5(b), respectively. Note that in the image captured by the MCA camera, the color misalignment appears on the right and the left hand sides of the image corresponding to the relative displacement of object regions from the in-focus plane. With a conventional camera using a single aperture lens and an RGB color lter array (CFA), an object outside the in-focus plane of the camera produces the same out-of-focus blur in each color channel without color misalignment. Because of the small, xed size of apertures in the MCA camera, the amount of outof-focus blur is relatively small, which is an important property for justifying the use of simple, nite impulse response (FIR) image restoration lter. The trade-off, however, is that with a small aperture, the camera needs a longer period of shutter opening, which may cause motion blur. Removal of the motion blur is out of the scope of this paper. Based on the color shift model of the MCA camera and the assumption of the sufciently small apertures, the focusing problem turns into the registration and alignment problem between color channels. If an image represents a single-plane scene, space-invariant registration and alignment will focus the image. On the other hand, for an image containing multiple objects at different depths, spatially varying registration and alignment is necessary. The step-by-step description of the proposed multifocusing algorithm using the MCA camera is presented in the following section, and the depth estimation algorithms is presented in section V. IV. M ULTIFOCUSING U SING THE MCA C AMERA The color image that is acquired by the MCA camera, f c (x , y ), for c { R , G , B } (1)
goes through a number of image processing blocks to obtain the all-focused image. The proposed image processing algorithms include i) automatic segmentation of objects and the corresponding rectangular regions, ii) estimation of color shifting vectors of segmented regions among three color channels, and iii) image restoration with additional noise smoothing, as shown in Fig. 6. The specic blocks and processing tasks are described in the following subsections, and the depth estimation using the MCA camera is presented in section V.
4156
(a)
(b)
(c)
(d)
Fig. 7. Segmentation examples. (a) Input image, f c ( x , y ). (b) Color segmented clusters, Ci , and uniformly partitioned blocks. (c) Blocked clusters. (d) Rectangular regions, Ri , enclosing Ci .
Fig. 6. Proposed image processing algorithms, including segmentation, registration, and restoration for multifocusing and depth estimation using the MCA camera.
A. Automatic Segmentation of Objects With Different Depth An image acquired by the MCA camera contains spatially varying misalignment among RGB color channels, where the direction and length of the misalignment is uniquely determined by the distance of an object from the camera. Under assumption that an appropriately segmented region has uniform distance, we rst segment the MCA output image into multiple uniform-depth regions. Although various image segmentation algorithms have been proposed in the literature [34], [35], the proposed segmentation algorithm is developed in pursuit of i) the use of a color transformation compensating the potential loss of a specic color channel and misaligned color edges in the RGB color space, ii) non-iterative color-based processing, and iii) description of both real objects shape and the corresponding rectangular regions for efcient registration. The input RGB color image is transformed into the CIE L*a*b color space as td (x , y ) = T [ f c (x , y )], for d { L , a , b } (2)
K seeds using the hill-climbing method in the threedimensional (3D) histogram of the L*a*b* color space of an input image [38], [39]. The hill-climbing method selects a bin in the histogram and searches for the peak of it. This process repeats until all bins are climbed, if Hw > w , where Hw and w respectively represent the width of the hill and the threshold value. On the other hand, if the hill is slim enough to contain ve or less bins, we compare the average size of all hills, Ha v , with the size of the i -th hill. If Hi > Ha v , and Hw < w , then the current hill is determined to be dominant. For the experiment, w is set to 40. The identied peaks are used as the set of initial seeds for K-means clustering. The K-means algorithm then classies an input image into NC clusters as (3) Ci , for i = 1, . . . , NC , which represents a set of pixels classied into the i -th cluster. After labeling the clusters, the segmented image is partitioned into multiple uniform blocks, which are then merged using 8-connectivity to produce the same number of blocked clusters. For the registration process that will be described in the following subsection, the smallest rectangular region enclosing the corresponding blocked cluster is determined such that Ri Ci , for i = 1, . . . , NC . (4)
where the details of RGB to L*a*b* color transformation can be found in [36], [37]. Due to color misalignment, the boundary of an object is differently located in RGB color channels. The L*a*b* color space averages the misalignment, and provides both perceptual linearity and uniformity in the segmented region. The proposed segmentation method modies the K-means clustering algorithm, which starts with automatically selected
In other words, if (x , y ) Ci , then (x , y ) Ri . Examples of each step of the proposed segmentation algorithm are shown in Fig. 7. B. Color Channel Registration In all NC rectangular regions, which have been determined by the segmentation algorithm shown in Fig. 7, shifting vectors among color channels should be estimated for both multifocusing and depth estimation using edge-based phase
4157
TABLE I ROOT M EAN -S QUARED E RROR VALUES OF T HREE C OLOR I MAGE R EGISTRATION M ETHODS
Measure type Input Sum of squared difference [41] Mutual information [42], [43] Proposed 16.42 9.41 5.53 3.68 15.19 9.01 5.12 3.01 19.02 12.12 6.46 5.94 20.37 10.67 7.33 5.18
Average 17.75 10.30 6.11 4.45
correlation, where Canny edge detector is employed [40]. Let f ci be a part of input image in Ri such as f ci (x x i , y yi ) = f c (x , y ), for (x , y ) Ri and c { R , G , B }, (5)
(a) (b)
where (x i , yi ) represents the coordinate of the top left corner of Ri . We consider f G as the reference color channel, and estimate color shifting vectors of each region in f R and f B from i the reference channel. If we assume that f R is uniformly i shifted from f G by ( x , y ), then the following relationship is satised
i i (x , y ) = f G (x fR
x, y
y ),
(6)
(c)
(d)
and the normalized cross power spectrum is given as

i (u , v) F i (u , v) FR G i i FR (u , v) FG (u , v)
= e j 2(u
x +v y )
(7)
where Fci (u , v) represents the discrete Fourier transform (DFT) of f ci (x , y ). If x and y are both integers, the inverse transform of (7) results in a shifted version of the Dirac delta function (x x , y y ), whose center indicates the translation vector. On the other hand, if the translation is decimal, it is considered as the integer translation followed by downsampling, and Foroosh et al. [12] showed that the corresponding phase correlation is the inverse discrete Fourier transfrom (IDFT) of a downsampled cross power spectrum of ltered unit impulse. If the lter has a rectangular frequency response, it can be approximated as
i rG R (x , y )
Fig. 8. Performance evaluation of three color registration algorithms. (a) Input image acquired by the MCA camera. (b) Registered image by the sum of squared difference method [41]. (c) Registered image using mutual information [42], [43]. (d) Registered image by the proposed edge-based phase correlation method. Note that better registered color image has higher correlation among channels, which results in the slimmest distribution in the RGB color coordinate.
Fig. 9. Conguration of three apertures (top: green, left: red, right: blue) with color shifting vectors and the ideal focusing point.
sin(( M x x )) sin(( N y y )) , ( M x x )( N y y )
(8) among color channels, which results in the slim or at distribution if the registered color channels have high correlation as shown in Fig. 8. The second comparison manually registers all color channels to obtain the ground truth image, and then computes root mean square error values from the ground truth image as shown in Table I. In the MCA camera, locations of three apertures form a regular triangle around the optical axis as shown in Fig. 9. For registering all three channels, we rst align R and B channels to the G channel, and then move three aligned channels to the center of the triangle. i i Given ( x G R , yG R ) in (9), the corresponding compensation vectors of RGB channels for ideal focusing point are
where ( x , y ) represents the integer translation vector, and M and N respectively represent downsampling factors in horizontal and vertical directions. By nding the location of the peak in the continuous two-dimensional (2D) sinc function in (8), we can estimate subpixel color shifting vector of region Ri between G and R channels as follows
i i ( xG R , yG R ) = ( x / M , y / N ).
(9)
The use of phase correlation method is justied because of the assumption that each rectangular region has a uniform translation among color channels. We also compare the performance of the proposed method with existing color registration algorithms. The rst comparison uses the joint histogram
4158
(a) (a) (b)
(b)
Fig. 11. Finally registered image. (a) Input image acquired by the MCA camera. (b) Registered image with reduced artifacts.
Fig. 10. Registration result and artifacts. (a) Registration process for i ( x , y ), using (11). (b) Lost pixels (top) and an registered cluster regions, gc unnatural border (bottom).
respectively determined as
i i i ( xi R , y R ) = ( x G R , y G R ) + 0,
2 i y 3 GR
i xG R,
1 i y 3 GR 2 i y 3 GR
i i i ( xi B , y B ) = ( x G R , y G R ) + 0 ,
=
i i ( xG , yG )=
i xG R,
1 i y 3 GR (10)
0,
2 i y . 3 GR
Although color shifting vectors are estimated from rectangular regions, Ri , registration is performed based on clusters, Ci , which has been dened in (3). The i -th cluster region can be considered as the intersection of f ci and Ci , and can be expressed as fci (x , y ) Ci . Therefore, registered cluster regions are expressed as
i (x , y ) = f ci (x gc i xc ,y
ciR ) Ci , for c { R , G , B }. (11)
Fig. 10 shows registered cluster regions containing registration artifacts such as lost pixels and unnatural border. To reduce these artifacts, a Gaussian smoothing lter is applied, and the nally registered image is obtained by fusing NC cluster regions as gc ( x , y ) =
i =1,..., NC i gc (x , y ),
of light ray, to name a few. For overcoming these problems, under the real-time processing framework, we present a complete solution for digital multifocusing incorporated into the MCA camera in the form of the FIR lter structure. 1) TCLS Image Restoration Filter: Since there is oneto-one relationship between the PSF and the distance of the corresponding cluster of each region, we can a priori generate a set of PSFs and the corresponding TCLS lter coefcients based on the estimated color shifting vector in each region. Generation of the PSF and TCLS lter datasets consists of three steps: (i) generation of one-dimensional (1D) step response using the edge pattern of a registered cluster region, (ii) estimation of 2D PSF using the generated 1D step response, and (iii) generation of TCLS lter coefcients from the estimated PSF. We adopt the PSF estimation method proposed in [14], [15] under assumption that out-of-focus blur results in an isotropic PSF, and that the input image contains an ideal step edge at the linear boundary of an object. Based on the estimated PSF of every distance in the practical focusing range the corresponding TCLS lter coefcients are a priori computed and saved for the later use. The proposed TCLS lter is derived based on the classical constrained least-squares (CLS) lter, but a practical truncation method is applied for realizing an FIR lter structure. Given the PSF of each color, h c (i , j ) for c { R , G , B } or equivalently its frequency response Hc (u , v), the frequency response of the CLS lter is given as [36]. Rc (u , v) =
(u , v) Hc
| Hc (u , v)|2 + | P (u , v)|2
(13)
(12)
and a sample result is shown in Fig. 11. C. Image Restoration In this section we present an efcient, real-time image restoration algorithm and additional artifact removing method. The registered image given in (12) still contains undesired outof-focus blur. In addition to the out-of-focus blur, the image is further degraded by additional factors caused by the nitesize apertures and the lateral displacement of color-ltered apertures around the optical axis, which result in low exposure, color mixing, deviation of color convergence, and divergence
where P (u , v) represents a high-pass lter, and the regularization parameter that controls the relative amount of data delity and smoothness. The frequency domain CLS ltering can be considered as a spatial domain lter of support size equal to the entire image ignoring the boundary problem. The spatial domain lter can be expressed as the inverse DFT of Rc (u , v) as rc (x , y ) = F 1 [ Rc (u , v)] , (14) where F 1 [] represents the inverse DFT operation. Since a lter of very large support is impractical, we can have a reduced, nite-support approximation of the CLS lter by appropriate truncation since signicant energy is concentrated in the central part of coefcients.
4159
(a )
(b )
boundary artifacts due to discontinuity, noise amplication or clustering in at regions, etc. Spatially adaptive image restoration is the most widely used for removing above mentioned artifacts [44]. The spatially adaptive processing, however, breaks the space-invariance assumption of the restoration lter, and makes frequency-domain implementation impossible. As a result, most adaptive image restoration algorithms take an iterative framework [45], which makes real-time image restoration impossible. To obtain an acceptable quality of the restored image in real-time, we propose a compact, efcient way to implement spatially adaptive mixing method that takes detailed highfrequency regions from the restored image and at regions from the color registered image. Weighted mixture of two images is expressed as fc (x , y ) = (x , y )gc (x , y ) + (1 (x , y )) f cT (x , y ), for 0 1, (17) where gc represents the color channel registered image, f cT the restored image by the TCLS lter, and (x , y ) the spatially varying weighing factor, which is determined as [45] (x , y ) = 1 , 1 + v (x , y ) (18)
(c)
Fig. 12. Relationship between color shifting vectors, PSFs, and the TCLS lter coefcients. (a) Direction and size of color shifting vectors. (b) Corresponding PSFs of three color channels. (c) Corresponding TCLS lter coefcients for R channel.
A simple approach to obtain a lter is to multiply a Gaussian function of a proper standard deviation, , to the full-support CLS lter coefcients, and truncate the result outside of 3 . While the Gaussian function provides a stable truncated restoration lter, it tends to excessively attenuate lter coefcients due to exponential decrease of the Gaussian function. Such excessive attenuation results in insufcient restoration of the detail information of the image. Based on empirical optimization for both performance and efciency, we adopted the raised-cosine function. Let the lter size be 2m +1, then the truncated lter using the raised cosine function is given as
T (x , y ) = rc
rc (x , y ) cos 0,
2m x
cos
2m y
, | x |, | y | m otherwise. (15)
While the raised cosine window minimizes the attenuation of the lter coefcients, the resulting lter may become less stable. For this reason special care is needed in selecting the size of the lter. In this paper, we designed the TCLS lter of size 13 13, but in general, the size of the raised cosine function is determined based on the size of the PSF. This function tends to keep only coefcients in the primary and secondary lobes of the CLS lter while attenuating or removing remaining coefcients in the third and higher lobes. A set of examples including the direction and size of color shifting vectors, the corresponding PSFs, and the corresponding 15 15 TCLS lter coefcients are shown in Fig. 12. The cluster-adaptively restored image using the TCLS lter is obtained as
T (x , y ). f cT (x , y ) = gc (x , y ) rc
(16)
where is the convolution operator. 2) Spatially Adaptive Artifacts Removal: Although most image restoration can successfully remove degradation factors with a certain amount of noise, the restored image still contains undesired artifacts including ringing around strong edges,
where v(x , y ) represents the local variance in the neighborhood of (x , y ), is chosen so that distributes as uniformly as possible in [0,1]. A typical value of is empirically chosen, and approximately equal to 1000. The alpha map images are produced according to (18) as shown in Fig. 13(e) and (f). As a result, the nally restored images have sharply restored edges with suppressed artifacts in at regions as shown in Fig. 13(g) and (h). Although a bilateral lter can also reduce noise with preserving sharp edge, it cannot restore the sharp details in the image. On the other hand, the proposed spatially adaptive artifacts removal lter can enhance the sharpness of the image together with the TCLS lter. The size of a PSF produced by the MCA camera is very small because of the nite, small aperture. In order to prove that the proposed restoration method including the TCLS restoration lter and adaptive mixing is suitable for the MCA camera-based multifocusing, we compared its performance with well-known algorithms, such as Wiener lter, ForWaRD [46], SADCT [47], and TwIST [48] methods using 512512 Lena image with simulated blur and additive white Gaussian noise. Performance of four image restoration methods are summarized in Table II in the sense of peak-to-peak signal-to-noise ratio (PSNR). The proposed method denitely outperforms the classical Wiener lter and provides similar performance to ForWaRD. Although SADCT and TwIST methods give higher PSNR than the proposed method, they cannot be used for real-time restoration for MCA camerabased multifocusing due to image transformation and iterative structure. V. D EPTH E STIMATION The recovery of depth information of a scene from 2D images is a fundamental problem in many image processing
4160
TABLE II PSNR VALUES OF F OUR D IFFERENT R ESTORATION A LGORITHMS Blur size Restoration type Wiener ForWaRD 33 SADCT TwIST Proposed Wiener ForWaRD 55 SADCT TwIST Proposed Wiener ForWaRD SADCT TwIST Proposed SNR(dB) 30 dB 40 dB 16.667 23.120 32.829 32.892 20.023 16.523 23.120 32.321 32.182 19.034 15.332 21.423 30.423 31.182 17.923 23.582 28.452 34.232 35.123 28.001 22.923 26.324 33.433 34.496 26.132 21.232 25.031 31.531 32.183 22.487
20 dB 10.923 20.923 31.012 31.320 15.123 9.434 20.923 29.350 30.770 14.130 9.023 19.532 28.172 29.123 13.321
50 dB 25.923 31.978 34.903 35.821 30.921 24.4321 31.342 34.679 35.712 29.821 23.172 30.132 33.002 34.218 26.134
(a)
(b)
77
(c)
(d)
(e)
(f)
(a)
(b)
Fig. 14. Sample MCA image and the distribution of color shifting vectors. (a) MCA image focused on the mid-depth object (the eye region of green, or bottom right sh). (b) Distribution of color shifting vectors (vectors corresponding to the far-focus region are pointing upward, and vice versa).
(g)
(h)
Fig. 13. Image restoration for real defocused image and the MCA output image. (a) and (b) Real photographically defocused image acquired by a DSLR camera and the color channel registered images, respectively. (c) and (d) Restored images by the TCLS lter. (e) and (f) Alpha map images. (g) and (h) Adaptively mixed images using (17).
and computer vision applications such as object recognition, scene interpretation, 3D reconstruction, and multifocusing. Although various depth estimation algorithms have been proposed, the accurate estimation of depth with a moderate amount of computation is still a challenging problem. Recently, computational cameras have combined special optics with signal processing algorithms have been developed to perform a variety of special functions, including depth estimation [9], [23]. In this section, we present a novel approach to depth estimation using the color shift model-based MCA camera, which was originally proposed in [8] and the rst attempt in using multiple color ltered apertures in the literature to the best of our knowledge.
In the previous section, we estimated only a color shifting vector of the corresponding region Ri between green and red i , y i ), and all three color channels are channels, ( x G R GR appropriately aligned using the relationship given in (10). On the contrary, in this section we estimate two color shifting vectors, i i , yGc ), for c { R , B }, (19) ( x Gc in order to improve the accuracy of the depth estimate. The color shifting vectors of an MCA camera image provide depth information of a cluster Ci given in (3), and are estimated from the corresponding rectangular region Ri given in (4). More specically, the farther a cluster goes away from the focusing plane, the larger the length of the vector becomes. Three vectors including green-to-red (GR), greento-blue (GB), and red-to-blue (RB) vectors, form a regular triangle. These properties of the color shifting vectors are used to robustly estimate the distance of a cluster from the camera. Real estimated vectors are shown in Fig. 14, which illustrates the fundamental property of color shifting vectors
4161
(a)
(b)
(a)
(b)
Fig. 15. (a) Test pattern. (b) Four MCAs with different baselines and the corresponding acquired images.
(c)
(a)
(b)
(d)
(e)
(c)
(d)
Fig. 17. Proposed depth estimation process. (a) Rectangular regions in the image. (b) Color coded clusters enclosed by the corresponding regions. (c) Distribution of color shifting vectors for the corresponding clusters. (d) Estimated distances of six regions. (e) Result of depth estimation in conjunction with the proposed cluster-based segmentation result.
Fig. 16. Relationship between the squared length of color shifting vectors and the corresponding distance using four MCAs of baseline length 14, 18, 24, and 28 millimeters. (a) Color shifting vectors of GR channels at 1.0 meters or closer. (b) Color shifting vectors of GR channels at 1.0 meters or farther. (c) Color shifting vectors of GB channels at 1.0 meters or closer. (d) Color shifting vectors of GB channels at 1.0 meters or farther.
distribution. As shown in Fig. 14(b), the length of a vector is proportional to the distance from the focusing plane, and a region on the near-focus plane produces the vectors whose direction is opposite to that of a region on the far-focus plane. Since lengths of color shifting vectors between GR and GB in the same distance is theoretically the same, an undesired variation (outlier) is rejected based on the squared length difference between GR and GB color shifting vectors. As can be inferred by the MCA model shown in Fig. 4, the length of a color shifting vector is also related to the displacement among the RGB color lter apertures. In order to estimate the distance of a region, Di , we use a test pattern and four MCAs with different baseline lengths of 14, 18, 24,
and 28 millimeters as shown in Fig. 15. Four distance maps using the special pattern of distance ranged from 0.4 meter to 3.6 meter by 0.02 meter interval are shown in Fig. 16. The proposed depth estimation function can be used for depth-based vision applications such as visual surveillance to track objects with depth information. The distance of a region was computed by averaging distances from GR and GB maps. It is noted that we a priori generate the distance map corresponding to each focal setting of the camera because the distance map depends on the distance of an in-focused object. Furthermore, since the PSF is xed for a specic distance, we can also generate the PSF using the distance map as mentioned in the previous section. Fig. 17 shows step-by-step results of the proposed depth estimation workow. Figs. 17(a) and (b) respectively show regions Ri s and the corresponding color coded clusters Ci s. Fig. 17(c) shows the color shifting vector distribution grouped by regions, where upper and lower dotted circles respectively contain far- and near-focused clusters, and
4162
(a)
(b) (a) (b)
Fig. 18. Experimental setup. (a) MCA camera with RGB color ltered apertures. (b) Its internal conguration. TABLE III H ARDWARE S PECIFICATIONS OF THE MCA C AMERA The MCA Camera Specications Camera type R, G, B Filters Digital single lens reex (DSLR) Green-Kodak-Wratten Filter No. 58 Blue-Kodak-Wratten Filter No. 47 Red-Kodak-Wratten Filter No. 25 APO-Symmar-L-150-5.6,11,22 f-5.6, f-11, f-22 23.7 15.6 mm RGB CCD, 6.31 million total pixels Schneider Apo-Tele-Xenar Relative aperture focal length 5.6/250 30 to 1/4,000 sec. and Bulb Triple mode for R, G, and B channels
(c)
(d)
Focusing Sensor Lens mounting Shutter speed Color mode
the distance of each cluster is estimated from the pair of color shifting vectors of the same size using the distance map shown in Fig. 16. The corresponding regions enclosing clusters with estimated distances are shown in Fig. 17(d). The result of depth estimation in conjunction with the proposed clusterbased image segmentation result is shown in Fig. 17(e). VI. E XPERIMENTAL R ESULTS In this section, we present experimental results of the proposed algorithm for multifocusing and depth estimation using the MCA camera. We used test images of size 15041000 acquired by the MCA camera, which is equipped with a lens of focal length 150 millimeter and three RGB color ltered apertures 14 millimeter apart from each other. We rst show the experimental setup, and then provide step-by-step results of the proposed algorithm compared with existing algorithms. A. Experimental Setup The experimental setup of the MCA camera equipped with RGB color ltered apertures and its internal conguration are shown in Fig. 18. The hardware specications of the MCA camera are also listed in Table III. B. Multifocusing Using the MCA Camera Multifocusing is accomplished by the three-step postprocessing of the MCA output image; (i) cluster-based image segmentation, (ii) color shift model-based registration and fusion, and (iii) image restoration, as shown in Fig. 6.
(e)
(f)
Fig. 19. Image processing results for digital multifocusing of the MCA output image. (a) Input indoor image acquired by the MCA camera. (b) Clustered image enclosed by the corresponding rectangular regions. (c) Color channel registered image. (d) Restored image by the TCLS lter. (e) Alpha map image for spatially adaptive artifact removal. (f) Finally restored image using the spatially adaptive artifacts removing lter.
Fig. 19(a) shows the MCA output image, which contains multiple objects at different distances. Its segmented clusters enclosed by the corresponding rectangular regions are shown in Fig. 19(b), where all the pixels in a region have approximately the same color shifting vector. It is noted that there is one-to-one relationship between the length of a color shifting vector and the corresponding distance of a cluster. Fig. 19(c) shows the registered image, where color misalignment is corrected but out-of-focus blur still exists. For overcoming this problem, the proposed TCLS image restoration lter is used with the alpha map-based spatially adaptive artifacts removing lter. The resulting restored and artifacts removed images are respectively shown in Figs. 19(d) and (f). The same experiment on an outdoor image is also performed, whose results are shown in Fig. 20. As previously discussed with performance comparison given in Table II, the combination of TCLS and spatially adaptive artifact removing lters is the preferred choice for real-time multifocusing images acquired from the MCA camera on
4163
(a) (a) (b)
(b)
(c)
(d)
(c)
(d)
Fig. 20. Image processing results for digital multifocusing of the MCA image. (a) Input outdoor image acquired by the MCA camera. (b) Clustered image enclosed by the corresponding rectangular regions. (c) Color channel registered image. (d) Finally restored image using the TCLS lter followed by the alpha map-based spatially adaptive artifacts removing lter. TABLE IV CPU P ROCESSING T IME IN S ECONDS FOR THE P ROPOSED AND E XISTING R ESTORATION M ETHODS Input image
(15041000)
(e)
(f)
Denoising type ForWaRD SADCT TwIST Proposed ForWaRD SADCT TwIST Proposed ForWaRD SADCT TwIST Proposed ForWaRD SADCT TwIST Proposed
Noise value( ) 20 30 40 50
132.98 133.92 132.12 132.23 108.32 108.33 109.82 109.85
Average 132.81 109.08 2.09 134.66 110.38 2.33 133.53 110.27 2.24 134.66 109.33 1.98
1830.37 1831.23 1832.23 1831.32 1831.28 2.13 1.91 2.19 2.14
134.92 133.32 135.11 135.32 110.13 110.32 110.11 110.99
(g)
Fig. 21. Comparison of ve different restoration algorithms using the color channel registered image of Fig. 19(c). (a) Lower right part of the registered image. (b) Restored image by Wiener lter. (c) Restored image by ForWaRD. (d) Restored image by SADCT. (e) Restored image by TwIST. (f) Restored image by the proposed method. (g) Vertical edge prole (G channel) of 27th line in the region marked by the yellow rectangle.
1826.99 1831.00 1829.32 1828.12 1828.86 1.91 2.10 2.32 2.99
134.23 133.99 132.90 133.00 110.32 110.12 110.11 110.53
1830.13 1830.33 1830.32 1831.32 1830.52 2.02 2.13 1.99 2.83 134.99 135.23 134.21 134.21 109.12 109.00 109.21 109.99
1829.62 1830.32 1830.79 1830.78 1830.37 1.82 1.96 2.03 2.12
variant FIR lter. Although the result of Wiener lter shown in Fig. 21(b) may change with various selections of signal-tonoise power ratio, we provided the visually best result. Another comparison of four image restoration methods in the sense of CPU processing time is summarized in Table IV. C. Depth Estimation The MCA camera enables the explicit estimation of depth information using the direction and length of the estimated color shifting vectors in each region. Figs. 22(a) and (b) respectively show the color shifting vector distributions for Figs. 19 and 20, where we used the distance maps generated
condition that the size of a PSF for a color channel registered region is relatively small. In addition, we compare the performance of the proposed algorithm with various image restoration algorithms given in Table II using the color registered image shown in Fig. 19(c). As shown in Fig. 21, all restoration methods give similar, comparable results whereas only the proposed method can be implemented in the form of spatially-
4164
TABLE V E RRORS IN D EPTH E STIMATION FOR F IG . 22(c) AND (d) Regions 1 2 3 Estimated distance 4.8 m/18 m 6 m/26 m 7.3 m/41 m True distance 5.2 m/19.5 m 6.7 m/29.7 m 8 m/49.7 m Errors 0.3 m/1.5 m 0.7 m/3.7 m 1.3 m/8.7 m
(a)
(b)
(a)
(b)
(c)
(d)
(c)
(d)
Fig. 23. Comparison of depth estimation of three different algorithms. (a) Input image acquired by the MCA camera. (b) Result of depth estimation using the single-eye range method. (c) Result of depth estimation using the MI-based method. (d) Result of the proposed depth estimation algorithm.
(e)
(f)
Fig. 22. Depth estimation using the proposed method. (a) and (b) Color shifting vector distributions for two images shown in Figs. 19 and 20, respectively. (c) and (d) Estimated distances in three regions. (e) and (f) Final estimated depth images in conjunction with the proposed cluster-based segmentation result.
by the camera that is set to focus an object at 5.8 meter away for Fig. 19 and 21.4 meter away for Fig. 20. The corresponding rectangular regions of clusters and distances are shown in Figs. 22(c) and (d). The results of depth estimation in conjunction with the proposed image segmentation are shown in Figs. 22(e) and (f). Table V shows estimation errors of distances of three regions in Figs. 22(c) and (d). As shown in the table the accuracy of the distance estimation using the distance map is approximately 90% in average. The performance of the proposed depth estimation algorithm is evaluated by comparing with single-eye range method relying on high-pass ltering [31], and mutual information (MI)-based method coupled with iterative graph-cut optimization [49]. As shown in Fig. 23(b), the single-eye range method mostly failed to estimate depth of the defocused scene background. The result of the MI-based method is shown in Fig. 23(c). Since the correspondence measure of the MI-
based method is dened for two images, we take the average of the value for the three pairs of RGB channel (GR, GB, and BR). Although the MI-based method provides acceptable performance in the sense that it does not assume a priori knowledge of the intensity relationships, some portions of the foreground are lost. On the other hand, the proposed depth estimation method produces the best result as shown in Fig. 23(d). VII. C ONCLUSION In this paper, we presented a novel approach to multifocusing and depth estimation using the MCA camera. An image acquired by the MCA camera contains spatially varying misalignment among RGB color channels, where the direction and length of the misalignment is uniquely determined by the distance of an object from the camera. For multifocusing the MCA output image, we have proposed a set of image processing algorithms including; i) cluster-based segmentation and generation of the corresponding rectangular regions, ii) color shifting vector estimation for registration and clusterbased fusion, iii) image restoration using the TCLS lter and spatially adaptive artifacts removal. Furthermore, as a byproduct of the MCA camera-based multifocusing process, we can estimate depth of each region using the direction and length of the corresponding pair of color shifting vectors. The estimated
4165
depth information may be used in various depth-based vision applications. The extended set of experimental results conrm that the proposed set of image processing algorithms enable the MCA camera to signicantly enhance the visual quality of an image containing multiple objects of different distances, and can be fully or partially incorporated into multifocusing or EDoF systems in the form of the FIR lter structure. Experimental results also show that the estimated depth is accurate enough to perform a variety of vision-based tasks, such as image understanding, description, and robotics. R EFERENCES
[1] E. Adelson and J. Wang, Single lens stereo with a plenoptic camera, IEEE Trans. Pattern Anal. Mach. Intell., vol. 14, no. 2, pp. 99106, Feb. 1992. [2] R. Ng, M. Levoy, M. Brdif, G. Duval, M. Horowitz, and P. Hanrahan, Light eld photography with a hand-held plenoptic camera, Dept. Comput. Sci., Stanford Univ., Stanford, CA, Tech. Rep. CTSR 200502, 2005. [3] R. Ng, Fourier slice photography, ACM Trans. Graph., vol. 24, no. 3, pp. 735744, 2005. [4] R. Horstmeyer, G. Euliss, R. Athale, and M. Levoy, Flexible multimodal camera using a light eld architecture, in Proc. IEEE Int. Conf. Comput. Photography, Apr. 2009, pp. 18. [5] A. Mohan, X. Huang, J. Tumblin, and R. Raskar, Sensing increased image resolution using aperture masks, in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit., Jun. 2008, pp. 18. [6] S. Nayar, Computational cameras: Redening the image, Computer, vol. 39, no. 8, pp. 3038, 2006. [7] T. Bishop, S. Zanetti, and P. Favaro, Light eld superresolution, in Proc. IEEE Int. Conf. Comput. Photography, Apr. 2009, pp. 19. [8] V. Maik, D. Cho, J. Shin, D. Har, and J. Paik, Color shift modelbased segmentation and fusion for digital autofocusing, J. Imaging Sci. Technol., vol. 51, no. 4, pp. 368379, 2007. [9] Y. Bando, B. Chen, and T. Nishita, Extracting depth and matte using a color-ltered aperture, ACM Trans. Graph., vol. 27, no. 5, pp. 134142, 2008. [10] S. Kim, E. Lee, V. Maik, and J. Paik, Real-time image restoration for digital multifocusing in a multiple color-lter aperture camera, Opt. Eng., vol. 49, no. 4, p. 040502, Apr. 2010. [11] E. Lee, W. Kang, S. Kim, and J. Paik, Color shift model-based image enhancement for digital multifocusing based on a multiple color-lter aperture camera, IEEE Trans. Consumer Electron., vol. 56, no. 2, pp. 317323, May 2010. [12] H. Foroosh, J. Zerubia, and M. Berthod, Extension of phase correlation to subpixel registration, IEEE Trans. Image Process., vol. 11, no. 3, pp. 188200, Mar. 2002. [13] M. Hagiwara, M. Abe, and M. Kawamata, Estimation method of frame displacement for old lms using phase-only correlation, J. Signal Process., vol. 8, no. 5, pp. 421429, 2004. [14] S. Kim, S. Park, and J. Paik, Simultaneous out-of-focus blur estimation and restoration for digital auto-focusing system, IEEE Trans. Consumer Electron., vol. 44, no. 3, pp. 10711075, Aug. 1998. [15] V. Maik, D. Cho, J. Shin, and J. Paik, Regularized restoration using image fusion for digital auto-focusing, IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 10, pp. 13601369, Oct. 2007. [16] J. Tanida, R. Shogenji, Y. Kitamura, K. Yamada, M. Miyamoto, and S. Miyatake, Color imaging with an integrated compound imaging system, Opt. Exp., vol. 11, no. 18, pp. 21092117, 2003. [17] P. Green, W. Sun, W. Matusik, and F. Durand, Multi-aperture photography, ACM Trans. Graph., vol. 26, no. 3, pp. 6877, 2007. [18] E. Dowski and G. Johnson, Wavefront coding: A modern method of achieving high performance and/or low cost imaging systems, Proc. SPIE, vol. 3779, no. 137, pp. 137145, Oct. 1999. [19] S. Bradburn, W. Cathey, and E. Dowski, Realizations of focus invariance in opticaldigital systems with wave-front coding, Appl. Opt., vol. 36, no. 35, pp. 91579166, 1997. [20] E. Dowski and W. Cathey, Extended depth of eld through wave-front coding, Appl. Opt., vol. 34, no. 11, pp. 18591866, 1995. [21] F. Guichard, H. Nguyen, R. Tessires, M. Pyanet, I. Tarchouna, and F. Cao, Extended depth-of-eld using sharpness transport across color channels, Proc. SPIE, vol. 7250, p. 72500N, 2009.
[22] O. Cossairt and S. Nayar, Spectral focal sweep: Extended depth of eld from chromatic aberrations, in Proc. IEEE Int. Conf. Comput. Photography, Mar. 2010, pp. 18. [23] A. Levin, R. Fergus, F. Durand, and W. Freeman, Image and depth from a conventional camera with a coded aperture, ACM Trans. Graph., vol. 26, no. 3, pp. 170, 2007. [24] R. Raskar, A. Agrawal, and J. Tumblin, Coded exposure photography: Motion deblurring using uttered shutter, ACM Trans. Graph., vol. 25, no. 3, pp. 795804, 2006. [25] A. Veeraraghavan, R. Raskar, A. Agrawal, A. Mohan, and J. Tumblin, Dappled photography: Mask enhanced cameras for heterodyned light elds and coded aperture refocusing, ACM Trans. Graph., vol. 26, no. 3, pp. 6982, 2007. [26] C. Zhou, S. Lin, and S. Nayar, Coded aperture pairs for depth from defocus, in Proc. IEEE 12th Int. Conf. Comput. Vis., Oct. 2009, pp. 325332. [27] M. Subbarao and J. Tyan, Selecting the optimal focus measure for autofocusing and depth-from-focus, IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 8, pp. 864870, Aug. 1998. [28] M. Asif and T. Choi, Shape from focus using multilayer feedforward neural networks, IEEE Trans. Image Process., vol. 10, no. 11, pp. 1670 1675, Nov. 2001. [29] P. Favaro, A. Mennucci, and S. Soatto, Observing shape from defocused images, Int. J. Comput. Vis., vol. 52, no. 1, pp. 2543, 2003. [30] S. Hasinoff and K. Kutulakos, A layer-based restoration framework for variable-aperture photography, in Proc. IEEE 11th Int. Conf. Comput. Vis., Oct. 2007, pp. 18. [31] Y. Amari and E. Adelson, Single-eye range estimation by using displaced apertures with color lters, in Proc. Int. Conf. Ind. Electron., Control, Instrum., Autom., Power Electron. Motion Control, Nov. 1992, pp. 15881592. [32] J. Lim and J. Paik, Color-invariant three-dimensional feature descriptor for color-shift-model-based image processing, Opt. Eng., vol. 50, no. 11, p. 117005, Nov. 2011. [33] J. Nakamura, Image Sensors and Signal Processing for Digital Still Cameras. Boca Raton, FL: CRC Press, 2006. [34] Q. Yu and D. Clausi, IRGS: Image segmentation using edge penalties and region growing, IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 12, pp. 21262139, Dec. 2008. [35] M. Farmer and A. Jain, A wrapper-based approach to image segmentation and classication, IEEE Trans. Image Process., vol. 14, no. 12, pp. 20602072, Dec. 2005. [36] R. Gonzalez and R. Woods, Digital Image Processing, 3rd ed. Englewood Cliffs, NJ: Prentice-Hall, 2007. [37] A. Mavrinac, J. Wu, X. Chen, and K. Tepe, Competitive learning techniques for color image segmentation, in Proc. Congr. Image Signal Process., vol. 3. 2008, pp. 644649. [38] Z. Aghbari and R. Al-Haj, Hill-manipulation: An effective algorithm for color image segmentation, Image Vis. Comput., vol. 24, no. 8, pp. 894903, 2006. [39] T. Ohashi, Z. Aghbari, and A. Makinouchi, Hill-climbing algorithm for efcient color-based image segmentation, in Signal Processing, Pattern Recognition, and Applications. Phuket, Thailand: ACTA Press, 2003. [40] J. Canny, A computational approach to edge detection, IEEE Trans. Pattern Anal. Mach. Intell., vol. 8, no. 6, pp. 679698, Jun. 1986. [41] M. Shimizu, S. Yoshimura, M. Tanaka, and M. Okutomi, Superresolution from image sequence under inuence of hot-air optical turbulence, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2008, pp. 18. [42] A. Collignon, F. Maes, D. Delaere, D. Vandermeulen, P. Suetens, and G. Marchal, Automated multi-modality image registration based on information theory, Inf. Process. Med. Imaging, vol. 14, no. 6, pp. 263274, 1995. [43] F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens, Multimodality image registration by maximization of mutual information, IEEE Trans. Med. Imaging, vol. 16, no. 2, pp. 187198, Apr. 1997. [44] B. Zhang and J. Allebach, Adaptive bilateral lter for sharpness enhancement and noise removal, IEEE Trans. Image Process., vol. 17, no. 5, pp. 664678, May 2008. [45] A. Katsaggelos, Iterative image restoration algorithms, Opt. Eng., vol. 28, no. 77, pp. 735748, Jul. 1989. [46] R. Neelamani, H. Choi, and R. Baraniuk, Forward: Fourier-wavelet regularized deconvolution for ill-conditioned systems, IEEE Trans. Signal Process., vol. 52, no. 2, pp. 418433, Feb. 2004.
4166
[47] A. Foi, V. Katkovnik, and K. Egiazarian, Pointwise shape-adaptive DCT for high-quality denoising and deblocking of grayscale and color images, IEEE Trans. Image Process., vol. 16, no. 5, pp. 13951411, May 2007. [48] J. Bioucas-Dias and M. Figueiredo, A new twist: Two-step iterative shrinkage/thresholding algorithms for image restoration, IEEE Trans. Image Process., vol. 16, no. 12, pp. 29923004, Dec. 2007. [49] J. Kim, V. Kolmogorov, and R. Zabih, Visual correspondence using energy minimization and mutual information, in Proc. 9th IEEE Int. Conf. Comput. Vis., Oct. 2003, pp. 10331040.
Sangjin Kim (S09) was born in Seoul, Korea, in 1978. He received the B.S. degree in electronic engineering from Kangnam University, Korea, in 2003, and the M.S. and Ph.D. degrees in image engineering from Chung-Ang University, Seoul, Korea, in 2005 and 2009, respectively. He is currently a Post-Doctoral Researcher with the Image Processing and Intelligent Systems Laboratory, Chung-Ang University. His current research interests include image restoration, computational cameras, and real-time object tracking.
Computer Engineering, Georgia Tech, and as an Associate Director for Georgia Tech Savannah. Since joining the Faculty at Georgia Tech, he has become internationally recognized for his contributions to the eld of digital signal processing, image and video processing, and engineering education. He is currently a Distinguished Foreign Professor with the Graduate School of Advanced Imaging Science, Multimedia, and Film, Chung-Ang University, Seoul, Korea. He has published over 150 papers. He is the author of two textbooks. His current research interests include face and gesture recognition, image and video processing, adaptive signal processing, and engineering education. Dr. Hayes was a recipient of the Presidential Young Investigator Award and the IEEE Senior Award. He has received numerous awards and distinctions from professional societies and from Georgia Tech. He has served on the Signal Processing Society of the IEEE in numerous positions, including Chairman of the DSP Technical Committee from 1995 to 1997. He served as an Associate Editor for the IEEE T RANSACTIONS ON A COUSTICS , S PEECH , AND S IGNAL P ROCESSING from 1984 to 1988, and the IEEE T RANSAC TIONS ON E DUCATION from 2000 to 2010. He was the Secretary-Treasurer of the ASSP Publications Board from 1986 to 1988, and the Chairman of the ASSP Publications Board from 1992 to 1994, General Chairman of ICASSP in 1996, and General Chairman of ICIP in 2006.
Eunsung Lee (S09) was born in Seoul, Korea, in 1982. He received the B.S. degree in electronic engineering and the M.S. degree in image engineering from Chung-Ang University, Seoul, Korea, in 2009 and 2011, respectively, where he is currently pursuing the Ph.D. degree in image processing. His current research interests include image restoration, computational cameras, and image enhancement.
Monson H. Hayes (F92) received the B.S. degree from the University of California, Berkeley, in 1971, and the Sc.D. degree in electrical engineering and computer science from the Massachusetts Institute of Technology, Cambridge, in 1981. He was a Systems Engineer with Aerojet Electrosystems until 1974. He then joined the Faculty of the Georgia Institute of Technology (Georgia Tech), Atlanta, where he was a Professor of electrical and computer engineering. Currently, he is serving as an Associate Chair with the School of Electrical and
Joonki Paik (SM11) was born in Seoul, Korea, in 1960. He received the B.S. degree in control and instrumentation engineering from Seoul National University, Seoul, Korea, in 1984, and the M.S. and Ph.D. degrees in electrical engineering and computer science from Northwestern University, Evanston, IL, in 1987 and 1990, respectively. He joined Samsung Electronics, from 1990 to 1993, where he designed the image stabilization chip sets for consumers camcorders. Since 1993, he has joined, as a Faculty Member, with Chung-Ang University, Seoul, where he is currently a Professor with the Graduate School of Advanced Imaging Science, Multimedia, and Film. From 1999 to 2002, he was a Visiting Professor with the Department of Electrical and Computer Engineering, the University of Tennessee, Knoxville. Since 2005, he has been the Head of the National Research Laboratory in the eld of image processing and intelligent systems. In 2008, he was a Full-Time Technical Consultant with the System LSI Division, Samsung Electronics, where he developed various computational photographic techniques including an extended depth of eld system. From 2005 to 2007, he served as the Dean of the Graduate School of Advanced Imaging Science, Multimedia, and Film. From 2005 to 2007, he was the Director of Seoul Future Contents Convergence Cluster established by Seoul Research and Business Development Program. He is currently a member of Presidential Advisory Board for Scientic/Technical Policy of the Korean Government and a Technical Consultant of the Korean Supreme Prosecutors Ofce for computational forensics. Dr. Paik was a recipient of Chester-Sall Award from the IEEE Consumer Electronics Society, the Academic Award from the Institute of Electronic Engineers of Korea, and the Best Research Professor Award from ChungAng University. He has served the Consumer Electronics Society of the IEEE as an Editorial Board Member.

06213551

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

06213551

Uploaded by

Copyright:

Available Formats

4152

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 9, SEPTEMBER 2012

10577149/$31.00 2012 IEEE

PAIK et al.: MULTIFOCUSING AND DEPTH ESTIMATION

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 9, SEPTEMBER 2012

PAIK et al.: MULTIFOCUSING AND DEPTH ESTIMATION

(b) (a) (b)

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 9, SEPTEMBER 2012

PAIK et al.: MULTIFOCUSING AND DEPTH ESTIMATION

Average 17.75 10.30 6.11 4.45

and the normalized cross power spectrum is given as

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 9, SEPTEMBER 2012

(a) (a) (b)

ciR ) Ci , for c { R , G , B }. (11)

PAIK et al.: MULTIFOCUSING AND DEPTH ESTIMATION

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 9, SEPTEMBER 2012

PAIK et al.: MULTIFOCUSING AND DEPTH ESTIMATION

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 9, SEPTEMBER 2012

(b) (a) (b)

Focusing Sensor Lens mounting Shutter speed Color mode

PAIK et al.: MULTIFOCUSING AND DEPTH ESTIMATION

(a) (a) (b)

1830.37 1831.23 1832.23 1831.32 1831.28 2.13 1.91 2.19 2.14

134.92 133.32 135.11 135.32 110.13 110.32 110.11 110.99

1826.99 1831.00 1829.32 1828.12 1828.86 1.91 2.10 2.32 2.99

134.23 133.99 132.90 133.00 110.32 110.12 110.11 110.53

1829.62 1830.32 1830.79 1830.78 1830.37 1.82 1.96 2.03 2.12

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 9, SEPTEMBER 2012

PAIK et al.: MULTIFOCUSING AND DEPTH ESTIMATION

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 9, SEPTEMBER 2012

You might also like