Estimation of Multiple Illuminant Colors Using Color Lines of Single Image
Yusuke Uchimi
Department of Computer Science and Engineering
Toyohashi University of Technology Toyohashi, Japan
Email: [email protected]
Takao Jinno
Computer Science and Engineering Osaka Institute of Technology
Hirakata, Japan Email: [email protected]
Shigeru Kuriyama Department of Computer Science
and Engineering
Toyohashi University of Technology Toyohashi, Japan
Email: [email protected]
Abstract—Color estimation of illuminants has been applied to various applications such as the correction of white balance and color normalization for machine vision. The color estimation from a single image, however, is still challenging because each pixel value depends on three components: illumination, reflection, and characteristic of an image sensor, whose decomposition is a hard inverse problem.
This research estimates the color of illuminant using color lines obtained from each super pixel, by assuming that the component of reflected light is locally constant. This article demonstrates the availability of this method for multiple illuminants of different colors, by experimentally showing a comparative evaluation against an existing state-of-the-art technology.
I. INTRODUCTION
In the field of computer vision technology, estimating unknown property of illuminant is very important because its color information is often referred for enhancing the performance of various types of applications. For example, the performance of object tracking is improved by analyzing the color histogram of lighting environment. Also, the estimation of illuminant color can correct white balance, or normalize colors for images tinted unnaturally. Extracting the physical quantity of an illuminant as scene-independent value can serve not only object detection but also re-lighting of the scene lit by colorful illumination.
Color constancy of human vision system can recognize true colors of objects lit by colored illuminations. On the other hand, the color estimation of illumination from images is usually a hard problem because the estimation is affected by many factors: the optical properties of object’s surface, lighting conditions in capturing images, characteristics of image sensors, and so forth.
For overcoming such difficulty, some methods introduced a gray world assumption [1], which assumes that the average of objects’ colors converges into gray color, and other method [2] extended this assumption to the average of edges’ colors.
These methods premise that a single illuminant uniformly affects the brightness of a whole scene; the brightness of actual scene, however, is often non-uniformly distributed owing to the effect from multiple illuminants.
Unlike the gray world assumption, some methods were proposed on the basis of human color consistency by applying a retinex theory proposed by Land et al. [3]. According to this theory, human visual system perceives an object by its surface reflectance while removing the effects by illuminations. With this model, the pixel value of an image located at (x, y), denoted by I(x, y), is represented by the product of two components of illumination L(x, y)and reflectionR(x, y)as I(x, y) =L(x, y)R(x, y). (1) This theory can be extended to handle non-uniform illu- minations by assuming their spatial smoothness. The entire distribution of the illumination can be estimated by taking the convolution of local average with a Gaussian function [4], [5].
For the estimations of multiple illuminants, the corrections of white balance were proposed [6]–[8] with user interactions for the scene including multiple illuminants of different color temperatures. Another approaches without any interactions were proposed by using images captured in different condition, e.g., with or without flashing, or by using movies of a sliding camera [9]. Although these methods based on multiple images can estimate static illuminations, they are unfeasible for time-variant lighting conditions, which is often introduced in enhancing aesthetic performance with digitally controllable color LEDs.
This research aims to estimate the colors of illuminants of unknown characteristics. Estimation from a single image is also extensible to dynamic scenes that include time-variant il- luminations or moving objects, using per-frame computations.
II. RELATED WORKS
A. Color lines
Omer et al. proposed the theory of color lines [10], as shown in Fig.1, which encodes color distribution within a local region as a straight line. This theory is derived from the observation that the distribution of locally aggregated pixel colors emerges along a corresponding line. It is noteworthy that this linear distribution has bending property at the color saturation point.
978-1-5386-3001-3/17/$31.00 c⃝2017 IEEE.
(a) Assumption (b) Actual image
Fig. 1: Concept of color lines
Fig. 2: Schematic overview of our method
B. Gray pixels
Yang et al. proposed gray pixels [11] for estimating multiple illuminant colors from a single image by searching achromatic color regions. Here we decompose the equation (1) of retinex theory into each color channelc∈ {r, g, b} as
Ic(x, y) =Lc(x, y)Rc(x, y) (2) whereLc(x, y) can be regarded as a constant in logarithmic space by assuming the invariance of the illumination compo- nent among nearby pixels. The standard deviation within a near-field region therefore becomes a reference value that is independent of lighting conditions. The gray pixels estimates achromatic pixels on the basis of such invariant criterion, assuming that achromatic regions are included in most natural images. This approach can estimates illuminant colors even for multiple illuminants if the scene contains corresponding achromatic regions to each color. For the scenes involving over- or under-exposure, however, it often causes fatal errors.
III. COLOR ESTIMATION
For the objects lit by diffused illuminations, the color distribution in a local region causes a little deviation due to the location-dependent variation of radiation intensity. On the assumption of color invariance for an identical object, object color is regarded as constant in a local region and is deviated in the direction of an illuminant color. Based on this observation, we can numerically estimate the straight line, which is calledcolor line, that connects true object color and
illuminant color (see Fig.1). Ideally, the color lines estimated for every local regions on the same object have an intersection near an illuminant color, while being extended from each object color.
Our method estimates the illuminant color relying on this assumption, and it is therefore desirable that many object colors are included within a local region lit by the same illumination. Fig.2 shows the overview of our method.
A. Image division into local regions
We introduce the method called SLIC [12] for subdividing an image into super-pixels which corresponds to the small local regions whose involving pixels have similar brightness or chromatic, in oder to identify the local region on the same object. For each region, pixels are excluded when they have crushed shadows or over-saturation in RGB color space because the former have no effective color information and the latter bends color lines to false directions. After this pixel- wise exclusion, we exclude local regions when the number of effective pixels inside the region falls below a threshold.
B. Estimation of color lines
For simplifying the detection of intersections, we compute color lines in a two dimensional chromaticity diagram. Since our method treats local color distributions, isochromatic color variables of u′ andv′ are introduced as
X Y Z
=M
R G B
, M =
0.577 0.186 0.188 0.297 0.627 0.075 0.027 0.071 0.991
(a) Effective local region (b) Ineffective local region
Fig. 3: Local distribution of pixel values
Fig. 4: Example of vote
u′ = 4X
X+ 15Y + 3Z, v′ = 9Y
X+ 15Y + 3Z (3) where u′ and v′ are related to color (i.e., hue and chroma) which spans in the range of [0.0,0.7]. Color lines are then estimated by the distributions of(u′, v′)variables.
For small regions where the radiation intensity is invariant, the non-linear distribution of pixel values makes the computa- tion of color lines difficult. We therefore omit the local regions if the flatness of color distribution is less than a threshold, where the flatness is computed by the ratio of the first and second principle components, as shown in Fig. 3.
C. Color discrimination
For a single illuminant, color lines intersect near the corresponding color. For multiple illuminants, however, the intersections are densely generated at intermediate locations in-between illuminant colors, and we introduce a voting mech- anism for removing outliers.
The region on a chromaticity diagram is uniformly divided for discretizing(u′, v′)coordinates into cells, and all possible intersections are voted to the cell at which intersection is involved. Fig. 4 shows the example of voting in (u′, v′) coordinates, where the brighter colors correspond to the larger number of votes. It is noteworthy that the magnitude of estimation errors depends on the area of each cell.
The number of voted cells often largely exceeds the number of illuminants because many cells are voted near true illu- minant colors. We therefore utilize K-means clustering for narrowing down the cells. Although the number of cluster
Fig. 5: Experimental setup
Fig. 6: Sampled illuminant color
corresponds to the number of different illuminant colors, we intentionally set it by larger values for robustly selecting possible candidates that are further narrowed down using some heuristics. In the following experiment, we set the number of cluster by three for two illuminants’ colors.
As each cluster includes one redundant candidate, we finally choose two colors of the highest saturation among them. This strategy of excluding the color of the lowest saturation is derived from the assumption that tertiary color is possibly estimated by the intermediate color between two true colors, in which the saturation of the color tends to be decreased.
D. Metric of estimation error
The estimated brightness of illuminant color is essentially a variable factor because it deeply depends on the physical condition irrelevant to illuminant color, e.g., the reflectance of object, the distances between objects and illuminant or image sensor, etc. For this reason, we evaluate the estimation error by computing Euclidean distance in (u′, v′) coordinates, by which the effect of brightness can be omitted.
For each scene, the error metric E of estimation is then given between the(u′, v′)values of ground truths, denoted by guvc for the c-th color, and those estimated ones, denoted by euvc as
E= 1
|C|
∑
c∈C
∥guvc −euvc ∥2 (4)
Fig. 7: Estimations of illuminant colors
Fig. 8: Example of failed estimation
where∥∥2represents L2-norm, and|C|is the number of tested colorsc∈ C for each scene.
In addition, we evaluate the estimation error with respect to pure colors by computing hue angles in aL∗u∗v∗color space, denoted by h, which is computed from (u′, v′)values as,
h = arctan (v∗
u∗ )
= arctan
(13L∗(u′−u′n) 13L∗(v′−v′n)
)
= arctan
(u′−u′n v′−vn′
)
(5) where (u′n, v′n) represents the value corresponding to a white stimulus that is computed with D65 white given by (0.9504,1.0000,1.0888)in a X-Y-Z color space.
The estimation error for each scene, denoted byH, is then given by
H = 1
|C|
∑
c∈C
|ghc −ehc| (6)
where gch and ehc denote the hue angle of ground truth and estimated one, respectively, for thec-th color.
IV. EXPERIMENTAL EVALUATION
We experimentally evaluated our method by using sample images that were captured for the scene lit by two illuminants of different colors, as shown in Fig.5. Comparative evaluation is demonstrated against the method based on gray pixels [11], by computing similarity between true and estimated colors.
The gray pixels requires specifying the estimated percentage of the number of pixels included achromatic regions, and we set it by 10% for all samples, as recommended in [11].
Lighting condition is supplied by using a commercially available color LED lighting device (Philips hue bloom) while changing its color with control parameters for hue and saturation. The true color was evaluated by using pixel values corresponding to the 18% gray patch of a color chart. We sampled images by controlling two illuminant colors cast from left and right sides with 6 representative colors: {red, green, blue, orange, cyan, magenta}, whose chroma are set at 3 fixed levels = 128,192,254 for the range of[0,255], as shown in Fig.6. Since the brightness of illuminant is treated as a variable component, we tentatively set the colors by adding L= 0.75 to (u′, v′) coordinates. For each saturation level, a pair of hue components is sampled in six patterns{(red, green), (red, blue), (blue, green), (orange, cyan), (orange, magenta), (cyan, magenta)}, which constitute 18 pairs of colors. As a result, a total of 36 (= |C|) illuminant colors were picked up in capturing each sample scene.
Fig.7 shows the examples of color estimations, where ICL is an abbreviation of our method, by the name of Intersection of Color Lines, and the color of each square indicates the estimated color with the labels of the two types of errors for E (top) andH (bottom). Notice that the indicated colors are represented by setting L= 0.75 in the same way as in Fig.6.
In computing the errors, the correspondence of two estimated colors is automatically determined so as to minimize the error E or H. The rightmost squares in our method indicate the tertiary colors that are finally excluded.
TABLE I: Estimation errors for various pairs of illuminants’ colors
(a) Average distance in (u’,v’) coordinates(=E)
Gray pixels ICL (proposed) ICL (optimum)
Scene 1 0.108 0.082 0.058
Scene 2 0.101 0.067 0.060
Scene 3 0.089 0.082 0.067
Scene 4 0.067 0.095 0.056
Scene 5 0.124 0.104 0.081
Scene 6 0.078 0.094 0.062
Average 0.094 0.087 0.064
(b) Average difference in hue angles(=H)
Gray pixels ICL (proposed) ICL (optimum)
Scene 1 54.599 23.687 14.923
Scene 2 46.753 18.716 17.469
Scene 3 46.896 44.097 37.563
Scene 4 31.060 33.104 17.714
Scene 5 46.466 30.819 17.624
Scene 6 30.853 35.100 19.492
Average 42.771 30.921 20.798
(a)T hreshold= 10◦ (b)T hreshold= 20◦ (c)T hreshold= 30◦
Fig. 9: Accuracy in hue angles
(a) a single illuminant1 (b) double illuminants2 (c) triple illuminants3 (d) double illuminants4
Fig. 10: Estimation for general scenes
Table I shows the comparison of estimation errors of our method against the gray pixels. We here computed the error for the optimal selection of two colors among three candidates, as shown in the right-most column, which demonstrates the potential of our redundant selection mechanism. This result shows that our method achieved higher accuracy (lower errors) for the scenes 1 and 2 including objects of specular surfaces, which implies that color lines are robustly drawn at the regions of specular reflections. On the other hand, our method lacks the estimation accuracy for the non-glossy scenes 4 and 6 that have no specular reflection. Fig. 8 shows a failed example in
1By David Bramhall https://flic.kr/p/NtfgWy
2By rawartistsmedia https://flic.kr/p/noozzW
3By Marcus https://flic.kr/p/gRXazw
4By Fr´ed´eric BISSON https://flic.kr/p/8WB4ij
detecting correct intersections for such a condition.
Fig. 9 shows the accuracy of color estimation that is computed by judging the equivalence to ground truths for three levels of thresholds. We here introduce a simple definition of accuracy; for each scene we count 1 if both illuminants’ colors are evaluated as correctly estimated, 0.5 if only one’s color is evaluated as correct, and 0 otherwise. The accuracy is then computed by averaging these values for all 6 scenes. The result in Fig. 9 shows that our method achieved higher accuracy over the gray pixels for every scenes.
Fig. 10 demonstrates the examples of estimation for general night scenes, which are downloaded from an image-hosting website, using the same parameters as our experimental set- ting. The scenes of (a), (b), and (c) seem to be lit by one, two, or three colors, respectively. Although we have no ground
truth information about their colors, their color appearances are plausible. Our method, however, failed to estimate a blue component of the scene (d), because the corresponding intersection is hard to be detected owing to the lack of the area lit by a blue illuminant.
The scenes 5 and 6 have no achromatic object; such the condition breaks the premise of the gray pixels. The result for the scene 5 shows higher accuracy of our method; as for the scene 6, however, the gray pixels achieved higher accuracy than our method, in spite of its prerequisite for the existence of achromatic objects. This implies that our advantageous property for achromatic objects is currently limited to the scenes that include specular reflections, and to our surprise, the estimation accuracy of the gray pixel is not degraded for non-glossy scenes.
The higher accuracy in optimal selection, as shown in the right-most column of the Table I, reveals that our redundant selection strategy can increase the power of color estimations if the final process of narrowing down is properly implemented.
We omitted illuminant colors of low saturation as outliers, assuming that such low saturation is caused by the color mixture from multiple illuminants. Such assumption, however, is not valid for closely arranged illuminants, which often causes mis-estimation of the tertiary color. This suggests the limitation of our selection strategy.
V. CONCLUSIONS
This article has proposed a method of estimating multiple illuminant colors of unknown properties from a single image.
Since our method requires no estimation of the area oc- cupied by achromatic region, a whole process can be fully automated, by which the restriction of the gray pixels can be overcome. Illuminant colors can be robustly and accurately estimated even for the scenes having no achromatic object, if they include the regions of specular reflections.
Our method fails to detect correct intersections, depending on the combination of object colors; if one object color resides between the colors of illuminant and the other object, the corresponding intersection is hard to obtain. This defect, however, could be relaxed by introducing some extensions explained below.
The first extension utilizes the brightness information; unde- sirable intersections can be excluded by omitting those located in low brightness regions, by checking the corresponding L values. Such screening process assumes that illuminant color resides on the region of high brightness when the objects of specular reflectance exist.
The second extension iteratively estimates illuminant colors so as to increase the reliability. The number of vote are weighted by the distance between a color line and a tentatively estimated color, and the weights and estimated colors are iteratively updated until reaching a convergent state.
The third extension optimizes super pixels. The reliability of color line can be increased by merging the super pixels of
similar colors because the larger number of pixels for each super pixel can reduce the effect of noisy component.
Our method requires that multiple colors of objects must be included in a scene. As this limitation is not so sever for ordinary natural images, it could be relaxed by integrating with other method such as gray pixels. The estimation accuracy of our method is degraded for the scenes including no specular reflection as shown in Fig.8, and adaptively alternating the method depending on the glossiness could improve the esti- mation accuracy.
The optimal size of superpixel depends on the feature of a scene, and its parameter should be carefully adjusted.
Inappropriate subdivision for a local region often incorrectly excludes the most superpixels lit by an illuminant, which causes mis-estimation due to the lack of corresponding color lines.
Adding an advanced clustering mechanism and a smarter heuristics in a final selection process could increase the estimation accuracy. Our future works also includes the re- lighting based on the estimated colors, which is one of the important and valuable application of our color estimation.
ACKNOWLEDGEMENT
This work was supported by JSPS KAKENHI Grant Num- ber JP26730088.
REFERENCES
[1] G. Buchsbaum, “A spatial processor model for object colour perception,”
Journal of the Franklin Institute, vol. 310, no. 1, pp. 1–26, 1980.
[2] J. van de Weijer, T. Gevers, and A. Gijsenij, “Edge-based color con- stancy,”IEEE Transactions on Image Processing, vol. 16, no. 9, pp.
2207–2214, 2007.
[3] E. H. Land and J. J. McCann, “Lightness and retinex theory.”Journal of the Optical Society of America, vol. 61, no. 1, pp. 1–11, 1971.
[4] M. Ebner,Color Constancy Using Local Color Shifts, ser. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, vol. 3023.
[5] M. Ebner, “Color constancy based on local space average color,”
Machine Vision and Applications, vol. 20, no. 5, pp. 283–301, 2009.
[6] E. Hsu, T. Mertens, S. Paris, S. Avidan, and F. Durand, “Light mixture estimation for spatially varying white balance,”ACM Transactions on Graphics, vol. 27, no. 3, 2008.
[7] A. Bousseau, S. Paris, and F. Durand, “User-assisted intrinsic images,”
ACM Transactions on Graphics, vol. 28, no. 5, pp. 130:1–130:10, 2009.
[8] I. Boyadzhiev, K. Bala, S. Paris, and F. Durand, “User-guided white balance for mixed lighting conditions,”ACM Transactions on Graphics, vol. 31, no. 6, 2012.
[9] V. Prinet, D. Lischinski, and M. Werman, “Illuminant chromaticity from image sequences,” inProceedings of the IEEE International Conference on Computer Vision, 2013, pp. 3320–3327.
[10] I. Omer and M. Werman, “Color lines: Image specific color represen- tation,” inProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, 2004, pp. II946–II953.
[11] K.-F. Yang, S.-B. Gao, and Y.-J. Li, “Efficient illuminant estimation for color constancy using grey pixels,” inProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol.
07-12-June-2015, 2015, pp. 2254–2263.
[12] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. S¨usstrunk,
“Slic superpixels compared to state-of-the-art superpixel methods,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 11, pp. 2274–2281, 2012.