Color in Image and Video Processing: Most Recent Trends and Future Research Directions

The motivation of this paper is to provide an overview of the most recent trends and of the future research directions in color image and video processing. Rather than covering all aspects of the domain this survey covers issues related to the most active research areas in the last two years. It presents the most recent trends as well as the state-of-the-art, with a broad survey of the relevant literature, in the main active research areas in color imaging. It also focuses on the most promising research areas in color imaging science. This survey gives an overview about the issues, controversies, and problems of color image science. It focuses on human color vision, perception, and interpretation. It focuses also on acquisition systems, consumer imaging applications, and medical imaging applications. Next it gives a brief overview about the solutions, recommendations, most recent trends, and future trends of color image science. It focuses on color space, appearance models, color di ﬀ erence metrics, and color saliency. It focuses also on color features, color-based object tracking, scene illuminant estimation and color constancy, quality assessment and ﬁdelity assessment, color characterization and calibration of a display device. It focuses on quantization, ﬁltering and enhancement, segmentation, coding and compression, watermarking, and lastly on multispectral color image processing. Lastly, it addresses the research areas which still need addressing and which are the next and future perspectives of color in image and video processing.


BACKGROUND AND MOTIVATION
The perception of color is of paramount importance in many applications, such as digital imaging, multimedia systems, visual communications, computer vision, entertainment, and consumer electronics.In the last fifteen years, color has been becoming a key element for many, if not all, modern image and video processing systems.It is well known that color plays a central role in digital cinematography, modern consumer electronics solutions, digital photography system such as digital cameras, video displays, video enabled cellular phones, and printing solutions.In these applications, compression-and transmission-based algorithms as well as color management algorithms provide the foundation for cost effective, seamless processing of visual information through the processing pipeline.Moreover, color also is crucial to many pattern recognition and multimedia systems, where color-based feature extraction and color segmentation have proven pertinent in detecting and classifying objects in various areas ranging from industrial inspection to geomatics and to biomedical applications.
Over the years, several important contributions were made in the field of color image processing.It is only since the last decades that a better understanding of color vision, colorimetry, and color appearance has been utilized in the design of image processing methodologies [1].The first special issue on this aspect was written by McCann in 1998 [2].According to McCann, the problem with display devices and printing devices is that they work one pixel at a time, while the human visual system (HSV) analyzes the whole image from spatial information.The color we see at a pixel is controlled by that pixel and all the other pixels in the field of view [2].In our point of view, the future of color image processing will pass by the use of human vision models that compute the color appearance of spatial information rather than low level signal processing models based on pixels, but also frequential, temporal information, and the use of semantic models.Human color vision is an essential EURASIP Journal on Image and Video Processing tool for those who wish to contribute to the development of color image processing solutions and also for those who wish to develop a new generation of color image processing algorithms based on high-level concepts.
A number of special issues, including survey papers that review the state-of-the-art in the area of color image processing, have been published in the past decades.More recently, in 2005 a special issue on color image processing was written for the signal processing community to understand the fundamental differences between color and grayscale imaging [1].In the same year, a special issue on multidimensional image processing was edited by Lukac et al. [3].This issue overviewed recent trends in multidimensional image processing, ranging from image acquisition to image and video coding, to color image processing and analysis, and to color image encryption.In 2007, a special issue on color image processing was edited by Lukac et al. [4] to fill the existing gap between researchers and practitioners that work in this area.In 2007, a book on color image processing was published to cover processing and application aspects of digital color imaging [5].
Several books have also been published on the topic.For example, Lukac and Plataniotis edited a book [6] which examines the techniques, algorithms, and solutions for digital color imaging, emphasizing emerging topics such as secure imaging, semantic processing, and digital camera image processing.
Since 2006, we have observed a significant increase in the number of papers devoted to color image processing in the image processing community.We will discuss in this survey which are the main problems examined by these papers and the principal solutions proposed to face these problems.The motivation of this paper is to provide a comprehensive overview of the most recent trends and of the future research directions in color image and video processing.Rather than covering all aspects of the domain, this survey covers issues related to the most active research areas in the last two years.It presents the most recent trends as well as the state-ofthe-art, with a broad survey of the relevant literature, in the main active research areas in color imaging.It also focuses on the most promising research areas in color imaging science.Lastly, it addresses the research areas which still need addressing and which are the next and future perspectives of color in image and video processing.
This survey is intended for graduate students, researchers and practitioners who have a good knowledge in color science and digital imaging and who want to know and understand the most recent advances and research in digital color imaging.This survey is organized as follows: after an introduction about the background and the motivation of this work, Section 2 gives an overview about the issues, controversies, and problems of color image science.This section focuses on human color vision, perception, and interpretation.Section 3 presents the issues, controversies, and problems of color image applications.This section focuses on acquisition systems, consumer imaging applications, and medical imaging applications.Section 4 gives a brief overview about the solutions, recommendations, most recent trends and future trends of color image science.This section focuses on color space, appearance models, color difference metrics, and color saliency.Section 5 presents the most recent advances and researches in color image analysis.Section 5 focuses on color features, color-based object tracking, scene illuminant estimation and color constancy, quality assessment and fidelity assessment, color characterization and calibration of a display device.Next, Section 6 presents the most recent advances and researches in color image processing.Section 6 focuses on quantization, filtering and enhancement, segmentation, coding and compression, watermarking, and lastly on multispectral color image processing.Finally, conclusions and suggestions for future work are drawn in Section 7.

COLOR IMAGE SCIENCE AT PRESENT:
ISSUES, CONTROVERSIES, PROBLEMS

Background
The science of color imaging may be defined as the study of color images and the application of scientific methods to their measurement, generation, analysis, and representation.It includes all types of image processing, including optical image production, sensing, digitalization, electronic protection, encoding, processing, and transmission over communications channels.It draws on diverse disciplines from applied mathematics, computing, physics, engineering, and social as well as behavioural sciences, including humancomputer interface design, artistic design, photography, media communications, biology, physiology, and cognition.Although digital image processing has been studied for some 30 years as an academic discipline, its focus in the past has largely been in the specific fields of photographic science, medicine, remote sensing, nondestructive testing, and machine vision.Previous image processing and computer vision research programs have primarily focused on intensity (grayscale) images.Color was just considered as a dimensional extension of intensity dimension, that is, color images were treated just as three gray-value images, not taking into consideration the multidimensional nature of human color perception or color sensory system in general.The importance of color image science has been driven in recent years by the accelerating proliferation of inexpensive color technology in desktop computers and consumer imaging devices, ranging from monitors and printers to scanners and digital color cameras.What now endows the field with critical importance in mainstream information technology is the very wide availability of the Internet and World Wide Web, augmented by CD-ROM and DVD storage, as a means of quickly and cheaply transferring color image data.The introduction of digital entertainment systems such as digital television and digital cinema required the replacement of the analog processing stages in the imaging chain by digital processing modules, opening the way for the introduction to the imaging pipeline of the speed and flexibility afforded by digital technology.The convergence of digital media, moreover, makes it possible for the application of techniques from one field to another, and for public access to heterogeneous multimedia systems.
For several years we have been facing the development of worldwide image communication using a large variety of color display and printing technologies.As a result, "cross media" image transfer has become a challenge [7].Likewise, the requirement of accuracy on color reproduction has pushed the development of new multispectral imaging systems.The effective design of color imaging products relies on a range of disciplines, for it operates at the very heart of the human-computer interface, matching human perception with computer-based image generation.
Until recently, the design of efficient color imaging systems was guided by the criterion that "what the user cannot see does not matter."This is no longer true.This has been, so far, the only guiding principle for image filtering and coding.In modern applications, this is not sufficient enough.For example, it should be possible to reconstruct on display the image of a painting from a digital archive under different illuminations.From the human vision point, the problem is that visual perception is one of the most elusive and changeable of all aspects of human cognition, and depends on a multitude of factors.Successful research and development of color imaging products must therefore combine a broad understanding of psychophysical methods with a significant technical ability in engineering, computer science, applied mathematics, and behavioral science.

Human color vision
The human color vision system is immensely complicated.For a better understanding of its complexity, a short introduction is given here.The reflected light from an object enters the eye, first passes through the cornea and lens, and creates an inverted image on the retina at the back of the eyeball.The retinal surface contains millions of two types of photoreceptors: rods and cones.The former are sensitive to very low levels of light but cannot see color.Color information is detected at normal (daylight) levels of illumination by the three types of cones, named L, M, S, corresponding to light sensitive pigments at long, medium, and short wavelengths, respectively.The visible spectrum ranges between about 380 to 780 nanometers (nm).The situation is complicated by the retinal distribution of the photoreceptors: the cone density is the highest in the foveal region in a central visual field of approximately 2 • diameter, whereas the rods are absent from the fovea but attain maximum density in an annulus of 18 • eccentricity, that is, in the peripheral visual field.The information acquired by rods and cones is encoded and transmitted via the optic nerve to the brain as one luminance channel (black-white) and two opponent chrominance channels (red-green and yellow-blue), as proposed by the opponent-process theory of color vision of Hering.These visual signals are successively processed in the lateral geniculate nucleus (LGN) and visual cortex (V1), and then propagated to several nearby visual areas in the brain for further extraction of features.Finally, the higher cognitive functions of object recognition and color perception are attained.
At very low illumination levels, when the stimulus has a luminance lesser than approximately 0.01 cd/m 2 , only the rods are active and give monochromatic vision, known as scotopic vision.When the luminance of the stimulus is greater than approximately 10 cd/m 2 , at normal indoor and daylight level of illumination in a moderate surround, the cones alone mediate color vision, known as photopic vision.In between 0.01 and 10 cd/m 2 there is a gradual changeover from scotopic to photopic vision as the retinal illuminance increases, and in this domain of mesopic vision both cones and rods make significant contributions to the visual response.
Yet the mesopic condition is commonly encountered in dark-surround or dim-surround conditions for viewing of television, cinema, and conference projection displays, so it is important to have an appropriate model of color appearance.The cinema viewing condition is particularly interesting, because although the screen luminance is definitely photopic, with a standard white luminance of 40-50 cd/m 2 , the observers in the audience are adapted to a dark surround in the peripheral field which is definitely in the mesopic region.Also, the screen fills a larger field of view than is normal for television, so the retinal stimulus extends further into the peripheral field where rods may make a contribution.Additionally, the image on the screen changes continuously and the average luminance level of dark scenes may be well down into the mesopic region.Under such conditions, the rod contribution cannot be ignored.There is no official CIE standard yet available for mesopic photometry, although in Division 1 of the CIE there is a technical committee dedicated to this aspect of human vision: TC1-58 "Visual Performance in the Mesopic Range." When dealing with the perception of static and moving images, visual contrast sensitivity plays an important role in the filtering of visual information processed simultaneously in the various visual "channels."The high frequency active channels (also known as parvocellular or P channels) enable detail perception; the medium frequency active channels allow shape recognition, whereas the low-frequency active channels (also known as magnocellular or M channels) are more sensitive to motion.Spatial contrast sensitivity functions (CSFs) are generally used to quantify these responses and are divided into two types: achromatic and chromatic.Achromatic contrast sensitivity is generally higher than chromatic.For achromatic sensitivity, the maximum sensitivity to luminance for spatial frequencies is approximately 5 cycles/degree.The maximum chrominance sensitivity is only about one tenth of the maximum luminance sensitivity.The chrominance sensitivities fall off above 1 cycle/degree, particularly for the blue-yellow opponent channel, thus requiring a much lower spatial bandwidth than luminance.For a nonstatic stimulus, as in all refreshed display devices, the temporal contrast sensitivity function must also be considered.To further complicate matters, the spatial and temporal CSFs are not separable and so must be investigated and reported as a function on the time-space frequency plane.
Few research groups have been working on the mesopic domain; however there is a need for investigation.For example, there is a need to develop metrics for perceived contrasts in the mesopic domain [8].In 2005, Walkey EURASIP Journal on Image and Video Processing et al. proposed a model which provided insight into the activity and interactions of the achromatic and chromatic mechanisms involved in the perception of contrasts [9].However, the proposed model does not offer significant improvement over other models in high mesopic range or in mid-to-low mesopic range because the mathematical model used is not relevant to adjust correctly these extreme values.
Likewise, there is a need to determine the limits of visibility, for example, the minimum of brightness contrast between foreground and background, in different viewing conditions.For example, Ojanpaa et al. investigated the effect of luminance and color contrasts on the speed of reading and visual search in function of character sizes.It would be interesting to extend this study to small displays such as mobile devices and to various viewing conditions such as under strong ambient light.According to Kuang et al., contrast judgement as well as colorfulness has to be analysed in function of highlight contrasts and shadow contrasts [10].

Low-level description and high-level interpretation
In recent years, research efforts have also focused on semantically meaningful automatic image extraction [11].
According to Dasiapoulou et al. [11], these efforts have not bridged the gap between low-level visual features that can be automatically extracted from visual content (e.g., with saliency descriptors), and the high-level concepts capturing the conveyed meaning.Even if conceptual models such as MPEG7 have been introduced to model high-level concepts, we are always confronted to the problem of extracting the objects of a scene (i.e., the regions of an image) at intermediate level between the low level and the high level.
Perhaps the most promising way to bridge the former gap is to focus the research activity on new and improved human visual models.Traditional models are based either on a data-driven description or on a knowledge-based description.Likewise, there is in a general way a gap between traditional computer vision science and human vision science, the former considering that there is a hierarchy of intermediate levels between signal-domain information and semantic understanding meanwhile the latter consider that the relationships between visual features in the human visual system are too complex to be modeled by a hierarchical model.Alternative models attempted to bridge the gap between low-level descriptions and high-level interpretations by encompassing a structured representation of objects, events, relations that are directly related to semantic entities.However, there is still plenty of space for new alternative models, additional descriptors and methodologies for an efficient fusion of descriptors [11].
Image-based models as well as learning-based approaches are techniques that have been widely used in the area of object recognition and scene classification.They consider that humans can recognize objects either from their shapes or from their color and their texture.This information is considered as low-level data because it is extracted by the human vision system during the preattentive stage.Inversely, high-level data (i.e., semantic data) is extracted during the interpretation stage.There is no consensus in human vision science to model intermediate stages between preattentive and interpretation stages because we do not have a complete knowledge of visual areas and of neural mechanisms.Moreover, the neural pathways are interconnected and the cognitive mechanisms are very complex.Consequently, there is no consensus for one human vision model.
We believe that the future of image understanding will advance through the development of human vision models which better take into account the hierarchy of visual image processing stages from the preattentive stage to the interpretation stage.With such a model, we could bridge the gap between low-level descriptors and high-level interpretation.With a better knowledge of the interpretation stage of the human vision system we could analyze images at the semantic level in a way that matches human perception.

COLOR IMAGE APPLICATIONS: ISSUES, CONTROVERSIES, PROBLEMS
When we speak about color image science, it is fundamental to evoke firstly problems of acquisition and reproduction of color images but also problems of expertise for particular disciplinary fields (meteorologists, climaticians, geographers, historians, etc.).To illustrate the problems of acquisition, we evoke the demosaicking technologies.Next, to illustrate the problems with the display of color images we speak about digital cinema.Lastly, to illustrate the problems of particular expertise we quote the medical applications.

Color acquisition systems
For several years, we have seen the development of singlechip technologies based on the use of color filter arrays (CFAs) [12].The main problems these technologies have to face are the demosaicking and the denoising of resulting images [13][14][15].Numerous solutions have been published on facing these problems.Among the most recent ones, Li proposed in [16] a demosaicking algorithm in the color difference domain based on successive approximations in order to suppress color misregistration and zipper artefacts in the demosaicked images.Chaix de Lavarène et al. proposed in [17] a demosaicking algorithm based on a linear minimization of the mean square error (MSE).Tsai and Song proposed in [18] a demosaicking algorithm based on edge-adaptive filtering and postprocessing schemes in order to reduce aliasing error in red and blue channels by exploiting high-frequency information of the green channel.On the other hand, L. Zhang and D. Zhang proposed in [19] a joint demosaicking-zoomingalgorithm based on the computation of the color difference signals using the high spectral-spatial correlations in the CFA image to suppress artefacts arising from demosaicking as well as zippers and rings arising from zooming.Likewise, Chung and Chan proposed in [20] a joint demosaicking-zoomingalgorithm based on the interpolation of edge information extracted from raw sensor data in order to preserve edge features in output image.Lastly, Wu and Zhang proposed in [21,22] a temporal color video demosaicking algorithm based on the motion estimation and data fusion in order to reduce color artefacts over the intraframes.In this paper, the authors have considered that the temporal dimension of a color mosaic image sequence could reveal new information on the missing color components due to the mosaic subsampling which is otherwise unavailable in the spatial domain of individual frames.Then, each pixel of the current frame is matched to another in a reference frame via motion analysis, such that the CCD sensor samples different color components of the same object position in the two frames.Next, the resulting interframe estimates of missing color components are fused with suitable intraframe estimates to achieve a more robust color restoration.In [23], Lukac and Plataniotis surveyed in a comprehensive manner demosaicking demosaicked image postprocessing and camera image zooming solutions that utilize data-adaptive and spectral modeling principles to produce camera images with an enhanced visual quality.
Demosaickingtechniques have been also studied in regards to other image processing tasks, such as compression task (e.g., see [24]).

Color in consumer imaging applications
Digital color image processing is increasingly becoming a core technology for future products in consumer imaging.Unlike past solutions where consumer imaging was entirely reliant on traditional photography, increasingly diverse color image sources, including (digitized) photographic media, images from digital still or video cameras, synthetically generated images, and hybrids, are fuelling the consumer imaging pipeline.The diversity on the image capturing and generation side is mirrored by an increasing diversity of the media on which color images are reproduced.Besides being printed on photographic paper, consumer pictures are also reproduced on toner-or inkjet-based systems or viewed on digital displays.The variety of image sources and reproduction media, in combination with diverse illumination and viewing conditions, creates challenges in managing the reproduction of color in a consistent and systematic way.The solution of this problem involves not only the mastering of the photomechanical color reproduction principles, but also the understanding of the intrinsic relations between visual image appearance and quantitative image quality measurements.Much is expected from improved standards that describe the interfaces of various capturing and reproduction devices so they can be combined into better and more reliably working systems.
To achieve "what you see is what you get" (WYSIWYG) color reproduction when capturing, processing, storing, and displaying visual data, the color in visual data should be managed so that whenever and however images are displayed their appearance remains perceptually constant.In the photographic, display, and printing industries, color appearance models, color management methods and standards are already available, notably from the International Color Consortium (ICC, see http://www.color.org/), the International Commission on Illumination (CIE) Divisions 1 "Vision and Color" (see http://www.bio.im.hiroshima-cu .ac.jp/∼cie1) and 8 "Image Technology" (see http://www .colour.org/), the International Electrotechnical Commission (IEC) TC100 "Multimedia for today and tomorrow" (see http://tc100.iec.ch/about/structure/tc100ta2.htm/), and the International Organisation for Standardisation (ISO) such as ISO TC42 "Photography" (see http://www.i3a.org/iso.html/),ISO TC 159 "Visual Display" and ISO TC171 "Document Management" (see http://www.iso.org/iso/).A computer system that enables WYSIWYG color to be achieved is called a color management system.Typical components include the following: (i) a color appearance model (CAM) capable of predicting color appearance under a wide variety of viewing conditions, for example, the CIECAM02 model recommended by CIE; (ii) device characterization models for mapping between the color primaries of each imaging device and the color stimulus seen by a human observer, as defined by CIE specifications; (iii) a device profile format for embodying the translation from a device characterization to a color appearance space proposed by ICC.
Although in graphic arts, web application, HDTV, and so forth rapid progress has been made towards the development of a comprehensive suite of standards for color management in other application domains such as cinematography, similar efforts are still in its infancy.It should be noted, for example, that cinematographic color reproduction is performed in a rather ad hoc primitive manner due to the nature of its processing and its unique viewing conditions [25].Likewise, there are problems in achieving effective color management for cinematographic applications [26].
In particular, in cinematographic applications the concept of "film look" is very important; this latter depends of the content of the film (e.g., the hue of the skin of actors or the hue of the sky) [27].Most of color management processes minimize the errors of color rendering without taking into account the image content.Likewise the spreading of digital film applications (DFAs) in the postproduction industry introduces color management problem.This spreading arises in the processing of data when the encoding is done with different device primary colors (CMY or RGB).The current workflow in postproduction is to transform film material into the digital domain to perform the color grading (artistic color correction) and then to record the finalised images back to film.Displays used for color grading such as CRTs and digital projectors have completely different primary colors compared to negative and positive film stocks.An uncalibrated display of the digital data during the color grading sessions may produce a totally different color impression compared to the colors and the "film look" of the images printed on film.In order to achieve perceptually satisfactory cinematographic color management, it is highly desirable to model the color appearance under the cinema viewing conditions, based on a large set of color appearance data accumulated from experiments with observers under controlled conditions [28].In postproduction, there is a need for automatic color transfer toolboxes (e.g., color balance, RGB channel alignment, color grade transfer, color correction).Unfortunately, little attention has been paid to color transfer in a video or in a film.Most of color transfer algorithms have been defined for still images from a reference image, or for image sequences from key frames in a video clip [29].Moreover, the key frames computed for video sequences are arbitrarily selected regardless of the color content of these frames.A common feature of color transfer algorithms is that they operate on the whole image independent of the image's semantic content (however, an observer who sees a football match in a stadium is more sensitive to the color of the ground than to the color of the steps).Moreover, they do not take into account metadata such as the script of the scenario or the lighting conditions under which the scene was filmed.Nevertheless, such metadata is used by the Digital Cinema System Specification for testing digital projectors and theatre equipment [30].
The problems of color reproduction in graphic arts are in many regards similar to those in consumer imaging, except that much of the image capturing and reproduction is in a controlled and mature industrial environment, making it generally easier to manage the variability.A particularly important color problem in graphic arts is the consistency and predictability of the "digital color proof " with regard to the final print.According to Bochko et al., the design of a system for accurate digital archiving of fine art paintings has awakened increasing interest [31].Excellent results have been achieved under controlled illumination conditions, but it is expected that approaching this problem using multispectral techniques will result in a color reproduction that is more stable under different illumination conditions.Archiving the current condition of a painting with high accuracy in digital form is important to preserve it for the future, likewise to restore it.For example, Berns worked on digital restoration of faded paintings and drawings using a paint-mixing model and a digital imaging of the artwork with a color-managed camera [32].Until 2005, Berns also managed a research program entitled "Art Spectral Imaging" which focused on spectral-based color capture, archiving, and reproduction [30].
Another interesting problem in graphic arts is colorization.Colorization is a computerized process that adds color to a monochrome image or movie.Few methods for motion pictures have been published (e.g., [33]).Various applications such as comics (Manga), a cartoon film, and a satellite image have been reported (e.g., [34]).In addition, the technology is not only used to color images but also for image encoding [35].In recent years, techniques have developed in the field of other image processing, such as image matting [36], image inpainting [37], and physical reflection model [38] and have been applied to colorization.The target of colorization is not only limited to coloring algorithm but extends to the problem of color-to-gray (e.g., [39]).This problem is interesting and must be a new direction in colorization.The colorization accuracy for monochrome video needs to be improved and considered as an essential challenge in the future.

Color in medical imaging
In general, medical imaging focuses mostly on analysing the content of the images rather than the artefacts linked to the technologies used.
Most of the images, such as X-ray and tomographic images, echo-, or thermographs are monochrome in nature.In a first application of color image processing, pseudocolorization was used to aid the interpretation of transmitted microscopy (including stereo microscopy, 3D reconstructed image, and fluorescence microscopy) [40].In the context of biomedical imaging, an important area of increasing significance in society, color information, has been used significantly in order, amongst other things, to detect skin lesions, glaucomatous in eyes [41], microaneurysms in color fundus images [42], and to measure blood-flow velocities in the orbital vessels, and to analyze tissue microarrays (TMAs) or cDNA microarrays [43,44].Current approaches are based on colorimetric interpretation, but multispectral approaches can lead to more reliable diagnoses.Multispectral image processing may also become an important core technology for the business unit "nondestructive testing" and "aerial photography," assuming that these groups expand their applications into the domain of digital image processing.The main problem in medical imaging is to model the image formation process (e.g., digital microscopes [45], endoscopes [46], color-doppler echocardiography [47]) and to correlate image interpretation with physics-based models.In medical applications, usually lighting conditions are controlled.However, several medical applications are faced with the problem of noncontrolled illumination, such as in dentistry [48] or in surgery.
Another important problem addressed in medical imaging is the quality of images and displays (e.g., sensitivity, contrast, spatial uniformity, color shifts across the grayscale, angular-related changes of contrast and angular color shifts) [49][50][51].To face with the problem of image quality, some systems classify images by assigning them to one of a number of quality classes, such as in retinal screening [50].To classify image structuresfound within the image Niemeijer et al. have used a clustering approach based on multiscale filterbanks.The proposed method was compared, using different feature sets (e.g., image structure or color histograms) and classifiers, with the ratings of a human observer.The best system, based on a Support Vector Machine, had performance close to optimal with an area under the ROC curve of 0.9968.
Another problem medical imaging has to face is how to quantify the evolution of a phenomenon and more generally how to assist the diagnostic.Unfortunately, few studies have been published in this domain.Conventional image processing based on low-level features, such as clustering or segmentation, may be used to analyze color contrast between neighbor pixels or color homogeneity of regions in medical imaging application to analyze the evolution of a phenomenon but are not adapted to high-level interpretation.Perhaps a combination of low-level features such as color features, geometrical features, and structure features could improve the relevance of the analysis (e.g., see [52]).Another strategy will consist of extracting high-level metadata from specimens to characterize them, to abstract their interpretation, to correlate them to clinical data, next to use these metadata for automated and accurate analysis of digitized images.
Lastly, dentistry is faced with complex lighting phenomena (e.g., translucency, opacity, light scattering, gloss effect, etc.) which are difficult to control.Likewise, cosmetic science is faced with the same problems.The main tasks of dentistry and cosmetic science are color correction, gloss correction, and face shape correction.

Color in other applications
We have evoked in this section several problems of medical applications, but we could also evoke the problems with assisting the diagnosis in each area of particular expertise (meteorologists, climaticians, geographers, historians, etc.).Likewise, we could evoke the problems of image and display quality in web applications, HDTV, graphic arts and so on or applications of nondestructive quality control for numerous areas including painting, varnishes, and materials in the car industries, aeronautical packaging, or in the control of products in the food industry.Numerous papers have shown that even if most of the problems in color image science are similar for various applications, color imaging solutions are widely linked to the kinds of image and to the applications.

Color spaces
Rather than using a conventional color space, another solution consists of using an ad hoc color space based on the most characteristic color components of a given set of images.Thus, Benedetto et al. [53] proposed to use the YST color space to watermark images of human faces where Y, S, and T represent, respectively, the brightness component, the color average value of a set of different colors of human faces, and the color component orthogonal to the two others.The YST color space is next used to watermark images that have the same color characteristics as the set of images used.Such a watermarking process is robust to illumination changes as the S component is relatively invariant to illumination changes.
Other solutions have been also proposed for other kinds of processes such as the following.
(i) For segmentation.The Fischer distance strategy has been proposed in [54] in order to perform figureground segmentation.The idea is to maximize the foreground/background class separability from a linear discriminant analysis (LDA) method.
(ii) For feature detection.The diversification principle strategy had been proposed in [55] in order to perform selection and fusion of color components.The idea is to exploit nonperfect correlation between color components or feature detection algorithms from a weighting scheme which yields maximal feature discrimination.Considering that a tradeoff exists between color invariant components and their discriminating power, the authors proposed to automatically weight color components to arrive at a proper balance between color invariance under varying viewing conditions (repeatability) and discriminative power (distinctiveness).
(iii) For tracking.The adaptive color space switching strategy had been proposed in [56] in order to perform tracking under varying illumination.The idea is to dynamically select the better color space, for a given task (e.g., tracking), as a function of the state of the environment, among all conventional color spaces.
These solutions could be extended to more image processing tasks than those initially considered provided these solutions are adapted to these tasks.The proper use and understanding of these solutions is necessary for the development of new color image processing algorithms.In our opinion, there is room for the development of other solutions for choosing the best color space for a given image processing task.Lastly, to decompose color data in different components such as a lightness component and a color component, new techniques recently appeared such as the quaternion theory [57,58] or other mathematical models based on polar representation [59].For example, Denis et al. [57] used the quaternion representation for edge detection in color images.They constrained the discrete quaternionic Fourier transform to avoid information loss during processing and defined new spatial and frequency operators to filter color images.Shi and Funt [58] used the quaternion representation for segmenting color images.They showed that the quaternion color texture representation can be used to successfully divide an image into regions on basis of texture.

Color image appearance (CAM)
The aim of the color appearance model is to model how the human visual system perceives the color of an object or of an image under different points of view, different lighting conditions, and with different backgrounds.
The principal role of a CAM is to achieve successful color reproduction across different media, for example, to transform input images from film scanners, cameras, onto displays, film printers, and data projectors considering the human visual system (HVS).In this way, a CAM must be adaptive to viewing conditions, that is ambient light, surround color, screen type, viewing angle, and distance.The standard CIECAM02 [60] has been successfully tested at various industrial sites for graphic arts applications, but needs to be tested before being used in other viewing conditions (e.g., cinematographic viewing conditions).
Research efforts have been applied in developing a color appearance model for predicting a color appearance under different viewing conditions.A complete model should predict various well-known visual phenomena such as Stevens effect, Hunt effect, Bezold-Brücke effect, simultaneous contrast, crispening, color constancy, memory color, discounting-the-illuminant, light, dark, and chromatic adaptation, surround effect, spatial and temporal visions.All these phenomena are caused by the change of viewing parameters, primarily illuminance level, field size, background, surround, viewing distance, spatial, and temporal variations, viewing mode (illuminant, surface, reflecting, self-luminous, or transparent), structure effect, shadow, transparency, neon-effect, saccades effect, stereo depth, and so forth.
Many color appearance models have been developed since 1980.The last one is the CIECAM02 [60].Although CIECAM02 does provide satisfactory prediction to a wide range of viewing conditions, there still remain many limitations.Let us consider four of these limitations: (1) objective determination of viewing parameters; (2) prediction of color appearance under mesopic vision; (3) incorporation of spatial effects for evaluating static images; (4) consideration of the temporal effects of human vision system for moving images.
The first limitation is due to the fact that in CIECAM02 the viewing conditions need to be defined in terms of illumination (light source and luminance level), luminance factor of background and surround (average, dim, or dark).Many of these parameters are very difficult to define, which leads to confusion in industrial application and deviations in experimentation.The surround condition is highly critical for predicting accurate color appearance, especially when associated with viewing conditions for different media.Typically, we assume that viewing a photograph or a print in a normal office environment is called "bright" or "average" surround, whereas watching TV in a darkly lit living room can be categorized as "dim" surround, and observing projected slides and cinema images in a darkened room is "dark" surround.Users currently have to determine what viewing condition parameter values should be used.Recent work has been carried out by Kwak et al. [61] to make better prediction of changes in color appearance with different viewing parameters.
The second shortcoming addresses the state of visual adaptation at the low-light levels (mesopic vision).Most models of color appearance assume photopic vision, and completely disregard the contribution from rods at low levels of luminance.There are few color appearance datasets for mesopic vision and the experimental data from conventional vision research are difficult to apply to color appearance modeling because of the different experimental techniques employed (haploscopic matching, flicker photometry, etc.).The only color appearance model yet to include a rod contribution is the Hunt 1994 model but, when this was adapted to produce CIECAM97s and later CIECAM02, the contributions of rod signal to the achromatic luminance channel were omitted [62].In a recent study, color appearance under mesopic vision conditions was investigated using a magnitude estimation technique [8,63].Larger stimuli covering both foveal and perifoveal regions were used to probe the effect of the rods.It was confirmed that colors looked "brighter" and more colorful for a 10-degree patch than a 2-degree patch, an effect that grew at lower luminance levels.It seemed that perceived brightness was increased by the larger relative contribution of the rods at lower luminance levels and that the increased brightness induced higher colourfulness.It was also found that the colors with green-blue hues were more affected by the rods than other colors, an effect that corresponds to the spectral sensitivity of the rod cell, known as the "Purkinje shift" phenomenon.Analysis of the experimental results led to the development of an improved lightness predictor, which gave superior results to eight other color appearance models in the mesopic region [61].
The third shortcoming is linked to the problem that the luminance of the white point and the luminance range (white-to-dark, e.g., from highlight to shadow) of the scene may have a profound impact on the color appearance.Likewise, the background surrounding the objects in a scene influences the judgement of human evaluators when assessing video quality using segmented content.
For the last shortcoming, an interesting direction to be pursued is the incorporation of spatial and temporal effects of human vision system into color appearance models.For example, although foveal acuity is far better than peripheral acuity, many studies have shown that the near periphery resembles foveal vision for moving and flickering gratings.It is especially true for sensitivity to small vertical displacements, and detection of coherent movement in peripherally viewed random-dot patterns.Central fovea and peripheral visions are qualitatively similar in spatial-temporal visual performance and this phenomenon has to be taken into account for color appearance modeling.Some researches have been conducted on spatial and temporal effects by numerous papers [64][65][66][67].
Several studies have shown that the human visual system is more sensitive to low frequencies than to high frequencies.Likewise, several studies have shown that the human visual system is less sensitive to noise in dark and bright regions than in other regions.Lastly, the human visual system is highly insensitive to distortions in regions of high activity (e.g., salient regions) and is more sensitive to distortions near edges (objects contours) than in highly textured areas.All these spatial effects are unfortunately not taken into account enough by CIECAM97s or CIECAM02 color appearance models.A new technical committee, the TC1-68 "Effect of stimulus size on colour appearance," has been created in 2005 to compare the appearance of small and large uniform stimuli on a neutral background.Even if numerous papers have been published on this topic, in particular in the proceedings of the CIE Expert Symposium on Visual Appearance organized in 2006 [68][69][70][71], there is a need for further research on spatial effects.
The main limitation of color imaging in the color appearance models previously described is that they can only predict the appearance of a single stimulus under "reference conditions" such as a uniform background.These models can been used successfully in color imaging as they are able to compute the influence of viewing conditions such as the surround lighting or the overall viewing luminance on the appearance of a single color patch.The problem with these models is that the interactions between individual pixels are mostly ignored.To deal with this problem, spatial appearance models have been developed such as the iCAM [64] which take into account both spatial and color properties of the stimuli and viewing conditions.The goal in developing the iCAM was to create a single model applicable to image appearance, image rendering, and image quality specifications and evaluations.This model was built upon previous research in uniform color spaces, the importance of image surround, algorithms for image difference and image quality measurement [72], insights into observers eye movements while performing various visual imaging tasks, adaptation to natural scenes and an earlier model of spatial and color vision applied to color appearance problems and high dynamic range (HDR) imaging.
The iCAM model has a sound theoretical background, however, it is based on empirical equations rather than a standardized color appearance model such as CIECAM02 and some parts are still not fully implemented.It is quite efficient in dealing with still images but it needs to be improved and extended for video appearance [64].Moreover, filters implemented are only spatial and cannot contribute to color rendering improvement for mesopic conditions with high contrast ratios and a large viewing field.Consequently, the concept and the need for image appearance modeling are still under discussion in the Division 1 of the CIE, in particular in the TC 1-60 "Contrast Sensitivity Function (CSF) for Detection and Discrimination."Likewise, how to define and predict the appearance of a complex image is still an open question.
Appreciating the principles of color image appearance and more generally the principles of visual appearance opens the door for improving color image processing algorithms.For example, the development of emotional models related to the color perception should contribute to the understanding of color and light effects in images (see CIE Color Reportership R1-32 "Emotional Aspects of Color").Another example is that the development of measurement scales that relate to the perceived texture should help to analyze textured color images.Likewise, the development of measurement scales that relate to the perceived gloss should help to describe perceived colorimetric effects.Numerous studies have been done on the "science" of appearance in the CIE Technical Committee TC 1-65 "Visual Appearance Measurement."

Color difference metrics
Beyond the problem of the color appearance description arises also the problem of the color difference measurement in a color space.The CIEDE2000 color difference formula was standardized by the CIE in 2000 in order to compensate some errors in the CIELAB and CIE94 formulas [73].Unfortunately, the CIEDE2000 color difference formula suffers from mathematical discontinuities [74].
In order to develop/text new color spaces with Euclidean color difference formulas, new reliable experimental datasets need to be used (e.g., using visual displays, under illuminating/viewing conditions close to the "reference conditions" suggested for the CAM).This need has recently been expressed by the Technical Committee CIE TC 1-55 "Uniform color space for industrial color difference evaluation" [75].The aim of this TC is to propose "a Euclidean color space where color differences can be evaluated for reliable experimental data with better accuracy than the one achieved by the CIEDE2000 formula."(See recent studies of the TC1-63 "Validity of the range of the CIEDE2000" and R1-39 "Alternative Forms of the CIEDE2000 Colour-Difference Equations.") The usual color difference formulas, such as the CIEDE2000 formula, have been developed to predict color difference under specific illuminating/viewing conditions closed to the "reference conditions."Inversely, the CIECAM97s and CIECAM02 color appearance models have been developed to predict the change of color appearance under various viewing conditions.These CIECAM97s and CIECAM02 models involve seven attributes: brightness (Q), lightness (J), colorfulness (M), chroma (C), saturation (s), hue composition (H), and hue angle (h).
Lastly, let us note that meanwhile the CIE L * a * b * ΔE metric can be seen as a Euclidean color metric, the S-CIELAB space has the advantage of taking into account the differences of sensitivity of the HVS in the spatial domain, such as homogeneous or textured areas.

COLOR IMAGE PROCESSING
The following subsections focus on the most recent trends in quantization, filtering and enhancement, segmentation, coding and compression, watermarking, and lastly on multispectral color image processing.Several states of the art on various aspects of image processing had been published in the past.Rather than globally describing the problematic of these topics, we focus on color specificities in advanced topics.

Color image quantization
The optimal goal of the quantization method is to build a set of representative colors such that the perceived difference between the original image and the quantized one is as small as possible.The definition of relevant criteria to characterize the perceived image quality is still an open problem.One criterion commonly used by quantization algorithms is the minimization of the distance between each input color and its representative.Such criterion may be measured thanks to the total squared error which minimizes the distance within each cluster.A dual approach tries to maximize the distance between clusters.Note that the distance of each color to its representative is relative to the color space in which the mean squared error is computed.Several strategies have been developed to quantize a color image, among them the vectorial quantization (VQ) is the most popular.VQ can be also used as an image coding technique that shows high data compression ratio [76].
In the previous years, image quantization algorithms were very useful due to the fact that most computers used 8-bit color palettes, but now all displays have high EURASIP Journal on Image and Video Processing bit depth, even cell phones.Image quantization algorithms are considered of much less usefulness today due to the increasing power of most digital imaging devices, and the decreasing cost of memory.The future of color quantization is not in the displays community due to the fact that the bit depth of all triprimaries displays is currently at least equal to 24 bit (or higher, e.g., equal to 48 bits!).Inversely, the future of color quantization will be guided by the image processing community due to the fact that typical color imaging processes such as compression, watermarking, filtering, segmentation, or retrieval use the quantization.
It has been demonstrated that the quality of a quantized image depends on the image content and on gray-levels of the color palette (LUT); likewise the quality of a compression or a watermarking process based on a quantization process depends on these features [77].In order to illustrate this aspect, let us consider the problem of color image watermarking.Several papers have proposed a color watermarking scheme based on a quantization process.Among them, Pei and Chen [78] proposed an approach which embed two watermarks in the same host image, one on the a * b * chromatic plane with a fragile message by modulating the indexes of a color palette obtained by color quantization, another on the L * lightness component with a robust message of gray levels palette obtained also by quantization.Chareyron et al. [79] proposed a vector watermarking scheme which embeds one watermark on the xyY color space by modulating the color values of pixels previously selected by color quantization.This scheme is based on the minimization of color changes between the watermarked image and the host image in the L * a * b * color space.

Color image filtering and enhancement
The function of a filtering and signal enhancement module is to transform a signal into another more suitable for a given processing task.As such, filters and signal enhancement modules find applications in image processing, computer vision, telecommunications, geophysical signal processing, and biomedicine.However, the most popular filtering application is the process of detecting and removing unwanted noise from a signal of interest, such as color images and video sequences.Noise affects the perceptual quality of the image decreasing not only the appreciation of the image but also the performance of the task for which the image was intended.Therefore, filtering is an essential part of any image processing system whether the final product is used for human inspection, such as visual inspection, or an automatic analysis.
In the past decade, several color image processing algorithms have been proposed for filtering, noise reduction targeting, in particular, additive impulsive and Gaussian noise, speckle noise, additive mixture noise, and stripping noise.A comprehensive class of vector filtering operators have been proposed, researched, and developed to effectively smooth noise, enhance signals, detect edges, and segment color images [80].The proposed framework, which has supplanted previously proposed solutions, appeared to report the best performance to date and has inspired the introduction of a number of variants inspired by the framework of [81] such as those reported in [82][83][84][85][86][87][88][89][90].
Most of these solutions are able to outperform classical rank-order techniques.However, they do not produce convincing results for additive noise [89] and fall short of delivering the performance reported in [80].It should be added at this point that classical color filters are designed to perform a fixed amount of smoothing so that they are not able to adapt to local image statistics [89].Inversely, adaptive filters are designed to filter only those pixels that are likely to be noisy while leaving the rest of the pixels unchanged.For example, Jin and Li [88] proposed a "switching" filterwhich better preserves the thin lines, fine details, and image edges.Other filtering techniques, able to suppress impulsive noise and keep image structures based on modifying the importance of the central pixel in the filtering process, have also been developed [90].They provide better detailed preservation whereas the impulses are reduced [90].A disadvantage of these techniques is that some parameters have to be tuned in order to achieve an appropriate performance.To solve this problem, a new technique based on a fuzzy metric has been recently developed where an adaptive parameter is automatically determined in each image location by using local statistics [90].This new technique is a variant of the filtering technique proposed in [91].Numerous filtering techniques used also morphological operators, wavelets or partial differential equations [92,93].
Several research groups worldwide have been working on these problems, although none of the proposed solutions seems to outperform the adaptive designs reported in [80].Nevertheless, there is a room for improvement in existing vector image processing to achieve a tradeoff between detailed preservation (e.g., edge sharpness) and noise suppression.The challenge of the color image denoising results mainly from two aspects: the diversity of the noise characteristics and the nonstationary statistics of the underlying image structures [87].
The main problem these groups have to face is how to evaluate the effectiveness of a given algorithm.As for other image processing algorithms, the effectiveness of an algorithm is image-dependent and application-dependent.Although there is no universal method for color image filtering and enhancement solutions, the design criteria accompanied the framework reported in [80,81,86] appear to offer the best guidance to researchers and practitioners.

Color image segmentation
Color image segmentation refers to partitioning an image into different regions that are homogeneous with respect to some image feature.Color image segmentation is usually the first task of any image analysis process.All subsequent tasks, such as feature extraction and object recognition, rely heavily on the quality of the segmentation.Without a good segmentation algorithm, an object may never be recognizable.Oversegmenting an image will split an object into different regions while undersegmenting it will group various objects into one region.In this way, the segmentation step determines the eventual success or failure of the analysis.For this reason, considerable care is taken to improve the stateof-the-art in color image segmentation.The latest survey on color image segmentation techniques were published in 2007 by Paulus [94].These surveys discussed the advantages and disadvantages of classical segmentation techniques, such as histogram thresholding, clustering, edge detection, regionbased methods, vector based, fuzzy techniques, as well as physics-based methods.Since then, physics-based methods as well as those based on fuzzy logic concepts appear to offer the most promising results.Methodologies utilizing active contour concepts [95] or hybrid methods combining global information, such as image histograms and local information, regions and edge information [96,97], appear to deliver efficient results.
Color image segmentation is a rather demanding task and developed solutions have to be effectively deal with image shadows, illumination variations and highlights.Amongst the most promising line of work in the area is the computation of image invariants that are robust to photometric effects [54,98,99].Unfortunately, there are too many color invariant models introduced in the open literature, making the selection of the best model and its combination with local image structures (e.g., color derivatives) in order to produce the best result quite difficult.In [100], Gevers et al. survey the possible solutions available to the practitioner.In specific applications, shadow, shading, illumination, and highlight edges have to be identified and processed separately from geometrical edges such as corners and T-junctions.To address the issue, local differential structures and color invariants in a multidimensional feature space were used to detect salient image structures (i.e., edges) on the basis of their physical nature in [100].In [101], the authors proposed a classification of edges into five classes, namely, object edges, reflectance edges, illumination/shadow edges, specular edges, and occlusion edges to enhance the performance of the segmentation solution utilized.
Shadow segmentation is of particular importance in applications such as video object extraction and tracking.Several research proposals have been developed in an attempt to detect a particular class of shadows in video images, namely, moving cast shadows, based on the shadow's spectral and geometric properties [102].The problem is that cast shadow models cannot be effectively used to detect other classes of shadows, such as self-shadows or shadows in diffuse penumbra [102] suggesting that existing shadow segmentations solutions could be further improved using invariant color features.
Presently, the main focus of the color image processing community appears to be the fusion of several low-level image features so that image content would be better described and processed.Several researches provided some solutions to combine color derivatives features and color invariant features, color features and other low-level features (e.g., color and texture [103], color and shape [100]), low-level features and high-level features (e.g., from graph representation [104]).However, none of the proposed solutions appear to provide the expected performance leading to solutions that borrow ideas and concepts from sister signal processing communities.For example, in [105] the authors propose the utilization of color masks and MPEG-7 descriptors in order to segment prespecified target objects in video sequences.According to this solution, available priori information on specified target objects, such as skin color features in head-and-shoulder sequence, are used to automatically segment these objects focusing on a small part of the image.In the opinion of the authors, the future of color image segmentation solutions will heavily rely on the development and use of intermediate-level features derived using saliency descriptors and by the use of a priori information.
Color segmentation can be used in numerous applications, such as skin detection.Skin detection plays an important role in a wide range of image processing applications ranging from face detection, face tracking, content-based image retrieval systems, and to various human computer interaction domains [106][107][108][109]. A survey of skin modeling and classification strategies based on color information was published by Kakumanu et al. in 2007 [108].

Color coding and compression
A number of video coding standards have been developed, ITU-T H.261, H.263, ISO/IEC MPEG-1, MPEG-2, MPEG-4, and H.264/AVC, and deployed in multimedia applications such as video conferencing, storage video, video-on-demand, digital television broadcasting, and Internet video streaming [110].In most of the developed solutions, color has played only a peripheral role.However, in the opinion of the authors, video coding solutions could be further improved by utilizing color and its properties.Most of the traditional video coding techniques are based on the hypothesis that the so-called luminance component, that is the Y channel in the YCbCr color space representation, provides meaningful textural details which can deliver acceptable performance without resorting to the use of chrominance planes.This fundamental design assumption explains the use of models with separate luminance and chrominance components in most transform-based video coding solutions.In [110], the authors suggested the utilization of the same distribution function for both the luminance and chrominance components demonstrating the effectiveness of a nonseparable color model both in terms of compression ratio and compressed sequence picture quality.
Unfortunately, most of codecs use different chroma subsampling ratio as appropriate to their compression needs.For example, video compression schemes for Web and DVD use make use of a 4 : 2 : 0 color sampling pattern and the DV standard uses 4 : 1 : 1 sampling ratio.A common problem when an end user wants to watch a video stream encoded with a specific codec is that if the exact codec is not present and properly installed on the user's machine, the video will not play (or will not play optimally).Spatial and temporal downsampling may also be used to reduce the raw data rate before the basic encoding process.The most popular of such transforms is the 8 × 8 discrete cosine transform (DCT).
In the area of still image compression, there has been a growing interest in wavelet-based embedded image coders because they enable high quality at large compression ratio, very fast decoding/encoding, progressive transmission, low computational complexity, low dynamic memory requirement, and so forth [111].The recent survey of [112] summarized color image compression techniques based on subband transform coding principles.The discrete cosine transform (DCT), the discrete Fourier transform (DFT), the Karhunen-Loeve transform (KLT), and the wavelet tree decomposition had been reviewed.The authors proposed a rate-distortion model to determine the optimal color components and the optimal bit allocation for the compression.It is interesting to note that these authors had demonstrated that the YUV, YIQ, and KLT color spaces are not optimal to reduce bit allocation.There has been also a great interest in vector quantization (VQ) because VQ provides a high compression ratio and better performance may be obtained than using any other block coding technique by increasing vector length and codebook size.Lin and Chen extended this technique in developing a spread neural network with penalized fuzzy cmeans (PFCM) clustering technology based on interpolative VQ for color image compression [113].
In [114], Dhara and Chanda surveyed color image compression techniques that are based on block truncation coding (BTC).The authors' recommendations to increase the performance of BTC include a proposal to reduce the interplane redundancy between color components prior to applying a pattern fitting (PF) on each of the color plane separately.The work includes recommendations on determining the size of the pattern book, the number of levels in patterns, and the block size based on the entropy of each color plane.The resulting solution offers competitive coding gains at a fraction of the coding/decoding time required by existing solution such as JPEG.In [115], the authors proposed a color image coding strategy which combines localized spatial correlation and intercolor correlation between color components in order to build a progressive transmission, cost-effective solution.Their idea is to exploit the correlation between color components instead of decorrelating color components before applying the compression.Inspired by the huge success of set-partitioning sorting algorithms such as the SPIHT or the SPECK, there has been also extensive research on color image coding using the zerotree structure.For example, Nagaraj et al. proposed a color set partitioned embedded block coder (CSPECK) to handle color still images in the YUV 4 : 2 : 0 format [111].By treating all color planes as one unit at the coding stage, the CSPECK generates a single mixed bit-stream so that the decoder can reconstruct the color image with the best quality at that bit-rate.
Although it is a known fact that interframe-based coding schemes (such as MPEG) which exploit the redundancy in the temporal domain outperform intrabased coding schemes (like Motion JPEG or Motion JPEG2000) in terms of compression ratio, intrabased coding schemes have their own set of advantages such as embeddedness, frame-byframe editing, arbitrary frame extraction, and robustness to bit errors in error-prone channel environments which the former schemes fail to provide [111].Nagaraj et al. exploited this statement to extend CSPECK for coding video frames by using an intrabased setting of the video sequences.They called this scheme as Motion-SPECK and compared its performance on QCIF and CIF sequences against Motion-JPEG2000.The intended applications of such video coder would be high-end and emerging video applications such as high-quality digital video recording system and professional broadcasting systems.
In a general way, to automatically measure the quality of a compressed video sequence the PSNR is computed on multimedia videos, consisting of CIF and QCIF video sequences compressed at various bit rates and frame rates [111,116].However, the PSNR has been found to correlate poorly with subjective quality ratings, particularly at low bit rates and low frame rates.To face with this problem, Ong et al. proposed an objective video quality measurement method better correlated to the human perception than the PSNR and the video structural similarity method [116].On the other hand, Süsstrunk and Winkler reviewed the typical visual artifacts that occur due to high compression ratios and/or transmission errors [117].They discussed noreference artifact metrics for blockiness, blurriness, and colorfulness.In our opinion, objective video quality metrics will be useful for weighting the frame rate of coding algorithms in regard to the content richness fidelity, to the distortion-invisibility, and so forth.In this area, numerous researches have been made but few of them focused on color information (see Section 6.5).
Lastly, it is interesting to note that even if the goals of compression and data hiding methods are by definition contradictory, these methods can be used jointly.While the former methods add perceptually irrelevant information in order to embed data, the latter methods remove this irrelevancy and redundancy to reduce storage requirements.In the opinion of the authors, the future of color image compression will heavily rely on the development of joint methods combining compression and data hiding.For example, Lin and Chen proposed a color image hiding scheme which first compresses color data by an interpolative VQ scheme (IVQ), then encrypts color IVQ indices, sorts the codebooks of secret color image information, and embeds them into the frequency domain of the cover color image by the Hadamard transform (HT) [113].On the other hand, Chang et al. [118] proposed a reversible hiding scheme which first compresses color data by a block-truncation coding scheme (BTC), then applies a genetic algorithm to reduce the binary bitmap from three to one, and embeds the secret bits from the common bitmap and the three quantization levels of each block.According to Chang et al., unlike the codebook used in VQ, BTC never requires any auxiliary information during the encoding and decoding procedures.In addition, BTC-compressed images usually maintain acceptable visual quality, and the output can be compressed further by using other lossless compression methods.

Color image watermarking
For a few years, color has become a major component in watermarking applications but also in security, steganography, and cryptography applications of multimedia contents.In this section, we only discuss watermarking, for other topics refer to the survey written by Lukac and Plataniotis in 2007 [5].In watermarking, we tend to watermark the perceptually significant part of the image to ensure robustness rather than providing fidelity (except for fragile watermarks and authentication).Therefore, the whole challenge is how to introduce more and more significant information without perceptibility, and how to keep the distortion minimal.On one hand, this relies upon crypting techniques, and on the other, the integration of HSV models.Most watermarking schemes use either one or two perceptual components, such as color and frequency components.Obviously, the issue is the combination of the individual components so that a watermark with increased robustness and adequate imperceptibility is obtained [119,120].
Most of the recently proposed watermarking techniques operate on the spatial color image domain.The main advantage of spatial domain watermarking schemes is that their computational cost is smaller compared to the cost associated with watermarking solutions operating on the transform image domain.One of the first spatial-domain watermarking schemes, the so-called the least significant bit (LSB) scheme, was on the principle of inserting the watermark in the low order bits of the image pixel.Unfortunately, LSB techniques are highly sensitive to noise with watermarks that can be easily removed.Moreover, as LSB solutions applied to color images use color transforms which are not reversible when using fixed-point processor, the watermark can be destroyed and the original image cannot be recovered, even if only the least significant bits are altered [121].This problem is not specific to LSB techniques, it concerns any color image watermarking algorithm based on nonreversible forward and inverse color transforms using fixed-point processor.Another problem with LSB-based methods is that most of them are built for raw image data rather than for compressed image formats that are usually used across the Internet today [118].To face this problem, Chang et al. proposed a reversible hiding method based on a block truncation coding of compressed color images.The reversibility of this scheme is based on the order of the quantization levels of each block and the property of the natural image, that is, the adjacent pixels are usually similar.
In the authors' opinion, watermarking quality can be improved through the utilization of the appearance models and color saliency maps.As a line for future research, it will be interesting to examine how to combine the various saliency maps that influence the visual attention, namely, the intensity map, contrast map, edginess map, texture map, and the location map [119,122,123].
Generally, when a new watermarking method is proposed, some empirical results are provided so that performance claims can be validated.However, at present there is no systematic framework or body of standard metrics and testing techniques that allow for a systematic comparative evaluation of watermarking methods.Even for benchmarked systems such as Stirmark or Checkmark, comparative evaluation of performance is still an open question [122].From a color image processing perspective, the main weaknesses of these benchmarking techniques is that they are limited to gray-level images.Thus, in order to compute the fidelity between an original and a watermarked image, color images have to be converted to grayscale images.Moreover, such benchmarks use a black-box approach to compute the performance of a given scheme.Thus, they first compute various performance metrics which they then combine to produce an overall performance score.According to Wilkinson [122], a number of separate performance metrics must be computed to better fully describe the performance of a watermarking scheme.Likewise, Xenos et al. [119] proposed a model based on four quality factors and approximately twenty criteria hierarchized in three levels of analysis (i.e., high level, middle level, and low level).According to this recommendation, four major factors are considered as part of the evaluation procedure, namely, high-level properties, such as the image type, color-related information, such as the depth and basic colors, color features, such as the brightness, saturation, and hue, and regional information, such as the contrast, the location, the size, the color of image patches.In the opinion of the authors, it will be interesting to undertake new investigations towards the development of a new generation of a comprehensive benchmarking system capable of measuring the quality of the watermarking process in terms of color perception.
Similar to solutions developed for still color images, the development of quality metrics that can accurately and consistently measure the perceptual differences between original and watermarked video sequences is a key technical challenge.Winkler [124] showed that the video quality metrics (VQM) could automatically predict the perceptual quality of video streams for a broad variety of video applications.In the author's opinion, these metrics could be refined through the utilization of high-level color descriptors.Unfortunately, very few works had been reported in the literature on the objective evaluation of the quality of watermarked videos.

Multispectral color image processing
A multispectral color imagingsystem is a system which captures and describes color information by a greater number of sensors than an RGB device resulting in a color representation that uses more than three parameters.The problem with conventional color imaging systems is that they have some limitations, namely, dependence on the illuminant and characteristics of the imaging system.On the other hand, multispectral color imaging systems, based on spectral reflectance, are device and illuminant independent [7,30,31].
During the last few years, the importance of multispectral imagery has sharply increased following the development of new optical devices and the introduction of new applications.The trichromatic, RGB color imaging becomes unsatisfactory for many advanced applications but also for the interfacing of input/output device and color rendering in imaging systems.Color imaging must become spectrophotometric, therefore, multispectral color imaging is the technique of the immediate future.
The advantages of multispectral systems are beginning to be appreciated by a growing group of researchers, many of whom have devoted considerable efforts over the past few years to developing new techniques.The importance of this subject is reflected by an increasing number of publications in journals and conference proceedings.Consequently, in 2002 the CIE established a new technical committee (TC8-07) devoted to this field (see http://www.colour.org/tc8-07/).For a few years, this technical committee works with a survey of the existing data formats for multispectral images to specify the requirements of a general data format and to define a new such data format, based on previous knowledge.
To further understand existing solutions and to facilitate the development of new algorithms, a unified color representation is needed.This can be achieved only by using spectral approach to color.The basis for the theory is the spectral color signal reaching the detection system (human eye, eye of a solitary bee, or an artificial detector in industry).Some approaches towards this theory have been proposed including Karhunen-Loeve transform-based subspaces of a Hilbert space, time-frequency analysis by using Wigner distribution, group theory with Lie algebra, or quaternion as color representation in a complex space.Learning this unified spectral color theory requires much effort and cooperation between theorists and practitioners.In future, metrology in multispectral imaging measurements will be a large and diffuse research field and will be directly aimed at generic and precompetitive multisector-based activities [125,126] (see also http://www.multispectral.org/).
There are two main differences between multispectral remote sensing systems and multispectral color imaging systems.Firstly, in remote sensing systems, the information is captured from narrowband spectra filters to record the spectrum without attempting to match a human observer.Inversely, color imaging systems used spectra filters to recover the visible spectrum so as to match the human observer.Secondly, most of remote sensing systems classify the data acquired in a number of known categories to reduce the amount of information.Inversely, in color imaging system the goal is to acquire data without loss of visual information, that is without dimensionality reduction of data acquired.An alternate solution of multispectral color imaging systems involves sampling the spectra of images while preserving visual information.Numerous techniques can be used for recovering illuminant and surface reflectance data from recorded images.All these techniques were reviewed and compared by Bochko et al. [31].According to Bochko et al., there is a room for improvement in existing reconstruction methods of reflectance spectra.

COLOR IMAGE ANALYSIS
According to several studies, color is perhaps the most expressive of all visual features.Furthermore, color features are robust to several image processing transforms such as geometric transform (e.g., translation and rotation of the regions of interest) and to partial occlusion and pose variations.For several years, the main challenge for color image analysis, and particularly for image retrieval and object recognition, has been to develop high-level features modeling the semantics of image content.The problem that we have to face is that there is a gap between this objective and the set of features which have been identified and experimented.Meanwhile, many low-level image features have been identified, such as color, texture, shape, and structure [127,128], or at an intermediate level, such as spatial arrangement and multiscale saliency.However, few high-level image features have been identified [11], with regard to the variety of images which can be seen and to the number of entity features which can be identified by an observer.

Color features
In numerous applications, such as color image indexing and retrieval, color features are used to compare or match objects through similarity metrics.However, Schettini et al. have shown that the robustness, the effectiveness, and the efficiency of color features in image indexing are still open issues [129].
Color features are also used to classify color regions or to recognize color objects in images [130][131][132].Classical object matching methods are based on template matching, color histograms matching, or hybrid models [133].Hurtu et al. [134] revealed that taking into account the spatial organization of colors and the independence relationships between pixels improves the performance of classifiers.The spatial organization of colors is a key element of the structure of an image and ultimately one of the first to be perceived.It is an intermediate feature between low-level content such as color histograms and image semantics [134].The earth mover's distance (EMD) is a robust method enabling the comparison of the spatial organization of color between images [135].A major interest of the EMD is that it is unnecessary and sometimes misleading to segment the image into regions.One of the main problems of color object recognition methods is to be able to cope with the object color uncertainty caused by different illumination conditions.To face this problem, van Gemert et al. proposed to use high-order invariant features with an entropy-based similarity measure [136].Other invariant features have been considered as the correlograms or the SIFT.SIFT has been proven to be the most robust local invariant features descriptor [137,138].A colored SIFT, more robust than the conventional SIFT descriptors with respect to color and photometrical variations, has been proposed by Abdel-Hakim and Farag [138].As structural information and color information are often complementary, Schügerl et al. proposed a combined object redetection method using SIFT and MPEG-7 color descriptors extracted around the same interest points [139].
In order to unify efforts aiming to define descriptors that effectively and efficiently capture the image content, the International Standards Organization (ISO) developed the MPEG-7 standard, specifically designed for the description of multimedia content [128,140].The main problem with MPEG-7 is that it focuses on the representation of descriptions and their encoding rather than on the extraction of some descriptors.For example, description schemes used in MPEG-7 specify complex structures and semantics groupings descriptors and other descriptions schemes such as segments and regions which require a segmentation of multimedia data (still images and videos).However, MPEG-7 does not specify how to automatically segment still images and videos in regions and segments.Likewise, how to segment objects at semantic level.The creation and application of MPEG-7 descriptions are outside the scope of the MPEG-7 standard.Another problem with MPEG-7 is its complexity, for example, 13 low-level descriptors have been defined to represent color, texture, and shape [11].For some image analysis tasks, such as content-based image retrieval, it may be more efficient to combine selected MPEG-7 descriptors in a compact way rather than using several of them independently [141].
MPEG-7 provides seven color descriptors, namely, color space, color quantization, dominant colors, scalable color, color layout, color-structure, and group of frames/group of pictures color.Among them, some are histogram-derived descriptors such as the scalable color descriptor (SCD) constructed from a fixed HSV color space quantization and a Haar transform encoding.Others provided spatial information on the image color distribution, such as (i) the color layout descriptor (CLD) constructed in the YCbCr color space from a DCT transform with quantization, (ii) the color structure descriptor (CSD) constructed from the hue-min-max-difference (HMMD) using an 8 × 8 structuring element and a nonuniform quantization, (iii) the region locator descriptor (RLD) when regions are concerned.
The main advantage of these kinds of descriptors is that they can be used to embed into a color histogram some information on the spatial localization of color content.Several studies have shown the effectiveness of these descriptors in the context of image retrieval.Berreti et al. [128] noted that these descriptors may not be appropriate for capturing binary spatial relationships between complex spatial entities.
Other studies have also shown the effectiveness of combining spatial and color information in the context of content-based image retrieval.For example, Heidemann [142] proposed to compute color features from a local principal component analysis in order to represent spatial color distribution.The representation used is based on image local windows which are selected by two complementary data driven attentive mechanisms: a symmetry-based saliency map and an edge and corner detector.According to Dasiapoulou et al. [11], since the performance of the analysis depends on the availability of sufficiently descriptive and representative concepts definitions, among the future priorities is the investigation of additional descriptors and methodologies for their effective fusion.We particularly think of inherent objects colors descriptors which could only be computed for certain types of objects such as the sky, the sand, the vegetation, and a face.For example, to face the problem of skin region segmentation, several papers have tried to define the inherent colors of skin.Li et al. [99] conducted a survey on this problem.We also think about fuzzy spatial connectivity descriptors which could be used to measure the homogeneity of a region.Prados-Suárez et al. [143] proposed a fuzzy image segmentation algorithm based on Weber's parametric t-norm.According to the authors, Weber's t-norm provides suitable homogeneity measures to segment imprecise regions due to shadows, highlights, and color gradients.One advantage of this technique is to provide a way to define the "semantics" of homogeneity.
Beyond the problem of color image analysis, there is also a need to define which metadata (e.g., MPEG7 metadata) could be useful to increase the performance of analysis methods.For example, viewing conditions and display parameters are metadata which could be useful for accurate coding, representation, and analysis of color images.According to Ramanath et al. [144], all data which affect the image data need to be included with the image data.Thus, to characterize a digital camera we need to know (or to determine) the illumination under which the image was recorded.

Color saliency
The aim of color saliency models is to model how the human visual system perceives the colors in the image in function of its local spatial organization [145,146].
The selection of regions of interest is directed both by neurological and cognitive resources.Neurological resources refer to bottom-up (stimuli-based) information where cognitive resources refer to top-down (task-dependent) cues.Bottom-up information is controlled by low-level image features that stimulate achromatic and chromatic parallel pathways of the human visual system.Top-down cues are controlled by high-level cognitive strategies largely influenced by memory and task-oriented constraints.
One drawback of most of existing models is that color information is not integrated in the computation of the saliency map [100] or it is taken into account only through the raw RGB components of color images.For example, van de Weijer et al. proposed a salient point detector based upon the analysis of the statistics of color derivatives [147].Another drawback is that local spatial organization of the visual scene generally does not play a part in the construction of saliency maps.However, it is well known that a large uniform patch does not attract visual attention as a fine textured structure.Moreover, color appearance is widely dependent on the local spatial arrangements.Surroundings largely influence the color appearance of a surface.
According to numerous studies, the future of visual attention models will follow the development of perceptual multiscale saliency map based on a competitive process between all bottom-up cues (color, intensity, orientation, location, motion) [68,[148][149][150].In order to be consistent with human visual perception, color information must be exploited on the basis of chromatic channel opponencies.Likewise, in order to be consistent with neural mechanisms, all features must be quantified in the LMS color space.During the competitive process color information must be modulated by local spatial arrangements of the visual scene.

EURASIP Journal on Image and Video Processing
In our opinion, new saliency models will be developed in the next decade which better take into account color perception through neural mechanisms.The regions of interest (ROI) detection based on visual attention mechanisms is currently an active research area in the image processing community [151][152][153][154].For example, Hu et al. propose a visual attention region (VAR) process which involves the selection of a part of the sensory information by the primary visual cortex in the brain features such as intensity, color, orientation and size.The uniqueness of a combination of these features at a location compared to its neighbourhood indicates a high saliency for that region.

Color-based object tracking
Color-based object tracking has long been an active research topic in the image processing and computer vision community, and has widespread applications in many application areas ranging from visual surveillance to image coding, robotics, and human-computer interaction [155,156].One of the most commonly used techniques, the so-called meanshift (MS) solution, was developed to use color, amongst other features, in order to segment and track objects of interest [157].Dynamic MS models using stochastic estimators, such as the celebrated Kalman filter or the so-called particle filter, have been used to cope with large displacements, occlusions and, to some extent, with scale changes of the tracked objects in color video sequences.Other innovative methods such as stream tensors, block-matching, and relaxation with local descriptors could be optimized with color information.It was suggested that the performance of these trackers is greatly improved through the use of local object color models instead of the global ones [157].The utility of the color model is particularly useful when the object under consideration undergoes partial occlusion [158].Two categories of color models are traditionally used in tracking applications, namely semiparametric and nonparametric models.In general terms, semiparametric models use a mixture of Gaussians (MoG) to estimate color distributions.The EM algorithm is one of the algorithms which better adjusts color histograms by MoGs [54].Nonparametric models use similarity measures such as the Bhattacharrya distance to match color histograms.The mean-shift algorithm is one of the most popular techniques used for color histogram matching.According to Peihua [156], determination of the number of histogram bins is an important yet unresolved problem in color-based object tracking.The main difficulty to face is to account for illumination changes or noise while retaining a good discriminative power [56].
In 2007, Muñoz-Salinas et al. [159] proposed a people detection and tracking method based on stereo vision and color.Although that is a useful tool in dealing with partial occlusion, the use of color models is yet to be effectively utilized in the problem of tracking multiple targets that share the same color information (i.e., football game telecast) [160].The fundamental assumption of these solutions, namely the assumption that the image background is of a sufficiently different color structure compared to the objects to be tracked, is rather restrictive.In order to alleviate this problem, Czyz et al. recommend the use of observation features such as appearance models or contours in addition to color [160].On the other hand, Holcombe and Gavanagh demonstrated that the perception of object motion depends upon its color [149].Such features, coupled with the use of stereo camera configurations, will allow for the development of solutions that can effectively deal with occlusions, illumination variations, camera motion, and fade in/out motion.Appearance models could really be helpful in color-based tracking applications in particular for face tracking.Numerous studies have integrated visual appearance attributes based on color information [161].

Scene illuminant estimation and color constancy
Scene illuminant estimation is related to an ability of the human visual system that the color appearance of an object is invariant under light with varying intensity levels and illuminant spectral distribution.This ability, called color constancy, demonstrates at least a subconscious ability to separate the illuminant spectral distribution from the surface spectral reflectance function.
Most previous studies supposed that the scene illuminant had a continuous spectral-power distribution such as incandescent lamp light and daylight.Illuminants with spikes, such as fluorescent illuminant, were neglected as a target scene illuminant.Spectral distribution of a fluorescent lamp or a mercury arc lamp includes intense spectral lines and a weaker continuous spectrum.Currently, fluorescent sources are often being used as indoor lighting; therefore we have to discuss illuminant estimation while inferring scene illumination with a spiky spectrum.Note that wavelengths of the line spectra are inherent to fluorescent material.Therefore, an illuminant classification approach is proposed for inferring the fluorescent type by knowing the wavelength positions of spikes [162].This approach requires a spectral camera system with narrow band filtration.
So far, most of the illuminant estimation methods made the hypothesis that only one source illuminates the scene.It is obvious that this hypothesis is not realistic and severely limits the applicability of the algorithms to real scenes.Attempts have started to overcome the limitations in several ways.One approach is the use of an active illumination.For instance, a camera flush can be used as an active light source with the known illuminant properties.In this case, illuminant is estimated using two images of the same scene, which are captured under the unknown ambient scene illuminant and under a combination of the scene illumination and the camera flash.This approach is limited in scenes to which it can be applied because it is based on light reflection of the camera flash from scene objects to the camera.
The second approach may be directed at solving the illuminant estimation problem for composite illuminants involving both spiky and continuous spectra.Note that the ambient illumination in indoors and outdoors often is a compound of fluorescence and daylight (or incandescent lamp).Therefore, we pose a new illuminant estimation problem: estimation of light source components from a single image of natural scene [163].Obviously the usual RGB camera systems are inappropriate for solving this problem.
The third approach may be directed at omnidirectional scene illuminant estimation [164].Again we should note that many illumination sources are present in natural scenes, and the case of just one source is an exception.Therefore, illuminant estimation includes definitely the problem of estimating spatial distribution of light sources by omnidirectional observations.The omnidirectional measuring systems use a special tool for observing the surrounding scene, such as a mirrored ball and a fisheye lens.A method for estimating an omnidirectional distribution of the scene illuminant spectral power distribution was proposed based on a calibrated measuring system using a mirrored ball and an RGB digital camera [165].Thus, the illuminant estimation problems for various scenes with more than one illumination source must be a promising area of future research.
As the problem of illuminant estimation is in general illposed, there is a room for improvement in existing scene illuminant estimation and correction methods [166][167][168][169][170]. Another problem we are faced with is how to evaluate the performance of a given algorithm.The problem is that the performance of an algorithm is image-dependent.According to Hordley [167], three requirements have to be addressed to solve this problem.First, we must choose an appropriate set of test images.Second, we must define suitable error measures.Third, we must consider how best to summarize these errors over the set of test images.Note that these three requirements can be extended to any image processing task, such as edge detection, segmentation, and compression.In our opinion, the evaluation of the performance of an algorithm is one of the main problems to face in many fields due to the fact that most of image processing algorithms are image-dependent.
Unfortunately, in image processing, most algorithms such as scene segmentation, object recognition, and tracking do not take into account scene illumination changes.The illumination dependence is one of the main problems to face in computer vision.Several illuminant estimation algorithms have been developed for trichromatic devices, such as the max-RGB and the gray-world methods, and the three statistical methods of gamut mapping, Bayesian, and neural network.These methods have been reviewed and compared in [167].Moreover, a combined method was proposed to combine a physics-based method with the statistical methods.Most of these algorithms are described as color constancy algorithms since the goal is to have the colors of the image remain constant regardless of the illumination spectrum.Color constancy is the ability to measure colors of objects independently of the color of the light source [171].Since 2005, numerous studies have been based on region attributes [172,173].For example, van de Weijer et al. [173] proposed a bottom-up method based on high-level visual information to improve illuminant estimation.The authors modeled the image by a mixture of semantic classes (e.g., the grass green, the sky blue, the road gray) and used class features (e.g., texture, position, and color information) to evaluate the likelihood of the semantic content.

Quality assessment and fidelity assessment
In a general way, the fidelity of a color image to a reference is evaluated by some algorithmic perceptual distance measure.On the other hand, the quality of a color image can be assessed by some perceptual features without reference.The problem of objective quality assessment linked to perception is open because it depends on several parameters, such as the kind of image viewed, the expertise of the viewer, the field of study, and the perceived quality of services (PqoS) [117,[174][175][176][177]. Furthermore, in general there is a difference between subjective evaluations with and without experts in the area.Numerous studies integrate appearance attributes, such as spatial/spatial-frequency distribution, to assess local perceptual distortions due to quantization noise in block-based DCT domain, as those resulting from JPEG or MPEG encoding [178][179][180].Other researches integrate in a feature vector several appearance attributes, such as contrast masking effect and color correlation, to obtain more reliable quality scores [181].
It is generally believed that measuring the fidelity between two color images is a difficult problem due to the subjective nature of the human visual system (HVS).Many reported works utilize the mean-squared error (MSE) distance between two images.MSE is often evaluated in the L * a * b * color space, although there have been reported works where MSE measures are firstly computed componentwise and then are added together to produce an overall error value.The peak signal-to-noise ratio (PSNR) is often used to compute the fidelity between two images.It should be noted at this point that although these two methods are simpler to use and evaluate they do not correlate well with human perception [182].By construction, MSE treats all errors equally regardless of image content and error type.However, it is well known that the human visual system's response is highly nonuniform with regard to spatial frequencies or colors.Thus, the development of new fidelity metrics, albeit more complicated, that correlates well the human visual system's response is still an open research problem.A promising idea appears to be the development of fidelity metrics which combine human sensitivity to color differences with human sensitivity to spatial frequencies, as it is done with the S-CIELAB space or the iCAM color space [64].
The problem of fidelity assessment is more difficult for video images than for still images as in this case spatiotemporal effects and memory effect occur.Numerous researches have shown that these two effects are linked to color information.Furthermore, subjective quality assessment is absolutely necessary in critical situations such as postproduction task in cinematographic applications and standardization processes.Over the last several years, a number of subjective assessment methods have been introduced for the evaluation of video image quality [174,183] [27,28].In cinematographic postproduction or color management, subjective assessments are more critical because they involve highly skilled and experienced specialists in color correction and creative production, so called "golden eyes."The problem with film assessment consists of the following: (1) the number of "golden eyes" is limited in number and time; (2) a film projector cannot be stopped in repetition during projection (e.g., for the voting of observers or for the projection of other media); (3) the quality estimation is different between the preattentive phase and the attentive phase; (4) a memory effect may significantly impact the quality estimation when the quality of a video does not vary in the time; (5) to enable final assessment comparable to real application case, short segments of real film content with high importance to color quality are used for the tests.
In the opinion of the authors, when evaluating the quality of a processing video with regard to fidelity, it is necessary to consider not only the overall video but also individual sequences in the video and individual regions within sequences.This requires choosing a representative set of test sequences that satisfy a wide range of constraints, such as large gamut, presence of natural colors, high contrast, high sharpness, and presence of quite uniform areas.For example, the content of these test sequences must correspond to specific use cases such as "hue of skin color," "saturation of blue sky," and "tone of night scene."

Color characterization and calibration of a display device
Several standard data sets can be used to characterize or to calibrate a device, such as the Macbeth Color Checker (24 patches), the Macbeth DC Color Checker (237 patches), the Munsell data set (426 patches), or the IT8.7/2 chart (288 patches).The problem with these data sets is that the number of color patches is limited, the color patches do not cover regularly and entirely the gamut of the device to calibrate or are too distant.In other words, these data sets are defined according to some constraints which are not optimal when using color processing algorithms based on LUTs, such as calibration or compression, due to the fact that interpolation function needs to be used to ensure a continuum of values between points in the LUT.Color interpolation consists of constructing a continuous function of 3 independent variables that interpolates color data values which are only known (measured) at some points in the three-dimensional space.Different interpolation methods can be used depending on the nature of the input data values [184].Another problem of using such data sets is that it is difficult to know how to update algorithms if the data set used for a characterization process cannot be used any longer.For example, as it is now impossible to buy the Macbeth DC Color Checker due to the fact that it is no longer produced, how could one update algorithms based on this data set?
Another problem with color reproduction devices is that the number of colors that they can reproduce is limited.Numerous gamut mapping strategies have been developed to ensure that all the colors can be reproduced on output devices (e.g., displays or printers) without visual artefacts.By definition a gamut mapping operation converts input values from a source image to output values of a reproduction in a way that compensates for differences in the input and output gamut volume shapes [185].The problem with conventional gamut mapping techniques is that they do not include adjusting for preferred colors or adapting colors for different lighting/viewing conditions.On the other hand, a color rendering operation converts an encoded representation of a scene (e.g., a raw capture) to a reproduction in a way that includes gamut mapping and image preference adjustments, and also compensates for differences in viewing conditions, tonal range, and so forth.Color rerendering is similar to color rendering, except that it starts with a source image that is already a reproduction, and produces a new different reproduction, typically for a different kind of display [186].
The problem of gamut mapping (and of color rendering) is far from being optimally solved [187].The development of universal gamut mappings algorithms is still under discussion in the Technical Committee CIE TC 8-03 "Gamut Mapping."With the development of multispectral devices new problems appear.For example, with the advent of highfidelity six color display devices, the higher dimensionality of the mapping becomes a problem [188].Likewise, how could one display on a six primaries system a color image coded on triprimaries?Inversely, how could one display on a triprimaries system a color image coded on six primaries?Some solutions have been proposed but new research is needed [189].
With the development of new display technologies, such as flat-panels or large-area color displays for TV or small color displays (e.g., cell phones, PDAs), a multitude of new problems appear.The advantage of multiprimaries displays in comparison with conventional displays is that the gamut is expanded, the metamerism is minimized, the brightness is higher, and the color break-up effect is reduced [190][191][192].Other technologies, such as hybrid spatial-temporal displays [193], provide evolutionary advances.In particular, they have been used for large displays to further enhance spatial and temporal image qualities [194].Inversely, substantial improvements of image quality and full-color reproduction are required for mobile displays.Even if color management and image processing enhancement for mobile displays are currently active areas of investigation and engineering development [195], there is a need for efficient image processing techniques compatible with the processing resources of mobile devices [166,194].Most of the new displays technologies use new image processing functionalities, such as contrasts stretching, saturation enhancement, and sharpening.to increase the image quality and the color rendering of displays.Among these advancements, let us point out the video attention model based on content recomposition proposed by Cheng et al. [120].The idea of this modelis to provide effective small size videos which emphasize the important aspects of a scene while faithfully retaining the background context (e.g., for a football match retransmission) [196].That is achieved by explicitly separating the manipulation of different video objects.The model is developed to extract user-interest objects, next visual attention features (intensity, color, and motion) are computed and fused according to a high-level combination strategy, and lastly these objects are well reintegrated with the directresized background to optimally match the specific screen sizes.The problem with mobile displays is their power consumption.In order to reduce power consumption while preserving perceived brightness, several issues have been considered such as the use of backlights for LCD mobile displays.However, some problems have been partly solved such as the visibility loss under strong ambient light (e.g., daylight) or the changes of color contrast due to the screen size of mobile displays.
Generally, the contrast of a display device is defined as a ratio depending on the black level and the white level.It is well known that gamma correction influences both the image brightness and image contrast, since it modifies the black level, the white point, and the brightness ratio.Furthermore, it is well known that the ambient light, the viewing angle, the screen depth (sharpness), and the screen size have an influence on the display image quality [197][198][199].Lastly, it is well known that the chromacity coordinates of primaries, the size of the gamut, and the lightness of the white point have an influence on the rendering of color which depends on the saturation of colors [200].Most of these parameters and their relationships are taken into account in color appearance models.Incremental advances in displaying image quality could be obtained thanks to the development of sophisticated modeling and analysis tools based on color appearance models.
Over the last several years, new technologies have appeared on the market and significant progress has been achieved in display screens.These new technologies are due to the development of small, large, or 3D displays and the new uses associated with these displays.These new technologies have introduced new problems in terms of image quality evaluation on such screens (e.g., the "presence" on a 3D screen [198]).The problem is that several descriptors contribute to characterize a screen in terms of image quality and that these descriptors are device-dependent.Furthermore, the majority of these descriptors are correlated.Perhaps it could be possible to reduce the concept of "image quality" to a limited number of descriptors, but this seems very complicated.Especially, as this concept varies according to the contents of the image.However, we can consider that the brightness, the contrast, the color rendering, and the sharpness are the main quality descriptors required to qualify an image without specific artefact [201].
To improve image quality, three strategies can be used.The first one relies on using ad hoc tools based on color management such as gamut mapping, contrast enhancement (from contrast ratio, black level, and high peak luminance), contrast stretching, sharpening, saturation enhancement to improve color rendering and to expand the color gamut (e.g., see [195]).The second utilizes conventional image processing tools to enhance color image contrast.The last strategy uses high-level concepts based on color appearance models (in function of viewing distance, screen type, background, surround, ambient light, etc.) or content-image dependency models (e.g., see [202]).
Until now, only ad hoc tools have given efficient results due to the fact that they take into account the limits of the technologies used.The problem with conventional image processing tools is that in the best case their parameters have to be adapted to the application, and in the worst case their fundamental concepts have to be extended to the application or have to be completely modified [195].In our opinion, the tools based on high-level concepts constitute without any doubt the most promising way to improve image quality.The first solutions which have been developed in this direction, such as those developed by Liu et al. [203], have consolidated us in this opinion.

DISCUSSION AND CONCLUSIONS
Knowledge of recent trends in color science, color systems, appropriate processing algorithms, and device characteristics is necessary to fully harness the functionalities and specificities of the capture, representation, description, analysis, interpretation, processing, exchange, and output of color images.The field of digital color imaging is a highly interdisciplinary area involving elements of physics, psychophysics, visual science, physiology, psychology, computational algorithms, systems engineering, and mathematical optimization.While excellent surveys and reference material exist in each of these areas, the goal of this survey was to present the most recent trends of these diverse elements as they relate to digital color imaging in a single and concise compilation and to put forward relevant information.The aim of this survey was to aid researchers with expertise in a specific domain who seek a better understanding of other domains in the field.Researchers can also use it as an up-todate reference because it offers a broad survey of the relevant literature.
In this survey, we presented an extensive overview of the up-to-date techniques for color image analysis and processing.We reviewed the following critical issues regarding still images and videos.
(i) Choice of appropriate color space.Most of image analysis and processing methods are affected by the choice of color space and by the color metric used.The improvement of the accuracy of color metrics is really a hot topic in color imaging.
(ii) Dropping the intensity component so as to obtain robust parameters against illumination conditions.Many processes are affected by illumination conditions.However, the application of color constancy techniques as a preprocessing step proved to improve the performance of numerous processes.All techniques which do not make any explicit assumptions about the scene content are very promising.Dynamic adaptation techniques which transform the existing color models so as to adapt to the changing viewing conditions are also promising.
(iii) Reducing the gap between low-level features and high-level interpretation.To improve the accuracy of the analysis, various low-level features such as shape, spatial and motion information can be used along with color information.Likewise, various perceptual features such as saliency and appearance models can be used to improve the accuracy of the analysis.
(iv) Incorporating color perception models (though, not necessarily physiological) into color processing algorithms.The improvement of the performance of algorithms by simpler and readily applicable models is really a hot topic in digital imaging.This is particularly relevant for problems of coding, description, object recognition, compression, quality assessment, and so forth.Many processes are affected by perceptual quality criteria.All techniques including perceptual quality criteria are very promising.
The objective of this survey was not to cover all types of applications under color imagery, but to identify all topics that have given rise to new advances in the last two years and for which there have been more theoretical research as in other topics.We have clearly shown that the future of color image processing will be guided by the use of human vision models that compute the color appearance of spatial information rather than low-level signal processing models based on pixels, but also frequential, temporal information, and the use of semantic models.In particular, we focused on color constancy, illuminant estimation, mesopic vision, appearance models, saliency, and so forth.Human color vision is an essential tool for those who wish to contribute to the development of color image processing solutions and also for those who wish to develop a new generation of color image processing algorithms based on high-level concepts.We have also showed that color characterization and calibration of digital systems is a key element for several applications such as medical imaging and consumer imaging applications.The quality of images and displays is really a hot topic for particular fields of expertise.This paper focused also on hot topics such as object tracking or skindetection, and advanced topics such as image/video coding, compression, and watermarking for which several open problems have been identified.
. Most of them follow recommendations published by The Video Quality Expert Group [VQEG] or by The Society of Motion Picture and Television Engineers and the International Telecommunication Union [SMPTE 196M-2003, ITU-R BT.500-10], such as the double-stimulus continuous quality scale method (DSCQS), the simultaneous double stimulus for continuous evaluation method (SDSCE), or the double-stimulus impairment scale method (DSIS) [ITU-R BT.500-10].Inversely, few subjective assessment methods have been proposed to evaluate film image quality