4, 2018.
Digital Object Identifier 10.1109/ACCESS.2018.2804379
ABSTRACT Salient object detection aims at finding the most conspicuous objects in an image that highly
catches the user’s attention. The traditional contrast based salient object detection algorithms focus on
highlighting the most dissimilar regions and generally fail to detect complex salient objects. In this paper,
we propose a salient object detection principle from existing contrast based methods: dissimilarity produces
contrast, while contrast leads to saliency. Guided by this principle, we propose a generalized framework to
detect complex salient objects. First, we propose a set of region dissimilarity definitions inspired by diverse
saliency cues. Then, multiple contrast contexts are encoded to derive dissimilarity matrices. Afterwards,
multiple contrast transformations are designed to convert dissimilarity matrices into unified ultra-contrast
features. Finally, these ultra-contrast features are mapped to saliency values through logistic regression. The
proposed framework is capable of flexibly integrating different kinds of region dissimilarity definitions,
region contexts, and contrast transformations. The experimental results demonstrate that our ultra-contrast
based saliency detection method outperforms existing contrast based algorithms in terms of three metrics on
four datasets.
INDEX TERMS Saliency detection, salient object segmentation, ultra-contrast, region dissimilarity.
2169-3536
2018 IEEE. Translations and content mining are permitted for academic research only.
14870 Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 6, 2018
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
L. Tang et al.: Salient Object Detection and Segmentation via Ultra-Contrast
FIGURE 2. An example of complex salient object detection and segmentation results produced by the proposed algorithm compared with other
10 state-of-the-arts salient object detection algorithms, where the complex salient object is composed of four distinct basic objects: the ‘‘person’’,
the ‘‘clothing’’, the ‘‘hat’’ and the ‘‘bag’’. From (a) to (n):(a) Input image and corrresponding ground truth mask, (b) RC [6], (c) DSR [13], (d) GS [10],
(e) HSD [9], (f) MC [14], (g) MR [11], (h) SO [12], (i) DRFI [15], (j) BL [16], (k) LPS [17], (l) Ours.
FIGURE 3. The proposed framework for computing ultra-contrast based saliency map. (a) Input image. (b) Over-segmentation. (c) Nine
region dissimilarity matrices, generated by integrating three dissimilarity definitions and three context types. (d) Four contrast
transformations. (e) Ultra-contrast feature vectors. (f) Ultra-contrast based saliency map.
the major differences between them are the mathematical C. CONTRAST TRANSFORMATION
models and post-processing techniques. Besides, the deep Contrast transformation refers to the method, by region dis-
learning based saliency detection works [24], [25] also utilize similarities are transformed to contrast value or saliency
both local and global context to train their models. Thus, value. Most existing works [1], [6], [7], [9], [12], [19], [20],
context is important to saliency detection task and different [23] compute the mean dissimilarity as the contrast value
contexts play different roles. In this paper, we encode three and simply treat this contrast value as the final saliency
contexts in the manner, while only one context needs to be value. Li et al. [30] propose to compute the saliency value
computed. by adding segmentation results. Jiang et al. [15] propose to
train random forest against all kinds of region features to
B. REGION DISSIMILARITY predict the region’s saliency value. Wei et al. [10] adopt a
In this paper, we propose a set of region dissimilarity defi- novel geodesic distance to compute the region dissimilarity
nitions to encode diverse saliency cues. Among all existing and regard it as the saliency value. Tang et al. [31] propose
saliency cues, color is the most frequent one. Most existing to fuse several existing saliency maps into a new saliency
works [1], [6], [8]–[10], [12]–[15], [17], [19], [26]–[28] take map. In this paper, we assert there exists a differentiation
color difference as the most important cue to detect salient between contrast and saliency and devise different contrast
objects. Wang et al. [20] use gradient distribution to compute transformations to extract contrast features from diverse dis-
gradient contrast and combine it with color contrast to gen- similarity matrices.
erate saliency map. Jiang et al. [21] exploit object detection
results as a complementary saliency map to boost perfor- III. GENERALIZED FRAMEWORK
mance. Specifically, they fuse three saliency maps based three This section presents the architecture of our salient object
saliency cues to generate final saliency map via element- detection framework, as shown in Fig. 3. As illustrated
wise multiplication. In addition, Meng et al. [29] exploit in Fig. 3 (b), the first step is to over-segment an input image
inter-image region dissimilarity to assist foreground object into N regions. Next, a set of distinct dissimilarity matrices
segmentation in multiple images. Instead of designing sophis- are computed Fig. 3 (c), where three kinds of dissimilar-
ticated methods to combine multiple saliency cues, we pro- ity definitions and three contexts are shown for illustration
pose a unified architecture (region dissimilarity matrix) to purposes. As introduced earlier, only global dissimilarity
integrate them. matrices are calculated in practice, from which local and
separately in Fig. 6. It can be seen that different transfor- method is that it doesn’t make the most of the saliency map,
mations (b)(c)(d)(e) highlight different salient regions. For as it only uses the saliency map to locate the salient object
example, Average Contrast (b) highlights the bigger salient roughly. As a result of that, it makes the initial saliency
object while ignores the smaller one, because the smaller map less important. In the following, we present a unified
one has lower mean region dissimilarity compared with the algorithm to produce segmentation mask from saliency map
bigger one. In contrast, Michelson Contrast (d) and Variance by making the most use of the saliency map. Specifically,
Contrast (e) highlight the smaller salient object. Obviously, the saliency value of a pixel is directly treated as the prob-
these results demonstrate the diversity of the proposed trans- ability of it belonging to a salient object. Besides that, only
formations. At last, our final ultra-contrast saliency map the smooth operation is considered.
(f) combining all transformations successfully detect all parts By combining these two factors, we construct the objective
of salient objects. function as follows:
X X
V. TRAINING E(y) = U (yi ) + λ V (yi , yj ) (14)
In this section, we present how to train a logistic regression i∈V i,j∈E
model against the proposed ultra-contrast features. The train- where yi is the segmentation variable of pixel i.
ing samples are extracted by the following procedures: we The first term U (si ) is denoted as data potential, designed
first define an indicator image Ii for region ri as a binary mask to constrain that the final segmentation si is close to the
where label 1 represents the pixels that are in the region ri and saliency map, therefore it is computed as follows:
0 for other pixels. Then we compute an overlap score [39]
between indicator image Ii and corresponding ground-truth U (yi ) = −log(si yi + (1 − si )(1−yi ) ) (15)
binary mask S as follows:
T where si is the saliency value of pixel i.
| Ii S |
s(Ii , S) = S (12) The second term is denoted as the edge-preserving
| Ii S | smooth potential, designed to smooth pixels in salient
If the overlap score is above τ , it’s treated as a positive objects and backgrounds separately, and here we use the
sample. It is worth noting that when τ is set to 0, there Kolmogorov and Zabin [42] interaction energy.
exists an unbalanced situation between the number of positive 2
samples and negative samples. The training will fail if we V (yi , yj ) = d(i, j)[yi 6= yj ]e−β|fi −fj | (16)
directly use the native loss function of logistic regression [40],
because of the symmetry with which it penalizes two types of where d(i, j) represents the distance between region i and j,
mistakes equally. Similar to [40], our solution is to modify the [.] refers to the indicator function, β refers to the parameter
training loss. Specifically, we multiply a parameter β to the that weights the feature distance, and fi is the color feature
loss of positive samples. vector of region i.
Note that the objective function is a submodular binary
Nn
1 discrete optimization, and it can be minimized using graph
log(1 + eω xi )
X T
ω∗ (β) = argminω ωT ω + cuts [42].
2
i=1
Np
X Tx VII. EXPERIMENTS
+β log(1 + e−ω j ) (13) To evaluate the effectiveness of the proposed method,
j=1
we design three groups of comparative experiments. First,
where Np and Nn refer to the number of positive samples and we compare the performance of the proposed ultra-contrast
negative samples respectively. features with DRFI’s [15] features through different seg-
mentation level. Second, salient object detection experiments
VI. UNIFIED SALIENT OBJECT SEGMENTATION are conducted in comparison with state-of-the-art algorithms.
In saliency cut [6], they use an initial saliency map to produce Third, salient object segmentation experiments are conducted
a rough saliency mask, then iterative GrabCut [41] are run to investigate the performance of the proposed method in
to get the final segmentation result. The disadvantage of this object segmentation task.
TABLE 2. Performance of the proposed ultra-contrast feature compared with DRFI [15] feature on four datasets across different segmentation levels.
(a) MSRA-B. (b) PASCAL-S. (c) ECSSD. (d) DUT-OMRON.
TABLE 3. Saliency detection results of the proposed method in terms of three metrics on four public datasets in compared with 10 state-of-the-art
algorithms. For each metric, the top three results are shown in red, green and blue, respectively. (a) MSRA-B. (b) PASCAL-S. (c) ECSSD. (d) DUT-OMRON.
FIGURE 7. Precision recall curves of the proposed ultra-contrast based algorithm compared with other 10 state-of-the-art saliency detection
algorithms on four datasets. From left to right : (a) MSRA-B, (b) PASCAL-S, (c) ECSSD, (d) DUT-OMRON.
LPS [17], RC [6], DSR [13], MC [14], GS [10], MR [11], all other saliency maps into our energy function (14) to
SO [12], DRFI [15], HSD [9], where DRFI [15] is the leading produce the segmentation mask. The salient object segmen-
algorithm over all seven datasets reported in [45]. tation experiment is conducted on MSRA-B [19]. The most
The MAXF, ADAPF and MAE scores on four datasets are well-known segmentation evaluation metric, intersection-
shown in Table 3 and the corresponding PR curves are plotted over-union (IoU) score [39] is adopted, which is denoted
in Fig. 7. In Fig. 7, our UC consistently outperforms these as So .
algorithms in terms of both precision and recall at a signif- Table 4 shows the results of our UC compared with other
icant margin on four datasets, while DRFI takes the second 10 state-of-the-art algorithms: SF [8], RC [6], LPS [17],
position. It proves the effectiveness of the proposed algorithm DSR [13], GS [10], MC [14], MR [11], SO [12], HSD [9],
comparing with state-of-the-art contrast-based algorithms. DRFI [15] and another salient object segmentation algo-
Observing from Table 3, our UC also consistently outper- rithm SaliencyCut [6]. It can be seen that the mean over-
forms these algorithms in terms of three metrics on four lap score of our UC is 75.53%, which is 5% higher than
datasets. On MSRA-B, our UC is the only method whose the best salient object segmentation algorithm, i.e., Saliency
MAE is below 10%. Cut [6].
FIGURE 8. Visual comparison of the proposed method UC with other state-of-the-art contrast based methods. From left to
right: (a) source image, (b) RC [6], (c) DSR [13], (d) GS [10], (e) HSD [9], (f) MC [14], (g) MR [11], (h) SO [12], (i) DRFI [15], (j) LPS [17],
(k) SaliencyCut [6], (l) UC, (m) Ground truth mask.
TABLE 4. Results of salient object segmentation of the proposed method compared with other 11 algorithms on MSRA-B.
TABLE 5. Quantitative result of ablation experiment regarding different regions dissimilarities on MSRA-B.
algorithms. The first group denoted with green color border 1) REGION DISSIMILARITY
presents an input image containing a simple salient object. The results of ablation analysis regarding region dissimilarity
It can be seen that most methods have the ability to seg- are presented in Table 5. From this table, first of all, we can
ment simple salient objects accurately. The second group see that all the proposed region dissimilarities contribute to
denoted with blue color border contains seven images con- the final result and this justify the choice of utilizing multiple
taining complex salient objects. As can be seen from this region dissimilarity types. For color dissimilarity, there is
group, except our UC, most contrast-based methods fail to about 1% drop in terms of MAXF when removing any kind
detect these complex salient objects. Take the person in the of color dissimilarity. This result proves the effectiveness of
fifth row of this group as an illustration, where the ‘‘face’’, using multiple color channels. Removing texture dissimilar-
‘‘cloth’’ and the ‘‘bottle’’ are highlighted by the proposed ity also leads to a 1% decrease. As one of our high-level
object dissimilarity, texture dissimilarity, and edge dissimilar- region dissimilarities, object dissimilarity, causes a moderate
ity, respectively. In contrast, the color of the ‘‘cloth’’ is similar decrease. The edge and spatial dissimilarities can be seen
to its backgrounds, thus RC [6] misses it. On the other hand, as complementary cues since there is a relatively smaller
because this ‘‘person’’ lies on the bounder of the image, these decrease by removing them separately.
methods [11], [13], [14] based on boundary context miss the The visual comparisons of different region dissimilarity
boundary part of the salient object. types are shown in Fig. 9, where one column corresponds
to one dissimilarity, mRGB and mGradMag are chosen to
C. ABLATION ANALYSIS represent the color and textures dissimilarity respectively.
The proposed framework is capable of integrating different From the second column we can see that, when only spa-
types of region dissimilarities, contrast transformations, and tial dissimilarity is used, the algorithm always segment the
contexts. In order to survey the effectiveness of these com- objects lying in the center of an image. Take the first row
ponents, we conduct ablation experiments on MSRA-B by of Fig. 9 as an illustration for color dissimilarity, where
removing one component each time. Since there are three key the goal is to segment a yellow flower from cluster back-
factors in our framework, i.e., region dissimilarity, contrast grounds. In RGB color space, the pure yellow color is
transformation and context encoding, we evaluate one factor [255 255 0], and the background color ranges from dark green
at a time while keeping other factors unchanged. to black, which contains a very low red component, because
2) CONTRAST TRANSFORMATION
To prove the effectiveness of our proposed contrast transfor-
mations, we conduct the ablation experiment by removing
one contrast transformation at a time. The results are shown in
table 6 and it can be seen that all contrast transformations do
contribute to the final result. Concretely, removing CA (Aver-
age contrast) causes about 2% decrease in the final result in
terms of MAXF, which demonstrates that a region’s saliency
D. EFFICIENCY ANALYSIS
To analyze the efficiency of the proposed framework,
we briefly illustrate the implementation flowchart in Fig. 12,
which can be divided into three categories. The first one is
the pre-computation procedure including the extraction of
superpixel map, edge map, and objectness map. The second
and third categories are region dissimilarity computation and
contrast transformation respectively.
The experiments are run on the single thread of an Intel
i5 CPU of 2.60GHz and the code are written in MAT-
LAB. All these results are reported in Table 8. First of all,
it takes less than 0.001s to compute each of four contrast
FIGURE 11. Visual comparison of the proposed framework regarding transformations, because it only involves simple operations,
different different contexts. From left to right: (a) Source image,
(b) Global, (c) Local, (d) Boundary, (e) All, (f) Ground Truth. The green and such as mean computation, variance computation of a small
red frames denote successes and failures respectively. matrix. Even there are dozens of dissimilarity matrices it only
TABLE 7. Quantitative result of ablation experiment regarding different
needs 0.001 seconds plus the number of dissimilarity matrix.
contexts on MSRA-B. In summary, it takes around 3.5 seconds to compute a saliency
map from the input.
[9] Q. Yan, L. Xu, J. Shi, and J. Jia, ‘‘Hierarchical saliency detection,’’ in Proc.
IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2013, pp. 1155–1162.
[10] Y. Wei, F. Wen, W. Zhu, and J. Sun, ‘‘Geodesic saliency using background
priors,’’ in Proc. Eur. Conf. Comput. Vis., 2012, pp. 29–42.
[11] C. Yang, L. Zhang, H. Lu, X. Ruan, and M.-H. Yang, ‘‘Saliency detection
via graph-based manifold ranking,’’ in Proc. IEEE Conf. Comput. Vis.
Pattern Recognit., Jun. 2013, pp. 3166–3173.
[12] W. Zhu, S. Liang, Y. Wei, and J. Sun, ‘‘Saliency optimization from robust
background detection,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recog-
nit., Jun. 2014, pp. 2814–2821.
[13] X. Li, H. Lu, L. Zhang, X. Ruan, and M.-H. Yang, ‘‘Saliency detection via
dense and sparse reconstruction,’’ in Proc. IEEE Int. Conf. Comput. Vis.,
Dec. 2013, pp. 2976–2983.
[14] B. Jiang, L. Zhang, H. Lu, C. Yang, and M.-H. Yang, ‘‘Saliency detection
via absorbing Markov chain,’’ in Proc. IEEE Int. Conf. Comput. Vis.,
Dec. 2013, pp. 1665–1672.
[15] H. Jiang, J. Wang, Z. Yuan, Y. Wu, N. Zheng, and S. Li, ‘‘Salient
FIGURE 13. Performances of integrating our proposed ultra-contrast object detection: A discriminative regional feature integration approach,’’
features into deep learning based features.
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2013,
pp. 2083–2090.
[16] N. Tong, H. Lu, X. Ruan, and M.-H. Yang, ‘‘Salient object detec-
VIII. CONCLUSIONS tion via bootstrap learning,’’ in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit. (CVPR), Jun. 2015, pp. 1884–1892.
In this paper, we focus on solving the problem that exists [17] H. Li, H. Lu, Z. Lin, X. Shen, and B. Price, ‘‘Inner and inter label
in contrast-based salient object detection algorithms: miss propagation: Salient object detection in the wild,’’ IEEE Trans. Image
some parts of complex salient objects. To achieve this goal, Process., vol. 24, no. 10, pp. 3176–3186, Oct. 2015.
[18] S. Goferman, L. Zelnik-Manor, and A. Tal, ‘‘Context-aware saliency
we propose a unified salient object detection framework, detection,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 10,
which includes three main blocks: dissimilarity definition, pp. 1915–1926, Oct. 2012.
contrast transformation, and context selection. The dissim- [19] T. Liu et al., ‘‘Learning to detect a salient object,’’ IEEE Trans. Pattern
Anal. Mach. Intell., vol. 33, no. 2, pp. 353–367, Feb. 2011.
ilarity definition block is designed to integrate multiple and
[20] K. Wang, L. Lin, J. Lu, C. Li, and K. Shi, ‘‘PISA: Pixelwise image saliency
diverse saliency cues. Then, the contrast transformation block by aggregating complementary appearance contrast measures with edge-
is utilized to transform dissimilarity matrix into region-level preserving coherence,’’ IEEE Trans. Image Process., vol. 24, no. 10,
pp. 3019–3033, Oct. 2015.
ultra-contrast features. Finally, the ultra-contrast is trans-
[21] P. Jiang, H. Ling, J. Yu, and J. Peng, ‘‘Salient region detection by UFO:
formed to saliency values. The experimental results show Uniqueness, focusness and objectness,’’ in Proc. IEEE Int. Conf. Comput.
that the proposed ultra-contrast saliency detection framework Vis., Dec. 2013, pp. 1976–1983.
significantly outperforms existing contrast-based algorithms. [22] X. Li, Y. Li, C. Shen, A. Dick, and A. van den Hengel, ‘‘Contextual
hypergraph modeling for salient object detection,’’ in Proc. IEEE Int. Conf.
Furthermore, we show that deep learning based features Comput. Vis., Dec. 2013, pp. 3328–3335.
can be integrated into our framework, and the experimental [23] Y.-F. Ma and H.-J. Zhang, ‘‘Contrast-based image attention analysis
results demonstrate that the proposed ultra-contrast features by using fuzzy growing,’’ in Proc. ACM Int. Conf. Multimedia, 2003,
pp. 374–381.
are complementary to deep learning based features. [24] R. Zhao, W. Ouyang, H. Li, and X. Wang, ‘‘Saliency detection by
multi-context deep learning,’’ in Proc. IEEE Conf. Comput. Vis. Pattern
REFERENCES Recognit., Jun. 2015, pp. 1265–1274.
[25] L. Wang, H. Lu, X. Ruan, and M.-H. Yang, ‘‘Deep networks for saliency
[1] L. Itti, C. Koch, and E. Niebur, ‘‘A model of saliency-based visual attention detection via local estimation and global search,’’ in Proc. IEEE Conf.
for rapid scene analysis,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, Comput. Vis. Pattern Recognit., Jun. 2015, pp. 3183–3192.
no. 11, pp. 1254–1259, Nov. 1998. [26] Y. Ren, Z. Wang, and M. A. Xu, ‘‘Learning-based saliency detection of
[2] S. Belongie, G. Mori, and J. Malik, ‘‘Matching with shape contexts,’’ in face images,’’ IEEE Access, vol. 5, pp. 6502–6514, 2017.
Statistics and Analysis of Shapes. Boston, MA, USA: Birkhäuser, 2006, [27] C. Scharfenberger, A. G. Chung, A. Wong, and D. A. Clausi, ‘‘Salient
pp. 81–105. region detection using self-guided statistical non-redundancy in natural
[3] A. Rabinovich, A. Vedaldi, and S. J. Belongie, ‘‘Does image segmen- images,’’ IEEE Access, vol. 4, pp. 48–60, 2016.
tation improve object categorization?’’ Dept. Comput. Sci. Eng., Univ. [28] H. Du, Z. Liu, H. Song, L. Mei, and Z. Xu, ‘‘Improving RGBD saliency
California, San Diego, San Diego, CA, USA Tech. Rep. CS2007-0908, detection using progressive region classification and saliency fusion,’’
2007. IEEE Access, vol. 4, pp. 8987–8994, 2016.
[4] L. Marchesotti, C. Cifarelli, and G. Csurka, ‘‘A framework for visual [29] F. Meng, H. Li, Q. Wu, B. Luo, and K. N. Ngan, ‘‘Weakly supervised
saliency detection with applications to image thumbnailing,’’ in Proc. IEEE part proposal segmentation from multiple images,’’ IEEE Trans. Image
Int. Conf. Comput. Vis., Sep./Oct. 2009, pp. 2232–2239. Process., vol. 26, no. 8, pp. 4019–4031, Aug. 2017.
[5] H. Jiang, ‘‘Human pose estimation using consistent max covering,’’ [30] H. Li, F. Meng, and K. N. Ngan, ‘‘Co-salient object detection
IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 9, pp. 1911–1918, from multiple images,’’ IEEE Trans. Multimedia, vol. 15, no. 8,
Sep. 2011. pp. 1896–1909, Dec. 2013.
[6] M.-M. Cheng, G.-X. Zhang, N. J. Mitra, X. Huang, and S.-M. Hu, ‘‘Global [31] L. Tang, H. Li, and T. Chen, ‘‘Extract salient objects from natural images,’’
contrast based salient region detection,’’ in Proc. IEEE Conf. Comput. Vis. in Proc. IEEE Int. Symp. Intell. Signal Process. Commun. Syst., Dec. 2010,
Pattern Recognit., Jun. 2011, pp. 409–416. pp. 1–4.
[7] Y. Zhai and M. Shah, ‘‘Visual attention detection in video sequences [32] P. Dollár and C. L. Zitnick, ‘‘Fast edge detection using structured forests,’’
using spatiotemporal cues,’’ in Proc. ACM Int. Conf. Multimedia, 2006, IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 8, pp. 1558–1570,
pp. 815–824. Aug. 2015.
[8] F. Perazzi, P. Krahenbuhl, Y. Pritch, and A. Hornung, ‘‘Saliency filters: [33] K. E. A. van de Sande, J. R. R. Uijlings, T. Gevers, and A. W. M. Smeulders,
Contrast based filtering for salient region detection,’’ in Proc. IEEE Conf. ‘‘Segmentation as selective search for object recognition,’’ in Proc. IEEE
Comput. Vis. Pattern Recognit., Jun. 2012, pp. 733–740. Int. Conf. Comput. Vis., Nov. 2011, pp. 1879–1886.
[34] E. Peli, ‘‘Contrast in complex images,’’ J. Opt. Soc. Amer. A, Opt. Image QINGBO WU received the B.E. degree in educa-
Sci., vol. 7, no. 10, pp. 2032–2040, 1990. tion of applied electronic technology from Hebei
[35] M. Pavel, G. Sperling, T. Riedl, and A. Vanderbeek, ‘‘Limits of visual Normal University in 2009 and the Ph.D. degree
communication: The effect of signal-to-noise ratio on the intelligibility of in signal and information processing from the
American Sign Language,’’ J. Opt. Soc. Amer. A, Opt. Image Sci., vol. 4, University of Electronic Science and Technology
no. 12, pp. 2355–2365, 1987. of China in 2015. In 2014, he was a Research
[36] A. A. Michelson, Studies in Optics. North Chelmsford, MA, USA: Courier Assistant with the Image and Video Processing
Corporation, 1995.
Laboratory, Chinese University of Hong Kong.
[37] R. F. Hess, ‘‘Contrast-coding in amblyopia. II. On the physiological basis
From 2014 to 2015, he was a Visiting Scholar
of contrast recruitment,’’ Proc. Roy. Soc. London B, Biol. Sci., vol. 217,
no. 1208, pp. 331–340, Feb. 1983. with the Image & Vision Computing Laboratory,
[38] R. F. Hess and E. R. Howell, ‘‘The threshold contrast sensitivity function University of Waterloo, Waterloo, ON, Canada. He is currently a Lecturer
in strabismic amblyopia: Evid ence for a two type classification,’’ Vis. Res., with the School of Electronic Engineering, University of Electronic Science
vol. 17, no. 9, pp. 1049–1055, 1977. and Technology of China. His research interests include image/video coding,
[39] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and quality evaluation, and perceptual modeling and processing.
A. Zisserman, ‘‘The Pascal visual object classes (VOC) challenge,’’ Int.
J. Comput. Vis., vol. 88, no. 2, pp. 303–338, Sep. 2009.
[40] Z. Ren and G. Shakhnarovich, ‘‘Image segmentation by cascaded region
agglomeration,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
Jun. 2013, pp. 2011–2018. NII LONGDON SOWAH received the B.Sc.
[41] C. Rother, V. Kolmogorov, and A. Blake, ‘‘Grabcut: Interactive foreground degree in computer engineering from the Kwame
extraction using iterated graph cuts,’’ ACM Trans. Graph., vol. 23, no. 3, Nkrumah University of Science and Technology
pp. 309–314, 2004. in 2009 and the M.Sc. degree in communication
[42] V. Kolmogorov and R. Zabin, ‘‘What energy functions can be minimized engineering from the University of Electronic Sci-
via graph cuts?’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 2, ence and Technology of China in 2012, where
pp. 147–159, Feb. 2004. he is currently pursuing the Ph.D. degree with
[43] Y. Li, X. Hou, C. Koch, J. M. Rehg, and A. L. Yuille, ‘‘The secrets of salient the Intelligent Visual Information Processing and
object segmentation,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Communication Laboratory. His research interests
Jun. 2014, pp. 280–287. include object tracking and image clustering.
[44] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk, ‘‘SLIC
superpixels compared to state-of-the-art superpixel methods,’’ IEEE Trans.
Pattern Anal. Mach. Intell., vol. 34, no. 11, pp. 2274–2282, Nov. 2012.
[45] A. Borji, M.-M. Cheng, H. Jiang, and J. Li, ‘‘Salient object detec-
tion: A benchmark,’’ IEEE Trans. Image Process., vol. 24, no. 12, KAI TAN received the M.A.Sc. degree from Shan-
pp. 5706–5722, Dec. 2015. dong Normal University in 2013. He is currently
pursuing the Ph.D. degree in signal and informa-
tion processing with the University of Electronic
Science and Technology of China, under the super-
vision of Prof. H. Li. His research interests include
visual attention, image recognition, crowd analy-
LIANGZHI TANG received the B.Sc. and M.Sc. sis, neural network, and deep learning.
degrees from the School of Electronic Engineer-
ing, University of Electronic Science and Tech-
nology of China, in 2008 and 2011, respectively,
where he is currently pursuing the Ph.D. degree,
under the supervision of Prof. H. Li.
His research interests include saliency detec- HONGLIANG LI (M’06–SM’11) received the
tion, object segmentation, and deep convolutional Ph.D. degree in electronics and information engi-
neural network. neering from Xian Jiaotong University, Xian,
China, in 2005. From 2005 to 2006, he was with
the Visual Signal Processing and Communication
Laboratory, The Chinese University of Hong Kong
(CUHK), Hong Kong, as a Research Associate.
From 2006 to 2008, he was a Post-Doctoral Fellow
with the Visual Signal Processing and Communi-
FANMAN MENG (S‘12–M‘14) received the cation Laboratory, CUHK. He is currently a Pro-
Ph.D. degree in signal and information process- fessor with the School of Electronic Engineering, University of Electronic
ing from the University of Electronic Science and Science and Technology of China, Chengdu, China. He has authored or
Technology of China, Chengdu, China, in 2014. co-authored numerous technical articles in well-known international jour-
From 2013 to 2014, he was with the Division of nals and conferences. He is a Co-Editor of the book Video Segmentation
Visual and Interactive Computing, Nanyang Tech- and its Applications (Springer, 2011). His research interests include image
nological University, Singapore, as a Research segmentation, object detection, image and video coding, visual attention, and
Assistant. He is currently an Associate professor multimedia communication systems. He was involved in many professional
with the School of Electronic Engineering, Uni- activities. He served as a TPC member in a number of international con-
versity of Electronic Science and Technology of ferences, e.g., ICME 2013, ICME 2012, ISCAS 2013, PCM 2007, PCM
China, Chengdu, Sichuan, China. He has authored or co-authored numerous 2009, and VCIP 2010. He served as the Technical Program Co-Chair for
technical articles in well-known international journals and conferences. His VCIP2016 and ISPACS 2009, the General Co-Chair of the ISPACS 2010,
research interests include image segmentation and object detection. He is the Publicity Co-Chair of the IEEE VCIP 2013, and the Local Chair of the
a member of the IEEE CAS society. He received the Best Student Paper IEEE ICME 2014. He is a member of the Editorial Board of the Journal on
Honorable Mention Award for the 12th Asian Conference on Computer Visual Communications and Image Representation, and an Area Editor of
Vision, Singapore, in 2014, and the Top 10% paper award in the IEEE Signal Processing: Image Communication, Elsevier Science.
International Conference on Image Processing, Paris, France, in 2014.