Pérez-Pellitero, E.; Salvador, J.; Ruiz-Hidalgo, J.; Rosenhahn, B. IEEE transactions on image processing Vol. 25, num. 6, p. 2456-2468 DOI: 10.1109/TIP.2016.2549362 Data de publicació: 2016-03-31 Article en revista
Dictionary-based super-resolution (SR) algorithms usually select dictionary atoms based on the distance or similarity metrics. Although the optimal selection of the nearest neighbors is of central importance for such methods, the impact of using proper metrics for SR has been overlooked in literature, mainly due to the vast usage of Euclidean distance. In this paper, we present a very fast regression-based algorithm, which builds on the densely populated anchored neighborhoods and sublinear search structures. We perform a study of the nature of the features commonly used for SR, observing that those features usually lie in the unitary hypersphere, where every point has a diametrically opposite one, i.e., its antipode, with same module and angle, but the opposite direction. Even though, we validate the benefits of using antipodally invariant metrics, most of the binary splits use Euclidean distance, which does not handle antipodes optimally. In order to benefit from both the worlds, we propose a simple yet effective antipodally invariant transform that can be easily included in the Euclidean distance calculation. We modify the original spherical hashing algorithm with this metric in our antipodally invariant spherical hashing scheme, obtaining the same performance as a pure antipodally invariant metric. We round up our contributions with a novel feature transform that obtains a better coarse approximation of the input image thanks to iterative backprojection. The performance of our method, which we named antipodally invariant SR, improves quality (Peak Signal to Noise Ratio) and it is faster than any other state-of-the-art method.
Speckle noise filtering on polarimetric SAR (PolSAR) images remains a challenging task due to the difficulty to reduce a scatterer-dependent noise while preserving the polarimetric information and the spatial information. This challenge is particularly acute on single look complex images, where little information about the scattering process can be derived from a rank-1 covariance matrix. This paper proposes to analyze and to evaluate the performances of a set of PolSAR speckle filters. The filter performances are measured by a set of ten different indicators, including relative errors on incoherent target decomposition parameters, coherences, polarimetric signatures, point target, and edge preservation. The result is a performance profile for each individual filter. The methodology consists of simulating a set of artificial PolSAR images on which the various filters will be evaluated. The image morphology is stochastic and determined by a Markov random field and the number of scattering classes is allowed to vary so that we can explore a large range of image configurations. Evaluation on real PolSAR images is also considered. Results show that filters performances need to be assessed using a complete set of indicators, including distributed scatterer parameters, radiometric parameters, and spatial information preservation.
This paper provides an alternative solution to the costly representation of multi-view video data, which can be used for both rendering and scene analyses. Initially, a new efficient Monte Carlo discrete surface reconstruction method for foreground objects with static background is presented, which outperforms volumetric techniques and is suitable for GPU environments. Some extensions are also presented, which allow a speeding up of the reconstruction by exploiting multi-resolution and temporal correlations. Then, a fast meshing algorithm is applied, which allows interpolating a continuous surface from the discrete reconstructed points. As shown by the experimental results, the original video frames can be approximated with high accuracy by projecting the reconstructed foreground objects onto the original viewpoints. Furthermore, the reconstructed scene can be easily projected onto any desired virtual viewpoint, thus simplifying the design of free-viewpoint video applications. In our experimental results, we show that our techniques for reconstruction and meshing compare favorably with the state-of-the-art, and we also introduce a rule-of-thumb for effective application of the method with a good quality versus representation cost trade-off.
This paper provides an alternative solution to the
costly representation of multi-view video data, which can be
used for both rendering and scene analyses. Initially, a new
efficient Monte Carlo discrete surface reconstruction method for
foreground objects with static background is presented, which
outperforms volumetric techniques and is suitable for GPU
environments. Some extensions are also presented, which allow a
speeding up of the reconstruction by exploiting multi-resolution
and temporal correlations. Then, a fast meshing algorithm is
applied, which allows interpolating a continuous surface from
the discrete reconstructed points. As shown by the experimental
results, the original video frames can be approximated with high
accuracy by projecting the reconstructed foreground objects onto
the original viewpoints. Furthermore, the reconstructed scene
can be easily projected onto any desired virtual viewpoint, thus
simplifying the design of free-viewpoint video applications. In our
experimental results, we show that our techniques for reconstruc-
tion and meshing compare favorably with the state-of-the-art, and
we also introduce a rule-of-thumb for effective application of the
method with a good quality versus representation cost trade-off
This paper proposes a system that relates objects
in an image using occlusion cues and arranges them according
to depth. The system does not rely on a priori knowledge of
the scene structure and focuses on detecting special points,
such as T-junctions and highly convex contours, to infer the
depth relationships between objects in the scene. The system
makes extensive use of the binary partition tree as hierarchical
region-based image representation jointly with a new approach
for candidate T-junction estimation. Since some regions may
not involve T-junctions, occlusion is also detected by examining
convex shapes on region boundaries. Combining T-junctions and
convexity leads to a system which only relies on low level depth
cues and does not rely on semantic information. However, it
shows a similar or better performance with the state-of-the-art
while not assuming any type of scene.
As an extension of the automatic depth ordering system, a
semi-automatic approach is also proposed. If the user provides
the depth order for a subset of regions in the image, the system
is able to easily integrate this user information to the final
depth order for the complete image. For some applications, user
interaction can naturally be integrated, improving the quality of
the automatically generated depth map.
The optimal exploitation of the information provided by hyperspectral images requires the development of advanced image-processing tools. This paper proposes the construction and the processing of a new region-based hierarchical hyperspectral image representation relying on the binary partition tree (BPT). This hierarchical region-based representation can be interpreted as a set of hierarchical regions stored in a tree structure. Hence, the BPT succeeds in presenting: 1) the decomposition of the image in terms of coherent regions, and 2) the inclusion relations of the regions in the scene. Based on region-merging techniques, the BPT construction is investigated by studying the hyperspectral region models and the associated similarity metrics. Once the BPT is constructed, the fixed tree structure allows implementing efficient and advanced application-dependent techniques on it. The application-dependent processing of BPT is generally implemented through a specific pruning of the tree. In this paper, a pruning strategy is proposed and discussed in a classification context. Experimental results on various hyperspectral data sets demonstrate the interest and the good performances of the BPT representation.
Abstract—The purpose of the current work is to propose, under a statistical framework, a family of unsupervised region merging techniques providing a set of the most relevant region-based explanations of an image at different levels of analysis. These techniques are characterized by general and nonparametric region models, with neither color nor texture homogeneity assumptions, and a set of innovative merging criteria, based on information theory statistical measures. The scale consistency of the partitions is assured through i) a size regularization term into the merging criteria and a classical merging order, or ii) using a novel scale-based merging order to avoid the region size homogeneity imposed by the use of a size regularization term. Moreover, a partition significance index is defined to automatically determine the subset of most representative partitions from the created hierarchy. Most significant automatically extracted partitions show the ability to represent the semantic content of the image from a human point of view. Finally, a complete and exhaustive evaluation of the proposed techniques is performed, using not only different databases for the two main addressed problems (object-oriented segmentation of generic images and texture image segmentation), but also specific evaluation features in each case: under- and oversegmentation error, and a large set of region-based, pixel-based and error consistency indicators, respectively. Results are promising, outperforming in most indicators both object-oriented and texture state-of-the-art segmentation techniques.
Local moments have attracted attention as local features in applications such as edge detection and texture segmentation. The main reason for this is that they are inherently integral-based features, so that their use reduces the effect of uncorrelated noise. The computation of local moments, when viewed as a neighborhood operation, can be interpreted as a convolution of the image with a set of masks. Nevertheless, moments computed inside overlapping windows are not independent and convolution does not take this fact into account. By introducing a matrix formulation and the concept of accumulation moments, this paper presents an algorithm which is computationally much more efficient than convolving and yet as simple.
This paper discusses the interest of binary partition trees as a region-oriented image representation. Binary partition trees concentrate in a compact and structured representation a set of meaningful regions that can be extracted from an image. They offer a multiscale representation of the image and define a translation invariant 2-connectivity rule among regions. As shown in this paper, this representation can be used for a large number of processing goals such as filtering, segmentation, information retrieval and visual browsing. Furthermore, the processing of the tree representation leads to very efficient algorithms. Finally, for some applications, it may be interesting to compute the binary partition tree once and to store it for subsequent use for various applications. In this context, the paper shows that the amount of bits necessary to encode a binary partition tree remains moderate
This paper discusses the interest of binary partition
trees as a region-oriented image representation. Binary partition
trees concentrate in a compact and structured representation a set
of meaningful regions that can be extracted from an image. They
offer a multiscale representation of the image and define a translation
invariant 2-connectivity rule among regions. As shown in this
paper, this representation can be used for a large number of processing
goals such as filtering, segmentation, information retrieval
and visual browsing. Furthermore, the processing of the tree representation
leads to very efficient algorithms. Finally, for some applications,
it may be interesting to compute the binary partition tree
once and to store it for subsequent use for various applications. In
this context, the last section of the paper will show that the amount
of bits necessary to encode a binary partition tree remains moderate.
This paper deals with a class of morphological operators called connected operators. These operators filter the signal by merging its flat zones. As a result, they do not create any new contours and are very attractive for filtering tasks where the contour information has to be preserved. This paper shows that connected operators work implicitly on a structured representation of the image made of flat zones. The max-tree is proposed as a suitable and efficient structure to deal with the processing steps involved in antiextensive connected operators. A formal definition of the various processing steps involved in the operator is proposed and, as a result, several lines of generalization are developed. First, the notion of connectivity and its definition are analyzed. Several modifications of the traditional approach are presented. They lead to connected operators that are able to deal with texture. They also allow the definition of connected operators with less leakage than the classical ones. Second, a set of simplification criteria are proposed and discussed. They lead to simplicity-, entropy-, and motion-oriented operators. The problem of using a nonincreasing criterion is analyzed. Its solution is formulated as an optimization problem that can be very efficiently solved by a Viterbi (1979) algorithm. Finally, several implementation issues are discussed showing that these operators can be very efficiently implemented.
This paper presents a prediction technique for partition sequences. It uses a region-by-region approach that consists of four steps: region parameterization, region prediction, region ordering, and partition creation. The time evolution of each region is divided into two types: regular motion and shape deformation. Both types of evolution are parameterized by means of the Fourier descriptors and they are separately predicted in the Fourier domain. The final predicted partition is built from the ordered combination of the predicted regions, using morphological tools. With this prediction technique, two different applications are addressed in the context of segmentation-based coding approaches. Noncausal partition prediction is applied to partition interpolation, and examples using complete partitions are presented. In turn, causal partition prediction is applied to partition extrapolation for coding purposes, and examples using complete partitions as well as sequences of binary images-shape information in video object planes (VOPs)-are presented.
The objective of this paper is to introduce a fourth-order cost function of the displaced frame difference (DFD) capable of estimating motion even for small regions or blocks. Using higher than second-order statistics is appropriate in case the image sequence is severely corrupted by additive Gaussian noise. Some results are presented and compared to those obtained from the mean kurtosis and the mean square error of the DFD.
This paper deals with the use of some morphological tools for image and video coding. Mathematical morphology can be considered as a shape-oriented approach to signal processing, and some of its features make it very useful for compression. Rather than describing a coding algorithm, the purpose of this paper is to describe some morphological tools that have proved attractive for compression. Four sets of morphological transformations are presented: connected operators, the region-growing version of the watershed, the geodesic skeleton, and a morphological interpolation technique. The authors discuss their implementation, and show how they can be used for image and video segmentation, contour coding, and texture coding.
This correspondence deals with the notion of connected operators. Starting from the definition for operator acting on sets, it is shown how to extend it to operators acting on function. Typically, a connected operator acting on a function is a transformation that enlarges the partition of the space created by the flat zones of the functions. It is shown that from any connected operator acting on sets, one can construct a connected operator for functions (however, it is not the unique way of generating connected operators for functions). Moreover, the concept of pyramid is introduced in a formal way. It is shown that, if a pyramid is based on connected operators, the flat zones of the functions increase with the level of the pyramid. In other words, the flat zones are nested. Filters by reconstruction are defined and their main properties are presented. Finally, some examples of application of connected operators and use of flat zones are described.
We propose the use of higher order statistics (HOS)-based methods to address the problem of image restoration. The restoration strategy is based on the fact that the phase information of the original image and its HOS are not distorted by some types of blurring. The difficulties associated with the combination of 2-D signals and their HOS are reduced by means of the Radon transform. Two methods that apply the weight-slice algorithm over the projections are developed. Simulation results illustrate the performance of the proposed methods.
This paper deals with a hierarchical morphological segmentation algorithm for image sequence coding. Mathematical morphology is very attractive for this purpose because it efficiently deals with geometrical features such as size, shape, contrast, or connectivity that can be considered as segmentation-oriented features. The algorithm follows a top-down procedure. It first takes into account the global information and produces a coarse segmentation, that is, with a small number of regions. Then, the segmentation quality is improved by introducing regions corresponding to more local information. The algorithm, considering sequences as being functions on a 3-D space, directly segments 3-D regions. A 3-D approach is used to get a segmentation that is stable in time and to directly solve the region correspondence problem. Each segmentation stage relies on four basic steps: simplification, marker extraction, decision, and quality estimation. The simplification removes information from the sequence to make it easier to segment. Morphological filters based on partial reconstruction are proven to be very efficient for this purpose, especially in the case of sequences. The marker extraction identifies the presence of homogeneous 3-D regions. It is based on constrained flat region labeling and morphological contrast extraction. The goal of the decision is to precisely locate the contours of regions detected by the marker extraction. This decision is performed by a modified watershed algorithm. Finally, the quality estimation concentrates on the coding residue, all the information about the 3-D regions that have not been properly segmented and therefore coded. The procedure allows the introduction of the texture and contour coding schemes within the segmentation algorithm. The coding residue is transmitted to the next segmentation stage to improve the segmentation and coding quality. Finally, segmentation and coding examples are presented to show the validity and interest of the coding approach.