Elsevier

Pattern Recognition

Volume 36, Issue 10, October 2003, Pages 2377-2394
Pattern Recognition

On the concept of best achievable compression ratio for lossy image coding

https://doi.org/10.1016/S0031-3203(03)00047-5Get rights and content

Abstract

The trade-off between image fidelity and coding rate is reached with several techniques, but all of them require an ability to measure distortion. The problem is that finding a general enough measure of perceptual quality has proven to be an elusive goal. Here, we propose a novel technique for deriving an optimal compression ratio for lossy coding based on the relationship between information theory and the problem of testing hypotheses. The best achievable compression ratio determines a boundary between achievable and non-achievable regions in the trade-off between source fidelity and coding rate. The resultant performance bound is operational in that it is directly achievable by a constructive procedure, as suggested in a theorem that states the relationship between the best achievable compression ratio and the Kullback–Leibler information gain. As an example of the proposed technique, we analyze the effects of lossy compression at the best achievable compression ratio on the identification of breast cancer microcalcifications.

Introduction

Digital image transmission and storage are facing major challenges due to growing size of image datasets and practical limitations in transmission bandwidth and storage space. This explains why image compression has become a popular necessity: applying coding techniques reduces the image transfer time, and therefore reduces the cost; applying image compression also reduces the storage requirements, network traffic, and therefore improves the efficiency.

During the past two decades, various lossless and lossy image coding techniques have been developed (for a list of references see Ref. [1]). Typical lossless coders can attain compression ratios of only 2:1 or 3:1 for most images, thus users often prefer to deal with lossy algorithms which can achieve high compression rates, e.g., 50:1 or more. The problem is that high compression ratios are possible at the cost of imperfect source representation. Compression is lossy in that the decoded images are not exact copies of the originals but, if the properties of the human visual system are correctly exploited, original and decoded images will be almost indistinguishable. The trade-off between image distortion and coding rate may be stated as follows [2]: How much fidelity in the representation are willing to give up in order to reduce the storage or the number of bits required to transmit the data?

As an example, consider compression of digitized mammograms (most of the authors digitize them with a spatial resolution of 0.1 or 0.05mm producing a huge amount of data). Radiologists look for certain signs and characteristics indicative of cancer when evaluating a mammogram. Among these signs is the presence of clustered microcalcifications. Individual breast cancer microcalcifications appear as small objects of variable shape with a roughly circular average form, and therefore, compression artifacts in JPEG coding [3] (the international standard for lossy compression of still images) as blurring or blocking artifacts could affect to the number or extension of these microcalcifications. In such scenario, the important point is how to find an optimal compression level at which to trade-off between compression ratio and source fidelity in the sense of diagnostic usefulness.

Optimization of the trade-off between image fidelity and coding rate requires an ability to measure distortion [2], [4], [5], [6], [7]. However, the perceived distortion in visual content is a very difficult quantity to measure, as the characteristics of the human visual system are complex and not well understood [8], [9]. There are some attempts for using the human's eye capabilities for measuring perception and distortion [10], [11], [12], [13], [14], [15]. Nevertheless, in practice, highly imperfect distortion models such as the sum of squared differences or its equivalents, known as mean squared error (MSE) or peak signal-to-noise ratio (PSNR) are used in most actual comparisons.

To circumvent the lack of knowledge of what distortion measures are more suitable for images, here we propose a novel technique for deriving an optimal performance bound (it has been termed the “best achievable” compression ratio) based on the relationship between information theory and the problem of testing hypotheses. The best achievable compression ratio for lossy coders determines a boundary between achievable and non-achievable regions in the trade-off between source fidelity and coding rate. The resultant bounds are tight for situations of practical relevance. These performance bounds are directly achievable by a constructive procedure as suggested in a theorem, given in Section 2, which proves the relationship between the “best achievable” compression ratio and the Kullback–Leibler information gain [16].

This measure of information gain is to be characterized in Section 3 with a minimal number of properties which are natural for image processing and thus desirable. This section also determines the form of all error functions that satisfy these properties which we have stated to be desirable.

As an example of the proposed technique for achieving optimal performance bounds, we analyze whether lossy compression at the best achievable compression ratio can be used for the detection of breast cancer microcalcifications (Section 4). Several experiments are also performed to investigate the comparative results of the Kullback–Leibler (KL) information gain and various quantitative distortion measures for predicting image fidelity in the sense of diagnostic usefulness (Section 5). The main conclusions of this paper are summarized in Section 6.

Section snippets

Best achievable compression ratio

Let X1,X2,…,XN be a sequence of N symbols from the set of gray levels G={l|l=1,2,…,l|G|}. A 2D digital image can be interpreted as a sequence X1,X2,…,XN of N symbols, with Xi being the gray level at pixel i. We use the notation I to denote a particular sequence of gray levels x1,…,xN.

Let I be the original image; Iq(i) be the reconstruction of the original image at compression ratio q(i); P and Pq(i) be the discrete probability distributions, with P=(p(l/I))l and Pq(i)=(p(l/Iq(i)))l, that

Axiomatic characterization of relative information

As stated in Theorem 1, the KL information gain plays a key role in obtaining the best achievable level of compression for lossy coding. And this section is intended to characterize the KL gain with a minimal number of properties which are natural for image processing and thus desirable.

The first postulate states a property of how unexpected a single event of a digital image was.

Postulate 1

A measure U of how unexpected the single eventgray level l occurswas, depends only upon its probability p.

This

Effects of lossy compression at best achievable compression ratio

This section analyzes the effects of lossy compression at the best achievable level on a specific application of image feature extraction. The coding scheme will be the international standard JPEG [3]. And the particular problem of feature extraction here discussed is the identification of breast cancer microcalcifications. Thus the point is to decide whether lossy JPEG coding at the best achievable compression ratio q(i) can be used for the specific problem of detection of individual

Performance of the KL information gain for predicting image fidelity

Now we show the performance of the Kullback–Leibler information gain for predicting image fidelity in terms of diagnostic usefulness. This will be evaluated by making use of the degree of correlation between the FP/TP ratio and the KL gain. The information gain is applied to quantify the image distortion by means of the difference between the original mammographic image and the images reconstructed after compression.

The correlation coefficient is given bycorr(y,z)=i=199(yi−μy)(zi−μz)i=199(yi−μ

Conclusion

The focus of this paper has been to a large extent the reformulation of the trade-off between image distortion and coding rate in terms of the derivation of an optimal compression level based on the relationship between information theory and the decision problems of hypothesis testing. This optimal level will allow us to determine a boundary between achievable and non-achievable regions, and consequently, it provides useful information to benchmark specific applications.

As an example of this,

Summary

High compression ratios are possible at the cost of imperfect source representation, and therefore, the natural question is: how much fidelity in the representation are willing to give up in order to reduce the storage or the number of bits required to transmit the data? The trade-off between image fidelity and coding rate is reached with several techniques, but all of them require an ability to measure distortion. The problem is that finding a general enough measure of perceptual quality has

Acknowledgements

This paper was prepared while Xose R. Fdez-Vidal was on leave from The Department of Applied Physics at Santiago University, visiting the Department of Computer Science and A.I. at Granada University (Spain). The digitized mammographic images were provided by courtesy of the Mammographic Image Analysis Society as well as the National Expert and training centre for breast cancer screening and the Department of radiology at the University of Nijmegen, The Netherlands. This research was sponsored

About the Author—JOSE A. GARCÍA was born in Almeria, Spain. He received the M.S. and Ph.D. degrees, both in Mathematics, from the University of Granada in 1987 and 1992, respectively. Since 1988, he has been with the Computer Science Department (DECSAI) at Granada University where he is now an Associate Professor. Author of over 100 technical papers and two books, he has devoted the last 14 years to develop computer vision models for biomedicine, astronomy, cartography, feature extraction,

References (22)

  • X.R. Fdez-Vidal et al.

    Using models of feature perception in distortion measure guidance

    Pattern Recognition Lett.

    (1998)
  • M.P. Eckert et al.

    Perceptual quality metrics applied to still image compression

    Signal Process.

    (1998)
  • O. Egger et al.

    High-performance compression of visual information—a tutorial review—part Istill pictures

    Proc. IEEE

    (1999)
  • A. Ortega et al.

    Rate-distortion methods for image and video compression

    IEEE Signal Process. Mag.

    (1998)
  • JTC1 Committee, Digital compression and coding of continuous-tone still images, International Organization...
  • T. Berger

    Rate-Distortion Theory, A Mathematical Theory for Data Compression

    (1971)
  • N. Jayant et al.

    Digital Coding and Waveforms: Principles and Applications to Speech and Video

    (1984)
  • G.M. Schuster et al.

    Rate-distortion Based Video Compression

    (1997)
  • Y.-S. Saw

    Rate-quality Optimized Video Coding

    (1999)
  • R. Rodriguez-Sanchez et al.

    The RGFF representational modela system for the automatically learned partitioning of visual patterns in digital images

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1999)
  • X.R. Fdez-Vidal et al.

    Computing visual target distinctness through selective filtering, statistical features, and visual patterns

    Opt. Eng.

    (2000)
  • Cited by (6)

    About the Author—JOSE A. GARCÍA was born in Almeria, Spain. He received the M.S. and Ph.D. degrees, both in Mathematics, from the University of Granada in 1987 and 1992, respectively. Since 1988, he has been with the Computer Science Department (DECSAI) at Granada University where he is now an Associate Professor. Author of over 100 technical papers and two books, he has devoted the last 14 years to develop computer vision models for biomedicine, astronomy, cartography, feature extraction, clustering, image representation, image distortion, visual target distinctness, and image compression. Dr. Garcı́a is a member of the IAPR Association and the IEEE Computer Society.

    About the Author—JOAQUÍN FDEZ-VALDIVIA was born in Granada, Spain. He received the M.S. and Ph.D. degrees, both in Mathematics, from the University of Granada in 1986 and 1991, respectively. Since 1988, he has been with the Computer Science Department (DECSAI) at Granada University where he is now an Associate Professor. His current interest includes computer vision, image analysis, feature detection, visual perception, image coding, and biomedical applications. Dr. Fdez-Valdivia is a member of the IAPR Association and the IEEE Computer Society. His research work is summarized in over 100 papers in scientific journals and conference proceedings and two books in the field of Computer Vision.

    About the Author—XOSE R. FDEZ-VIDAL was born in Lugo, Spain. He received the M.S. and Ph.D. degrees, both in Physics, from the University of Santiago de Compostela in 1991 and 1996, respectively. Since 1992, he has been with the Applied Physics Department at Santiago de Compostela University where he is now an Associate Professor. His current interest includes image processing, multiresolution methods, wavelets and distortion measures. Dr. Fdez-Vidal is a member of the IAPR Association.

    About the Author—ROSA RODRIGUEZ- SÁNCHEZ was born in Granada, Spain. She received the M.S. and Ph.D. degrees, both in Computer Science, from the University of Granada in 1996 and 1999, respectively. Currently she is with the Computer Science Department (DECSAI) at Granada University where she is now an Associate Professor. Her current interest includes computer vision, visual perception and image coding. Dr. Rodriguez-Sanchez is a member of the IAPR Association.

    View full text