Transform Domain Watermarking: Adaptive Selection of the Watermark’s Position and Length ‡ V. Fotopoulos a,b and A. N. Skodras b,c a
Electronics Laboratory, University of Patras, GR-26110 Patras, Greece Research Academic Computer Technology Institute, GR-26221 Patras, Greece c Hellenic Open University, School of Sciences & Technology, GR-26222 Patras, Greece b
ABSTRACT Transform domain methods have dominated the watermarking field from its early stages. In these methods some coefficients are selected and modified according to certain rules. The two most important numbers in this process are the length and the position of the watermark. These are usually heuristically chosen. In order to handle this problem, an adaptive scheme for the selection of the proper coefficients is analysed in the present communication.
1. INTRODUCTION The work of Cox et al 1 has signalled the opening of the watermarking field. Many people have followed the same trails using frequency domain watermarking with slight variations. Methods, which are compatible to this scheme, in general include the following steps: • • • • • •
Determine a frequency transform Perform transform Select transformed coefficients Alter selected coefficients according to some rule Inverse transform Save watermarked image
A variety of transforms have been used in such applications like DCT 2,3, DFT 4, DWT 5,6 or others 7,8,9. The application of the transform could be full frame or in smaller image blocks. The transformed coefficients are then reordered in vector form and in order of frequency. First coefficients represent low frequencies while the last correspond to high frequencies. The reordering scheme has to do with the transform used. For the DCT case, zig-zag scanning is performed and this is quite understandable since the low frequency (and highest energy coefficients) are concentrated at the upper left corner of the transformed block. Other transforms concentrate energy in different regions of the transform block and thus scanning should be different. This scheme performs generally well but there are two parameters that are heuristically chosen. Usually from the ordered coefficients vector, first M are skipped and next L are selected for watermarking. The selection of these parameters until now was based on the following assumptions: 1. 2. 3.
Parameter M should not be too low because changing the first coefficients of the vector (that correspond to low frequencies) would heavily degrade the image. M should not be too large because then we would be using high frequencies. Altering these coefficients would produce visible artefacts and would be vulnerable to low pass filtering and similar processes. The L parameter should be sufficiently large (because of statistical hypothesis implied) but not too large in order to avoid image degradation.
There is no sufficient theoretic background on the selection of these coefficients. It is observed that even now, almost a decade after the watermarking boom, the selections are the same as those used in the first two years. In this ‡
ΠΑΒΕΤ 2000 – 00ΒΕ363)
This work has been funded by a GSRT grant (
Visual Communications and Image Processing 2003, Touradj Ebrahimi, Thomas Sikora, Editors, Proceedings of SPIE Vol. 5150 (2003) © 2003 SPIE · 0277-786X/03/$15.00
1921
communication, we are not going to present a theoretic investigation, but rather an efficient calculation algorithm that can be used to maximize the efficiency of the detection.
2. ADAPTATION OF L AND M In watermarking literature, common values of L and M are usually drawn in the range of 5000-10000. Using a large value form L has mainly to do with statistical reasons but at the same time produces more distortion. As M is concerned, the number of skipped coefficients, the selection depends on the desired frequency band for the watermark embedding. If low frequencies are used, then M is usually selected in the range 1000-2000. In that case, very good robustness is achieved in exchange for stronger image degradation compared to the case of middle frequencies where M is selected in the range 5000-10000. All these numbers correspond to 512x512 images which produce 262144 transform coefficients (full frame transform). For different image sizes, these numbers are proportionally changed. To perform the selection of these parameters in an adaptive way, some decision rule is needed. This rule is the maximization of the similarity function used in the detection process. This function depends on L and M and that’s why, from now on we are going to refer to it as nsim(M,L). To perform parametric analysis, for different pairs of M and L, we are going to watermark the selected coefficients and test the output of the detector. The output of the detection function will be visualized as a 3D graph in which the main task is the localization of the maximum value for an optimum pair of L and M for watermarking the selected image. If an image is of size R rows by C columns, the number of the available transform coefficients will be N=R x C. The initial values for M and L are chosen to be 0.004N and 0.002N respectively. The calculations are performed in increments F re q u e n c y T ra n s f o rm (D C T , H a rt le y e tc ) R xC Im a g e
E n e rg y S p e c t ru m
P s e u d o -ra n d o m N o is e G e n e ra t o r
C o e f fic ie n t s R e o rd e rin g
S e le c tio n of L & M p a ra m e te rs
Ν=R xC Μ = 0.004 Ν L = 0 .0 0 2 N
D e t e c tio n O u tp u t = n s im (M ,L )
L= L+ 64
NO
L = 0 .0 2 N
YES
Μ = Μ +64
NO
M = 0.01 6N
YES
n s im (M O , L O )= m a x (D e te c t io n O u t p u t)
MO
Figure 1 1922
Proc. of SPIE Vol. 5150
LO
Μ and L selection algorithm
nsim M
L
Figure 2 Detection as a function of
Μ and L
of 64 for each parameter, for profound computational reasons and continue until M=0.016N and L=0.02N. Figure 1 depicts the algorithm used. To draw some early conclusions, the method was applied on the grayscale image “cameraman” of size 256x256. The 3D mesh produced, is presented in Figure 2, where M runs on the left axis and L on the right. It is easily observed that the detection decays while L is increasing. Especially for L>10000 the decay is too fast. As M is concerned (keeping L constant), we have some distinct peaks. Which one should be chosen? Being strict, the largest value corresponds to the lower value of M but by choosing the larger value, although the detection would be only a bit lower, one would benefit from better PSNR and visual result. To test the scheme in another way, we used it on four test images with two different frequency transforms, DCT and Hartley. The reason behind this is to examine the relationship between the transform and the detection obtained. It is obvious that the scheme does not depend on the transformed selection; it can be applied with no problem in any frequency domain method. It is a question though to examine how different transforms affect the overall scheme’s performance. Two transforms are checked here, the DCT and Hartley respectively. The results for the L and M pairs that produce the maximum detection are given in Table 1, while the 3D meshes are shown in Figures 3 and 4. Results show that M should be as low as possible. The value of M that produced the maximum output was the starting one (1024), for all combinations of images and transforms. Contrary to M, the L parameter varies significantly. It is interesting to note that for the “house” image L is less than half compared to the other images. The “Girl” image requires the most coefficients in the case of DCT but if Hartley transform is used, it is “Lena” that requires the most. In general it is observed that for DCT, more coefficients are needed compared to Hartley. Table 1 M, L detection for different transforms (DCT & Hartley) DCT
Image
Hartley
M
L
M
L
Lena
1024
1984
1024
1600
House
1024
832
1024
576
Girl
1024
3840
1024
1024
Baboon
1024
3648
1024
1280
3. DETECTION IMPROVEMENT IN PRACTICE After those preliminary tests, the next logical step towards the evaluation of the scheme is to incorporate the selection algorithm into a common transform domain watermarking scheme. The scheme selected was based on the Hartley transform. The reason behind this was to exhibit the improvement achieved with the adaptation of L and M for a Proc. of SPIE Vol. 5150
1923
transform that does not perform as good as others like DCT or Wavelets. Corresponding results are given in Table 2. The image used is “Lena” and the attacks are the non-geometric attacks of the “CheckMark” software 10, a very efficient benchmarking tool for watermarking applications.
1924
Proc. of SPIE Vol. 5150
Figure 3
Selecting parameters for DCT
Figure 4
Selecting parameters for Hartley
Table 2 Hartley transform based watermarking (with/without adaptation)
Attack (number)
Without Adaptation (heuristically chosen parameters)
With Adaptation
Number of detections
Percentage
Number of detections
Percentage
reSample(1)
1
100%
1
100%
Filtering(3)
3
100%
3
100%
ColorReduce(2)
2
100%
2
100%
JPEG(12)
11
92%
12
100%
Wavelet(10)
9
90%
10
100%
MAP(6)
5
83%
6
100%
ML(7)
3
43%
7
100%
Remodulation(4)
0
0%
4
100%
Copy(1)
0
0%
1
100%
Total number of attacks: 46
Successful detections: 34
Total success percentage: 74%
Successful detections: 46
Total success percentage: 100%
The performed tests, although they exclude geometrical attacks, are extremely hard. Filtering includes various lowpass gaussian-windowed, sharpening, median and more. Compression with JPEG goes down to a quality factor of 10 while wavelet compression can go down to 0.1 bpp. Denoising with Wiener filtering, hard and soft thresholding, denoising combined with remodulation are also included. The “Copy” attack involves estimating the watermark by means of Wiener filtering and copying onto another one. For more information on the nature and parameters of the attacks, one can refer to 11. It is observed from the table that without the adaptive scheme assistance, the Hartley transform based scheme fails for the highest compression rates (both JPEG and Wavelets), the Remodulation and Copy attacks. It fails even worst in the maximum likelihood estimation attacks (ML) where it misses 4 out of 7. From this point of view, it seems that the adaptive selection of parameters produced great improvement, as the scheme that originally performed successful in only 74% of the cases, now succeeds always. There’s not even one case that the enhanced method fails to identify the watermark. Although results are taken for one image only, the improvement is clearly evident.
4. CONCLUSIONS – FUTURE DIRECTIONS The Hartley transform has been selected intentionally to show how, the results of a method based on a transform not known to perform well in watermarking applications, can be boosted significantly by using the proposed algorithm. Not only Hartley but all schemes that perform watermarking this way (frequency transform-selection of coefficientsmultiplication by pseudo-random noise sequence) can be benefited from the proposed technique, the scheme is independent of the transform used. Some questions though remain to be answered. One is the possibility that the false positive errors are increased. The answer to this question cannot easily be given because it has to do with the detection threshold selection (fixed, dynamic or other). In any case, a lot more experiments in a larger image set are needed to quantify this percentage. Another important issue has to do with the robustness vs. quality trade off. We’ve seen that there might be two or more easily distinguishable peaks in the 3D surface. These correspond to local maxima of the nsim function. Until now we’ve used the total maximum point that happened to correspond to the lowest pair of values. This was the best solution in terms of detectors reliability-robustness but perhaps it wouldn’t be hard to sacrifice a bit of robustness to select a lower peak that performs much better in terms of quality (larger PSNR values). A quality threshold could be included into the scheme, something like the just noticeable difference (JND), mean absolute Proc. of SPIE Vol. 5150
1925
error (MAE), mean squared error (MSE), root mean square error (RMS) or other. The algorithm then should be changed as follows:
Transform input image Produce 3D mesh for the nsim function For each pair of L,M values, find the corresponding quality metric Divide each value of the nsim function with the corresponding quality metric Find total maximum of the product function
Original Image
Frequency Transform
Create L,M pairs
w1
Metric 3D mesh
w2
Candidate L,M pairs
L0 M0
Figure 5
Detector’s 3D mesh
Find total maximum
Scheme improvement by quality metric inclusion
Additionally, weights w1 and w2 could be introduced for the division of the nsim and the metric (Figure 5). Another method could probably perform the inverse transform for each pair of L, M values in order to compute the JND between the original and watermarked image. Of course such an addition would result in significant computational cost. A query of “adaptive watermarking” in the Google search engine yields almost 6,500 hits. Interesting ideas have been proposed, some of them coming from different research fields like genetic algorithms for example 12. None of them though addresses the adaptation problem in the same way to the proposed method. The algorithm presented in this communication is brute force and suboptimal because it does not account for all possible L,M pairs. For a typical 512x512 sized image, 3,700 different pairs are examined. The performance boost achieved is impressive. Of course, the computational cost is discouraging for real time applications. On the other hand, if robustness is the main issue, like in a digital rights management (DRM) system, such improvement is highly desirable. This fact, along with the transform selection independence makes this algorithm a useful add on for many watermarking applications. Of course there is still space for improvement.
REFERENCES 1. I. Cox, M. Miller, J. Bloom, “Digital Watermarking”, Morgan Kaufmann Publishers, pp. 272, 2002 2. I. Cox, J. Killian, F. Thompson Leighton, T. Shamoon, Secure spread spectrum watermarking for multimedia, IEEE Trans. on Image Processing, Vol. 6, No. 12, pp. 1673-1687, December 1997 3. M. Barni, F. Bartolini, V. Capellini, A. Piva, A DCT-domain system for robust image watermarking, Signal Processing 66, pp. 357-372, 1998 4. M. Ramkumar, A. N. Akansu, A. A. Alatan, “A Robust Data-Hiding Scheme for Images Using DFT”, in Proc. of the 6th IEEE Conference on Image Processing, ICIP’99, pp. 211-215, Kobe, Japan, October 1999 5. L. Xie, G. R. Arce, “Joint Wavelet Compression and Authentication Watermarking”, in Proc. of the 5th IEEE Conference on Image Processing, ICIP’98, Chicago, IL, USA, 1998 6. J. R. Kim, Y. S. Moon, “A Robust Wavelet Based Digital Watermark Using Level-Adaptive Thresholding”, in Proc. of the 6th IEEE Conference on Image Processing, ICIP’99, pp. 202, Kobe, Japan, October 1999
1926
Proc. of SPIE Vol. 5150
7. S. A. M. Gilani, A.N. Skodras, “DLT-Based Digital Image Watermarking”, Proc. First IEEE Balkan Conference on Signal Processing, Communications, Circuits and Systems, Istanbul, Turkey, June 1-3, 2000 8. S. Pereira, J.J.K. O’Ruanaidh, T. Pun, “Secure Robust Digital Watermarking Using the Lapped Orthogonal Transform”, SPIE Conference on Security and Watermarking of Multimedia Content, Proceedings of SPIE, Vol. 3657, pp. 21-30, San Jose, California USA, 23-29 January 1999 9. V. Fotopoulos, A.N. Skodras, “A Subband DCT Approach to Image Watermarking”, Proc. X European Signal Processing Conference (EUSIPCO-2000), Tampere, Finland, 5-8 Sept. 2000 10. http://watermarking.unige.ch/Checkmark/ 11. S. Voloshynovskiy, S. Pereira, V. Iquise, T. Pun, “Attack modeling: Towards a second generation benchmark”, Signal Processing, Special Issue: Information Theoretic Issues in Digital Watermarking, May, 2001 12. C. H. Huang and J. L. Wu, “A watermark optimization technique based on genetic algorithms,” SPIE Electronic Imaging 2000, San Jose, January, 2000
Proc. of SPIE Vol. 5150
1927