Digital Audio Watermarking and DRM Enforcement

Digital Audio Watermarking

Table of Contents

Digital Rights Management (DRM) is a topic that never fails to stir up controversial yet unremedied discussions about digital copyright and how much control publishers have over published and copyrighted media. The implementation of many of the widely accepted DRM methodologies are proprietary. This post makes an effort to implement one simple but effective DRM technique – Digital Audio Watermarking proceeded by a discussion on DRM Enforcement.

The Origin Story

Back at the turn of the millennium, portable MP3 players were taking on traditional music player installations, the internet was a thing and people figured out they need not buy all of their music anyway when it could be ripped off of Napster. Now, creators and production houses had a problem. They had to figure out who was authorized to use their media and who was consuming their media illegally. This forayed into the adoption of DRM techniques in audio files with iTunes being an early adopter, strictly restricting only authorized users to access music from their music store. It had its Achilles Heel though; One being users complaining they could not play their legally purchased music because their device was labelled incompatible. Multi device streaming was not a thing yet and many a time, users had to pay multiple times to access their music on multiple devices. These early models had many more caveats and soon got deprecated.

Importance in modern times

Music, Audiobooks, Podcasts and streaming all sorts of audio have never been in a brighter spotlight than now. Ergo, lots of intellectual property agreements to be agreed to and enforced. Though these happen behind the scenes and end users never seem to notice the DRM enforcements by streaming apps, it does help to understand a fair bit of what is happening underneath. This post demonstrates one such powerful and secure DRM enforcement technique for audio files – Digital Audio Watermarking.

Digital Audio Watermarking

Digital audio watermarking is a technique used to embed inaudible information in the digital audio for purposes including ownership verification, copyright enforcement, and sensitive information. The digital watermarking technique has been proposed as a valid solution to the requirement of copyright protection and authentication of multimedia data in a networked environment, since it makes possible to identify the author, owner, distributor or authorized consumer of a multimedia data.

Almost all audio watermarking techniques are based on the perceptual properties of the human auditory system. Inaudibility, robustness to attacks and the watermark data rate are the three main pillars to be taken care of, in this process. The watermarked signal is to be perceptually similar to the original audio signal.

The watermarks can be embedded in the Fourier domain, time domain, sub-band domain, wavelet domain and by echo hiding. As such, an effective watermarking scheme must satisfy the following requirements:

  • Imperceptibility
    The quality of the audio should be retained after adding the watermark. Imperceptibility can be evaluated using both objective and subjective measures. According to International Federation of the Phonographic Industry (IFPI) recommendations, a watermarked audio signal should maintain more than 20 dB Signal to Noise Ratio (SNR)
  • Security
    Watermarked signals should not reveal any clues about the watermarks in them. The security of the watermarking procedure must depend on secret keys, but not on the secrecy of the watermarking algorithm
  • Robustness
    Ability to extract a watermark from a watermarked audio signal after various signal processing attacks
  • Payload
    The amount of data that can be embedded into the host audio signal without losing imperceptibility. For audio signals, data payload refers to the number of watermark data bits that may be reliably embedded within a host signal per unit of time, usually measured using bits per second (bps). There should be more than 20 bps data payload

Singular Value Decomposition (SVD) is a useful tool of Linear Algebra with several applications in image compression, watermarking and other areas of signal processing.

A few years ago, SVD was explored for image watermarking applications. It is an effective numerical analysis tool used to analyze matrices. The SVD of a matrix has interesting properties that is used to our advantage:

To implement one iteration of the digital audio watermarking process, MATLAB® can be employed as the tool of choice to model the process. SVD is directly is obtained as an in-built function in MATLAB®  as –
% This function returns the singular values of 
% matrix A in descending order
s = svd(A); 

Discrete Wavelet Transform (DWT) has received a tremendous amount of interest in many important signal processing applications in recent times. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location in time. With the DWT, the audio signal can be transformed into frequency domain ranging from low frequency to high frequency. Besides, the high  frequency spectrum is less sensitive to the human ear. That is the reason why the high frequency component is usually discarded in the audio compression process. Therefore, information to be hidden can be embedded into the low frequency component.

DWT is directly is obtained as an in-built function in MATLAB®  as –

% This returns the single-level discrete wavelet 
% transform (DWT) of the vector x using the 
% wavelet specified by wname. 
% The return values are the approximation 
% coefficients vector cA and
% detail coefficients vector cD of the DWT.
[cA,cD] = dwt(x,wname);
% This returns the single-level reconstructed 
% approximation coefficients vector X based 
% on approximation and detail coefficients 
% vectors cA and cD, and using the wavelet 'wname'. 
% The idwt command performs a single-level
% one-dimensional wavelet reconstruction with 
% respect to a particular wavelet
X = idwt(cA,cD,'wname')

Algorithm to embed a watermark in a given audio file

Audio watermarking algorithm
  • 1
    Sample the original audio signal at a sampling rate of particular number of samples per second. Then, partition the sampled file into frames each having certain samples.
  • 2
    Perform DWT transformation on original audio signal. This operation produces Two sub-bands: A and D. The D represents Details sub-band, and A represents the Approximation sub-band. This is represented as cA and cD.
  • 3
    Apply SVD to the DWT performed approximation sub-band A. SVD decomposes the DWT coefficients into three matrices namely, U, S, VT. Where U is Unary matrix, S is Singular matrix.
  • 4
    Perform the steps 2 and 3 to the watermark signal too.
  • 5
    Embed the watermark audio bits into the DWT SVD-transformed original audio signal according to the formula: Sem = S + (k ∗ Sw). Where, Sem = singular matrix of watermarked audio signal; S = Singular matrix of original audio signal; Sw = Singular matrix of watermark audio signal; k is a real number which acts as the key.
  • 6
    Produce the final watermarked audio signal as follows:
    • Apply the inverse SVD operation using the U and VT matrices, which were unchanged, and the S matrix, which has been modified according to the equation.
    • Apply the inverse DWT operation to obtain each watermarked audio frame. The overall watermarked audio signal is obtained by summing all watermarked frames.

Algorithm to extract the watermark from a watermarked audio file

Audio watermark recovery algorithm
  • 1
    Perform Steps 2 and 3 of the embedding procedure until the S matrix is obtained for all frames of the watermarked audio signal.
  • 2
    Compose the singular matrix of watermark audio signal in the DWT-SVD transformed watermarked audio signal according to the formula, Mex = (Sem−S)/k.
    Where Mex = singular matrix of extracted watermark audio signal and k remains the same as chosen in the watermarking process.
  • 3
    Produce the final watermark audio signal as follows:
    • Apply the inverse SVD operation using the U and VT matrices, which were unchanged, and the S matrix, which has been modified according to the watermarking equation followed.
    • Apply the inverse DWT operation to obtain each watermarked audio frame and piece it all together.

Practical implementation of this process

The watermarking process presented above is extremely rudimentary and works for demonstration of proof of concept. The present standards, say for example, Widevine deals with more sophisticated processes to enforce DRM. However, one methodology of enforcement can be as follows: 

A media server hosting media files must watermark the music, movies it hosts, dynamically, on the go. A prebuilt media player must run on a client device like a laptop, desktop or smartphone. When this device views the media file, the web server will send the watermarked file and an encrypted secret key to the laptop/desktop/smartphone where the de-watermarking process takes place. The media is shown/played to the user only if the watermarks match and thus only authorized personnel can view the content as pirated media will have different watermarks and will not be played on the user’s device. These are fairly easy to implement but rather difficult to convince everyone to adopt. As a side note we must be informed that all music downloaded on smartphones from streaming apps, are encrypted files to protect against illegal distribution of music. 

How the process fares on actual audio files

Here are a few waveforms processed by MATLAB® and visualized using Audacity (you might have to open images in a new tab for a better view) :

Host audio file waveform
Host audio file waveform
Watermark audio file waveform
Watermark audio file waveform
Waveform of watermarked audio file generated by MATLAB
Waveform of watermarked audio file generated by MATLAB
Waveform of watermark extracted from watermarked audio
Waveform of watermark extracted from watermarked audio

The waveforms represent the efficacy of the algorithm in achieving the watermarking process. Waveforms suggest that no practical degradation of audio occurs after being watermarked but this remains to be tested on a larger number of audio files spanning the entire human auditory range. This should be good enough of a process to enforce minimalist DRM.

Enter, caveats

  • Potential exists for echo, chorus and reverbing effects in the watermarked signal. These are to be processed and rectified to achieve a better Signal-to-Noise ratio in the extracted audio signal.
  • The watermarking schemes are developed under the assumption that there is perfect synchronization between the original and the watermarked audios. The work can be extended for de-synchronized watermarked audios using synchronization sequences.
  • The ciphering technique from above is far from ideal. This can be worked around with industry standard encryption techniques like AES, SHA etc.

The acceptance part

Would you, as the authorized end consumer of a media file be accepting of ‘You have zero control over your bits anyway. Get over it.’ ?

Not all paying consumers are thieves. Would you not rather buy a movie online or from a streaming service than wait for the disc to get released and go through the hassle of maintaining it? Would you not rather stream music than go through the process of building your own authorized library? Undoubtedly we are entering into the terrain of copyright ownership and licensing management with this topic here. The whole point of DRM is to make sure that copyrighted digital files are restricted from being illegally reproduced and redistributed.

DMCA can cry strikes all it wants to but peer to peer networks are never bothered by the strikes. Static content is impossible from being secured against digital redistribution. WIthout financial incentives, many independent creators are thrown off guard and forced to find alternatives. The quality of content produced might as well depend on the support of paying consumers. Shifting to OpenCulture based implementations might work for well known creators but not everyone. Individual donations are not voluminous enough to consider abolishing strict DRM enforcement. A perfect and balanced digital media marketplace needs the support of all involved. Else, it becomes a closed library like Netflix/YouTube/Spotify which heavily enforce DRM and ends up annoying a large user base. Unfortunately, mass redistribution doesn’t require much technical skills and unprotected static media ends up a victim of digital piracy. This is the reason a dynamic on the go solution was highlighted in the section above. Streaming aggregators seem to have the best solution against these practices, yet. These are not physical tokens of ownership of the media but rather login based which present yet another challenge, unauthorized co-operation among users. Stern legal remedies should help the case though. 

If there is a key takeaway from this post, it should most certainly be this –

In my opinion, content protection and rights management exist only as vestigial efforts to preserve existing models of content sales for as long as the bulk of the consumer market remains clueless. History has shown every content-protection scheme invented for consumer-grade goods to have almost no impact on piracy, and little impact on casual copying, except when it has doomed the technology carrying it. This is inevitable.
The question before us is not about how to protect the bits, but how to protect the investments in creation of the bits, and how best to preserve the relationships between people and content. I submit that establishing a market for licenses to digital content is the last best hope for providing a continuing revenue stream for static content.

Mark S. Manasse, Compaq Systems Research Center, Palo Alto, CA Tweet


Check out an interesting licensing model created to go against the conventional copyright ideologies, aptly named - Copyleft

Notify of
Inline Feedbacks
View all comments