Digital Audio Watermarking and DRM Enforcement

Audio Feb 8, 2021

Digital Rights Management (DRM) is a topic that never fails to stir up controversial yet unremedied discussions about digital copyright and how much control publishers have over published and copyrighted media. The implementation of many of the widely accepted DRM methodologies are proprietary. This post makes an effort to implement one simple but effective DRM technique – Digital Audio Watermarking proceeded by a discussion on DRM Enforcement.

The Origin Story

Back at the turn of the millennium, portable MP3 players were taking on traditional music player installations, the internet was a thing and people figured out they need not buy all of their music anyway when it could be ripped off of Napster. Now, creators and production houses had a problem. They had to figure out who was authorized to use their media and who was consuming their media illegally. This forayed into the adoption of DRM techniques in audio files with iTunes being an early adopter, strictly restricting only authorized users to access music from their music store. It had its Achilles Heel though; One being users complaining they could not play their legally purchased music because their device was labelled incompatible. Multi device streaming was not a thing yet and many a time, users had to pay multiple times to access their music on multiple devices. These early models had many more caveats and soon got deprecated.

Importance in modern times

Music, Audiobooks, Podcasts and streaming all sorts of audio have never been in a brighter spotlight than now. Ergo, lots of intellectual property agreements to be agreed to and enforced. Though these happen behind the scenes and end users never seem to notice the DRM enforcements by streaming apps, it does help to understand a fair bit of what is happening underneath. This post demonstrates one such powerful and secure DRM enforcement technique for audio files – Digital Audio Watermarking.

Digital Audio Watermarking

Digital audio watermarking is a technique used to embed inaudible information in the digital audio for purposes including ownership verification, copyright enforcement, and sensitive information. The digital watermarking technique has been proposed as a valid solution to the requirement of copyright protection and authentication of multimedia data in a networked environment, since it makes possible to identify the author, owner, distributor or authorized consumer of a multimedia data.

Almost all audio watermarking techniques are based on the perceptual properties of the human auditory system. Inaudibility, robustness to attacks and the watermark data rate are the three main pillars to be taken care of, in this process. The watermarked signal is to be perceptually similar to the original audio signal.

The watermarks can be embedded in the Fourier domain, time domain, sub-band domain, wavelet domain and by echo hiding. As such, an effective watermarking scheme must satisfy the following requirements:

Modus Operandi

Singular Value Decomposition (SVD) is a useful tool of Linear Algebra with several applications in image compression, watermarking and other areas of signal processing.

A few years ago, SVD was explored for image watermarking applications. It is an effective numerical analysis tool used to analyze matrices. The SVD of a matrix has interesting properties that is used to our advantage:

  • The sizes of the matrices from SVD transformation are not fixed and the matrices need not be square
  • Changing Singular Value (SV) slightly does not affect the quality of the signal much
  • The SVs are invariant under common signal processing operations
  • The SVs satisfy intrinsic algebraic properties

To implement one iteration of the digital audio watermarking process, MATLAB® can be employed as the tool of choice to model the process. SVD is directly is obtained as an in-built function in MATLAB®  as –

Discrete Wavelet Transform (DWT) has received a tremendous amount of interest in many important signal processing applications in recent times. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location in time. With the DWT, the audio signal can be transformed into frequency domain ranging from low frequency to high frequency. Besides, the high  frequency spectrum is less sensitive to the human ear. That is the reason why the high frequency component is usually discarded in the audio compression process. Therefore, information to be hidden can be embedded into the low frequency component.

DWT is directly is obtained as an in-built function in MATLAB®  as –

Algorithm to embed a watermark in a given audio file
Algorithm to extract the watermark from a watermarked audio file

Practical implementation of this process

The watermarking process presented above is extremely rudimentary and works for demonstration of proof of concept. The present standards, say for example, Widevine deals with more sophisticated processes to enforce DRM. However, one methodology of enforcement can be as follows:

A media server hosting media files must watermark the music, movies it hosts, dynamically, on the go. A prebuilt media player must run on a client device like a laptop, desktop or smartphone. When this device views the media file, the web server will send the watermarked file and an encrypted secret key to the laptop/desktop/smartphone where the de-watermarking process takes place. The media is shown/played to the user only if the watermarks match and thus only authorized personnel can view the content as pirated media will have different watermarks and will not be played on the user’s device. These are fairly easy to implement but rather difficult to convince everyone to adopt. As a side note we must be informed that all music downloaded on smartphones from streaming apps, are encrypted files to protect against illegal distribution of music.

How the process fares on actual audio files

Here are a few waveforms processed by MATLAB® and visualized using Audacity (you might have to open images in a new tab for a better view) :

Host audio file waveform
Watermark audio file waveform
Waveform of watermarked audio file generated by MATLAB
Waveform of watermark extracted from watermarked audio

The waveforms represent the efficacy of the algorithm in achieving the watermarking process. Waveforms suggest that no practical degradation of audio occurs after being watermarked but this remains to be tested on a larger number of audio files spanning the entire human auditory range. This should be good enough of a process to enforce minimalist DRM.

Enter, caveats

  • Potential exists for echo, chorus and reverbing effects in the watermarked signal. These are to be processed and rectified to achieve a better Signal-to-Noise ratio in the extracted audio signal.
  • The watermarking schemes are developed under the assumption that there is perfect synchronization between the original and the watermarked audios. The work can be extended for de-synchronized watermarked audios using synchronization sequences.
  • The ciphering technique from above is far from ideal. This can be worked around with industry standard encryption techniques like AES, SHA etc.

The acceptance part

Would you, as the authorized end consumer of a media file be accepting of ‘You have zero control over your bits anyway. Get over it.’ ?

Not all paying consumers are thieves. Would you not rather buy a movie online or from a streaming service than wait for the disc to get released and go through the hassle of maintaining it? Would you not rather stream music than go through the process of building your own authorized library? Undoubtedly we are entering into the terrain of copyright ownership and licensing management with this topic here. The whole point of DRM is to make sure that copyrighted digital files are restricted from being illegally reproduced and redistributed.

DMCA can cry strikes all it wants to but peer to peer networks are never bothered by the strikes. Static content is impossible from being secured against digital redistribution. WIthout financial incentives, many independent creators are thrown off guard and forced to find alternatives. The quality of content produced might as well depend on the support of paying consumers. Shifting to OpenCulture based implementations might work for well known creators but not everyone. Individual donations are not voluminous enough to consider abolishing strict DRM enforcement. A perfect and balanced digital media marketplace needs the support of all involved. Else, it becomes a closed library like Netflix/YouTube/Spotify which heavily enforce DRM and ends up annoying a large user base. Unfortunately, mass redistribution doesn’t require much technical skills and unprotected static media ends up a victim of digital piracy. This is the reason a dynamic on the go solution was highlighted in the section above. Streaming aggregators seem to have the best solution against these practices, yet. These are not physical tokens of ownership of the media but rather login based which present yet another challenge, unauthorized co-operation among users. Stern legal remedies should help the case though.

If there is a key takeaway from this post, it should most certainly be this –

In my opinion, content protection and rights management exist only as vestigial efforts to preserve existing models of content sales for as long as the bulk of the consumer market remains clueless. History has shown every content-protection scheme invented for consumer-grade goods to have almost no impact on piracy, and little impact on casual copying, except when it has doomed the technology carrying it. This is inevitable.
The question before us is not about how to protect the bits, but how to protect the investments in creation of the bits, and how best to preserve the relationships between people and content. I submit that establishing a market for licenses to digital content is the last best hope for providing a continuing revenue stream for static content.

- Mark S. Manasse, Compaq Systems Research Center, Palo Alto, CA


Akshobhya Jamadagni

I combine technical passion with the right team & challenges to deliver value across ladders of abstraction—from the highest levels of business strategy to the lowest levels of implementation.