MSc Thesis

Perceptual Compression of Digital Audio -- Abstract

This document presents the implementation of a perceptual CODEC that uses as input an uncompressed file sampled at 44100Hz, quantized at 16-bits, stereo. The waveform is processed in blocks of 512 samples and transformed using Modified Discrete Cosine Transform. These coefficients are quantized based on the information gathered from a psychoacoustic model and packed into a variable-bit-rate file. The decoder uses this file as input and outputs a file with the same format as the original input. The average data rate observed ranged from 58kbps and 340kbps, with data rates around 240kbps being the most common. Results are presented regarding the subjective quality measurement of the implemented format, MPEG-1 Layer 3 and MPEG-4 Low Complexity Advanced Audio Coding. Based on the results of the tests, the CODEC achieves a high quality profile.

99% confidence interval, Tukey's HSD=0.085

To download the PDF document in spanish click here (sorry, no english version).
The source code is available here
Win32 executable file is here. You'll need fftw, dll available here.

The sound files used for testing can be found here.