Audioread Bug with Opus (maximum value returned exceeding 1)

7 views (last 30 days)
Hello,
I think there is a bug with the audioread function and opus files. Somehow, I am getting maximum values read > 1. The files are perfectly fine and from a well mastered cd and opus files were made with ffmpeg from it. I have tried running this code in matlab 2023b and 2025a pre release, issue is persisting. Please help.
Output:
--------------------------------------------------
File: C:\Users\Admin\Desktop\Datasets\Compressor\input.wav
Max Value: 0.98535 0.98593
Filename: 'C:\Users\Admin\Desktop\Datasets\Compressor\input.wav'
CompressionMethod: 'Uncompressed'
NumChannels: 2
SampleRate: 44100
TotalSamples: 166800900
Duration: 3.7823e+03
Title: []
Comment: []
Artist: []
BitsPerSample: 16
--------------------------------------------------
--------------------------------------------------
"File: " "C:\Users\Admin\Desktop\Datasets\Compressor\MP3_Encodes\C…"
Max Value: 1 1
Filename: 'C:\Users\Admin\Desktop\Datasets\Compressor\MP3_Encodes\CBR\CBR_128_MP3.mp3'
CompressionMethod: 'MP3'
NumChannels: 2
SampleRate: 44100
TotalSamples: 166803837
Duration: 3.7824e+03
Title: []
Comment: []
Artist: []
BitRate: 128
--------------------------------------------------
--------------------------------------------------
File: C:\Users\Admin\Desktop\Datasets\Compressor\Opus_Presets\CBR\CBR_128_Opus.opus
Max Value: 1.3834 1.4113
Filename: 'C:\Users\Admin\Desktop\Datasets\Compressor\Opus_Presets\CBR\CBR_128_Opus.opus'
CompressionMethod: 'Opus'
NumChannels: 2
SampleRate: 48000
TotalSamples: 181552000
Duration: 3.7823e+03
Title: []
Comment: []
Artist: []
--------------------------------------------------
Code to reproduce bug:
% Define the list of input files (add more file paths as needed)
inputFiles = {
'C:\Users\Admin\Desktop\Datasets\Compressor\input.wav', %original wav file
"C:\Users\Admin\Desktop\Datasets\Compressor\MP3_Encodes\CBR\CBR_128_MP3.mp3",
'C:\Users\Admin\Desktop\Datasets\Compressor\Opus_Presets\CBR\CBR_128_Opus.opus'
};
% Loop through each input file
for i = 1:length(inputFiles)
% Get the current input file path
inputFile = inputFiles{i};
% Read the audio file
[audioIn, inputFs] = audioread(inputFile);
% Find the maximum absolute value in the audio data
maxValue = max(abs(audioIn));
% Display the result for each file in a more readable format using disp
disp('--------------------------------------------------');
disp(['File: ', inputFile]);
disp(['Max Value: ', num2str(maxValue)]);
info = audioinfo(inputFile);
disp(info);
disp('--------------------------------------------------');
end
EDIT:
The plot thickens, as a factor of sqrt(2) appears before the weights in the opus with all files I tested vs the mp3 and wav weights.
  2 Comments
Walter Roberson
Walter Roberson on 4 Feb 2025
It would be interesting to see the result of audioread() with the 'native' option.
Ivan Rodionov
Ivan Rodionov on 4 Feb 2025
Edited: Walter Roberson on 5 Feb 2025
@Walter Roberson Hello Walter and thank you for your reply! I have tried this code and here is what I am getting, please tell me if I did it right:
Code changes
[audioIn, inputFs] = audioread(inputFile, 'native');
% Find the maximum absolute value in the audio data
maxValue = max(abs(audioIn));
% Display the result for each file in a more readable format using disp
disp('--------------------------------------------------');
disp(['File: ', inputFile]);
disp(['Max Value: ', num2str(maxValue)]);
info = audioinfo(inputFile);
disp(info);
disp('--------------------------------------------------');
Output:
peakanalyzer
--------------------------------------------------
File: C:\Users\Admin\Desktop\Datasets\Compressor\input.wav
Max Value: 32288 32307
Filename: 'C:\Users\Admin\Desktop\Datasets\Compressor\input.wav'
CompressionMethod: 'Uncompressed'
NumChannels: 2
SampleRate: 44100
TotalSamples: 166800900
Duration: 3.7823e+03
Title: []
Comment: []
Artist: []
BitsPerSample: 16
--------------------------------------------------
--------------------------------------------------
"File: " "C:\Users\Admin\Desktop\Datasets\Compressor\MP3_Encodes\CBR\CBR_128_MP3.mp3"
Max Value: 1 1
Filename: 'C:\Users\Admin\Desktop\Datasets\Compressor\MP3_Encodes\CBR\CBR_128_MP3.mp3'
CompressionMethod: 'MP3'
NumChannels: 2
SampleRate: 44100
TotalSamples: 166803837
Duration: 3.7824e+03
Title: []
Comment: []
Artist: []
BitRate: 128
--------------------------------------------------
--------------------------------------------------
File: C:\Users\Admin\Desktop\Datasets\Compressor\Opus_Presets\CBR\CBR_128_Opus.opus
Max Value: 1.3834 1.4113
Filename: 'C:\Users\Admin\Desktop\Datasets\Compressor\Opus_Presets\CBR\CBR_128_Opus.opus'
CompressionMethod: 'Opus'
NumChannels: 2
SampleRate: 48000
TotalSamples: 181552000
Duration: 3.7823e+03
Title: []
Comment: []
Artist: []
--------------------------------------------------
I am very confused now why the mp3 is also not outputting a value in bits for what is the maximum like the wav file and the opus is doing its own thing.
Thank you again and I would much appreciate your input,
Ivan
PS, good to hear from you in a non gpu drama related context :) .

Sign in to comment.

Answers (2)

Walter Roberson
Walter Roberson on 5 Feb 2025
I would say it is a bug. audioread() documents
  • If you do not specify dataType, or dataType is 'double', then y is of type double, and matrix elements are normalized values between −1.0 and 1.0.
but that normalization is clearly not happening for opus files.
In the few sample tests I did, the maximum output was 1.22317087650299. I would not identify that as sqrt(2) myself... I would speculate it is more like 1.25 maximum.
  1 Comment
Ivan Rodionov
Ivan Rodionov on 5 Feb 2025
Edited: Walter Roberson on 5 Feb 2025
Hello Walter,
I concur that is a bug. As for the sqrt(2) term, compare the peak values of the original file and the opus, the peaks only get amplified for some reason by a sqrt(2) term while the mean value stays the same.
EDIT:
I've made a temporary (and very ugly) fix by loading the file if opus with ffmpeg and transforming it to a wav.
function [audioData, fs] = loadAudioFile(filePath)
% Check if the file is an OPUS file by checking the extension
[~, ~, ext] = fileparts(filePath);
if strcmpi(ext, '.opus')
% Define the temporary WAV file path
tempWavFile = [tempname, '.wav'];
% Run FFmpeg to convert OPUS to WAV (without resampling or changing bit depth)
ffmpegCmd = sprintf('ffmpeg -i "%s" "%s"', filePath, tempWavFile);
status = system(ffmpegCmd);
if status ~= 0
error('Error converting OPUS file to WAV using FFmpeg.');
end
% Load the converted WAV file
[audioData, fs] = audioread(tempWavFile);
% Optionally, remove the temporary WAV file
delete(tempWavFile);
else
% For non-OPUS files, load with MATLAB's default audio reader
[audioData, fs] = audioread(filePath);
end
end

Sign in to comment.


Jimmy Lapierre
Jimmy Lapierre on 5 Feb 2025
Hi, I was not able to reproduce peaks of that amplitude when using audiowrite, but I did observe some amplitudes above 1. It might be worse in your case with CBR encoding (less able to handle difficult parts of the signal).
To check if the issue is with the peaks or with the overall gain, please check the energy before (original wav) and after (read from opus). For example: sqrt(sum(x.^2)/length(x)). The energy should be similar before/after.
If so, I suspect this is just an artefact of the lossy coding, and the decoder not clipping the end result. I also think the OPUS decoder is implemented in floating point, so the math is not saturating either. With MP3, OGG or OPUS encoding, there are changes to phase and amplitude that will produce a waveform that sounds like the original but is different. In other words, if the energy is similar to the original signal, I think these amplitude variations are normal. It is good practice to reduce the amplitude of a signal before lossy encoding to avoid such overshoots.
  4 Comments
Walter Roberson
Walter Roberson on 18 Feb 2025
audioread() is defined as returning the range +/- 1 unless the 'native' option is used.
  • If you do not specify dataType, or dataType is 'double', then y is of type double, and matrix elements are normalized values between −1.0 and 1.0.
So it is a bug if values outside the range +/- 1 are returned by audioread() without 'native'
It does not matter for this purpose that the decoder might be using a floating point implementation: the value should be strictly in the +/- 1 range.
There are several possible fixes:
  • implementations might be changed, possibly post-scaling signals to detect signals outside +/-1 and if so scale by 1/max(abs(signal))
  • documentation might be changed to indicate which decoders the +/- 1 range is valid for
  • the line in the documentation might be dropped completely, leaving the output range undefined
Jimmy Lapierre
Jimmy Lapierre on 18 Feb 2025
That is a fair point. My intention was to request a doc change as it is better to give you the opportunity to scale down the decoded signal instead of hard clipping in a way that is totally out of your control.

Sign in to comment.

Categories

Find more on Audio I/O and Waveform Generation in Help Center and File Exchange

Products


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!