Language

Pliant audio system

Pliant audio system only works under Linux or FullPliant operating system because it uses Linux kernel drivers to deal with audio boards, and external libraries available in mainstream Linux distribution to handle MP3 (lossy) and FLAC (lossless) compression.

Principles

Pliant audio system is using a push through filters mechanism:

Samples pushed from a filter to the next one are encoded as 32 bits floating point values.

The audio filter interface is defined in /pliant/audio/core/prototype.pli

var Link:AudioPrototype p

An audio filter exposes three methods: 'open' for setup, 'write' for providing it samples, and 'close' for ending operations. So, the correct sequence is to call 'open' once, then 'write' several times, and finally 'close' once. 'write' and 'close' should not be called if 'open' returned failure.

method p open channels rate options -> status
  oarg_rw AudioPrototype p ; arg Int channels rate ; arg Str options ; arg ExtendedStatus status

No bits per sample is indicated since samples will always be provided as Float32.

method p write samples count -> status
  oarg_rw AudioPrototype p ; arg Address samples ; arg Int count ; arg Status status

'count' is the number of samples, so assuming that 'channels' parameter passed to 'open' was 2, then the size of buffer pointed by 'samples' is 8*count.

method p close -> status
  oarg_rw AudioPrototype p ; arg ExtendedStatus status

Now I can explain the all machinery principle.

   •   

An audio source is a function that receives a filter as a parameter. It reads samples through decoding a file, or from recording (on an audio input board), and outputs samples through calling the 'write' method of the filter it received as a parameter.

   •   

A real audio filter binds to a second filter and modify audio samples on the fly. It receives the samples through it's 'write' method, modify them, then calls the 'write' method of the second filter.

   •   

Lastly, and audio final filter is receiving samples though it's 'write' method, and store them in a file or put them on an audio output board hardware buffer (play). It is called an audio final filter as opposed to an audio target or anything else because it uses the exact same API as a real audio filter, I mean 'open', 'write' and 'close'.

Two extra methods are provided:

method p pending -> count
  oarg_rw AudioPrototype p ; arg Int count

Returns the number of audio samples that have already been provided through 'write' methods, but have not been consumes yet. This is usefull to synchronize time on an audio final filter.

method p write16 samples16 count decoder -> status
  oarg_rw AudioPrototype p ; arg Address samples16 ; arg Int count ; arg_rw AudioDecoder decoder ; arg Status status

'write' is expecting samples to be provided as a buffer of 32 bits floatting point value.
'write16' enables to provide samples as 16 bits signed integers.
You have to provide as 'decoder' argument a properly configured local variable with 'AudioDecoder' type so that the 16 bits signed integer to 32 bits floating point conversion can be done if the filter cannot handle directly 16 bits signed integer samples.
'write16' is provided to avoid unnecessary re-encoding from integer to float and back in some simple situation, but since modern processors are so powerful, you can just forget it in most applications.

Audio sources

As I have just explained it, audio sources are generating the set of audio samples. An audio source function is not returning until all audio samples have been played. So, it has a 'stop' parameter that enables another thread to request it to stop immediately.

A stream (or a file)

This is implemented as function 'audio_play' in module /pliant/audio/format/all.pli

function audio_play stream format output options stop -> status
  arg_rw Stream stream ; arg Str format ; oarg_rw AudioPrototype output ; arg Str options ; arg_rw CBool stop ; arg ExtendedStatus status

Supported formats can be 'wave' (uncompressed) or 'flac' or 'mp3'. 'format' parameter can be a file name provided it's extension is matching one the 3 supported formats. If 'format' is empty, 'audio_play' function will (try to) discover it from the file content. 'wav' is accepted as an alias for 'wav'.
all.pli is just a dispatching module. The code is in libflac.pli, libmad.pli for MP3, and wave.pli. Each of them is implementing an 'xxx_play' function that should be fairly straightforward to read.

An audio board (record sound)

This is implemented as function 'device_play' in module /pliant/audio/core/device.pli"

function device_play output options stop -> status
  oarg_rw AudioPrototype output ; arg Str options ; arg_rw CBool stop ; arg ExtendedStatus status

Maybe this function should be called 'device_record' since it will record from the audio board in the end. Here is a possible value for 'options':

"board 0 channels 4 rate 96000 bits 24 oss alsa volume 10 mic_volume 80"

See 'An audio board (play sound)' paragraph bellow for details about these parameters.

Audio filters

Tone adjustement

This is implemented in /pliant/audio/filter/tone.pli

The core of it is the shelving filter implementation. A shelving filter is defined by three parameters: the power adjustment bellow the threshold frequency, the threshold frequency, and the power adjustment above the threshold frequency. See Wikipedia.

I have implemented shelving through experimenting and earing, so that in the end, I don't know if what I have implemented is well known (100th time reinvention of the wheel) or new.
My general idea was to sharpen or smooth the audio curve (I've worked in images processing field for years). So, I compare the current value with the average value in the recent past then just increase (amplify high frequencies) or decrease the difference (reduce high frequencies).
If you read the code,
'l' is the level of the current sample,
'ref' is the average value of 'l' on the 'w' previous samples, where 'w' has been computed at setup time from the requested threshold frequency.
The corrected sample will be a barycenter of 'l' with weigh 's' and 'ref' with weigh '1-s', where 's' has also been computed at setup time from the difference between the gain requested for the low and high frequencies.

Then in the 'AudioTone' type that implements the Pliant general tone adjustment filtering mechanism, I just set an array of shelving filters according to the requested adjustments. Here is a sample containing all possible parameters (weather it makes sense to adjust all of them at once is another story):

"bass 3 bass_hz 200 tone 1 tone_hz 1000 trebble -3 trebble_hz 5000 gain -1"

Please notice that in the program I've misspelled treble as 'trebble', so you have to do the same or your settings will just be ignored. Anyway, I promise: I will correct the spelling as soon as we reach 1000000 users of Pliant audio player :-)

Crossfeed

According to my personal taste, best recordings (assuming that you use a high quality audio system both at recording and listening time) are the ones done with only two microphones properly installed (ORTF positioning). Anyway, this recording technique has three drawbacks:

   •   

it prevents to cheat on the relative power of various instruments, so the singer might be covered by the percussion,

   •   

it requires cardioid microphones that have a bad response curve compared to omni-directional ones,

   •   

it is very sensible to unwanted background noise.

So, the nowadays standard is rather to use one microphone per audio source (instrument, singer), then do mixing to reduce to two tracks. The huge problem is that current mixing techniques are still very naive so that the final sound is often completely unrealistic.

When using speakers, this is not a that serious issue because the speakers will automatically provide some crossfeed and positioning informations, but when using a headset, it can be very disturbing because the brain continuously tries to analyze inconsistent informations.
Crossfeed is a process on the sound that will build a sound that will 'look' more natural to the brain, so reduce fatigue.

The general principle of crossfeed (it can be achieved either through hardware or software) is to add the right audio channel to the left one, reduced, delayed and maybe tone adjusted to recreate the information that is always present in real situation so that the brain is desperately searching when earing using headphones a badly mixed recording.

Pliant crossfeed implementation is in /pliant/audio/filter/crossfeed.pli module and can be adjusted trough the following parameters:

"crossfeed -9 crossfeed_delay 250 crossfeed_difference 0.5 crossfeeld_filter_db -6 crossfeed_filter_hz 1000"

'crossfeed' is the number of dB of amplification of the signal that will be copied to the opposite side.
'crossfeed_delay' is the delay of the copied signal, and is related to the size of the head, so you should not change it unless your head is unusually big.
'crossfeed_difference' has no effect if it's zero, but if it's one, then it is not the signal of one side that will be copied to the other, but the difference between the two. Copying the difference can be interesting in order to avoid the mudding effect on the soon of the crossfeed, but the drawback is to reduce the bass general level because the bass will be less cross copied that the trebles. Intermediate values are possible, so do your own experimentations.
'crossfeed_filter_db' and 'crossfeed_filter_hz' enable to apply a shelving filter before copying. It is intended to cut the crossfeed on high frequencies. I've implemented this because the ear is more directional with high frequencies, so the natural crossfeed of a sound coming for one side to the opposite ear is expected to be lower on high frequencies.

Resample

It enables to change sampling frequency and is implemented in /pliant/audio/filter/resample.pli

It can be activated through the 'rate' keyword:

"rate 44100"

A sample usage could be in 'audio_convert' high level function described at the end of this document.

Select

Enables to extract a subpart of the track, and is implemented in /pliant/audio/filter/select.pli. Start and stop times are defined in seconds:

"from 60 to 180"

Also usefull in 'audio_convert'.

Info

Provides informations about the audio track in a text variable. The variable is intended to be read by another thread, so it's access is protected by a semaphore.

It is activated through providing 'info' option to 'audio_filter' or 'audio_convert' functions described bellow:

"info title[dq]Me playing 'Etoile des neiges' on the accordion[dq]"

Here is a sample value that the application might get while the track is playing:

"title[dq]Me playing 'Etoile des neiges' on the accordion[dq] elapsed 12.3 maxi 0.2 clip maxi0 -0.3 maxi1 0.2 forever_maxi 2.1"

'elapsed' is the time from the track begin, in seconds,
'maxi' is the maximum output level in the recent time, in dB, so it should be negative, or the sound is clipping, which is very bad, and in such a case, 'clip' keyword is added.
'maxi0', 'maxi1', 'maxi2', 'maxi3', ... are the respective maximum levels on various channels,
and 'forever_maxi' is the maximum on any channel since the beginning of the track.

Display informations

The 'AudioWhat' filter implemented in /pliant/audio/filter/what.pli will display on console informations about the audio track, such as number of channels, sampling rate, and sometime bits per sample. Will be modified to use Pliant trace mechanism.

All at once

In module /pliant/audio/filter/all.pli, a 'audio_filter' function is provided enabling to set all filters at once through the single 'effect' parameter:

function audio_filter base effect report sem -> final
  oarg_rw AudioPrototype base ; arg Str effect ; arg_rw Str report ; arg_rw Sem sem ; arg Link:AudioPrototype final

'report' and 'sem' are used to provide feedback from the optional 'Info' filter.

Audio final filters

A stream (or a file)

This is implemented as function 'audio_output' in module /pliant/audio/format/all.pli

function audio_output stream format -> output
  arg_rw Stream stream ; arg Str format ; arg Link:AudioPrototype output

Supported values for 'format' parameters are 'wav', 'wave', 'flac' and 'mp3'.

When the 'open' method of the 'AudioPrototype' object returned by 'audio_output' will  called, some options will be used to configure compression.

For FLAC compression:

"best bits 24"

'best' means try to achieve best possible FLAC compression ratio, at the expense of a lot of computing power consumed, so slow compression speed.
'bits' enables to set the bits per sample in the FLAC file (default is 16).

For MP3 compression:

"kbps 192 hifi best fast very_fast"

Don't set all parameters as in this example since they are conflicting !
'kbps' is specifying the file compression ratio. It is not a good idea to set this parameter since variable compression ratio is better. Moreover, setting 'kbps' to bellow 192 will produce noise, not sound.
'hifi' is the option that asks for maximum quality. Standard quality is obtained through setting neither 'kbps' nor 'hifi'.
'best', 'fast' or 'very_fast' enable to adjust the compression speed versus optimal result. Once again, not specifying anything is generally a great idea.

An audio board (play sound)

This is implemented as type AudioDevice in module /pliant/audio/core/device.pli

Since it implements the filter API, the audio device is configured through the 'options' parameter it receives when it's 'open' function is called. Possible options are:

"oss alsa board 3 bits 24 volume 75"

'oss' forces to use Linux OSS as opposed to Alsa,
'alsa' forces to use Linux Alsa as opposed to OSS (using both at once makes no sense),
'board' enables to select the audio board where 0 is the index of the first available board,
and 'bits' enables to set the bits per sample to use on the board, with the default value being 16, and the number of channels and samples rate being provided by the audio stream content.
'volume' enables to set all output volumes, and 'master_volume' 'pcm_volume' 'speaker_volume' 'line_volume' 'headphone_volume' and 'mic_volume' can be used to set various volumes on the board.

A full sample

The following code is a simplified version of what you can find in the main function of the Pliant audio player (function 'audio_play' in module /pliant/appli/audio.pli):

module "/pliant/language/unsafe.pli"
module "/pliant/language/stream.pli"
module "/pliant/audio/core/device.pli" # provides AudioDevice
module "/pliant/audio/filter/all.pli" # provides audio_filter
module "/pliant/audio/format/all.pli" # provides audio_play

function play file options stop -> status
  arg Str file options ; arg_rw CBool stop ; arg ExtendedStatus
  stop := false
  # open the file
  var Link:Stream stream :> new Stream
  status := stream open file options in+safe
  if status=failure
    return
  # get the default audio board as the final audio filter
  var Link:AudioDevice device :> new AudioDevice
  # apply a tone filter with +3 dB on the bass
  var Link:AudioPrototype final :> audio_filter device options (var Str report) sem
  # play the track
  status := audio_play stream file final options stop

# let's play some track, with 3 dB boost on the bass (and -3 dB gain in order to avoid clipping)
play "file:/tmp/track.mp3" "gain -3 bass 3"

The player

An audio player is provided in /pliant/appli/audio.ui (accessed through 'Application' 'Audio' from Pliant main menu).

It's user interface is rough, but various filters (tone adjustment, crossfeed) implemented in Pliant audio system can be used and easily adjusted. So, the audio quality of this player, from an audiophile point of view, mostly depends on the quality of the Pliant shelving filter and crossfeed compared to mainstream implementations. Please feel free to review, test and comment !

Audio files have to be stored in file:/audio/ directory. I really have to make this parametric :-(

Extra utilities

Module /pliant/audio/appli/convert.pli provides a high level 'audio_convert' function:

var ExtendedStatus s := audio_convert "file:/tmp/tracks/" "file:/tmp/tracks/" "format [dq]mp3[dq] replace"

'replace' means that the original files will be removed after re-encoding.