Porting/Audio/Q and A RE-PA

(A list of questions Jonwil asked Juho Hämäläinen in IRC query. Posted here for public discussion and documenting the answers)
(added some links on top)
 
(One intermediate revision not shown)
Line 1: Line 1:
 +
[http://talk.maemo.org/showthread.php?p=1443720 TMO "Re: the Fremantle Porting Task Force, or "how to run maemo on Neo900" "]
 +
 +
[http://talk.maemo.org/showthread.php?p=1242804  TMO "Re: What we know about alsa-policy-enforcement and alsaped?"]
 +
 +
(2014-10-25Sat13:35:22)<freemangordon> https://gitorious.org/pulseaudio-nokia/pulseaudio-nokia/commit/4e0dfeb82759ce334bec8882edcafe7e04992a50
<pre>
<pre>
jonwil Page 15 of this PDF http://linuxplumbersconf.org/2009/slides/Jyri-Sarha-audio_miniconf_slides.pdf shows a block diagram of the N900 audio system (or at least what I assume is the N900 audio system). Is there any chance you can expand on which algorithms (e.g. stw, xprot, eap, aep, drc, eq, iir, fir etc) match with the blocks marked "Mic EQ", "music processing", "recording processing",...
jonwil Page 15 of this PDF http://linuxplumbersconf.org/2009/slides/Jyri-Sarha-audio_miniconf_slides.pdf shows a block diagram of the N900 audio system (or at least what I assume is the N900 audio system). Is there any chance you can expand on which algorithms (e.g. stw, xprot, eap, aep, drc, eq, iir, fir etc) match with the blocks marked "Mic EQ", "music processing", "recording processing",...
Line 33: Line 38:
jonwil BIG thankyou for any help you may give btw, its so great to find someone who actually understands this alphabet soup of audio processing
jonwil BIG thankyou for any help you may give btw, its so great to find someone who actually understands this alphabet soup of audio processing
</pre>
</pre>
 +
 +
 +
copy of answer from http://wiki.maemo.org/User_talk:Jusa
 +
===============================================================================
 +
As a preface, I may mix up things between Fremantle and Harmattan, given that it has already been quite a while when working with the former. But as a general rule, Fremantle and Harmattan are (from PulseAudio components point of view) pretty close to each other. For the algorithms, when you look at the meego modules, wherever there is algorithm processing in Harmattan side, there likely was pretty identical processing in Fremantle as well. Meego modules are an evolution of the modules written for Fremantle, and one of the goals was also to separate the closed source algorithms from the source tree. To achieve this, "algorithm-hook"s were introduced, which just execute the algorithms from separate modules (shipped with Harmattan).
 +
 +
MeeGo != Harmattan, but the PulseAudio modules were named meego when open sourcing them. So in Harmattan you have source trees pulseaudio-modules-meego (containing open source code) and pulseaudio-modules-nokia (containing proprietary algorithms).
 +
 +
'''Page 15 of this PDF http://linuxplumbersconf.org/2009/slides/Jyri-Sarha-audio_miniconf_slides.pdf shows a block diagram of the N900 audio system (or at least what I assume is the N900 audio system). Is there any chance you can expand on which algorithms (e.g. stw, xprot, eap, aep, drc, eq, iir, fir etc) match with the blocks marked "Mic EQ", "music processing", "recording processing", "transducer processing" and "call audio processing"?'''
 +
 +
Mic EQ:
 +
* fir/iir
 +
 +
Music processing:
 +
* mudrc
 +
 +
Recording processing:
 +
* mudrc
 +
* (I think there was agc/automatic gain control at some point... hm)
 +
 +
Transducer processing:
 +
* iir/fir
 +
* xprot
 +
 +
Call audio processing:
 +
* aep
 +
 +
'''I have been examining the Meego code and the Harmattan code (both the open bits in pulseaudio-modules-meego and the closed bits in pulseaudio-nokia) with a view to possibly using the blobs or parts of them for the Neo900 audio work (since they are more open than the ones on the N900). Comparing what Meego has to what Fremantle has, it would appear as though the Meego and Harmattan blobs are missing functions labeled stw. Are you aware of any other pieces in the Meego or Harmattan blobs that are missing or that might have been built differently? (ignoring anything to do with different versions or compilers, PulseAudio, libc or other things, just talking about the actual algorithms and such in the pulseaudio-nokia blobs).'''
 +
 +
stw - Stereo Widening. Kind of useful with N900, since that has stereo speakers, dropped with N9 since N9 has only mono speaker. It might be that it was dropped from N900 as well, if it wasn't seen as enough improvement to the sound compared to the resource usage.
 +
 +
Algorithms themselves are pretty much the same, although I think AEP saw quite big change. All the configuration parameter binary blobs in the parameters files were just fed to the algorithms as is. Tuning was done with proprietary plugin to Fast Trace (iirc).
 +
 +
'''Or for that matter anything that is present in the Meego or Harmattan blobs but isn't used on Fremantle?'''
 +
 +
'''Other than the cmtspeech stuff, are you aware of anything in the nokia Fremantle Pulseaudio blobs that talks to or is dependent on the N900 cellular modem?'''
 +
 +
Not that I know of.
 +
 +
'''Which of these Nokia algorithms are necessary for safe system functioning and which are just going to give not-so-nice audio if we dont have them? Obviously xprot is something we probably need in order to protect the speakers/earpiece/etc from damage but what else? (the cellular modem we are using in the Neo900 already does all sorts of voice processing so identifying which bits we need and which bits we can leave to the cellular modem will save us a lot of time)'''
 +
 +
I don't think any of the algorithms are needed if the hardware is run conservatively enough. XProt helps with getting more loudness safer, but it would be just as well to run the speaker a bit quieter.
 +
 +
From audio quality perspective, voice call algorithms are really crucial. It is impossible to get device certified without echo cancellation and friends, and almost impossible without EQs. I remember it was really hard, even with the super equipment the tuning guys had.
 +
 +
'''Are you aware of open-source implementations of algorithms similar to the proprietary Nokia ones?'''
 +
 +
Not really. Maybe speex.
 +
 +
'''Do you know if the stereo widening stuff is actually used on the N900 on Fremantle? All of the .parameters files that control the blobs set x-maemo.stereo-widening = "false" which I assume means "turn it off". If its in fact not used that means we can ignore it (it would also explain why its not even compiled in for Meego)'''
 +
 +
Answered above.
 +
 +
'''nb_eeq is "narrowband ear equalizer" nb_meq is "narrowband mic equalizer" wb_eeq is "wideband ear equalizer" wb_meq is "wideband mic equalizer" What is the difference between "narrowband" and "wideband" here? What is xprot displacement?'''
 +
 +
Narrowband/wideband: Voice call bandwith. DocScrutinizer05 already wrote about this in IRC.
 +
 +
XProt displacement - don't remember. Something about the speaker coil safe traversal probably.
 +
 +
'''Can you tell me which of these sets of parameters will be tied to the specific details of the microphone in the N900 or of the FMTX in the N900? (bluetooth is digital so changing bluetooth chips shouldn't change the way the audio needs to be processed and we are using the same earpiece, speakers, headphone, lineout and tvout setup so those wont change)'''
 +
 +
'''aep nb_eeq nb_meq wb_eeq wb_meq mumdrc.ul mumdrc.dl limiter.ul limiter.dl xprot. That will tell us which bits we need to dig deep into and retune/replace for our needs
 +
retune/replace'''
 +
 +
* mumdrc.ul: Multiband DRC uplink (run in record module)
 +
* mumdrc.dl: Multiband DRC downlonk (run in music module)
 +
* limiter.ul: Limiter (don't remeber more specifically) uplink
 +
* limiter.dl: same with downlink
 +
* aep: Algorithm package especially meant for voice call processing, includes echo cancelling, comfort noise, whatnot.
 +
* I think eap is also mentioned somewhere, iirc that contains the algorithms for DRC etc.
 +
 +
I think the best way to verify these is to look at the different places in Harmattan where meego_algorithm_hook_fire() is called, like said above, those places pretty much match what's happening with Fremantle as well.
 +
 +
'''oh and one more thing, if you know anything at all about alsaped, alsaped.conf or alsa-policy-enforcement, feel free to share :)'''
 +
 +
IIRC alsaped and alsa-policy-enforcement were doing the same thing as alsa-mixer stuff in Harmattan. That is, set ALSA interface switches for achieving given output/input routing. Porting the alsaped configuration to recent PulseAudio alsa-mixer configuration should be doable (and might actually already be done for N900 Nemo port, not sure.)
 +
 +
I'd probably need to take a look how Harmattan stuff was laid out to better remember these details, but as I currently don't have that available I'm just running with my (bad) memory. I do have N9 though (probably even with stock image flashed) so I could take a look at Harmattan root filesystem at some point as well.
 +
===================================================

Latest revision as of 14:23, 27 October 2014

TMO "Re: the Fremantle Porting Task Force, or "how to run maemo on Neo900" "

TMO "Re: What we know about alsa-policy-enforcement and alsaped?"

(2014-10-25Sat13:35:22)<freemangordon> https://gitorious.org/pulseaudio-nokia/pulseaudio-nokia/commit/4e0dfeb82759ce334bec8882edcafe7e04992a50

jonwil	Page 15 of this PDF http://linuxplumbersconf.org/2009/slides/Jyri-Sarha-audio_miniconf_slides.pdf shows a block diagram of the N900 audio system (or at least what I assume is the N900 audio system). Is there any chance you can expand on which algorithms (e.g. stw, xprot, eap, aep, drc, eq, iir, fir etc) match with the blocks marked "Mic EQ", "music processing", "recording processing",...
	jonwil	..."transducer processing" and "call audio processing"?
	jonwil	I have been examining the Meego code and the Harmattan code (both the open bits in pulseaudio-modules-meego and the closed bits in pulseaudio-nokia) with a view to possibly using the blobs or parts of them for the Neo900 audio work (since they are more open than the ones on the N900). Comparing what Meego has to what Fremantle has, it would appear as though the Meego and Harmattan blobs are...
	jonwil	...missing functions labeled stw. Are you aware of any other pieces in the Meego or Harmattan blobs that are missing or that might have been built differently? (ignoring anything to do with different versions of compilers, PulseAudio, libc or other things, just talking about the actual algorithms and such in the pulseaudio-nokia blobs). Or for that matter anything that is present in the...
	jonwil	...Meego or Harmattan blobs but isn't used on Fremantle?
	jonwil	Other than the cmtspeech stuff, are you aware of anything in the nokia Fremantle Pulseaudio blobs that talks to or is dependent on the N900 cellular modem?
	jonwil	Which of these Nokia algorithms are necessary for safe system functioning and which are just going to give not-so-nice audio if we dont have them? Obviously xprot is something we probably need in order to protect the speakers/earpiece/etc from damage but what else? (the cellular modem we are using in the Neo900 already does all sorts of voice processing so identifying which bits we need and...
	jonwil	...which bits we can leave to the cellular modem will save us a lot of time)
	jonwil	Are you aware of open-source implementations of algorithms similar to the proprietary Nokia ones?
	jonwil	Anything you can share on these questions (whenever you have the time) will be great (if you cant share some of this because you dont have the info, have forgotten it or are unwilling/unable to share due to NDA, that's ok, anything you CAN share on this pulseaudio-nokia stuff will really help us out)
	jonwil	Some more questions for when you have the time:
	jonwil	Do you know if the stereo widening stuff is actually used on the N900 on Fremantle? All of the .parameters files that control the blobs set x-maemo.stereo-widening = "false" which I assume means "turn it off". If its in fact not used that means we can ignore it (it would also explain why its not even compiled in for Meego)
	jonwil	nb_eeq is "narrowband ear equalizer"
	jonwil	nb_meq is "narrowband mic equalizer"
	jonwil	wb_eeq is "wideband ear equalizer"
	jonwil	wb_meq is "wideband mic equalizer"
	jonwil	What is the difference between "narrowband" and "wideband" here?
	jonwil	What is xprot displacement?
	jonwil	Can you tell me which of these sets of parameters will be tied to the specific details of the microphone in the N900 or of the FMTX in the N900? (bluetooth is digital so changing bluetooth chips shouldn't change the way the audio needs to be processed and we are using the same earpiece, speakers, headphone, lineout and tvout setup so those wont change)
	jonwil	aep
	jonwil	nb_eeq
	jonwil	nb_meq
	jonwil	wb_eeq
	jonwil	wb_meq
	jonwil	mumdrc.ul
	jonwil	mumdrc.dl
	jonwil	limiter.ul
	jonwil	limiter.dl
	jonwil	xprot
	jonwil	That will tell us which bits we need to dig deep into and retune for our needs
	jonwil	retune/replace
	jonwil	BIG thankyou for any help you may give btw, its so great to find someone who actually understands this alphabet soup of audio processing


copy of answer from http://wiki.maemo.org/User_talk:Jusa

[edit] ===================================================================

As a preface, I may mix up things between Fremantle and Harmattan, given that it has already been quite a while when working with the former. But as a general rule, Fremantle and Harmattan are (from PulseAudio components point of view) pretty close to each other. For the algorithms, when you look at the meego modules, wherever there is algorithm processing in Harmattan side, there likely was pretty identical processing in Fremantle as well. Meego modules are an evolution of the modules written for Fremantle, and one of the goals was also to separate the closed source algorithms from the source tree. To achieve this, "algorithm-hook"s were introduced, which just execute the algorithms from separate modules (shipped with Harmattan).

MeeGo != Harmattan, but the PulseAudio modules were named meego when open sourcing them. So in Harmattan you have source trees pulseaudio-modules-meego (containing open source code) and pulseaudio-modules-nokia (containing proprietary algorithms).

Page 15 of this PDF http://linuxplumbersconf.org/2009/slides/Jyri-Sarha-audio_miniconf_slides.pdf shows a block diagram of the N900 audio system (or at least what I assume is the N900 audio system). Is there any chance you can expand on which algorithms (e.g. stw, xprot, eap, aep, drc, eq, iir, fir etc) match with the blocks marked "Mic EQ", "music processing", "recording processing", "transducer processing" and "call audio processing"?

Mic EQ:

  • fir/iir

Music processing:

  • mudrc

Recording processing:

  • mudrc
  • (I think there was agc/automatic gain control at some point... hm)

Transducer processing:

  • iir/fir
  • xprot

Call audio processing:

  • aep

I have been examining the Meego code and the Harmattan code (both the open bits in pulseaudio-modules-meego and the closed bits in pulseaudio-nokia) with a view to possibly using the blobs or parts of them for the Neo900 audio work (since they are more open than the ones on the N900). Comparing what Meego has to what Fremantle has, it would appear as though the Meego and Harmattan blobs are missing functions labeled stw. Are you aware of any other pieces in the Meego or Harmattan blobs that are missing or that might have been built differently? (ignoring anything to do with different versions or compilers, PulseAudio, libc or other things, just talking about the actual algorithms and such in the pulseaudio-nokia blobs).

stw - Stereo Widening. Kind of useful with N900, since that has stereo speakers, dropped with N9 since N9 has only mono speaker. It might be that it was dropped from N900 as well, if it wasn't seen as enough improvement to the sound compared to the resource usage.

Algorithms themselves are pretty much the same, although I think AEP saw quite big change. All the configuration parameter binary blobs in the parameters files were just fed to the algorithms as is. Tuning was done with proprietary plugin to Fast Trace (iirc).

Or for that matter anything that is present in the Meego or Harmattan blobs but isn't used on Fremantle?

Other than the cmtspeech stuff, are you aware of anything in the nokia Fremantle Pulseaudio blobs that talks to or is dependent on the N900 cellular modem?

Not that I know of.

Which of these Nokia algorithms are necessary for safe system functioning and which are just going to give not-so-nice audio if we dont have them? Obviously xprot is something we probably need in order to protect the speakers/earpiece/etc from damage but what else? (the cellular modem we are using in the Neo900 already does all sorts of voice processing so identifying which bits we need and which bits we can leave to the cellular modem will save us a lot of time)

I don't think any of the algorithms are needed if the hardware is run conservatively enough. XProt helps with getting more loudness safer, but it would be just as well to run the speaker a bit quieter.

From audio quality perspective, voice call algorithms are really crucial. It is impossible to get device certified without echo cancellation and friends, and almost impossible without EQs. I remember it was really hard, even with the super equipment the tuning guys had.

Are you aware of open-source implementations of algorithms similar to the proprietary Nokia ones?

Not really. Maybe speex.

Do you know if the stereo widening stuff is actually used on the N900 on Fremantle? All of the .parameters files that control the blobs set x-maemo.stereo-widening = "false" which I assume means "turn it off". If its in fact not used that means we can ignore it (it would also explain why its not even compiled in for Meego)

Answered above.

nb_eeq is "narrowband ear equalizer" nb_meq is "narrowband mic equalizer" wb_eeq is "wideband ear equalizer" wb_meq is "wideband mic equalizer" What is the difference between "narrowband" and "wideband" here? What is xprot displacement?

Narrowband/wideband: Voice call bandwith. DocScrutinizer05 already wrote about this in IRC.

XProt displacement - don't remember. Something about the speaker coil safe traversal probably.

Can you tell me which of these sets of parameters will be tied to the specific details of the microphone in the N900 or of the FMTX in the N900? (bluetooth is digital so changing bluetooth chips shouldn't change the way the audio needs to be processed and we are using the same earpiece, speakers, headphone, lineout and tvout setup so those wont change)

aep nb_eeq nb_meq wb_eeq wb_meq mumdrc.ul mumdrc.dl limiter.ul limiter.dl xprot. That will tell us which bits we need to dig deep into and retune/replace for our needs retune/replace

  • mumdrc.ul: Multiband DRC uplink (run in record module)
  • mumdrc.dl: Multiband DRC downlonk (run in music module)
  • limiter.ul: Limiter (don't remeber more specifically) uplink
  • limiter.dl: same with downlink
  • aep: Algorithm package especially meant for voice call processing, includes echo cancelling, comfort noise, whatnot.
  • I think eap is also mentioned somewhere, iirc that contains the algorithms for DRC etc.

I think the best way to verify these is to look at the different places in Harmattan where meego_algorithm_hook_fire() is called, like said above, those places pretty much match what's happening with Fremantle as well.

oh and one more thing, if you know anything at all about alsaped, alsaped.conf or alsa-policy-enforcement, feel free to share :)

IIRC alsaped and alsa-policy-enforcement were doing the same thing as alsa-mixer stuff in Harmattan. That is, set ALSA interface switches for achieving given output/input routing. Porting the alsaped configuration to recent PulseAudio alsa-mixer configuration should be doable (and might actually already be done for N900 Nemo port, not sure.)

I'd probably need to take a look how Harmattan stuff was laid out to better remember these details, but as I currently don't have that available I'm just running with my (bad) memory. I do have N9 though (probably even with stock image flashed) so I could take a look at Harmattan root filesystem at some point as well.

[edit] =======================================