beamform

ROS package that carries out simple beamforming strategies, using JACK as input/output audio server.

View the Project on GitHub

beamform

ROS package that carries out simple 1D beamforming strategies, using JACK as input/output audio server.

IMPORTANT, EOL: This package will reach end-of-life when ROS Noetic does. There is a ROS2 version of this package that you can find in https://github.com/balkce/beamform2 that works with ROS2 Humble.

It includes a custom ROS topic: JackAudio, defined in the package “jack_msgs”. It also uses a custom API for communicating between ROS and JACK referenced as “rosjack”, which is not to be confused with https://github.com/balkce/rosjack. Although both serve similar purposes, in this repository, rosjack is built as a library for ROS-JACK inter-operability, not as a ROS package.

Included beamformers:

Two YAML files are required to be configured:

The direction of interest of all beamformers can be changed on-the-fly by writing to the topic /theta of type std::Float32 (0 is front, -90 is left, 90 is right, 180 is back).

Note for MVDR: it uses only a small portion of the frequencies for speed. It decides which frequencies to use upon a frequency range and basic enery thresholding that can be configured in the mvdr.launch file.

Note for GSC: it uses a dynamic mu that changes depending on the current sample SNR. To facilitate it’s configuration, the gsc.launch file includes the values for the starting mu, the maximum mu, the filter size, if a VAD should be used (and its accompanying VAD threshold), and if the behavior of the mu value should be stored in an external text file (/home/user/mu_behavior.txt).

Note for LCMV: similar to MVDR, it uses only a small portion of the frequencies for speed. It also considers the positions of the interferences for nullifying effect. For this, beamform_config.yaml also stores the initial direction of interferences, and LCMV ignores the interference that have an absolute value greater than 180 and the ones that follow it in the interference list. Similar to the /theta topic, LCMV also listens for the /theta_interference topic, that uses a custom InterfTheta message with the following structure {id,angle}. If the interference id is outside the range [1,interference number] it will add another interference to the list. If the new angle of a current interference is too close to a current interference, it is eliminated.

Note for GSS: similar to MVDR and LCMV, it uses only a small portion of the frequencies for speed (although it is fast enough such that the magnitude threshold can be considerably lower than MVDR and LCMV). Also similar to LCMV, it also considers the positions of the interferences, however it uses them, as well as the position of the source of interest, for its inner processes. The workings of how interferences are inserted and removed are the same as in LCMV. Although in its original form GSS is able to separate all the sources (of interest and interferences), in this implementation it only provides the one of interest since it would require the reworking of the whole ROS package to consider more than one output.

Note for Phase: it process the weight of each frequency bin by considering if the inter-microphone phase difference is below a threshold; if not, it multiplies the reference microphone magnitude by a factor to reduce the presence of not-in-phase interferences. To facilitate it’s configuration, the phase.launch file includes the values for minimum phase and magnitude factor, and the size of its output smoothing filter.

Note for PhaseMPF: carries out the Phase beamformer and an anti-Phase beamfomer (by using the negative mask). Both outputs are then inputed into a bi-channel post-filter, which is a variation based on the multi-channel post-filter of Valin et al (2007), the source code of which can be found in https://github.com/introlab/manyears/blob/master/manyears-C/dsplib/Separation/postfilter.c. In turn, this post-filter uses the Minima Controlled Recursive Averaging (MCRA) noise estimator of Cohen and Berdugo (2001). To facilitate its configuration the phasempf.launch includes all the parameters of the Phase beamformer and the modified post-filter, including its MCRA module.

Note for audio file: the audio file is a 16-bit WAV file with the sample rate with which the JACK server is configured. If the file path in rosjack_config.yaml is empty, the default path will be used: /home/user/rosjack_write_file.wav.

Additional utilities included

The following nodes are also included:

Dependencies

Packages that can be installed trough apt official repositories:

References

Some useful references: