Late in my teenager days, I would often record myself playing games.
I did not do anything with the videos (I had a reluctance to post
anything on the internet even back then). However, I wanted to be able
to keep an easy record of what happened in the game so I could easily
refer back to important story beats and whatnot, as well as capture
great moments of my own making (I am still saddened to this day that I
do not have footage of me shooting a plasma grenade with a plasma pistol
in Halo: Reach). As I tinkered with it more, I investigated the
idea of splitting up my audio to different audio channels in the video
for the potential of making editing easier. I could dump the game audio
to one channel, my microphone to another, and output from my chat
application to a third. Back then, pulseaudio was the default audio
handler on major Linux distributions, and it had the flexibility to set
up and route virtual speakers and microphones for those willing to learn
the commands and put them in your .profile
file. When I went back to Windows for a brief stint, I used VB-Audio's freemium products to
set up a number of these virtual speakers.
I eventually tired of recording all my games. I wanted to be able to play games in whatever environment (Steam Deck, my 1080p laptop, etc), and I was not willing to put up with the variations in quality, so I gave up the idea of recording everything. However, I recently have been investigating this again. I have been some visual-novel-type games with an out-of-state friend, where I host a game and stream it to him, and we decide together what to do. We primarily work using Signal, and Signal does not share desktop audio. As such, I needed to be able to send the game audio alongside my microphone output to my brother. On top up that, I have also taken to sending my music player output to whoever I am chatting with, creating a custom soundtrack for the game. I avoid this on public platforms like Discord, but I am willing to do this on secure platforms like Signal, or platforms that I manage, like Mumble or the Simple Voice Chat Minecraft mod. With these new developments, I began to revisit my custom audio software setup.
These days, Pipewire is the preferred audio system for Linux distros. As such, I needed to learn the new setup. Up until now, I relied on a rough bash script to create the appropriate sinks and loopback streams (recreating them as needed, since Henry Stickmin had a bad habit of closing audio streams any time audio was not playing). However, now that I am using more loopbacks, I decided to look into something more permanent.
As a solution, I discovered that you can define audio items in a
configuration file and will create them whenever the service starts. By
putting files the $XDG_CONFIG/pipewire/pipewire.conf.d/ directory,
I can have everything set up whenever I start the computer, and
configure my applications as needed.
When I was setting up my scheme last night, I had the following goals:
- I wanted a custom speaker (or "sink") for my music. My music player would dump its audio into this sink. Anything that is sent to this sink is then also sent (via a "loopback") to my primary speakers, but also my microphone output (or "source" alongside with whatever my actual microphone picks up.
- I wanted a sink for my application. For cases where I am sharing a game, I need to send the application output alongside my microphone output as well. However, I do not want to always do it (I do not want to send my Minecraft game audio to chat).
- I want a sink to dump my chat audio. It will be looped to my speakers. However, it will not be sent to the microphone.
In all of these, each category of items (each sink) can be recorded separately by software like OBS Studio, allowing it to be dumped to different audio channels for later mastering.
With these goals, I used a series of pipewire loopback modules,
defining it in /etc/pipewire/pipewire.conf.d/10-music-sink.conf
(to give all my users access to this setup).
context.modules = []
The configuration file vaguely resembles creating Python arrays and objects, although they do not use commas. All of my modules are defined within the square brackets.
To start, I create a "combined microphone." Both my hardware microphone and all relevant sinks will send their data to that sink. Anything sent to the combined microphone sink will then be able to be read as a microphone that chat applications like Signal and Mumble can use.
context.modules = [
{ name = libpipewire-module-loopback
args = {
capture.props = {
node.name = "combined_microphone.input"
stream.dont-remix = true
node.passive = true
}
playback.props = {
node.name = "combined_microphone"
node.description = "Combined Microphone"
media.class = "Audio/Source"
}
}
}
]
By reloading Pipewire with systemctl --user reload pipewire, we now can see
a new input, "Combined Microphone" (you may need to enable virtual audio
objects in your UI).
An explanation of what each aspect does:
name = libpipewire-module-loopbackThis defined what type of module we are making. For this project, we are using exclusively loopback modules. This module type can both create sinks and sources, but also handle redirecting.capture.propsThis controls how we get audio data into this module. This has two modes. By setting themedia.classproperty to "Audio/Sink," I can create a new sink that we can have applications dump data to. However, by leaving it out, it can be configured to take its data from an existing source or sink. This is usually done by specifying thenode.targetproperty. By leaving outnode.target, however, it will have the loopback module capture the default source (i.e. the hardware microphone). This capture will even keep up with changes to the default, letting us change what microphone we use without issue.playback.propsThis controls how we send out data. Here, we setmedia.classto "Audio/Source" to create a new source, a new microphone. The idea is that I can configure chat applications to listen to this microphone and get whatever I send to this object. If I decided to omitmedia.class, I could have instead sent the received audio data to a sink, such as my primary speakers (configurable vianode.target).node.nameThe name of the source/sink. This is the name that will be used to reference the source/sink elsewhere in the file. As a standard, I add a ".input" to the capture properties when I am making a new source, or a "output" suffix to the playback properties when I am making a new sink. This is just a personal standard, however.node.descriptionA human-readable name for sinks and sources that will show up in GUIs. I only define this when I am also defining themedia.classproperty.stream.dont-remixBy setting this to "true," I can ensure that the channels (i.e. left output, right output) are not changed. I may need to reconsider this if I ever get a more complex audio hardware setup. However, since all my devices use stereo audio, this works.node.passiveBy setting this to "true," I can reduce resources when no audio is flowing through this source or sink. I use this for objects that are purely used for connecting two objects.stream.capture.sinkThis is not used in the above example, but I set it to "true" when I am sending audio between sinks. I cannot find documentation on what it does, but it is needed in some of the definitions below.
With this setup, I now have a virtual microphone alongside however many hardware microphones I have plugged in, and whatever hardware microphone I have set to "default" will be echoed in the virtual microphone. At the moment, it is not useful. We need more objects for it to do something interesting. Let us define a new sink to dump audio into.
context.modules = [
{ name = libpipewire-module-loopback
...
}
{ name = libpipewire-module-loopback
args = {
capture.props = {
node.name = music_sink
node.description = "Music Sink"
media.class = Audio/Sink
}
playback.props = {
node.name = "music_sink.output"
stream.dont-remix = true
node.passive = true
}
}
}
]
This time, we create a virtual speaker called music_sink by setting the media.class property within capture.props to "Audio/Sink." By omitting the
class within the playback props, we echo anything dumped into the new
sink into the default speakers.
With this setup, we now have a new audio output that will echo its data to the default speaker. By setting a music player to send its output to this new source, we can still hear the music. However, we can record this output (i.e. via OBS) separately from the rest of the desktop.
Let us set up a few more.
context.modules = [
{ name = libpipewire-module-loopback
...
}
{ name = libpipewire-module-loopback
...
}
{ name = libpipewire-module-loopback
args = {
capture.props = {
node.name = application_sink
node.description = "Application Sink"
media.class = Audio/Sink
}
playback.props = {
node.name = "application_sink.output"
stream.dont-remix = true
node.passive = true
}
}
}
{ name = libpipewire-module-loopback
args = {
capture.props = {
node.name = chat_sink
node.description = "Chat Sink"
media.class = Audio/Sink
}
playback.props = {
node.name = "chat_sink.output"
stream.dont-remix = true
node.passive = true
}
}
}
]
Now, we have three new sinks: one for the music player, one for applications/games, and one for chat applications. All of them will send their data to the default speaker, but will also keep their information separate.
Now, we could leave things here, and our setup would work quite well
for streaming and recording. However, I want to be able to send some
data (music) over chat, so we need to create more loopbacks and take
advantage of the combined_microphone
source we set up earlier. Let us add a new loopback that will connect
two existing objects.
context.modules = [
{ name = libpipewire-module-loopback
...
}
{ name = libpipewire-module-loopback
...
}
{ name = libpipewire-module-loopback
...
}
{ name = libpipewire-module-loopback
...
}
{ name = libpipewire-module-loopback
args = {
capture.props = {
node.target = music_sink
stream.capture.sink = true
}
playback.props = {
node.target = "combined_microphone.input"
stream.dont-remix = true
node.passive = true
}
}
}
]
This new setup connects everything we dump to music_sink and send it to the combined_microphone. Now, if we reload, our
music player can be heard on combined_microphone. By setting a chat
application to listen to combined_microphone, your friends can hear both
your voice and whatever music you are playing.
Now, we could also set up a similar loopback for applications, letting me supplement the screen share. However, this presents a conflict. I want to share Ace Attorney over Signal. However, I do not want to share Minecraft over voice chat. That said, I still want to dump both into an isolated sink to record separately. To solve this, I will create yet another sink, along with a new loopback.
context.modules = [
...
{ name = libpipewire-module-loopback
args = {
node.name = application_sink
capture.props = {
node.name = application_loopback
node.description = "Application Loopback"
media.class = Audio/Sink
}
playback.props = {
node.target = "application_sink"
stream.dont-remix = true
node.passive = true
}
}
}
...
{ name = libpipewire-module-loopback
args = {
audio.position = [ FL FR ]
capture.props = {
node.target = application_loopback
stream.capture.sink = true
}
playback.props = {
node.target = "combined_microphone.input"
stream.dont-remix = true
node.passive = true
}
}
}
]
The first module we define, application_loopback above creates a sink that
echos not to the speaker, but to application_sink. The second module will also
echo application_loopback to the combined
microphone. Thus, we have two options for applications. By sending it to
application_loopback, it will be echoed to
both application_sink for recording, but
also to combined_microphone for sharing.
Alternatively, we can dump it to application_sink directly if we only want to
record.
The final file can be found here.
By placing it in the appropriate folder and reloading pipewire (systemctl --user restart pipewire), we now have
several new audio outputs and a new input. From there, we set up our
environment as such:
- Set the default speaker to the hardware speaker we want to hear everything out of. Set the default microphone to the hardware microphone we will be speaking into.
- Configure the music player to dump to the
music_sinksink. If the application does not support it naively, configure it in an OS volume mixer (KDE handles this without needing extra packages). - Configure your game to output to either
application_loopbackif you are running a screen share, orapplication_sinkif you only need to record. - Configure your chat application (Mumble, Signal, Discord, etc.) to
output to
chat_sink. Configure your chat application to usecombined_microphoneas the input. - In your screen recording software (OBS, etc.), set the microphone to
your default hardware microphone. Set up different audio sources to read
application_sinkandchat_sink(and optionallymusic_sink), and set each source to save to a different audio channel.