Summary of what I am trying to achieve:
I'm currently doing some work on a Discord bot. I'm trying to join a voice channel, which is the easy part, and then use the combined audio of the speakers in that voice channel as input for a webpage in a web browser. It doesn't really matter which browser it is as long as it can be controlled with Selenium.
What I've tried/looked into so far
My bot so far is written up in Python using the discord.py API wrapper. Unfortunately listening to, as opposed to putting in, audio hasn't been exactly implemented great − let alone documented − with discord.py. This made me decide to switch to node.js (i.e. discord.js) for the voice channel stuff of my bot.
After switching to discord.js it was pretty easy to determine who's talking and create an audio stream (PCM stream) for that user. For the next part I though I'd just pipe the audio stream to a virtual microphone and select that as the audio input on the browser. You can even use FFMPEG from within node.js 1, to get something that looks like this:
const Discord = require("discord.js");
const client = new Discord.Client();
client.on('ready', () => {
voiceChannel = client.channels.get('SOME_CHANNEL_ID');
voiceChannel.join()
.then(conn => {
console.log('Connected')
const receiver = conn.createReceiver();
conn.on('speaking', (user, speaking) => {
if (speaking) {
const audioStream = receiver.createPCMStream(user);
ffmpeg(stream)
.inputFormat('s32le')
.audioFrequency(16000)
.audioChannels(1)
.audioCodec('pcm_s16le')
.format('s16le')
.pipe(someVirtualMic);
}
});
})
.catch(console.log);
});
client.login('SOME_TOKEN');
This last part, creating and streaming to a virtual microphone, has proven to be rather complicated. I've read a ton of SO posts and documentation on both The Advanced Linux Sound Architecture (ALSA) and the JACK Audio Connection Kit, but I simply can't figure out how to setup a virtual microphone that will show up as a mic in my browser, or how to pipe audio to it.
Any help or pointers to a solution would be greatly appreciated!
Addendum
For the past couple of days I've kept on looking into to this issue. I've now learned about ALSA loopback devices and feel that the solution must be there.
I've pretty much followed a post that talks about loopback devices and aims to achieve the following:
Simply imagine that you have a physical link between one OUT and one IN of the same device.
I've set up the devices as described in the post and now two new audio devices show up when selecting a microphone in Firefox. I'd expect one, but I that may be because I don't entirely understand the loopback devices (yet).
The loop back devices are created and I think that they're linked (if I understood the aforementioned article correctly). Assuming that's the case the only problem I have to tackle is streaming the audio via FFMPEG from within node.js.
ffmpeg
directly from command-line until you figure out the microphone part. Easier to debug :/ – Ramsden