This message was deleted.
# announcements
s
This message was deleted.
m
@polite-kilobyte-67570 has started to think about/explore a framework for LiveKit processors/plugins. He may have some thoughts here 🙂
d
today, you could also perform 1 directly upon the MediaStreamTrack objects acquired on the browser side. LiveKit's advanced publish APIs can take in a MediaStreamTrack, so is compatible with the w3c spec.
👍 1
p
I haven’t experimented with scenario 2, yet. Could you share a bit more about your usecase and how you would imagine this to work? Are you imagining an additional hop in between client and sfu that only does a mediatrack transform?
a
@polite-kilobyte-67570 That's correct, I imagine effects would be applied after the VideoFrame leaves the client browser and before the SFU performs its more advanced functionalities like adaptive stream, and dynamic broadcast. Possibly deployed, to reduce transcoding overhead, at the SFU. Include relevant parameters in the participant's metadata, or within the track datachannel. A few approaches that could exist: • twirp based api extension ◦ Configure where your processor is running and have the api pass metadata and frames (audio chunks as well) to it • webhooks based ◦ Similar to above, however I believe this is primarily used for client apis • TrackBased ◦ Define an interface similar to mediacapturetransform for tracks. ◦ Deploy as part of a stripped down sfu or as a client track relay.
Twirp comm channel is likely going in the wrong direction. It and webhooks have the benefit of possible reduction of transcode overhead. Track based seems more flexible to me but leads to more encode/decode. Building as a plug-in may be reasonable but I am not familiar how to best do that in go/what headaches would come from multi language/platform support.
p
From what you describe, I think the easiest way to achieve something along those lines today would be to deploy your own TURN server and “misuse” it to modify packages. https://github.com/pion/turn/blob/master/examples/turn-server/add-software-attribute/main.go
a
Thanks, that does look like a way to do it. mediacapture-transform does also seem to be supported browser side (as an experimental feature in some browsers). Do you have an opinion on whether it should be included in the server sdks, or as some separate system, as well?
@polite-kilobyte-67570 at that point, when the message is intercepted at the TURN server, the video frame would be encoded still right? I'm under the impression a lot of the functionality in the SFU, with respect to determining encoding type and state, would need to be re-implemented there. Do you think intercepting at the sfu receiver layer might be easier?
p
yeah, I think generally it’s quite an undertaking to achieve what you described, especially if you want to manipulate frames on a pixel level. I’m not an expert on this, but AFAIK SFUs generally don’t decode video frames, the architecture that does this kind of processing would be a MCU. And because a MCU also handles decoding and re-encoding, it is generally a lot more resource hungry than a SFU. We do have plans to expose an easy to use interface for stream transforms on the client side, but there are currently no plans regarding deployable transform nodes in front of a LiveKit server. But it’s an interesting thought for sure! The reason I suggested using a TURN server as a starting point was mainly because at least the relaying part to the actual SFU wouldn’t have to be reimplemented, but I don’t think it’s a perfect solution either.
a
Great, thanks again for all of the feedback!
So multiple quality streams are published directly from the client. I think I missed that the SFU wasn't resizing to the various stream qualities itself. That would further increase the amount of processing that would need to happen at the mcu/sfu level.
p
exactly, that is if the publisher has
simulcast
enabled. @dry-elephant-14928 has written a great blog post about this topic https://blog.livekit.io/an-introduction-to-webrtc-simulcast-6c5f1f6402eb/
a
Yeah, I think I knew this but somehow glossed over it. I'm now considering whether the client could go through a Track (or MediastreamTrack) proxy. Ignoring considerations like node/region selection for now. I'm going to look at the Track code in more depth to see if that seems feasible.