Assuming you're using RTMP to publish a stream to an input, the FLV spec requires exactly one video track and one audio track. I can't answer to the specific behavior of cloudflare stream with a scenario like that, but arguably something like that is undefined behavior and out of spec