-
-
Notifications
You must be signed in to change notification settings - Fork 8.5k
obs-outputs: Correct FLV CTS calculation #12151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
obs-outputs: Correct FLV CTS calculation #12151
Conversation
8bdf61e
to
1372148
Compare
1372148
to
499ac38
Compare
cc @norihiro who may more quickly understand the math. |
I found a few secondary sources that confirm that CTS (composition time) should be calculated as PTS - DTS - specifically in milliseconds; so these changes make sense. That said, I don't have access to a 'source of truth' for the FLV format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking the example:
def get_ms_time(ts: int, den: int):
return ts * 1000 // den
timebase_den = 60 # 60 FPS: 1/60
# 0 b-frames
dts = pts = 120
cts = pts - dts # == 0
dts_ms = get_ms_time(dts, timebase_den) # 2000
cts_ms = get_ms_time(cts, timebase_den) # 0
The relevant OBS function get_ms_time()
is:
#define MILLISECOND_DEN 1000
static int32_t get_ms_time(struct encoder_packet *packet, int64_t val)
{
return (int32_t)(val * MILLISECOND_DEN / packet->timebase_den);
}
In the example, dts = pts = 120
and timebase_den = 60
passed as ts
and den
returns 2000
.
In OBS' get_ms_time
, packet->timebase_den = 60
and packet->pts = 120
(passed as packet
and packet->pts
returns 2000
.
Trying to follow between the example and OBS code, I got a little confused because the parameters are swapped.
I'm trying to follow along here, but I'm having trouble finding where the math goes wrong within the changed lines. With the sample values in a mocked up test, changing between 0 and 1 b-frames, I notice that time_ms
(line 311) is the value that changes from 2000
to 1999
, which itself seems odd. All other values seem consistent.
In order for this to be correct, we have to convert both PTS and DTS to milliseconds first, then calculate the offset, which will give us Edit: Also to clarify, with this change all the PTS values when b-frames are used will match the ones when no b-frames are used. Right now in OBS at 60 FPS 40 out of 60 frames in a second will have a wrong timestamp. |
Okay, I think I follow now. Thanks. |
This is correct; get PTS and DTS in milliseconds first, and then the (consistent) CTS is evaluated by calculating PTS - DTS. |
Using x264 encoder with bf=3, Using the PR 12151,
On the master (FYI):
So, DTS+CTS is still 1999 instead of 2000. Is this the expected result? |
The PR description is just an illustrative example for how the math goes wrong, not what OBS will actually produce. The DTS in FLV is an unsigned integer1, meaning that we cannot start at PTS The main issue that this PR solves is that the PTS for frames (but especially keyframes) between multiple renditions in multitrack does not line up when different amounts of b-frames are used (e.g. with AMD cards or NVIDIA Turing/10-series not supporting b-frames for HEVC). To further explain the reasoning here, when we convert the timestamps to milliseconds the deltas between DTS timestamps will be 2/3 Footnotes
|
Here's a real world sample of timestamps that show the problem with a multitrack output that intentionally has different b-frame numbers configured as follows (from track 0 to 3): 0, 1, 2, 3 Without the fix:
With the fix:
|
Thank you for the explanation. |
The conversion to milliseconds rounds down. Rounding down DTS and CT values independently results in a cumulative error that results in the PTS (DTS + CT) being incorrect (off-by-one).
499ac38
to
0778a7e
Compare
Description
Fixes the CTS (composition timestamp/offset) calculation in OBS's native FLV muxer.
Motivation and Context
Currently the offset calculation will round down both the DTS and CTS, resulting in an off-by-one error when in the assembled PTS (DTS+CTS) timestamp.
To illustrate the problem:
However, if we convert both parts to milliseconds before calculating the difference, the math works out:
How Has This Been Tested?
Copious amounts of logging. Also verified that FFmpeg will produce a file with the expected PTS alignments.
Types of changes
Checklist: