Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream is not deleted immediately #1789

Open
getroot opened this issue Feb 24, 2025 · 11 comments
Open

Stream is not deleted immediately #1789

getroot opened this issue Feb 24, 2025 · 11 comments
Assignees
Labels
patched Patch applied

Comments

@getroot
Copy link
Member

getroot commented Feb 24, 2025

Discussed in #1782

Originally posted by hernanrz February 20, 2025
We've been having this issue for a while now where some users experience disconnects/reconnects in OBS somewhat consistently. We've ruled out network issues on the user's end, also tried different bitrate settings and it doesn't seem to affect things. We're wondering if it's possible a bug in OME is the cause.

About our server setup

  • We have multiple concurrent streams
  • We use the API to record every stream that starts and call the API to stop recording when the stream disconnects (and stays disconnected for up to 5 seconds)
  • We're using the latest OME version

Config file
Server.xml.txt

OBS Logs from the disconnects
obslog.txt

@getroot getroot added the in progress Being actively worked on but may take some time to complete label Feb 24, 2025
@dimiden dimiden added patched Patch applied and removed in progress Being actively worked on but may take some time to complete labels Feb 26, 2025
@dimiden
Copy link
Member

dimiden commented Feb 26, 2025

@naanlizard
There was an issue where socket-related events could be delayed when traffic surged, which caused RTMP streams not to be deleted immediately.
This has been fixed and committed, so it should now be deleted properly.
Thank you! 👍

@naanlizard
Copy link
Contributor

Unfortunately, looks like this has not fixed the problem, and/or causes crashes.

Attached are logs, let me know if you need more

Unfortunately due to a config error, I don't think we can get the dump file - I'll see about making sure that works in the future

Backend and OME logs for the flickering we had, and also a log around the crash.

ome-crash.txt
ome-logs.txt
backend-logs.txt

@naanlizard
Copy link
Contributor

naanlizard commented Feb 26, 2025

Unfortunately, looks like this has not fixed the problem, and/or causes crashes.

Attached are logs, let me know if you need more

Unfortunately due to a config error, I don't think we can get the dump file - I'll see about making sure that works in the future

Backend and OME logs for the flickering we had, and also a log around the crash.

ome-crash.txt ome-logs.txt backend-logs.txt

Dockerfile has

# https://github.com/AirenSoft/OvenMediaEngine/commit/59432ef00ae2fafef36f124ede6dc4d982876253
FROM airensoft/ovenmediaengine@sha256:594c097275d670ea3cecde8d54ca3b2fb67f1252434af8208400bdb90f0f5034

@dimiden
Copy link
Member

dimiden commented Feb 27, 2025

@naanlizard
In our Testbed, I conducted a repeated connection and disconnection test with hundreds of connections for over 10 hours and confirmed that there were no issues. It's unfortunate that you're still able to reproduce it. 😭

  1. The crash seems to have occurred during the shutdown process of OME v0.17.3, which appears to be unrelated to this issue, so I will set it aside for now.
  2. To analyze the cause, I need logs showing when the stream corresponding to #default#live/79658 was started and stopped in v0.18.0. Could you provide the full logs from when v0.18.0 was started at 2025-02-26 13:13:04.018 until the issue occurred at 2025-02-26 15:47:17.304?

@naanlizard
Copy link
Contributor

naanlizard commented Feb 27, 2025

Ah I apologize, I grabbed the wrong logs for the crash - (of course, when 0.17.3 stopped, it is a SIGSEGV, from stopping the old container to start the new one)

Unfortunately due to a logger problem, I don't have those logs. What I did manage to get were these logs - the clean.txt file is the last 3 hours, without the listed log lines, and the crash is from an example sigsegv (older than the last 3 hours)

omelogs-noHTTP-noStream_metrics-nolonger-clean.txt
ome-logs-crash.txt

I'll update with more information when I have it, thank you for your hard work here

@getroot
Copy link
Member Author

getroot commented Feb 28, 2025

@naanlizard Thanks for reporting the crash. Unfortunately, it was very hard to find any hints in the logs. If you get a dump, please share it. And this crash didn't happen in 0.17.3?

@naanlizard
Copy link
Contributor

naanlizard commented Feb 28, 2025 via email

@naanlizard
Copy link
Contributor

No crashes since, any ideas about the duplicate streams/flickering even on the new version?

@dimiden
Copy link
Member

dimiden commented Mar 2, 2025

We are reviewing several scenarios, but we have not yet identified any suspicious code.
Could you provide logs from the moment the affected stream last stopped publishing to the point where it was shown as duplicated?

@dimiden
Copy link
Member

dimiden commented Mar 4, 2025

@naanlizard
I have improved the logging so that crash dumps are also displayed in the log files. Additionally, if the crash dump file cannot be saved in the dumps directory, it will now be saved in the logs directory.

If the issue occurs again, please attach the logs containing the crash. And, if possible, please also provide the full logs from the dates before and after the stream duplication error occurred.
If it is difficult to upload here, I can provide a Google Drive URL where you can upload the files.

@naanlizard
Copy link
Contributor

I'll upgrade asap - the problem was we were running the container as a non root user, which didn't have write permissions in the folder it wanted to create the crash folder in. I think that's maybe a bug or at least an oversight in the image, I'll double check though.

So far, no more crashes, which is strange.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patched Patch applied
Projects
None yet
Development

No branches or pull requests

3 participants