-
Notifications
You must be signed in to change notification settings - Fork 523
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
--preserved_segments_outside_live_window
can fail to delete old segments.
#533
Comments
Rather than change the packager, adding the above workaround to the docs may be enough to close this issue. |
Yes, we need to improve the packager to try to retry the file deletion if it fails. I'll add it to v2.4 milestone for now. But it is not a good sign if every file deletion needs to be retried. As mentioned in #509 (comment), if we know that the server needs to hold the file for some more time, it is better to give it more time, e.g. by increasing --preserved_segments_outside_live_window (6 equals 1 minute for a segment duration of 10 seconds). |
Not "every", just the few that fail. Even a low-percentage of failures will eventually consume all storage space. That's not a good sign! |
Ok, that is much better. Anyway, agree that we'll need to have the retry logic so the file does not accumulate! |
I won't get to test the fix for a little while, but the code looks great. Thanks! |
Hello, I am using an attenuator to reproduce the issue. The attenuator receives ATSC data and sends it to an ATSC RF receiver. The data is then turns into IP streams, fed into shaka packger and eventually being streamed out. I raise the level of the attenuator until the data is affected by it (can visually see pixelation) and let it run. |
@ohMoshko Interesting. I admitted that we did not take data corruption into consideration when designing this solution. Do you mind describing a little bit more on what is happening? For example, what does packager see for "bad" data? Valid TS streams with incorrect timestamps?
What is causing the crash? Out of storage space? Out of memory?
I do not know. One possibility is that the timestamp is corrupted causing SlideWindow() to exist early: https://github.com/google/shaka-packager/blob/master/packager/hls/base/media_playlist.cc#L581; it might also be possible that MediaPlaylist::SlideWindow() or MediaPlaylist::AddSegmentInfoEntry() is not invoked at all.
The files to be deleted are stored in |
I am feeding the packager with different ATSC services. The services that are affected by the low signal power start accumulating .mp4 files (audio and video). The services’ media directory gets to around ~900MB and then the Linux OOM_killer is invoked and kills my app that is running shaka-packager: I can confirm that when the signal is fading, MediaPlaylist::SlideWindow() and MediaPlaylist::AddSegmentInfoEntry() are invoked but the timestamp is probably corrupted causing SlideWindow() to exit early: I can record the TS and send it to you if that would help. |
--preserved_segments_outside_live_window
appears to make a single attempt to remove old segment files, never attempting a retry.For example, this happens when throttling a 1 Gbps link down to 750 Kbps just when the player is starting the next high-res segment. That segment can stay open for some time before the player quits trying.
Workaround: At the end of my packager startup script, I added this code:
Possible resolution strategy: Keep a list of all segments generated, and try to delete all segments outside the window each time a new segment is generated, removing the list entry when deletion succeeds. This will ensure multiple tries for every segment outside the window. Perhaps provide an error after a significant number of deletion attempts fail (~10 or so), to permit the operator to address downstream apps that maybe keeping the file open/locked.
Originally posted by @BobCu in #509 (comment)
The text was updated successfully, but these errors were encountered: