Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter for min and max age / incomplete live streams #640

Open
padok opened this issue Jun 30, 2024 · 6 comments
Open

Filter for min and max age / incomplete live streams #640

padok opened this issue Jun 30, 2024 · 6 comments

Comments

@padok
Copy link

padok commented Jun 30, 2024

Hello!

Im currently having issues downloading live streams using PodSync. Sometimes the feed gets updated before or shortly after the live stream has ended, which, for me, causes the video to sometimes either miss part of the beginning, end or both.

This happens to me especially with the following config:

  [feeds.LTT]
  url = "https://www.youtube.com/user/LinusTechTips"
  page_size = 10
  update_period = "1h"
  quality = "high"
  filters = {not_title = "LTT TV.*", max_duration = 18000}
  format = "video"
  max_height = 720
  clean = { keep_last = 10 }

And their long friday live streams.

I'm proposing a filter that enables me to specify a minimum age for videos. This way PodSync could skip the first and maybe faulty download until a set time period.

I didn't find any related issue or documentation, but I would be happy, if someone could redirect me if this feature already exist in one form or another.

Thanks!

@LewisSpring
Copy link

LewisSpring commented Aug 27, 2024

Looks like this might already be a feature?

filters = { title = "regex for title here", not_title = "regex for negative title match", description = "...", not_description = "...", min_duration = 0, max_duration = 86400, max_age = 365 }

Edit: At least for Max_Age. Will test for Min_Age now

@LewisSpring
Copy link

LewisSpring commented Aug 27, 2024

Min_Age does not seem to work. Now trying this:
youtube_dl_args = ["--datebefore today-1week"]

Edit: It seems to think that it's not a valid argument?

time="2024-08-27T16:46:05Z" level=error msg="youtube-dl error: /tmp/podsync-691529531/GKPSAUzczO8.%(ext)s" error="failed to execute youtube-dl: exit status 2"
time="2024-08-27T16:46:05Z" level=error msg="\nUsage: youtube-dl [OPTIONS] URL [URL...]\n\nyoutube-dl: error: no such option: --datebefore today-2weeks\n"

but does work in the shell.

/app # youtube-dl --datebefore today-2weeks -s "https://www.youtube.com/watch?v=GKPSAUzczO8&list=PL8mG-RkN2uTw7PhlnAr4pZZz2Qub"
IbujH
[youtube] Extracting URL: https://www.youtube.com/watch?v=GKPSAUzczO8
[youtube] GKPSAUzczO8: Downloading webpage
[youtube] GKPSAUzczO8: Downloading ios player API JSON
[youtube] GKPSAUzczO8: Downloading web creator player API JSON
[youtube] GKPSAUzczO8: Downloading m3u8 information
[download] 2024-08-24 upload date is not in range 0001-01-01 to 2024-08-13

@LewisSpring
Copy link

LewisSpring commented Aug 27, 2024

According to #424

The option and arguments need to be separated and it seems to be working.
Now retesting using youtube_dl_args = ["--datebefore", "today-1week"]

Edit: I think this confuses the pager as it doesn't actually download anything.
But seems to work for the other videos that are older as far as i can tell right now (downloading taking time...)

It does either of these behaviours:

time="2024-08-27T17:09:51Z" level=info msg="downloading episodes" page_size=5
time="2024-08-27T17:09:51Z" level=info msg="download count: 5"
time="2024-08-27T17:09:51Z" level=info msg="! downloading episode https://youtube.com/watch?v=EefvOLKoXdg" episode_id=EefvOLKoXdg index=0
time="2024-08-27T17:09:55Z" level=info msg="! downloading episode https://youtube.com/watch?v=G2tYxHT-EkA" episode_id=G2tYxHT-EkA index=1
time="2024-08-27T17:20:35Z" level=info msg="creating file: /app/data/wantest/G2tYxHT-EkA.mp3" name=wantest/G2tYxHT-EkA.mp3
time="2024-08-27T17:26:55Z" level=info msg="downloading episodes" page_size=10
time="2024-08-27T17:26:55Z" level=info msg="skipping due to already downloaded" episode_id=EefvOLKoXdg
time="2024-08-27T17:26:55Z" level=info msg="download count: 9"
time="2024-08-27T17:26:55Z" level=info msg="! downloading episode https://youtube.com/watch?v=G2tYxHT-EkA" episode_id=G2tYxHT-EkA index=0

I think using the filters line would be a better solution.

@LewisSpring
Copy link

LewisSpring commented Aug 27, 2024

Update, now the downloads have finished.
At least this time around, The episode that is "too new" is added to the RSS feed but the file is not present on the web server. This is not too bad of a solution if it's consistent.

Edit: though this might cause problems as soon as the file is new enough and needs to be downloaded. testing that now

Edit2: so far so good. Just downloading as normal.

time="2024-08-28T08:34:00Z" level=info msg="download count: 1"
time="2024-08-28T08:34:00Z" level=info msg="! downloading episode https://youtube.com/watch?v=GKPSAUzczO8" episode_id=GKPSAUzczO8 index=0
time="2024-08-28T08:47:19Z" level=info msg="creating file: /app/data/wan/GKPSAUzczO8.mp4" name=wan/GKPSAUzczO8.mp4
time="2024-08-28T08:49:40Z" level=info msg="successfully downloaded file \"GKPSAUzczO8\"" episode_id=GKPSAUzczO8 index=0
time="2024-08-28T08:49:47Z" level=info msg="downloaded 1 episode(s)"
time="2024-08-28T08:49:47Z" level=info msg="running cleaner" count=10 feed_id=wan
time="2024-08-28T08:49:48Z" level=info msg="creating file: /app/data/wan.xml" name=wan.xml

@padok
Copy link
Author

padok commented Oct 26, 2024

Let me try to summarize your findings, and thank you for sharing them!

So, as far as I understood it, the --datebefore argument in yt-dlp makes it behave as if an episode is already present, when the time condition is not met. This causes Podsync to skip the download but still add the episode to the RSS feed. When the file reaches the required age, yt-dlp downloads the episode the next time Podsync triggers it. Since Podsync has already added the episode to the RSS feed, the feed does not get updated again.

The main issue is that some podcast clients might not automatically retry downloading the episodes after they initially fail.

Therefore, we still need Podsync to support this in filtering or to only add entries to the feed if the corresponding files are present.

Let me know if I understood your findings correctly.

@LewisSpring
Copy link

LewisSpring commented Oct 31, 2024

Exactly!
Well done for deciphering it 😅
I should have taken the time to summarise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants