1
0
mirror of https://github.com/yt-dlp/yt-dlp.git synced 2025-08-16 20:28:14 -04:00

Compare commits

...

92 Commits

Author SHA1 Message Date
github-actions
41bd0dc4d7 [version] update
Created by: pukkandan

:ci skip all :ci run dl
2023-02-17 12:31:30 +00:00
pukkandan
a0a7c01542
Release 2023.02.17 2023-02-17 17:52:25 +05:30
pukkandan
45b2ee6f4f
Update to ytdl-commit-2dd6c6e
[YouTube] Avoid crash if uploader_id extraction fails
2dd6c6edd8

Except:
    * 295736c9cba714fb5de7d1c3dd31d86e50091cf8 [jsinterp] Improve parsing
    * 384f632e8a9b61e864a26678d85b2b39933b9bae [ITV] Overhaul ITV extractor
    * 33db85c571304bbd6863e3407ad8d08764c9e53b [feat]: Add support to external downloader aria2p
2023-02-17 17:52:23 +05:30
pukkandan
a538772969
[cleanup] Misc
Closes #5897
2023-02-17 17:52:22 +05:30
HobbyistDev
30031be974
[extractor/tempo] Add IVXPlayer extractor (#5837)
Authored by: HobbyistDev
2023-02-17 14:46:46 +05:30
HobbyistDev
9acca71237
[extractor/boxcast] Add extractor (#5983)
Authored by: HobbyistDev
Closes #5769
2023-02-17 14:35:46 +05:30
Henrik Heimbuerger
d50ea3ce5a
[extractor/nebula] Remove broken cookie support (#5979)
Authored by: hheimbuerger
Closes #4002
2023-02-17 14:02:55 +05:30
bashonly
c61cf091a5
[extractor/youtube] uploader_id includes @ with handle
Authored by: bashonly
2023-02-17 02:14:45 -06:00
Chris Caruso
f737fb16d8
[ExtractAudio] Handle outtmpl without ext (#6005)
Authored by: carusocr
Closes #5968
2023-02-17 13:36:15 +05:30
Friedrich Rehren
5e1a54f63e
[extractor/SportDeutschland] Fix extractor (#6041)
Authored by: FriedrichRehren
Closes #3005
2023-02-17 13:14:26 +05:30
HobbyistDev
31c279a2a2
[extractor/hypergryph] Add extractor (#6094)
Authored by: HobbyistDev, bashonly
Closes #6052
2023-02-17 09:33:04 +05:30
HobbyistDev
a4ad59ff2d
[extractor/anchorfm] Add episode extractor (#6092)
Authored by: HobbyistDev, bashonly
Closes #6081
2023-02-17 09:29:04 +05:30
Alex Ionescu
b25d6cb963
[utils] Fix race condition in make_dir (#6089)
Authored by: aionescu
2023-02-17 08:59:32 +05:30
HobbyistDev
3616300155
[extractor/yappy] Add extractor (#6111)
Authored by: HobbyistDev
Closes #3522
2023-02-17 08:49:24 +05:30
qbnu
e4a8b1769e
[extractor/vocaroo] Add extractor (#6117)
Authored by: qbnu, SuperSonicHub1
Closes #6152
2023-02-17 08:48:07 +05:30
JChris246
da880559a6
[extractor/ebay] Add extractor (#6170)
Closes #6134
Authored by: JChris246
2023-02-17 08:44:33 +05:30
Felix Yan
65e5c021e7
[utils] Don't use Content-length with encoding (#6176)
Authored by: felixonmars
Closes #3772, #6178
2023-02-17 08:38:45 +05:30
OIRNOIR
a9189510ba
[extractor/nitter] Update instance list (#6236)
Authored by: OIRNOIR
2023-02-17 08:36:16 +05:30
HobbyistDev
10fd9e6ee8
[extractor/odkmedia] Add OnDemandChinaEpisodeIE (#6116)
Authored by: HobbyistDev, pukkandan
2023-02-17 08:30:07 +05:30
HobbyistDev
72671a212d
[extractor/viu] Add ViuOTTIndonesiaIE extractor (#6099)
Authored by: HobbyistDev
Closes #1757
2023-02-17 08:27:52 +05:30
Siddhartha Sahu
376aa24b15
Improve default subtitle language selection (#6240)
Authored by: sdht0
2023-02-17 01:25:01 +05:30
Simon Sawicki
c9d14bd22a
[extractor/crunchyroll] Fix incorrect premium-only error
Closes #6234

Authored by: Grub4K
2023-02-16 15:54:11 +01:00
bashonly
149eb0bbf3
[extractor/youtube] Fix uploader_id extraction
Closes #6247
Authored by: bashonly
2023-02-16 08:51:45 -06:00
pukkandan
9ebac35577
Bugfix for 39f32f1715c0dffb7626dda7307db6388bb7abaa
when `--ignore-no-formats-error`
2023-02-16 17:06:54 +05:30
bashonly
8b37c58f8b
[extractor/nfl] Add NFLPlus extractors (#6222)
Closes #6165
Authored by: bashonly
2023-02-14 02:57:24 +00:00
Greg Sadetsky
d3bb187f01
[extractor/NZOnScreen] Add extractor (#6208)
Authored by: gregsadetsky, pukkandan
Closes #6193
2023-02-14 08:22:27 +05:30
pukkandan
44699d10dc
[extractor/crunchyroll] Better message for premium videos
Closes #6227
2023-02-14 01:07:07 +05:30
Marenga
a9c685453f
[extractor/vk] Fix playlists for new API (#6122)
Authored by: the-marenga
Closes #6219
2023-02-13 11:37:47 +05:30
pukkandan
c154302c58
Bugfix for 39f32f1715c0dffb7626dda7307db6388bb7abaa 2023-02-13 01:35:54 +05:30
pukkandan
5712943b76
Imply --no-progress when --print 2023-02-13 01:19:51 +05:30
pukkandan
39f32f1715
Sanitize formats before sorting
Closes #4501
2023-02-13 01:19:51 +05:30
shirt
365b900605
[Build] Update pyinstaller 2023-02-12 10:57:57 -05:00
nixxo
c6b657867a
[extractor/rcs] Fix extractors (#5700)
Authored by: nixxo, pukkandan
Closes #5683
2023-02-12 20:13:20 +05:30
Lesmiscore
a4f1683221
[extractor/AbemaTV] Cache user token whenever appropriate (#6216)
Authored by: Lesmiscore
2023-02-12 23:02:09 +09:00
Simon Sawicki
b6795fd310
[extractor/twitter] Fix --no-playlist and add media view_count when using GraphQL (#6211)
Authored by: Grub4K
2023-02-12 14:43:26 +01:00
pukkandan
2e269bd998
[pyinst] Fix for pyinstaller 5.8
Fixes comment https://github.com/yt-dlp/yt-dlp/issues/1839#issuecomment-1427002271
2023-02-12 18:43:21 +05:30
Bruno Guerreiro
78a78fa74d
[extractor/youtube] Add hyperpipe instances (#6020)
Authored by: Generator
2023-02-12 14:03:45 +05:30
HobbyistDev
0ba87dd279
[extractor/biliintl] Add intro and ending chapters (#6018)
Authored by: HobbyistDev
2023-02-12 13:24:36 +05:30
Roland Hieber
05799a48c7
[extractor/youtube] Update invidious and piped instances (#6030)
Authored by: rohieb
2023-02-12 13:22:07 +05:30
ByteDream
93abb7406b
[extractor/crunchyroll] Add intro chapter (#6023)
Authored by: ByteDream
2023-02-12 13:17:12 +05:30
LowSuggestion912
b23167e754
[extractor/common] Fix _search_nuxt_data (#6062)
Authored by: LowSuggestion912
2023-02-12 12:55:24 +05:30
Chris Caruso
417cdaae08
[extractor/ximalaya] Update album _VALID_URL (#6110)
Authored by: carusocr
Closes #6059
2023-02-12 10:23:24 +05:30
sepro
b3eaab7ca2
[extractor/vlive] Replace with VLiveWebArchiveIE (#6196)
vlive has shut down: https://web.archive.org/web/20221031171019/https://www.vlive.tv/notice/4749

Authored by: seproDev
2023-02-12 10:17:03 +05:30
lauren n. liberda
a31d0fa6c3
[extractor/tvp] Support stream.tvp.pl (#6139)
Authored by: selfisekai
2023-02-12 10:13:10 +05:30
sepro
cc2389c8ac
[extractor/npo] Fix extractor and add HD support (#6155)
Authored by: seproDev
2023-02-12 10:05:24 +05:30
Chris Caruso
20266508dd
[extractor/bfmtv] Support rmc prefix (#6025)
Authored by: carusocr
Closes #6021
2023-02-12 09:59:41 +05:30
qulaz
cc13293c28
[extractor/clyp] Support wav (#6102)
Authored by: qulaz
2023-02-12 09:58:15 +05:30
oxamun
989f47b631
[extractor/tnaflix] Fix extractor (#6086)
Closes #6085
Authored by: oxamun, bashonly
2023-02-12 09:51:29 +05:30
JChris246
7d5f919bad
[extractor/Stripchat] Fix extractor (#5985)
Authored by bashonly, JChris246
Closes #5963, closes #5866
2023-02-12 09:47:37 +05:30
panatexxa
c62e64cf01
[extractor/moviepilot] Fix extractor (#5954)
Authored by: panatexxa
2023-02-12 09:45:16 +05:30
pmitchell86
c085cc2def
[extractor/91porn] Fix title and comment extraction (#5932)
Authored by: pmitchell86
Fixes #3256
2023-02-12 09:43:31 +05:30
Alex Berg
7708df8da0
[extractor/Hidive] Fix subtitles and age-restriction (#5828)
Authored by: chexxor
Closes #408
2023-02-12 09:17:52 +05:30
pukkandan
b85faf6ffb
[devscripts/pyinstaller] Analyze sub-modules of Cryptodome
Ref: https://github.com/yt-dlp/yt-dlp/issues/6185#issuecomment-1423523986
2023-02-12 03:07:32 +05:30
Master
203a06f855
[extractor/radiko] Fix format sorting for Time Free (#6159)
Authored by: road-master
2023-02-11 19:24:10 +09:00
Simon Sawicki
6839ae1f6d
[utils] traverse_obj: Fix more bugs
and cleanup uses of `default=[]`

Continued from b1bde57bef878478e3503ab07190fd207914ade9
2023-02-10 19:36:55 +05:30
LeoniePhiline
c0cd13fb1c
[extractor/vimeo] Fix playerConfig extraction (#6203)
Authored by: bashonly, LeoniePhiline
Closes #6149
2023-02-10 19:20:29 +05:30
Ha Tien Loi
f14c233348
[extractor/DouyuTV]: Use new API (#6074)
Authored by: hatienl0i261299
2023-02-09 02:11:04 +05:30
pukkandan
768a001781
[compat_utils] Simplify EnhancedModule 2023-02-09 01:47:13 +05:30
pukkandan
acb1042a9f
[devscripts] Provide pyinstaller hooks
Closes #6185
2023-02-09 01:46:56 +05:30
Stefan Lobbenmeier
f40e32fb1a
[extractor/servus] Rewrite extractor (#6036)
Closes #1076, closes #4240, closes #2748, closes #1045, closes #1498
Authored by: FrankZ85, Ashish0804, StefanLobbenmeier

Co-authored-by: FrankZ85 <43293037+FrankZ85@users.noreply.github.com>
2023-02-08 11:35:32 +05:30
bashonly
e61acb40b2
[extractor/wrestleuniverse] Add extractors (#6158)
Authored by bashonly, Grub4K
Closes #6120

Co-authored-by: Simon Sawicki <contact@grub4k.xyz>
2023-02-08 11:12:11 +05:30
bashonly
7e68567e50
[downloader/hls] Allow extractors to provide AES key (#6158)
and related cleanup

Authored by: bashonly, Grub4K

Co-authored-by: Simon Sawicki <contact@grub4k.xyz>
2023-02-08 11:09:32 +05:30
JChris246
f7efe6dc95
[extractor/pornez] Handle relative URLs in iframe (#6171)
Authored by: JChris246
Closes #6162
2023-02-08 10:50:19 +05:30
Simon Sawicki
b1bde57bef
[utils] traverse_obj: Fix several behavioral problems
See #6180 for further info

Authored by: Grub4K
2023-02-08 04:11:08 +01:00
pukkandan
88426d9446
[compat_utils] Improve passthrough_module 2023-02-08 08:23:36 +05:30
pukkandan
f6a765ceb5
[dependencies] Standardize Cryptodome imports 2023-02-08 07:28:46 +05:30
pukkandan
754c84e2e4
Support module level __bool__ and property 2023-02-08 07:28:45 +05:30
pukkandan
7aefd19afe
Make title completely non-fatal
Ref: https://github.com/yt-dlp/yt-dlp/pull/6158#discussion_r1096984349
2023-02-07 01:18:04 +05:30
Felix Yan
fbbb5508ea
[extractor/huya] Support HD streams (#6172)
Authored by: felixonmars
2023-02-07 00:54:47 +05:30
OMEGA_RAZER
c77df98b1a
[extractor/reddit] Support user posts (#6173)
Authored by: OMEGARAZER
2023-02-06 19:21:39 +05:30
Jeroen Jacobs
d27bde9883
[extractor/GoPlay] Use new API (#6151)
Authored by: jeroenj
Closes #6032
2023-02-04 04:12:43 +05:30
sepro
0fe87a8730
[extractor/zdf] Use android API endpoint for UHD downloads (#6150)
Authored by: seproDev
2023-02-04 04:08:29 +05:30
Matumo
3b161265ad
[extractor/niconico] Add support for like history (#5705)
Authored by: Matumo, pukkandan
2023-02-04 00:20:06 +05:30
chio0hai
389896df85
[extractor/txxx] Add extractors (#5240)
Authored by: chio0hai
Closes #5021
2023-02-04 00:17:00 +05:30
pukkandan
b032ff0f03
[extractor/youtube] Handle consent.youtube 2023-02-03 23:53:42 +05:30
pukkandan
dad2210c0c
[extractor/youtube] Support /live/ URL 2023-02-03 23:53:41 +05:30
Jasper Rebane
9cfdbcbf3f
[extractor/freesound] Workaround invalid URL in webpage (#6147)
Authored by: rebane2001
Closes #6146
2023-02-03 20:08:51 +05:30
lauren n. liberda
7543c9c99b
[extractor/twitter] Fix graphql extraction on some tweets (#6075)
Authored by: selfisekai
2023-02-02 19:02:14 +05:30
Simon Sawicki
acacb57c7e [extractor/rumble] Fix format sorting
Closes #6119
Authored by: pukkandan
2023-02-02 07:12:36 +01:00
Simon Sawicki
776995bc10
[utils] traverse_obj: Various improvements
- Add `set` key for transformations/filters
- Add `re.Match` group names
- Fix behavior for `expected_type` with `dict` key
- Raise for filter function signature mismatch in debug

Authored by: Grub4K
2023-02-02 06:40:19 +01:00
pukkandan
8b008d6254
[jsinterp] Support if statements
Closes #6131
2023-02-01 09:40:16 +05:30
Lesmiscore
83c4970e52
[utils] Fix time_seconds to use the provided TZ (#6118)
Authored by: Lesmiscore, Grub4K

Fixes https://github.com/yt-dlp/yt-dlp/pull/6056
2023-01-31 22:30:00 +09:00
bashonly
8aa0bd5d10
[extractor/generic] Avoid catastrophic backtracking in KVS regex
Authored by: bashonly
2023-01-29 00:59:37 -06:00
Simon Sawicki
37e325b92f [utils] Use local kernel32 for file locking on Windows
Ref: https://github.com/ytdl-org/youtube-dl/issues/21545

Authored by: Grub4K
2023-01-25 22:32:07 +01:00
pukkandan
59d7de0da5
Fix --concat-playlist
Closes #6080
2023-01-24 03:43:48 +05:30
pukkandan
88d8928bf7
[plugins] Fix zip search paths
Closes #6011
2023-01-20 23:35:34 +05:30
bashonly
176a068cde
[extractor/nbc] Fix XML parsing
Python 3.7 compat bug in cb73b8460c3ce6d37ab651a4e44bb23b10056154
Authored by: bashonly
2023-01-16 15:38:33 -06:00
bashonly
5ab3534d44
[extractor/slideslive] Fix slides and chapters/duration (#6024)
* Fix slides/thumbnails extraction
* Extract duration to fix issues w/ `--embed-chapters`, `--split-chapters`
* Add `InfoExtractor._extract_mpd_vod_duration` method
* Expand applicability of `InfoExtractor._parse_m3u8_vod_duration` method
Authored by: bashonly
2023-01-14 19:52:03 +00:00
bashonly
cb73b8460c
[extractor/nbc] Fix NBC and NBCStations extractors (#6033)
Improve `InfoExtractor._parse_smil_formats` extension detection
Closes #6019
Authored by: bashonly
2023-01-14 16:40:42 +00:00
bashonly
7481998b16
[extractor/drtv] Fix bug in ab4cbef (#6034)
Fixes bug in ab4cbef ab4cbeff00ac08f142f78a6281aa0c1124a59daa
Closes #5993
Authored by: bashonly
2023-01-14 16:35:47 +00:00
pukkandan
87ebab0615
[extractor/embedly] Embedded links may be for other extractors
Bug in bfd973ece3369c593b5e82a88cc16de80088a73e
Closes #5987
2023-01-08 00:39:12 +05:30
Marek Hudik
355d781bed
[extractor/rozhlas] Add extractor RozhlasVltavaIE (#5951)
Authored by: amra
2023-01-07 20:37:10 +05:30
127 changed files with 5599 additions and 2332 deletions

View File

@ -7,7 +7,7 @@ body:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: checkboxes
id: checklist
@ -18,13 +18,13 @@ body:
options:
- label: I'm reporting a broken site
required: true
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.02.17** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true
@ -62,7 +62,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.02.17 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@ -70,8 +70,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
Latest version: 2023.02.17, Current version: 2023.02.17
yt-dlp is up to date (2023.02.17)
<more lines>
render: shell
validations:

View File

@ -7,7 +7,7 @@ body:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: checkboxes
id: checklist
@ -18,13 +18,13 @@ body:
options:
- label: I'm reporting a new site support request
required: true
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.02.17** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
- label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true
@ -74,7 +74,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.02.17 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@ -82,8 +82,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
Latest version: 2023.02.17, Current version: 2023.02.17
yt-dlp is up to date (2023.02.17)
<more lines>
render: shell
validations:

View File

@ -7,7 +7,7 @@ body:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: checkboxes
id: checklist
@ -18,11 +18,11 @@ body:
options:
- label: I'm requesting a site-specific feature
required: true
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.02.17** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true
@ -70,7 +70,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.02.17 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@ -78,8 +78,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
Latest version: 2023.02.17, Current version: 2023.02.17
yt-dlp is up to date (2023.02.17)
<more lines>
render: shell
validations:

View File

@ -7,7 +7,7 @@ body:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: checkboxes
id: checklist
@ -18,13 +18,13 @@ body:
options:
- label: I'm reporting a bug unrelated to a specific site
required: true
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.02.17** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true
@ -55,7 +55,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.02.17 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@ -63,8 +63,8 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
Latest version: 2023.02.17, Current version: 2023.02.17
yt-dlp is up to date (2023.02.17)
<more lines>
render: shell
validations:

View File

@ -7,7 +7,7 @@ body:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: checkboxes
id: checklist
@ -20,9 +20,9 @@ body:
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.02.17** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true
@ -51,7 +51,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.02.17 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@ -59,7 +59,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
Latest version: 2023.02.17, Current version: 2023.02.17
yt-dlp is up to date (2023.02.17)
<more lines>
render: shell

View File

@ -7,7 +7,7 @@ body:
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\* field
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\* field
required: true
- type: markdown
attributes:
@ -26,9 +26,9 @@ body:
required: true
- label: I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme)
required: true
- label: I've verified that I'm running yt-dlp version **2023.01.06** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
- label: I've verified that I'm running yt-dlp version **2023.02.17** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true
@ -57,7 +57,7 @@ body:
[debug] Command-line config: ['-vU', 'test:youtube']
[debug] Portable config "yt-dlp.conf": ['-i']
[debug] Encodings: locale cp65001, fs utf-8, pref cp65001, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2023.01.06 [9d339c4] (win32_exe)
[debug] yt-dlp version 2023.02.17 [9d339c4] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22000-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
@ -65,7 +65,7 @@ body:
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
Latest version: 2023.02.17, Current version: 2023.02.17
yt-dlp is up to date (2023.02.17)
<more lines>
render: shell

View File

@ -18,7 +18,7 @@ body:
required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true

View File

@ -18,7 +18,7 @@ body:
required: true
- label: I've checked that none of provided URLs [violate any copyrights](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#is-the-website-primarily-used-for-piracy) or contain any [DRM](https://en.wikipedia.org/wiki/Digital_rights_management) to the best of my knowledge
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true

View File

@ -16,7 +16,7 @@ body:
required: true
- label: I've checked that all provided URLs are playable in a browser with the same IP and same login details
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true

View File

@ -18,7 +18,7 @@ body:
required: true
- label: I've checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/yt-dlp/yt-dlp/wiki/FAQ#video-url-contains-an-ampersand--and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true

View File

@ -16,7 +16,7 @@ body:
required: true
- label: I've verified that I'm running yt-dlp version **%(version)s** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar issues **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true

View File

@ -22,7 +22,7 @@ body:
required: true
- label: I've verified that I'm running yt-dlp version **%(version)s** ([update instructions](https://github.com/yt-dlp/yt-dlp#update)) or later (specify commit)
required: true
- label: I've searched the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
- label: I've searched [known issues](https://github.com/yt-dlp/yt-dlp/issues/3766) and the [bugtracker](https://github.com/yt-dlp/yt-dlp/issues?q=) for similar questions **including closed ones**. DO NOT post duplicates
required: true
- label: I've read the [guidelines for opening an issue](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#opening-an-issue)
required: true

View File

@ -30,7 +30,7 @@ Fixes #
- [ ] [Searched](https://github.com/yt-dlp/yt-dlp/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests
- [ ] Checked the code with [flake8](https://pypi.python.org/pypi/flake8) and [ran relevant tests](https://github.com/yt-dlp/yt-dlp/blob/master/CONTRIBUTING.md#developer-instructions)
### In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check one of the following options:
### In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check all of the following options that apply:
- [ ] I am the original author of this code and I am willing to release it under [Unlicense](http://unlicense.org/)
- [ ] I am not the original author of this code but it is in public domain or released under [Unlicense](http://unlicense.org/) (provide reliable evidence)

View File

@ -255,7 +255,7 @@ jobs:
- name: Install Requirements
run: | # Custom pyinstaller built with https://github.com/yt-dlp/pyinstaller-builds
python -m pip install -U pip setuptools wheel py2exe
pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-5.3-py3-none-any.whl" -r requirements.txt
pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/x86_64/pyinstaller-5.8.0-py3-none-any.whl" -r requirements.txt
- name: Prepare
run: |
@ -291,7 +291,7 @@ jobs:
- name: Install Requirements
run: |
python -m pip install -U pip setuptools wheel
pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-5.3-py3-none-any.whl" -r requirements.txt
pip install -U "https://yt-dlp.github.io/Pyinstaller-Builds/i686/pyinstaller-5.8.0-py3-none-any.whl" -r requirements.txt
- name: Prepare
run: |

View File

@ -4,6 +4,7 @@ coletdjnz/colethedj (collaborator)
Ashish0804 (collaborator)
nao20010128nao/Lesmiscore (collaborator)
bashonly (collaborator)
Grub4K (collaborator)
h-h-h-h
pauldubois98
nixxo
@ -319,7 +320,6 @@ columndeeply
DoubleCouponDay
Fabi019
GautamMKGarg
Grub4K
itachi-19
jeroenj
josanabr
@ -381,3 +381,27 @@ gschizas
JC-Chung
mzhou
OndrejBakan
ab4cbef
aionescu
amra
ByteDream
carusocr
chexxor
felixonmars
FrankZ85
FriedrichRehren
gregsadetsky
LeoniePhiline
LowSuggestion912
Matumo
OIRNOIR
OMEGARAZER
oxamun
pmitchell86
qbnu
qulaz
rebane2001
road-master
rohieb
sdht0
seproDev

View File

@ -10,6 +10,259 @@
* Dispatch the workflow https://github.com/yt-dlp/yt-dlp/actions/workflows/build.yml on master
-->
# 2023.02.17
* Merge youtube-dl: Upto [commit/2dd6c6e](https://github.com/ytdl-org/youtube-dl/commit/2dd6c6e)
* Fix `--concat-playlist`
* Imply `--no-progress` when `--print`
* Improve default subtitle language selection by [sdht0](https://github.com/sdht0)
* Make `title` completely non-fatal
* Sanitize formats before sorting by [pukkandan](https://github.com/pukkandan)
* Support module level `__bool__` and `property`
* [dependencies] Standardize `Cryptodome` imports
* [hls] Allow extractors to provide AES key by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly)
* [ExtractAudio] Handle outtmpl without ext by [carusocr](https://github.com/carusocr)
* [extractor/common] Fix `_search_nuxt_data` by [LowSuggestion912](https://github.com/LowSuggestion912)
* [extractor/generic] Avoid catastrophic backtracking in KVS regex by [bashonly](https://github.com/bashonly)
* [jsinterp] Support `if` statements
* [plugins] Fix zip search paths
* [utils] `traverse_obj`: Various improvements by [Grub4K](https://github.com/Grub4K)
* [utils] `traverse_obj`: Fix more bugs
* [utils] `traverse_obj`: Fix several behavioral problems by [Grub4K](https://github.com/Grub4K)
* [utils] Don't use Content-length with encoding by [felixonmars](https://github.com/felixonmars)
* [utils] Fix `time_seconds` to use the provided TZ by [Grub4K](https://github.com/Grub4K), [Lesmiscore](https://github.com/Lesmiscore)
* [utils] Fix race condition in `make_dir` by [aionescu](https://github.com/aionescu)
* [utils] Use local kernel32 for file locking on Windows by [Grub4K](https://github.com/Grub4K)
* [compat_utils] Improve `passthrough_module`
* [compat_utils] Simplify `EnhancedModule`
* [build] Update pyinstaller
* [pyinst] Fix for pyinstaller 5.8
* [devscripts] Provide `pyinstaller` hooks
* [devscripts/pyinstaller] Analyze sub-modules of `Cryptodome`
* [cleanup] Misc fixes and cleanup
* [extractor/anchorfm] Add episode extractor by [HobbyistDev](https://github.com/HobbyistDev), [bashonly](https://github.com/bashonly)
* [extractor/boxcast] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/ebay] Add extractor by [JChris246](https://github.com/JChris246)
* [extractor/hypergryph] Add extractor by [HobbyistDev](https://github.com/HobbyistDev), [bashonly](https://github.com/bashonly)
* [extractor/NZOnScreen] Add extractor by [gregsadetsky](https://github.com/gregsadetsky), [pukkandan](https://github.com/pukkandan)
* [extractor/rozhlas] Add extractor RozhlasVltavaIE by [amra](https://github.com/amra)
* [extractor/tempo] Add IVXPlayer extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/txxx] Add extractors by [chio0hai](https://github.com/chio0hai)
* [extractor/vocaroo] Add extractor by [SuperSonicHub1](https://github.com/SuperSonicHub1), [qbnu](https://github.com/qbnu)
* [extractor/wrestleuniverse] Add extractors by [Grub4K](https://github.com/Grub4K), [bashonly](https://github.com/bashonly)
* [extractor/yappy] Add extractor by [HobbyistDev](https://github.com/HobbyistDev)
* **[extractor/youtube] Fix `uploader_id` extraction** by [bashonly](https://github.com/bashonly)
* [extractor/youtube] Add hyperpipe instances by [Generator](https://github.com/Generator)
* [extractor/youtube] Handle `consent.youtube`
* [extractor/youtube] Support `/live/` URL
* [extractor/youtube] Update invidious and piped instances by [rohieb](https://github.com/rohieb)
* [extractor/91porn] Fix title and comment extraction by [pmitchell86](https://github.com/pmitchell86)
* [extractor/AbemaTV] Cache user token whenever appropriate by [Lesmiscore](https://github.com/Lesmiscore)
* [extractor/bfmtv] Support `rmc` prefix by [carusocr](https://github.com/carusocr)
* [extractor/biliintl] Add intro and ending chapters by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/clyp] Support `wav` by [qulaz](https://github.com/qulaz)
* [extractor/crunchyroll] Add intro chapter by [ByteDream](https://github.com/ByteDream)
* [extractor/crunchyroll] Better message for premium videos
* [extractor/crunchyroll] Fix incorrect premium-only error by [Grub4K](https://github.com/Grub4K)
* [extractor/DouyuTV] Use new API by [hatienl0i261299](https://github.com/hatienl0i261299)
* [extractor/embedly] Embedded links may be for other extractors
* [extractor/freesound] Workaround invalid URL in webpage by [rebane2001](https://github.com/rebane2001)
* [extractor/GoPlay] Use new API by [jeroenj](https://github.com/jeroenj)
* [extractor/Hidive] Fix subtitles and age-restriction by [chexxor](https://github.com/chexxor)
* [extractor/huya] Support HD streams by [felixonmars](https://github.com/felixonmars)
* [extractor/moviepilot] Fix extractor by [panatexxa](https://github.com/panatexxa)
* [extractor/nbc] Fix `NBC` and `NBCStations` extractors by [bashonly](https://github.com/bashonly)
* [extractor/nbc] Fix XML parsing by [bashonly](https://github.com/bashonly)
* [extractor/nebula] Remove broken cookie support by [hheimbuerger](https://github.com/hheimbuerger)
* [extractor/nfl] Add `NFLPlus` extractors by [bashonly](https://github.com/bashonly)
* [extractor/niconico] Add support for like history by [Matumo](https://github.com/Matumo), [pukkandan](https://github.com/pukkandan)
* [extractor/nitter] Update instance list by [OIRNOIR](https://github.com/OIRNOIR)
* [extractor/npo] Fix extractor and add HD support by [seproDev](https://github.com/seproDev)
* [extractor/odkmedia] Add `OnDemandChinaEpisodeIE` by [HobbyistDev](https://github.com/HobbyistDev), [pukkandan](https://github.com/pukkandan)
* [extractor/pornez] Handle relative URLs in iframe by [JChris246](https://github.com/JChris246)
* [extractor/radiko] Fix format sorting for Time Free by [road-master](https://github.com/road-master)
* [extractor/rcs] Fix extractors by [nixxo](https://github.com/nixxo), [pukkandan](https://github.com/pukkandan)
* [extractor/reddit] Support user posts by [OMEGARAZER](https://github.com/OMEGARAZER)
* [extractor/rumble] Fix format sorting by [pukkandan](https://github.com/pukkandan)
* [extractor/servus] Rewrite extractor by [Ashish0804](https://github.com/Ashish0804), [FrankZ85](https://github.com/FrankZ85), [StefanLobbenmeier](https://github.com/StefanLobbenmeier)
* [extractor/slideslive] Fix slides and chapters/duration by [bashonly](https://github.com/bashonly)
* [extractor/SportDeutschland] Fix extractor by [FriedrichRehren](https://github.com/FriedrichRehren)
* [extractor/Stripchat] Fix extractor by [JChris246](https://github.com/JChris246), [bashonly](https://github.com/bashonly)
* [extractor/tnaflix] Fix extractor by [bashonly](https://github.com/bashonly), [oxamun](https://github.com/oxamun)
* [extractor/tvp] Support `stream.tvp.pl` by [selfisekai](https://github.com/selfisekai)
* [extractor/twitter] Fix `--no-playlist` and add media `view_count` when using GraphQL by [Grub4K](https://github.com/Grub4K)
* [extractor/twitter] Fix graphql extraction on some tweets by [selfisekai](https://github.com/selfisekai)
* [extractor/vimeo] Fix `playerConfig` extraction by [LeoniePhiline](https://github.com/LeoniePhiline), [bashonly](https://github.com/bashonly)
* [extractor/viu] Add `ViuOTTIndonesiaIE` extractor by [HobbyistDev](https://github.com/HobbyistDev)
* [extractor/vk] Fix playlists for new API by [the-marenga](https://github.com/the-marenga)
* [extractor/vlive] Replace with `VLiveWebArchiveIE` by [seproDev](https://github.com/seproDev)
* [extractor/ximalaya] Update album `_VALID_URL` by [carusocr](https://github.com/carusocr)
* [extractor/zdf] Use android API endpoint for UHD downloads by [seproDev](https://github.com/seproDev)
* [extractor/drtv] Fix bug in [ab4cbef](https://github.com/yt-dlp/yt-dlp/commit/ab4cbef) by [bashonly](https://github.com/bashonly)
### 2023.02.17
#### Core changes
### Core changes
- [Bugfix for 39f32f1715c0dffb7626dda7307db6388bb7abaa](https://github.com/yt-dlp/yt-dlp/commit/9ebac35577e61c3d25fafc959655fa3ab04ca7ef) by [pukkandan](https://github.com/pukkandan)
- [Bugfix for 39f32f1715c0dffb7626dda7307db6388bb7abaa](https://github.com/yt-dlp/yt-dlp/commit/c154302c588c3d4362cec4fc5545e7e5d2bcf7a3) by [pukkandan](https://github.com/pukkandan)
- [Fix `--concat-playlist`](https://github.com/yt-dlp/yt-dlp/commit/59d7de0da545944c48a82fc2937b996d7cd8cc9c) by [pukkandan](https://github.com/pukkandan)
- [Imply `--no-progress` when `--print`](https://github.com/yt-dlp/yt-dlp/commit/5712943b764ba819ef479524c32700228603817a) by [pukkandan](https://github.com/pukkandan)
- [Improve default subtitle language selection](https://github.com/yt-dlp/yt-dlp/commit/376aa24b1541e2bfb23337c0ae9bafa5bb3787f1) ([#6240](https://github.com/yt-dlp/yt-dlp/issues/6240)) by [sdht0](https://github.com/sdht0)
- [Make `title` completely non-fatal](https://github.com/yt-dlp/yt-dlp/commit/7aefd19afed357c80743405ec2ace2148cba42e3) by [pukkandan](https://github.com/pukkandan)
- [Sanitize formats before sorting](https://github.com/yt-dlp/yt-dlp/commit/39f32f1715c0dffb7626dda7307db6388bb7abaa) by [pukkandan](https://github.com/pukkandan)
- [Support module level `__bool__` and `property`](https://github.com/yt-dlp/yt-dlp/commit/754c84e2e416cf6609dd0e4632b4985a08d34043) by [pukkandan](https://github.com/pukkandan)
- [Update to ytdl-commit-2dd6c6e](https://github.com/yt-dlp/yt-dlp/commit/48fde8ac4ccbaaea868f6378814dde395f649fbf) by [pukkandan](https://github.com/pukkandan)
- [extractor/douyutv]: [Use new API](https://github.com/yt-dlp/yt-dlp/commit/f14c2333481c63c24017a41ded7d8f36726504b7) ([#6074](https://github.com/yt-dlp/yt-dlp/issues/6074)) by [hatienl0i261299](https://github.com/hatienl0i261299)
- compat_utils
- [Improve `passthrough_module`](https://github.com/yt-dlp/yt-dlp/commit/88426d9446758c707fb511408f2d6f56de952db4) by [pukkandan](https://github.com/pukkandan)
- [Simplify `EnhancedModule`](https://github.com/yt-dlp/yt-dlp/commit/768a00178109508893488e53a0e720b117fbccf6) by [pukkandan](https://github.com/pukkandan)
- dependencies
- [Standardize `Cryptodome` imports](https://github.com/yt-dlp/yt-dlp/commit/f6a765ceb59c55aea06921880c1c87d1ff36e5de) by [pukkandan](https://github.com/pukkandan)
- jsinterp
- [Support `if` statements](https://github.com/yt-dlp/yt-dlp/commit/8b008d62544b82e24a0ba36c30e8e51855d93419) by [pukkandan](https://github.com/pukkandan)
- plugins
- [Fix zip search paths](https://github.com/yt-dlp/yt-dlp/commit/88d8928bf7630801865cf8728ae5c77234324b7b) by [pukkandan](https://github.com/pukkandan)
- utils
- [Don't use Content-length with encoding](https://github.com/yt-dlp/yt-dlp/commit/65e5c021e7c5f23ecbc6a982b72a02ac6cd6900d) ([#6176](https://github.com/yt-dlp/yt-dlp/issues/6176)) by [felixonmars](https://github.com/felixonmars)
- [Fix `time_seconds` to use the provided TZ](https://github.com/yt-dlp/yt-dlp/commit/83c4970e52839ce8761ec61bd19d549aed7d7920) ([#6118](https://github.com/yt-dlp/yt-dlp/issues/6118)) by [Grub4K](https://github.com/Grub4K), [Lesmiscore](https://github.com/Lesmiscore)
- [Fix race condition in `make_dir`](https://github.com/yt-dlp/yt-dlp/commit/b25d6cb96337d479bdcb41768356da414c3aa835) ([#6089](https://github.com/yt-dlp/yt-dlp/issues/6089)) by [aionescu](https://github.com/aionescu)
- [Use local kernel32 for file locking on Windows](https://github.com/yt-dlp/yt-dlp/commit/37e325b92ff9d784715ac0e5d1f7d96bf5f45ad9) by [Grub4K](https://github.com/Grub4K)
- traverse_obj
- [Fix more bugs](https://github.com/yt-dlp/yt-dlp/commit/6839ae1f6dde4c0442619e351b3f0442312ab4f9) by [pukkandan](https://github.com/pukkandan)
- [Fix several behavioral problems](https://github.com/yt-dlp/yt-dlp/commit/b1bde57bef878478e3503ab07190fd207914ade9) by [Grub4K](https://github.com/Grub4K)
- [Various improvements](https://github.com/yt-dlp/yt-dlp/commit/776995bc109c5cd1aa56b684fada2ce718a386ec) by [Grub4K](https://github.com/Grub4K)
### Extractor changes
- [Fix `_search_nuxt_data`](https://github.com/yt-dlp/yt-dlp/commit/b23167e7542c177f32b22b29857b637dc4aede69) ([#6062](https://github.com/yt-dlp/yt-dlp/issues/6062)) by [LowSuggestion912](https://github.com/LowSuggestion912)
- 91porn
- [Fix title and comment extraction](https://github.com/yt-dlp/yt-dlp/commit/c085cc2def9862ac8a7619ce8ea5dcc177325719) ([#5932](https://github.com/yt-dlp/yt-dlp/issues/5932)) by [pmitchell86](https://github.com/pmitchell86)
- abematv
- [Cache user token whenever appropriate](https://github.com/yt-dlp/yt-dlp/commit/a4f16832213d9e29beecf685d6cd09a2f0b48c87) ([#6216](https://github.com/yt-dlp/yt-dlp/issues/6216)) by [Lesmiscore](https://github.com/Lesmiscore)
- anchorfm
- [Add episode extractor](https://github.com/yt-dlp/yt-dlp/commit/a4ad59ff2ded208bf33f6fe07299a3449eadccdc) ([#6092](https://github.com/yt-dlp/yt-dlp/issues/6092)) by [bashonly](https://github.com/bashonly), [HobbyistDev](https://github.com/HobbyistDev)
- bfmtv
- [Support `rmc` prefix](https://github.com/yt-dlp/yt-dlp/commit/20266508dd6247dd3cf0e97b9b9f14c3afc046db) ([#6025](https://github.com/yt-dlp/yt-dlp/issues/6025)) by [carusocr](https://github.com/carusocr)
- biliintl
- [Add intro and ending chapters](https://github.com/yt-dlp/yt-dlp/commit/0ba87dd279d3565ed93c559cf7880ad61eb83af8) ([#6018](https://github.com/yt-dlp/yt-dlp/issues/6018)) by [HobbyistDev](https://github.com/HobbyistDev)
- boxcast
- [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/9acca71237f42a4775008e51fe26e42f0a39c552) ([#5983](https://github.com/yt-dlp/yt-dlp/issues/5983)) by [HobbyistDev](https://github.com/HobbyistDev)
- clyp
- [Support `wav`](https://github.com/yt-dlp/yt-dlp/commit/cc13293c2819b5461be211a9729fd02bb1e2f476) ([#6102](https://github.com/yt-dlp/yt-dlp/issues/6102)) by [qulaz](https://github.com/qulaz)
- crunchyroll
- [Add intro chapter](https://github.com/yt-dlp/yt-dlp/commit/93abb7406b95793f6872d12979b91d5f336b4f43) ([#6023](https://github.com/yt-dlp/yt-dlp/issues/6023)) by [ByteDream](https://github.com/ByteDream)
- [Better message for premium videos](https://github.com/yt-dlp/yt-dlp/commit/44699d10dc8de9c6a338f4a8e5c63506ec4d2118) by [pukkandan](https://github.com/pukkandan)
- [Fix incorrect premium-only error](https://github.com/yt-dlp/yt-dlp/commit/c9d14bd22ab31e2a41f9f8061843668a06db583b) by [Grub4K](https://github.com/Grub4K)
- drtv
- [Fix bug in ab4cbef](https://github.com/yt-dlp/yt-dlp/commit/7481998b169b2a52049fc33bff82034d6563ead4) ([#6034](https://github.com/yt-dlp/yt-dlp/issues/6034)) by [bashonly](https://github.com/bashonly)
- ebay
- [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/da880559a6ecbbf374cc9f3378e696b55b9599af) ([#6170](https://github.com/yt-dlp/yt-dlp/issues/6170)) by [JChris246](https://github.com/JChris246)
- embedly
- [Embedded links may be for other extractors](https://github.com/yt-dlp/yt-dlp/commit/87ebab0615b1bf9b14b478b055e7059d630b4833) by [pukkandan](https://github.com/pukkandan)
- freesound
- [Workaround invalid URL in webpage](https://github.com/yt-dlp/yt-dlp/commit/9cfdbcbf3f17be51f5b6bb9bb6d880b2f3d67362) ([#6147](https://github.com/yt-dlp/yt-dlp/issues/6147)) by [rebane2001](https://github.com/rebane2001)
- generic
- [Avoid catastrophic backtracking in KVS regex](https://github.com/yt-dlp/yt-dlp/commit/8aa0bd5d10627ece3c1815c01d02fb8bf22847a7) by [bashonly](https://github.com/bashonly)
- goplay
- [Use new API](https://github.com/yt-dlp/yt-dlp/commit/d27bde98832e3b7ffb39f3cf6346011b97bb3bc3) ([#6151](https://github.com/yt-dlp/yt-dlp/issues/6151)) by [jeroenj](https://github.com/jeroenj)
- hidive
- [Fix subtitles and age-restriction](https://github.com/yt-dlp/yt-dlp/commit/7708df8da05c94270b43e0630e4e20f6d2d62c55) ([#5828](https://github.com/yt-dlp/yt-dlp/issues/5828)) by [chexxor](https://github.com/chexxor)
- huya
- [Support HD streams](https://github.com/yt-dlp/yt-dlp/commit/fbbb5508ea98ed8709847f5ecced7d70ff05e0ee) ([#6172](https://github.com/yt-dlp/yt-dlp/issues/6172)) by [felixonmars](https://github.com/felixonmars)
- hypergryph
- [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/31c279a2a2c2ef402a9e6dad9992b310d16439a6) ([#6094](https://github.com/yt-dlp/yt-dlp/issues/6094)) by [bashonly](https://github.com/bashonly), [HobbyistDev](https://github.com/HobbyistDev)
- moviepilot
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/c62e64cf0122e52fa2175dd1b004ca6b8e1d82af) ([#5954](https://github.com/yt-dlp/yt-dlp/issues/5954)) by [panatexxa](https://github.com/panatexxa)
- nbc
- [Fix XML parsing](https://github.com/yt-dlp/yt-dlp/commit/176a068cde4f2d9dfa0336168caead0b1edcb8ac) by [bashonly](https://github.com/bashonly)
- [Fix `NBC` and `NBCStations` extractors](https://github.com/yt-dlp/yt-dlp/commit/cb73b8460c3ce6d37ab651a4e44bb23b10056154) ([#6033](https://github.com/yt-dlp/yt-dlp/issues/6033)) by [bashonly](https://github.com/bashonly)
- nebula
- [Remove broken cookie support](https://github.com/yt-dlp/yt-dlp/commit/d50ea3ce5abc3b0defc0e5d1e22b22ce9b01b07b) ([#5979](https://github.com/yt-dlp/yt-dlp/issues/5979)) by [hheimbuerger](https://github.com/hheimbuerger)
- nfl
- [Add `NFLPlus` extractors](https://github.com/yt-dlp/yt-dlp/commit/8b37c58f8b5494504acdb5ebe3f8bbd26230f725) ([#6222](https://github.com/yt-dlp/yt-dlp/issues/6222)) by [bashonly](https://github.com/bashonly)
- niconico
- [Add support for like history](https://github.com/yt-dlp/yt-dlp/commit/3b161265add30613bde2e46fca214fe94d09e651) ([#5705](https://github.com/yt-dlp/yt-dlp/issues/5705)) by [Matumo](https://github.com/Matumo), [pukkandan](https://github.com/pukkandan)
- nitter
- [Update instance list](https://github.com/yt-dlp/yt-dlp/commit/a9189510baadf0dccd2d4d363bc6f3a441128bb0) ([#6236](https://github.com/yt-dlp/yt-dlp/issues/6236)) by [OIRNOIR](https://github.com/OIRNOIR)
- npo
- [Fix extractor and add HD support](https://github.com/yt-dlp/yt-dlp/commit/cc2389c8ac72a514d4e002a0f6ca5a7d65c7eff0) ([#6155](https://github.com/yt-dlp/yt-dlp/issues/6155)) by [seproDev](https://github.com/seproDev)
- nzonscreen
- [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/d3bb187f01e1e30db05e639fc23a2e1935d777fe) ([#6208](https://github.com/yt-dlp/yt-dlp/issues/6208)) by [gregsadetsky](https://github.com/gregsadetsky), [pukkandan](https://github.com/pukkandan)
- odkmedia
- [Add `OnDemandChinaEpisodeIE`](https://github.com/yt-dlp/yt-dlp/commit/10fd9e6ee833c88edf6c633f864f42843a708d32) ([#6116](https://github.com/yt-dlp/yt-dlp/issues/6116)) by [HobbyistDev](https://github.com/HobbyistDev), [pukkandan](https://github.com/pukkandan)
- pornez
- [Handle relative URLs in iframe](https://github.com/yt-dlp/yt-dlp/commit/f7efe6dc958eb0689cb9534ff0b4e592040be8df) ([#6171](https://github.com/yt-dlp/yt-dlp/issues/6171)) by [JChris246](https://github.com/JChris246)
- radiko
- [Fix format sorting for Time Free](https://github.com/yt-dlp/yt-dlp/commit/203a06f8554df6db07d8f20f465ecbfe8a14e591) ([#6159](https://github.com/yt-dlp/yt-dlp/issues/6159)) by [road-master](https://github.com/road-master)
- rcs
- [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/c6b657867ad68af6b930ed0aa11ec5d93ee187b7) ([#5700](https://github.com/yt-dlp/yt-dlp/issues/5700)) by [nixxo](https://github.com/nixxo), [pukkandan](https://github.com/pukkandan)
- reddit
- [Support user posts](https://github.com/yt-dlp/yt-dlp/commit/c77df98b1a477a020a57141464d10c0f4d0fdbc9) ([#6173](https://github.com/yt-dlp/yt-dlp/issues/6173)) by [OMEGARAZER](https://github.com/OMEGARAZER)
- rozhlas
- [Add extractor RozhlasVltavaIE](https://github.com/yt-dlp/yt-dlp/commit/355d781bed497cbcb254bf2a2737b83fa51c84ea) ([#5951](https://github.com/yt-dlp/yt-dlp/issues/5951)) by [amra](https://github.com/amra)
- rumble
- [Fix format sorting](https://github.com/yt-dlp/yt-dlp/commit/acacb57c7e173b93c6e0f0c43e61b9b2912719d8) by [pukkandan](https://github.com/pukkandan)
- servus
- [Rewrite extractor](https://github.com/yt-dlp/yt-dlp/commit/f40e32fb1ac67be5bdbc8e32a3c235abfc4be260) ([#6036](https://github.com/yt-dlp/yt-dlp/issues/6036)) by [Ashish0804](https://github.com/Ashish0804), [FrankZ85](https://github.com/FrankZ85), [StefanLobbenmeier](https://github.com/StefanLobbenmeier)
- slideslive
- [Fix slides and chapters/duration](https://github.com/yt-dlp/yt-dlp/commit/5ab3534d44231f7711398bc3cfc520e2efd09f50) ([#6024](https://github.com/yt-dlp/yt-dlp/issues/6024)) by [bashonly](https://github.com/bashonly)
- sportdeutschland
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/5e1a54f63e393c218a40949012ff0de0ce63cb15) ([#6041](https://github.com/yt-dlp/yt-dlp/issues/6041)) by [FriedrichRehren](https://github.com/FriedrichRehren)
- stripchat
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/7d5f919bad07017f4b39b55725491b1e9717d47a) ([#5985](https://github.com/yt-dlp/yt-dlp/issues/5985)) by [bashonly](https://github.com/bashonly), [JChris246](https://github.com/JChris246)
- tempo
- [Add IVXPlayer extractor](https://github.com/yt-dlp/yt-dlp/commit/30031be974d210f451100339699ef03b0ddb5f10) ([#5837](https://github.com/yt-dlp/yt-dlp/issues/5837)) by [HobbyistDev](https://github.com/HobbyistDev)
- tnaflix
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/989f47b6315541989bb507f26b431d9586430995) ([#6086](https://github.com/yt-dlp/yt-dlp/issues/6086)) by [bashonly](https://github.com/bashonly), [oxamun](https://github.com/oxamun)
- tvp
- [Support `stream.tvp.pl`](https://github.com/yt-dlp/yt-dlp/commit/a31d0fa6c315b1145d682361149003d98f1e3782) ([#6139](https://github.com/yt-dlp/yt-dlp/issues/6139)) by [selfisekai](https://github.com/selfisekai)
- twitter
- [Fix `--no-playlist` and add media `view_count` when using GraphQL](https://github.com/yt-dlp/yt-dlp/commit/b6795fd310f1dd61dddc9fd08e52fe485bdc8a3e) ([#6211](https://github.com/yt-dlp/yt-dlp/issues/6211)) by [Grub4K](https://github.com/Grub4K)
- [Fix graphql extraction on some tweets](https://github.com/yt-dlp/yt-dlp/commit/7543c9c99bcb116b085fdb1f41b84a0ead04c05d) ([#6075](https://github.com/yt-dlp/yt-dlp/issues/6075)) by [selfisekai](https://github.com/selfisekai)
- txxx
- [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/389896df85ed14eaf74f72531da6c4491d6b73b0) ([#5240](https://github.com/yt-dlp/yt-dlp/issues/5240)) by [chio0hai](https://github.com/chio0hai)
- vimeo
- [Fix `playerConfig` extraction](https://github.com/yt-dlp/yt-dlp/commit/c0cd13fb1c71b842c3d272d0273c03542b467766) ([#6203](https://github.com/yt-dlp/yt-dlp/issues/6203)) by [bashonly](https://github.com/bashonly), [LeoniePhiline](https://github.com/LeoniePhiline)
- viu
- [Add `ViuOTTIndonesiaIE` extractor](https://github.com/yt-dlp/yt-dlp/commit/72671a212d7c939329cb5d34335fa089dd3acbd3) ([#6099](https://github.com/yt-dlp/yt-dlp/issues/6099)) by [HobbyistDev](https://github.com/HobbyistDev)
- vk
- [Fix playlists for new API](https://github.com/yt-dlp/yt-dlp/commit/a9c685453f7019bee94170f936619c6db76c964e) ([#6122](https://github.com/yt-dlp/yt-dlp/issues/6122)) by [the-marenga](https://github.com/the-marenga)
- vlive
- [Replace with `VLiveWebArchiveIE`](https://github.com/yt-dlp/yt-dlp/commit/b3eaab7ca2e118d4db73dcb44afd9c8717db8b67) ([#6196](https://github.com/yt-dlp/yt-dlp/issues/6196)) by [seproDev](https://github.com/seproDev)
- vocaroo
- [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/e4a8b1769e19755acba6d8f212208359905a3159) ([#6117](https://github.com/yt-dlp/yt-dlp/issues/6117)) by [qbnu](https://github.com/qbnu), [SuperSonicHub1](https://github.com/SuperSonicHub1)
- wrestleuniverse
- [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/e61acb40b2cb6ef45508d72235026d458c9d5dff) ([#6158](https://github.com/yt-dlp/yt-dlp/issues/6158)) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K)
- ximalaya
- [Update album `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/417cdaae08fc447c9d15c53a88e2e9a027cdbf0a) ([#6110](https://github.com/yt-dlp/yt-dlp/issues/6110)) by [carusocr](https://github.com/carusocr)
- yappy
- [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/361630015535026712bdb67f804a15b65ff9ee7e) ([#6111](https://github.com/yt-dlp/yt-dlp/issues/6111)) by [HobbyistDev](https://github.com/HobbyistDev)
- youtube
- [Add hyperpipe instances](https://github.com/yt-dlp/yt-dlp/commit/78a78fa74dbc888d20f1b65e1382bf99131597d5) ([#6020](https://github.com/yt-dlp/yt-dlp/issues/6020)) by [Generator](https://github.com/Generator)
- [Fix `uploader_id` extraction](https://github.com/yt-dlp/yt-dlp/commit/149eb0bbf34fa8fdf8d1e2aa28e17479d099e26b) by [bashonly](https://github.com/bashonly)
- [Handle `consent.youtube`](https://github.com/yt-dlp/yt-dlp/commit/b032ff0f032512bd6fc70c9c1994d906eacc06cb) by [pukkandan](https://github.com/pukkandan)
- [Support `/live/` URL](https://github.com/yt-dlp/yt-dlp/commit/dad2210c0cb9cf03702a9511817ee5ec646d7bc8) by [pukkandan](https://github.com/pukkandan)
- [Update invidious and piped instances](https://github.com/yt-dlp/yt-dlp/commit/05799a48c7dec12b34c8bf951c8d2eceedda59f8) ([#6030](https://github.com/yt-dlp/yt-dlp/issues/6030)) by [rohieb](https://github.com/rohieb)
- [`uploader_id` includes `@` with handle](https://github.com/yt-dlp/yt-dlp/commit/c61cf091a54d3aa3c611722035ccde5ecfe981bb) by [bashonly](https://github.com/bashonly)
- zdf
- [Use android API endpoint for UHD downloads](https://github.com/yt-dlp/yt-dlp/commit/0fe87a8730638490415d630f48e61d264d89c358) ([#6150](https://github.com/yt-dlp/yt-dlp/issues/6150)) by [seproDev](https://github.com/seproDev)
### Downloader changes
- hls
- [Allow extractors to provide AES key](https://github.com/yt-dlp/yt-dlp/commit/7e68567e508168b345266c0c19812ad50a829eaa) ([#6158](https://github.com/yt-dlp/yt-dlp/issues/6158)) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K)
### Postprocessor changes
- extractaudio
- [Handle outtmpl without ext](https://github.com/yt-dlp/yt-dlp/commit/f737fb16d8234408c85bc189ccc926fea000515b) ([#6005](https://github.com/yt-dlp/yt-dlp/issues/6005)) by [carusocr](https://github.com/carusocr)
- pyinst
- [Fix for pyinstaller 5.8](https://github.com/yt-dlp/yt-dlp/commit/2e269bd998c61efaf7500907d114a56e5e83e65e) by [pukkandan](https://github.com/pukkandan)
### Misc. changes
- build
- [Update pyinstaller](https://github.com/yt-dlp/yt-dlp/commit/365b9006051ac7d735c20bb63c4907b758233048) by [pukkandan](https://github.com/pukkandan)
- cleanup
- Miscellaneous: [76c9c52](https://github.com/yt-dlp/yt-dlp/commit/76c9c523071150053df7b56956646b680b6a6e05) by [pukkandan](https://github.com/pukkandan)
- devscripts
- [Provide pyinstaller hooks](https://github.com/yt-dlp/yt-dlp/commit/acb1042a9ffa8769fe691beac1011d6da1fcf321) by [pukkandan](https://github.com/pukkandan)
- pyinstaller
- [Analyze sub-modules of `Cryptodome`](https://github.com/yt-dlp/yt-dlp/commit/b85faf6ffb700058e774e99c04304a7a9257cdd0) by [pukkandan](https://github.com/pukkandan)
### 2023.01.06

View File

@ -8,6 +8,7 @@ You can also find lists of all [contributors of yt-dlp](CONTRIBUTORS) and [autho
## [pukkandan](https://github.com/pukkandan)
[![ko-fi](https://img.shields.io/badge/_-Ko--fi-red.svg?logo=kofi&labelColor=555555&style=for-the-badge)](https://ko-fi.com/pukkandan)
[![gh-sponsor](https://img.shields.io/badge/_-Github-red.svg?logo=github&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/pukkandan)
* Owner of the fork
@ -25,8 +26,9 @@ You can also find lists of all [contributors of yt-dlp](CONTRIBUTORS) and [autho
## [coletdjnz](https://github.com/coletdjnz)
[![gh-sponsor](https://img.shields.io/badge/_-Sponsor-red.svg?logo=githubsponsors&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/coletdjnz)
[![gh-sponsor](https://img.shields.io/badge/_-Github-red.svg?logo=github&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/coletdjnz)
* Improved plugin architecture
* YouTube improvements including: age-gate bypass, private playlists, multiple-clients (to avoid throttling) and a lot of under-the-hood improvements
* Added support for new websites YoutubeWebArchive, MainStreaming, PRX, nzherald, Mediaklikk, StarTV etc
* Improved/fixed support for Patreon, panopto, gfycat, itv, pbs, SouthParkDE etc
@ -57,3 +59,11 @@ You can also find lists of all [contributors of yt-dlp](CONTRIBUTORS) and [autho
* `--cookies-from-browser` support for Firefox containers
* Added support for new websites Genius, Kick, NBCStations, Triller, VideoKen etc
* Improved/fixed support for Anvato, Brightcove, Instagram, ParamountPlus, Reddit, SlidesLive, TikTok, Twitter, Vimeo etc
## [Grub4K](https://github.com/Grub4K)
[![ko-fi](https://img.shields.io/badge/_-Ko--fi-red.svg?logo=kofi&labelColor=555555&style=for-the-badge)](https://ko-fi.com/Grub4K) [![gh-sponsor](https://img.shields.io/badge/_-Github-red.svg?logo=github&labelColor=555555&style=for-the-badge)](https://github.com/sponsors/Grub4K)
* Rework internals like `traverse_obj`, various core refactors and bugs fixes
* Helped fix crunchyroll, Twitter, wrestleuniverse, wistia, slideslive etc

View File

@ -74,7 +74,7 @@ offlinetest: codetest
$(PYTHON) -m pytest -k "not download"
# XXX: This is hard to maintain
CODE_FOLDERS = yt_dlp yt_dlp/downloader yt_dlp/extractor yt_dlp/postprocessor yt_dlp/compat
CODE_FOLDERS = yt_dlp yt_dlp/downloader yt_dlp/extractor yt_dlp/postprocessor yt_dlp/compat yt_dlp/dependencies
yt-dlp: yt_dlp/*.py yt_dlp/*/*.py
mkdir -p zip
for d in $(CODE_FOLDERS) ; do \

View File

@ -76,7 +76,7 @@ yt-dlp is a [youtube-dl](https://github.com/ytdl-org/youtube-dl) fork based on t
# NEW FEATURES
* Merged with **youtube-dl v2021.12.17+ [commit/195f22f](https://github.com/ytdl-org/youtube-dl/commit/195f22f)** <!--([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21))--> and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* Merged with **youtube-dl v2021.12.17+ [commit/2dd6c6e](https://github.com/ytdl-org/youtube-dl/commit/2dd6c6e)** ([exceptions](https://github.com/yt-dlp/yt-dlp/issues/21)) and **youtube-dlc v2020.11.11-3+ [commit/f9401f2](https://github.com/blackjack4494/yt-dlc/commit/f9401f2a91987068139c5f757b12fc711d4c0cee)**: You get all the features and patches of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) in addition to the latest [youtube-dl](https://github.com/ytdl-org/youtube-dl)
* **[SponsorBlock Integration](#sponsorblock-options)**: You can mark/remove sponsor sections in YouTube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API
@ -788,7 +788,7 @@ You can also fork the project on GitHub and run your fork's [build workflow](.gi
--prefer-insecure Use an unencrypted connection to retrieve
information about the video (Currently
supported only for YouTube)
--add-header FIELD:VALUE Specify a custom HTTP header and its value,
--add-headers FIELD:VALUE Specify a custom HTTP header and its value,
separated by a colon ":". You can use this
option multiple times
--bidi-workaround Work around terminals that lack
@ -1511,7 +1511,7 @@ The available fields are:
- `source`: The preference of the source
- `proto`: Protocol used for download (`https`/`ftps` > `http`/`ftp` > `m3u8_native`/`m3u8` > `http_dash_segments`> `websocket_frag` > `mms`/`rtsp` > `f4f`/`f4m`)
- `vcodec`: Video Codec (`av01` > `vp9.2` > `vp9` > `h265` > `h264` > `vp8` > `h263` > `theora` > other)
- `acodec`: Audio Codec (`flac`/`alac` > `wav`/`aiff` > `opus` > `vorbis` > `aac` > `mp4a` > `mp3` `ac4` > > `eac3` > `ac3` > `dts` > other)
- `acodec`: Audio Codec (`flac`/`alac` > `wav`/`aiff` > `opus` > `vorbis` > `aac` > `mp4a` > `mp3` > `ac4` > `eac3` > `ac3` > `dts` > other)
- `codec`: Equivalent to `vcodec,acodec`
- `vext`: Video Extension (`mp4` > `mov` > `webm` > `flv` > other). If `--prefer-free-formats` is used, `webm` is preferred.
- `aext`: Audio Extension (`m4a` > `aac` > `mp3` > `ogg` > `opus` > `webm` > other). If `--prefer-free-formats` is used, the order changes to `ogg` > `opus` > `webm` > `mp3` > `m4a` > `aac`
@ -1741,6 +1741,8 @@ $ yt-dlp --replace-in-metadata "title,uploader" "[ _]" "-"
Some extractors accept additional arguments which can be passed using `--extractor-args KEY:ARGS`. `ARGS` is a `;` (semicolon) separated string of `ARG=VAL1,VAL2`. E.g. `--extractor-args "youtube:player-client=android_embedded,web;include_live_dash" --extractor-args "funimation:version=uncut"`
Note: In CLI, `ARG` can use `-` instead of `_`; e.g. `youtube:player-client"` becomes `youtube:player_client"`
The following extractors use this feature:
#### youtube
@ -1887,7 +1889,7 @@ with YoutubeDL() as ydl:
ydl.download(URLS)
```
Most likely, you'll want to use various options. For a list of options available, have a look at [`yt_dlp/YoutubeDL.py`](yt_dlp/YoutubeDL.py#L180).
Most likely, you'll want to use various options. For a list of options available, have a look at [`yt_dlp/YoutubeDL.py`](yt_dlp/YoutubeDL.py#L184).
**Tip**: If you are porting your code from youtube-dl to yt-dlp, one important point to look out for is that we do not guarantee the return value of `YoutubeDL.extract_info` to be json serializable, or even be a dictionary. It will be dictionary-like, but if you want to ensure it is a serializable dictionary, pass it through `YoutubeDL.sanitize_info` as shown in the [example below](#extracting-information)

View File

@ -58,7 +58,7 @@ NO_SKIP = '''
label: DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE
description: Fill all fields even if you think it is irrelevant for the issue
options:
- label: I understand that I will be **blocked** if I remove or skip any mandatory\\* field
- label: I understand that I will be **blocked** if I *intentionally* remove or skip any mandatory\\* field
required: true
'''.strip()

View File

@ -37,7 +37,7 @@ def main():
'--icon=devscripts/logo.ico',
'--upx-exclude=vcruntime140.dll',
'--noconfirm',
*dependency_options(),
'--additional-hooks-dir=yt_dlp/__pyinstaller',
*opts,
'yt_dlp/__main__.py',
]
@ -77,30 +77,6 @@ def version_to_list(version):
return list(map(int, version_list)) + [0] * (4 - len(version_list))
def dependency_options():
# Due to the current implementation, these are auto-detected, but explicitly add them just in case
dependencies = [pycryptodome_module(), 'mutagen', 'brotli', 'certifi', 'websockets']
excluded_modules = ('youtube_dl', 'youtube_dlc', 'test', 'ytdlp_plugins', 'devscripts')
yield from (f'--hidden-import={module}' for module in dependencies)
yield '--collect-submodules=websockets'
yield from (f'--exclude-module={module}' for module in excluded_modules)
def pycryptodome_module():
try:
import Cryptodome # noqa: F401
except ImportError:
try:
import Crypto # noqa: F401
print('WARNING: Using Crypto since Cryptodome is not available. '
'Install with: pip install pycryptodomex', file=sys.stderr)
return 'Crypto'
except ImportError:
pass
return 'Cryptodome'
def set_version_info(exe, version):
if OS_NAME == 'win32':
windows_set_version(exe, version)
@ -109,7 +85,6 @@ def set_version_info(exe, version):
def windows_set_version(exe, version):
from PyInstaller.utils.win32.versioninfo import (
FixedFileInfo,
SetVersion,
StringFileInfo,
StringStruct,
StringTable,
@ -118,6 +93,11 @@ def windows_set_version(exe, version):
VSVersionInfo,
)
try:
from PyInstaller.utils.win32.versioninfo import SetVersion
except ImportError: # Pyinstaller >= 5.8
from PyInstaller.utils.win32.versioninfo import write_version_info_to_executable as SetVersion
version_list = version_to_list(version)
suffix = MACHINE and f'_{MACHINE}'
SetVersion(exe, VSVersionInfo(

View File

@ -92,7 +92,10 @@ def build_params():
params = {'data_files': data_files}
if setuptools_available:
params['entry_points'] = {'console_scripts': ['yt-dlp = yt_dlp:main']}
params['entry_points'] = {
'console_scripts': ['yt-dlp = yt_dlp:main'],
'pyinstaller40': ['hook-dirs = yt_dlp.__pyinstaller:get_hook_dirs'],
}
else:
params['scripts'] = ['yt-dlp']
return params

View File

@ -63,14 +63,15 @@
- **AluraCourse**: [<abbr title="netrc machine"><em>aluracourse</em></abbr>]
- **Amara**
- **AmazonMiniTV**
- **amazonminitv:season**: Amazon MiniTV Series, "minitv:season:" prefix
- **amazonminitv:series**
- **amazonminitv:season**: Amazon MiniTV Season, "minitv:season:" prefix
- **amazonminitv:series**: Amazon MiniTV Series, "minitv:series:" prefix
- **AmazonReviews**
- **AmazonStore**
- **AMCNetworks**
- **AmericasTestKitchen**
- **AmericasTestKitchenSeason**
- **AmHistoryChannel**
- **AnchorFMEpisode**
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **Angel**
- **AnimalPlanet**
@ -177,6 +178,7 @@
- **BlackboardCollaborate**
- **BleacherReport**
- **BleacherReportCMS**
- **blerp**
- **blogger.com**
- **Bloomberg**
- **BokeCC**
@ -184,6 +186,7 @@
- **BooyahClips**
- **BostonGlobe**
- **Box**
- **BoxCastVideo**
- **Bpb**: Bundeszentrale für politische Bildung
- **BR**: Bayerischer Rundfunk
- **BravoTV**
@ -364,6 +367,7 @@
- **dw:article**
- **EaglePlatform**
- **EbaumsWorld**
- **Ebay**
- **EchoMsk**
- **egghead:course**: egghead.io course
- **egghead:lesson**: egghead.io lesson
@ -595,6 +599,7 @@
- **ivi**: ivi.ru
- **ivi:compilation**: ivi.ru compilations
- **ivideon**: Ivideon TV
- **IVXPlayer**
- **Iwara**
- **iwara:playlist**
- **iwara:user**
@ -626,6 +631,7 @@
- **KickVOD**
- **KinjaEmbed**
- **KinoPoisk**
- **Kommunetv**
- **KompasVideo**
- **KonserthusetPlay**
- **Koo**
@ -773,6 +779,7 @@
- **Mofosex**
- **MofosexEmbed**
- **Mojvideo**
- **MonsterSirenHypergryphMusic**
- **Morningstar**: morningstar.com
- **Motherless**
- **MotherlessGroup**
@ -878,6 +885,8 @@
- **NFHSNetwork**
- **nfl.com**
- **nfl.com:article**
- **nfl.com:plus:episode**
- **nfl.com:plus:replay**
- **NhkForSchoolBangumi**
- **NhkForSchoolProgramList**
- **NhkForSchoolSubject**: Portal page for each school subjects, like Japanese (kokugo, 国語) or math (sansuu/suugaku or 算数・数学)
@ -890,7 +899,7 @@
- **nickelodeonru**
- **nicknight**
- **niconico**: [<abbr title="netrc machine"><em>niconico</em></abbr>] ニコニコ動画
- **niconico:history**: NicoNico user history. Requires cookies.
- **niconico:history**: NicoNico user history or likes. Requires cookies.
- **niconico:playlist**
- **niconico:series**
- **niconico:tag**: NicoNico video tag URLs
@ -940,6 +949,7 @@
- **NYTimesArticle**
- **NYTimesCooking**
- **nzherald**
- **NZOnScreen**
- **NZZ**
- **ocw.mit.edu**
- **OdaTV**
@ -949,6 +959,7 @@
- **OktoberfestTV**
- **OlympicsReplay**
- **on24**: ON24
- **OnDemandChinaEpisode**
- **OnDemandKorea**
- **OneFootball**
- **OnePlacePodcast**
@ -1063,7 +1074,10 @@
- **Pornotube**
- **PornoVoisines**
- **PornoXO**
- **PornTop**
- **PornTube**
- **Pr0gramm**
- **Pr0grammStatic**
- **PrankCast**
- **PremiershipRugby**
- **PressTV**
@ -1115,6 +1129,8 @@
- **RaiSudtirol**
- **RayWenderlich**
- **RayWenderlichCourse**
- **RbgTum**
- **RbgTumCourse**
- **RBMARadio**
- **RCS**
- **RCSEmbeds**
@ -1149,6 +1165,7 @@
- **RoosterTeethSeries**: [<abbr title="netrc machine"><em>roosterteeth</em></abbr>]
- **RottenTomatoes**
- **Rozhlas**
- **RozhlasVltava**
- **RTBF**: [<abbr title="netrc machine"><em>rtbf</em></abbr>]
- **RTDocumentry**
- **RTDocumentryPlaylist**
@ -1485,6 +1502,7 @@
- **twitter:card**
- **twitter:shortener**
- **twitter:spaces**
- **Txxx**
- **udemy**: [<abbr title="netrc machine"><em>udemy</em></abbr>]
- **udemy:course**: [<abbr title="netrc machine"><em>udemy</em></abbr>]
- **UDNEmbed**: 聯合影音
@ -1572,14 +1590,13 @@
- **Viu**
- **viu:ott**: [<abbr title="netrc machine"><em>viu</em></abbr>]
- **viu:playlist**
- **ViuOTTIndonesia**
- **Vivo**: vivo.sx
- **vk**: [<abbr title="netrc machine"><em>vk</em></abbr>] VK
- **vk:uservideos**: [<abbr title="netrc machine"><em>vk</em></abbr>] VK - User's Videos
- **vk:wallpost**: [<abbr title="netrc machine"><em>vk</em></abbr>]
- **vlive**: [<abbr title="netrc machine"><em>vlive</em></abbr>]
- **vlive:channel**: [<abbr title="netrc machine"><em>vlive</em></abbr>]
- **vlive:post**: [<abbr title="netrc machine"><em>vlive</em></abbr>]
- **vm.tiktok**
- **Vocaroo**
- **Vodlocker**
- **VODPl**
- **VODPlatform**
@ -1628,6 +1645,7 @@
- **wdr:mobile**: (**Currently broken**)
- **WDRElefant**
- **WDRPage**
- **web.archive:vlive**: web.archive.org saved vlive videos
- **web.archive:youtube**: web.archive.org saved youtube videos, "ytarchive:" prefix
- **Webcamerapl**
- **Webcaster**
@ -1653,6 +1671,8 @@
- **WorldStarHipHop**
- **wppilot**
- **wppilot:channels**
- **WrestleUniversePPV**
- **WrestleUniverseVOD**
- **WSJ**: Wall Street Journal
- **WSJArticle**
- **WWE**
@ -1689,6 +1709,7 @@
- **YandexVideo**
- **YandexVideoPreview**
- **YapFiles**
- **Yappy**
- **YesJapan**
- **yinyuetai:video**: 音悦Tai
- **YleAreena**

View File

@ -69,6 +69,7 @@ class TestInfoExtractor(unittest.TestCase):
<meta name="og:test1" content='foo > < bar'/>
<meta name="og:test2" content="foo >//< bar"/>
<meta property=og-test3 content='Ill-formatted opengraph'/>
<meta property=og:test4 content=unquoted-value/>
'''
self.assertEqual(ie._og_search_title(html), 'Foo')
self.assertEqual(ie._og_search_description(html), 'Some video\'s description ')
@ -81,6 +82,7 @@ class TestInfoExtractor(unittest.TestCase):
self.assertEqual(ie._og_search_property(('test0', 'test1'), html), 'foo > < bar')
self.assertRaises(RegexNotFoundError, ie._og_search_property, 'test0', html, None, fatal=True)
self.assertRaises(RegexNotFoundError, ie._og_search_property, ('test0', 'test00'), html, None, fatal=True)
self.assertEqual(ie._og_search_property('test4', html), 'unquoted-value')
def test_html_search_meta(self):
ie = self.ie

View File

@ -26,7 +26,7 @@ from yt_dlp.aes import (
key_expansion,
pad_block,
)
from yt_dlp.dependencies import Cryptodome_AES
from yt_dlp.dependencies import Cryptodome
from yt_dlp.utils import bytes_to_intlist, intlist_to_bytes
# the encrypted data can be generate with 'devscripts/generate_aes_testdata.py'
@ -48,7 +48,7 @@ class TestAES(unittest.TestCase):
data = b'\x97\x92+\xe5\x0b\xc3\x18\x91ky9m&\xb3\xb5@\xe6\x27\xc2\x96.\xc8u\x88\xab9-[\x9e|\xf1\xcd'
decrypted = intlist_to_bytes(aes_cbc_decrypt(bytes_to_intlist(data), self.key, self.iv))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
if Cryptodome_AES:
if Cryptodome:
decrypted = aes_cbc_decrypt_bytes(data, intlist_to_bytes(self.key), intlist_to_bytes(self.iv))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
@ -78,7 +78,7 @@ class TestAES(unittest.TestCase):
decrypted = intlist_to_bytes(aes_gcm_decrypt_and_verify(
bytes_to_intlist(data), self.key, bytes_to_intlist(authentication_tag), self.iv[:12]))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
if Cryptodome_AES:
if Cryptodome:
decrypted = aes_gcm_decrypt_and_verify_bytes(
data, intlist_to_bytes(self.key), authentication_tag, intlist_to_bytes(self.iv[:12]))
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)

View File

@ -10,6 +10,7 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import is_download_test, try_rm
from yt_dlp import YoutubeDL
from yt_dlp.utils import DownloadError
def _download_restricted(url, filename, age):
@ -25,10 +26,14 @@ def _download_restricted(url, filename, age):
ydl.add_default_info_extractors()
json_filename = os.path.splitext(filename)[0] + '.info.json'
try_rm(json_filename)
ydl.download([url])
res = os.path.exists(json_filename)
try_rm(json_filename)
return res
try:
ydl.download([url])
except DownloadError:
pass
else:
return os.path.exists(json_filename)
finally:
try_rm(json_filename)
@is_download_test
@ -38,12 +43,12 @@ class TestAgeRestriction(unittest.TestCase):
self.assertFalse(_download_restricted(url, filename, age))
def test_youtube(self):
self._assert_restricted('07FYdnEawAQ', '07FYdnEawAQ.mp4', 10)
self._assert_restricted('HtVdAasjOgU', 'HtVdAasjOgU.mp4', 10)
def test_youporn(self):
self._assert_restricted(
'http://www.youporn.com/watch/505835/sex-ed-is-it-safe-to-masturbate-daily/',
'505835.mp4', 2, old_age=25)
'https://www.youporn.com/watch/16715086/sex-ed-in-detention-18-asmr/',
'16715086.mp4', 2, old_age=25)
if __name__ == '__main__':

View File

@ -31,6 +31,9 @@ class TestCompat(unittest.TestCase):
# TODO: Test submodule
# compat.asyncio.events # Must not raise error
with self.assertWarns(DeprecationWarning):
compat.compat_pycrypto_AES # Must not raise error
def test_compat_expanduser(self):
old_home = os.environ.get('HOME')
test_str = R'C:\Documents and Settings\тест\Application Data'

View File

@ -155,6 +155,38 @@ class TestJSInterpreter(unittest.TestCase):
self.assertEqual(jsi.call_function('z'), 5)
self.assertEqual(jsi.call_function('y'), 2)
def test_if(self):
jsi = JSInterpreter('''
function x() {
let a = 9;
if (0==0) {a++}
return a
}''')
self.assertEqual(jsi.call_function('x'), 10)
jsi = JSInterpreter('''
function x() {
if (0==0) {return 10}
}''')
self.assertEqual(jsi.call_function('x'), 10)
jsi = JSInterpreter('''
function x() {
if (0!=0) {return 1}
else {return 10}
}''')
self.assertEqual(jsi.call_function('x'), 10)
""" # Unsupported
jsi = JSInterpreter('''
function x() {
if (0!=0) {return 1}
else if (1==0) {return 2}
else {return 10}
}''')
self.assertEqual(jsi.call_function('x'), 10)
"""
def test_for_loop(self):
jsi = JSInterpreter('''
function x() { a=0; for (i=0; i-10; i++) {a++} return a }

View File

@ -105,6 +105,7 @@ from yt_dlp.utils import (
sanitized_Request,
shell_quote,
smuggle_url,
str_or_none,
str_to_int,
strip_jsonp,
strip_or_none,
@ -1999,8 +2000,8 @@ Line 1
# Test Ellipsis behavior
self.assertCountEqual(traverse_obj(_TEST_DATA, ...),
(item for item in _TEST_DATA.values() if item is not None),
msg='`...` should give all values except `None`')
(item for item in _TEST_DATA.values() if item not in (None, {})),
msg='`...` should give all non discarded values')
self.assertCountEqual(traverse_obj(_TEST_DATA, ('urls', 0, ...)), _TEST_DATA['urls'][0].values(),
msg='`...` selection for dicts should select all values')
self.assertEqual(traverse_obj(_TEST_DATA, (..., ..., 'url')),
@ -2015,6 +2016,29 @@ Line 1
msg='function as query key should perform a filter based on (key, value)')
self.assertCountEqual(traverse_obj(_TEST_DATA, lambda _, x: isinstance(x[0], str)), {'str'},
msg='exceptions in the query function should be catched')
if __debug__:
with self.assertRaises(Exception, msg='Wrong function signature should raise in debug'):
traverse_obj(_TEST_DATA, lambda a: ...)
with self.assertRaises(Exception, msg='Wrong function signature should raise in debug'):
traverse_obj(_TEST_DATA, lambda a, b, c: ...)
# Test set as key (transformation/type, like `expected_type`)
self.assertEqual(traverse_obj(_TEST_DATA, (..., {str.upper}, )), ['STR'],
msg='Function in set should be a transformation')
self.assertEqual(traverse_obj(_TEST_DATA, (..., {str})), ['str'],
msg='Type in set should be a type filter')
self.assertEqual(traverse_obj(_TEST_DATA, {dict}), _TEST_DATA,
msg='A single set should be wrapped into a path')
self.assertEqual(traverse_obj(_TEST_DATA, (..., {str.upper})), ['STR'],
msg='Transformation function should not raise')
self.assertEqual(traverse_obj(_TEST_DATA, (..., {str_or_none})),
[item for item in map(str_or_none, _TEST_DATA.values()) if item is not None],
msg='Function in set should be a transformation')
if __debug__:
with self.assertRaises(Exception, msg='Sets with length != 1 should raise in debug'):
traverse_obj(_TEST_DATA, set())
with self.assertRaises(Exception, msg='Sets with length != 1 should raise in debug'):
traverse_obj(_TEST_DATA, {str.upper, str})
# Test alternative paths
self.assertEqual(traverse_obj(_TEST_DATA, 'fail', 'str'), 'str',
@ -2060,15 +2084,23 @@ Line 1
{0: ['https://www.example.com/1', 'https://www.example.com/0']},
msg='tripple nesting in dict path should be treated as branches')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 'fail'}), {},
msg='remove `None` values when dict key')
msg='remove `None` values when top level dict key fails')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 'fail'}, default=...), {0: ...},
msg='do not remove `None` values if `default`')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 'dict'}), {0: {}},
msg='do not remove empty values when dict key')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 'dict'}, default=...), {0: {}},
msg='do not remove empty values when dict key and a default')
self.assertEqual(traverse_obj(_TEST_DATA, {0: ('dict', ...)}), {0: []},
msg='if branch in dict key not successful, return `[]`')
msg='use `default` if key fails and `default`')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 'dict'}), {},
msg='remove empty values when dict key')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 'dict'}, default=...), {0: ...},
msg='use `default` when dict key and `default`')
self.assertEqual(traverse_obj(_TEST_DATA, {0: {0: 'fail'}}), {},
msg='remove empty values when nested dict key fails')
self.assertEqual(traverse_obj(None, {0: 'fail'}), {},
msg='default to dict if pruned')
self.assertEqual(traverse_obj(None, {0: 'fail'}, default=...), {0: ...},
msg='default to dict if pruned and default is given')
self.assertEqual(traverse_obj(_TEST_DATA, {0: {0: 'fail'}}, default=...), {0: {0: ...}},
msg='use nested `default` when nested dict key fails and `default`')
self.assertEqual(traverse_obj(_TEST_DATA, {0: ('dict', ...)}), {},
msg='remove key if branch in dict key not successful')
# Testing default parameter behavior
_DEFAULT_DATA = {'None': None, 'int': 0, 'list': []}
@ -2092,20 +2124,55 @@ Line 1
msg='if branched but not successful return `[]`, not `default`')
self.assertEqual(traverse_obj(_DEFAULT_DATA, ('list', ...)), [],
msg='if branched but object is empty return `[]`, not `default`')
self.assertEqual(traverse_obj(None, ...), [],
msg='if branched but object is `None` return `[]`, not `default`')
self.assertEqual(traverse_obj({0: None}, (0, ...)), [],
msg='if branched but state is `None` return `[]`, not `default`')
branching_paths = [
('fail', ...),
(..., 'fail'),
100 * ('fail',) + (...,),
(...,) + 100 * ('fail',),
]
for branching_path in branching_paths:
self.assertEqual(traverse_obj({}, branching_path), [],
msg='if branched but state is `None`, return `[]` (not `default`)')
self.assertEqual(traverse_obj({}, 'fail', branching_path), [],
msg='if branching in last alternative and previous did not match, return `[]` (not `default`)')
self.assertEqual(traverse_obj({0: 'x'}, 0, branching_path), 'x',
msg='if branching in last alternative and previous did match, return single value')
self.assertEqual(traverse_obj({0: 'x'}, branching_path, 0), 'x',
msg='if branching in first alternative and non-branching path does match, return single value')
self.assertEqual(traverse_obj({}, branching_path, 'fail'), None,
msg='if branching in first alternative and non-branching path does not match, return `default`')
# Testing expected_type behavior
_EXPECTED_TYPE_DATA = {'str': 'str', 'int': 0}
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'str', expected_type=str), 'str',
msg='accept matching `expected_type` type')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'str', expected_type=int), None,
msg='reject non matching `expected_type` type')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'int', expected_type=lambda x: str(x)), '0',
msg='transform type using type function')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'str',
expected_type=lambda _: 1 / 0), None,
msg='wrap expected_type fuction in try_call')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, ..., expected_type=str), ['str'],
msg='eliminate items that expected_type fails on')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'str', expected_type=str),
'str', msg='accept matching `expected_type` type')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'str', expected_type=int),
None, msg='reject non matching `expected_type` type')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'int', expected_type=lambda x: str(x)),
'0', msg='transform type using type function')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, 'str', expected_type=lambda _: 1 / 0),
None, msg='wrap expected_type fuction in try_call')
self.assertEqual(traverse_obj(_EXPECTED_TYPE_DATA, ..., expected_type=str),
['str'], msg='eliminate items that expected_type fails on')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 100, 1: 1.2}, expected_type=int),
{0: 100}, msg='type as expected_type should filter dict values')
self.assertEqual(traverse_obj(_TEST_DATA, {0: 100, 1: 1.2, 2: 'None'}, expected_type=str_or_none),
{0: '100', 1: '1.2'}, msg='function as expected_type should transform dict values')
self.assertEqual(traverse_obj(_TEST_DATA, ({0: 1.2}, 0, {int_or_none}), expected_type=int),
1, msg='expected_type should not filter non final dict values')
self.assertEqual(traverse_obj(_TEST_DATA, {0: {0: 100, 1: 'str'}}, expected_type=int),
{0: {0: 100}}, msg='expected_type should transform deep dict values')
self.assertEqual(traverse_obj(_TEST_DATA, [({0: '...'}, {0: '...'})], expected_type=type(...)),
[{0: ...}, {0: ...}], msg='expected_type should transform branched dict values')
self.assertEqual(traverse_obj({1: {3: 4}}, [(1, 2), 3], expected_type=int),
[4], msg='expected_type regression for type matching in tuple branching')
self.assertEqual(traverse_obj(_TEST_DATA, ['data', ...], expected_type=int),
[], msg='expected_type regression for type matching in dict result')
# Test get_all behavior
_GET_ALL_DATA = {'key': [0, 1, 2]}
@ -2145,14 +2212,17 @@ Line 1
traverse_string=True), '.',
msg='traverse into converted data if `traverse_string`')
self.assertEqual(traverse_obj(_TRAVERSE_STRING_DATA, ('str', ...),
traverse_string=True), list('str'),
msg='`...` branching into string should result in list')
traverse_string=True), 'str',
msg='`...` should result in string (same value) if `traverse_string`')
self.assertEqual(traverse_obj(_TRAVERSE_STRING_DATA, ('str', slice(0, None, 2)),
traverse_string=True), 'sr',
msg='`slice` should result in string if `traverse_string`')
self.assertEqual(traverse_obj(_TRAVERSE_STRING_DATA, ('str', lambda i, v: i or v == "s"),
traverse_string=True), 'str',
msg='function should result in string if `traverse_string`')
self.assertEqual(traverse_obj(_TRAVERSE_STRING_DATA, ('str', (0, 2)),
traverse_string=True), ['s', 'r'],
msg='branching into string should result in list')
self.assertEqual(traverse_obj(_TRAVERSE_STRING_DATA, ('str', lambda _, x: x),
traverse_string=True), list('str'),
msg='function branching into string should result in list')
msg='branching should result in list if `traverse_string`')
# Test is_user_input behavior
_IS_USER_INPUT_DATA = {'range8': list(range(8))}
@ -2189,6 +2259,8 @@ Line 1
msg='failing str key on a `re.Match` should return `default`')
self.assertEqual(traverse_obj(mobj, 8), None,
msg='failing int key on a `re.Match` should return `default`')
self.assertEqual(traverse_obj(mobj, lambda k, _: k in (0, 'group')), ['0123', '3'],
msg='function on a `re.Match` should give group name as well')
if __name__ == '__main__':

View File

@ -134,6 +134,10 @@ _NSIG_TESTS = [
'https://www.youtube.com/s/player/7a062b77/player_ias.vflset/en_US/base.js',
'NRcE3y3mVtm_cV-W', 'VbsCYUATvqlt5w',
),
(
'https://www.youtube.com/s/player/dac945fd/player_ias.vflset/en_US/base.js',
'o8BkRxXhuYsBCWi6RplPdP', '3Lx32v_hmzTm6A',
),
]

View File

@ -554,7 +554,7 @@ class YoutubeDL:
'vbr', 'fps', 'vcodec', 'container', 'filesize', 'filesize_approx', 'rows', 'columns',
'player_url', 'protocol', 'fragment_base_url', 'fragments', 'is_from_start',
'preference', 'language', 'language_preference', 'quality', 'source_preference',
'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'downloader_options',
'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'extra_param_to_segment_url', 'hls_aes', 'downloader_options',
'page_url', 'app', 'play_path', 'tc_url', 'flash_version', 'rtmp_live', 'rtmp_conn', 'rtmp_protocol', 'rtmp_real_time'
}
_format_selection_exts = {
@ -1777,7 +1777,7 @@ class YoutubeDL:
return {
**info,
'playlist_index': 0,
'__last_playlist_index': max(ie_result['requested_entries'] or (0, 0)),
'__last_playlist_index': max(ie_result.get('requested_entries') or (0, 0)),
'extractor': ie_result['extractor'],
'extractor_key': ie_result['extractor_key'],
}
@ -2411,11 +2411,7 @@ class YoutubeDL:
def _fill_common_fields(self, info_dict, final=True):
# TODO: move sanitization here
if final:
title = info_dict.get('title', NO_DEFAULT)
if title is NO_DEFAULT:
raise ExtractorError('Missing "title" field in extractor result',
video_id=info_dict['id'], ie=info_dict['extractor'])
info_dict['fulltitle'] = title
title = info_dict['fulltitle'] = info_dict.get('title')
if not title:
if title == '':
self.write_debug('Extractor gave empty title. Creating a generic title')
@ -2470,15 +2466,8 @@ class YoutubeDL:
def sort_formats(self, info_dict):
formats = self._get_formats(info_dict)
if not formats:
return
# Backward compatibility with InfoExtractor._sort_formats
field_preference = formats[0].pop('__sort_fields', None)
if field_preference:
info_dict['_format_sort_fields'] = field_preference
formats.sort(key=FormatSorter(
self, info_dict.get('_format_sort_fields', [])).calculate_preference)
self, info_dict.get('_format_sort_fields') or []).calculate_preference)
def process_video_result(self, info_dict, download=True):
assert info_dict.get('_type', 'video') == 'video'
@ -2565,9 +2554,13 @@ class YoutubeDL:
info_dict['requested_subtitles'] = self.process_subtitles(
info_dict['id'], subtitles, automatic_captions)
self.sort_formats(info_dict)
formats = self._get_formats(info_dict)
# Backward compatibility with InfoExtractor._sort_formats
field_preference = (formats or [{}])[0].pop('__sort_fields', None)
if field_preference:
info_dict['_format_sort_fields'] = field_preference
# or None ensures --clean-infojson removes it
info_dict['_has_drm'] = any(f.get('has_drm') for f in formats) or None
if not self.params.get('allow_unplayable_formats'):
@ -2605,44 +2598,12 @@ class YoutubeDL:
if not formats:
self.raise_no_formats(info_dict)
formats_dict = {}
# We check that all the formats have the format and format_id fields
for i, format in enumerate(formats):
for format in formats:
sanitize_string_field(format, 'format_id')
sanitize_numeric_fields(format)
format['url'] = sanitize_url(format['url'])
if not format.get('format_id'):
format['format_id'] = str(i)
else:
# Sanitize format_id from characters used in format selector expression
format['format_id'] = re.sub(r'[\s,/+\[\]()]', '_', format['format_id'])
format_id = format['format_id']
if format_id not in formats_dict:
formats_dict[format_id] = []
formats_dict[format_id].append(format)
# Make sure all formats have unique format_id
common_exts = set(itertools.chain(*self._format_selection_exts.values()))
for format_id, ambiguous_formats in formats_dict.items():
ambigious_id = len(ambiguous_formats) > 1
for i, format in enumerate(ambiguous_formats):
if ambigious_id:
format['format_id'] = '%s-%d' % (format_id, i)
if format.get('ext') is None:
format['ext'] = determine_ext(format['url']).lower()
# Ensure there is no conflict between id and ext in format selection
# See https://github.com/yt-dlp/yt-dlp/issues/1282
if format['format_id'] != format['ext'] and format['format_id'] in common_exts:
format['format_id'] = 'f%s' % format['format_id']
for i, format in enumerate(formats):
if format.get('format') is None:
format['format'] = '{id} - {res}{note}'.format(
id=format['format_id'],
res=self.format_resolution(format),
note=format_field(format, 'format_note', ' (%s)'),
)
if format.get('ext') is None:
format['ext'] = determine_ext(format['url']).lower()
if format.get('protocol') is None:
format['protocol'] = determine_protocol(format)
if format.get('resolution') is None:
@ -2654,16 +2615,46 @@ class YoutubeDL:
if (info_dict.get('duration') and format.get('tbr')
and not format.get('filesize') and not format.get('filesize_approx')):
format['filesize_approx'] = int(info_dict['duration'] * format['tbr'] * (1024 / 8))
format['http_headers'] = self._calc_headers(collections.ChainMap(format, info_dict))
# Add HTTP headers, so that external programs can use them from the
# json output
full_format_info = info_dict.copy()
full_format_info.update(format)
format['http_headers'] = self._calc_headers(full_format_info)
# Remove private housekeeping stuff
# This is copied to http_headers by the above _calc_headers and can now be removed
if '__x_forwarded_for_ip' in info_dict:
del info_dict['__x_forwarded_for_ip']
self.sort_formats({
'formats': formats,
'_format_sort_fields': info_dict.get('_format_sort_fields')
})
# Sanitize and group by format_id
formats_dict = {}
for i, format in enumerate(formats):
if not format.get('format_id'):
format['format_id'] = str(i)
else:
# Sanitize format_id from characters used in format selector expression
format['format_id'] = re.sub(r'[\s,/+\[\]()]', '_', format['format_id'])
formats_dict.setdefault(format['format_id'], []).append(format)
# Make sure all formats have unique format_id
common_exts = set(itertools.chain(*self._format_selection_exts.values()))
for format_id, ambiguous_formats in formats_dict.items():
ambigious_id = len(ambiguous_formats) > 1
for i, format in enumerate(ambiguous_formats):
if ambigious_id:
format['format_id'] = '%s-%d' % (format_id, i)
# Ensure there is no conflict between id and ext in format selection
# See https://github.com/yt-dlp/yt-dlp/issues/1282
if format['format_id'] != format['ext'] and format['format_id'] in common_exts:
format['format_id'] = 'f%s' % format['format_id']
if format.get('format') is None:
format['format'] = '{id} - {res}{note}'.format(
id=format['format_id'],
res=self.format_resolution(format),
note=format_field(format, 'format_note', ' (%s)'),
)
if self.params.get('check_formats') is True:
formats = LazyList(self._check_formats(formats[::-1]), reverse=True)
@ -2819,10 +2810,14 @@ class YoutubeDL:
self.params.get('subtitleslangs'), {'all': all_sub_langs}, use_regex=True)
except re.error as e:
raise ValueError(f'Wrong regex for subtitlelangs: {e.pattern}')
elif normal_sub_langs:
requested_langs = ['en'] if 'en' in normal_sub_langs else normal_sub_langs[:1]
else:
requested_langs = ['en'] if 'en' in all_sub_langs else all_sub_langs[:1]
requested_langs = LazyList(itertools.chain(
['en'] if 'en' in normal_sub_langs else [],
filter(lambda f: f.startswith('en'), normal_sub_langs),
['en'] if 'en' in all_sub_langs else [],
filter(lambda f: f.startswith('en'), all_sub_langs),
normal_sub_langs, all_sub_langs,
))[:1]
if requested_langs:
self.to_screen(f'[info] {video_id}: Downloading subtitles: {", ".join(requested_langs)}')
@ -3670,6 +3665,7 @@ class YoutubeDL:
format_field(f, 'asr', '\t%s', func=format_decimal_suffix),
join_nonempty(
self._format_out('UNSUPPORTED', 'light red') if f.get('ext') in ('f4f', 'f4m') else None,
self._format_out('DRM', 'light red') if f.get('has_drm') else None,
format_field(f, 'language', '[%s]'),
join_nonempty(format_field(f, 'format_note'),
format_field(f, 'container', ignore=(None, f.get('ext'))),
@ -3769,12 +3765,13 @@ class YoutubeDL:
source = detect_variant()
if VARIANT not in (None, 'pip'):
source += '*'
klass = type(self)
write_debug(join_nonempty(
f'{"yt-dlp" if REPOSITORY == "yt-dlp/yt-dlp" else REPOSITORY} version',
__version__,
f'[{RELEASE_GIT_HEAD}]' if RELEASE_GIT_HEAD else '',
'' if source == 'unknown' else f'({source})',
'' if _IN_CLI else 'API',
'' if _IN_CLI else 'API' if klass == YoutubeDL else f'API:{self.__module__}.{klass.__qualname__}',
delim=' '))
if not _IN_CLI:

View File

@ -318,10 +318,6 @@ def validate_options(opts):
if outtmpl_default == '':
opts.skip_download = None
del opts.outtmpl['default']
if outtmpl_default and not os.path.splitext(outtmpl_default)[1] and opts.extractaudio:
raise ValueError(
'Cannot download a video and extract audio into the same file! '
f'Use "{outtmpl_default}.%(ext)s" instead of "{outtmpl_default}" as the output template')
def parse_chapters(name, value):
chapters, ranges = [], []
@ -708,6 +704,7 @@ def parse_options(argv=None):
'dumpjson', 'dump_single_json', 'getdescription', 'getduration', 'getfilename',
'getformat', 'getid', 'getthumbnail', 'gettitle', 'geturl'
))
opts.quiet = opts.quiet or any_getting or opts.print_json or bool(opts.forceprint)
playlist_pps = [pp for pp in postprocessors if pp.get('when') == 'playlist']
write_playlist_infojson = (opts.writeinfojson and not opts.clean_infojson
@ -743,7 +740,7 @@ def parse_options(argv=None):
'client_certificate': opts.client_certificate,
'client_certificate_key': opts.client_certificate_key,
'client_certificate_password': opts.client_certificate_password,
'quiet': opts.quiet or any_getting or opts.print_json or bool(opts.forceprint),
'quiet': opts.quiet,
'no_warnings': opts.no_warnings,
'forceurl': opts.geturl,
'forcetitle': opts.gettitle,

View File

@ -0,0 +1,5 @@
import os
def get_hook_dirs():
return [os.path.dirname(__file__)]

View File

@ -0,0 +1,57 @@
import ast
import os
import sys
from pathlib import Path
from PyInstaller.utils.hooks import collect_submodules
def find_attribute_accesses(node, name, path=()):
if isinstance(node, ast.Attribute):
path = [*path, node.attr]
if isinstance(node.value, ast.Name) and node.value.id == name:
yield path[::-1]
for child in ast.iter_child_nodes(node):
yield from find_attribute_accesses(child, name, path)
def collect_used_submodules(name, level):
for dirpath, _, filenames in os.walk(Path(__file__).parent.parent):
for filename in filenames:
if not filename.endswith('.py'):
continue
with open(Path(dirpath) / filename, encoding='utf8') as f:
for submodule in find_attribute_accesses(ast.parse(f.read()), name):
yield '.'.join(submodule[:level])
def pycryptodome_module():
try:
import Cryptodome # noqa: F401
except ImportError:
try:
import Crypto # noqa: F401
print('WARNING: Using Crypto since Cryptodome is not available. '
'Install with: pip install pycryptodomex', file=sys.stderr)
return 'Crypto'
except ImportError:
pass
return 'Cryptodome'
def get_hidden_imports():
yield 'yt_dlp.compat._legacy'
yield from collect_submodules('websockets')
crypto = pycryptodome_module()
for sm in set(collect_used_submodules('Cryptodome', 2)):
yield f'{crypto}.{sm}'
# These are auto-detected, but explicitly add them just in case
yield from ('mutagen', 'brotli', 'certifi')
hiddenimports = list(get_hidden_imports())
print(f'Adding imports: {hiddenimports}')
excludedimports = ['youtube_dl', 'youtube_dlc', 'test', 'ytdlp_plugins', 'devscripts']

View File

@ -2,17 +2,17 @@ import base64
from math import ceil
from .compat import compat_ord
from .dependencies import Cryptodome_AES
from .dependencies import Cryptodome
from .utils import bytes_to_intlist, intlist_to_bytes
if Cryptodome_AES:
if Cryptodome:
def aes_cbc_decrypt_bytes(data, key, iv):
""" Decrypt bytes with AES-CBC using pycryptodome """
return Cryptodome_AES.new(key, Cryptodome_AES.MODE_CBC, iv).decrypt(data)
return Cryptodome.Cipher.AES.new(key, Cryptodome.Cipher.AES.MODE_CBC, iv).decrypt(data)
def aes_gcm_decrypt_and_verify_bytes(data, key, tag, nonce):
""" Decrypt bytes with AES-GCM using pycryptodome """
return Cryptodome_AES.new(key, Cryptodome_AES.MODE_GCM, nonce).decrypt_and_verify(data, tag)
return Cryptodome.Cipher.AES.new(key, Cryptodome.Cipher.AES.MODE_GCM, nonce).decrypt_and_verify(data, tag)
else:
def aes_cbc_decrypt_bytes(data, key, iv):

View File

@ -1,5 +1,4 @@
import contextlib
import errno
import json
import os
import re
@ -39,11 +38,7 @@ class Cache:
fn = self._get_cache_fn(section, key, dtype)
try:
try:
os.makedirs(os.path.dirname(fn))
except OSError as ose:
if ose.errno != errno.EEXIST:
raise
os.makedirs(os.path.dirname(fn), exist_ok=True)
self._ydl.write_debug(f'Saving {section}.{key} to cache')
write_json_file({'yt-dlp_version': __version__, 'data': data}, fn)
except Exception:

View File

@ -8,7 +8,7 @@ from .compat_utils import passthrough_module
# XXX: Implement this the same way as other DeprecationWarnings without circular import
passthrough_module(__name__, '._legacy', callback=lambda attr: warnings.warn(
DeprecationWarning(f'{__name__}.{attr} is deprecated'), stacklevel=3))
DeprecationWarning(f'{__name__}.{attr} is deprecated'), stacklevel=5))
# HTMLParseError has been deprecated in Python 3.3 and removed in
@ -70,9 +70,3 @@ if compat_os_name in ('nt', 'ce'):
return userhome + path[i:]
else:
compat_expanduser = os.path.expanduser
# NB: Add modules that are imported dynamically here so that PyInstaller can find them
# See https://github.com/pyinstaller/pyinstaller-hooks-contrib/issues/438
if False:
from . import _legacy # noqa: F401

View File

@ -1,5 +1,6 @@
""" Do not use! """
import base64
import collections
import ctypes
import getpass
@ -29,6 +30,7 @@ from asyncio import run as compat_asyncio_run # noqa: F401
from re import Pattern as compat_Pattern # noqa: F401
from re import match as compat_Match # noqa: F401
from . import compat_expanduser, compat_HTMLParseError, compat_realpath
from .compat_utils import passthrough_module
from ..dependencies import Cryptodome_AES as compat_pycrypto_AES # noqa: F401
from ..dependencies import brotli as compat_brotli # noqa: F401
@ -47,23 +49,25 @@ def compat_setenv(key, value, env=os.environ):
env[key] = value
compat_base64_b64decode = base64.b64decode
compat_basestring = str
compat_casefold = str.casefold
compat_chr = chr
compat_collections_abc = collections.abc
compat_cookiejar = http.cookiejar
compat_cookiejar_Cookie = http.cookiejar.Cookie
compat_cookies = http.cookies
compat_cookies_SimpleCookie = http.cookies.SimpleCookie
compat_etree_Element = etree.Element
compat_etree_register_namespace = etree.register_namespace
compat_cookiejar = compat_http_cookiejar = http.cookiejar
compat_cookiejar_Cookie = compat_http_cookiejar_Cookie = http.cookiejar.Cookie
compat_cookies = compat_http_cookies = http.cookies
compat_cookies_SimpleCookie = compat_http_cookies_SimpleCookie = http.cookies.SimpleCookie
compat_etree_Element = compat_xml_etree_ElementTree_Element = etree.Element
compat_etree_register_namespace = compat_xml_etree_register_namespace = etree.register_namespace
compat_filter = filter
compat_get_terminal_size = shutil.get_terminal_size
compat_getenv = os.getenv
compat_getpass = getpass.getpass
compat_getpass = compat_getpass_getpass = getpass.getpass
compat_html_entities = html.entities
compat_html_entities_html5 = html.entities.html5
compat_HTMLParser = html.parser.HTMLParser
compat_html_parser_HTMLParseError = compat_HTMLParseError
compat_HTMLParser = compat_html_parser_HTMLParser = html.parser.HTMLParser
compat_http_client = http.client
compat_http_server = http.server
compat_input = input
@ -72,6 +76,8 @@ compat_itertools_count = itertools.count
compat_kwargs = lambda kwargs: kwargs
compat_map = map
compat_numeric_types = (int, float, complex)
compat_os_path_expanduser = compat_expanduser
compat_os_path_realpath = compat_realpath
compat_print = print
compat_shlex_split = shlex.split
compat_socket_create_connection = socket.create_connection
@ -81,7 +87,9 @@ compat_struct_unpack = struct.unpack
compat_subprocess_get_DEVNULL = lambda: DEVNULL
compat_tokenize_tokenize = tokenize.tokenize
compat_urllib_error = urllib.error
compat_urllib_HTTPError = urllib.error.HTTPError
compat_urllib_parse = urllib.parse
compat_urllib_parse_parse_qs = urllib.parse.parse_qs
compat_urllib_parse_quote = urllib.parse.quote
compat_urllib_parse_quote_plus = urllib.parse.quote_plus
compat_urllib_parse_unquote_plus = urllib.parse.unquote_plus
@ -90,8 +98,10 @@ compat_urllib_parse_urlunparse = urllib.parse.urlunparse
compat_urllib_request = urllib.request
compat_urllib_request_DataHandler = urllib.request.DataHandler
compat_urllib_response = urllib.response
compat_urlretrieve = urllib.request.urlretrieve
compat_xml_parse_error = etree.ParseError
compat_urlretrieve = compat_urllib_request_urlretrieve = urllib.request.urlretrieve
compat_xml_parse_error = compat_xml_etree_ElementTree_ParseError = etree.ParseError
compat_xpath = lambda xpath: xpath
compat_zip = zip
workaround_optparse_bug9161 = lambda: None
legacy = []

View File

@ -1,5 +1,6 @@
import collections
import contextlib
import functools
import importlib
import sys
import types
@ -10,61 +11,73 @@ _Package = collections.namedtuple('Package', ('name', 'version'))
def get_package_info(module):
parent = module.__name__.split('.')[0]
parent_module = None
with contextlib.suppress(ImportError):
parent_module = importlib.import_module(parent)
for attr in ('__version__', 'version_string', 'version'):
version = getattr(parent_module, attr, None)
if version is not None:
break
return _Package(getattr(module, '_yt_dlp__identifier', parent), str(version))
return _Package(
name=getattr(module, '_yt_dlp__identifier', module.__name__),
version=str(next(filter(None, (
getattr(module, attr, None)
for attr in ('__version__', 'version_string', 'version')
)), None)))
def _is_package(module):
try:
module.__getattribute__('__path__')
except AttributeError:
return False
return True
return '__path__' in vars(module)
def passthrough_module(parent, child, allowed_attributes=None, *, callback=lambda _: None):
parent_module = importlib.import_module(parent)
child_module = None # Import child module only as needed
def _is_dunder(name):
return name.startswith('__') and name.endswith('__')
class PassthroughModule(types.ModuleType):
def __getattr__(self, attr):
if _is_package(parent_module):
with contextlib.suppress(ImportError):
return importlib.import_module(f'.{attr}', parent)
ret = self.__from_child(attr)
if ret is _NO_ATTRIBUTE:
raise AttributeError(f'module {parent} has no attribute {attr}')
callback(attr)
return ret
class EnhancedModule(types.ModuleType):
def __bool__(self):
return vars(self).get('__bool__', lambda: True)()
def __from_child(self, attr):
if allowed_attributes is None:
if attr.startswith('__') and attr.endswith('__'):
return _NO_ATTRIBUTE
elif attr not in allowed_attributes:
def __getattribute__(self, attr):
try:
ret = super().__getattribute__(attr)
except AttributeError:
if _is_dunder(attr):
raise
getter = getattr(self, '__getattr__', None)
if not getter:
raise
ret = getter(attr)
return ret.fget() if isinstance(ret, property) else ret
def passthrough_module(parent, child, allowed_attributes=(..., ), *, callback=lambda _: None):
"""Passthrough parent module into a child module, creating the parent if necessary"""
def __getattr__(attr):
if _is_package(parent):
with contextlib.suppress(ImportError):
return importlib.import_module(f'.{attr}', parent.__name__)
ret = from_child(attr)
if ret is _NO_ATTRIBUTE:
raise AttributeError(f'module {parent.__name__} has no attribute {attr}')
callback(attr)
return ret
@functools.lru_cache(maxsize=None)
def from_child(attr):
nonlocal child
if attr not in allowed_attributes:
if ... not in allowed_attributes or _is_dunder(attr):
return _NO_ATTRIBUTE
nonlocal child_module
child_module = child_module or importlib.import_module(child, parent)
if isinstance(child, str):
child = importlib.import_module(child, parent.__name__)
with contextlib.suppress(AttributeError):
return getattr(child_module, attr)
if _is_package(child):
with contextlib.suppress(ImportError):
return passthrough_module(f'{parent.__name__}.{attr}',
importlib.import_module(f'.{attr}', child.__name__))
if _is_package(child_module):
with contextlib.suppress(ImportError):
return importlib.import_module(f'.{attr}', child)
with contextlib.suppress(AttributeError):
return getattr(child, attr)
return _NO_ATTRIBUTE
return _NO_ATTRIBUTE
# Python 3.6 does not have module level __getattr__
# https://peps.python.org/pep-0562/
sys.modules[parent].__class__ = PassthroughModule
parent = sys.modules.get(parent, types.ModuleType(parent))
parent.__class__ = EnhancedModule
parent.__getattr__ = __getattr__
return parent

View File

@ -0,0 +1,30 @@
import types
from ..compat import functools
from ..compat.compat_utils import passthrough_module
try:
import Cryptodome as _parent
except ImportError:
try:
import Crypto as _parent
except (ImportError, SyntaxError): # Old Crypto gives SyntaxError in newer Python
_parent = types.ModuleType('no_Cryptodome')
__bool__ = lambda: False
passthrough_module(__name__, _parent, (..., '__version__'))
del passthrough_module
@property
@functools.cache
def _yt_dlp__identifier():
if _parent.__name__ == 'Crypto':
from Crypto.Cipher import AES
try:
# In pycrypto, mode defaults to ECB. See:
# https://www.pycryptodome.org/en/latest/src/vs_pycrypto.html#:~:text=not%20have%20ECB%20as%20default%20mode
AES.new(b'abcdefghijklmnop')
except TypeError:
return 'pycrypto'
return _parent.__name__

View File

@ -23,24 +23,6 @@ else:
certifi = None
try:
from Cryptodome.Cipher import AES as Cryptodome_AES
except ImportError:
try:
from Crypto.Cipher import AES as Cryptodome_AES
except (ImportError, SyntaxError): # Old Crypto gives SyntaxError in newer Python
Cryptodome_AES = None
else:
try:
# In pycrypto, mode defaults to ECB. See:
# https://www.pycryptodome.org/en/latest/src/vs_pycrypto.html#:~:text=not%20have%20ECB%20as%20default%20mode
Cryptodome_AES.new(b'abcdefghijklmnop')
except TypeError:
pass
else:
Cryptodome_AES._yt_dlp__identifier = 'pycrypto'
try:
import mutagen
except ImportError:
@ -84,12 +66,16 @@ else:
xattr._yt_dlp__identifier = 'pyxattr'
from . import Cryptodome
all_dependencies = {k: v for k, v in globals().items() if not k.startswith('_')}
available_dependencies = {k: v for k, v in all_dependencies.items() if v}
# Deprecated
Cryptodome_AES = Cryptodome.Cipher.AES if Cryptodome else None
__all__ = [
'all_dependencies',
'available_dependencies',

View File

@ -104,6 +104,7 @@ class ExternalFD(FragmentFD):
return all((
not info_dict.get('to_stdout') or Features.TO_STDOUT in cls.SUPPORTED_FEATURES,
'+' not in info_dict['protocol'] or Features.MULTIPLE_FORMATS in cls.SUPPORTED_FEATURES,
not traverse_obj(info_dict, ('hls_aes', ...), 'extra_param_to_segment_url'),
all(proto in cls.SUPPORTED_PROTOCOLS for proto in info_dict['protocol'].split('+')),
))

View File

@ -360,7 +360,8 @@ class FragmentFD(FileDownloader):
if not decrypt_info or decrypt_info['METHOD'] != 'AES-128':
return frag_content
iv = decrypt_info.get('IV') or struct.pack('>8xq', fragment['media_sequence'])
decrypt_info['KEY'] = decrypt_info.get('KEY') or _get_key(info_dict.get('_decryption_key_url') or decrypt_info['URI'])
decrypt_info['KEY'] = (decrypt_info.get('KEY')
or _get_key(traverse_obj(info_dict, ('hls_aes', 'uri')) or decrypt_info['URI']))
# Don't decrypt the content in tests since the data is explicitly truncated and it's not to a valid block
# size (see https://github.com/ytdl-org/youtube-dl/pull/27660). Tests only care that the correct data downloaded,
# not what it decrypts to.
@ -382,7 +383,7 @@ class FragmentFD(FileDownloader):
max_workers = self.params.get('concurrent_fragment_downloads', 1)
if max_progress > 1:
self._prepare_multiline_status(max_progress)
is_live = any(traverse_obj(args, (..., 2, 'is_live'), default=[]))
is_live = any(traverse_obj(args, (..., 2, 'is_live')))
def thread_func(idx, ctx, fragments, info_dict, tpe):
ctx['max_progress'] = max_progress

View File

@ -7,8 +7,15 @@ from . import get_suitable_downloader
from .external import FFmpegFD
from .fragment import FragmentFD
from .. import webvtt
from ..dependencies import Cryptodome_AES
from ..utils import bug_reports_message, parse_m3u8_attributes, update_url_query
from ..dependencies import Cryptodome
from ..utils import (
bug_reports_message,
parse_m3u8_attributes,
remove_start,
traverse_obj,
update_url_query,
urljoin,
)
class HlsFD(FragmentFD):
@ -63,7 +70,7 @@ class HlsFD(FragmentFD):
can_download, message = self.can_download(s, info_dict, self.params.get('allow_unplayable_formats')), None
if can_download:
has_ffmpeg = FFmpegFD.available()
no_crypto = not Cryptodome_AES and '#EXT-X-KEY:METHOD=AES-128' in s
no_crypto = not Cryptodome and '#EXT-X-KEY:METHOD=AES-128' in s
if no_crypto and has_ffmpeg:
can_download, message = False, 'The stream has AES-128 encryption and pycryptodomex is not available'
elif no_crypto:
@ -150,6 +157,13 @@ class HlsFD(FragmentFD):
i = 0
media_sequence = 0
decrypt_info = {'METHOD': 'NONE'}
external_aes_key = traverse_obj(info_dict, ('hls_aes', 'key'))
if external_aes_key:
external_aes_key = binascii.unhexlify(remove_start(external_aes_key, '0x'))
assert len(external_aes_key) in (16, 24, 32), 'Invalid length for HLS AES-128 key'
external_aes_iv = traverse_obj(info_dict, ('hls_aes', 'iv'))
if external_aes_iv:
external_aes_iv = binascii.unhexlify(remove_start(external_aes_iv, '0x').zfill(32))
byte_range = {}
discontinuity_count = 0
frag_index = 0
@ -165,10 +179,7 @@ class HlsFD(FragmentFD):
frag_index += 1
if frag_index <= ctx['fragment_index']:
continue
frag_url = (
line
if re.match(r'^https?://', line)
else urllib.parse.urljoin(man_url, line))
frag_url = urljoin(man_url, line)
if extra_query:
frag_url = update_url_query(frag_url, extra_query)
@ -190,10 +201,7 @@ class HlsFD(FragmentFD):
return False
frag_index += 1
map_info = parse_m3u8_attributes(line[11:])
frag_url = (
map_info.get('URI')
if re.match(r'^https?://', map_info.get('URI'))
else urllib.parse.urljoin(man_url, map_info.get('URI')))
frag_url = urljoin(man_url, map_info.get('URI'))
if extra_query:
frag_url = update_url_query(frag_url, extra_query)
@ -218,15 +226,18 @@ class HlsFD(FragmentFD):
decrypt_url = decrypt_info.get('URI')
decrypt_info = parse_m3u8_attributes(line[11:])
if decrypt_info['METHOD'] == 'AES-128':
if 'IV' in decrypt_info:
if external_aes_iv:
decrypt_info['IV'] = external_aes_iv
elif 'IV' in decrypt_info:
decrypt_info['IV'] = binascii.unhexlify(decrypt_info['IV'][2:].zfill(32))
if not re.match(r'^https?://', decrypt_info['URI']):
decrypt_info['URI'] = urllib.parse.urljoin(
man_url, decrypt_info['URI'])
if extra_query:
decrypt_info['URI'] = update_url_query(decrypt_info['URI'], extra_query)
if decrypt_url != decrypt_info['URI']:
decrypt_info['KEY'] = None
if external_aes_key:
decrypt_info['KEY'] = external_aes_key
else:
decrypt_info['URI'] = urljoin(man_url, decrypt_info['URI'])
if extra_query:
decrypt_info['URI'] = update_url_query(decrypt_info['URI'], extra_query)
if decrypt_url != decrypt_info['URI']:
decrypt_info['KEY'] = None
elif line.startswith('#EXT-X-MEDIA-SEQUENCE'):
media_sequence = int(line[22:])

View File

@ -211,7 +211,12 @@ class HttpFD(FileDownloader):
ctx.stream = None
def download():
data_len = ctx.data.info().get('Content-length', None)
data_len = ctx.data.info().get('Content-length')
if ctx.data.info().get('Content-encoding'):
# Content-encoding is present, Content-length is not reliable anymore as we are
# doing auto decompression. (See: https://github.com/yt-dlp/yt-dlp/pull/6176)
data_len = None
# Range HTTP header may be ignored/unsupported by a webserver
# (e.g. extractor/scivee.py, extractor/bambuser.py).

View File

@ -21,7 +21,8 @@ from .youtube import ( # Youtube is moved to the top to improve performance
YoutubeYtBeIE,
YoutubeYtUserIE,
YoutubeWatchLaterIE,
YoutubeShortsAudioPivotIE
YoutubeShortsAudioPivotIE,
YoutubeConsentRedirectIE,
)
from .abc import (
@ -101,6 +102,7 @@ from .americastestkitchen import (
AmericasTestKitchenIE,
AmericasTestKitchenSeasonIE,
)
from .anchorfm import AnchorFMEpisodeIE
from .angel import AngelIE
from .anvato import AnvatoIE
from .aol import AolIE
@ -121,6 +123,7 @@ from .applepodcasts import ApplePodcastsIE
from .archiveorg import (
ArchiveOrgIE,
YoutubeWebArchiveIE,
VLiveWebArchiveIE,
)
from .arcpublishing import ArcPublishingIE
from .arkena import ArkenaIE
@ -236,12 +239,14 @@ from .bleacherreport import (
BleacherReportIE,
BleacherReportCMSIE,
)
from .blerp import BlerpIE
from .blogger import BloggerIE
from .bloomberg import BloombergIE
from .bokecc import BokeCCIE
from .bongacams import BongaCamsIE
from .bostonglobe import BostonGlobeIE
from .box import BoxIE
from .boxcast import BoxCastVideoIE
from .booyah import BooyahClipsIE
from .bpb import BpbIE
from .br import (
@ -505,6 +510,7 @@ from .dw import (
)
from .eagleplatform import EaglePlatformIE, ClipYouEmbedIE
from .ebaumsworld import EbaumsWorldIE
from .ebay import EbayIE
from .echomsk import EchoMskIE
from .egghead import (
EggheadCourseIE,
@ -743,6 +749,7 @@ from .hungama import (
HungamaAlbumPlaylistIE,
)
from .hypem import HypemIE
from .hypergryph import MonsterSirenHypergryphMusicIE
from .hytale import HytaleIE
from .icareus import IcareusIE
from .ichinanalive import (
@ -855,6 +862,7 @@ from .kicker import KickerIE
from .kickstarter import KickStarterIE
from .kinja import KinjaEmbedIE
from .kinopoisk import KinoPoiskIE
from .kommunetv import KommunetvIE
from .kompas import KompasVideoIE
from .konserthusetplay import KonserthusetPlayIE
from .koo import KooIE
@ -1195,6 +1203,8 @@ from .nfhsnetwork import NFHSNetworkIE
from .nfl import (
NFLIE,
NFLArticleIE,
NFLPlusEpisodeIE,
NFLPlusReplayIE,
)
from .nhk import (
NhkVodIE,
@ -1285,8 +1295,10 @@ from .nytimes import (
)
from .nuvid import NuvidIE
from .nzherald import NZHeraldIE
from .nzonscreen import NZOnScreenIE
from .nzz import NZZIE
from .odatv import OdaTVIE
from .odkmedia import OnDemandChinaEpisodeIE
from .odnoklassniki import OdnoklassnikiIE
from .oftv import (
OfTVIE,
@ -1450,6 +1462,7 @@ from .puhutv import (
PuhuTVIE,
PuhuTVSerieIE,
)
from .pr0gramm import Pr0grammStaticIE, Pr0grammIE
from .prankcast import PrankCastIE
from .premiershiprugby import PremiershipRugbyIE
from .presstv import PressTVIE
@ -1511,6 +1524,10 @@ from .raywenderlich import (
RayWenderlichCourseIE,
)
from .rbmaradio import RBMARadioIE
from .rbgtum import (
RbgTumIE,
RbgTumCourseIE,
)
from .rcs import (
RCSIE,
RCSEmbedsIE,
@ -1555,7 +1572,10 @@ from .rokfin import (
)
from .roosterteeth import RoosterTeethIE, RoosterTeethSeriesIE
from .rottentomatoes import RottenTomatoesIE
from .rozhlas import RozhlasIE
from .rozhlas import (
RozhlasIE,
RozhlasVltavaIE,
)
from .rte import RteIE, RteRadioIE
from .rtlnl import (
RtlNlIE,
@ -1845,7 +1865,7 @@ from .telequebec import (
)
from .teletask import TeleTaskIE
from .telewebion import TelewebionIE
from .tempo import TempoIE
from .tempo import TempoIE, IVXPlayerIE
from .tencent import (
IflixEpisodeIE,
IflixSeriesIE,
@ -2044,6 +2064,10 @@ from .twitter import (
TwitterSpacesIE,
TwitterShortenerIE,
)
from .txxx import (
TxxxIE,
PornTopIE,
)
from .udemy import (
UdemyIE,
UdemyCourseIE
@ -2169,17 +2193,14 @@ from .viu import (
ViuIE,
ViuPlaylistIE,
ViuOTTIE,
ViuOTTIndonesiaIE,
)
from .vk import (
VKIE,
VKUserVideosIE,
VKWallPostIE,
)
from .vlive import (
VLiveIE,
VLivePostIE,
VLiveChannelIE,
)
from .vocaroo import VocarooIE
from .vodlocker import VodlockerIE
from .vodpl import VODPlIE
from .vodplatform import VODPlatformIE
@ -2266,6 +2287,10 @@ from .wppilot import (
WPPilotIE,
WPPilotChannelsIE,
)
from .wrestleuniverse import (
WrestleUniverseVODIE,
WrestleUniversePPVIE,
)
from .wsj import (
WSJIE,
WSJArticleIE,
@ -2314,6 +2339,7 @@ from .yandexvideo import (
ZenYandexChannelIE,
)
from .yapfiles import YapFilesIE
from .yappy import YappyIE
from .yesjapan import YesJapanIE
from .yinyuetai import YinYueTaiIE
from .yle_areena import YleAreenaIE

View File

@ -156,7 +156,7 @@ class AbemaTVBaseIE(InfoExtractor):
def _generate_aks(cls, deviceid):
deviceid = deviceid.encode('utf-8')
# add 1 hour and then drop minute and secs
ts_1hour = int((time_seconds(hours=9) // 3600 + 1) * 3600)
ts_1hour = int((time_seconds() // 3600 + 1) * 3600)
time_struct = time.gmtime(ts_1hour)
ts_1hour_str = str(ts_1hour).encode('utf-8')
@ -190,6 +190,16 @@ class AbemaTVBaseIE(InfoExtractor):
if self._USERTOKEN:
return self._USERTOKEN
username, _ = self._get_login_info()
AbemaTVBaseIE._USERTOKEN = username and self.cache.load(self._NETRC_MACHINE, username)
if AbemaTVBaseIE._USERTOKEN:
# try authentication with locally stored token
try:
self._get_media_token(True)
return
except ExtractorError as e:
self.report_warning(f'Failed to login with cached user token; obtaining a fresh one ({e})')
AbemaTVBaseIE._DEVICE_ID = str(uuid.uuid4())
aks = self._generate_aks(self._DEVICE_ID)
user_data = self._download_json(
@ -300,6 +310,11 @@ class AbemaTVIE(AbemaTVBaseIE):
_TIMETABLE = None
def _perform_login(self, username, password):
self._get_device_token()
if self.cache.load(self._NETRC_MACHINE, username) and self._get_media_token():
self.write_debug('Skipping logging in')
return
if '@' in username: # don't strictly check if it's email address or not
ep, method = 'user/email', 'email'
else:
@ -319,6 +334,7 @@ class AbemaTVIE(AbemaTVBaseIE):
AbemaTVBaseIE._USERTOKEN = login_response['token']
self._get_media_token(True)
self.cache.store(self._NETRC_MACHINE, username, AbemaTVBaseIE._USERTOKEN)
def _real_extract(self, url):
# starting download using infojson from this extractor is undefined behavior,
@ -416,7 +432,7 @@ class AbemaTVIE(AbemaTVBaseIE):
f'https://api.abema.io/v1/video/programs/{video_id}', video_id,
note='Checking playability',
headers=headers)
ondemand_types = traverse_obj(api_response, ('terms', ..., 'onDemandType'), default=[])
ondemand_types = traverse_obj(api_response, ('terms', ..., 'onDemandType'))
if 3 not in ondemand_types:
# cannot acquire decryption key for these streams
self.report_warning('This is a premium-only stream')
@ -489,7 +505,7 @@ class AbemaTVTitleIE(AbemaTVBaseIE):
})
yield from (
self.url_result(f'https://abema.tv/video/episode/{x}')
for x in traverse_obj(programs, ('programs', ..., 'id'), default=[]))
for x in traverse_obj(programs, ('programs', ..., 'id')))
def _entries(self, playlist_id, series_version):
return OnDemandPagedList(

View File

@ -191,7 +191,7 @@ query content($sessionIdToken: String!, $deviceLocale: String, $contentId: ID!,
class AmazonMiniTVSeasonIE(AmazonMiniTVBaseIE):
IE_NAME = 'amazonminitv:season'
_VALID_URL = r'amazonminitv:season:(?:amzn1\.dv\.gti\.)?(?P<id>[a-f0-9-]+)'
IE_DESC = 'Amazon MiniTV Series, "minitv:season:" prefix'
IE_DESC = 'Amazon MiniTV Season, "minitv:season:" prefix'
_TESTS = [{
'url': 'amazonminitv:season:amzn1.dv.gti.0aa996eb-6a1b-4886-a342-387fbd2f1db0',
'playlist_mincount': 6,
@ -250,6 +250,7 @@ query getEpisodes($sessionIdToken: String!, $clientId: String, $episodeOrSeasonI
class AmazonMiniTVSeriesIE(AmazonMiniTVBaseIE):
IE_NAME = 'amazonminitv:series'
_VALID_URL = r'amazonminitv:series:(?:amzn1\.dv\.gti\.)?(?P<id>[a-f0-9-]+)'
IE_DESC = 'Amazon MiniTV Series, "minitv:series:" prefix'
_TESTS = [{
'url': 'amazonminitv:series:amzn1.dv.gti.56521d46-b040-4fd5-872e-3e70476a04b0',
'playlist_mincount': 3,

View File

@ -11,7 +11,7 @@ from ..utils import (
class AmericasTestKitchenIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?americastestkitchen\.com/(?:cooks(?:country|illustrated)/)?(?P<resource_type>episode|videos)/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?(?:americastestkitchen|cooks(?:country|illustrated))\.com/(?:cooks(?:country|illustrated)/)?(?P<resource_type>episode|videos)/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.americastestkitchen.com/episode/582-weeknight-japanese-suppers',
'md5': 'b861c3e365ac38ad319cfd509c30577f',
@ -72,6 +72,12 @@ class AmericasTestKitchenIE(InfoExtractor):
}, {
'url': 'https://www.americastestkitchen.com/cooksillustrated/videos/4478-beef-wellington',
'only_matching': True,
}, {
'url': 'https://www.cookscountry.com/episode/564-when-only-chocolate-will-do',
'only_matching': True,
}, {
'url': 'https://www.cooksillustrated.com/videos/4478-beef-wellington',
'only_matching': True,
}]
def _real_extract(self, url):
@ -100,7 +106,7 @@ class AmericasTestKitchenIE(InfoExtractor):
class AmericasTestKitchenSeasonIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?americastestkitchen\.com(?P<show>/cookscountry)?/episodes/browse/season_(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?(?P<show>americastestkitchen|(?P<cooks>cooks(?:country|illustrated)))\.com(?:(?:/(?P<show2>cooks(?:country|illustrated)))?(?:/?$|(?<!ated)(?<!ated\.com)/episodes/browse/season_(?P<season>\d+)))'
_TESTS = [{
# ATK Season
'url': 'https://www.americastestkitchen.com/episodes/browse/season_1',
@ -117,29 +123,73 @@ class AmericasTestKitchenSeasonIE(InfoExtractor):
'title': 'Season 12',
},
'playlist_count': 13,
}, {
# America's Test Kitchen Series
'url': 'https://www.americastestkitchen.com/',
'info_dict': {
'id': 'americastestkitchen',
'title': 'America\'s Test Kitchen',
},
'playlist_count': 558,
}, {
# Cooks Country Series
'url': 'https://www.americastestkitchen.com/cookscountry',
'info_dict': {
'id': 'cookscountry',
'title': 'Cook\'s Country',
},
'playlist_count': 199,
}, {
'url': 'https://www.americastestkitchen.com/cookscountry/',
'only_matching': True,
}, {
'url': 'https://www.cookscountry.com/episodes/browse/season_12',
'only_matching': True,
}, {
'url': 'https://www.cookscountry.com',
'only_matching': True,
}, {
'url': 'https://www.americastestkitchen.com/cooksillustrated/',
'only_matching': True,
}, {
'url': 'https://www.cooksillustrated.com',
'only_matching': True,
}]
def _real_extract(self, url):
show_path, season_number = self._match_valid_url(url).group('show', 'id')
season_number = int(season_number)
season_number, show1, show = self._match_valid_url(url).group('season', 'show', 'show2')
show_path = ('/' + show) if show else ''
show = show or show1
season_number = int_or_none(season_number)
slug = 'cco' if show_path == '/cookscountry' else 'atk'
slug, title = {
'americastestkitchen': ('atk', 'America\'s Test Kitchen'),
'cookscountry': ('cco', 'Cook\'s Country'),
'cooksillustrated': ('cio', 'Cook\'s Illustrated'),
}[show]
season = 'Season %d' % season_number
facet_filters = [
'search_document_klass:episode',
'search_show_slug:' + slug,
]
if season_number:
playlist_id = 'season_%d' % season_number
playlist_title = 'Season %d' % season_number
facet_filters.append('search_season_list:' + playlist_title)
else:
playlist_id = show
playlist_title = title
season_search = self._download_json(
'https://y1fnzxui30-dsn.algolia.net/1/indexes/everest_search_%s_season_desc_production' % slug,
season, headers={
playlist_id, headers={
'Origin': 'https://www.americastestkitchen.com',
'X-Algolia-API-Key': '8d504d0099ed27c1b73708d22871d805',
'X-Algolia-Application-Id': 'Y1FNZXUI30',
}, query={
'facetFilters': json.dumps([
'search_season_list:' + season,
'search_document_klass:episode',
'search_show_slug:' + slug,
]),
'attributesToRetrieve': 'description,search_%s_episode_number,search_document_date,search_url,title' % slug,
'facetFilters': json.dumps(facet_filters),
'attributesToRetrieve': 'description,search_%s_episode_number,search_document_date,search_url,title,search_atk_episode_season' % slug,
'attributesToHighlight': '',
'hitsPerPage': 1000,
})
@ -162,4 +212,4 @@ class AmericasTestKitchenSeasonIE(InfoExtractor):
}
return self.playlist_result(
entries(), 'season_%d' % season_number, season)
entries(), playlist_id, playlist_title)

View File

@ -0,0 +1,98 @@
from .common import InfoExtractor
from ..utils import (
clean_html,
float_or_none,
int_or_none,
str_or_none,
traverse_obj,
unified_timestamp
)
class AnchorFMEpisodeIE(InfoExtractor):
_VALID_URL = r'https?://anchor\.fm/(?P<channel_name>\w+)/(?:embed/)?episodes/[\w-]+-(?P<episode_id>\w+)'
_EMBED_REGEX = [rf'<iframe[^>]+\bsrc=[\'"](?P<url>{_VALID_URL})']
_TESTS = [{
'url': 'https://anchor.fm/lovelyti/episodes/Chrisean-Rock-takes-to-twitter-to-announce-shes-pregnant--Blueface-denies-he-is-the-father-e1tpt3d',
'info_dict': {
'id': 'e1tpt3d',
'ext': 'mp3',
'title': ' Chrisean Rock takes to twitter to announce she\'s pregnant, Blueface denies he is the father!',
'description': 'md5:207d167de3e28ceb4ddc1ebf5a30044c',
'thumbnail': 'https://s3-us-west-2.amazonaws.com/anchor-generated-image-bank/production/podcast_uploaded_nologo/1034827/1034827-1658438968460-5f3bfdf3601e8.jpg',
'duration': 624.718,
'uploader': 'Lovelyti ',
'uploader_id': '991541',
'channel': 'lovelyti',
'modified_date': '20230121',
'modified_timestamp': 1674285178,
'release_date': '20230121',
'release_timestamp': 1674285179,
'episode_id': 'e1tpt3d',
}
}, {
# embed url
'url': 'https://anchor.fm/apakatatempo/embed/episodes/S2E75-Perang-Bintang-di-Balik-Kasus-Ferdy-Sambo-dan-Ismail-Bolong-e1shjqd',
'info_dict': {
'id': 'e1shjqd',
'ext': 'mp3',
'title': 'S2E75 Perang Bintang di Balik Kasus Ferdy Sambo dan Ismail Bolong',
'description': 'md5:9e95ad9293bf00178bf8d33e9cb92c41',
'duration': 1042.008,
'thumbnail': 'https://s3-us-west-2.amazonaws.com/anchor-generated-image-bank/production/podcast_uploaded_episode400/2627805/2627805-1671590688729-4db3882ac9e4b.jpg',
'release_date': '20221221',
'release_timestamp': 1671595916,
'modified_date': '20221221',
'modified_timestamp': 1671590834,
'channel': 'apakatatempo',
'uploader': 'Podcast Tempo',
'uploader_id': '2585461',
'season': 'Season 2',
'season_number': 2,
'episode_id': 'e1shjqd',
}
}]
_WEBPAGE_TESTS = [{
'url': 'https://podcast.tempo.co/podcast/192/perang-bintang-di-balik-kasus-ferdy-sambo-dan-ismail-bolong',
'info_dict': {
'id': 'e1shjqd',
'ext': 'mp3',
'release_date': '20221221',
'duration': 1042.008,
'season': 'Season 2',
'modified_timestamp': 1671590834,
'uploader_id': '2585461',
'modified_date': '20221221',
'description': 'md5:9e95ad9293bf00178bf8d33e9cb92c41',
'season_number': 2,
'title': 'S2E75 Perang Bintang di Balik Kasus Ferdy Sambo dan Ismail Bolong',
'release_timestamp': 1671595916,
'episode_id': 'e1shjqd',
'thumbnail': 'https://s3-us-west-2.amazonaws.com/anchor-generated-image-bank/production/podcast_uploaded_episode400/2627805/2627805-1671590688729-4db3882ac9e4b.jpg',
'uploader': 'Podcast Tempo',
'channel': 'apakatatempo',
}
}]
def _real_extract(self, url):
channel_name, episode_id = self._match_valid_url(url).group('channel_name', 'episode_id')
api_data = self._download_json(f'https://anchor.fm/api/v3/episodes/{episode_id}', episode_id)
return {
'id': episode_id,
'title': traverse_obj(api_data, ('episode', 'title')),
'url': traverse_obj(api_data, ('episode', 'episodeEnclosureUrl'), ('episodeAudios', 0, 'url')),
'ext': 'mp3',
'vcodec': 'none',
'thumbnail': traverse_obj(api_data, ('episode', 'episodeImage')),
'description': clean_html(traverse_obj(api_data, ('episode', ('description', 'descriptionPreview')), get_all=False)),
'duration': float_or_none(traverse_obj(api_data, ('episode', 'duration')), 1000),
'modified_timestamp': unified_timestamp(traverse_obj(api_data, ('episode', 'modified'))),
'release_timestamp': int_or_none(traverse_obj(api_data, ('episode', 'publishOnUnixTimestamp'))),
'episode_id': episode_id,
'uploader': traverse_obj(api_data, ('creator', 'name')),
'uploader_id': str_or_none(traverse_obj(api_data, ('creator', 'userId'))),
'season_number': int_or_none(traverse_obj(api_data, ('episode', 'podcastSeasonNumber'))),
'channel': channel_name or traverse_obj(api_data, ('creator', 'vanitySlug')),
}

View File

@ -1,8 +1,10 @@
import json
import re
import urllib.error
import urllib.parse
from .common import InfoExtractor
from .naver import NaverBaseIE
from .youtube import YoutubeBaseInfoExtractor, YoutubeIE
from ..compat import compat_HTTPError, compat_urllib_parse_unquote
from ..utils import (
@ -945,3 +947,237 @@ class YoutubeWebArchiveIE(InfoExtractor):
if not info.get('title'):
info['title'] = video_id
return info
class VLiveWebArchiveIE(InfoExtractor):
IE_NAME = 'web.archive:vlive'
IE_DESC = 'web.archive.org saved vlive videos'
_VALID_URL = r'''(?x)
(?:https?://)?web\.archive\.org/
(?:web/)?(?:(?P<date>[0-9]{14})?[0-9A-Za-z_*]*/)? # /web and the version index is optional
(?:https?(?::|%3[Aa])//)?(?:
(?:(?:www|m)\.)?vlive\.tv(?::(?:80|443))?/(?:video|embed)/(?P<id>[0-9]+) # VLive URL
)
'''
_TESTS = [{
'url': 'https://web.archive.org/web/20221221144331/http://www.vlive.tv/video/1326',
'md5': 'cc7314812855ce56de70a06a27314983',
'info_dict': {
'id': '1326',
'ext': 'mp4',
'title': "Girl's Day's Broadcast",
'creator': "Girl's Day",
'view_count': int,
'uploader_id': 'muploader_a',
'uploader_url': None,
'uploader': None,
'upload_date': '20150817',
'thumbnail': r're:^https?://.*\.(?:jpg|png)$',
'timestamp': 1439816449,
'like_count': int,
'channel': 'Girl\'s Day',
'channel_id': 'FDF27',
'comment_count': int,
'release_timestamp': 1439818140,
'release_date': '20150817',
'duration': 1014,
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://web.archive.org/web/20221221182103/http://www.vlive.tv/video/16937',
'info_dict': {
'id': '16937',
'ext': 'mp4',
'title': '첸백시 걍방',
'creator': 'EXO',
'view_count': int,
'subtitles': 'mincount:12',
'uploader_id': 'muploader_j',
'uploader_url': 'http://vlive.tv',
'uploader': None,
'upload_date': '20161112',
'thumbnail': r're:^https?://.*\.(?:jpg|png)$',
'timestamp': 1478923074,
'like_count': int,
'channel': 'EXO',
'channel_id': 'F94BD',
'comment_count': int,
'release_timestamp': 1478924280,
'release_date': '20161112',
'duration': 906,
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://web.archive.org/web/20221127190050/http://www.vlive.tv/video/101870',
'info_dict': {
'id': '101870',
'ext': 'mp4',
'title': '[ⓓ xV] “레벨이들 매력에 반해? 안 반해?” 움직이는 HD 포토 (레드벨벳:Red Velvet)',
'creator': 'Dispatch',
'view_count': int,
'subtitles': 'mincount:6',
'uploader_id': 'V__FRA08071',
'uploader_url': 'http://vlive.tv',
'uploader': None,
'upload_date': '20181130',
'thumbnail': r're:^https?://.*\.(?:jpg|png)$',
'timestamp': 1543601327,
'like_count': int,
'channel': 'Dispatch',
'channel_id': 'C796F3',
'comment_count': int,
'release_timestamp': 1543601040,
'release_date': '20181130',
'duration': 279,
},
'params': {
'skip_download': True,
},
}]
# The wayback machine has special timestamp and "mode" values:
# timestamp:
# 1 = the first capture
# 2 = the last capture
# mode:
# id_ = Identity - perform no alterations of the original resource, return it as it was archived.
_WAYBACK_BASE_URL = 'https://web.archive.org/web/2id_/'
def _download_archived_page(self, url, video_id, *, timestamp='2', **kwargs):
for retry in self.RetryManager():
try:
return self._download_webpage(f'https://web.archive.org/web/{timestamp}id_/{url}', video_id, **kwargs)
except ExtractorError as e:
if isinstance(e.cause, urllib.error.HTTPError) and e.cause.code == 404:
raise ExtractorError('Page was not archived', expected=True)
retry.error = e
continue
def _download_archived_json(self, url, video_id, **kwargs):
page = self._download_archived_page(url, video_id, **kwargs)
if not page:
raise ExtractorError('Page was not archived', expected=True)
else:
return self._parse_json(page, video_id)
def _extract_formats_from_m3u8(self, m3u8_url, params, video_id):
m3u8_doc = self._download_archived_page(m3u8_url, video_id, note='Downloading m3u8', query=params, fatal=False)
if not m3u8_doc:
return
# M3U8 document should be changed to archive domain
m3u8_doc = m3u8_doc.splitlines()
url_base = m3u8_url.rsplit('/', 1)[0]
first_segment = None
for i, line in enumerate(m3u8_doc):
if not line.startswith('#'):
m3u8_doc[i] = f'{self._WAYBACK_BASE_URL}{url_base}/{line}?{urllib.parse.urlencode(params)}'
first_segment = first_segment or m3u8_doc[i]
# Segments may not have been archived. See https://web.archive.org/web/20221127190050/http://www.vlive.tv/video/101870
urlh = self._request_webpage(HEADRequest(first_segment), video_id, errnote=False,
fatal=False, note='Check first segment availablity')
if urlh:
formats, subtitles = self._parse_m3u8_formats_and_subtitles('\n'.join(m3u8_doc), ext='mp4', video_id=video_id)
if subtitles:
self._report_ignoring_subs('m3u8')
return formats
# Closely follows the logic of the ArchiveTeam grab script
# See: https://github.com/ArchiveTeam/vlive-grab/blob/master/vlive.lua
def _real_extract(self, url):
video_id, url_date = self._match_valid_url(url).group('id', 'date')
webpage = self._download_archived_page(f'https://www.vlive.tv/video/{video_id}', video_id, timestamp=url_date)
player_info = self._search_json(r'__PRELOADED_STATE__\s*=', webpage, 'player info', video_id)
user_country = traverse_obj(player_info, ('common', 'userCountry'))
main_script_url = self._search_regex(r'<script\s+src="([^"]+/js/main\.[^"]+\.js)"', webpage, 'main script url')
main_script = self._download_archived_page(main_script_url, video_id, note='Downloading main script')
app_id = self._search_regex(r'appId\s*=\s*"([^"]+)"', main_script, 'app id')
inkey = self._download_archived_json(
f'https://www.vlive.tv/globalv-web/vam-web/video/v1.0/vod/{video_id}/inkey', video_id, note='Fetching inkey', query={
'appId': app_id,
'platformType': 'PC',
'gcc': user_country,
'locale': 'en_US',
}, fatal=False)
vod_id = traverse_obj(player_info, ('postDetail', 'post', 'officialVideo', 'vodId'))
vod_data = self._download_archived_json(
f'https://apis.naver.com/rmcnmv/rmcnmv/vod/play/v2.0/{vod_id}', video_id, note='Fetching vod data', query={
'key': inkey.get('inkey'),
'pid': 'rmcPlayer_16692457559726800', # partially unix time and partially random. Fixed value used by archiveteam project
'sid': '2024',
'ver': '2.0',
'devt': 'html5_pc',
'doct': 'json',
'ptc': 'https',
'sptc': 'https',
'cpt': 'vtt',
'ctls': '%7B%22visible%22%3A%7B%22fullscreen%22%3Atrue%2C%22logo%22%3Afalse%2C%22playbackRate%22%3Afalse%2C%22scrap%22%3Afalse%2C%22playCount%22%3Atrue%2C%22commentCount%22%3Atrue%2C%22title%22%3Atrue%2C%22writer%22%3Atrue%2C%22expand%22%3Afalse%2C%22subtitles%22%3Atrue%2C%22thumbnails%22%3Atrue%2C%22quality%22%3Atrue%2C%22setting%22%3Atrue%2C%22script%22%3Afalse%2C%22logoDimmed%22%3Atrue%2C%22badge%22%3Atrue%2C%22seekingTime%22%3Atrue%2C%22muted%22%3Atrue%2C%22muteButton%22%3Afalse%2C%22viewerNotice%22%3Afalse%2C%22linkCount%22%3Afalse%2C%22createTime%22%3Afalse%2C%22thumbnail%22%3Atrue%7D%2C%22clicked%22%3A%7B%22expand%22%3Afalse%2C%22subtitles%22%3Afalse%7D%7D',
'pv': '4.26.9',
'dr': '1920x1080',
'cpl': 'en_US',
'lc': 'en_US',
'adi': '%5B%7B%22type%22%3A%22pre%22%2C%22exposure%22%3Afalse%2C%22replayExposure%22%3Afalse%7D%5D',
'adu': '%2F',
'videoId': vod_id,
'cc': user_country,
})
formats = []
streams = traverse_obj(vod_data, ('streams', ...))
if len(streams) > 1:
self.report_warning('Multiple streams found. Only the first stream will be downloaded.')
stream = streams[0]
max_stream = max(
stream.get('videos') or [],
key=lambda v: traverse_obj(v, ('bitrate', 'video'), default=0), default=None)
if max_stream is not None:
params = {arg.get('name'): arg.get('value') for arg in stream.get('keys', []) if arg.get('type') == 'param'}
formats = self._extract_formats_from_m3u8(max_stream.get('source'), params, video_id) or []
# For parts of the project MP4 files were archived
max_video = max(
traverse_obj(vod_data, ('videos', 'list', ...)),
key=lambda v: traverse_obj(v, ('bitrate', 'video'), default=0), default=None)
if max_video is not None:
video_url = self._WAYBACK_BASE_URL + max_video.get('source')
urlh = self._request_webpage(HEADRequest(video_url), video_id, errnote=False,
fatal=False, note='Check video availablity')
if urlh:
formats.append({'url': video_url})
return {
'id': video_id,
'formats': formats,
**traverse_obj(player_info, ('postDetail', 'post', {
'title': ('officialVideo', 'title', {str}),
'creator': ('author', 'nickname', {str}),
'channel': ('channel', 'channelName', {str}),
'channel_id': ('channel', 'channelCode', {str}),
'duration': ('officialVideo', 'playTime', {int_or_none}),
'view_count': ('officialVideo', 'playCount', {int_or_none}),
'like_count': ('officialVideo', 'likeCount', {int_or_none}),
'comment_count': ('officialVideo', 'commentCount', {int_or_none}),
'timestamp': ('officialVideo', 'createdAt', {lambda x: int_or_none(x, scale=1000)}),
'release_timestamp': ('officialVideo', 'willStartAt', {lambda x: int_or_none(x, scale=1000)}),
})),
**traverse_obj(vod_data, ('meta', {
'uploader_id': ('user', 'id', {str}),
'uploader': ('user', 'name', {str}),
'uploader_url': ('user', 'url', {url_or_none}),
'thumbnail': ('cover', 'source', {url_or_none}),
}), expected_type=lambda x: x or None),
**NaverBaseIE.process_subtitles(vod_data, lambda x: [self._WAYBACK_BASE_URL + x]),
}

View File

@ -5,7 +5,7 @@ from ..utils import extract_attributes
class BFMTVBaseIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?bfmtv\.com/'
_VALID_URL_BASE = r'https?://(?:www\.|rmc\.)?bfmtv\.com/'
_VALID_URL_TMPL = _VALID_URL_BASE + r'(?:[^/]+/)*[^/?&#]+_%s[A-Z]-(?P<id>\d{12})\.html'
_VIDEO_BLOCK_REGEX = r'(<div[^>]+class="video_block"[^>]*>)'
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/%s_default/index.html?videoId=%s'
@ -31,6 +31,9 @@ class BFMTVIE(BFMTVBaseIE):
'uploader_id': '876450610001',
'upload_date': '20201002',
'timestamp': 1601629620,
'duration': 44.757,
'tags': ['bfmactu', 'politique'],
'thumbnail': 'https://cf-images.eu-west-1.prod.boltdns.net/v1/static/876450610001/5041f4c1-bc48-4af8-a256-1b8300ad8ef0/cf2f9114-e8e2-4494-82b4-ab794ea4bc7d/1920x1080/match/image.jpg',
},
}]
@ -81,6 +84,20 @@ class BFMTVArticleIE(BFMTVBaseIE):
}, {
'url': 'https://www.bfmtv.com/sante/covid-19-oui-le-vaccin-de-pfizer-distribue-en-france-a-bien-ete-teste-sur-des-personnes-agees_AN-202101060275.html',
'only_matching': True,
}, {
'url': 'https://rmc.bfmtv.com/actualites/societe/transports/ce-n-est-plus-tout-rentable-le-bioethanol-e85-depasse-1eu-le-litre-des-automobilistes-regrettent_AV-202301100268.html',
'info_dict': {
'id': '6318445464112',
'ext': 'mp4',
'title': 'Le plein de bioéthanol fait de plus en plus mal à la pompe',
'description': None,
'uploader_id': '876630703001',
'upload_date': '20230110',
'timestamp': 1673341692,
'duration': 109.269,
'tags': ['rmc', 'show', 'apolline de malherbe', 'info', 'talk', 'matinale', 'radio'],
'thumbnail': 'https://cf-images.eu-west-1.prod.boltdns.net/v1/static/876630703001/5bef74b8-9d5e-4480-a21f-60c2e2480c46/96c88b74-f9db-45e1-8040-e199c5da216c/1920x1080/match/image.jpg'
}
}]
def _real_extract(self, url):

View File

@ -6,6 +6,7 @@ import urllib.error
import urllib.parse
from .common import InfoExtractor, SearchInfoExtractor
from ..dependencies import Cryptodome
from ..utils import (
ExtractorError,
GeoRestrictedError,
@ -893,22 +894,15 @@ class BiliIntlBaseIE(InfoExtractor):
}
def _perform_login(self, username, password):
try:
from Cryptodome.PublicKey import RSA
from Cryptodome.Cipher import PKCS1_v1_5
except ImportError:
try:
from Crypto.PublicKey import RSA
from Crypto.Cipher import PKCS1_v1_5
except ImportError:
raise ExtractorError('pycryptodomex not found. Please install', expected=True)
if not Cryptodome:
raise ExtractorError('pycryptodomex not found. Please install', expected=True)
key_data = self._download_json(
'https://passport.bilibili.tv/x/intl/passport-login/web/key?lang=en-US', None,
note='Downloading login key', errnote='Unable to download login key')['data']
public_key = RSA.importKey(key_data['key'])
password_hash = PKCS1_v1_5.new(public_key).encrypt((key_data['hash'] + password).encode('utf-8'))
public_key = Cryptodome.PublicKey.RSA.importKey(key_data['key'])
password_hash = Cryptodome.Cipher.PKCS1_v1_5.new(public_key).encrypt((key_data['hash'] + password).encode('utf-8'))
login_post = self._download_json(
'https://passport.bilibili.tv/x/intl/passport-login/web/login/password?lang=en-US', None, data=urlencode_postdata({
'username': username,
@ -939,6 +933,19 @@ class BiliIntlIE(BiliIntlBaseIE):
'episode': 'Episode 2',
'timestamp': 1602259500,
'description': 'md5:297b5a17155eb645e14a14b385ab547e',
'chapters': [{
'start_time': 0,
'end_time': 76.242,
'title': '<Untitled Chapter 1>'
}, {
'start_time': 76.242,
'end_time': 161.161,
'title': 'Intro'
}, {
'start_time': 1325.742,
'end_time': 1403.903,
'title': 'Outro'
}],
}
}, {
# Non-Bstation page
@ -953,6 +960,19 @@ class BiliIntlIE(BiliIntlBaseIE):
'episode': 'Episode 3',
'upload_date': '20211219',
'timestamp': 1639928700,
'chapters': [{
'start_time': 0,
'end_time': 88.0,
'title': '<Untitled Chapter 1>'
}, {
'start_time': 88.0,
'end_time': 156.0,
'title': 'Intro'
}, {
'start_time': 1173.0,
'end_time': 1259.535,
'title': 'Outro'
}],
}
}, {
# Subtitle with empty content
@ -976,6 +996,20 @@ class BiliIntlIE(BiliIntlBaseIE):
'upload_date': '20221212',
'title': 'Kimetsu no Yaiba Season 3 Official Trailer - Bstation',
}
}, {
# episode id without intro and outro
'url': 'https://www.bilibili.tv/en/play/1048837/11246489',
'info_dict': {
'id': '11246489',
'ext': 'mp4',
'title': 'E1 - Operation \'Strix\' <Owl>',
'description': 'md5:b4434eb1a9a97ad2bccb779514b89f17',
'timestamp': 1649516400,
'thumbnail': 'https://pic.bstarstatic.com/ogv/62cb1de23ada17fb70fbe7bdd6ff29c29da02a64.png',
'episode': 'Episode 1',
'episode_number': 1,
'upload_date': '20220409',
},
}, {
'url': 'https://www.biliintl.com/en/play/34613/341736',
'only_matching': True,
@ -1028,12 +1062,31 @@ class BiliIntlIE(BiliIntlBaseIE):
def _real_extract(self, url):
season_id, ep_id, aid = self._match_valid_url(url).group('season_id', 'ep_id', 'aid')
video_id = ep_id or aid
chapters = None
if ep_id:
intro_ending_json = self._call_api(
f'/web/v2/ogv/play/episode?episode_id={ep_id}&platform=web',
video_id, fatal=False) or {}
if intro_ending_json.get('skip'):
# FIXME: start time and end time seems a bit off a few second even it corrext based on ogv.*.js
# ref: https://p.bstarstatic.com/fe-static/bstar-web-new/assets/ogv.2b147442.js
chapters = [{
'start_time': float_or_none(traverse_obj(intro_ending_json, ('skip', 'opening_start_time')), 1000),
'end_time': float_or_none(traverse_obj(intro_ending_json, ('skip', 'opening_end_time')), 1000),
'title': 'Intro'
}, {
'start_time': float_or_none(traverse_obj(intro_ending_json, ('skip', 'ending_start_time')), 1000),
'end_time': float_or_none(traverse_obj(intro_ending_json, ('skip', 'ending_end_time')), 1000),
'title': 'Outro'
}]
return {
'id': video_id,
**self._extract_video_metadata(url, video_id, season_id),
'formats': self._get_formats(ep_id=ep_id, aid=aid),
'subtitles': self.extract_subtitles(ep_id=ep_id, aid=aid),
'chapters': chapters
}

167
yt_dlp/extractor/blerp.py Normal file
View File

@ -0,0 +1,167 @@
import json
from .common import InfoExtractor
from ..utils import strip_or_none, traverse_obj
class BlerpIE(InfoExtractor):
IE_NAME = 'blerp'
_VALID_URL = r'https?://(?:www\.)?blerp\.com/soundbites/(?P<id>[0-9a-zA-Z]+)'
_TESTS = [{
'url': 'https://blerp.com/soundbites/6320fe8745636cb4dd677a5a',
'info_dict': {
'id': '6320fe8745636cb4dd677a5a',
'title': 'Samsung Galaxy S8 Over the Horizon Ringtone 2016',
'uploader': 'luminousaj',
'uploader_id': '5fb81e51aa66ae000c395478',
'ext': 'mp3',
'tags': ['samsung', 'galaxy', 's8', 'over the horizon', '2016', 'ringtone'],
}
}, {
'url': 'https://blerp.com/soundbites/5bc94ef4796001000498429f',
'info_dict': {
'id': '5bc94ef4796001000498429f',
'title': 'Yee',
'uploader': '179617322678353920',
'uploader_id': '5ba99cf71386730004552c42',
'ext': 'mp3',
'tags': ['YEE', 'YEET', 'wo ha haah catchy tune yee', 'yee']
}
}]
_GRAPHQL_OPERATIONNAME = "webBitePageGetBite"
_GRAPHQL_QUERY = (
'''query webBitePageGetBite($_id: MongoID!) {
web {
biteById(_id: $_id) {
...bitePageFrag
__typename
}
__typename
}
}
fragment bitePageFrag on Bite {
_id
title
userKeywords
keywords
color
visibility
isPremium
owned
price
extraReview
isAudioExists
image {
filename
original {
url
__typename
}
__typename
}
userReactions {
_id
reactions
createdAt
__typename
}
topReactions
totalSaveCount
saved
blerpLibraryType
license
licenseMetaData
playCount
totalShareCount
totalFavoriteCount
totalAddedToBoardCount
userCategory
userAudioQuality
audioCreationState
transcription
userTranscription
description
createdAt
updatedAt
author
listingType
ownerObject {
_id
username
profileImage {
filename
original {
url
__typename
}
__typename
}
__typename
}
transcription
favorited
visibility
isCurated
sourceUrl
audienceRating
strictAudienceRating
ownerId
reportObject {
reportedContentStatus
__typename
}
giphy {
mp4
gif
__typename
}
audio {
filename
original {
url
__typename
}
mp3 {
url
__typename
}
__typename
}
__typename
}
''')
def _real_extract(self, url):
audio_id = self._match_id(url)
data = {
'operationName': self._GRAPHQL_OPERATIONNAME,
'query': self._GRAPHQL_QUERY,
'variables': {
'_id': audio_id
}
}
headers = {
'Content-Type': 'application/json'
}
json_result = self._download_json('https://api.blerp.com/graphql',
audio_id, data=json.dumps(data).encode('utf-8'), headers=headers)
bite_json = json_result['data']['web']['biteById']
info_dict = {
'id': bite_json['_id'],
'url': bite_json['audio']['mp3']['url'],
'title': bite_json['title'],
'uploader': traverse_obj(bite_json, ('ownerObject', 'username'), expected_type=strip_or_none),
'uploader_id': traverse_obj(bite_json, ('ownerObject', '_id'), expected_type=strip_or_none),
'ext': 'mp3',
'tags': list(filter(None, map(strip_or_none, (traverse_obj(bite_json, 'userKeywords', expected_type=list) or []))) or None)
}
return info_dict

102
yt_dlp/extractor/boxcast.py Normal file
View File

@ -0,0 +1,102 @@
from .common import InfoExtractor
from ..utils import (
js_to_json,
traverse_obj,
unified_timestamp
)
class BoxCastVideoIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://boxcast\.tv/(?:
view-embed/|
channel/\w+\?(?:[^#]+&)?b=|
video-portal/(?:\w+/){2}
)(?P<id>[\w-]+)'''
_EMBED_REGEX = [r'<iframe[^>]+src=["\'](?P<url>https?://boxcast\.tv/view-embed/[\w-]+)']
_TESTS = [{
'url': 'https://boxcast.tv/view-embed/in-the-midst-of-darkness-light-prevails-an-interdisciplinary-symposium-ozmq5eclj50ujl4bmpwx',
'info_dict': {
'id': 'da1eqqgkacngd5djlqld',
'ext': 'mp4',
'thumbnail': r're:https?://uploads\.boxcast\.com/(?:[\w+-]+/){3}.+\.png$',
'title': 'In the Midst of Darkness Light Prevails: An Interdisciplinary Symposium',
'release_timestamp': 1670686812,
'release_date': '20221210',
'uploader_id': 're8w0v8hohhvpqtbskpe',
'uploader': 'Children\'s Health Defense',
}
}, {
'url': 'https://boxcast.tv/video-portal/vctwevwntun3o0ikq7af/rvyblnn0fxbfjx5nwxhl/otbpltj2kzkveo2qz3ad',
'info_dict': {
'id': 'otbpltj2kzkveo2qz3ad',
'ext': 'mp4',
'uploader_id': 'vctwevwntun3o0ikq7af',
'uploader': 'Legacy Christian Church',
'title': 'The Quest | 1: Beginner\'s Bay | Jamie Schools',
'thumbnail': r're:https?://uploads.boxcast.com/(?:[\w-]+/){3}.+\.jpg'
}
}, {
'url': 'https://boxcast.tv/channel/z03fqwaeaby5lnaawox2?b=ssihlw5gvfij2by8tkev',
'info_dict': {
'id': 'ssihlw5gvfij2by8tkev',
'ext': 'mp4',
'thumbnail': r're:https?://uploads.boxcast.com/(?:[\w-]+/){3}.+\.jpg$',
'release_date': '20230101',
'uploader_id': 'ds25vaazhlu4ygcvffid',
'release_timestamp': 1672543201,
'uploader': 'Lighthouse Ministries International - Beltsville, Maryland',
'description': 'md5:ac23e3d01b0b0be592e8f7fe0ec3a340',
'title': 'New Year\'s Eve CROSSOVER Service at LHMI | December 31, 2022',
}
}]
_WEBPAGE_TESTS = [{
'url': 'https://childrenshealthdefense.eu/live-stream/',
'info_dict': {
'id': 'da1eqqgkacngd5djlqld',
'ext': 'mp4',
'thumbnail': r're:https?://uploads\.boxcast\.com/(?:[\w+-]+/){3}.+\.png$',
'title': 'In the Midst of Darkness Light Prevails: An Interdisciplinary Symposium',
'release_timestamp': 1670686812,
'release_date': '20221210',
'uploader_id': 're8w0v8hohhvpqtbskpe',
'uploader': 'Children\'s Health Defense',
}
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
webpage_json_data = self._search_json(
r'var\s*BOXCAST_PRELOAD\s*=', webpage, 'broadcast data', display_id,
transform_source=js_to_json, default={})
# Ref: https://support.boxcast.com/en/articles/4235158-build-a-custom-viewer-experience-with-boxcast-api
broadcast_json_data = (
traverse_obj(webpage_json_data, ('broadcast', 'data'))
or self._download_json(f'https://api.boxcast.com/broadcasts/{display_id}', display_id))
view_json_data = (
traverse_obj(webpage_json_data, ('view', 'data'))
or self._download_json(f'https://api.boxcast.com/broadcasts/{display_id}/view',
display_id, fatal=False) or {})
formats, subtitles = [], {}
if view_json_data.get('status') == 'recorded':
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
view_json_data['playlist'], display_id)
return {
'id': str(broadcast_json_data['id']),
'title': (broadcast_json_data.get('name')
or self._html_search_meta(['og:title', 'twitter:title'], webpage)),
'description': (broadcast_json_data.get('description')
or self._html_search_meta(['og:description', 'twitter:description'], webpage)
or None),
'thumbnail': (broadcast_json_data.get('preview')
or self._html_search_meta(['og:image', 'twitter:image'], webpage)),
'formats': formats,
'subtitles': subtitles,
'release_timestamp': unified_timestamp(broadcast_json_data.get('streamed_at')),
'uploader': broadcast_json_data.get('account_name'),
'uploader_id': broadcast_json_data.get('account_id'),
}

View File

@ -1,9 +1,5 @@
from .common import InfoExtractor
from ..utils import (
traverse_obj,
float_or_none,
int_or_none
)
from ..utils import float_or_none, int_or_none, make_archive_id, traverse_obj
class CallinIE(InfoExtractor):
@ -35,6 +31,54 @@ class CallinIE(InfoExtractor):
'episode_number': 1,
'episode_id': '218b979630a35ead12c6fd096f2996c56c37e4d0dc1f6dc0feada32dcf7b31cd'
}
}, {
'url': 'https://www.callin.com/episode/fcc-commissioner-brendan-carr-on-elons-PrumRdSQJW',
'md5': '14ede27ee2c957b7e4db93140fc0745c',
'info_dict': {
'id': 'c3dab47f237bf953d180d3f243477a84302798be0e0b29bc9ade6d60a69f04f5',
'ext': 'ts',
'title': 'FCC Commissioner Brendan Carr on Elons Starlink',
'description': 'Or, why the government doesnt like SpaceX',
'channel': 'The Pull Request',
'channel_url': 'https://callin.com/show/the-pull-request-ucnDJmEKAa',
'duration': 3182.472,
'series_id': '7e9c23156e4aecfdcaef46bfb2ed7ca268509622ec006c0f0f25d90e34496638',
'uploader_url': 'http://thepullrequest.com',
'upload_date': '20220902',
'episode': 'FCC Commissioner Brendan Carr on Elons Starlink',
'display_id': 'fcc-commissioner-brendan-carr-on-elons-PrumRdSQJW',
'series': 'The Pull Request',
'channel_id': '7e9c23156e4aecfdcaef46bfb2ed7ca268509622ec006c0f0f25d90e34496638',
'view_count': int,
'uploader': 'Antonio García Martínez',
'thumbnail': 'https://d1z76fhpoqkd01.cloudfront.net/shows/legacy/1ade9142625344045dc17cf523469ced1d93610762f4c886d06aa190a2f979e8.png',
'episode_id': 'c3dab47f237bf953d180d3f243477a84302798be0e0b29bc9ade6d60a69f04f5',
'timestamp': 1662100688.005,
}
}, {
'url': 'https://www.callin.com/episode/episode-81-elites-melt-down-over-student-debt-lzxMidUnjA',
'md5': '16f704ddbf82a27e3930533b12062f07',
'info_dict': {
'id': '8d06f869798f93a7814e380bceabea72d501417e620180416ff6bd510596e83c',
'ext': 'ts',
'title': 'Episode 81- Elites MELT DOWN over Student Debt Victory? Rumble in NYC?',
'description': 'Lets talk todays episode about the primary election shake up in NYC and the elites melting down over student debt cancelation.',
'channel': 'The DEBRIEF With Briahna Joy Gray',
'channel_url': 'https://callin.com/show/the-debrief-with-briahna-joy-gray-siiFDzGegm',
'duration': 10043.16,
'series_id': '61cea58444465fd26674069703bd8322993bc9e5b4f1a6d0872690554a046ff7',
'uploader_url': 'http://patreon.com/badfaithpodcast',
'upload_date': '20220826',
'episode': 'Episode 81- Elites MELT DOWN over Student Debt Victory? Rumble in NYC?',
'display_id': 'episode-',
'series': 'The DEBRIEF With Briahna Joy Gray',
'channel_id': '61cea58444465fd26674069703bd8322993bc9e5b4f1a6d0872690554a046ff7',
'view_count': int,
'uploader': 'Briahna Gray',
'thumbnail': 'https://d1z76fhpoqkd01.cloudfront.net/shows/legacy/461ea0d86172cb6aff7d6c80fd49259cf5e64bdf737a4650f8bc24cf392ca218.png',
'episode_id': '8d06f869798f93a7814e380bceabea72d501417e620180416ff6bd510596e83c',
'timestamp': 1661476708.282,
}
}]
def try_get_user_name(self, d):
@ -86,6 +130,7 @@ class CallinIE(InfoExtractor):
return {
'id': id,
'_old_archive_ids': [make_archive_id(self, display_id.rsplit('-', 1)[-1])],
'display_id': display_id,
'title': title,
'formats': formats,

View File

@ -1,9 +1,5 @@
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
url_or_none,
)
from ..utils import int_or_none, url_or_none
class CamModelsIE(InfoExtractor):
@ -17,32 +13,11 @@ class CamModelsIE(InfoExtractor):
def _real_extract(self, url):
user_id = self._match_id(url)
webpage = self._download_webpage(
url, user_id, headers=self.geo_verification_headers())
manifest_root = self._html_search_regex(
r'manifestUrlRoot=([^&\']+)', webpage, 'manifest', default=None)
if not manifest_root:
ERRORS = (
("I'm offline, but let's stay connected", 'This user is currently offline'),
('in a private show', 'This user is in a private show'),
('is currently performing LIVE', 'This model is currently performing live'),
)
for pattern, message in ERRORS:
if pattern in webpage:
error = message
expected = True
break
else:
error = 'Unable to find manifest URL root'
expected = False
raise ExtractorError(error, expected=expected)
manifest = self._download_json(
'%s%s.json' % (manifest_root, user_id), user_id)
'https://manifest-server.naiadsystems.com/live/s:%s.json' % user_id, user_id)
formats = []
thumbnails = []
for format_id, format_dict in manifest['formats'].items():
if not isinstance(format_dict, dict):
continue
@ -82,12 +57,20 @@ class CamModelsIE(InfoExtractor):
'quality': -10,
})
else:
if format_id == 'jpeg':
thumbnails.append({
'url': f['url'],
'width': f['width'],
'height': f['height'],
'format_id': f['format_id'],
})
continue
formats.append(f)
return {
'id': user_id,
'title': user_id,
'thumbnails': thumbnails,
'is_live': True,
'formats': formats,
'age_limit': 18

View File

@ -9,22 +9,22 @@ from ..utils import (
class ClypIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?clyp\.it/(?P<id>[a-z0-9]+)'
_TESTS = [{
'url': 'https://clyp.it/ojz2wfah',
'md5': '1d4961036c41247ecfdcc439c0cddcbb',
'url': 'https://clyp.it/iynkjk4b',
'md5': '4bc6371c65210e7b372097fce4d92441',
'info_dict': {
'id': 'ojz2wfah',
'ext': 'mp3',
'title': 'Krisson80 - bits wip wip',
'description': '#Krisson80BitsWipWip #chiptune\n#wip',
'duration': 263.21,
'timestamp': 1443515251,
'upload_date': '20150929',
'id': 'iynkjk4b',
'ext': 'ogg',
'title': 'research',
'description': '#Research',
'duration': 51.278,
'timestamp': 1435524981,
'upload_date': '20150628',
},
}, {
'url': 'https://clyp.it/b04p1odi?token=b0078e077e15835845c528a44417719d',
'info_dict': {
'id': 'b04p1odi',
'ext': 'mp3',
'ext': 'ogg',
'title': 'GJ! (Reward Edit)',
'description': 'Metal Resistance (THE ONE edition)',
'duration': 177.789,
@ -34,6 +34,17 @@ class ClypIE(InfoExtractor):
'params': {
'skip_download': True,
},
}, {
'url': 'https://clyp.it/v42214lc',
'md5': '4aca4dfc3236fb6d6ddc4ea08314f33f',
'info_dict': {
'id': 'v42214lc',
'ext': 'wav',
'title': 'i dont wanna go (old version)',
'duration': 113.528,
'timestamp': 1607348505,
'upload_date': '20201207',
},
}]
def _real_extract(self, url):
@ -59,8 +70,20 @@ class ClypIE(InfoExtractor):
'url': format_url,
'format_id': format_id,
'vcodec': 'none',
'acodec': ext.lower(),
})
page = self._download_webpage(url, video_id=audio_id)
wav_url = self._html_search_regex(
r'var\s*wavStreamUrl\s*=\s*["\'](?P<url>https?://[^\'"]+)', page, 'url', default=None)
if wav_url:
formats.append({
'url': wav_url,
'format_id': 'wavStreamUrl',
'vcodec': 'none',
'acodec': 'wav',
})
title = metadata['Title']
description = metadata.get('Description')
duration = float_or_none(metadata.get('Duration'))

View File

@ -32,6 +32,7 @@ from ..utils import (
FormatSorter,
GeoRestrictedError,
GeoUtils,
HEADRequest,
LenientJSONDecoder,
RegexNotFoundError,
RetryManager,
@ -81,6 +82,7 @@ from ..utils import (
update_url_query,
url_basename,
url_or_none,
urlhandle_detect_ext,
urljoin,
variadic,
xpath_element,
@ -218,6 +220,17 @@ class InfoExtractor:
* no_resume The server does not support resuming the
(HTTP or RTMP) download. Boolean.
* has_drm The format has DRM and cannot be downloaded. Boolean
* extra_param_to_segment_url A query string to append to each
fragment's URL, or to update each existing query string
with. Only applied by the native HLS/DASH downloaders.
* hls_aes A dictionary of HLS AES-128 decryption information
used by the native HLS downloader to override the
values in the media playlist when an '#EXT-X-KEY' tag
is present in the playlist:
* uri The URI from which the key will be downloaded
* key The key (as hex) used to decrypt fragments.
If `key` is given, any key URI will be ignored
* iv The IV (as hex) used to decrypt fragments
* downloader_options A dictionary of downloader options
(For internal use only)
* http_chunk_size Chunk size for HTTP downloads
@ -1325,7 +1338,7 @@ class InfoExtractor:
# Helper functions for extracting OpenGraph info
@staticmethod
def _og_regexes(prop):
content_re = r'content=(?:"([^"]+?)"|\'([^\']+?)\'|\s*([^\s"\'=<>`]+?))'
content_re = r'content=(?:"([^"]+?)"|\'([^\']+?)\'|\s*([^\s"\'=<>`]+?)(?=\s|/?>))'
property_re = (r'(?:name|property)=(?:\'og%(sep)s%(prop)s\'|"og%(sep)s%(prop)s"|\s*og%(sep)s%(prop)s\b)'
% {'prop': re.escape(prop), 'sep': '(?:&#x3A;|[:-])'})
template = r'<meta[^>]+?%s[^>]+?%s'
@ -1657,11 +1670,8 @@ class InfoExtractor:
if js is None:
return {}
args = dict(zip(arg_keys.split(','), arg_vals.split(',')))
for key, val in args.items():
if val in ('undefined', 'void 0'):
args[key] = 'null'
args = dict(zip(arg_keys.split(','), map(json.dumps, self._parse_json(
f'[{arg_vals}]', video_id, transform_source=js_to_json, fatal=fatal) or ())))
ret = self._parse_json(js, video_id, transform_source=functools.partial(js_to_json, vars=args), fatal=fatal)
return traverse_obj(ret, traverse) or {}
@ -2178,13 +2188,23 @@ class InfoExtractor:
return self._parse_m3u8_vod_duration(m3u8_vod or '', video_id)
def _parse_m3u8_vod_duration(self, m3u8_vod, video_id):
if '#EXT-X-PLAYLIST-TYPE:VOD' not in m3u8_vod:
if '#EXT-X-ENDLIST' not in m3u8_vod:
return None
return int(sum(
float(line[len('#EXTINF:'):].split(',')[0])
for line in m3u8_vod.splitlines() if line.startswith('#EXTINF:'))) or None
def _extract_mpd_vod_duration(
self, mpd_url, video_id, note=None, errnote=None, data=None, headers={}, query={}):
mpd_doc = self._download_xml(
mpd_url, video_id,
note='Downloading MPD VOD manifest' if note is None else note,
errnote='Failed to download VOD manifest' if errnote is None else errnote,
fatal=False, data=data, headers=headers, query=query) or {}
return int_or_none(parse_duration(mpd_doc.get('mediaPresentationDuration')))
@staticmethod
def _xpath_ns(path, namespace=None):
if not namespace:
@ -2311,7 +2331,8 @@ class InfoExtractor:
height = int_or_none(medium.get('height'))
proto = medium.get('proto')
ext = medium.get('ext')
src_ext = determine_ext(src)
src_ext = determine_ext(src, default_ext=None) or ext or urlhandle_detect_ext(
self._request_webpage(HEADRequest(src), video_id, note='Requesting extension info', fatal=False))
streamer = medium.get('streamer') or base
if proto == 'rtmp' or streamer.startswith('rtmp'):

View File

@ -20,8 +20,12 @@ class CrunchyrollBaseIE(InfoExtractor):
_NETRC_MACHINE = 'crunchyroll'
params = None
@property
def is_logged_in(self):
return self._get_cookies(self._LOGIN_URL).get('etp_rt')
def _perform_login(self, username, password):
if self._get_cookies(self._LOGIN_URL).get('etp_rt'):
if self.is_logged_in:
return
upsell_response = self._download_json(
@ -46,7 +50,7 @@ class CrunchyrollBaseIE(InfoExtractor):
}).encode('ascii'))
if login_response['code'] != 'ok':
raise ExtractorError('Login failed. Server message: %s' % login_response['message'], expected=True)
if not self._get_cookies(self._LOGIN_URL).get('etp_rt'):
if not self.is_logged_in:
raise ExtractorError('Login succeeded but did not set etp_rt cookie')
def _get_embedded_json(self, webpage, display_id):
@ -116,6 +120,7 @@ class CrunchyrollBetaIE(CrunchyrollBaseIE):
'episode': 'To the Future',
'episode_number': 73,
'thumbnail': r're:^https://www.crunchyroll.com/imgsrv/.*\.jpeg$',
'chapters': 'count:2',
},
'params': {'skip_download': 'm3u8', 'format': 'all[format_id~=hardsub]'},
}, {
@ -136,6 +141,7 @@ class CrunchyrollBetaIE(CrunchyrollBaseIE):
'episode': 'Porter Robinson presents Shelter the Animation',
'episode_number': 0,
'thumbnail': r're:^https://www.crunchyroll.com/imgsrv/.*\.jpeg$',
'chapters': 'count:0',
},
'params': {'skip_download': True},
'skip': 'Video is Premium only',
@ -154,8 +160,11 @@ class CrunchyrollBetaIE(CrunchyrollBaseIE):
episode_response = self._download_json(
f'{api_domain}/cms/v2{bucket}/episodes/{internal_id}', display_id,
note='Retrieving episode metadata', query=params)
if episode_response.get('is_premium_only') and not episode_response.get('playback'):
raise ExtractorError('This video is for premium members only.', expected=True)
if episode_response.get('is_premium_only') and not bucket.endswith('crunchyroll'):
if self.is_logged_in:
raise ExtractorError('This video is for premium members only', expected=True)
else:
self.raise_login_required('This video is for premium members only')
stream_response = self._download_json(
f'{api_domain}{episode_response["__links__"]["streams"]["href"]}', display_id,
@ -209,6 +218,17 @@ class CrunchyrollBetaIE(CrunchyrollBaseIE):
f['quality'] = hardsub_preference(hardsub_lang.lower())
formats.extend(adaptive_formats)
chapters = None
# if no intro chapter is available, a 403 without usable data is returned
intro_chapter = self._download_json(f'https://static.crunchyroll.com/datalab-intro-v2/{internal_id}.json',
display_id, fatal=False, errnote=False)
if isinstance(intro_chapter, dict):
chapters = [{
'title': 'Intro',
'start_time': float_or_none(intro_chapter.get('startTime')),
'end_time': float_or_none(intro_chapter.get('endTime'))
}]
return {
'id': internal_id,
'title': '%s Episode %s %s' % (
@ -235,6 +255,7 @@ class CrunchyrollBetaIE(CrunchyrollBaseIE):
'ext': subtitle_data.get('format')
}] for lang, subtitle_data in get_streams('subtitles')
},
'chapters': chapters
}

View File

@ -1,6 +1,7 @@
import time
import hashlib
import re
import urllib
from .common import InfoExtractor
from ..utils import (
@ -13,7 +14,7 @@ from ..utils import (
class DouyuTVIE(InfoExtractor):
IE_DESC = '斗鱼'
_VALID_URL = r'https?://(?:www\.)?douyu(?:tv)?\.com/(?:[^/]+/)*(?P<id>[A-Za-z0-9]+)'
_VALID_URL = r'https?://(?:www\.)?douyu(?:tv)?\.com/(topic/\w+\?rid=|(?:[^/]+/))*(?P<id>[A-Za-z0-9]+)'
_TESTS = [{
'url': 'http://www.douyutv.com/iseven',
'info_dict': {
@ -22,7 +23,7 @@ class DouyuTVIE(InfoExtractor):
'ext': 'flv',
'title': 're:^清晨醒脑!根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': r're:.*m7show@163\.com.*',
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.png',
'uploader': '7师傅',
'is_live': True,
},
@ -37,7 +38,7 @@ class DouyuTVIE(InfoExtractor):
'ext': 'flv',
'title': 're:^小漠从零单排记——CSOL2躲猫猫 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:746a2f7a253966a06755a912f0acc0d2',
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.png',
'uploader': 'douyu小漠',
'is_live': True,
},
@ -53,13 +54,28 @@ class DouyuTVIE(InfoExtractor):
'ext': 'flv',
'title': 're:^清晨醒脑!根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': r're:.*m7show@163\.com.*',
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.png',
'uploader': '7师傅',
'is_live': True,
},
'params': {
'skip_download': True,
},
}, {
'url': 'https://www.douyu.com/topic/ydxc?rid=6560603',
'info_dict': {
'id': '6560603',
'display_id': '6560603',
'ext': 'flv',
'title': 're:^阿余:新年快乐恭喜发财! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 're:.*直播时间.*',
'thumbnail': r're:^https?://.*\.png',
'uploader': '阿涛皎月Carry',
'live_status': 'is_live',
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.douyu.com/xiaocang',
'only_matching': True,
@ -79,28 +95,24 @@ class DouyuTVIE(InfoExtractor):
room_id = self._html_search_regex(
r'"room_id\\?"\s*:\s*(\d+),', page, 'room id')
# Grab metadata from mobile API
# Grab metadata from API
params = {
'aid': 'wp',
'client_sys': 'wp',
'time': int(time.time()),
}
params['auth'] = hashlib.md5(
f'room/{video_id}?{urllib.parse.urlencode(params)}zNzMV1y4EMxOHS6I5WKm'.encode()).hexdigest()
room = self._download_json(
'http://m.douyu.com/html5/live?roomId=%s' % room_id, video_id,
note='Downloading room info')['data']
f'http://www.douyutv.com/api/v1/room/{room_id}', video_id,
note='Downloading room info', query=params)['data']
# 1 = live, 2 = offline
if room.get('show_status') == '2':
raise ExtractorError('Live stream is offline', expected=True)
# Grab the URL from PC client API
# The m3u8 url from mobile API requires re-authentication every 5 minutes
tt = int(time.time())
signContent = 'lapi/live/thirdPart/getPlay/%s?aid=pcclient&rate=0&time=%d9TUk5fjjUjg9qIMH3sdnh' % (room_id, tt)
sign = hashlib.md5(signContent.encode('ascii')).hexdigest()
video_url = self._download_json(
'http://coapi.douyucdn.cn/lapi/live/thirdPart/getPlay/' + room_id,
video_id, note='Downloading video URL info',
query={'rate': 0}, headers={
'auth': sign,
'time': str(tt),
'aid': 'pcclient'
})['data']['live_url']
video_url = urljoin('https://hls3-akm.douyucdn.cn/', self._search_regex(r'(live/.*)', room['hls_url'], 'URL'))
formats, subs = self._extract_m3u8_formats_and_subtitles(video_url, room_id)
title = unescapeHTML(room['room_name'])
description = room.get('show_details')
@ -110,12 +122,13 @@ class DouyuTVIE(InfoExtractor):
return {
'id': room_id,
'display_id': video_id,
'url': video_url,
'title': title,
'description': description,
'thumbnail': thumbnail,
'uploader': uploader,
'is_live': True,
'subtitles': subs,
'formats': formats,
}

View File

@ -184,9 +184,10 @@ class DRTVIE(InfoExtractor):
data = self._download_json(
programcard_url, video_id, 'Downloading video JSON', query=query)
supplementary_data = self._download_json(
SERIES_API % f'/episode/{raw_video_id}', raw_video_id,
default={}) if re.search(r'_\d+$', raw_video_id) else {}
supplementary_data = {}
if re.search(r'_\d+$', raw_video_id):
supplementary_data = self._download_json(
SERIES_API % f'/episode/{raw_video_id}', raw_video_id, fatal=False) or {}
title = str_or_none(data.get('Title')) or re.sub(
r'\s*\|\s*(?:TV\s*\|\s*DR|DRTV)$', '',

36
yt_dlp/extractor/ebay.py Normal file
View File

@ -0,0 +1,36 @@
from .common import InfoExtractor
from ..utils import remove_end
class EbayIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?ebay\.com/itm/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.ebay.com/itm/194509326719',
'info_dict': {
'id': '194509326719',
'ext': 'mp4',
'title': 'WiFi internal antenna adhesive for wifi 2.4GHz wifi 5 wifi 6 wifi 6E full bands',
},
'params': {'skip_download': 'm3u8'}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_json = self._search_json(r'"video":', webpage, 'video json', video_id)
formats = []
for key, url in video_json['playlistMap'].items():
if key == 'HLS':
formats.extend(self._extract_m3u8_formats(url, video_id, fatal=False))
elif key == 'DASH':
formats.extend(self._extract_mpd_formats(url, video_id, fatal=False))
else:
self.report_warning(f'Unsupported format {key}', video_id)
return {
'id': video_id,
'title': remove_end(self._html_extract_title(webpage), ' | eBay'),
'formats': formats
}

View File

@ -61,14 +61,43 @@ class EmbedlyIE(InfoExtractor):
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'http://www.permacultureetc.com/2022/12/comment-greffer-facilement-les-arbres-fruitiers.html',
'info_dict': {
'id': 'pfUK_ADTvgY',
'ext': 'mp4',
'title': 'Comment greffer facilement les arbres fruitiers ? (mois par mois)',
'description': 'md5:d3a876995e522f138aabb48e040bfb4c',
'view_count': int,
'upload_date': '20221210',
'comment_count': int,
'live_status': 'not_live',
'channel_id': 'UCsM4_jihNFYe4CtSkXvDR-Q',
'channel_follower_count': int,
'tags': ['permaculture', 'jardinage', 'dekarz', 'autonomie', 'greffe', 'fruitiers', 'arbres', 'jardin forêt', 'forêt comestible', 'damien'],
'playable_in_embed': True,
'uploader': 'permaculture agroécologie etc...',
'channel': 'permaculture agroécologie etc...',
'thumbnail': 'https://i.ytimg.com/vi/pfUK_ADTvgY/sddefault.jpg',
'duration': 1526,
'channel_url': 'https://www.youtube.com/channel/UCsM4_jihNFYe4CtSkXvDR-Q',
'age_limit': 0,
'uploader_id': 'permacultureetc',
'like_count': int,
'uploader_url': 'http://www.youtube.com/user/permacultureetc',
'categories': ['Education'],
'availability': 'public',
},
}]
@classmethod
def _extract_embed_urls(cls, url, webpage):
# Bypass suitable check
def _extract_from_webpage(cls, url, webpage):
# Bypass "ie=cls" and suitable check
for mobj in re.finditer(r'class=["\']embedly-card["\'][^>]href=["\'](?P<url>[^"\']+)', webpage):
yield mobj.group('url')
yield cls.url_result(mobj.group('url'))
for mobj in re.finditer(r'class=["\']embedly-embed["\'][^>]src=["\'][^"\']*url=(?P<url>[^&]+)', webpage):
yield urllib.parse.unquote(mobj.group('url'))
yield cls.url_result(urllib.parse.unquote(mobj.group('url')))
def _real_extract(self, url):
qs = parse_qs(url)

View File

@ -52,6 +52,7 @@ class FreesoundIE(InfoExtractor):
tags_str = get_element_by_class('tags', webpage)
tags = re.findall(r'<a[^>]+>([^<]+)', tags_str) if tags_str else None
audio_url = re.sub(r'^https?://freesound\.org(https?://)', r'\1', audio_url)
audio_urls = [audio_url]
LQ_FORMAT = '-lq.mp3'

View File

@ -48,7 +48,7 @@ class GameJoltBaseIE(InfoExtractor):
post_hash_id, note='Downloading comments list page %d' % page)
if not comments_data.get('comments'):
break
for comment in traverse_obj(comments_data, (('comments', 'childComments'), ...), expected_type=dict, default=[]):
for comment in traverse_obj(comments_data, (('comments', 'childComments'), ...), expected_type=dict):
yield {
'id': comment['id'],
'text': self._parse_content_as_text(

View File

@ -864,20 +864,6 @@ class GenericIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.jpg$',
},
},
{
# JWPlayer config passed as variable
'url': 'http://www.txxx.com/videos/3326530/ariele/',
'info_dict': {
'id': '3326530_hq',
'ext': 'mp4',
'title': 'ARIELE | Tube Cup',
'uploader': 'www.txxx.com',
'age_limit': 18,
},
'params': {
'skip_download': True,
}
},
{
# Video.js embed, multiple formats
'url': 'http://ortcam.com/solidworks-урок-6-настройка-чертежа_33f9b7351.html',
@ -2637,11 +2623,11 @@ class GenericIE(InfoExtractor):
# Look for generic KVS player (before json-ld bc of some urls that break otherwise)
found = self._search_regex((
r'<script\b[^>]+?\bsrc\s*=\s*(["\'])https?://(?:\S+?/)+kt_player\.js\?v=(?P<ver>\d+(?:\.\d+)+)\1[^>]*>',
r'kt_player\s*\(\s*(["\'])(?:(?!\1)[\w\W])+\1\s*,\s*(["\'])https?://(?:\S+?/)+kt_player\.swf\?v=(?P<ver>\d+(?:\.\d+)+)\2\s*,',
r'<script\b[^>]+?\bsrc\s*=\s*(["\'])https?://(?:(?!\1)[^?#])+/kt_player\.js\?v=(?P<ver>\d+(?:\.\d+)+)\1[^>]*>',
r'kt_player\s*\(\s*(["\'])(?:(?!\1)[\w\W])+\1\s*,\s*(["\'])https?://(?:(?!\2)[^?#])+/kt_player\.swf\?v=(?P<ver>\d+(?:\.\d+)+)\2\s*,',
), webpage, 'KVS player', group='ver', default=False)
if found:
self.report_detected('KWS Player')
self.report_detected('KVS Player')
if found.split('.')[0] not in ('4', '5', '6'):
self.report_warning(f'Untested major version ({found}) in player engine - download may fail.')
return [self._extract_kvs(url, webpage, video_id)]

View File

@ -76,11 +76,11 @@ class GoPlayIE(InfoExtractor):
}
api = self._download_json(
f'https://api.viervijfzes.be/content/{video_id}',
video_id, headers={'Authorization': self._id_token})
f'https://api.goplay.be/web/v1/videos/long-form/{video_id}',
video_id, headers={'Authorization': 'Bearer %s' % self._id_token})
formats, subs = self._extract_m3u8_formats_and_subtitles(
api['video']['S'], video_id, ext='mp4', m3u8_id='HLS')
api['manifestUrls']['hls'], video_id, ext='mp4', m3u8_id='HLS')
info_dict.update({
'id': video_id,

View File

@ -1,5 +1,3 @@
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
@ -39,15 +37,27 @@ class HiDiveIE(InfoExtractor):
form = self._search_regex(
r'(?s)<form[^>]+action="/account/login"[^>]*>(.+?)</form>',
webpage, 'login form', default=None)
if not form: # logged in
if not form:
return
data = self._hidden_inputs(form)
data.update({
'Email': username,
'Password': password,
})
self._download_webpage(
login_webpage = self._download_webpage(
self._LOGIN_URL, None, 'Logging in', data=urlencode_postdata(data))
# If the user has multiple profiles on their account, select one. For now pick the first profile.
profile_id = self._search_regex(r'<button [^>]+?data-profile-id="(\w+)"', login_webpage, 'profile_id')
if profile_id is None:
return # If only one profile, Hidive auto-selects it
profile_id_hash = self._search_regex(r'\<button [^>]+?data-hash="(\w+)"', login_webpage, 'profile_id_hash')
self._request_webpage(
'https://www.hidive.com/ajax/chooseprofile', None,
data=urlencode_postdata({
'profileId': profile_id,
'hash': profile_id_hash,
'returnUrl': '/dashboard'
}))
def _call_api(self, video_id, title, key, data={}, **kwargs):
data = {
@ -60,26 +70,6 @@ class HiDiveIE(InfoExtractor):
'https://www.hidive.com/play/settings', video_id,
data=urlencode_postdata(data), **kwargs) or {}
def _extract_subtitles_from_rendition(self, rendition, subtitles, parsed_urls):
for cc_file in rendition.get('ccFiles', []):
cc_url = url_or_none(try_get(cc_file, lambda x: x[2]))
# name is used since we cant distinguish subs with same language code
cc_lang = try_get(cc_file, (lambda x: x[1].replace(' ', '-').lower(), lambda x: x[0]), str)
if cc_url not in parsed_urls and cc_lang:
parsed_urls.add(cc_url)
subtitles.setdefault(cc_lang, []).append({'url': cc_url})
def _get_subtitles(self, url, video_id, title, key, parsed_urls):
webpage = self._download_webpage(url, video_id, fatal=False) or ''
subtitles = {}
for caption in set(re.findall(r'data-captions=\"([^\"]+)\"', webpage)):
renditions = self._call_api(
video_id, title, key, {'Captions': caption}, fatal=False,
note=f'Downloading {caption} subtitle information').get('renditions') or {}
for rendition_id, rendition in renditions.items():
self._extract_subtitles_from_rendition(rendition, subtitles, parsed_urls)
return subtitles
def _real_extract(self, url):
video_id, title, key = self._match_valid_url(url).group('id', 'title', 'key')
settings = self._call_api(video_id, title, key)
@ -104,10 +94,20 @@ class HiDiveIE(InfoExtractor):
f['format_note'] = f'{version}, {extra}'
formats.extend(frmt)
subtitles = {}
for rendition_id, rendition in settings['renditions'].items():
audio, version, extra = rendition_id.split('_')
for cc_file in rendition.get('ccFiles') or []:
cc_url = url_or_none(try_get(cc_file, lambda x: x[2]))
cc_lang = try_get(cc_file, (lambda x: x[1].replace(' ', '-').lower(), lambda x: x[0]), str)
if cc_url not in parsed_urls and cc_lang:
parsed_urls.add(cc_url)
subtitles.setdefault(cc_lang, []).append({'url': cc_url})
return {
'id': video_id,
'title': video_id,
'subtitles': self.extract_subtitles(url, video_id, title, key, parsed_urls),
'subtitles': subtitles,
'formats': formats,
'series': title,
'season_number': int_or_none(

View File

@ -1,5 +1,6 @@
import hashlib
import random
import re
from ..compat import compat_urlparse, compat_b64decode
@ -37,7 +38,7 @@ class HuyaLiveIE(InfoExtractor):
}]
_RESOLUTION = {
'蓝光4M': {
'蓝光': {
'width': 1920,
'height': 1080,
},
@ -76,11 +77,15 @@ class HuyaLiveIE(InfoExtractor):
if re_secret:
fm, ss = self.encrypt(params, stream_info, stream_name)
for si in stream_data.get('vMultiStreamInfo'):
display_name, bitrate = re.fullmatch(
r'(.+?)(?:(\d+)M)?', si.get('sDisplayName')).groups()
rate = si.get('iBitRate')
if rate:
params['ratio'] = rate
else:
params.pop('ratio', None)
if bitrate:
rate = int(bitrate) * 1000
if re_secret:
params['wsSecret'] = hashlib.md5(
'_'.join([fm, params['u'], stream_name, ss, params['wsTime']]))
@ -90,7 +95,7 @@ class HuyaLiveIE(InfoExtractor):
'tbr': rate,
'url': update_url_query(f'{stream_url}/{stream_name}.{stream_info.get("sFlvUrlSuffix")}',
query=params),
**self._RESOLUTION.get(si.get('sDisplayName'), {}),
**self._RESOLUTION.get(display_name, {}),
})
return {

View File

@ -0,0 +1,32 @@
from .common import InfoExtractor
from ..utils import js_to_json, traverse_obj
class MonsterSirenHypergryphMusicIE(InfoExtractor):
_VALID_URL = r'https?://monster-siren\.hypergryph\.com/music/(?P<id>\d+)'
_TESTS = [{
'url': 'https://monster-siren.hypergryph.com/music/514562',
'info_dict': {
'id': '514562',
'ext': 'wav',
'artist': ['塞壬唱片-MSR'],
'album': 'Flame Shadow',
'title': 'Flame Shadow',
}
}]
def _real_extract(self, url):
audio_id = self._match_id(url)
webpage = self._download_webpage(url, audio_id)
json_data = self._search_json(
r'window\.g_initialProps\s*=', webpage, 'data', audio_id, transform_source=js_to_json)
return {
'id': audio_id,
'title': traverse_obj(json_data, ('player', 'songDetail', 'name')),
'url': traverse_obj(json_data, ('player', 'songDetail', 'sourceUrl')),
'ext': 'wav',
'vcodec': 'none',
'artist': traverse_obj(json_data, ('player', 'songDetail', 'artists')),
'album': traverse_obj(json_data, ('musicPlay', 'albumDetail', 'name'))
}

View File

@ -1,17 +1,20 @@
import re
import urllib.error
from .common import InfoExtractor
from ..compat import (
compat_parse_qs,
compat_urllib_parse_urlparse,
)
from ..compat import compat_parse_qs
from ..utils import (
HEADRequest,
ExtractorError,
determine_ext,
error_to_compat_str,
extract_attributes,
int_or_none,
merge_dicts,
parse_iso8601,
strip_or_none,
try_get,
traverse_obj,
url_or_none,
urljoin,
)
@ -20,14 +23,90 @@ class IGNBaseIE(InfoExtractor):
return self._download_json(
'http://apis.ign.com/{0}/v3/{0}s/slug/{1}'.format(self._PAGE_TYPE, slug), slug)
def _checked_call_api(self, slug):
try:
return self._call_api(slug)
except ExtractorError as e:
if isinstance(e.cause, urllib.error.HTTPError) and e.cause.code == 404:
e.cause.args = e.cause.args or [
e.cause.geturl(), e.cause.getcode(), e.cause.reason]
raise ExtractorError(
'Content not found: expired?', cause=e.cause,
expected=True)
raise
def _extract_video_info(self, video, fatal=True):
video_id = video['videoId']
formats = []
refs = traverse_obj(video, 'refs', expected_type=dict) or {}
m3u8_url = url_or_none(refs.get('m3uUrl'))
if m3u8_url:
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
f4m_url = url_or_none(refs.get('f4mUrl'))
if f4m_url:
formats.extend(self._extract_f4m_formats(
f4m_url, video_id, f4m_id='hds', fatal=False))
for asset in (video.get('assets') or []):
asset_url = url_or_none(asset.get('url'))
if not asset_url:
continue
formats.append({
'url': asset_url,
'tbr': int_or_none(asset.get('bitrate'), 1000),
'fps': int_or_none(asset.get('frame_rate')),
'height': int_or_none(asset.get('height')),
'width': int_or_none(asset.get('width')),
})
mezzanine_url = traverse_obj(
video, ('system', 'mezzanineUrl'), expected_type=url_or_none)
if mezzanine_url:
formats.append({
'ext': determine_ext(mezzanine_url, 'mp4'),
'format_id': 'mezzanine',
'quality': 1,
'url': mezzanine_url,
})
thumbnails = traverse_obj(
video, ('thumbnails', ..., {'url': 'url'}), expected_type=url_or_none)
tags = traverse_obj(
video, ('tags', ..., 'displayName'),
expected_type=lambda x: x.strip() or None)
metadata = traverse_obj(video, 'metadata', expected_type=dict) or {}
title = traverse_obj(
metadata, 'longTitle', 'title', 'name',
expected_type=lambda x: x.strip() or None)
return {
'id': video_id,
'title': title,
'description': strip_or_none(metadata.get('description')),
'timestamp': parse_iso8601(metadata.get('publishDate')),
'duration': int_or_none(metadata.get('duration')),
'thumbnails': thumbnails,
'formats': formats,
'tags': tags,
}
class IGNIE(IGNBaseIE):
"""
Extractor for some of the IGN sites, like www.ign.com, es.ign.com de.ign.com.
Some videos of it.ign.com are also supported
"""
_VALID_URL = r'https?://(?:.+?\.ign|www\.pcmag)\.com/videos/(?:\d{4}/\d{2}/\d{2}/)?(?P<id>[^/?&#]+)'
_VIDEO_PATH_RE = r'/(?:\d{4}/\d{2}/\d{2}/)?(?P<id>.+?)'
_PLAYLIST_PATH_RE = r'(?:/?\?(?P<filt>[^&#]+))?'
_VALID_URL = (
r'https?://(?:.+?\.ign|www\.pcmag)\.com/videos(?:%s)'
% '|'.join((_VIDEO_PATH_RE + r'(?:[/?&#]|$)', _PLAYLIST_PATH_RE)))
IE_NAME = 'ign.com'
_PAGE_TYPE = 'video'
@ -42,7 +121,13 @@ class IGNIE(IGNBaseIE):
'timestamp': 1370440800,
'upload_date': '20130605',
'tags': 'count:9',
}
'display_id': 'the-last-of-us-review',
'thumbnail': 'https://assets1.ignimgs.com/vid/thumbnails/user/2014/03/26/lastofusreviewmimig2.jpg',
'duration': 440,
},
'params': {
'nocheckcertificate': True,
},
}, {
'url': 'http://www.pcmag.com/videos/2015/01/06/010615-whats-new-now-is-gogo-snooping-on-your-data',
'md5': 'f1581a6fe8c5121be5b807684aeac3f6',
@ -54,84 +139,48 @@ class IGNIE(IGNBaseIE):
'timestamp': 1420571160,
'upload_date': '20150106',
'tags': 'count:4',
}
},
'skip': '404 Not Found',
}, {
'url': 'https://www.ign.com/videos/is-a-resident-evil-4-remake-on-the-way-ign-daily-fix',
'only_matching': True,
}]
@classmethod
def _extract_embed_urls(cls, url, webpage):
grids = re.findall(
r'''(?s)<section\b[^>]+\bclass\s*=\s*['"](?:[\w-]+\s+)*?content-feed-grid(?!\B|-)[^>]+>(.+?)</section[^>]*>''',
webpage)
return filter(None,
(urljoin(url, m.group('path')) for m in re.finditer(
r'''<a\b[^>]+\bhref\s*=\s*('|")(?P<path>/videos%s)\1'''
% cls._VIDEO_PATH_RE, grids[0] if grids else '')))
def _real_extract(self, url):
display_id = self._match_id(url)
video = self._call_api(display_id)
video_id = video['videoId']
metadata = video['metadata']
title = metadata.get('longTitle') or metadata.get('title') or metadata['name']
display_id, filt = self._match_valid_url(url).group('id', 'filt')
if display_id:
return self._extract_video(url, display_id)
return self._extract_playlist(url, filt or 'all')
formats = []
refs = video.get('refs') or {}
def _extract_playlist(self, url, display_id):
webpage = self._download_webpage(url, display_id)
m3u8_url = refs.get('m3uUrl')
if m3u8_url:
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
return self.playlist_result(
(self.url_result(u, self.ie_key())
for u in self._extract_embed_urls(url, webpage)),
playlist_id=display_id)
f4m_url = refs.get('f4mUrl')
if f4m_url:
formats.extend(self._extract_f4m_formats(
f4m_url, video_id, f4m_id='hds', fatal=False))
def _extract_video(self, url, display_id):
video = self._checked_call_api(display_id)
for asset in (video.get('assets') or []):
asset_url = asset.get('url')
if not asset_url:
continue
formats.append({
'url': asset_url,
'tbr': int_or_none(asset.get('bitrate'), 1000),
'fps': int_or_none(asset.get('frame_rate')),
'height': int_or_none(asset.get('height')),
'width': int_or_none(asset.get('width')),
})
info = self._extract_video_info(video)
mezzanine_url = try_get(video, lambda x: x['system']['mezzanineUrl'])
if mezzanine_url:
formats.append({
'ext': determine_ext(mezzanine_url, 'mp4'),
'format_id': 'mezzanine',
'quality': 1,
'url': mezzanine_url,
})
thumbnails = []
for thumbnail in (video.get('thumbnails') or []):
thumbnail_url = thumbnail.get('url')
if not thumbnail_url:
continue
thumbnails.append({
'url': thumbnail_url,
})
tags = []
for tag in (video.get('tags') or []):
display_name = tag.get('displayName')
if not display_name:
continue
tags.append(display_name)
return {
'id': video_id,
'title': title,
'description': strip_or_none(metadata.get('description')),
'timestamp': parse_iso8601(metadata.get('publishDate')),
'duration': int_or_none(metadata.get('duration')),
return merge_dicts({
'display_id': display_id,
'thumbnails': thumbnails,
'formats': formats,
'tags': tags,
}
}, info)
class IGNVideoIE(InfoExtractor):
class IGNVideoIE(IGNBaseIE):
_VALID_URL = r'https?://.+?\.ign\.com/(?:[a-z]{2}/)?[^/]+/(?P<id>\d+)/(?:video|trailer)/'
_TESTS = [{
'url': 'http://me.ign.com/en/videos/112203/video/how-hitman-aims-to-be-different-than-every-other-s',
@ -143,7 +192,16 @@ class IGNVideoIE(InfoExtractor):
'description': 'Taking out assassination targets in Hitman has never been more stylish.',
'timestamp': 1444665600,
'upload_date': '20151012',
}
'display_id': '112203',
'thumbnail': 'https://sm.ign.com/ign_me/video/h/how-hitman/how-hitman-aims-to-be-different-than-every-other-s_8z14.jpg',
'duration': 298,
'tags': 'count:13',
'display_id': '112203',
'thumbnail': 'https://sm.ign.com/ign_me/video/h/how-hitman/how-hitman-aims-to-be-different-than-every-other-s_8z14.jpg',
'duration': 298,
'tags': 'count:13',
},
'expected_warnings': ['HTTP Error 400: Bad Request'],
}, {
'url': 'http://me.ign.com/ar/angry-birds-2/106533/video/lrd-ldyy-lwl-lfylm-angry-birds',
'only_matching': True,
@ -163,22 +221,38 @@ class IGNVideoIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
req = HEADRequest(url.rsplit('/', 1)[0] + '/embed')
url = self._request_webpage(req, video_id).geturl()
parsed_url = urllib.parse.urlparse(url)
embed_url = urllib.parse.urlunparse(
parsed_url._replace(path=parsed_url.path.rsplit('/', 1)[0] + '/embed'))
webpage, urlh = self._download_webpage_handle(embed_url, video_id)
new_url = urlh.geturl()
ign_url = compat_parse_qs(
compat_urllib_parse_urlparse(url).query).get('url', [None])[0]
urllib.parse.urlparse(new_url).query).get('url', [None])[-1]
if ign_url:
return self.url_result(ign_url, IGNIE.ie_key())
return self.url_result(url)
video = self._search_regex(r'(<div\b[^>]+\bdata-video-id\s*=\s*[^>]+>)', webpage, 'video element', fatal=False)
if not video:
if new_url == url:
raise ExtractorError('Redirect loop: ' + url)
return self.url_result(new_url)
video = extract_attributes(video)
video_data = video.get('data-settings') or '{}'
video_data = self._parse_json(video_data, video_id)['video']
info = self._extract_video_info(video_data)
return merge_dicts({
'display_id': video_id,
}, info)
class IGNArticleIE(IGNBaseIE):
_VALID_URL = r'https?://.+?\.ign\.com/(?:articles(?:/\d{4}/\d{2}/\d{2})?|(?:[a-z]{2}/)?feature/\d+)/(?P<id>[^/?&#]+)'
_VALID_URL = r'https?://.+?\.ign\.com/(?:articles(?:/\d{4}/\d{2}/\d{2})?|(?:[a-z]{2}/)?(?:[\w-]+/)*?feature/\d+)/(?P<id>[^/?&#]+)'
_PAGE_TYPE = 'article'
_TESTS = [{
'url': 'http://me.ign.com/en/feature/15775/100-little-things-in-gta-5-that-will-blow-your-mind',
'info_dict': {
'id': '524497489e4e8ff5848ece34',
'id': '72113',
'title': '100 Little Things in GTA 5 That Will Blow Your Mind',
},
'playlist': [
@ -186,34 +260,43 @@ class IGNArticleIE(IGNBaseIE):
'info_dict': {
'id': '5ebbd138523268b93c9141af17bec937',
'ext': 'mp4',
'title': 'GTA 5 Video Review',
'title': 'Grand Theft Auto V Video Review',
'description': 'Rockstar drops the mic on this generation of games. Watch our review of the masterly Grand Theft Auto V.',
'timestamp': 1379339880,
'upload_date': '20130916',
'tags': 'count:12',
'thumbnail': 'https://assets1.ignimgs.com/thumbs/userUploaded/2021/8/16/gta-v-heistsjpg-e94705-1629138553533.jpeg',
'display_id': 'grand-theft-auto-v-video-review',
'duration': 501,
},
},
{
'info_dict': {
'id': '638672ee848ae4ff108df2a296418ee2',
'ext': 'mp4',
'title': '26 Twisted Moments from GTA 5 in Slow Motion',
'title': 'GTA 5 In Slow Motion',
'description': 'The twisted beauty of GTA 5 in stunning slow motion.',
'timestamp': 1386878820,
'upload_date': '20131212',
'duration': 202,
'tags': 'count:25',
'display_id': 'gta-5-in-slow-motion',
'thumbnail': 'https://assets1.ignimgs.com/vid/thumbnails/user/2013/11/03/GTA-SLO-MO-1.jpg',
},
},
],
'params': {
'playlist_items': '2-3',
'skip_download': True,
},
'expected_warnings': ['Backend fetch failed'],
}, {
'url': 'http://www.ign.com/articles/2014/08/15/rewind-theater-wild-trailer-gamescom-2014?watch',
'info_dict': {
'id': '53ee806780a81ec46e0790f8',
'title': 'Rewind Theater - Wild Trailer Gamescom 2014',
},
'playlist_count': 2,
'playlist_count': 1,
'expected_warnings': ['Backend fetch failed'],
}, {
# videoId pattern
'url': 'http://www.ign.com/articles/2017/06/08/new-ducktales-short-donalds-birthday-doesnt-go-as-planned',
@ -236,18 +319,84 @@ class IGNArticleIE(IGNBaseIE):
'only_matching': True,
}]
def _checked_call_api(self, slug):
try:
return self._call_api(slug)
except ExtractorError as e:
if isinstance(e.cause, urllib.error.HTTPError):
e.cause.args = e.cause.args or [
e.cause.geturl(), e.cause.getcode(), e.cause.reason]
if e.cause.code == 404:
raise ExtractorError(
'Content not found: expired?', cause=e.cause,
expected=True)
elif e.cause.code == 503:
self.report_warning(error_to_compat_str(e.cause))
return
raise
def _real_extract(self, url):
display_id = self._match_id(url)
article = self._call_api(display_id)
article = self._checked_call_api(display_id)
def entries():
media_url = try_get(article, lambda x: x['mediaRelations'][0]['media']['metadata']['url'])
if media_url:
yield self.url_result(media_url, IGNIE.ie_key())
for content in (article.get('content') or []):
for video_url in re.findall(r'(?:\[(?:ignvideo\s+url|youtube\s+clip_id)|<iframe[^>]+src)="([^"]+)"', content):
yield self.url_result(video_url)
if article:
# obsolete ?
def entries():
media_url = traverse_obj(
article, ('mediaRelations', 0, 'media', 'metadata', 'url'),
expected_type=url_or_none)
if media_url:
yield self.url_result(media_url, IGNIE.ie_key())
for content in (article.get('content') or []):
for video_url in re.findall(r'(?:\[(?:ignvideo\s+url|youtube\s+clip_id)|<iframe[^>]+src)="([^"]+)"', content):
if url_or_none(video_url):
yield self.url_result(video_url)
return self.playlist_result(
entries(), article.get('articleId'),
traverse_obj(
article, ('metadata', 'headline'),
expected_type=lambda x: x.strip() or None))
webpage = self._download_webpage(url, display_id)
playlist_id = self._html_search_meta('dable:item_id', webpage, default=None)
if playlist_id:
def entries():
for m in re.finditer(
r'''(?s)<object\b[^>]+\bclass\s*=\s*("|')ign-videoplayer\1[^>]*>(?P<params>.+?)</object''',
webpage):
flashvars = self._search_regex(
r'''(<param\b[^>]+\bname\s*=\s*("|')flashvars\2[^>]*>)''',
m.group('params'), 'flashvars', default='')
flashvars = compat_parse_qs(extract_attributes(flashvars).get('value') or '')
v_url = url_or_none((flashvars.get('url') or [None])[-1])
if v_url:
yield self.url_result(v_url)
else:
playlist_id = self._search_regex(
r'''\bdata-post-id\s*=\s*("|')(?P<id>[\da-f]+)\1''',
webpage, 'id', group='id', default=None)
nextjs_data = self._search_nextjs_data(webpage, display_id)
def entries():
for player in traverse_obj(
nextjs_data,
('props', 'apolloState', 'ROOT_QUERY', lambda k, _: k.startswith('videoPlayerProps('), '__ref')):
# skip promo links (which may not always be served, eg GH CI servers)
if traverse_obj(nextjs_data,
('props', 'apolloState', player.replace('PlayerProps', 'ModernContent')),
expected_type=dict):
continue
video = traverse_obj(nextjs_data, ('props', 'apolloState', player), expected_type=dict) or {}
info = self._extract_video_info(video, fatal=False)
if info:
yield merge_dicts({
'display_id': display_id,
}, info)
return self.playlist_result(
entries(), article.get('articleId'),
strip_or_none(try_get(article, lambda x: x['metadata']['headline'])))
entries(), playlist_id or display_id,
re.sub(r'\s+-\s+IGN\s*$', '', self._og_search_title(webpage, default='')) or None)

View File

@ -585,7 +585,7 @@ class IqIE(InfoExtractor):
'langCode': self._get_cookie('lang', 'en_us'),
'deviceId': self._get_cookie('QC005', '')
}, fatal=False)
ut_list = traverse_obj(vip_data, ('data', 'all_vip', ..., 'vipType'), expected_type=str_or_none, default=[])
ut_list = traverse_obj(vip_data, ('data', 'all_vip', ..., 'vipType'), expected_type=str_or_none)
else:
ut_list = ['0']
@ -617,7 +617,7 @@ class IqIE(InfoExtractor):
self.report_warning('This preview video is limited%s' % format_field(preview_time, None, ' to %s seconds'))
# TODO: Extract audio-only formats
for bid in set(traverse_obj(initial_format_data, ('program', 'video', ..., 'bid'), expected_type=str_or_none, default=[])):
for bid in set(traverse_obj(initial_format_data, ('program', 'video', ..., 'bid'), expected_type=str_or_none)):
dash_path = dash_paths.get(bid)
if not dash_path:
self.report_warning(f'Unknown format id: {bid}. It is currently not being extracted')
@ -628,7 +628,7 @@ class IqIE(InfoExtractor):
fatal=False), 'data', expected_type=dict)
video_format = traverse_obj(format_data, ('program', 'video', lambda _, v: str(v['bid']) == bid),
expected_type=dict, default=[], get_all=False) or {}
expected_type=dict, get_all=False) or {}
extracted_formats = []
if video_format.get('m3u8Url'):
extracted_formats.extend(self._extract_m3u8_formats(
@ -669,7 +669,7 @@ class IqIE(InfoExtractor):
})
formats.extend(extracted_formats)
for sub_format in traverse_obj(initial_format_data, ('program', 'stl', ...), expected_type=dict, default=[]):
for sub_format in traverse_obj(initial_format_data, ('program', 'stl', ...), expected_type=dict):
lang = self._LID_TAGS.get(str_or_none(sub_format.get('lid')), sub_format.get('_name'))
subtitles.setdefault(lang, []).extend([{
'ext': format_ext,

View File

@ -2,11 +2,8 @@ import json
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
qualities,
)
from ..dependencies import Cryptodome
from ..utils import ExtractorError, int_or_none, qualities
class IviIE(InfoExtractor):
@ -94,18 +91,8 @@ class IviIE(InfoExtractor):
for site in (353, 183):
content_data = (data % site).encode()
if site == 353:
try:
from Cryptodome.Cipher import Blowfish
from Cryptodome.Hash import CMAC
pycryptodome_found = True
except ImportError:
try:
from Crypto.Cipher import Blowfish
from Crypto.Hash import CMAC
pycryptodome_found = True
except ImportError:
pycryptodome_found = False
continue
if not Cryptodome:
continue
timestamp = (self._download_json(
self._LIGHT_URL, video_id,
@ -118,7 +105,8 @@ class IviIE(InfoExtractor):
query = {
'ts': timestamp,
'sign': CMAC.new(self._LIGHT_KEY, timestamp.encode() + content_data, Blowfish).hexdigest(),
'sign': Cryptodome.Hash.CMAC.new(self._LIGHT_KEY, timestamp.encode() + content_data,
Cryptodome.Cipher.Blowfish).hexdigest(),
}
else:
query = {}
@ -138,7 +126,7 @@ class IviIE(InfoExtractor):
extractor_msg = 'Video %s does not exist'
elif site == 353:
continue
elif not pycryptodome_found:
elif not Cryptodome:
raise ExtractorError('pycryptodomex not found. Please install', expected=True)
elif message:
extractor_msg += ': ' + message

View File

@ -0,0 +1,31 @@
from .common import InfoExtractor
from ..utils import update_url
class KommunetvIE(InfoExtractor):
_VALID_URL = r'https://(\w+).kommunetv.no/archive/(?P<id>\w+)'
_TEST = {
'url': 'https://oslo.kommunetv.no/archive/921',
'md5': '5f102be308ee759be1e12b63d5da4bbc',
'info_dict': {
'id': '921',
'title': 'Bystyremøte',
'ext': 'mp4'
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
headers = {
'Accept': 'application/json'
}
data = self._download_json('https://oslo.kommunetv.no/api/streams?streamType=1&id=%s' % video_id, video_id, headers=headers)
title = data['stream']['title']
file = data['playlist'][0]['playlist'][0]['file']
url = update_url(file, query=None, fragment=None)
formats = self._extract_m3u8_formats(url, video_id, ext='mp4', entry_protocol='m3u8_native', m3u8_id='hls', fatal=False)
return {
'id': video_id,
'formats': formats,
'title': title
}

View File

@ -1,11 +1,5 @@
from .dailymotion import DailymotionIE
from .common import InfoExtractor
from ..utils import (
parse_iso8601,
try_get,
)
import re
class MoviepilotIE(InfoExtractor):
@ -16,21 +10,21 @@ class MoviepilotIE(InfoExtractor):
_TESTS = [{
'url': 'https://www.moviepilot.de/movies/interstellar-2/',
'info_dict': {
'id': 'x7xdut5',
'id': 'x7xdpkk',
'display_id': 'interstellar-2',
'ext': 'mp4',
'title': 'Interstellar',
'thumbnail': r're:https://\w+\.dmcdn\.net/v/SaXev1VvzitVZMFsR/x720',
'timestamp': 1400491705,
'description': 'md5:7dfc5c1758e7322a7346934f1f0c489c',
'thumbnail': r're:https://\w+\.dmcdn\.net/v/SaV-q1ZganMw4HVXg/x1080',
'timestamp': 1605010596,
'description': 'md5:0ae9cb452af52610c9ffc60f2fd0474c',
'uploader': 'Moviepilot',
'like_count': int,
'view_count': int,
'uploader_id': 'x6nd9k',
'upload_date': '20140519',
'duration': 140,
'upload_date': '20201110',
'duration': 97,
'age_limit': 0,
'tags': ['Alle Trailer', 'Movie', 'Third Party'],
'tags': ['Alle Trailer', 'Movie', 'Verleih'],
},
}, {
'url': 'https://www.moviepilot.de/movies/interstellar-2/trailer',
@ -45,14 +39,14 @@ class MoviepilotIE(InfoExtractor):
'display_id': 'queen-slim',
'title': 'Queen & Slim',
'ext': 'mp4',
'thumbnail': r're:https://\w+\.dmcdn\.net/v/SbUM71WtomSjVmI_q/x720',
'timestamp': 1571838685,
'description': 'md5:73058bcd030aa12d991e4280d65fbebe',
'thumbnail': r're:https://\w+\.dmcdn\.net/v/SbUM71ZeG2N975lf2/x1080',
'timestamp': 1605555825,
'description': 'md5:83228bb86f5367dd181447fdc4873989',
'uploader': 'Moviepilot',
'like_count': int,
'view_count': int,
'uploader_id': 'x6nd9k',
'upload_date': '20191023',
'upload_date': '20201116',
'duration': 138,
'age_limit': 0,
'tags': ['Movie', 'Verleih', 'Neue Trailer'],
@ -72,12 +66,12 @@ class MoviepilotIE(InfoExtractor):
'display_id': 'muellers-buero',
'title': 'Müllers Büro',
'ext': 'mp4',
'description': 'md5:57501251c05cdc61ca314b7633e0312e',
'timestamp': 1287584475,
'description': 'md5:4d23a8f4ca035196cd4523863c4fe5a4',
'timestamp': 1604958457,
'age_limit': 0,
'duration': 82,
'upload_date': '20101020',
'thumbnail': r're:https://\w+\.dmcdn\.net/v/SaMes1WfAm1d6maq_/x720',
'upload_date': '20201109',
'thumbnail': r're:https://\w+\.dmcdn\.net/v/SaMes1Zg3lxLv9j5u/x1080',
'uploader': 'Moviepilot',
'like_count': int,
'view_count': int,
@ -91,22 +85,13 @@ class MoviepilotIE(InfoExtractor):
webpage = self._download_webpage(f'https://www.moviepilot.de/movies/{video_id}/trailer', video_id)
duration = try_get(
re.match(r'P(?P<hours>\d+)H(?P<mins>\d+)M(?P<secs>\d+)S',
self._html_search_meta('duration', webpage, fatal=False) or ''),
lambda mobj: sum(float(x) * y for x, y in zip(mobj.groups(), (3600, 60, 1))))
# _html_search_meta is not used since we don't want name=description to match
description = self._html_search_regex(
'<meta[^>]+itemprop="description"[^>]+content="([^>"]+)"', webpage, 'description', fatal=False)
clip = self._search_nextjs_data(webpage, video_id)['props']['initialProps']['pageProps']
return {
'_type': 'url_transparent',
'ie_key': DailymotionIE.ie_key(),
'display_id': video_id,
'title': self._og_search_title(webpage),
'url': self._html_search_meta('embedURL', webpage),
'thumbnail': self._html_search_meta('thumbnailURL', webpage),
'description': description,
'duration': duration,
'timestamp': parse_iso8601(self._html_search_meta('uploadDate', webpage), delimiter=' ')
'title': clip.get('title'),
'url': f'https://www.dailymotion.com/video/{clip["videoRemoteId"]}',
'description': clip.get('summary'),
}

View File

@ -1,5 +1,16 @@
import re
from .common import InfoExtractor
from ..utils import js_to_json
from ..utils import (
MONTH_NAMES,
clean_html,
get_element_by_class,
get_element_by_id,
int_or_none,
js_to_json,
qualities,
unified_strdate,
)
class MyVideoGeIE(InfoExtractor):
@ -11,37 +22,50 @@ class MyVideoGeIE(InfoExtractor):
'id': '3941048',
'ext': 'mp4',
'title': 'The best prikol',
'upload_date': '20200611',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'md5:d72addd357b0dd914e704781f7f777d8',
'description': 'md5:5c0371f540f5888d603ebfedd46b6df3'
}
'uploader': 'chixa33',
'description': 'md5:5b067801318e33c2e6eea4ab90b1fdd3',
},
}
_MONTH_NAMES_KA = ['იანვარი', 'თებერვალი', 'მარტი', 'აპრილი', 'მაისი', 'ივნისი', 'ივლისი', 'აგვისტო', 'სექტემბერი', 'ოქტომბერი', 'ნოემბერი', 'დეკემბერი']
_quality = staticmethod(qualities(('SD', 'HD')))
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(r'<h1[^>]*>([^<]+)</h1>', webpage, 'title')
description = self._og_search_description(webpage)
thumbnail = self._html_search_meta(['og:image'], webpage)
uploader = self._search_regex(r'<a[^>]+class="mv_user_name"[^>]*>([^<]+)<', webpage, 'uploader', fatal=False)
title = (
self._og_search_title(webpage, default=None)
or clean_html(get_element_by_class('my_video_title', webpage))
or self._html_search_regex(r'<title\b[^>]*>([^<]+)</title\b', webpage, 'title'))
jwplayer_sources = self._parse_json(
self._search_regex(
r"(?s)jwplayer\(\"mvplayer\"\).setup\(.*?sources: (.*?])", webpage, 'jwplayer sources'),
video_id, transform_source=js_to_json)
r'''(?s)jwplayer\s*\(\s*['"]mvplayer['"]\s*\)\s*\.\s*setup\s*\(.*?\bsources\s*:\s*(\[.*?])\s*[,});]''', webpage, 'jwplayer sources', fatal=False)
or '',
video_id, transform_source=js_to_json, fatal=False)
def _formats_key(f):
if f['label'] == 'SD':
return -1
elif f['label'] == 'HD':
return 1
else:
return 0
formats = self._parse_jwplayer_formats(jwplayer_sources or [], video_id)
for f in formats or []:
f['quality'] = self._quality(f['format_id'])
jwplayer_sources = sorted(jwplayer_sources, key=_formats_key)
description = (
self._og_search_description(webpage)
or get_element_by_id('long_desc_holder', webpage)
or self._html_search_meta('description', webpage))
formats = self._parse_jwplayer_formats(jwplayer_sources, video_id)
uploader = self._search_regex(r'<a[^>]+class="mv_user_name"[^>]*>([^<]+)<', webpage, 'uploader', fatal=False)
upload_date = get_element_by_class('mv_vid_upl_date', webpage)
# as ka locale may not be present roll a local date conversion
upload_date = (unified_strdate(
# translate any ka month to an en one
re.sub('|'.join(self._MONTH_NAMES_KA),
lambda m: MONTH_NAMES['en'][self._MONTH_NAMES_KA.index(m.group(0))],
upload_date, re.I))
if upload_date else None)
return {
'id': video_id,
@ -49,5 +73,9 @@ class MyVideoGeIE(InfoExtractor):
'description': description,
'uploader': uploader,
'formats': formats,
'thumbnail': thumbnail
'thumbnail': self._og_search_thumbnail(webpage),
'upload_date': upload_date,
'view_count': int_or_none(get_element_by_class('mv_vid_views', webpage)),
'like_count': int_or_none(get_element_by_id('likes_count', webpage)),
'dislike_count': int_or_none(get_element_by_id('dislikes_count', webpage)),
}

View File

@ -21,6 +21,23 @@ from ..utils import (
class NaverBaseIE(InfoExtractor):
_CAPTION_EXT_RE = r'\.(?:ttml|vtt)'
@staticmethod # NB: Used in VLiveWebArchiveIE
def process_subtitles(vod_data, process_url):
ret = {'subtitles': {}, 'automatic_captions': {}}
for caption in traverse_obj(vod_data, ('captions', 'list', ...)):
caption_url = caption.get('source')
if not caption_url:
continue
type_ = 'automatic_captions' if caption.get('type') == 'auto' else 'subtitles'
lang = caption.get('locale') or join_nonempty('language', 'country', from_dict=caption) or 'und'
if caption.get('type') == 'fan':
lang += '_fan%d' % next(i for i in itertools.count(1) if f'{lang}_fan{i}' not in ret[type_])
ret[type_].setdefault(lang, []).extend({
'url': sub_url,
'name': join_nonempty('label', 'fanName', from_dict=caption, delim=' - '),
} for sub_url in process_url(caption_url))
return ret
def _extract_video_info(self, video_id, vid, key):
video_data = self._download_json(
'http://play.rmcnmv.naver.com/vod/play/v2.0/' + vid,
@ -79,34 +96,18 @@ class NaverBaseIE(InfoExtractor):
]
return [caption_url]
automatic_captions = {}
subtitles = {}
for caption in get_list('caption'):
caption_url = caption.get('source')
if not caption_url:
continue
sub_dict = automatic_captions if caption.get('type') == 'auto' else subtitles
lang = caption.get('locale') or join_nonempty('language', 'country', from_dict=caption) or 'und'
if caption.get('type') == 'fan':
lang += '_fan%d' % next(i for i in itertools.count(1) if f'{lang}_fan{i}' not in sub_dict)
sub_dict.setdefault(lang, []).extend({
'url': sub_url,
'name': join_nonempty('label', 'fanName', from_dict=caption, delim=' - '),
} for sub_url in get_subs(caption_url))
user = meta.get('user', {})
return {
'id': video_id,
'title': title,
'formats': formats,
'subtitles': subtitles,
'automatic_captions': automatic_captions,
'thumbnail': try_get(meta, lambda x: x['cover']['source']),
'view_count': int_or_none(meta.get('count')),
'uploader_id': user.get('id'),
'uploader': user.get('name'),
'uploader_url': user.get('url'),
**self.process_subtitles(video_data, get_subs),
}

View File

@ -3,29 +3,31 @@ import json
import re
from .common import InfoExtractor
from .theplatform import ThePlatformIE
from .theplatform import ThePlatformIE, default_ns
from .adobepass import AdobePassIE
from ..compat import compat_urllib_parse_unquote
from ..utils import (
ExtractorError,
HEADRequest,
RegexNotFoundError,
UserNotLive,
clean_html,
int_or_none,
parse_age_limit,
parse_duration,
RegexNotFoundError,
smuggle_url,
str_or_none,
traverse_obj,
try_get,
unified_strdate,
unescapeHTML,
unified_timestamp,
update_url_query,
url_basename,
variadic,
xpath_attr,
)
class NBCIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
_VALID_URL = r'https?(?P<permalink>://(?:www\.)?nbc\.com/(?:classic-tv/)?[^/]+/video/[^/]+/(?P<id>n?\d+))'
_VALID_URL = r'https?(?P<permalink>://(?:www\.)?nbc\.com/(?:classic-tv/)?[^/]+/video/[^/]+/(?P<id>(?:NBCE|n)?\d+))'
_TESTS = [
{
@ -38,10 +40,18 @@ class NBCIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
'timestamp': 1424246400,
'upload_date': '20150218',
'uploader': 'NBCU-COM',
'episode': 'Jimmy Fallon Surprises Fans at Ben & Jerry\'s',
'episode_number': 86,
'season': 'Season 2',
'season_number': 2,
'series': 'Tonight Show: Jimmy Fallon',
'duration': 237.0,
'chapters': 'count:1',
'tags': 'count:4',
'thumbnail': r're:https?://.+\.jpg',
},
'params': {
# m3u8 download
'skip_download': True,
'skip_download': 'm3u8',
},
},
{
@ -55,11 +65,7 @@ class NBCIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
'upload_date': '20141206',
'uploader': 'NBCU-COM',
},
'params': {
# m3u8 download
'skip_download': True,
},
'skip': 'Only works from US',
'skip': 'page not found',
},
{
# HLS streams requires the 'hdnea3' cookie
@ -73,10 +79,59 @@ class NBCIE(ThePlatformIE): # XXX: Do not subclass from concrete IE
'upload_date': '20090315',
'uploader': 'NBCU-COM',
},
'params': {
'skip_download': True,
'skip': 'page not found',
},
{
# manifest url does not have extension
'url': 'https://www.nbc.com/the-golden-globe-awards/video/oprah-winfrey-receives-cecil-b-de-mille-award-at-the-2018-golden-globes/3646439',
'info_dict': {
'id': '3646439',
'ext': 'mp4',
'title': 'Oprah Winfrey Receives Cecil B. de Mille Award at the 2018 Golden Globes',
'episode': 'Oprah Winfrey Receives Cecil B. de Mille Award at the 2018 Golden Globes',
'episode_number': 1,
'season': 'Season 75',
'season_number': 75,
'series': 'The Golden Globe Awards',
'description': 'Oprah Winfrey receives the Cecil B. de Mille Award at the 75th Annual Golden Globe Awards.',
'uploader': 'NBCU-COM',
'upload_date': '20180107',
'timestamp': 1515312000,
'duration': 570.0,
'tags': 'count:8',
'thumbnail': r're:https?://.+\.jpg',
'chapters': 'count:1',
},
'params': {
'skip_download': 'm3u8',
},
},
{
# new video_id format
'url': 'https://www.nbc.com/quantum-leap/video/bens-first-leap-nbcs-quantum-leap/NBCE125189978',
'info_dict': {
'id': 'NBCE125189978',
'ext': 'mp4',
'title': 'Ben\'s First Leap | NBC\'s Quantum Leap',
'description': 'md5:a82762449b7ec4bb83291a7b355ebf8e',
'uploader': 'NBCU-COM',
'series': 'Quantum Leap',
'season': 'Season 1',
'season_number': 1,
'episode': 'Ben\'s First Leap | NBC\'s Quantum Leap',
'episode_number': 1,
'duration': 170.171,
'chapters': [],
'timestamp': 1663956155,
'upload_date': '20220923',
'tags': 'count:10',
'age_limit': 0,
'thumbnail': r're:https?://.+\.jpg',
},
'expected_warnings': ['Ignoring subtitle tracks'],
'params': {
'skip_download': 'm3u8',
},
'skip': 'Only works from US',
},
{
'url': 'https://www.nbc.com/classic-tv/charles-in-charge/video/charles-in-charge-pilot/n3310',
@ -600,32 +655,36 @@ class NBCStationsIE(InfoExtractor):
_TESTS = [{
'url': 'https://www.nbclosangeles.com/news/local/large-structure-fire-in-downtown-la-prompts-smoke-odor-advisory/2968618/',
'md5': '462041d91bd762ef5a38b7d85d6dc18f',
'info_dict': {
'id': '2968618',
'ext': 'mp4',
'title': 'Large Structure Fire in Downtown LA Prompts Smoke Odor Advisory',
'description': None,
'description': 'md5:417ed3c2d91fe9d301e6db7b0942f182',
'timestamp': 1661135892,
'upload_date': '20220821',
'upload_date': '20220822',
'uploader': 'NBC 4',
'uploader_id': 'KNBC',
'channel_id': 'KNBC',
'channel': 'nbclosangeles',
},
'params': {
'skip_download': 'm3u8',
},
}, {
'url': 'https://www.telemundoarizona.com/responde/huracan-complica-reembolso-para-televidente-de-tucson/2247002/',
'md5': '0917dcf7885be1023a9220630d415f67',
'info_dict': {
'id': '2247002',
'ext': 'mp4',
'title': 'Huracán complica que televidente de Tucson reciba reembolso',
'title': 'Huracán complica que televidente de Tucson reciba reembolso',
'description': 'md5:af298dc73aab74d4fca6abfb12acb6cf',
'timestamp': 1660886507,
'upload_date': '20220819',
'uploader': 'Telemundo Arizona',
'uploader_id': 'KTAZ',
'channel_id': 'KTAZ',
'channel': 'telemundoarizona',
},
'params': {
'skip_download': 'm3u8',
},
}]
_RESOLUTIONS = {
@ -641,51 +700,42 @@ class NBCStationsIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
nbc_data = self._search_json(
r'<script>var\s*nbc\s*=', webpage, 'NBC JSON data', video_id)
r'<script>\s*var\s+nbc\s*=', webpage, 'NBC JSON data', video_id)
pdk_acct = nbc_data.get('pdkAcct') or 'Yh1nAC'
fw_ssid = traverse_obj(nbc_data, ('video', 'fwSSID'))
fw_network_id = traverse_obj(nbc_data, ('video', 'fwNetworkID'), default='382114')
video_data = self._parse_json(self._html_search_regex(
r'data-videos="([^"]*)"', webpage, 'video data', default='{}'), video_id)
video_data = variadic(video_data)[0]
video_data.update(self._parse_json(self._html_search_regex(
r'data-meta="([^"]*)"', webpage, 'metadata', default='{}'), video_id))
video_data = self._search_json(
r'data-videos="\[', webpage, 'video data', video_id, default={}, transform_source=unescapeHTML)
video_data.update(self._search_json(
r'data-meta="', webpage, 'metadata', video_id, default={}, transform_source=unescapeHTML))
if not video_data:
raise ExtractorError('No video metadata found in webpage', expected=True)
formats = []
info, formats, subtitles = {}, [], {}
is_live = int_or_none(video_data.get('mpx_is_livestream')) == 1
query = {
'formats': 'MPEG-DASH none,M3U none,MPEG-DASH none,MPEG4,MP3',
'format': 'SMIL',
'fwsitesection': fw_ssid,
'fwNetworkID': traverse_obj(nbc_data, ('video', 'fwNetworkID'), default='382114'),
'pprofile': 'ots_desktop_html',
'sensitive': 'false',
'w': '1920',
'h': '1080',
'mode': 'LIVE' if is_live else 'on-demand',
'vpaid': 'script',
'schema': '2.0',
'sdk': 'PDK 6.1.3',
}
if video_data.get('mpx_is_livestream') == '1':
live = True
player_id = traverse_obj(
video_data, 'mpx_m3upid', ('video', 'meta', 'mpx_m3upid'), 'mpx_pid',
('video', 'meta', 'mpx_pid'), 'pid_streaming_web_medium')
query = {
'mbr': 'true',
'assetTypes': 'LegacyRelease',
'fwsitesection': fw_ssid,
'fwNetworkID': fw_network_id,
'pprofile': 'ots_desktop_html',
'sensitive': 'false',
'w': '1920',
'h': '1080',
'rnd': '1660303',
'mode': 'LIVE',
'format': 'SMIL',
'tracking': 'true',
'formats': 'M3U+none,MPEG-DASH+none,MPEG4,MP3',
'vpaid': 'script',
'schema': '2.0',
'SDK': 'PDK+6.1.3',
}
info = {
'title': f'{channel} livestream',
}
if is_live:
player_id = traverse_obj(video_data, ((None, ('video', 'meta')), (
'mpx_m3upid', 'mpx_pid', 'pid_streaming_web_medium')), get_all=False)
info['title'] = f'{channel} livestream'
else:
live = False
player_id = traverse_obj(
video_data, ('video', 'meta', 'pid_streaming_web_high'), 'pid_streaming_web_high',
('video', 'meta', 'mpx_pid'), 'mpx_pid')
player_id = traverse_obj(video_data, (
(None, ('video', 'meta')), ('pid_streaming_web_high', 'mpx_pid')), get_all=False)
date_string = traverse_obj(video_data, 'date_string', 'date_gmt')
if date_string:
@ -693,63 +743,58 @@ class NBCStationsIE(InfoExtractor):
r'datetime="([^"]+)"', date_string, 'date string', fatal=False)
else:
date_string = traverse_obj(
nbc_data, ('dataLayer', 'adobe', 'prop70'), ('dataLayer', 'adobe', 'eVar70'),
('dataLayer', 'adobe', 'eVar59'))
nbc_data, ('dataLayer', 'adobe', ('prop70', 'eVar70', 'eVar59')), get_all=False)
video_url = traverse_obj(video_data, ('video', 'meta', 'mp4_url'), 'mp4_url')
video_url = traverse_obj(video_data, ((None, ('video', 'meta')), 'mp4_url'), get_all=False)
if video_url:
height = url_basename(video_url).split('-')[1].split('p')[0]
height = self._search_regex(r'\d+-(\d+)p', url_basename(video_url), 'height', default=None)
formats.append({
'url': video_url,
'ext': 'mp4',
'width': int_or_none(self._RESOLUTIONS.get(height)),
'height': int_or_none(height),
'format_id': f'http-{height}',
'format_id': 'http-mp4',
})
query = {
'mbr': 'true',
'assetTypes': 'LegacyRelease',
'fwsitesection': fw_ssid,
'fwNetworkID': fw_network_id,
'format': 'redirect',
'manifest': 'm3u',
'Tracking': 'true',
'Embedded': 'true',
'formats': 'MPEG4',
}
info = {
'title': video_data.get('title') or traverse_obj(
nbc_data, ('dataLayer', 'contenttitle'), ('dataLayer', 'title'),
('dataLayer', 'adobe', 'prop22'), ('dataLayer', 'id')),
'description': traverse_obj(video_data, 'summary', 'excerpt', 'video_hero_text'),
'upload_date': str_or_none(unified_strdate(date_string)),
'timestamp': int_or_none(unified_timestamp(date_string)),
}
info.update({
'title': video_data.get('title') or traverse_obj(nbc_data, (
'dataLayer', (None, 'adobe'), ('contenttitle', 'title', 'prop22')), get_all=False),
'description':
traverse_obj(video_data, 'summary', 'excerpt', 'video_hero_text')
or clean_html(traverse_obj(nbc_data, ('dataLayer', 'summary'))),
'timestamp': unified_timestamp(date_string),
})
if not player_id:
raise ExtractorError(
'No video player ID or livestream player ID found in webpage', expected=True)
smil = None
if player_id and fw_ssid:
smil = self._download_xml(
f'https://link.theplatform.com/s/{pdk_acct}/{player_id}', video_id,
note='Downloading SMIL data', query=query, fatal=is_live)
if smil:
manifest_url = xpath_attr(smil, f'.//{{{default_ns}}}video', 'src', fatal=is_live)
subtitles = self._parse_smil_subtitles(smil, default_ns)
fmts, subs = self._extract_m3u8_formats_and_subtitles(
manifest_url, video_id, 'mp4', m3u8_id='hls', fatal=is_live,
live=is_live, errnote='No HLS formats found')
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
headers = {'Origin': f'https://www.{channel}.com'}
manifest, urlh = self._download_webpage_handle(
f'https://link.theplatform.com/s/{pdk_acct}/{player_id}', video_id,
headers=headers, query=query, note='Downloading manifest')
if live:
manifest_url = self._search_regex(r'<video src="([^"]*)', manifest, 'manifest URL')
else:
manifest_url = urlh.geturl()
formats.extend(self._extract_m3u8_formats(
manifest_url, video_id, 'mp4', headers=headers, m3u8_id='hls',
fatal=live, live=live, errnote='No HLS formats found'))
if not formats:
self.raise_no_formats('No video content found in webpage', expected=True)
elif is_live:
try:
self._request_webpage(
HEADRequest(formats[0]['url']), video_id, note='Checking live status')
except ExtractorError:
raise UserNotLive(video_id=channel)
return {
'id': str_or_none(video_id),
'id': video_id,
'channel': channel,
'uploader': str_or_none(nbc_data.get('on_air_name')),
'uploader_id': str_or_none(nbc_data.get('callLetters')),
'channel_id': nbc_data.get('callLetters'),
'uploader': nbc_data.get('on_air_name'),
'formats': formats,
'is_live': live,
'subtitles': subtitles,
'is_live': is_live,
**info,
}

View File

@ -1,11 +1,9 @@
import itertools
import json
import time
import urllib.error
import urllib.parse
from .common import InfoExtractor
from ..utils import ExtractorError, parse_iso8601, try_get
from ..utils import ExtractorError, parse_iso8601
_BASE_URL_RE = r'https?://(?:www\.)?(?:watchnebula\.com|nebula\.app|nebula\.tv)'
@ -15,11 +13,10 @@ class NebulaBaseIE(InfoExtractor):
_nebula_api_token = None
_nebula_bearer_token = None
_zype_access_token = None
def _perform_nebula_auth(self, username, password):
if not username or not password:
self.raise_login_required()
self.raise_login_required(method='password')
data = json.dumps({'email': username, 'password': password}).encode('utf8')
response = self._download_json(
@ -33,38 +30,10 @@ class NebulaBaseIE(InfoExtractor):
note='Logging in to Nebula with supplied credentials',
errnote='Authentication failed or rejected')
if not response or not response.get('key'):
self.raise_login_required()
# save nebula token as cookie
self._set_cookie(
'nebula.app', 'nebula-auth',
urllib.parse.quote(
json.dumps({
"apiToken": response["key"],
"isLoggingIn": False,
"isLoggingOut": False,
}, separators=(",", ":"))),
expire_time=int(time.time()) + 86400 * 365,
)
self.raise_login_required(method='password')
return response['key']
def _retrieve_nebula_api_token(self, username=None, password=None):
"""
Check cookie jar for valid token. Try to authenticate using credentials if no valid token
can be found in the cookie jar.
"""
nebula_cookies = self._get_cookies('https://nebula.app')
nebula_cookie = nebula_cookies.get('nebula-auth')
if nebula_cookie:
self.to_screen('Authenticating to Nebula with token from cookie jar')
nebula_cookie_value = urllib.parse.unquote(nebula_cookie.value)
nebula_api_token = self._parse_json(nebula_cookie_value, None).get('apiToken')
if nebula_api_token:
return nebula_api_token
return self._perform_nebula_auth(username, password)
def _call_nebula_api(self, url, video_id=None, method='GET', auth_type='api', note=''):
assert method in ('GET', 'POST',)
assert auth_type in ('api', 'bearer',)
@ -95,35 +64,24 @@ class NebulaBaseIE(InfoExtractor):
note='Authorizing to Nebula')
return response['token']
def _fetch_zype_access_token(self):
"""
Get a Zype access token, which is required to access video streams -- in our case: to
generate video URLs.
"""
user_object = self._call_nebula_api('https://api.watchnebula.com/api/v1/auth/user/', note='Retrieving Zype access token')
access_token = try_get(user_object, lambda x: x['zype_auth_info']['access_token'], str)
if not access_token:
if try_get(user_object, lambda x: x['is_subscribed'], bool):
# TODO: Reimplement the same Zype token polling the Nebula frontend implements
# see https://github.com/ytdl-org/youtube-dl/pull/24805#issuecomment-749231532
raise ExtractorError(
'Unable to extract Zype access token from Nebula API authentication endpoint. '
'Open an arbitrary video in a browser with this account to generate a token',
expected=True)
raise ExtractorError('Unable to extract Zype access token from Nebula API authentication endpoint')
return access_token
def _fetch_video_formats(self, slug):
stream_info = self._call_nebula_api(f'https://content.watchnebula.com/video/{slug}/stream/',
video_id=slug,
auth_type='bearer',
note='Fetching video stream info')
manifest_url = stream_info['manifest']
return self._extract_m3u8_formats_and_subtitles(manifest_url, slug)
def _build_video_info(self, episode):
zype_id = episode['zype_id']
zype_video_url = f'https://player.zype.com/embed/{zype_id}.html?access_token={self._zype_access_token}'
fmts, subs = self._fetch_video_formats(episode['slug'])
channel_slug = episode['channel_slug']
channel_title = episode['channel_title']
return {
'id': episode['zype_id'],
'display_id': episode['slug'],
'_type': 'url_transparent',
'ie_key': 'Zype',
'url': zype_video_url,
'formats': fmts,
'subtitles': subs,
'webpage_url': f'https://nebula.tv/{episode["slug"]}',
'title': episode['title'],
'description': episode['description'],
'timestamp': parse_iso8601(episode['published_at']),
@ -133,27 +91,26 @@ class NebulaBaseIE(InfoExtractor):
'height': key,
} for key, tn in episode['assets']['thumbnail'].items()],
'duration': episode['duration'],
'channel': episode['channel_title'],
'channel': channel_title,
'channel_id': channel_slug,
'channel_url': f'https://nebula.app/{channel_slug}',
'uploader': episode['channel_title'],
'channel_url': f'https://nebula.tv/{channel_slug}',
'uploader': channel_title,
'uploader_id': channel_slug,
'uploader_url': f'https://nebula.app/{channel_slug}',
'series': episode['channel_title'],
'creator': episode['channel_title'],
'uploader_url': f'https://nebula.tv/{channel_slug}',
'series': channel_title,
'creator': channel_title,
}
def _perform_login(self, username=None, password=None):
self._nebula_api_token = self._retrieve_nebula_api_token(username, password)
self._nebula_api_token = self._perform_nebula_auth(username, password)
self._nebula_bearer_token = self._fetch_nebula_bearer_token()
self._zype_access_token = self._fetch_zype_access_token()
class NebulaIE(NebulaBaseIE):
_VALID_URL = rf'{_BASE_URL_RE}/videos/(?P<id>[-\w]+)'
_TESTS = [
{
'url': 'https://nebula.app/videos/that-time-disney-remade-beauty-and-the-beast',
'url': 'https://nebula.tv/videos/that-time-disney-remade-beauty-and-the-beast',
'md5': '14944cfee8c7beeea106320c47560efc',
'info_dict': {
'id': '5c271b40b13fd613090034fd',
@ -167,19 +124,17 @@ class NebulaIE(NebulaBaseIE):
'uploader': 'Lindsay Ellis',
'uploader_id': 'lindsayellis',
'timestamp': 1533009600,
'uploader_url': 'https://nebula.app/lindsayellis',
'uploader_url': 'https://nebula.tv/lindsayellis',
'series': 'Lindsay Ellis',
'average_rating': int,
'display_id': 'that-time-disney-remade-beauty-and-the-beast',
'channel_url': 'https://nebula.app/lindsayellis',
'channel_url': 'https://nebula.tv/lindsayellis',
'creator': 'Lindsay Ellis',
'duration': 2212,
'view_count': int,
'thumbnail': r're:https://\w+\.cloudfront\.net/[\w-]+\.jpeg?.*',
},
},
{
'url': 'https://nebula.app/videos/the-logistics-of-d-day-landing-craft-how-the-allies-got-ashore',
'url': 'https://nebula.tv/videos/the-logistics-of-d-day-landing-craft-how-the-allies-got-ashore',
'md5': 'd05739cf6c38c09322422f696b569c23',
'info_dict': {
'id': '5e7e78171aaf320001fbd6be',
@ -192,19 +147,17 @@ class NebulaIE(NebulaBaseIE):
'channel_id': 'realengineering',
'uploader': 'Real Engineering',
'uploader_id': 'realengineering',
'view_count': int,
'series': 'Real Engineering',
'average_rating': int,
'display_id': 'the-logistics-of-d-day-landing-craft-how-the-allies-got-ashore',
'creator': 'Real Engineering',
'duration': 841,
'channel_url': 'https://nebula.app/realengineering',
'uploader_url': 'https://nebula.app/realengineering',
'channel_url': 'https://nebula.tv/realengineering',
'uploader_url': 'https://nebula.tv/realengineering',
'thumbnail': r're:https://\w+\.cloudfront\.net/[\w-]+\.jpeg?.*',
},
},
{
'url': 'https://nebula.app/videos/money-episode-1-the-draw',
'url': 'https://nebula.tv/videos/money-episode-1-the-draw',
'md5': 'ebe28a7ad822b9ee172387d860487868',
'info_dict': {
'id': '5e779ebdd157bc0001d1c75a',
@ -217,14 +170,12 @@ class NebulaIE(NebulaBaseIE):
'channel_id': 'tom-scott-presents-money',
'uploader': 'Tom Scott Presents: Money',
'uploader_id': 'tom-scott-presents-money',
'uploader_url': 'https://nebula.app/tom-scott-presents-money',
'uploader_url': 'https://nebula.tv/tom-scott-presents-money',
'duration': 825,
'channel_url': 'https://nebula.app/tom-scott-presents-money',
'view_count': int,
'channel_url': 'https://nebula.tv/tom-scott-presents-money',
'series': 'Tom Scott Presents: Money',
'display_id': 'money-episode-1-the-draw',
'thumbnail': r're:https://\w+\.cloudfront\.net/[\w-]+\.jpeg?.*',
'average_rating': int,
'creator': 'Tom Scott Presents: Money',
},
},
@ -251,7 +202,7 @@ class NebulaSubscriptionsIE(NebulaBaseIE):
_VALID_URL = rf'{_BASE_URL_RE}/myshows'
_TESTS = [
{
'url': 'https://nebula.app/myshows',
'url': 'https://nebula.tv/myshows',
'playlist_mincount': 1,
'info_dict': {
'id': 'myshows',
@ -279,7 +230,7 @@ class NebulaChannelIE(NebulaBaseIE):
_VALID_URL = rf'{_BASE_URL_RE}/(?!myshows|videos/)(?P<id>[-\w]+)'
_TESTS = [
{
'url': 'https://nebula.app/tom-scott-presents-money',
'url': 'https://nebula.tv/tom-scott-presents-money',
'info_dict': {
'id': 'tom-scott-presents-money',
'title': 'Tom Scott Presents: Money',
@ -287,13 +238,13 @@ class NebulaChannelIE(NebulaBaseIE):
},
'playlist_count': 5,
}, {
'url': 'https://nebula.app/lindsayellis',
'url': 'https://nebula.tv/lindsayellis',
'info_dict': {
'id': 'lindsayellis',
'title': 'Lindsay Ellis',
'description': 'Enjoy these hottest of takes on Disney, Transformers, and Musicals.',
},
'playlist_mincount': 100,
'playlist_mincount': 2,
},
]

View File

@ -1,10 +1,18 @@
import base64
import json
import re
import time
import uuid
from .anvato import AnvatoIE
from .common import InfoExtractor
from ..utils import (
ExtractorError,
clean_html,
determine_ext,
get_element_by_class,
traverse_obj,
urlencode_postdata,
)
@ -54,15 +62,14 @@ class NFLBaseIE(InfoExtractor):
)/
'''
_VIDEO_CONFIG_REGEX = r'<script[^>]+id="[^"]*video-config-[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12}[^"]*"[^>]*>\s*({.+});?\s*</script>'
_ANVATO_PREFIX = 'anvato:GXvEgwyJeWem8KCYXfeoHWknwP48Mboj:'
def _parse_video_config(self, video_config, display_id):
video_config = self._parse_json(video_config, display_id)
item = video_config['playlist'][0]
mcp_id = item.get('mcpID')
if mcp_id:
info = self.url_result(
'anvato:GXvEgwyJeWem8KCYXfeoHWknwP48Mboj:' + mcp_id,
'Anvato', mcp_id)
info = self.url_result(f'{self._ANVATO_PREFIX}{mcp_id}', AnvatoIE, mcp_id)
else:
media_id = item.get('id') or item['entityId']
title = item.get('title')
@ -157,3 +164,138 @@ class NFLArticleIE(NFLBaseIE):
'nfl-c-article__title', webpage)) or self._html_search_meta(
['og:title', 'twitter:title'], webpage)
return self.playlist_result(entries, display_id, title)
class NFLPlusReplayIE(NFLBaseIE):
IE_NAME = 'nfl.com:plus:replay'
_VALID_URL = r'https?://(?:www\.)?nfl.com/plus/games/[\w-]+/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.nfl.com/plus/games/giants-at-vikings-2022-post-1/1572108',
'info_dict': {
'id': '1572108',
'ext': 'mp4',
'title': 'New York Giants at Minnesota Vikings',
'description': 'New York Giants play the Minnesota Vikings at U.S. Bank Stadium on January 15, 2023',
'uploader': 'NFL',
'upload_date': '20230116',
'timestamp': 1673864520,
'duration': 7157,
'categories': ['Game Highlights'],
'tags': ['Minnesota Vikings', 'New York Giants', 'Minnesota Vikings vs. New York Giants'],
'thumbnail': r're:^https?://.*\.jpg',
},
'params': {'skip_download': 'm3u8'},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
return self.url_result(f'{self._ANVATO_PREFIX}{video_id}', AnvatoIE, video_id)
class NFLPlusEpisodeIE(NFLBaseIE):
IE_NAME = 'nfl.com:plus:episode'
_VALID_URL = r'https?://(?:www\.)?nfl.com/plus/episodes/(?P<id>[\w-]+)'
_TESTS = [{
'note': 'premium content',
'url': 'https://www.nfl.com/plus/episodes/kurt-s-qb-insider-conference-championships',
'info_dict': {
'id': '1576832',
'ext': 'mp4',
'title': 'Kurt\'s QB Insider: Conference Championships',
'description': 'md5:944f7fab56f7a37430bf8473f5473857',
'uploader': 'NFL',
'upload_date': '20230127',
'timestamp': 1674782760,
'duration': 730,
'categories': ['Analysis'],
'tags': ['Cincinnati Bengals at Kansas City Chiefs (2022-POST-3)'],
'thumbnail': r're:^https?://.*\.jpg',
},
'params': {'skip_download': 'm3u8'},
}]
_CLIENT_DATA = {
'clientKey': '4cFUW6DmwJpzT9L7LrG3qRAcABG5s04g',
'clientSecret': 'CZuvCL49d9OwfGsR',
'deviceId': str(uuid.uuid4()),
'deviceInfo': base64.b64encode(json.dumps({
'model': 'desktop',
'version': 'Chrome',
'osName': 'Windows',
'osVersion': '10.0',
}, separators=(',', ':')).encode()).decode(),
'networkType': 'other',
'nflClaimGroupsToAdd': [],
'nflClaimGroupsToRemove': [],
}
_ACCOUNT_INFO = {}
_API_KEY = None
_TOKEN = None
_TOKEN_EXPIRY = 0
def _get_account_info(self, url, video_id):
cookies = self._get_cookies('https://www.nfl.com/')
login_token = traverse_obj(cookies, (
(f'glt_{self._API_KEY}', f'gig_loginToken_{self._API_KEY}',
lambda k, _: k.startswith('glt_') or k.startswith('gig_loginToken_')),
{lambda x: x.value}), get_all=False)
if not login_token:
self.raise_login_required()
account = self._download_json(
'https://auth-id.nfl.com/accounts.getAccountInfo', video_id,
note='Downloading account info', data=urlencode_postdata({
'include': 'profile,data',
'lang': 'en',
'APIKey': self._API_KEY,
'sdk': 'js_latest',
'login_token': login_token,
'authMode': 'cookie',
'pageURL': url,
'sdkBuild': traverse_obj(cookies, (
'gig_canary_ver', {lambda x: x.value.partition('-')[0]}), default='13642'),
'format': 'json',
}), headers={'Content-Type': 'application/x-www-form-urlencoded'})
self._ACCOUNT_INFO = traverse_obj(account, {
'signatureTimestamp': 'signatureTimestamp',
'uid': 'UID',
'uidSignature': 'UIDSignature',
})
if len(self._ACCOUNT_INFO) != 3:
raise ExtractorError('Failed to retrieve account info with provided cookies', expected=True)
def _get_auth_token(self, url, video_id):
if not self._ACCOUNT_INFO:
self._get_account_info(url, video_id)
token = self._download_json(
'https://api.nfl.com/identity/v3/token%s' % (
'/refresh' if self._ACCOUNT_INFO.get('refreshToken') else ''),
video_id, headers={'Content-Type': 'application/json'}, note='Downloading access token',
data=json.dumps({**self._CLIENT_DATA, **self._ACCOUNT_INFO}, separators=(',', ':')).encode())
self._TOKEN = token['accessToken']
self._TOKEN_EXPIRY = token['expiresIn']
self._ACCOUNT_INFO['refreshToken'] = token['refreshToken']
def _real_extract(self, url):
slug = self._match_id(url)
if not self._API_KEY:
webpage = self._download_webpage(url, slug, fatal=False) or ''
self._API_KEY = self._search_regex(
r'window\.gigyaApiKey=["\'](\w+)["\'];', webpage, 'API key',
default='3_Qa8TkWpIB8ESCBT8tY2TukbVKgO5F6BJVc7N1oComdwFzI7H2L9NOWdm11i_BY9f')
if not self._TOKEN or self._TOKEN_EXPIRY <= int(time.time()):
self._get_auth_token(url, slug)
video_id = self._download_json(
f'https://api.nfl.com/content/v1/videos/episodes/{slug}', slug, headers={
'Authorization': f'Bearer {self._TOKEN}',
})['mcpPlaybackId']
return self.url_result(f'{self._ANVATO_PREFIX}{video_id}', AnvatoIE, video_id)

View File

@ -675,8 +675,8 @@ class NiconicoSeriesIE(InfoExtractor):
class NiconicoHistoryIE(NiconicoPlaylistBaseIE):
IE_NAME = 'niconico:history'
IE_DESC = 'NicoNico user history. Requires cookies.'
_VALID_URL = r'https?://(?:www\.|sp\.)?nicovideo\.jp/my/history'
IE_DESC = 'NicoNico user history or likes. Requires cookies.'
_VALID_URL = r'https?://(?:www\.|sp\.)?nicovideo\.jp/my/(?P<id>history(?:/like)?)'
_TESTS = [{
'note': 'PC page, with /video',
@ -694,23 +694,29 @@ class NiconicoHistoryIE(NiconicoPlaylistBaseIE):
'note': 'mobile page, without /video',
'url': 'https://sp.nicovideo.jp/my/history',
'only_matching': True,
}, {
'note': 'PC page',
'url': 'https://www.nicovideo.jp/my/history/like',
'only_matching': True,
}, {
'note': 'Mobile page',
'url': 'https://sp.nicovideo.jp/my/history/like',
'only_matching': True,
}]
def _call_api(self, list_id, resource, query):
path = 'likes' if list_id == 'history/like' else 'watch/history'
return self._download_json(
'https://nvapi.nicovideo.jp/v1/users/me/watch/history', 'history',
f'Downloading {resource}', query=query,
headers=self._API_HEADERS)['data']
f'https://nvapi.nicovideo.jp/v1/users/me/{path}', list_id,
f'Downloading {resource}', query=query, headers=self._API_HEADERS)['data']
def _real_extract(self, url):
list_id = 'history'
list_id = self._match_id(url)
try:
mylist = self._call_api(list_id, 'list', {
'pageSize': 1,
})
mylist = self._call_api(list_id, 'list', {'pageSize': 1})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
self.raise_login_required('You have to be logged in to get your watch history')
self.raise_login_required('You have to be logged in to get your history')
raise
return self.playlist_result(self._entries(list_id), list_id, **self._parse_owner(mylist))

View File

@ -39,59 +39,99 @@ class NitterIE(InfoExtractor):
)
HTTP_INSTANCES = (
'nitter.42l.fr',
'nitter.pussthecat.org',
'nitter.nixnet.services',
'nitter.lacontrevoie.fr',
'nitter.fdn.fr',
'nitter.1d4.us',
'nitter.kavin.rocks',
'nitter.unixfox.eu',
'nitter.domain.glass',
'nitter.eu',
'nitter.namazso.eu',
'nitter.actionsack.com',
'birdsite.xanny.family',
'nitter.hu',
'twitr.gq',
'nitter.moomoo.me',
'nittereu.moomoo.me',
'bird.from.tf',
'bird.trom.tf',
'nitter.it',
'twitter.censors.us',
'twitter.grimneko.de',
'nitter.alefvanoon.xyz',
'n.hyperborea.cloud',
'nitter.ca',
'nitter.grimneko.de',
'twitter.076.ne.jp',
'twitter.mstdn.social',
'nitter.fly.dev',
'notabird.site',
'nitter.weiler.rocks',
'nitter.silkky.cloud',
'nitter.sethforprivacy.com',
'nttr.stream',
'nitter.cutelab.space',
'nitter.nl',
'nitter.mint.lgbt',
'nitter.bus-hit.me',
'fuckthesacklers.network',
'nitter.govt.land',
'nitter.datatunnel.xyz',
'nitter.esmailelbob.xyz',
'tw.artemislena.eu',
'de.nttr.stream',
'nitter.winscloud.net',
'nitter.tiekoetter.com',
'nitter.spaceint.fr',
'twtr.bch.bar',
'nitter.exonip.de',
'nitter.mastodon.pro',
'nitter.notraxx.ch',
# not in the list anymore
'nitter.skrep.in',
'nitter.snopyta.org',
'nitter.privacy.com.de',
'nitter.poast.org',
'nitter.bird.froth.zone',
'nitter.dcs0.hu',
'twitter.dr460nf1r3.org',
'nitter.garudalinux.org',
'twitter.femboy.hu',
'nitter.cz',
'nitter.privacydev.net',
'nitter.evil.site',
'tweet.lambda.dance',
'nitter.kylrth.com',
'nitter.foss.wtf',
'nitter.priv.pw',
'nitter.tokhmi.xyz',
'nitter.catalyst.sx',
'unofficialbird.com',
'nitter.projectsegfau.lt',
'nitter.eu.projectsegfau.lt',
'singapore.unofficialbird.com',
'canada.unofficialbird.com',
'india.unofficialbird.com',
'nederland.unofficialbird.com',
'uk.unofficialbird.com',
'n.l5.ca',
'nitter.slipfox.xyz',
'nitter.soopy.moe',
'nitter.qwik.space',
'read.whatever.social',
'nitter.rawbit.ninja',
'nt.vern.cc',
'ntr.odyssey346.dev',
'nitter.ir',
'nitter.privacytools.io',
'nitter.sneed.network',
'n.sneed.network',
'nitter.manasiwibi.com',
'nitter.smnz.de',
'nitter.twei.space',
'nitter.inpt.fr',
'nitter.d420.de',
'nitter.caioalonso.com',
'nitter.at',
'nitter.drivet.xyz',
'nitter.pw',
'nitter.nicfab.eu',
'bird.habedieeh.re',
'nitter.hostux.net',
'nitter.adminforge.de',
'nitter.platypush.tech',
'nitter.mask.sh',
'nitter.pufe.org',
'nitter.us.projectsegfau.lt',
'nitter.arcticfoxes.net',
't.com.sb',
'nitter.kling.gg',
'nitter.ktachibana.party',
'nitter.riverside.rocks',
'nitter.girlboss.ceo',
'nitter.lunar.icu',
'twitter.moe.ngo',
'nitter.freedit.eu',
'ntr.frail.duckdns.org',
'nitter.librenode.org',
'n.opnxng.com',
'nitter.plus.st',
)
DEAD_INSTANCES = (
@ -117,6 +157,32 @@ class NitterIE(InfoExtractor):
'nitter.weaponizedhumiliation.com',
'nitter.vxempire.xyz',
'tweet.lambda.dance',
'nitter.ca',
'nitter.42l.fr',
'nitter.pussthecat.org',
'nitter.nixnet.services',
'nitter.eu',
'nitter.actionsack.com',
'nitter.hu',
'twitr.gq',
'nittereu.moomoo.me',
'bird.from.tf',
'twitter.grimneko.de',
'nitter.alefvanoon.xyz',
'n.hyperborea.cloud',
'twitter.mstdn.social',
'nitter.silkky.cloud',
'nttr.stream',
'fuckthesacklers.network',
'nitter.govt.land',
'nitter.datatunnel.xyz',
'de.nttr.stream',
'twtr.bch.bar',
'nitter.exonip.de',
'nitter.mastodon.pro',
'nitter.notraxx.ch',
'nitter.skrep.in',
'nitter.snopyta.org',
)
INSTANCES = NON_HTTP_INSTANCES + HTTP_INSTANCES + DEAD_INSTANCES

View File

@ -1,36 +1,22 @@
import random
import re
import urllib.parse
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_str,
)
from ..utils import (
determine_ext,
ExtractorError,
fix_xml_ampersands,
int_or_none,
merge_dicts,
orderedSet,
parse_duration,
qualities,
str_or_none,
strip_jsonp,
unified_strdate,
try_call,
unified_timestamp,
url_or_none,
urlencode_postdata,
)
class NPOBaseIE(InfoExtractor):
def _get_token(self, video_id):
return self._download_json(
'http://ida.omroep.nl/app.php/auth', video_id,
note='Downloading token')['token']
class NPOIE(NPOBaseIE):
class NPOIE(InfoExtractor):
IE_NAME = 'npo'
IE_DESC = 'npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl'
_VALID_URL = r'''(?x)
@ -58,6 +44,7 @@ class NPOIE(NPOBaseIE):
'description': 'Dagelijks tussen tien en elf: nieuws, sport en achtergronden.',
'upload_date': '20140622',
},
'skip': 'Video was removed',
}, {
'url': 'http://www.npo.nl/de-mega-mike-mega-thomas-show/27-02-2009/VARA_101191800',
'md5': 'da50a5787dbfc1603c4ad80f31c5120b',
@ -69,29 +56,41 @@ class NPOIE(NPOBaseIE):
'upload_date': '20090227',
'duration': 2400,
},
'skip': 'Video was removed',
}, {
'url': 'http://www.npo.nl/tegenlicht/25-02-2013/VPWON_1169289',
'md5': 'f8065e4e5a7824068ed3c7e783178f2c',
'md5': '1b279c0547f6b270e014c576415268c5',
'info_dict': {
'id': 'VPWON_1169289',
'ext': 'm4v',
'title': 'Tegenlicht: Zwart geld. De toekomst komt uit Afrika',
'description': 'md5:52cf4eefbc96fffcbdc06d024147abea',
'ext': 'mp4',
'title': 'Zwart geld: de toekomst komt uit Afrika',
'description': 'md5:dffaf3d628a9c36f78ca48d834246261',
'upload_date': '20130225',
'duration': 3000,
'creator': 'NED2',
'series': 'Tegenlicht',
'timestamp': 1361822340,
'thumbnail': 'https://images.npo.nl/tile/1280x720/142854.jpg',
'episode': 'Zwart geld: de toekomst komt uit Afrika',
'episode_number': 18,
},
}, {
'url': 'http://www.npo.nl/de-nieuwe-mens-deel-1/21-07-2010/WO_VPRO_043706',
'info_dict': {
'id': 'WO_VPRO_043706',
'ext': 'm4v',
'ext': 'mp4',
'title': 'De nieuwe mens - Deel 1',
'description': 'md5:518ae51ba1293ffb80d8d8ce90b74e4b',
'duration': 4680,
'episode': 'De nieuwe mens - Deel 1',
'thumbnail': 'https://images.npo.nl/tile/1280x720/6289.jpg',
'timestamp': 1279716057,
'series': 'De nieuwe mens - Deel 1',
'upload_date': '20100721',
},
'params': {
'skip_download': True,
}
},
}, {
# non asf in streams
'url': 'http://www.npo.nl/hoe-gaat-europa-verder-na-parijs/10-01-2015/WO_NOS_762771',
@ -102,20 +101,25 @@ class NPOIE(NPOBaseIE):
},
'params': {
'skip_download': True,
}
},
'skip': 'Video was removed',
}, {
'url': 'http://www.ntr.nl/Aap-Poot-Pies/27/detail/Aap-poot-pies/VPWON_1233944#content',
'info_dict': {
'id': 'VPWON_1233944',
'ext': 'm4v',
'ext': 'mp4',
'title': 'Aap, poot, pies',
'description': 'md5:c9c8005d1869ae65b858e82c01a91fde',
'description': 'md5:4b46b1b9553b4c036a04d2a532a137e6',
'upload_date': '20150508',
'duration': 599,
'episode': 'Aap, poot, pies',
'thumbnail': 'https://images.poms.omroep.nl/image/s1280/c1280x720/608118.jpg',
'timestamp': 1431064200,
'series': 'Aap, poot, pies',
},
'params': {
'skip_download': True,
}
},
}, {
'url': 'http://www.omroepwnl.nl/video/fragment/vandaag-de-dag-verkiezingen__POMS_WNL_853698',
'info_dict': {
@ -128,7 +132,8 @@ class NPOIE(NPOBaseIE):
},
'params': {
'skip_download': True,
}
},
'skip': 'Video was removed',
}, {
# audio
'url': 'http://www.npo.nl/jouw-stad-rotterdam/29-01-2017/RBX_FUNX_6683215/RBX_FUNX_7601437',
@ -140,7 +145,8 @@ class NPOIE(NPOBaseIE):
},
'params': {
'skip_download': True,
}
},
'skip': 'Video was removed',
}, {
'url': 'http://www.zapp.nl/de-bzt-show/gemist/KN_1687547',
'only_matching': True,
@ -169,6 +175,25 @@ class NPOIE(NPOBaseIE):
}, {
'url': 'https://npo.nl/KN_1698996',
'only_matching': True,
}, {
'url': 'https://www.npo3.nl/the-genius/21-11-2022/VPWON_1341105',
'info_dict': {
'id': 'VPWON_1341105',
'ext': 'mp4',
'duration': 2658,
'series': 'The Genius',
'description': 'md5:db02f1456939ca63f7c408f858044e94',
'title': 'The Genius',
'timestamp': 1669062000,
'creator': 'NED3',
'episode': 'The Genius',
'thumbnail': 'https://images.npo.nl/tile/1280x720/1827650.jpg',
'episode_number': 8,
'upload_date': '20221121',
},
'params': {
'skip_download': True,
},
}]
@classmethod
@ -179,25 +204,32 @@ class NPOIE(NPOBaseIE):
def _real_extract(self, url):
video_id = self._match_id(url)
return self._get_info(url, video_id) or self._get_old_info(video_id)
def _get_info(self, url, video_id):
token = self._download_json(
'https://www.npostart.nl/api/token', video_id,
'Downloading token', headers={
'Referer': url,
'X-Requested-With': 'XMLHttpRequest',
})['token']
player = self._download_json(
'https://www.npostart.nl/player/%s' % video_id, video_id,
'Downloading player JSON', data=urlencode_postdata({
'autoplay': 0,
'share': 1,
'pageUrl': url,
'hasAdConsent': 0,
'_token': token,
}))
if urllib.parse.urlparse(url).netloc in ['www.ntr.nl', 'ntr.nl']:
player = self._download_json(
f'https://www.ntr.nl/ajax/player/embed/{video_id}', video_id,
'Downloading player JSON', query={
'parameters[elementId]': f'npo{random.randint(0, 999)}',
'parameters[sterReferralUrl]': url,
'parameters[autoplay]': 0,
})
else:
self._request_webpage(
'https://www.npostart.nl/api/token', video_id,
'Downloading token', headers={
'Referer': url,
'X-Requested-With': 'XMLHttpRequest',
})
player = self._download_json(
f'https://www.npostart.nl/player/{video_id}', video_id,
'Downloading player JSON', data=urlencode_postdata({
'autoplay': 0,
'share': 1,
'pageUrl': url,
'hasAdConsent': 0,
}), headers={
'x-xsrf-token': try_call(lambda: urllib.parse.unquote(
self._get_cookies('https://www.npostart.nl')['XSRF-TOKEN'].value))
})
player_token = player['token']
@ -210,7 +242,7 @@ class NPOIE(NPOBaseIE):
video_id, 'Downloading %s profile JSON' % profile, fatal=False,
query={
'profile': profile,
'quality': 'npo',
'quality': 'npoplus',
'tokenId': player_token,
'streamType': 'broadcast',
})
@ -291,188 +323,8 @@ class NPOIE(NPOBaseIE):
return info
def _get_old_info(self, video_id):
metadata = self._download_json(
'http://e.omroep.nl/metadata/%s' % video_id,
video_id,
# We have to remove the javascript callback
transform_source=strip_jsonp,
)
error = metadata.get('error')
if error:
raise ExtractorError(error, expected=True)
# For some videos actual video id (prid) is different (e.g. for
# http://www.omroepwnl.nl/video/fragment/vandaag-de-dag-verkiezingen__POMS_WNL_853698
# video id is POMS_WNL_853698 but prid is POW_00996502)
video_id = metadata.get('prid') or video_id
# titel is too generic in some cases so utilize aflevering_titel as well
# when available (e.g. http://tegenlicht.vpro.nl/afleveringen/2014-2015/access-to-africa.html)
title = metadata['titel']
sub_title = metadata.get('aflevering_titel')
if sub_title and sub_title != title:
title += ': %s' % sub_title
token = self._get_token(video_id)
formats = []
urls = set()
def is_legal_url(format_url):
return format_url and format_url not in urls and re.match(
r'^(?:https?:)?//', format_url)
QUALITY_LABELS = ('Laag', 'Normaal', 'Hoog')
QUALITY_FORMATS = ('adaptive', 'wmv_sb', 'h264_sb', 'wmv_bb', 'h264_bb', 'wvc1_std', 'h264_std')
quality_from_label = qualities(QUALITY_LABELS)
quality_from_format_id = qualities(QUALITY_FORMATS)
items = self._download_json(
'http://ida.omroep.nl/app.php/%s' % video_id, video_id,
'Downloading formats JSON', query={
'adaptive': 'yes',
'token': token,
})['items'][0]
for num, item in enumerate(items):
item_url = item.get('url')
if not is_legal_url(item_url):
continue
urls.add(item_url)
format_id = self._search_regex(
r'video/ida/([^/]+)', item_url, 'format id',
default=None)
item_label = item.get('label')
def add_format_url(format_url):
width = int_or_none(self._search_regex(
r'(\d+)[xX]\d+', format_url, 'width', default=None))
height = int_or_none(self._search_regex(
r'\d+[xX](\d+)', format_url, 'height', default=None))
if item_label in QUALITY_LABELS:
quality = quality_from_label(item_label)
f_id = item_label
elif item_label in QUALITY_FORMATS:
quality = quality_from_format_id(format_id)
f_id = format_id
else:
quality, f_id = [None] * 2
formats.append({
'url': format_url,
'format_id': f_id,
'width': width,
'height': height,
'quality': quality,
})
# Example: http://www.npo.nl/de-nieuwe-mens-deel-1/21-07-2010/WO_VPRO_043706
if item.get('contentType') in ('url', 'audio'):
add_format_url(item_url)
continue
try:
stream_info = self._download_json(
item_url + '&type=json', video_id,
'Downloading %s stream JSON'
% item_label or item.get('format') or format_id or num)
except ExtractorError as ee:
if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 404:
error = (self._parse_json(
ee.cause.read().decode(), video_id,
fatal=False) or {}).get('errorstring')
if error:
raise ExtractorError(error, expected=True)
raise
# Stream URL instead of JSON, example: npo:LI_NL1_4188102
if isinstance(stream_info, compat_str):
if not stream_info.startswith('http'):
continue
video_url = stream_info
# JSON
else:
video_url = stream_info.get('url')
if not video_url or 'vodnotavailable.' in video_url or video_url in urls:
continue
urls.add(video_url)
if determine_ext(video_url) == 'm3u8':
formats.extend(self._extract_m3u8_formats(
video_url, video_id, ext='mp4',
entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
else:
add_format_url(video_url)
is_live = metadata.get('medium') == 'live'
if not is_live:
for num, stream in enumerate(metadata.get('streams', [])):
stream_url = stream.get('url')
if not is_legal_url(stream_url):
continue
urls.add(stream_url)
# smooth streaming is not supported
stream_type = stream.get('type', '').lower()
if stream_type in ['ss', 'ms']:
continue
if stream_type == 'hds':
f4m_formats = self._extract_f4m_formats(
stream_url, video_id, fatal=False)
# f4m downloader downloads only piece of live stream
for f4m_format in f4m_formats:
f4m_format['preference'] = -5
formats.extend(f4m_formats)
elif stream_type == 'hls':
formats.extend(self._extract_m3u8_formats(
stream_url, video_id, ext='mp4', fatal=False))
# Example: http://www.npo.nl/de-nieuwe-mens-deel-1/21-07-2010/WO_VPRO_043706
elif '.asf' in stream_url:
asx = self._download_xml(
stream_url, video_id,
'Downloading stream %d ASX playlist' % num,
transform_source=fix_xml_ampersands, fatal=False)
if not asx:
continue
ref = asx.find('./ENTRY/Ref')
if ref is None:
continue
video_url = ref.get('href')
if not video_url or video_url in urls:
continue
urls.add(video_url)
formats.append({
'url': video_url,
'ext': stream.get('formaat', 'asf'),
'quality': stream.get('kwaliteit'),
'preference': -10,
})
else:
formats.append({
'url': stream_url,
'quality': stream.get('kwaliteit'),
})
subtitles = {}
if metadata.get('tt888') == 'ja':
subtitles['nl'] = [{
'ext': 'vtt',
'url': 'http://tt888.omroep.nl/tt888/%s' % video_id,
}]
return {
'id': video_id,
'title': title,
'description': metadata.get('info'),
'thumbnail': metadata.get('images', [{'url': None}])[-1]['url'],
'upload_date': unified_strdate(metadata.get('gidsdatum')),
'duration': parse_duration(metadata.get('tijdsduur')),
'formats': formats,
'subtitles': subtitles,
'is_live': is_live,
}
class NPOLiveIE(NPOBaseIE):
class NPOLiveIE(InfoExtractor):
IE_NAME = 'npo.nl:live'
_VALID_URL = r'https?://(?:www\.)?npo(?:start)?\.nl/live(?:/(?P<id>[^/?#&]+))?'

View File

@ -0,0 +1,93 @@
from .common import InfoExtractor
from ..utils import (
float_or_none,
int_or_none,
remove_end,
strip_or_none,
traverse_obj,
url_or_none,
)
class NZOnScreenIE(InfoExtractor):
_VALID_URL = r'^https://www\.nzonscreen\.com/title/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.nzonscreen.com/title/shoop-shoop-diddy-wop-cumma-cumma-wang-dang-1982',
'info_dict': {
'id': '726ed6585c6bfb30',
'ext': 'mp4',
'format_id': 'hi',
'display_id': 'shoop-shoop-diddy-wop-cumma-cumma-wang-dang-1982',
'title': 'Monte Video - "Shoop Shoop, Diddy Wop"',
'description': 'Monte Video - "Shoop Shoop, Diddy Wop"',
'alt_title': 'Shoop Shoop Diddy Wop Cumma Cumma Wang Dang | Music Video',
'thumbnail': r're:https://www\.nzonscreen\.com/content/images/.+\.jpg',
'duration': 158,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.nzonscreen.com/title/shes-a-mod-1964?collection=best-of-the-60s',
'info_dict': {
'id': '3dbe709ff03c36f1',
'ext': 'mp4',
'format_id': 'hi',
'display_id': 'shes-a-mod-1964',
'title': 'Ray Columbus - \'She\'s A Mod\'',
'description': 'Ray Columbus - \'She\'s A Mod\'',
'alt_title': 'She\'s a Mod | Music Video',
'thumbnail': r're:https://www\.nzonscreen\.com/content/images/.+\.jpg',
'duration': 130,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://www.nzonscreen.com/title/puha-and-pakeha-1968/overview',
'info_dict': {
'id': 'f86342544385ad8a',
'ext': 'mp4',
'format_id': 'hi',
'display_id': 'puha-and-pakeha-1968',
'title': 'Looking At New Zealand - Puha and Pakeha',
'alt_title': 'Looking at New Zealand - \'Pūhā and Pākehā\' | Television',
'description': 'An excerpt from this television programme.',
'duration': 212,
'thumbnail': r're:https://www\.nzonscreen\.com/content/images/.+\.jpg',
},
'params': {'skip_download': 'm3u8'},
}]
def _extract_formats(self, playlist):
for quality, (id_, url) in enumerate(traverse_obj(
playlist, ('h264', {'lo': 'lo_res', 'hi': 'hi_res'}), expected_type=url_or_none).items()):
yield {
'url': url,
'format_id': id_,
'ext': 'mp4',
'quality': quality,
'height': int_or_none(playlist.get('height')) if id_ == 'hi' else None,
'width': int_or_none(playlist.get('width')) if id_ == 'hi' else None,
'filesize_approx': float_or_none(traverse_obj(playlist, ('h264', f'{id_}_res_mb')), invscale=1024**2),
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
playlist = self._parse_json(self._html_search_regex(
r'data-video-config=\'([^\']+)\'', webpage, 'media data'), video_id)
return {
'id': playlist['uuid'],
'display_id': video_id,
'title': strip_or_none(playlist.get('label')),
'description': strip_or_none(playlist.get('description')),
'alt_title': strip_or_none(remove_end(
self._html_extract_title(webpage, default=None) or self._og_search_title(webpage),
' | NZ On Screen')),
'thumbnail': traverse_obj(playlist, ('thumbnail', 'path')),
'duration': float_or_none(playlist.get('duration')),
'formats': list(self._extract_formats(playlist)),
'http_headers': {
'Referer': 'https://www.nzonscreen.com/',
'Origin': 'https://www.nzonscreen.com/',
}
}

View File

@ -0,0 +1,105 @@
import json
import urllib.error
from .common import InfoExtractor
from ..utils import (
ExtractorError,
GeoRestrictedError,
float_or_none,
traverse_obj,
try_call
)
class OnDemandChinaEpisodeIE(InfoExtractor):
_VALID_URL = r'https?://www\.ondemandchina\.com/\w+/watch/(?P<series>[\w-]+)/(?P<id>ep-(?P<ep>\d+))'
_TESTS = [{
'url': 'https://www.ondemandchina.com/en/watch/together-against-covid-19/ep-1',
'info_dict': {
'id': '264394',
'ext': 'mp4',
'duration': 3256.88,
'title': 'EP 1 The Calling',
'alt_title': '第1集 令出如山',
'thumbnail': 'https://d2y2efdi5wgkcl.cloudfront.net/fit-in/256x256/media-io/2020/9/11/image.d9816e81.jpg',
'description': '疫情严峻,党政军民学、东西南北中协同应考',
'tags': ['Social Humanities', 'Documentary', 'Medical', 'Social'],
}
}]
_QUERY = '''
query Episode($programSlug: String!, $episodeNumber: Int!) {
episode(
programSlug: $programSlug
episodeNumber: $episodeNumber
kind: "series"
part: null
) {
id
title
titleEn
titleKo
titleZhHans
titleZhHant
synopsis
synopsisEn
synopsisKo
synopsisZhHans
synopsisZhHant
videoDuration
images {
thumbnail
}
}
}'''
def _real_extract(self, url):
program_slug, display_id, ep_number = self._match_valid_url(url).group('series', 'id', 'ep')
webpage = self._download_webpage(url, display_id)
video_info = self._download_json(
'https://odc-graphql.odkmedia.io/graphql', display_id,
headers={'Content-type': 'application/json'},
data=json.dumps({
'operationName': 'Episode',
'query': self._QUERY,
'variables': {
'programSlug': program_slug,
'episodeNumber': int(ep_number),
},
}).encode())['data']['episode']
try:
source_json = self._download_json(
f'https://odkmedia.io/odc/api/v2/playback/{video_info["id"]}/', display_id,
headers={'Authorization': '', 'service-name': 'odc'})
except ExtractorError as e:
if isinstance(e.cause, urllib.error.HTTPError):
error_data = self._parse_json(e.cause.read(), display_id)['detail']
raise GeoRestrictedError(error_data)
formats, subtitles = [], {}
for source in traverse_obj(source_json, ('sources', ...)):
if source.get('type') == 'hls':
fmts, subs = self._extract_m3u8_formats_and_subtitles(source.get('url'), display_id)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
self.report_warning(f'Unsupported format {source.get("type")}', display_id)
return {
'id': str(video_info['id']),
'duration': float_or_none(video_info.get('videoDuration'), 1000),
'thumbnail': (traverse_obj(video_info, ('images', 'thumbnail'))
or self._html_search_meta(['og:image', 'twitter:image'], webpage)),
'title': (traverse_obj(video_info, 'title', 'titleEn')
or self._html_search_meta(['og:title', 'twitter:title'], webpage)
or self._html_extract_title(webpage)),
'alt_title': traverse_obj(video_info, 'titleKo', 'titleZhHans', 'titleZhHant'),
'description': (traverse_obj(
video_info, 'synopsisEn', 'synopsisKo', 'synopsisZhHans', 'synopsisZhHant', 'synopisis')
or self._html_search_meta(['og:description', 'twitter:description', 'description'], webpage)),
'formats': formats,
'subtitles': subtitles,
'tags': try_call(lambda: self._html_search_meta('keywords', webpage).split(', '))
}

View File

@ -412,7 +412,7 @@ class PanoptoIE(PanoptoBaseIE):
return {
'id': video_id,
'title': delivery.get('SessionName'),
'cast': traverse_obj(delivery, ('Contributors', ..., 'DisplayName'), default=[], expected_type=lambda x: x or None),
'cast': traverse_obj(delivery, ('Contributors', ..., 'DisplayName'), expected_type=lambda x: x or None),
'timestamp': session_start_time - 11640000000 if session_start_time else None,
'duration': delivery.get('Duration'),
'thumbnail': base_url + f'/Services/FrameGrabber.svc/FrameRedirect?objectId={video_id}&mode=Delivery&random={random()}',
@ -563,7 +563,7 @@ class PanoptoListIE(PanoptoBaseIE):
base_url, '/Services/Data.svc/GetFolderInfo', folder_id,
data={'folderID': folder_id}, fatal=False)
return {
'title': get_first(response, 'Name', default=[])
'title': get_first(response, 'Name')
}
def _real_extract(self, url):

View File

@ -310,7 +310,7 @@ class PatreonIE(PatreonBaseIE):
f'posts/{post_id}/comments', post_id, query=params, note='Downloading comments page %d' % page)
cursor = None
for comment in traverse_obj(response, (('data', ('included', lambda _, v: v['type'] == 'comment')), ...), default=[]):
for comment in traverse_obj(response, (('data', ('included', lambda _, v: v['type'] == 'comment')), ...)):
count += 1
comment_id = comment.get('id')
attributes = comment.get('attributes') or {}

View File

@ -1,26 +1,48 @@
import urllib.parse
from .common import InfoExtractor
from ..utils import (
parse_duration,
determine_ext,
int_or_none,
parse_duration,
remove_end,
unified_strdate,
ExtractorError,
)
class Porn91IE(InfoExtractor):
IE_NAME = '91porn'
_VALID_URL = r'(?:https?://)(?:www\.|)91porn\.com/.+?\?viewkey=(?P<id>[\w\d]+)'
_VALID_URL = r'(?:https?://)(?:www\.|)91porn\.com/view_video.php\?([^#]+&)?viewkey=(?P<id>\w+)'
_TEST = {
_TESTS = [{
'url': 'http://91porn.com/view_video.php?viewkey=7e42283b4f5ab36da134',
'md5': '7fcdb5349354f40d41689bd0fa8db05a',
'md5': 'd869db281402e0ef4ddef3c38b866f86',
'info_dict': {
'id': '7e42283b4f5ab36da134',
'title': '18岁大一漂亮学妹水嫩性感再爽一次',
'description': 'md5:1ff241f579b07ae936a54e810ad2e891',
'ext': 'mp4',
'duration': 431,
'upload_date': '20150520',
'comment_count': int,
'view_count': int,
'age_limit': 18,
}
}
}, {
'url': 'https://91porn.com/view_video.php?viewkey=7ef0cf3d362c699ab91c',
'md5': 'f8fd50540468a6d795378cd778b40226',
'info_dict': {
'id': '7ef0cf3d362c699ab91c',
'title': '真实空乘,冲上云霄第二部',
'description': 'md5:618bf9652cafcc66cd277bd96789baea',
'ext': 'mp4',
'duration': 248,
'upload_date': '20221119',
'comment_count': int,
'view_count': int,
'age_limit': 18,
}
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@ -29,32 +51,45 @@ class Porn91IE(InfoExtractor):
webpage = self._download_webpage(
'http://91porn.com/view_video.php?viewkey=%s' % video_id, video_id)
if '作为游客你每天只可观看10个视频' in webpage:
raise ExtractorError('91 Porn says: Daily limit 10 videos exceeded', expected=True)
if '视频不存在,可能已经被删除或者被举报为不良内容!' in webpage:
raise ExtractorError('91 Porn says: Video does not exist', expected=True)
title = self._search_regex(
r'<div id="viewvideo-title">([^<]+)</div>', webpage, 'title')
title = title.replace('\n', '')
daily_limit = self._search_regex(
r'作为游客,你每天只可观看([\d]+)个视频', webpage, 'exceeded daily limit', default=None, fatal=False)
if daily_limit:
raise ExtractorError(f'91 Porn says: Daily limit {daily_limit} videos exceeded', expected=True)
video_link_url = self._search_regex(
r'<textarea[^>]+id=["\']fm-video_link[^>]+>([^<]+)</textarea>',
webpage, 'video link')
videopage = self._download_webpage(video_link_url, video_id)
r'document\.write\(\s*strencode2\s*\(\s*((?:"[^"]+")|(?:\'[^\']+\'))', webpage, 'video link')
video_link_url = self._search_regex(
r'src=["\']([^"\']+)["\']', urllib.parse.unquote(video_link_url), 'unquoted video link')
info_dict = self._parse_html5_media_entries(url, videopage, video_id)[0]
formats, subtitles = self._get_formats_and_subtitle(video_link_url, video_id)
duration = parse_duration(self._search_regex(
r'时长:\s*</span>\s*(\d+:\d+)', webpage, 'duration', fatal=False))
comment_count = int_or_none(self._search_regex(
r'留言:\s*</span>\s*(\d+)', webpage, 'comment count', fatal=False))
info_dict.update({
return {
'id': video_id,
'title': title,
'duration': duration,
'comment_count': comment_count,
'age_limit': self._rta_search(webpage),
})
'title': remove_end(self._html_extract_title(webpage).replace('\n', ''), 'Chinese homemade video').strip(),
'formats': formats,
'subtitles': subtitles,
'upload_date': unified_strdate(self._search_regex(
r'<span\s+class=["\']title-yakov["\']>(\d{4}-\d{2}-\d{2})</span>', webpage, 'upload_date', fatal=False)),
'description': self._html_search_regex(
r'<span\s+class=["\']more title["\']>\s*([^<]+)', webpage, 'description', fatal=False),
'duration': parse_duration(self._search_regex(
r'时长:\s*<span[^>]*>\s*(\d+(?::\d+){1,2})', webpage, 'duration', fatal=False)),
'comment_count': int_or_none(self._search_regex(
r'留言:\s*<span[^>]*>\s*(\d+)\s*</span>', webpage, 'comment count', fatal=False)),
'view_count': int_or_none(self._search_regex(
r'热度:\s*<span[^>]*>\s*(\d+)\s*</span>', webpage, 'view count', fatal=False)),
'age_limit': 18,
}
return info_dict
def _get_formats_and_subtitle(self, video_link_url, video_id):
ext = determine_ext(video_link_url)
if ext == 'm3u8':
formats, subtitles = self._extract_m3u8_formats_and_subtitles(video_link_url, video_id, ext='mp4')
else:
formats = [{'url': video_link_url, 'ext': ext}]
subtitles = {}
return formats, subtitles

View File

@ -1,5 +1,5 @@
from .common import InfoExtractor
from ..utils import int_or_none
from ..utils import int_or_none, urljoin
class PornezIE(InfoExtractor):
@ -20,7 +20,8 @@ class PornezIE(InfoExtractor):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
iframe_src = self._html_search_regex(
r'<iframe[^>]+src="(https?://pornez\.net/player/\?[^"]+)"', webpage, 'iframe', fatal=True)
r'<iframe[^>]+src="([^"]+)"', webpage, 'iframe', fatal=True)
iframe_src = urljoin('https://pornez.net', iframe_src)
title = self._html_search_meta(['name', 'twitter:title', 'og:title'], webpage, 'title', default=None)
if title is None:
title = self._search_regex(r'<h1>(.*?)</h1>', webpage, 'title', fatal=True)

View File

@ -0,0 +1,97 @@
import re
from .common import InfoExtractor
from ..utils import merge_dicts
class Pr0grammStaticIE(InfoExtractor):
# Possible urls:
# https://pr0gramm.com/static/5466437
_VALID_URL = r'https?://pr0gramm\.com/static/(?P<id>[0-9]+)'
_TEST = {
'url': 'https://pr0gramm.com/static/5466437',
'md5': '52fa540d70d3edc286846f8ca85938aa',
'info_dict': {
'id': '5466437',
'ext': 'mp4',
'title': 'pr0gramm-5466437 by g11st',
'uploader': 'g11st',
'upload_date': '20221221',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
# Fetch media sources
entries = self._parse_html5_media_entries(url, webpage, video_id)
media_info = entries[0]
# Fetch author
uploader = self._html_search_regex(r'by\W+([\w-]+)\W+', webpage, 'uploader')
# Fetch approx upload timestamp from filename
# Have None-defaults in case the extraction fails
uploadDay = None
uploadMon = None
uploadYear = None
uploadTimestr = None
# (//img.pr0gramm.com/2022/12/21/62ae8aa5e2da0ebf.mp4)
m = re.search(r'//img\.pr0gramm\.com/(?P<year>[\d]+)/(?P<mon>[\d]+)/(?P<day>[\d]+)/\w+\.\w{,4}', webpage)
if (m):
# Up to a day of accuracy should suffice...
uploadDay = m.groupdict().get('day')
uploadMon = m.groupdict().get('mon')
uploadYear = m.groupdict().get('year')
uploadTimestr = uploadYear + uploadMon + uploadDay
return merge_dicts({
'id': video_id,
'title': 'pr0gramm-%s%s' % (video_id, (' by ' + uploader) if uploader else ''),
'uploader': uploader,
'upload_date': uploadTimestr
}, media_info)
# This extractor is for the primary url (used for sharing, and appears in the
# location bar) Since this page loads the DOM via JS, yt-dl can't find any
# video information here. So let's redirect to a compatibility version of
# the site, which does contain the <video>-element by itself, without requiring
# js to be ran.
class Pr0grammIE(InfoExtractor):
# Possible urls:
# https://pr0gramm.com/new/546637
# https://pr0gramm.com/new/video/546637
# https://pr0gramm.com/top/546637
# https://pr0gramm.com/top/video/546637
# https://pr0gramm.com/user/g11st/uploads/5466437
# https://pr0gramm.com/user/froschler/dafur-ist-man-hier/5091290
# https://pr0gramm.com/user/froschler/reinziehen-1elf/5232030
# https://pr0gramm.com/user/froschler/1elf/5232030
# https://pr0gramm.com/new/5495710:comment62621020 <- this is not the id!
# https://pr0gramm.com/top/fruher war alles damals/5498175
_VALID_URL = r'https?:\/\/pr0gramm\.com\/(?!static/\d+).+?\/(?P<id>[\d]+)(:|$)'
_TEST = {
'url': 'https://pr0gramm.com/new/video/5466437',
'info_dict': {
'id': '5466437',
'ext': 'mp4',
'title': 'pr0gramm-5466437 by g11st',
'uploader': 'g11st',
'upload_date': '20221221',
}
}
def _generic_title():
return "oof"
def _real_extract(self, url):
video_id = self._match_id(url)
return self.url_result(
'https://pr0gramm.com/static/' + video_id,
video_id=video_id,
ie=Pr0grammStaticIE.ie_key())

View File

@ -1,5 +1,4 @@
import base64
import re
import urllib.parse
from .common import InfoExtractor
@ -15,6 +14,23 @@ from ..utils import (
class RadikoBaseIE(InfoExtractor):
_FULL_KEY = None
_HOSTS_FOR_TIME_FREE_FFMPEG_UNSUPPORTED = (
'https://c-rpaa.smartstream.ne.jp',
'https://si-c-radiko.smartstream.ne.jp',
'https://tf-f-rpaa-radiko.smartstream.ne.jp',
'https://tf-c-rpaa-radiko.smartstream.ne.jp',
'https://si-f-radiko.smartstream.ne.jp',
'https://rpaa.smartstream.ne.jp',
)
_HOSTS_FOR_TIME_FREE_FFMPEG_SUPPORTED = (
'https://rd-wowza-radiko.radiko-cf.com',
'https://radiko.jp',
'https://f-radiko.smartstream.ne.jp',
)
# Following URL forcibly connects not Time Free but Live
_HOSTS_FOR_LIVE = (
'https://c-radiko.smartstream.ne.jp',
)
def _auth_client(self):
_, auth1_handle = self._download_webpage_handle(
@ -92,9 +108,9 @@ class RadikoBaseIE(InfoExtractor):
formats = []
found = set()
for url_tag in m3u8_urls:
pcu = url_tag.find('playlist_create_url')
pcu = url_tag.find('playlist_create_url').text
url_attrib = url_tag.attrib
playlist_url = update_url_query(pcu.text, {
playlist_url = update_url_query(pcu, {
'station_id': station,
**query,
'l': '15',
@ -118,9 +134,10 @@ class RadikoBaseIE(InfoExtractor):
'X-Radiko-AuthToken': auth_token,
})
for sf in subformats:
if re.fullmatch(r'[cf]-radiko\.smartstream\.ne\.jp', domain):
# Prioritize live radio vs playback based on extractor
sf['preference'] = 100 if is_onair else -100
if (is_onair ^ pcu.startswith(self._HOSTS_FOR_LIVE)) or (
not is_onair and pcu.startswith(self._HOSTS_FOR_TIME_FREE_FFMPEG_UNSUPPORTED)):
sf['preference'] = -100
sf['format_note'] = 'not preferred'
if not is_onair and url_attrib['timefree'] == '1' and time_to_skip:
sf['downloader_options'] = {'ffmpeg_args': ['-ss', time_to_skip]}
formats.extend(subformats)

View File

@ -0,0 +1,93 @@
import re
from .common import InfoExtractor
class RbgTumIE(InfoExtractor):
_VALID_URL = r'https://live\.rbg\.tum\.de/w/(?P<id>.+)'
_TESTS = [{
# Combined view
'url': 'https://live.rbg.tum.de/w/cpp/22128',
'md5': '53a5e7b3e07128e33bbf36687fe1c08f',
'info_dict': {
'id': 'cpp/22128',
'ext': 'mp4',
'title': 'Lecture: October 18. 2022',
'series': 'Concepts of C++ programming (IN2377)',
}
}, {
# Presentation only
'url': 'https://live.rbg.tum.de/w/I2DL/12349/PRES',
'md5': '36c584272179f3e56b0db5d880639cba',
'info_dict': {
'id': 'I2DL/12349/PRES',
'ext': 'mp4',
'title': 'Lecture 3: Introduction to Neural Networks',
'series': 'Introduction to Deep Learning (IN2346)',
}
}, {
# Camera only
'url': 'https://live.rbg.tum.de/w/fvv-info/16130/CAM',
'md5': 'e04189d92ff2f56aedf5cede65d37aad',
'info_dict': {
'id': 'fvv-info/16130/CAM',
'ext': 'mp4',
'title': 'Fachschaftsvollversammlung',
'series': 'Fachschaftsvollversammlung Informatik',
}
}, ]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
m3u8 = self._html_search_regex(r'(https://.+?\.m3u8)', webpage, 'm3u8')
lecture_title = self._html_search_regex(r'(?si)<h1.*?>(.*)</h1>', webpage, 'title')
lecture_series_title = self._html_search_regex(
r'(?s)<title\b[^>]*>\s*(?:TUM-Live\s\|\s?)?([^:]+):?.*?</title>', webpage, 'series')
formats = self._extract_m3u8_formats(m3u8, video_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls')
return {
'id': video_id,
'title': lecture_title,
'series': lecture_series_title,
'formats': formats,
}
class RbgTumCourseIE(InfoExtractor):
_VALID_URL = r'https://live\.rbg\.tum\.de/course/(?P<id>.+)'
_TESTS = [{
'url': 'https://live.rbg.tum.de/course/2022/S/fpv',
'info_dict': {
'title': 'Funktionale Programmierung und Verifikation (IN0003)',
'id': '2022/S/fpv',
},
'params': {
'noplaylist': False,
},
'playlist_count': 13,
}, {
'url': 'https://live.rbg.tum.de/course/2022/W/set',
'info_dict': {
'title': 'SET FSMPIC',
'id': '2022/W/set',
},
'params': {
'noplaylist': False,
},
'playlist_count': 6,
}, ]
def _real_extract(self, url):
course_id = self._match_id(url)
webpage = self._download_webpage(url, course_id)
lecture_series_title = self._html_search_regex(r'(?si)<h1.*?>(.*)</h1>', webpage, 'title')
lecture_urls = []
for lecture_url in re.findall(r'(?i)href="/w/(.+)(?<!/cam)(?<!/pres)(?<!/chat)"', webpage):
lecture_urls.append(self.url_result('https://live.rbg.tum.de/w/' + lecture_url, ie=RbgTumIE.ie_key()))
return self.playlist_result(lecture_urls, course_id, lecture_series_title)

View File

@ -3,9 +3,18 @@ import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
HEADRequest,
base_url,
clean_html,
extract_attributes,
get_element_html_by_class,
get_element_html_by_id,
int_or_none,
js_to_json,
mimetype2ext,
sanitize_url,
traverse_obj,
try_call,
url_basename,
urljoin,
)
@ -15,41 +24,8 @@ class RCSBaseIE(InfoExtractor):
# based on VideoPlayerLoader.prototype.getVideoSrc
# and VideoPlayerLoader.prototype.transformSrc from
# https://js2.corriereobjects.it/includes2013/LIBS/js/corriere_video.sjs
_ALL_REPLACE = {
'media2vam.corriere.it.edgesuite.net':
'media2vam-corriere-it.akamaized.net',
'media.youreporter.it.edgesuite.net':
'media-youreporter-it.akamaized.net',
'corrierepmd.corriere.it.edgesuite.net':
'corrierepmd-corriere-it.akamaized.net',
'media2vam-corriere-it.akamaized.net/fcs.quotidiani/vr/videos/':
'video.corriere.it/vr360/videos/',
'.net//': '.net/',
}
_MP4_REPLACE = {
'media2vam.corbologna.corriere.it.edgesuite.net':
'media2vam-bologna-corriere-it.akamaized.net',
'media2vam.corfiorentino.corriere.it.edgesuite.net':
'media2vam-fiorentino-corriere-it.akamaized.net',
'media2vam.cormezzogiorno.corriere.it.edgesuite.net':
'media2vam-mezzogiorno-corriere-it.akamaized.net',
'media2vam.corveneto.corriere.it.edgesuite.net':
'media2vam-veneto-corriere-it.akamaized.net',
'media2.oggi.it.edgesuite.net':
'media2-oggi-it.akamaized.net',
'media2.quimamme.it.edgesuite.net':
'media2-quimamme-it.akamaized.net',
'media2.amica.it.edgesuite.net':
'media2-amica-it.akamaized.net',
'media2.living.corriere.it.edgesuite.net':
'media2-living-corriere-it.akamaized.net',
'media2.style.corriere.it.edgesuite.net':
'media2-style-corriere-it.akamaized.net',
'media2.iodonna.it.edgesuite.net':
'media2-iodonna-it.akamaized.net',
'media2.leitv.it.edgesuite.net':
'media2-leitv-it.akamaized.net',
}
_UUID_RE = r'[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12}'
_RCS_ID_RE = r'[\w-]+-\d{10}'
_MIGRATION_MAP = {
'videoamica-vh.akamaihd': 'amica',
'media2-amica-it.akamaized': 'amica',
@ -90,183 +66,140 @@ class RCSBaseIE(InfoExtractor):
'vivimilano-vh.akamaihd': 'vivimilano',
'media2-youreporter-it.akamaized': 'youreporter'
}
_MIGRATION_MEDIA = {
'advrcs-vh.akamaihd': '',
'corriere-f.akamaihd': '',
'corrierepmd-corriere-it.akamaized': '',
'corrprotetto-vh.akamaihd': '',
'gazzetta-f.akamaihd': '',
'gazzettapmd-gazzetta-it.akamaized': '',
'gazzprotetto-vh.akamaihd': '',
'periodici-f.akamaihd': '',
'periodicisecure-vh.akamaihd': '',
'videocoracademy-vh.akamaihd': ''
}
def _get_video_src(self, video):
mediaFiles = video.get('mediaProfile').get('mediaFile')
src = {}
# audio
if video.get('mediaType') == 'AUDIO':
for aud in mediaFiles:
# todo: check
src['mp3'] = aud.get('value')
# video
else:
for vid in mediaFiles:
if vid.get('mimeType') == 'application/vnd.apple.mpegurl':
src['m3u8'] = vid.get('value')
if vid.get('mimeType') == 'video/mp4':
src['mp4'] = vid.get('value')
for source in traverse_obj(video, (
'mediaProfile', 'mediaFile', lambda _, v: v.get('mimeType'))):
url = source['value']
for s, r in (
('media2vam.corriere.it.edgesuite.net', 'media2vam-corriere-it.akamaized.net'),
('media.youreporter.it.edgesuite.net', 'media-youreporter-it.akamaized.net'),
('corrierepmd.corriere.it.edgesuite.net', 'corrierepmd-corriere-it.akamaized.net'),
('media2vam-corriere-it.akamaized.net/fcs.quotidiani/vr/videos/', 'video.corriere.it/vr360/videos/'),
('http://', 'https://'),
):
url = url.replace(s, r)
# replace host
for t in src:
for s, r in self._ALL_REPLACE.items():
src[t] = src[t].replace(s, r)
for s, r in self._MP4_REPLACE.items():
src[t] = src[t].replace(s, r)
type_ = mimetype2ext(source['mimeType'])
if type_ == 'm3u8' and '-vh.akamaihd' in url:
# still needed for some old content: see _TESTS #3
matches = re.search(r'(?:https?:)?//(?P<host>[\w\.\-]+)\.net/i(?P<path>.+)$', url)
if matches:
url = f'https://vod.rcsobjects.it/hls/{self._MIGRATION_MAP[matches.group("host")]}{matches.group("path")}'
if traverse_obj(video, ('mediaProfile', 'geoblocking')) or (
type_ == 'm3u8' and 'fcs.quotidiani_!' in url):
url = url.replace('vod.rcsobjects', 'vod-it.rcsobjects')
if type_ == 'm3u8' and 'vod' in url:
url = url.replace('.csmil', '.urlset')
if type_ == 'mp3':
url = url.replace('media2vam-corriere-it.akamaized.net', 'vod.rcsobjects.it/corriere')
# switch cdn
if 'mp4' in src and 'm3u8' in src:
if ('-lh.akamaihd' not in src.get('m3u8')
and 'akamai' in src.get('mp4')):
if 'm3u8' in src:
matches = re.search(r'(?:https*:)?\/\/(?P<host>.*)\.net\/i(?P<path>.*)$', src.get('m3u8'))
src['m3u8'] = 'https://vod.rcsobjects.it/hls/%s%s' % (
self._MIGRATION_MAP[matches.group('host')],
matches.group('path').replace(
'///', '/').replace(
'//', '/').replace(
'.csmil', '.urlset'
)
)
if 'mp4' in src:
matches = re.search(r'(?:https*:)?\/\/(?P<host>.*)\.net\/i(?P<path>.*)$', src.get('mp4'))
if matches:
if matches.group('host') in self._MIGRATION_MEDIA:
vh_stream = 'https://media2.corriereobjects.it'
if src.get('mp4').find('fcs.quotidiani_!'):
vh_stream = 'https://media2-it.corriereobjects.it'
src['mp4'] = '%s%s' % (
vh_stream,
matches.group('path').replace(
'///', '/').replace(
'//', '/').replace(
'/fcs.quotidiani/mediacenter', '').replace(
'/fcs.quotidiani_!/mediacenter', '').replace(
'corriere/content/mediacenter/', '').replace(
'gazzetta/content/mediacenter/', '')
)
else:
src['mp4'] = 'https://vod.rcsobjects.it/%s%s' % (
self._MIGRATION_MAP[matches.group('host')],
matches.group('path').replace('///', '/').replace('//', '/')
)
yield {
'type': type_,
'url': url,
'bitrate': source.get('bitrate')
}
if 'mp3' in src:
src['mp3'] = src.get('mp3').replace(
'media2vam-corriere-it.akamaized.net',
'vod.rcsobjects.it/corriere')
if 'mp4' in src:
if src.get('mp4').find('fcs.quotidiani_!'):
src['mp4'] = src.get('mp4').replace('vod.rcsobjects', 'vod-it.rcsobjects')
if 'm3u8' in src:
if src.get('m3u8').find('fcs.quotidiani_!'):
src['m3u8'] = src.get('m3u8').replace('vod.rcsobjects', 'vod-it.rcsobjects')
def _create_http_formats(self, m3u8_formats, video_id):
for f in m3u8_formats:
if f['vcodec'] == 'none':
continue
http_url = re.sub(r'(https?://[^/]+)/hls/([^?#]+?\.mp4).+', r'\g<1>/\g<2>', f['url'])
if http_url == f['url']:
continue
if 'geoblocking' in video.get('mediaProfile'):
if 'm3u8' in src:
src['m3u8'] = src.get('m3u8').replace('vod.rcsobjects', 'vod-it.rcsobjects')
if 'mp4' in src:
src['mp4'] = src.get('mp4').replace('vod.rcsobjects', 'vod-it.rcsobjects')
if 'm3u8' in src:
if src.get('m3u8').find('csmil') and src.get('m3u8').find('vod'):
src['m3u8'] = src.get('m3u8').replace('.csmil', '.urlset')
http_f = f.copy()
del http_f['manifest_url']
format_id = try_call(lambda: http_f['format_id'].replace('hls-', 'https-'))
urlh = self._request_webpage(HEADRequest(http_url), video_id, fatal=False,
note=f'Check filesize for {format_id}')
if not urlh:
continue
return src
def _create_formats(self, urls, video_id):
formats = []
formats = self._extract_m3u8_formats(
urls.get('m3u8'), video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False)
if urls.get('mp4'):
formats.append({
'format_id': 'http-mp4',
'url': urls['mp4']
http_f.update({
'format_id': format_id,
'url': http_url,
'protocol': 'https',
'filesize_approx': int_or_none(urlh.headers.get('Content-Length', None)),
})
return formats
yield http_f
def _create_formats(self, sources, video_id):
for source in sources:
if source['type'] == 'm3u8':
m3u8_formats = self._extract_m3u8_formats(
source['url'], video_id, 'mp4', m3u8_id='hls', fatal=False)
yield from m3u8_formats
yield from self._create_http_formats(m3u8_formats, video_id)
elif source['type'] == 'mp3':
yield {
'format_id': 'https-mp3',
'ext': 'mp3',
'acodec': 'mp3',
'vcodec': 'none',
'abr': source.get('bitrate'),
'url': source['url'],
}
def _real_extract(self, url):
mobj = self._match_valid_url(url)
video_id = mobj.group('id')
cdn, video_id = self._match_valid_url(url).group('cdn', 'id')
display_id, video_data = None, None
if 'cdn' not in mobj.groupdict():
raise ExtractorError('CDN not found in url: %s' % url)
# for leitv/youreporter/viaggi don't use the embed page
if ((mobj.group('cdn') not in ['leitv.it', 'youreporter.it'])
and (mobj.group('vid') == 'video')):
url = 'https://video.%s/video-embed/%s' % (mobj.group('cdn'), video_id)
page = self._download_webpage(url, video_id)
video_data = None
# look for json video data url
json = self._search_regex(
r'''(?x)url\s*=\s*(["'])
(?P<url>
(?:https?:)?//video\.rcs\.it
/fragment-includes/video-includes/.+?\.json
)\1;''',
page, video_id, group='url', default=None)
if json:
if json.startswith('//'):
json = 'https:%s' % json
video_data = self._download_json(json, video_id)
# if json url not found, look for json video data directly in the page
if re.match(self._UUID_RE, video_id) or re.match(self._RCS_ID_RE, video_id):
url = f'https://video.{cdn}/video-json/{video_id}'
else:
# RCS normal pages and most of the embeds
json = self._search_regex(
r'[\s;]video\s*=\s*({[\s\S]+?})(?:;|,playlist=)',
page, video_id, default=None)
if not json and 'video-embed' in url:
page = self._download_webpage(url.replace('video-embed', 'video-json'), video_id)
json = self._search_regex(
r'##start-video##({[\s\S]+?})##end-video##',
page, video_id, default=None)
if not json:
# if no video data found try search for iframes
emb = RCSEmbedsIE._extract_url(page)
webpage = self._download_webpage(url, video_id)
data_config = get_element_html_by_id('divVideoPlayer', webpage) or get_element_html_by_class('divVideoPlayer', webpage)
if data_config:
data_config = self._parse_json(
extract_attributes(data_config).get('data-config'),
video_id, fatal=False) or {}
if data_config.get('newspaper'):
cdn = f'{data_config["newspaper"]}.it'
display_id, video_id = video_id, data_config.get('uuid') or video_id
url = f'https://video.{cdn}/video-json/{video_id}'
else:
json_url = self._search_regex(
r'''(?x)url\s*=\s*(["'])
(?P<url>
(?:https?:)?//video\.rcs\.it
/fragment-includes/video-includes/[^"']+?\.json
)\1;''',
webpage, video_id, group='url', default=None)
if json_url:
video_data = self._download_json(sanitize_url(json_url, scheme='https'), video_id)
display_id, video_id = video_id, video_data.get('id') or video_id
if not video_data:
webpage = self._download_webpage(url, video_id)
video_data = self._search_json(
'##start-video##', webpage, 'video data', video_id, default=None,
end_pattern='##end-video##', transform_source=js_to_json)
if not video_data:
# try search for iframes
emb = RCSEmbedsIE._extract_url(webpage)
if emb:
return {
'_type': 'url_transparent',
'url': emb,
'ie_key': RCSEmbedsIE.ie_key()
}
if json:
video_data = self._parse_json(
json, video_id, transform_source=js_to_json)
if not video_data:
raise ExtractorError('Video data not found in the page')
formats = self._create_formats(
self._get_video_src(video_data), video_id)
description = (video_data.get('description')
or clean_html(video_data.get('htmlDescription'))
or self._html_search_meta('description', page))
uploader = video_data.get('provider') or mobj.group('cdn')
return {
'id': video_id,
'display_id': display_id,
'title': video_data.get('title'),
'description': description,
'uploader': uploader,
'formats': formats
'description': (clean_html(video_data.get('description'))
or clean_html(video_data.get('htmlDescription'))
or self._html_search_meta('description', webpage)),
'uploader': video_data.get('provider') or cdn,
'formats': list(self._create_formats(self._get_video_src(video_data), video_id)),
}
@ -296,7 +229,7 @@ class RCSEmbedsIE(RCSBaseIE):
\1''']
_TESTS = [{
'url': 'https://video.rcs.it/video-embed/iodonna-0001585037',
'md5': '623ecc8ffe7299b2d0c1046d8331a9df',
'md5': '0faca97df525032bb9847f690bc3720c',
'info_dict': {
'id': 'iodonna-0001585037',
'ext': 'mp4',
@ -305,38 +238,31 @@ class RCSEmbedsIE(RCSBaseIE):
'uploader': 'rcs.it',
}
}, {
# redownload the page changing 'video-embed' in 'video-json'
'url': 'https://video.gazzanet.gazzetta.it/video-embed/gazzanet-mo05-0000260789',
'md5': 'a043e3fecbe4d9ed7fc5d888652a5440',
'info_dict': {
'id': 'gazzanet-mo05-0000260789',
'ext': 'mp4',
'title': 'Valentino Rossi e papà Graziano si divertono col drifting',
'description': 'md5:a8bf90d6adafd9815f70fc74c0fc370a',
'uploader': 'rcd',
}
}, {
'url': 'https://video.corriere.it/video-embed/b727632a-f9d0-11ea-91b0-38d50a849abb?player',
'match_only': True
}, {
'url': 'https://video.gazzetta.it/video-embed/49612410-00ca-11eb-bcd8-30d4253e0140',
'match_only': True
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.iodonna.it/video-iodonna/personaggi-video/monica-bellucci-piu-del-lavoro-oggi-per-me-sono-importanti-lamicizia-e-la-famiglia/',
'info_dict': {
'id': 'iodonna-0002033648',
'ext': 'mp4',
'title': 'Monica Bellucci: «Più del lavoro, oggi per me sono importanti l\'amicizia e la famiglia»',
'description': 'md5:daea6d9837351e56b1ab615c06bebac1',
'uploader': 'rcs.it',
}
}]
@staticmethod
def _sanitize_urls(urls):
# add protocol if missing
for i, e in enumerate(urls):
if e.startswith('//'):
urls[i] = 'https:%s' % e
# clean iframes urls
for i, e in enumerate(urls):
urls[i] = urljoin(base_url(e), url_basename(e))
return urls
def _sanitize_url(url):
url = sanitize_url(url, scheme='https')
return urljoin(base_url(url), url_basename(url))
@classmethod
def _extract_embed_urls(cls, url, webpage):
return cls._sanitize_urls(list(super()._extract_embed_urls(url, webpage)))
return map(cls._sanitize_url, super()._extract_embed_urls(url, webpage))
class RCSIE(RCSBaseIE):
@ -349,37 +275,53 @@ class RCSIE(RCSBaseIE):
|corrierefiorentino\.
)?corriere\.it
|(?:gazzanet\.)?gazzetta\.it)
/(?!video-embed/).+?/(?P<id>[^/\?]+)(?=\?|/$|$)'''
/(?!video-embed/)[^?#]+?/(?P<id>[^/\?]+)(?=\?|/$|$)'''
_TESTS = [{
# json iframe directly from id
'url': 'https://video.corriere.it/sport/formula-1/vettel-guida-ferrari-sf90-mugello-suo-fianco-c-elecrerc-bendato-video-esilarante/b727632a-f9d0-11ea-91b0-38d50a849abb',
'md5': '0f4ededc202b0f00b6e509d831e2dcda',
'md5': '14946840dec46ecfddf66ba4eea7d2b2',
'info_dict': {
'id': 'b727632a-f9d0-11ea-91b0-38d50a849abb',
'ext': 'mp4',
'title': 'Vettel guida la Ferrari SF90 al Mugello e al suo fianco c\'è Leclerc (bendato): il video è esilarante',
'description': 'md5:93b51c9161ac8a64fb2f997b054d0152',
'description': 'md5:3915ce5ebb3d2571deb69a5eb85ac9b5',
'uploader': 'Corriere Tv',
}
}, {
# video data inside iframe
# search for video id inside the page
'url': 'https://viaggi.corriere.it/video/norvegia-il-nuovo-ponte-spettacolare-sopra-la-cascata-di-voringsfossen/',
'md5': 'da378e4918d2afbf7d61c35abb948d4c',
'md5': 'f22a92d9e666e80f2fffbf2825359c81',
'info_dict': {
'id': '5b7cd134-e2c1-11ea-89b3-b56dd0df2aa2',
'display_id': 'norvegia-il-nuovo-ponte-spettacolare-sopra-la-cascata-di-voringsfossen',
'ext': 'mp4',
'title': 'La nuova spettacolare attrazione in Norvegia: il ponte sopra Vøringsfossen',
'description': 'md5:18b35a291f6746c0c8dacd16e5f5f4f8',
'uploader': 'DOVE Viaggi',
}
}, {
'url': 'https://video.gazzetta.it/video-motogp-catalogna-cadute-dovizioso-vale-rossi/49612410-00ca-11eb-bcd8-30d4253e0140?vclk=Videobar',
'md5': 'eedc1b5defd18e67383afef51ff7bdf9',
# only audio format https://github.com/yt-dlp/yt-dlp/issues/5683
'url': 'https://video.corriere.it/cronaca/audio-telefonata-il-papa-becciu-santita-lettera-che-mi-ha-inviato-condanna/b94c0d20-70c2-11ed-9572-e4b947a0ebd2',
'md5': 'aaffb08d02f2ce4292a4654694c78150',
'info_dict': {
'id': '49612410-00ca-11eb-bcd8-30d4253e0140',
'id': 'b94c0d20-70c2-11ed-9572-e4b947a0ebd2',
'ext': 'mp3',
'title': 'L\'audio della telefonata tra il Papa e Becciu: «Santità, la lettera che mi ha inviato è una condanna»',
'description': 'md5:c0ddb61bd94a8d4e0d4bb9cda50a689b',
'uploader': 'Corriere Tv',
'formats': [{'format_id': 'https-mp3', 'ext': 'mp3'}],
}
}, {
# old content still needs cdn migration
'url': 'https://viaggi.corriere.it/video/milano-varallo-sesia-sul-treno-a-vapore/',
'md5': '2dfdce7af249654ad27eeba03fe1e08d',
'info_dict': {
'id': 'd8f6c8d0-f7d7-11e8-bfca-f74cf4634191',
'display_id': 'milano-varallo-sesia-sul-treno-a-vapore',
'ext': 'mp4',
'title': 'Dovizioso, il contatto con Zarco e la caduta. E anche Vale finisce a terra',
'description': 'md5:8c6e905dc3b9413218beca11ebd69778',
'uploader': 'AMorici',
'title': 'Milano-Varallo Sesia sul treno a vapore',
'description': 'md5:6348f47aac230397fe341a74f7678d53',
'uploader': 'DOVE Viaggi',
}
}, {
'url': 'https://video.corriere.it/video-360/metro-copenaghen-tutta-italiana/a248a7f0-e2db-11e9-9830-af2de6b1f945',
@ -391,13 +333,15 @@ class RCSVariousIE(RCSBaseIE):
_VALID_URL = r'''(?x)https?://www\.
(?P<cdn>
leitv\.it|
youreporter\.it
youreporter\.it|
amica\.it
)/(?:[^/]+/)?(?P<id>[^/]+?)(?:$|\?|/)'''
_TESTS = [{
'url': 'https://www.leitv.it/benessere/mal-di-testa-come-combatterlo-ed-evitarne-la-comparsa/',
'md5': '92b4e63667b8f95acb0a04da25ae28a1',
'url': 'https://www.leitv.it/benessere/mal-di-testa/',
'md5': '3b7a683d105a7313ec7513b014443631',
'info_dict': {
'id': 'mal-di-testa-come-combatterlo-ed-evitarne-la-comparsa',
'id': 'leitv-0000125151',
'display_id': 'mal-di-testa',
'ext': 'mp4',
'title': 'Cervicalgia e mal di testa, il video con i suggerimenti dell\'esperto',
'description': 'md5:ae21418f34cee0b8d02a487f55bcabb5',
@ -405,12 +349,24 @@ class RCSVariousIE(RCSBaseIE):
}
}, {
'url': 'https://www.youreporter.it/fiume-sesia-3-ottobre-2020/',
'md5': '8dccd436b47a830bab5b4a88232f391a',
'md5': '3989b6d603482611a2abd2f32b79f739',
'info_dict': {
'id': 'fiume-sesia-3-ottobre-2020',
'id': 'youreporter-0000332574',
'display_id': 'fiume-sesia-3-ottobre-2020',
'ext': 'mp4',
'title': 'Fiume Sesia 3 ottobre 2020',
'description': 'md5:0070eef1cc884d13c970a4125063de55',
'uploader': 'youreporter.it',
}
}, {
'url': 'https://www.amica.it/video-post/saint-omer-al-cinema-il-film-leone-dargento-che-ribalta-gli-stereotipi/',
'md5': '187cce524dfd0343c95646c047375fc4',
'info_dict': {
'id': 'amica-0001225365',
'display_id': 'saint-omer-al-cinema-il-film-leone-dargento-che-ribalta-gli-stereotipi',
'ext': 'mp4',
'title': '"Saint Omer": al cinema il film Leone d\'argento che ribalta gli stereotipi',
'description': 'md5:b1c8869c2dcfd6073a2a311ba0008aa8',
'uploader': 'rcs.it',
}
}]

View File

@ -14,7 +14,7 @@ from ..utils import (
class RedditIE(InfoExtractor):
_VALID_URL = r'https?://(?P<subdomain>[^/]+\.)?reddit(?:media)?\.com/r/(?P<slug>[^/]+/comments/(?P<id>[^/?#&]+))'
_VALID_URL = r'https?://(?P<subdomain>[^/]+\.)?reddit(?:media)?\.com/(?P<slug>(?:r|user)/[^/]+/comments/(?P<id>[^/?#&]+))'
_TESTS = [{
'url': 'https://www.reddit.com/r/videos/comments/6rrwyj/that_small_heart_attack/',
'info_dict': {
@ -58,6 +58,29 @@ class RedditIE(InfoExtractor):
'age_limit': 0,
'channel_id': 'aww',
},
}, {
# User post
'url': 'https://www.reddit.com/user/creepyt0es/comments/nip71r/i_plan_to_make_more_stickers_and_prints_check/',
'info_dict': {
'id': 'zasobba6wp071',
'ext': 'mp4',
'display_id': 'nip71r',
'title': 'I plan to make more stickers and prints! Check them out on my Etsy! Or get them through my Patreon. Links below.',
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'thumbnails': 'count:5',
'timestamp': 1621709093,
'upload_date': '20210522',
'uploader': 'creepyt0es',
'duration': 6,
'like_count': int,
'dislike_count': int,
'comment_count': int,
'age_limit': 0,
'channel_id': 'u_creepyt0es',
},
'params': {
'skip_download': True,
},
}, {
# videos embedded in reddit text post
'url': 'https://www.reddit.com/r/KamenRider/comments/wzqkxp/finale_kamen_rider_revice_episode_50_family_to/',
@ -84,6 +107,7 @@ class RedditIE(InfoExtractor):
'dislike_count': int,
'comment_count': int,
'age_limit': 0,
'channel_id': 'dumbfuckers_club',
},
}, {
'url': 'https://www.reddit.com/r/videos/comments/6rrwyj',
@ -124,10 +148,10 @@ class RedditIE(InfoExtractor):
self._set_cookie('.reddit.com', 'reddit_session', self._gen_session_id())
self._set_cookie('.reddit.com', '_options', '%7B%22pref_quarantine_optin%22%3A%20true%7D')
data = self._download_json(f'https://{subdomain}reddit.com/r/{slug}/.json', video_id, fatal=False)
data = self._download_json(f'https://{subdomain}reddit.com/{slug}/.json', video_id, fatal=False)
if not data:
# Fall back to old.reddit.com in case the requested subdomain fails
data = self._download_json(f'https://old.reddit.com/r/{slug}/.json', video_id)
data = self._download_json(f'https://old.reddit.com/{slug}/.json', video_id)
data = data[0]['data']['children'][0]['data']
video_url = data['url']

View File

@ -1,8 +1,5 @@
from .common import InfoExtractor
from ..utils import (
int_or_none,
remove_start,
)
from ..utils import extract_attributes, int_or_none, remove_start, traverse_obj
class RozhlasIE(InfoExtractor):
@ -45,3 +42,138 @@ class RozhlasIE(InfoExtractor):
'duration': duration,
'vcodec': 'none',
}
class RozhlasVltavaIE(InfoExtractor):
_VALID_URL = r'https?://(?:\w+\.rozhlas|english\.radio)\.cz/[\w-]+-(?P<id>\d+)'
_TESTS = [{
'url': 'https://wave.rozhlas.cz/papej-masicko-porcujeme-a-bilancujeme-filmy-a-serialy-ktere-letos-zabily-8891337',
'md5': 'ba2fdbc1242fc16771c7695d271ec355',
'info_dict': {
'id': 8891337,
'title': 'md5:21f99739d04ab49d8c189ec711eef4ec',
},
'playlist_count': 1,
'playlist': [{
'md5': 'ba2fdbc1242fc16771c7695d271ec355',
'info_dict': {
'id': '10520988',
'ext': 'mp3',
'title': 'Papej masíčko! Porcujeme a bilancujeme filmy a seriály, které to letos zabily',
'description': 'md5:1c6d29fb9564e1f17fc1bb83ae7da0bc',
'duration': 1574,
'artist': 'Aleš Stuchlý',
'channel_id': 'radio-wave',
},
}]
}, {
'url': 'https://wave.rozhlas.cz/poslechnete-si-neklid-podcastovy-thriller-o-vine-strachu-a-vztahu-ktery-zasel-8554744',
'info_dict': {
'id': 8554744,
'title': 'Poslechněte si Neklid. Podcastový thriller o vině, strachu a vztahu, který zašel příliš daleko',
},
'playlist_count': 5,
'playlist': [{
'md5': '93d4109cf8f40523699ae9c1d4600bdd',
'info_dict': {
'id': '9890713',
'ext': 'mp3',
'title': 'Neklid #1',
'description': '1. díl: Neklid: 1. díl',
'duration': 1025,
'artist': 'Josef Kokta',
'channel_id': 'radio-wave',
'chapter': 'Neklid #1',
'chapter_number': 1,
},
}, {
'md5': 'e9763235be4a6dcf94bc8a5bac1ca126',
'info_dict': {
'id': '9890716',
'ext': 'mp3',
'title': 'Neklid #2',
'description': '2. díl: Neklid: 2. díl',
'duration': 768,
'artist': 'Josef Kokta',
'channel_id': 'radio-wave',
'chapter': 'Neklid #2',
'chapter_number': 2,
},
}, {
'md5': '00b642ea94b78cc949ac84da09f87895',
'info_dict': {
'id': '9890722',
'ext': 'mp3',
'title': 'Neklid #3',
'description': '3. díl: Neklid: 3. díl',
'duration': 607,
'artist': 'Josef Kokta',
'channel_id': 'radio-wave',
'chapter': 'Neklid #3',
'chapter_number': 3,
},
}, {
'md5': 'faef97b1b49da7df874740f118c19dea',
'info_dict': {
'id': '9890728',
'ext': 'mp3',
'title': 'Neklid #4',
'description': '4. díl: Neklid: 4. díl',
'duration': 621,
'artist': 'Josef Kokta',
'channel_id': 'radio-wave',
'chapter': 'Neklid #4',
'chapter_number': 4,
},
}, {
'md5': '6e729fa39b647325b868d419c76f3efa',
'info_dict': {
'id': '9890734',
'ext': 'mp3',
'title': 'Neklid #5',
'description': '5. díl: Neklid: 5. díl',
'duration': 908,
'artist': 'Josef Kokta',
'channel_id': 'radio-wave',
'chapter': 'Neklid #5',
'chapter_number': 5,
},
}]
}]
def _extract_video(self, entry):
chapter_number = int_or_none(traverse_obj(entry, ('meta', 'ga', 'contentSerialPart')))
return {
'id': entry['meta']['ga']['contentId'],
'title': traverse_obj(entry, ('meta', 'ga', 'contentName')),
'description': entry.get('title'),
'duration': entry.get('duration'),
'artist': traverse_obj(entry, ('meta', 'ga', 'contentAuthor')),
'channel_id': traverse_obj(entry, ('meta', 'ga', 'contentCreator')),
'chapter': traverse_obj(entry, ('meta', 'ga', 'contentNameShort')) if chapter_number else None,
'chapter_number': chapter_number,
'formats': [{
'url': audio_link['url'],
'ext': audio_link.get('variant'),
'format_id': audio_link.get('variant'),
'abr': audio_link.get('bitrate'),
'acodec': audio_link.get('variant'),
'vcodec': 'none',
} for audio_link in entry['audioLinks']],
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
# FIXME: Use get_element_text_and_html_by_tag when it accepts less strict html
data = self._parse_json(extract_attributes(self._search_regex(
r'(<div class="mujRozhlasPlayer" data-player=\'[^\']+\'>)',
webpage, 'player'))['data-player'], video_id)['data']
return {
'_type': 'playlist',
'id': data.get('embedId'),
'title': traverse_obj(data, ('series', 'title')),
'entries': map(self._extract_video, data['playlist']),
}

View File

@ -186,7 +186,7 @@ class RumbleEmbedIE(InfoExtractor):
'filesize': 'size',
'width': 'w',
'height': 'h',
}, default={})
}, expected_type=lambda x: int(x) or None)
})
subtitles = {

View File

@ -1,11 +1,13 @@
from .common import InfoExtractor
from ..utils import (
determine_ext,
ExtractorError,
float_or_none,
format_field,
int_or_none,
join_nonempty,
traverse_obj,
unescapeHTML,
unified_timestamp,
urlencode_postdata,
url_or_none,
)
@ -15,32 +17,41 @@ class ServusIE(InfoExtractor):
(?:www\.)?
(?:
servus\.com/(?:(?:at|de)/p/[^/]+|tv/videos)|
(?:servustv|pm-wissen)\.com/videos
(?:servustv|pm-wissen)\.com/(?:[^/]+/)?v(?:ideos)?
)
/(?P<id>[aA]{2}-\w+|\d+-\d+)
/(?P<id>[aA]{2}-?\w+|\d+-\d+)
'''
_TESTS = [{
# new URL schema
'url': 'https://www.servustv.com/videos/aa-1t6vbu5pw1w12/',
'md5': '60474d4c21f3eb148838f215c37f02b9',
# URL schema v3
'url': 'https://www.servustv.com/natur/v/aa-28bycqnh92111/',
'info_dict': {
'id': 'AA-1T6VBU5PW1W12',
'id': 'AA-28BYCQNH92111',
'ext': 'mp4',
'title': 'Die Grünen aus Sicht des Volkes',
'alt_title': 'Talk im Hangar-7 Voxpops Gruene',
'description': 'md5:1247204d85783afe3682644398ff2ec4',
'title': 'Klettersteige in den Alpen',
'description': 'md5:25e47ddd83a009a0f9789ba18f2850ce',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 62.442,
'timestamp': 1605193976,
'upload_date': '20201112',
'series': 'Talk im Hangar-7',
'season': 'Season 9',
'season_number': 9,
'episode': 'Episode 31 - September 14',
'episode_number': 31,
}
'duration': 2823,
'timestamp': 1655752333,
'upload_date': '20220620',
'series': 'Bergwelten',
'season': 'Season 11',
'season_number': 11,
'episode': 'Episode 8 - Vie Ferrate Klettersteige in den Alpen',
'episode_number': 8,
},
'params': {'skip_download': 'm3u8'}
}, {
# old URL schema
'url': 'https://www.servustv.com/natur/v/aa-1xg5xwmgw2112/',
'only_matching': True,
}, {
'url': 'https://www.servustv.com/natur/v/aansszcx3yi9jmlmhdc1/',
'only_matching': True,
}, {
# URL schema v2
'url': 'https://www.servustv.com/videos/aa-1t6vbu5pw1w12/',
'only_matching': True,
}, {
# URL schema v1
'url': 'https://www.servus.com/de/p/Die-Gr%C3%BCnen-aus-Sicht-des-Volkes/AA-1T6VBU5PW1W12/',
'only_matching': True,
}, {
@ -60,85 +71,65 @@ class ServusIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url).upper()
token = self._download_json(
'https://auth.redbullmediahouse.com/token', video_id,
'Downloading token', data=urlencode_postdata({
'grant_type': 'client_credentials',
}), headers={
'Authorization': 'Basic SVgtMjJYNEhBNFdEM1cxMTpEdDRVSkFLd2ZOMG5IMjB1NGFBWTBmUFpDNlpoQ1EzNA==',
})
access_token = token['access_token']
token_type = token.get('token_type', 'Bearer')
video = self._download_json(
'https://sparkle-api.liiift.io/api/v1/stv/channels/international/assets/%s' % video_id,
video_id, 'Downloading video JSON', headers={
'Authorization': '%s %s' % (token_type, access_token),
})
'https://api-player.redbull.com/stv/servus-tv?timeZone=Europe/Berlin',
video_id, 'Downloading video JSON', query={'videoId': video_id})
if not video.get('videoUrl'):
self._report_errors(video)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
video['videoUrl'], video_id, 'mp4', m3u8_id='hls')
formats = []
thumbnail = None
for resource in video['resources']:
if not isinstance(resource, dict):
continue
format_url = url_or_none(resource.get('url'))
if not format_url:
continue
extension = resource.get('extension')
type_ = resource.get('type')
if extension == 'jpg' or type_ == 'reference_keyframe':
thumbnail = format_url
continue
ext = determine_ext(format_url)
if type_ == 'dash' or ext == 'mpd':
formats.extend(self._extract_mpd_formats(
format_url, video_id, mpd_id='dash', fatal=False))
elif type_ == 'hls' or ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif extension == 'mp4' or ext == 'mp4':
formats.append({
'url': format_url,
'format_id': type_,
'width': int_or_none(resource.get('width')),
'height': int_or_none(resource.get('height')),
})
attrs = {}
for attribute in video['attributes']:
if not isinstance(attribute, dict):
continue
key = attribute.get('fieldKey')
value = attribute.get('fieldValue')
if not key or not value:
continue
attrs[key] = value
title = attrs.get('title_stv') or video_id
alt_title = attrs.get('title')
description = attrs.get('long_description') or attrs.get('short_description')
series = attrs.get('label')
season = attrs.get('season')
episode = attrs.get('chapter')
duration = float_or_none(attrs.get('duration'), scale=1000)
season = video.get('season')
season_number = int_or_none(self._search_regex(
r'Season (\d+)', season or '', 'season number', default=None))
episode = video.get('chapter')
episode_number = int_or_none(self._search_regex(
r'Episode (\d+)', episode or '', 'episode number', default=None))
return {
'id': video_id,
'title': title,
'alt_title': alt_title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'timestamp': unified_timestamp(video.get('lastPublished')),
'series': series,
'title': video.get('title'),
'description': self._get_description(video_id) or video.get('description'),
'thumbnail': video.get('poster'),
'duration': float_or_none(video.get('duration')),
'timestamp': unified_timestamp(video.get('currentSunrise')),
'series': video.get('label'),
'season': season,
'season_number': season_number,
'episode': episode,
'episode_number': episode_number,
'formats': formats,
'subtitles': subtitles,
}
def _get_description(self, video_id):
info = self._download_json(
f'https://backend.servustv.com/wp-json/rbmh/v2/media_asset/aa_id/{video_id}?fieldset=page',
video_id, fatal=False)
return join_nonempty(*traverse_obj(info, (
('stv_short_description', 'stv_long_description'),
{lambda x: unescapeHTML(x.replace('\n\n', '\n'))})), delim='\n\n')
def _report_errors(self, video):
playability_errors = traverse_obj(video, ('playabilityErrors', ...))
if not playability_errors:
raise ExtractorError('No videoUrl and no information about errors')
elif 'FSK_BLOCKED' in playability_errors:
details = traverse_obj(video, ('playabilityErrorDetails', 'FSK_BLOCKED'), expected_type=dict)
message = format_field(''.join((
format_field(details, 'minEveningHour', ' from %02d:00'),
format_field(details, 'maxMorningHour', ' to %02d:00'),
format_field(details, 'minAge', ' (Minimum age %d)'),
)), None, 'Only available%s') or 'Blocked by FSK with unknown availability'
elif 'NOT_YET_AVAILABLE' in playability_errors:
message = format_field(
video, (('playabilityErrorDetails', 'NOT_YET_AVAILABLE', 'availableFrom'), 'currentSunrise'),
'Only available from %s') or 'Video not yet available with unknown availability'
else:
message = f'Video unavailable: {", ".join(playability_errors)}'
raise ExtractorError(message, expected=True)

View File

@ -29,6 +29,7 @@ class SlidesLiveIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.jpg',
'thumbnails': 'count:42',
'chapters': 'count:41',
'duration': 1638,
},
'params': {
'skip_download': 'm3u8',
@ -45,6 +46,7 @@ class SlidesLiveIE(InfoExtractor):
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'thumbnails': 'count:640',
'chapters': 'count:639',
'duration': 9832,
},
'params': {
'skip_download': 'm3u8',
@ -61,6 +63,7 @@ class SlidesLiveIE(InfoExtractor):
'timestamp': 1643728135,
'thumbnails': 'count:3',
'chapters': 'count:2',
'duration': 5889,
},
'params': {
'skip_download': 'm3u8',
@ -110,6 +113,7 @@ class SlidesLiveIE(InfoExtractor):
'timestamp': 1629671508,
'upload_date': '20210822',
'chapters': 'count:7',
'duration': 326,
},
'params': {
'skip_download': 'm3u8',
@ -126,6 +130,7 @@ class SlidesLiveIE(InfoExtractor):
'timestamp': 1654714970,
'upload_date': '20220608',
'chapters': 'count:6',
'duration': 171,
},
'params': {
'skip_download': 'm3u8',
@ -142,6 +147,7 @@ class SlidesLiveIE(InfoExtractor):
'timestamp': 1622806321,
'upload_date': '20210604',
'chapters': 'count:15',
'duration': 306,
},
'params': {
'skip_download': 'm3u8',
@ -158,6 +164,7 @@ class SlidesLiveIE(InfoExtractor):
'timestamp': 1654714896,
'upload_date': '20220608',
'chapters': 'count:8',
'duration': 295,
},
'params': {
'skip_download': 'm3u8',
@ -174,6 +181,7 @@ class SlidesLiveIE(InfoExtractor):
'thumbnails': 'count:22',
'upload_date': '20220608',
'chapters': 'count:21',
'duration': 294,
},
'params': {
'skip_download': 'm3u8',
@ -196,6 +204,7 @@ class SlidesLiveIE(InfoExtractor):
'thumbnails': 'count:30',
'upload_date': '20220608',
'chapters': 'count:31',
'duration': 272,
},
}, {
'info_dict': {
@ -237,6 +246,7 @@ class SlidesLiveIE(InfoExtractor):
'thumbnails': 'count:43',
'upload_date': '20220608',
'chapters': 'count:43',
'duration': 315,
},
}, {
'info_dict': {
@ -285,6 +295,23 @@ class SlidesLiveIE(InfoExtractor):
'params': {
'skip_download': 'm3u8',
},
}, {
# /v3/ slides, .png only, service_name = yoda
'url': 'https://slideslive.com/38983994',
'info_dict': {
'id': '38983994',
'ext': 'mp4',
'title': 'Zero-Shot AutoML with Pretrained Models',
'timestamp': 1662384834,
'upload_date': '20220905',
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'thumbnails': 'count:23',
'chapters': 'count:22',
'duration': 295,
},
'params': {
'skip_download': 'm3u8',
},
}, {
# service_name = yoda
'url': 'https://slideslive.com/38903721/magic-a-scientific-resurrection-of-an-esoteric-legend',
@ -311,6 +338,7 @@ class SlidesLiveIE(InfoExtractor):
'timestamp': 1629671508,
'upload_date': '20210822',
'chapters': 'count:7',
'duration': 326,
},
'params': {
'skip_download': 'm3u8',
@ -369,15 +397,28 @@ class SlidesLiveIE(InfoExtractor):
return m3u8_dict
def _extract_formats(self, cdn_hostname, path, video_id):
formats = []
formats.extend(self._extract_m3u8_formats(
def _extract_formats_and_duration(self, cdn_hostname, path, video_id, skip_duration=False):
formats, duration = [], None
hls_formats = self._extract_m3u8_formats(
f'https://{cdn_hostname}/{path}/master.m3u8',
video_id, 'mp4', m3u8_id='hls', fatal=False, live=True))
formats.extend(self._extract_mpd_formats(
f'https://{cdn_hostname}/{path}/master.mpd',
video_id, mpd_id='dash', fatal=False))
return formats
video_id, 'mp4', m3u8_id='hls', fatal=False, live=True)
if hls_formats:
if not skip_duration:
duration = self._extract_m3u8_vod_duration(
hls_formats[0]['url'], video_id, note='Extracting duration from HLS manifest')
formats.extend(hls_formats)
dash_formats = self._extract_mpd_formats(
f'https://{cdn_hostname}/{path}/master.mpd', video_id, mpd_id='dash', fatal=False)
if dash_formats:
if not duration and not skip_duration:
duration = self._extract_mpd_vod_duration(
f'https://{cdn_hostname}/{path}/master.mpd', video_id,
note='Extracting duration from DASH manifest')
formats.extend(dash_formats)
return formats, duration
def _real_extract(self, url):
video_id = self._match_id(url)
@ -406,44 +447,42 @@ class SlidesLiveIE(InfoExtractor):
assert service_name in ('url', 'yoda', 'vimeo', 'youtube')
service_id = player_info['service_id']
slides_info_url = None
slides, slides_info = [], []
slide_url_template = 'https://slides.slideslive.com/%s/slides/original/%s%s'
slides, slides_info = {}, []
if player_info.get('slides_json_url'):
slides_info_url = player_info['slides_json_url']
slides = traverse_obj(self._download_json(
slides_info_url, video_id, fatal=False,
note='Downloading slides JSON', errnote=False), 'slides', expected_type=list) or []
for slide_id, slide in enumerate(slides, start=1):
slides = self._download_json(
player_info['slides_json_url'], video_id, fatal=False,
note='Downloading slides JSON', errnote=False) or {}
slide_ext_default = '.png'
slide_quality = traverse_obj(slides, ('slide_qualities', 0))
if slide_quality:
slide_ext_default = '.jpg'
slide_url_template = f'https://cdn.slideslive.com/data/presentations/%s/slides/{slide_quality}/%s%s'
for slide_id, slide in enumerate(traverse_obj(slides, ('slides', ...), expected_type=dict), 1):
slides_info.append((
slide_id, traverse_obj(slide, ('image', 'name')),
traverse_obj(slide, ('image', 'extname'), default=slide_ext_default),
int_or_none(slide.get('time'), scale=1000)))
if not slides and player_info.get('slides_xml_url'):
slides_info_url = player_info['slides_xml_url']
slides = self._download_xml(
slides_info_url, video_id, fatal=False,
player_info['slides_xml_url'], video_id, fatal=False,
note='Downloading slides XML', errnote='Failed to download slides info')
for slide_id, slide in enumerate(slides.findall('./slide'), start=1):
slide_url_template = 'https://cdn.slideslive.com/data/presentations/%s/slides/big/%s%s'
for slide_id, slide in enumerate(slides.findall('./slide') if slides else [], 1):
slides_info.append((
slide_id, xpath_text(slide, './slideName', 'name'),
slide_id, xpath_text(slide, './slideName', 'name'), '.jpg',
int_or_none(xpath_text(slide, './timeSec', 'time'))))
slides_version = int(self._search_regex(
r'https?://slides\.slideslive\.com/\d+/v(\d+)/\w+\.(?:json|xml)',
slides_info_url, 'slides version', default=0))
if slides_version < 4:
slide_url_template = 'https://cdn.slideslive.com/data/presentations/%s/slides/big/%s.jpg'
else:
slide_url_template = 'https://slides.slideslive.com/%s/slides/original/%s.png'
chapters, thumbnails = [], []
if url_or_none(player_info.get('thumbnail')):
thumbnails.append({'id': 'cover', 'url': player_info['thumbnail']})
for slide_id, slide_path, start_time in slides_info:
for slide_id, slide_path, slide_ext, start_time in slides_info:
if slide_path:
thumbnails.append({
'id': f'{slide_id:03d}',
'url': slide_url_template % (video_id, slide_path),
'url': slide_url_template % (video_id, slide_path, slide_ext),
})
chapters.append({
'title': f'Slide {slide_id:03d}',
@ -473,7 +512,12 @@ class SlidesLiveIE(InfoExtractor):
if service_name == 'url':
info['url'] = service_id
elif service_name == 'yoda':
info['formats'] = self._extract_formats(player_info['video_servers'][0], service_id, video_id)
formats, duration = self._extract_formats_and_duration(
player_info['video_servers'][0], service_id, video_id)
info.update({
'duration': duration,
'formats': formats,
})
else:
info.update({
'_type': 'url_transparent',
@ -486,7 +530,7 @@ class SlidesLiveIE(InfoExtractor):
f'https://player.vimeo.com/video/{service_id}',
{'http_headers': {'Referer': url}})
video_slides = traverse_obj(slides, (..., 'video', 'id'))
video_slides = traverse_obj(slides, ('slides', ..., 'video', 'id'))
if not video_slides:
return info
@ -500,7 +544,7 @@ class SlidesLiveIE(InfoExtractor):
'videos': ','.join(video_slides),
}, note='Downloading video slides info', errnote='Failed to download video slides info') or {}
for slide_id, slide in enumerate(slides, 1):
for slide_id, slide in enumerate(traverse_obj(slides, ('slides', ...)), 1):
if not traverse_obj(slide, ('video', 'service')) == 'yoda':
continue
video_path = traverse_obj(slide, ('video', 'id'))
@ -508,7 +552,8 @@ class SlidesLiveIE(InfoExtractor):
video_path, 'video_servers', ...), get_all=False)
if not cdn_hostname or not video_path:
continue
formats = self._extract_formats(cdn_hostname, video_path, video_id)
formats, _ = self._extract_formats_and_duration(
cdn_hostname, video_path, video_id, skip_duration=True)
if not formats:
continue
yield {

View File

@ -1,95 +1,110 @@
from .common import InfoExtractor
from ..utils import (
clean_html,
float_or_none,
int_or_none,
parse_iso8601,
parse_qs,
strip_or_none,
try_get,
format_field,
traverse_obj,
unified_timestamp,
strip_or_none
)
class SportDeutschlandIE(InfoExtractor):
_VALID_URL = r'https?://sportdeutschland\.tv/(?P<id>(?:[^/]+/)?[^?#/&]+)'
_TESTS = [{
'url': 'https://sportdeutschland.tv/badminton/re-live-deutsche-meisterschaften-2020-halbfinals?playlistId=0',
'url': 'https://sportdeutschland.tv/blauweissbuchholztanzsport/buchholzer-formationswochenende-2023-samstag-1-bundesliga-landesliga',
'info_dict': {
'id': '5318cac0275701382770543d7edaf0a0',
'id': '983758e9-5829-454d-a3cf-eb27bccc3c94',
'ext': 'mp4',
'title': 'Re-live: Deutsche Meisterschaften 2020 - Halbfinals - Teil 1',
'duration': 16106.36,
},
'params': {
'noplaylist': True,
# m3u8 download
'skip_download': True,
},
'title': 'Buchholzer Formationswochenende 2023 - Samstag - 1. Bundesliga / Landesliga',
'description': 'md5:a288c794a5ee69e200d8f12982f81a87',
'live_status': 'was_live',
'channel': 'Blau-Weiss Buchholz Tanzsport',
'channel_url': 'https://sportdeutschland.tv/blauweissbuchholztanzsport',
'channel_id': '93ec33c9-48be-43b6-b404-e016b64fdfa3',
'display_id': '9839a5c7-0dbb-48a8-ab63-3b408adc7b54',
'duration': 32447,
'upload_date': '20230114',
'timestamp': 1673730018.0,
}
}, {
'url': 'https://sportdeutschland.tv/badminton/re-live-deutsche-meisterschaften-2020-halbfinals?playlistId=0',
'url': 'https://sportdeutschland.tv/deutscherbadmintonverband/bwf-tour-1-runde-feld-1-yonex-gainward-german-open-2022-0',
'info_dict': {
'id': 'c6e2fdd01f63013854c47054d2ab776f',
'title': 'Re-live: Deutsche Meisterschaften 2020 - Halbfinals',
'description': 'md5:5263ff4c31c04bb780c9f91130b48530',
'duration': 31397,
},
'playlist_count': 2,
}, {
'url': 'https://sportdeutschland.tv/freeride-world-tour-2021-fieberbrunn-oesterreich',
'only_matching': True,
'id': '95b97d9a-04f6-4880-9039-182985c33943',
'ext': 'mp4',
'title': 'BWF Tour: 1. Runde Feld 1 - YONEX GAINWARD German Open 2022',
'description': 'md5:2afb5996ceb9ac0b2ac81f563d3a883e',
'live_status': 'was_live',
'channel': 'Deutscher Badminton Verband',
'channel_url': 'https://sportdeutschland.tv/deutscherbadmintonverband',
'channel_id': '93ca5866-2551-49fc-8424-6db35af58920',
'display_id': '95c80c52-6b9a-4ae9-9197-984145adfced',
'duration': 41097,
'upload_date': '20220309',
'timestamp': 1646860727.0,
}
}]
def _real_extract(self, url):
display_id = self._match_id(url)
data = self._download_json(
'https://backend.sportdeutschland.tv/api/permalinks/' + display_id,
meta = self._download_json(
'https://api.sportdeutschland.tv/api/stateless/frontend/assets/' + display_id,
display_id, query={'access_token': 'true'})
asset = data['asset']
title = (asset.get('title') or asset['label']).strip()
asset_id = asset.get('id') or asset.get('uuid')
asset_id = traverse_obj(meta, 'id', 'uuid')
info = {
'id': asset_id,
'title': title,
'description': clean_html(asset.get('body') or asset.get('description')) or asset.get('teaser'),
'duration': int_or_none(asset.get('seconds')),
'channel_url': format_field(meta, ('profile', 'slug'), 'https://sportdeutschland.tv/%s'),
**traverse_obj(meta, {
'title': (('title', 'name'), {strip_or_none}),
'description': 'description',
'channel': ('profile', 'name'),
'channel_id': ('profile', 'id'),
'is_live': 'currently_live',
'was_live': 'was_live'
}, get_all=False)
}
videos = asset.get('videos') or []
if len(videos) > 1:
playlist_id = parse_qs(url).get('playlistId', [None])[0]
if not self._yes_playlist(playlist_id, asset_id):
videos = [videos[int(playlist_id)]]
def entries():
for i, video in enumerate(videos, 1):
video_id = video.get('uuid')
video_url = video.get('url')
if not (video_id and video_url):
continue
formats = self._extract_m3u8_formats(
video_url.replace('.smil', '.m3u8'), video_id, 'mp4', fatal=False)
if not formats and not self.get_param('ignore_no_formats'):
continue
yield {
'id': video_id,
'formats': formats,
'title': title + ' - ' + (video.get('label') or 'Teil %d' % i),
'duration': float_or_none(video.get('duration')),
}
videos = meta.get('videos') or []
if len(videos) > 1:
info.update({
'_type': 'multi_video',
'entries': entries(),
})
else:
formats = self._extract_m3u8_formats(
videos[0]['url'].replace('.smil', '.m3u8'), asset_id, 'mp4')
section_title = strip_or_none(try_get(data, lambda x: x['section']['title']))
info.update({
'formats': formats,
'display_id': asset.get('permalink'),
'thumbnail': try_get(asset, lambda x: x['images'][0]),
'categories': [section_title] if section_title else None,
'view_count': int_or_none(asset.get('views')),
'is_live': asset.get('is_live') is True,
'timestamp': parse_iso8601(asset.get('date') or asset.get('published_at')),
})
'entries': self.processVideoOrStream(asset_id, video)
} for video in enumerate(videos) if video.get('formats'))
elif len(videos) == 1:
info.update(
self.processVideoOrStream(asset_id, videos[0])
)
livestream = meta.get('livestream')
if livestream is not None:
info.update(
self.processVideoOrStream(asset_id, livestream)
)
return info
def process_video_or_stream(self, asset_id, video):
video_id = video['id']
video_src = video['src']
video_type = video['type']
token = self._download_json(
f'https://api.sportdeutschland.tv/api/frontend/asset-token/{asset_id}',
video_id, query={'type': video_type, 'playback_id': video_src})['token']
formats = self._extract_m3u8_formats(f'https://stream.mux.com/{video_src}.m3u8?token={token}', video_id)
video_data = {
'display_id': video_id,
'formats': formats,
}
if video_type == 'mux_vod':
video_data.update({
'duration': video.get('duration'),
'timestamp': unified_timestamp(video.get('created_at'))
})
return video_data

Some files were not shown because too many files have changed in this diff Show More