1
0
mirror of https://gitlab.com/ytdl-org/youtube-dl.git synced 2026-04-27 00:00:04 -04:00

Compare commits

...

135 Commits

Author SHA1 Message Date
Filippo Valsorda d479e34043 release 2012.11.27 2012-11-27 00:22:39 +01:00
Philipp Hagemeister 240089e5df remove accidental remnants 2012-11-27 00:14:12 +01:00
Philipp Hagemeister 1c469a9480 New optoin --restrict-filenames 2012-11-26 23:58:46 +01:00
Philipp Hagemeister 71f36332dd Remove redundancy in instructions 2012-11-26 23:40:51 +01:00
Philipp Hagemeister 8179d2ba74 Merge branch 'master' of github.com:rg3/youtube-dl 2012-11-26 23:25:04 +01:00
Philipp Hagemeister df4bad3245 Document configuration 2012-11-26 23:24:55 +01:00
Filippo Valsorda a7b5c8d6a8 fix FAQ on how to compile (also, starnge fix in the Makefile) 2012-11-26 22:35:12 +01:00
Philipp Hagemeister 92b91c1878 Use character instead of byte strings 2012-11-26 04:23:20 +01:00
Philipp Hagemeister 7ec1a206ea Remove longs (int does the right thing since Python 2.2, see PEP 237) 2012-11-26 04:13:43 +01:00
Philipp Hagemeister 51937c0869 Add some parentheses around print for #180 2012-11-26 04:05:54 +01:00
Philipp Hagemeister 6b50761222 Merge pull request #538 from zejn/patch-1
Also enable album URLs on Vimeo.
2012-11-25 18:04:11 -08:00
Philipp Hagemeister 6571408dc6 Merge pull request #545 from FiloSottile/alias
Kill (alias) --literal and %(title)
2012-11-25 15:57:57 -08:00
Filippo Valsorda b6fab35b9f alias %(title)s to %(stitle)s 2012-11-25 20:39:42 +01:00
Filippo Valsorda baec15387c aliased --literal to --title 2012-11-25 20:28:49 +01:00
zejn 297d7fd9c0 Also enable album URLs on Vimeo. 2012-11-21 13:24:14 +01:00
Filippo Valsorda 5002aea371 release 2012.11.17 2012-11-17 14:02:31 +01:00
Filippo Valsorda 74033a662d Reworked Vimeo file selection logic (quality, codec) - closes #530 2012-11-13 21:53:18 +01:00
Filippo Valsorda 0526e4f55a Merge pull request #522 from art-zhitnik/master
--(match|reject)-title utf8 fix
2012-11-11 06:22:10 -08:00
Art Zhitnik 39973a0236 Solve the bug of parsing titles with unicode (cyrillic) 2012-11-11 14:09:12 +10:00
Filippo Valsorda 5d40a470a2 quiet the HTMLParser debug info - closes #517 2012-11-09 12:32:07 +01:00
Filippo Valsorda 4cc391461a fix DailyMotion official users videos - closes #281 - by @yvestan 2012-11-07 14:44:10 +01:00
Filippo Valsorda bf95333e5e fixed MetacafeIE (uploader nickname regex) - closes #515 2012-11-06 23:08:10 +01:00
Philipp Hagemeister b7a34316d2 -x for --extract-audio, one of the most popular options 2012-10-30 17:41:38 +01:00
Philipp Hagemeister 74e453bdea New --id option for the old default filename pattern 2012-10-30 17:37:53 +01:00
Philipp Hagemeister 156a59e7a9 Additional tests in file name sanitation 2012-10-29 08:19:54 +01:00
Philipp Hagemeister aeca861f22 Merge pull request #502 from FiloSottile/new_sanitize_filename
My sanitize_filename proposal
2012-10-28 15:33:59 -07:00
Filippo Valsorda 42cb53fcfa modified filename escaping to a "smarter" one 2012-10-28 22:47:02 +01:00
Filippo Valsorda fe4d68e196 slight change to Dailymotion uploader regex (fix) 2012-10-28 21:43:43 +01:00
Philipp Hagemeister 25b7fd9c01 Merge pull request #491 from tyll/master
Update install target
2012-10-26 01:10:25 -07:00
Till Maas e79e8b7dc4 Update install target
- Allow to configure destination directories to fulfill the needs of
  different distributions
- Support DESTDIR variable for staging installation when packaging
- Do not set user/group to root. It requires 'make install' to run as
  root, but then this is the default behaviour anyways.
2012-10-25 21:19:13 +02:00
Filippo Valsorda 965a8b2bc4 Merge pull request #488 from Tailszefox/local
Fix audio bitrate quality for ffmpeg/avconv (closes #487)
2012-10-24 11:42:31 -07:00
Tailszefox f06eaa873e Fix audio bitrate quality for ffmpeg/avconv 2012-10-23 16:37:12 +02:00
Philipp Hagemeister ece34e8951 Merge pull request #486 from Tailszefox/local
Added duration for YouTube videos
2012-10-23 05:53:28 -07:00
Tailszefox 2262a32dd7 Added duration for YouTube videos 2012-10-22 18:32:42 +02:00
Philipp Hagemeister c6c0e23a32 Support raw playlist parameters (Closes #482) 2012-10-22 13:01:36 +02:00
Philipp Hagemeister 02b324a23d Restore 2.5 compat by activating with_statement future 2012-10-22 12:51:20 +02:00
Filippo Valsorda b8005afc20 handle YT urls with #/ redirects (closes #484) 2012-10-22 09:15:27 +02:00
Philipp Hagemeister 073522bc6c Don't use 2.7+ check_output 2012-10-19 23:28:37 +02:00
Philipp Hagemeister 9248cb0549 Merge pull request #472 from gcmalloc/master
Test proposal
2012-10-19 05:48:12 -07:00
gcmalloc 6b41b61119 correcting travis 2012-10-19 12:53:20 +02:00
gcmalloc 591bbe9c90 changing test from md5 to filesize, the file changed between download 2012-10-19 12:53:20 +02:00
gcmalloc fc7376016c cleaning the test that doesn't work with the api for the moment 2012-10-19 12:53:20 +02:00
gcmalloc 97a37c2319 some assertion on the file downloaded 2012-10-19 12:53:20 +02:00
gcmalloc 3afed78a6a removing testing video 2012-10-19 12:53:20 +02:00
gcmalloc 4279a0ca98 correcting test to be compatible with python2.6 2012-10-19 12:53:20 +02:00
gcmalloc edcc7d2dd3 StringIO used by nosetests do not merge with the way youtube-dl handle sys.stdout and sys.stderr 2012-10-19 12:53:19 +02:00
gcmalloc 7f60b5aa40 correction on the test 2012-10-19 12:53:19 +02:00
gcmalloc aeeb29a356 adding travis support 2012-10-15 10:58:35 +02:00
Filippo Valsorda 902b2a0a45 New IE: YouTube channels (closes #396) 2012-10-14 13:48:18 +02:00
gcmalloc 6d9c22cd26 correcting the makefile according to the new one 2012-10-12 20:30:01 +02:00
gcmalloc 729baf58b2 removing extended globbing for the find utility 2012-10-12 20:25:22 +02:00
gcmalloc 4c9afeca34 adding xvideo 2012-10-12 20:25:22 +02:00
gcmalloc 6da7877bf5 adding facebook test 2012-10-12 20:25:22 +02:00
gcmalloc b4e5de51ec adding photobucket test 2012-10-12 20:25:22 +02:00
gcmalloc a4b5f22554 adding metacafe test 2012-10-12 20:25:22 +02:00
gcmalloc ff08984246 adding dailymotion test 2012-10-12 20:25:22 +02:00
gcmalloc 137c5803c3 some changes to keep the same standard 2012-10-12 20:25:22 +02:00
gcmalloc 3eec021a1f removing unused global modifier 2012-10-12 20:25:22 +02:00
gcmalloc 5a33b73309 correcting the makefile 2012-10-12 20:25:22 +02:00
gcmalloc 0b4e98490b changing test video 2012-10-12 20:24:58 +02:00
gcmalloc 80a846e119 correction on the test for the utils.py 2012-10-12 20:24:58 +02:00
gcmalloc 434d60cd95 adding clean rule in the makefile 2012-10-12 20:24:58 +02:00
gcmalloc efe8902f0b adding download test with md5 check 2012-10-12 20:24:58 +02:00
gcmalloc 44fb345437 adding TestCase class and corresponding test 2012-10-12 20:24:58 +02:00
gcmalloc 9993976ae4 correction on the sanitize title method, change in title resulting 2012-10-12 20:24:58 +02:00
gcmalloc b387fb0385 adding test rule in the Makefile 2012-10-12 20:24:58 +02:00
Filippo Valsorda 10daa766a1 support EDU YouTube playlists (closes #407) 2012-10-11 08:27:19 +02:00
Filippo Valsorda 7b107eea51 release 2012.10.09 2012-10-09 15:53:20 +02:00
Filippo Valsorda 646b885cbf Added missing dependencies to Makefile 2012-10-09 15:49:24 +02:00
Filippo Valsorda 0bfd0b598a Re-engineered Dailymotion qualities selection (thanks @knagano, sort of merges #176) 2012-10-09 12:28:44 +02:00
Filippo Valsorda fd873c69a4 Merge PR #422 from 'kevinamadeus/master'
Add InfoExtractor for Google Plus video
(with fixes)
2012-10-09 10:48:49 +02:00
Filippo Valsorda d64db7409b Merge pull request #458 from grimreaper/patch-1
There is nothing bash specific in release.sh, switch to /bin/sh
2012-10-09 01:16:40 -07:00
Philipp Hagemeister 27fec0e3bd Merge branch 'master' of github.com:rg3/youtube-dl 2012-10-08 22:14:28 +02:00
Philipp Hagemeister 65f934dc93 Correct detect_executables on Windows (Closes #447, #457) 2012-10-08 22:14:19 +02:00
grimreaper d51d784f85 There is nothing bash specific here
/bin/bash is always wrong. Since there is nothing bash specific here, switch to /bin/sh
2012-10-06 10:00:40 -03:00
Filippo Valsorda aa85963987 Merge pull request #452 from Tailszefox/local
Added uploaded date for Dailymotion
2012-10-03 11:29:51 -07:00
Tailszefox 413575f7a5 Added uploaded date for Dailymotion 2012-10-03 10:57:46 +02:00
Philipp Hagemeister b7b4796bf2 Fix docs 2012-10-01 18:39:24 +02:00
Philipp Hagemeister fcbc8c830e Merge branch 'master' of github.com:rg3/youtube-dl 2012-10-01 18:38:19 +02:00
Philipp Hagemeister f48ce130c7 Fix doc of extractor field 2012-10-01 18:38:10 +02:00
Filippo Valsorda 13e69f546c Merged, modified and compiled Dailymotion pull request #446 by @Steap 2012-09-30 21:45:43 +02:00
Cyril Roelandt 63ec7b7479 DailymotionIE: There is not necessarily an underscore in a Dailymotion URL. 2012-09-30 15:47:37 +02:00
Cyril Roelandt 7b6d7001d8 DailymotionIE: some videos do not use the "hqURL", "sdURL", "ldURL" keywords. In this case, the "video_url" keyword should be looked for. 2012-09-30 15:47:29 +02:00
Filippo Valsorda 39ce6e79e7 Updated youtube-dl.exe 2012-09-29 19:12:56 +02:00
Filippo Valsorda 5c961d89df Merge pull request #403 from FiloSottile/re_VERBOSE 2012-09-29 17:05:40 +02:00
Filippo Valsorda 3c4d6c9eba Not all Dailymotion videos have an hqURL, now downloads highest quality available 2012-09-29 16:53:06 +02:00
Filippo Valsorda 349e2e3e21 Fixed DailymotionIE, now downloads high-def mp4s, which might be too much (?) 2012-09-29 16:38:38 +02:00
Filippo Valsorda 551fa9dfbf adding new --output replacements. Thanks @danut007ro (closes #442) 2012-09-29 15:49:10 +02:00
Filippo Valsorda ce3674430b added new FAQ on exe dependency 2012-09-29 15:35:07 +02:00
Filippo Valsorda 5cdfaeb37b New FAQ: What is this binary file? (+ small fix to other one) 2012-09-28 19:55:18 +02:00
Philipp Hagemeister 38612b4edc update default UA string (Closes #390) 2012-09-27 23:38:11 +02:00
Philipp Hagemeister 6c5b442a9b Add recent breakage to FAQ (Closes #433) 2012-09-27 23:30:17 +02:00
Philipp Hagemeister 5a5523698d Add new field "extractor" to the info dictionary 2012-09-27 20:48:16 +02:00
Philipp Hagemeister 05a2c206be Merge pull request #425 from danut007ro/master
Provider (youtube, etc) is now saved in info_dict
2012-09-27 11:45:07 -07:00
Philipp Hagemeister 8ca21983d8 Merge pull request #432 from cryzed/master
Fixed YouTube playlist parsing
2012-09-27 11:42:58 -07:00
Philipp Hagemeister 20326b8b1b Let Makefile use youtube-dl source code instead of compiled binary 2012-09-27 20:21:20 +02:00
Philipp Hagemeister 5d534e2fe6 Improve option definitions 2012-09-27 20:19:27 +02:00
Philipp Hagemeister 234e230c87 Merge remote-tracking branch 'FiloSottille/vbr'
Conflicts:
	youtube-dl
	youtube-dl.exe
2012-09-27 20:18:29 +02:00
Philipp Hagemeister 34ae0f9d20 Merge branch 'master' of github.com:rg3/youtube-dl 2012-09-27 19:56:29 +02:00
Philipp Hagemeister df09e5f9e1 Merge pull request #405 from hdclark/master
Support for custom user agent
2012-09-27 10:56:25 -07:00
cryzed 3af2f7656c Fixed YouTube playlist parsing 2012-09-27 19:48:29 +02:00
Philipp Hagemeister 74e716bb64 original test video 2012-09-27 19:44:44 +02:00
Philipp Hagemeister 85f76ac90b Merge remote-tracking branch 'FiloSottille/automation' 2012-09-27 19:41:51 +02:00
Philipp Hagemeister 7f36e39676 Merge remote-tracking branch 'FiloSottille/supports'
Conflicts:
	youtube-dl
2012-09-27 19:24:41 +02:00
Philipp Hagemeister ebe3f89ea4 Merge xnxx.com Support (NSFW). Test URL (SFW): http://video.xnxx.com/video1443330/youtube-dl_testvid_a_and_9829_._and_amp_and_38_ 2012-09-27 18:55:56 +02:00
danut007ro ae16f68f4a Provider (youtube, etc) is now saved in info_dict, so template filename can be something like %(provider)s_%(id)s.%(ext)s
This can be useful because videos should also be identified by their providers since id's can be the same on multiple providers.
2012-09-27 00:35:31 +03:00
danut007ro 3cd98c7894 Removed provider (mistake) and add provider parameter to process_info 2012-09-27 00:07:20 +03:00
danut007ro 2866e68838 Merge branch 'master' of https://github.com/rg3/youtube-dl 2012-09-26 21:09:44 +03:00
danut007ro be8786a6a4 Every extractor also return it's name. 2012-09-26 21:00:28 +03:00
Filippo Valsorda 0e841bdc54 add PREFIX option to make install 2012-09-26 00:10:39 +02:00
Filippo Valsorda 225dceb046 moved make release to devscripts/release.sh 2012-09-25 23:56:01 +02:00
Kevin Kwan d443aca863 Add InfoExtractor for Google Plus video 2012-09-25 16:21:02 +08:00
hdclark ea46fe2dd4 Added support for custom user agents.
Added a few simple lines to add support for the flag "--user-agent" to pass a custom string to std_header['User-Agent'].
2012-08-22 23:40:35 -07:00
Filippo Valsorda 202e76cfb0 Made the YouTubeIE regex verbose/commented 2012-08-20 00:58:10 +02:00
Filippo Valsorda 3a68d7b467 tweaked the --audio-quality input validation/specification 2012-08-19 23:25:16 +02:00
Filippo Valsorda 795cc5059a Re-engineered XNXXIE to actually exit on ERRORs even with -i 2012-08-19 18:46:23 +02:00
Filippo Valsorda 5dc846fad0 Merge pull request #398 from tempname/master 2012-08-19 18:39:43 +02:00
Filippo Valsorda d5c4c4c10e bugfix and standarize the youku.com support 2012-08-19 17:44:34 +02:00
Filippo Valsorda 1ac3e3315e Merge pull request #395 from thesues/master 2012-08-19 17:08:39 +02:00
Filippo Valsorda 0e4dc2fc74 Merge 'rbrito/support-tube.majestyc.net' (PR #391) with small fix 2012-08-19 17:00:20 +02:00
tempname 154b55dae3 added InfoExtractor for XNXX 2012-08-15 20:57:27 -03:00
tempname 6de7ef9b8d added InfoExtractor for XNXX 2012-08-15 20:54:03 -03:00
dongmao zhang 392105265c Merge branch 'master' of github.com:thesues/youtube-dl
Conflicts:
	youtube-dl
	youtube_dl/InfoExtractors.py
2012-08-10 18:32:28 +08:00
dongmao zhang 51661d8600 add www.youku.com support 2012-08-09 13:54:19 +08:00
dongmao zhang b5809a68bf merge 2012-08-09 12:26:26 +08:00
dongmao zhang 7733d455c8 fix 0a->0A bug 2012-08-09 03:14:02 +08:00
dongmao zhang 0a98b09bc2 youku default to download hd2 video 2012-08-09 02:53:21 +08:00
dongmao zhang 302efc19ea add youku support 2012-08-09 02:04:02 +08:00
Filippo Valsorda dce1088450 A more "make-esque" Makefile with file targets and dependencies 2012-08-03 20:10:54 +02:00
Filippo Valsorda 7a7c093ab0 added one-step realese script 'make release version=nn' - closes #158 2012-08-01 18:40:27 +02:00
Filippo Valsorda ce7b2a40d0 added automatically generated bash-completion; closes #191 2012-08-01 17:26:50 +02:00
Filippo Valsorda cfcec69331 auto-generating manpage from README.md (closes #151); redesigned Makefile 2012-08-01 11:54:27 +02:00
Filippo Valsorda 91645066e2 Merge branch 'joehillen/master' - pull request #381 2012-08-01 11:35:04 +02:00
joehillen ef0c08cdfe Added install target to Makefile. 2012-07-22 13:36:22 -07:00
Filippo Valsorda b24676ce88 changed --audio-quality behaviour to support both CBR and VBR 2012-07-14 19:43:24 +02:00
21 changed files with 1207 additions and 250 deletions
+9
View File
@@ -0,0 +1,9 @@
language: python
#specify the python version
python:
- "2.6"
- "2.7"
#command to install the setup
install:
# command to run tests
script: nosetests test --nocapture
+1 -1
View File
@@ -1 +1 @@
2012.09.27
2012.11.27
+48 -17
View File
@@ -1,26 +1,57 @@
default: update
all: youtube-dl README.md youtube-dl.1 youtube-dl.bash-completion LATEST_VERSION
# TODO: re-add youtube-dl.exe, and make sure it's 1. safe and 2. doesn't need sudo
update: compile update-readme update-latest
clean:
rm -f youtube-dl youtube-dl.exe youtube-dl.1 LATEST_VERSION
update-latest:
./youtube-dl.dev --version > LATEST_VERSION
PREFIX=/usr/local
BINDIR=$(PREFIX)/bin
MANDIR=$(PREFIX)/man
SYSCONFDIR=/etc
update-readme:
@options=$$(COLUMNS=80 ./youtube-dl.dev --help | sed -e '1,/.*General Options.*/ d' -e 's/^\W\{2\}\(\w\)/### \1/') && \
header=$$(sed -e '/.*## OPTIONS/,$$ d' README.md) && \
footer=$$(sed -e '1,/.*## FAQ/ d' README.md) && \
echo "$${header}" > README.md && \
echo >> README.md && \
echo '## OPTIONS' >> README.md && \
echo "$${options}" >> README.md&& \
echo >> README.md && \
echo '## FAQ' >> README.md && \
echo "$${footer}" >> README.md
install: youtube-dl youtube-dl.1 youtube-dl.bash-completion
install -d $(DESTDIR)$(BINDIR)
install -m 755 youtube-dl $(DESTDIR)$(BINDIR)
install -d $(DESTDIR)$(MANDIR)/man1
install -m 644 youtube-dl.1 $(DESTDIR)$(MANDIR)/man1
install -d $(DESTDIR)$(SYSCONFDIR)/bash_completion.d
install -m 644 youtube-dl.bash-completion $(DESTDIR)$(SYSCONFDIR)/bash_completion.d/youtube-dl
compile:
test:
nosetests2 --nocapture test
.PHONY: all clean install test README.md youtube-dl.bash-completion
# TODO un-phony README.md and youtube-dl.bash_completion by reading from .in files and generating from them
youtube-dl: youtube_dl/*.py
zip --quiet --junk-paths youtube-dl youtube_dl/*.py
echo '#!/usr/bin/env python' > youtube-dl
cat youtube-dl.zip >> youtube-dl
rm youtube-dl.zip
chmod a+x youtube-dl
.PHONY: default compile update update-latest update-readme
youtube-dl.exe: youtube_dl/*.py
bash devscripts/wine-py2exe.sh build_exe.py
README.md: youtube_dl/*.py
@options=$$(COLUMNS=80 python -m youtube_dl --help | sed -e '1,/.*General Options.*/ d' -e 's/^\W\{2\}\(\w\)/## \1/') && \
header=$$(sed -e '/.*# OPTIONS/,$$ d' README.md) && \
footer=$$(sed -e '1,/.*# CONFIGURATION/ d' README.md) && \
echo "$${header}" > README.md && \
echo >> README.md && \
echo '# OPTIONS' >> README.md && \
echo "$${options}" >> README.md&& \
echo >> README.md && \
echo '# CONFIGURATION' >> README.md && \
echo "$${footer}" >> README.md
youtube-dl.1: README.md
pandoc -s -w man README.md -o youtube-dl.1
youtube-dl.bash-completion: README.md
@options=`egrep -o '(--[a-z-]+) ' README.md | sort -u | xargs echo` && \
content=`sed "s/opts=\"[^\"]*\"/opts=\"$${options}\"/g" youtube-dl.bash-completion` && \
echo "$${content}" > youtube-dl.bash-completion
LATEST_VERSION: youtube_dl/__init__.py
python -m youtube_dl --version > LATEST_VERSION
+64 -22
View File
@@ -1,16 +1,19 @@
# youtube-dl
% youtube-dl(1)
## USAGE
youtube-dl [options] url [url...]
# NAME
youtube-dl
## DESCRIPTION
# SYNOPSIS
**youtube-dl** [OPTIONS] URL [URL...]
# DESCRIPTION
**youtube-dl** is a small command-line program to download videos from
YouTube.com and a few more sites. It requires the Python interpreter, version
2.x (x being at least 6), and it is not platform specific. It should work in
your Unix box, in Windows or in Mac OS X. It is released to the public domain,
which means you can modify it, redistribute it or use it however you like.
## OPTIONS
# OPTIONS
-h, --help print this help text and exit
--version print program version and exit
-U, --update update this program to latest version
@@ -18,10 +21,11 @@ which means you can modify it, redistribute it or use it however you like.
-r, --rate-limit LIMIT download rate limit (e.g. 50k or 44.6m)
-R, --retries RETRIES number of retries (default is 10)
--dump-user-agent display the current browser identification
--user-agent UA specify a custom user agent
--list-extractors List all supported extractors and the URLs they
would handle
### Video Selection:
## Video Selection:
--playlist-start NUMBER playlist video to start at (default is 1)
--playlist-end NUMBER playlist video to end at (default is last)
--match-title REGEX download only matching titles (regex or caseless
@@ -30,17 +34,21 @@ which means you can modify it, redistribute it or use it however you like.
caseless sub-string)
--max-downloads NUMBER Abort after downloading NUMBER files
### Filesystem Options:
## Filesystem Options:
-t, --title use title in file name
-l, --literal use literal title in file name
--id use video ID in file name
-l, --literal [deprecated] alias of --title
-A, --auto-number number downloaded files starting from 00000
-o, --output TEMPLATE output filename template. Use %(stitle)s to get the
-o, --output TEMPLATE output filename template. Use %(title)s to get the
title, %(uploader)s for the uploader name,
%(autonumber)s to get an automatically incremented
number, %(ext)s for the filename extension,
%(upload_date)s for the upload date (YYYYMMDD), and
%% for a literal percent. Use - to output to
stdout.
%(upload_date)s for the upload date (YYYYMMDD),
%(extractor)s for the provider (youtube, metacafe,
etc), %(id)s for the video id and %% for a literal
percent. Use - to output to stdout.
--restrict-filenames Avoid some characters such as "&" and spaces in
filenames
-a, --batch-file FILE file containing URLs to download ('-' for stdin)
-w, --no-overwrites do not overwrite files
-c, --continue resume partially downloaded files
@@ -53,7 +61,7 @@ which means you can modify it, redistribute it or use it however you like.
--write-description write video description to a .description file
--write-info-json write video metadata to a .info.json file
### Verbosity / Simulation Options:
## Verbosity / Simulation Options:
-q, --quiet activates quiet mode
-s, --simulate do not download the video and do not write anything
to disk
@@ -68,7 +76,7 @@ which means you can modify it, redistribute it or use it however you like.
--console-title display progress in console titlebar
-v, --verbose print various debugging information
### Video Format Options:
## Video Format Options:
-f, --format FORMAT video format code
--all-formats download all available video formats
--prefer-free-formats prefer free video formats unless a specific one is
@@ -80,22 +88,27 @@ which means you can modify it, redistribute it or use it however you like.
--srt-lang LANG language of the closed captions to download
(optional) use IETF language tags like 'en'
### Authentication Options:
## Authentication Options:
-u, --username USERNAME account username
-p, --password PASSWORD account password
-n, --netrc use .netrc authentication data
### Post-processing Options:
--extract-audio convert video files to audio-only files (requires
## Post-processing Options:
-x, --extract-audio convert video files to audio-only files (requires
ffmpeg or avconv and ffprobe or avprobe)
--audio-format FORMAT "best", "aac", "vorbis", "mp3", "m4a", or "wav";
best by default
--audio-quality QUALITY ffmpeg/avconv audio bitrate specification, 128k by
default
--audio-quality QUALITY ffmpeg/avconv audio quality specification, insert a
value between 0 (better) and 9 (worse) for VBR or a
specific bitrate like 128K (default 5)
-k, --keep-video keeps the video file on disk after the post-
processing; the video is erased by default
## FAQ
# CONFIGURATION
You can configure youtube-dl by placing default arguments (such as `--extract-audio --no-mtime` to always extract the audio and not copy the mtime) into `/etc/youtube-dl.conf` and/or `~/.local/config/youtube-dl.conf`.
# FAQ
### Can you please put the -b option back?
@@ -117,13 +130,42 @@ The URLs youtube-dl outputs require the downloader to have the correct cookies.
youtube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
## COPYRIGHT
### ERROR: unable to download video ###
youtube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
### SyntaxError: Non-ASCII character ###
The error
File "youtube-dl", line 2
SyntaxError: Non-ASCII character '\x93' ...
means you're using an outdated version of Python. Please update to Python 2.6 or 2.7.
To run youtube-dl under Python 2.5, you'll have to manually check it out like this:
git clone git://github.com/rg3/youtube-dl.git
cd youtube-dl
python -m youtube_dl --help
Please note that Python 2.5 is not supported anymore.
### What is this binary file? Where has the code gone?
Since June 2012 (#342) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`.
### The exe throws a *Runtime error from Visual C++*
To run the exe you need to install first the [Microsoft Visual C++ 2008 Redistributable Package](http://www.microsoft.com/en-us/download/details.aspx?id=29).
# COPYRIGHT
youtube-dl is released into the public domain by the copyright holders.
This README file was originally written by Daniel Bolton (<https://github.com/dbbolton>) and is likewise released into the public domain.
## BUGS
# BUGS
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues>
Regular → Executable
View File
+11
View File
@@ -0,0 +1,11 @@
#!/bin/sh
if [ -z "$1" ]; then echo "ERROR: specify version number like this: $0 1994.09.06"; exit 1; fi
version="$1"
if [ ! -z "`git tag | grep "$version"`" ]; then echo 'ERROR: version already present'; exit 1; fi
if [ ! -z "`git status --porcelain`" ]; then echo 'ERROR: the working directory is not clean; commit or stash changes'; exit 1; fi
sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/__init__.py
make all
git add -A
git commit -m "release $version"
git tag -m "Release $version" "$version"
Regular → Executable
View File
+1
View File
@@ -0,0 +1 @@
{"username": null, "listformats": null, "skip_download": false, "usenetrc": false, "max_downloads": null, "noprogress": false, "forcethumbnail": false, "forceformat": false, "format_limit": null, "ratelimit": null, "nooverwrites": false, "forceurl": false, "writeinfojson": false, "simulate": false, "playliststart": 1, "continuedl": true, "password": null, "prefer_free_formats": false, "nopart": false, "retries": 10, "updatetime": true, "consoletitle": false, "verbose": true, "forcefilename": false, "ignoreerrors": false, "logtostderr": false, "format": null, "subtitleslang": null, "quiet": false, "outtmpl": "%(id)s.%(ext)s", "rejecttitle": null, "playlistend": -1, "writedescription": false, "forcetitle": false, "forcedescription": false, "writesubtitles": false, "matchtitle": null}
-29
View File
@@ -1,29 +0,0 @@
# -*- coding: utf-8 -*-
# Various small unit tests
import os,sys
sys.path.append(os.path.dirname(os.path.dirname(__file__)))
import youtube_dl
def test_simplify_title():
assert youtube_dl._simplify_title(u'abc') == u'abc'
assert youtube_dl._simplify_title(u'abc_d-e') == u'abc_d-e'
assert youtube_dl._simplify_title(u'123') == u'123'
assert u'/' not in youtube_dl._simplify_title(u'abc/de')
assert u'abc' in youtube_dl._simplify_title(u'abc/de')
assert u'de' in youtube_dl._simplify_title(u'abc/de')
assert u'/' not in youtube_dl._simplify_title(u'abc/de///')
assert u'\\' not in youtube_dl._simplify_title(u'abc\\de')
assert u'abc' in youtube_dl._simplify_title(u'abc\\de')
assert u'de' in youtube_dl._simplify_title(u'abc\\de')
assert youtube_dl._simplify_title(u'ä') == u'ä'
assert youtube_dl._simplify_title(u'кириллица') == u'кириллица'
# Strip underlines
assert youtube_dl._simplify_title(u'\'a_') == u'a'
+93
View File
@@ -0,0 +1,93 @@
#!/usr/bin/env python2
import unittest
import hashlib
import os
import json
from youtube_dl.FileDownloader import FileDownloader
from youtube_dl.InfoExtractors import YoutubeIE, DailymotionIE
from youtube_dl.InfoExtractors import MetacafeIE, BlipTVIE
class DownloadTest(unittest.TestCase):
PARAMETERS_FILE = "test/parameters.json"
#calculated with md5sum:
#md5sum (GNU coreutils) 8.19
YOUTUBE_SIZE = 1993883
YOUTUBE_URL = "http://www.youtube.com/watch?v=BaW_jenozKc"
YOUTUBE_FILE = "BaW_jenozKc.mp4"
DAILYMOTION_MD5 = "d363a50e9eb4f22ce90d08d15695bb47"
DAILYMOTION_URL = "http://www.dailymotion.com/video/x33vw9_tutoriel-de-youtubeur-dl-des-video_tech"
DAILYMOTION_FILE = "x33vw9.mp4"
METACAFE_SIZE = 5754305
METACAFE_URL = "http://www.metacafe.com/watch/yt-_aUehQsCQtM/the_electric_company_short_i_pbs_kids_go/"
METACAFE_FILE = "_aUehQsCQtM.flv"
BLIP_MD5 = "93c24d2f4e0782af13b8a7606ea97ba7"
BLIP_URL = "http://blip.tv/cbr/cbr-exclusive-gotham-city-imposters-bats-vs-jokerz-short-3-5796352"
BLIP_FILE = "5779306.m4v"
XVIDEO_MD5 = ""
XVIDEO_URL = ""
XVIDEO_FILE = ""
def test_youtube(self):
#let's download a file from youtube
with open(DownloadTest.PARAMETERS_FILE) as f:
fd = FileDownloader(json.load(f))
fd.add_info_extractor(YoutubeIE())
fd.download([DownloadTest.YOUTUBE_URL])
self.assertTrue(os.path.exists(DownloadTest.YOUTUBE_FILE))
self.assertEqual(os.path.getsize(DownloadTest.YOUTUBE_FILE), DownloadTest.YOUTUBE_SIZE)
def test_dailymotion(self):
with open(DownloadTest.PARAMETERS_FILE) as f:
fd = FileDownloader(json.load(f))
fd.add_info_extractor(DailymotionIE())
fd.download([DownloadTest.DAILYMOTION_URL])
self.assertTrue(os.path.exists(DownloadTest.DAILYMOTION_FILE))
md5_down_file = md5_for_file(DownloadTest.DAILYMOTION_FILE)
self.assertEqual(md5_down_file, DownloadTest.DAILYMOTION_MD5)
def test_metacafe(self):
#this emulate a skip,to be 2.6 compatible
with open(DownloadTest.PARAMETERS_FILE) as f:
fd = FileDownloader(json.load(f))
fd.add_info_extractor(MetacafeIE())
fd.add_info_extractor(YoutubeIE())
fd.download([DownloadTest.METACAFE_URL])
self.assertTrue(os.path.exists(DownloadTest.METACAFE_FILE))
self.assertEqual(os.path.getsize(DownloadTest.METACAFE_FILE), DownloadTest.METACAFE_SIZE)
def test_blip(self):
with open(DownloadTest.PARAMETERS_FILE) as f:
fd = FileDownloader(json.load(f))
fd.add_info_extractor(BlipTVIE())
fd.download([DownloadTest.BLIP_URL])
self.assertTrue(os.path.exists(DownloadTest.BLIP_FILE))
md5_down_file = md5_for_file(DownloadTest.BLIP_FILE)
self.assertEqual(md5_down_file, DownloadTest.BLIP_MD5)
def tearDown(self):
if os.path.exists(DownloadTest.YOUTUBE_FILE):
os.remove(DownloadTest.YOUTUBE_FILE)
if os.path.exists(DownloadTest.DAILYMOTION_FILE):
os.remove(DownloadTest.DAILYMOTION_FILE)
if os.path.exists(DownloadTest.METACAFE_FILE):
os.remove(DownloadTest.METACAFE_FILE)
if os.path.exists(DownloadTest.BLIP_FILE):
os.remove(DownloadTest.BLIP_FILE)
def md5_for_file(filename, block_size=2**20):
with open(filename) as f:
md5 = hashlib.md5()
while True:
data = f.read(block_size)
if not data:
break
md5.update(data)
return md5.hexdigest()
+70
View File
@@ -0,0 +1,70 @@
# -*- coding: utf-8 -*-
# Various small unit tests
import unittest
#from youtube_dl.utils import htmlentity_transform
from youtube_dl.utils import timeconvert
from youtube_dl.utils import sanitize_filename
from youtube_dl.utils import unescapeHTML
from youtube_dl.utils import orderedSet
class TestUtil(unittest.TestCase):
def test_timeconvert(self):
self.assertTrue(timeconvert('') is None)
self.assertTrue(timeconvert('bougrg') is None)
def test_sanitize_filename(self):
self.assertEqual(sanitize_filename(u'abc'), u'abc')
self.assertEqual(sanitize_filename(u'abc_d-e'), u'abc_d-e')
self.assertEqual(sanitize_filename(u'123'), u'123')
self.assertEqual(u'abc-de', sanitize_filename(u'abc/de'))
self.assertFalse(u'/' in sanitize_filename(u'abc/de///'))
self.assertEqual(u'abc-de', sanitize_filename(u'abc/<>\\*|de'))
self.assertEqual(u'xxx', sanitize_filename(u'xxx/<>\\*|'))
self.assertEqual(u'yes no', sanitize_filename(u'yes? no'))
self.assertEqual(u'this - that', sanitize_filename(u'this: that'))
self.assertEqual(sanitize_filename(u'AT&T'), u'AT&T')
self.assertEqual(sanitize_filename(u'ä'), u'ä')
self.assertEqual(sanitize_filename(u'кириллица'), u'кириллица')
forbidden = u'"\0\\/'
for fc in forbidden:
for fbc in forbidden:
self.assertTrue(fbc not in sanitize_filename(fc))
def test_sanitize_filename_restricted(self):
self.assertEqual(sanitize_filename(u'abc', restricted=True), u'abc')
self.assertEqual(sanitize_filename(u'abc_d-e', restricted=True), u'abc_d-e')
self.assertEqual(sanitize_filename(u'123', restricted=True), u'123')
self.assertEqual(u'abc-de', sanitize_filename(u'abc/de', restricted=True))
self.assertFalse(u'/' in sanitize_filename(u'abc/de///', restricted=True))
self.assertEqual(u'abc-de', sanitize_filename(u'abc/<>\\*|de', restricted=True))
self.assertEqual(u'xxx', sanitize_filename(u'xxx/<>\\*|', restricted=True))
self.assertEqual(u'yes_no', sanitize_filename(u'yes? no', restricted=True))
self.assertEqual(u'this_-_that', sanitize_filename(u'this: that', restricted=True))
forbidden = u'"\0\\/&: \'\t\n'
for fc in forbidden:
print('input: ' + fc + ', result: ' + repr(sanitize_filename(fc, restricted=True)))
for fbc in forbidden:
self.assertTrue(fbc not in sanitize_filename(fc, restricted=True))
def test_ordered_set(self):
self.assertEqual(orderedSet([1,1,2,3,4,4,5,6,7,3,5]), [1,2,3,4,5,6,7])
self.assertEqual(orderedSet([]), [])
self.assertEqual(orderedSet([1]), [1])
#keep the list ordered
self.assertEqual(orderedSet([135,1,1,1]), [135,1])
def test_unescape_html(self):
self.assertEqual(unescapeHTML(u"%20;"), u"%20;")
BIN
View File
Binary file not shown.
+248
View File
@@ -0,0 +1,248 @@
.TH youtube-dl 1 ""
.SH NAME
.PP
youtube-dl
.SH SYNOPSIS
.PP
\f[B]youtube-dl\f[] [OPTIONS] URL [URL...]
.SH DESCRIPTION
.PP
\f[B]youtube-dl\f[] is a small command-line program to download videos
from YouTube.com and a few more sites.
It requires the Python interpreter, version 2.x (x being at least 6),
and it is not platform specific.
It should work in your Unix box, in Windows or in Mac OS X.
It is released to the public domain, which means you can modify it,
redistribute it or use it however you like.
.SH OPTIONS
.IP
.nf
\f[C]
-h,\ --help\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ print\ this\ help\ text\ and\ exit
--version\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ print\ program\ version\ and\ exit
-U,\ --update\ \ \ \ \ \ \ \ \ \ \ \ \ update\ this\ program\ to\ latest\ version
-i,\ --ignore-errors\ \ \ \ \ \ continue\ on\ download\ errors
-r,\ --rate-limit\ LIMIT\ \ \ download\ rate\ limit\ (e.g.\ 50k\ or\ 44.6m)
-R,\ --retries\ RETRIES\ \ \ \ number\ of\ retries\ (default\ is\ 10)
--dump-user-agent\ \ \ \ \ \ \ \ display\ the\ current\ browser\ identification
--user-agent\ UA\ \ \ \ \ \ \ \ \ \ specify\ a\ custom\ user\ agent
--list-extractors\ \ \ \ \ \ \ \ List\ all\ supported\ extractors\ and\ the\ URLs\ they
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ would\ handle
\f[]
.fi
.SS Video Selection:
.IP
.nf
\f[C]
--playlist-start\ NUMBER\ \ playlist\ video\ to\ start\ at\ (default\ is\ 1)
--playlist-end\ NUMBER\ \ \ \ playlist\ video\ to\ end\ at\ (default\ is\ last)
--match-title\ REGEX\ \ \ \ \ \ download\ only\ matching\ titles\ (regex\ or\ caseless
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ sub-string)
--reject-title\ REGEX\ \ \ \ \ skip\ download\ for\ matching\ titles\ (regex\ or
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ caseless\ sub-string)
--max-downloads\ NUMBER\ \ \ Abort\ after\ downloading\ NUMBER\ files
\f[]
.fi
.SS Filesystem Options:
.IP
.nf
\f[C]
-t,\ --title\ \ \ \ \ \ \ \ \ \ \ \ \ \ use\ title\ in\ file\ name
--id\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ use\ video\ ID\ in\ file\ name
-l,\ --literal\ \ \ \ \ \ \ \ \ \ \ \ [deprecated]\ alias\ of\ --title
-A,\ --auto-number\ \ \ \ \ \ \ \ number\ downloaded\ files\ starting\ from\ 00000
-o,\ --output\ TEMPLATE\ \ \ \ output\ filename\ template.\ Use\ %(title)s\ to\ get\ the
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ title,\ %(uploader)s\ for\ the\ uploader\ name,
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ %(autonumber)s\ to\ get\ an\ automatically\ incremented
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ number,\ %(ext)s\ for\ the\ filename\ extension,
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ %(upload_date)s\ for\ the\ upload\ date\ (YYYYMMDD),
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ %(extractor)s\ for\ the\ provider\ (youtube,\ metacafe,
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ etc),\ %(id)s\ for\ the\ video\ id\ and\ %%\ for\ a\ literal
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ percent.\ Use\ -\ to\ output\ to\ stdout.
--restrict-filenames\ \ \ \ \ Avoid\ some\ characters\ such\ as\ "&"\ and\ spaces\ in
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ filenames
-a,\ --batch-file\ FILE\ \ \ \ file\ containing\ URLs\ to\ download\ (\[aq]-\[aq]\ for\ stdin)
-w,\ --no-overwrites\ \ \ \ \ \ do\ not\ overwrite\ files
-c,\ --continue\ \ \ \ \ \ \ \ \ \ \ resume\ partially\ downloaded\ files
--no-continue\ \ \ \ \ \ \ \ \ \ \ \ do\ not\ resume\ partially\ downloaded\ files\ (restart
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ from\ beginning)
--cookies\ FILE\ \ \ \ \ \ \ \ \ \ \ file\ to\ read\ cookies\ from\ and\ dump\ cookie\ jar\ in
--no-part\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ do\ not\ use\ .part\ files
--no-mtime\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ do\ not\ use\ the\ Last-modified\ header\ to\ set\ the\ file
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ modification\ time
--write-description\ \ \ \ \ \ write\ video\ description\ to\ a\ .description\ file
--write-info-json\ \ \ \ \ \ \ \ write\ video\ metadata\ to\ a\ .info.json\ file
\f[]
.fi
.SS Verbosity / Simulation Options:
.IP
.nf
\f[C]
-q,\ --quiet\ \ \ \ \ \ \ \ \ \ \ \ \ \ activates\ quiet\ mode
-s,\ --simulate\ \ \ \ \ \ \ \ \ \ \ do\ not\ download\ the\ video\ and\ do\ not\ write\ anything
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ to\ disk
--skip-download\ \ \ \ \ \ \ \ \ \ do\ not\ download\ the\ video
-g,\ --get-url\ \ \ \ \ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ URL
-e,\ --get-title\ \ \ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ title
--get-thumbnail\ \ \ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ thumbnail\ URL
--get-description\ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ video\ description
--get-filename\ \ \ \ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ output\ filename
--get-format\ \ \ \ \ \ \ \ \ \ \ \ \ simulate,\ quiet\ but\ print\ output\ format
--no-progress\ \ \ \ \ \ \ \ \ \ \ \ do\ not\ print\ progress\ bar
--console-title\ \ \ \ \ \ \ \ \ \ display\ progress\ in\ console\ titlebar
-v,\ --verbose\ \ \ \ \ \ \ \ \ \ \ \ print\ various\ debugging\ information
\f[]
.fi
.SS Video Format Options:
.IP
.nf
\f[C]
-f,\ --format\ FORMAT\ \ \ \ \ \ video\ format\ code
--all-formats\ \ \ \ \ \ \ \ \ \ \ \ download\ all\ available\ video\ formats
--prefer-free-formats\ \ \ \ prefer\ free\ video\ formats\ unless\ a\ specific\ one\ is
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ requested
--max-quality\ FORMAT\ \ \ \ \ highest\ quality\ format\ to\ download
-F,\ --list-formats\ \ \ \ \ \ \ list\ all\ available\ formats\ (currently\ youtube\ only)
--write-srt\ \ \ \ \ \ \ \ \ \ \ \ \ \ write\ video\ closed\ captions\ to\ a\ .srt\ file
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (currently\ youtube\ only)
--srt-lang\ LANG\ \ \ \ \ \ \ \ \ \ language\ of\ the\ closed\ captions\ to\ download
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (optional)\ use\ IETF\ language\ tags\ like\ \[aq]en\[aq]
\f[]
.fi
.SS Authentication Options:
.IP
.nf
\f[C]
-u,\ --username\ USERNAME\ \ account\ username
-p,\ --password\ PASSWORD\ \ account\ password
-n,\ --netrc\ \ \ \ \ \ \ \ \ \ \ \ \ \ use\ .netrc\ authentication\ data
\f[]
.fi
.SS Post-processing Options:
.IP
.nf
\f[C]
-x,\ --extract-audio\ \ \ \ \ \ convert\ video\ files\ to\ audio-only\ files\ (requires
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ ffmpeg\ or\ avconv\ and\ ffprobe\ or\ avprobe)
--audio-format\ FORMAT\ \ \ \ "best",\ "aac",\ "vorbis",\ "mp3",\ "m4a",\ or\ "wav";
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ best\ by\ default
--audio-quality\ QUALITY\ \ ffmpeg/avconv\ audio\ quality\ specification,\ insert\ a
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ value\ between\ 0\ (better)\ and\ 9\ (worse)\ for\ VBR\ or\ a
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ specific\ bitrate\ like\ 128K\ (default\ 5)
-k,\ --keep-video\ \ \ \ \ \ \ \ \ keeps\ the\ video\ file\ on\ disk\ after\ the\ post-
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ processing;\ the\ video\ is\ erased\ by\ default
\f[]
.fi
.SH CONFIGURATION
.PP
You can configure youtube-dl by placing default arguments (such as
\f[C]--extract-audio\ --no-mtime\f[] to always extract the audio and not
copy the mtime) into \f[C]/etc/youtube-dl.conf\f[] and/or
\f[C]~/.local/config/youtube-dl.conf\f[].
.SH FAQ
.SS Can you please put the -b option back?
.PP
Most people asking this question are not aware that youtube-dl now
defaults to downloading the highest available quality as reported by
YouTube, which will be 1080p or 720p in some cases, so you no longer
need the -b option.
For some specific videos, maybe YouTube does not report them to be
available in a specific high quality format you\[aq]\[aq]re interested
in.
In that case, simply request it with the -f option and youtube-dl will
try to download it.
.SS I get HTTP error 402 when trying to download a video. What\[aq]s
this?
.PP
Apparently YouTube requires you to pass a CAPTCHA test if you download
too much.
We\[aq]\[aq]re considering to provide a way to let you solve the
CAPTCHA (https://github.com/rg3/youtube-dl/issues/154), but at the
moment, your best course of action is pointing a webbrowser to the
youtube URL, solving the CAPTCHA, and restart youtube-dl.
.SS I have downloaded a video but how can I play it?
.PP
Once the video is fully downloaded, use any video player, such as
vlc (http://www.videolan.org) or mplayer (http://www.mplayerhq.hu/).
.SS The links provided by youtube-dl -g are not working anymore
.PP
The URLs youtube-dl outputs require the downloader to have the correct
cookies.
Use the \f[C]--cookies\f[] option to write the required cookies into a
file, and advise your downloader to read cookies from that file.
Some sites also require a common user agent to be used, use
\f[C]--dump-user-agent\f[] to see the one in use by youtube-dl.
.SS ERROR: no fmt_url_map or conn information found in video info
.PP
youtube has switched to a new video info format in July 2011 which is
not supported by old versions of youtube-dl.
You can update youtube-dl with \f[C]sudo\ youtube-dl\ --update\f[].
.SS ERROR: unable to download video
.PP
youtube requires an additional signature since September 2012 which is
not supported by old versions of youtube-dl.
You can update youtube-dl with \f[C]sudo\ youtube-dl\ --update\f[].
.SS SyntaxError: Non-ASCII character
.PP
The error
.IP
.nf
\f[C]
File\ "youtube-dl",\ line\ 2
SyntaxError:\ Non-ASCII\ character\ \[aq]\\x93\[aq]\ ...
\f[]
.fi
.PP
means you\[aq]re using an outdated version of Python.
Please update to Python 2.6 or 2.7.
.PP
To run youtube-dl under Python 2.5, you\[aq]ll have to manually check it
out like this:
.IP
.nf
\f[C]
git\ clone\ git://github.com/rg3/youtube-dl.git
cd\ youtube-dl
python\ -m\ youtube_dl\ --help
\f[]
.fi
.PP
Please note that Python 2.5 is not supported anymore.
.SS What is this binary file? Where has the code gone?
.PP
Since June 2012 (#342) youtube-dl is packed as an executable zipfile,
simply unzip it (might need renaming to \f[C]youtube-dl.zip\f[] first on
some systems) or clone the git repository, as laid out above.
If you modify the code, you can run it by executing the
\f[C]__main__.py\f[] file.
To recompile the executable, run \f[C]make\ youtube-dl\f[].
.SS The exe throws a \f[I]Runtime error from Visual C++\f[]
.PP
To run the exe you need to install first the Microsoft Visual C++ 2008
Redistributable
Package (http://www.microsoft.com/en-us/download/details.aspx?id=29).
.SH COPYRIGHT
.PP
youtube-dl is released into the public domain by the copyright holders.
.PP
This README file was originally written by Daniel Bolton
(<https://github.com/dbbolton>) and is likewise released into the public
domain.
.SH BUGS
.PP
Bugs and suggestions should be reported at:
<https://github.com/rg3/youtube-dl/issues>
.PP
Please include:
.IP \[bu] 2
Your exact command line, like
\f[C]youtube-dl\ -t\ "http://www.youtube.com/watch?v=uHlDtZ6Oc3s&feature=channel_video_title"\f[].
A common mistake is not to escape the \f[C]&\f[].
Putting URLs in quotes should solve this problem.
.IP \[bu] 2
The output of \f[C]youtube-dl\ --version\f[]
.IP \[bu] 2
The output of \f[C]python\ --version\f[]
.IP \[bu] 2
The name and version of your Operating System ("Ubuntu 11.04 x64" or
"Windows 7 x64" is usually enough).
+14
View File
@@ -0,0 +1,14 @@
__youtube-dl()
{
local cur prev opts
COMPREPLY=()
cur="${COMP_WORDS[COMP_CWORD]}"
opts="--all-formats --audio-format --audio-quality --auto-number --batch-file --console-title --continue --cookies --dump-user-agent --extract-audio --format --get-description --get-filename --get-format --get-thumbnail --get-title --get-url --help --id --ignore-errors --keep-video --list-extractors --list-formats --literal --match-title --max-downloads --max-quality --netrc --no-continue --no-mtime --no-overwrites --no-part --no-progress --output --password --playlist-end --playlist-start --prefer-free-formats --quiet --rate-limit --reject-title --restrict-filenames --retries --simulate --skip-download --srt-lang --title --update --user-agent --username --verbose --version --write-description --write-info-json --write-srt"
if [[ ${cur} == * ]] ; then
COMPREPLY=( $(compgen -W "${opts}" -- ${cur}) )
return 0
fi
}
complete -F __youtube-dl youtube-dl
-6
View File
@@ -1,6 +0,0 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import youtube_dl
youtube_dl.main()
BIN
View File
Binary file not shown.
+62 -55
View File
@@ -13,7 +13,7 @@ import urllib2
if os.name == 'nt':
import ctypes
from utils import *
@@ -44,37 +44,38 @@ class FileDownloader(object):
Available options:
username: Username for authentication purposes.
password: Password for authentication purposes.
usenetrc: Use netrc for authentication instead.
quiet: Do not print messages to stdout.
forceurl: Force printing final URL.
forcetitle: Force printing title.
forcethumbnail: Force printing thumbnail URL.
forcedescription: Force printing description.
forcefilename: Force printing final filename.
simulate: Do not download the video files.
format: Video format code.
format_limit: Highest quality format to try.
outtmpl: Template for output names.
ignoreerrors: Do not stop on download errors.
ratelimit: Download speed limit, in bytes/sec.
nooverwrites: Prevent overwriting files.
retries: Number of times to retry for HTTP error 5xx
continuedl: Try to continue downloads if possible.
noprogress: Do not print the progress bar.
playliststart: Playlist item to start at.
playlistend: Playlist item to end at.
matchtitle: Download only matching titles.
rejecttitle: Reject downloads for matching titles.
logtostderr: Log messages to stderr instead of stdout.
consoletitle: Display progress in console window's titlebar.
nopart: Do not use temporary .part files.
updatetime: Use the Last-modified header to set output file timestamps.
writedescription: Write the video description to a .description file
writeinfojson: Write the video description to a .info.json file
writesubtitles: Write the video subtitles to a .srt file
subtitleslang: Language of the subtitles to download
username: Username for authentication purposes.
password: Password for authentication purposes.
usenetrc: Use netrc for authentication instead.
quiet: Do not print messages to stdout.
forceurl: Force printing final URL.
forcetitle: Force printing title.
forcethumbnail: Force printing thumbnail URL.
forcedescription: Force printing description.
forcefilename: Force printing final filename.
simulate: Do not download the video files.
format: Video format code.
format_limit: Highest quality format to try.
outtmpl: Template for output names.
restrictfilenames: Do not allow "&" and spaces in file names
ignoreerrors: Do not stop on download errors.
ratelimit: Download speed limit, in bytes/sec.
nooverwrites: Prevent overwriting files.
retries: Number of times to retry for HTTP error 5xx
continuedl: Try to continue downloads if possible.
noprogress: Do not print the progress bar.
playliststart: Playlist item to start at.
playlistend: Playlist item to end at.
matchtitle: Download only matching titles.
rejecttitle: Reject downloads for matching titles.
logtostderr: Log messages to stderr instead of stdout.
consoletitle: Display progress in console window's titlebar.
nopart: Do not use temporary .part files.
updatetime: Use the Last-modified header to set output file timestamps.
writedescription: Write the video description to a .description file
writeinfojson: Write the video description to a .info.json file
writesubtitles: Write the video subtitles to a .srt file
subtitleslang: Language of the subtitles to download
"""
params = None
@@ -139,23 +140,23 @@ class FileDownloader(object):
new_min = max(bytes / 2.0, 1.0)
new_max = min(max(bytes * 2.0, 1.0), 4194304) # Do not surpass 4 MB
if elapsed_time < 0.001:
return long(new_max)
return int(new_max)
rate = bytes / elapsed_time
if rate > new_max:
return long(new_max)
return int(new_max)
if rate < new_min:
return long(new_min)
return long(rate)
return int(new_min)
return int(rate)
@staticmethod
def parse_bytes(bytestr):
"""Parse a string indicating a byte quantity into a long integer."""
"""Parse a string indicating a byte quantity into an integer."""
matchobj = re.match(r'(?i)^(\d+(?:\.\d+)?)([kMGTPEZY]?)$', bytestr)
if matchobj is None:
return None
number = float(matchobj.group(1))
multiplier = 1024.0 ** 'bkmgtpezy'.index(matchobj.group(2).lower())
return long(round(number * multiplier))
return int(round(number * multiplier))
def add_info_extractor(self, ie):
"""Add an InfoExtractor object to the end of the list."""
@@ -173,7 +174,6 @@ class FileDownloader(object):
if not self.params.get('quiet', False):
terminator = [u'\n', u''][skip_eol]
output = message + terminator
if 'b' not in self._screen_file.mode or sys.version_info[0] < 3: # Python 2 lies about the mode of sys.stdout/sys.stderr
output = output.encode(preferredencoding(), 'ignore')
self._screen_file.write(output)
@@ -181,7 +181,8 @@ class FileDownloader(object):
def to_stderr(self, message):
"""Print message to stderr."""
print >>sys.stderr, message.encode(preferredencoding())
assert type(message) == type(u'')
sys.stderr.write((message + u'\n').encode(preferredencoding()))
def to_cons_title(self, message):
"""Set console/terminal window title to message."""
@@ -323,6 +324,7 @@ class FileDownloader(object):
template_dict = dict(info_dict)
template_dict['epoch'] = unicode(long(time.time()))
template_dict['autonumber'] = unicode('%05d' % self._num_downloads)
template_dict['title'] = template_dict['stitle'] # Keep both for backwards compatibility
filename = self.params['outtmpl'] % template_dict
return filename
except (ValueError, KeyError), err:
@@ -334,17 +336,21 @@ class FileDownloader(object):
title = info_dict['title']
matchtitle = self.params.get('matchtitle', False)
if matchtitle and not re.search(matchtitle, title, re.IGNORECASE):
return u'[download] "' + title + '" title did not match pattern "' + matchtitle + '"'
if matchtitle:
matchtitle = matchtitle.decode('utf8')
if not re.search(matchtitle, title, re.IGNORECASE):
return u'[download] "' + title + '" title did not match pattern "' + matchtitle + '"'
rejecttitle = self.params.get('rejecttitle', False)
if rejecttitle and re.search(rejecttitle, title, re.IGNORECASE):
return u'"' + title + '" title matched reject pattern "' + rejecttitle + '"'
if rejecttitle:
rejecttitle = rejecttitle.decode('utf8')
if re.search(rejecttitle, title, re.IGNORECASE):
return u'"' + title + '" title matched reject pattern "' + rejecttitle + '"'
return None
def process_info(self, info_dict):
"""Process a single dictionary returned by an InfoExtractor."""
info_dict['stitle'] = sanitize_filename(info_dict['title'])
info_dict['stitle'] = sanitize_filename(info_dict['title'], self.params.get('restrictfilenames'))
reason = self._match_entry(info_dict)
if reason is not None:
@@ -357,20 +363,20 @@ class FileDownloader(object):
raise MaxDownloadsReached()
filename = self.prepare_filename(info_dict)
# Forced printings
if self.params.get('forcetitle', False):
print info_dict['title'].encode(preferredencoding(), 'xmlcharrefreplace')
print(info_dict['title'].encode(preferredencoding(), 'xmlcharrefreplace'))
if self.params.get('forceurl', False):
print info_dict['url'].encode(preferredencoding(), 'xmlcharrefreplace')
print(info_dict['url'].encode(preferredencoding(), 'xmlcharrefreplace'))
if self.params.get('forcethumbnail', False) and 'thumbnail' in info_dict:
print info_dict['thumbnail'].encode(preferredencoding(), 'xmlcharrefreplace')
print(info_dict['thumbnail'].encode(preferredencoding(), 'xmlcharrefreplace'))
if self.params.get('forcedescription', False) and 'description' in info_dict:
print info_dict['description'].encode(preferredencoding(), 'xmlcharrefreplace')
print(info_dict['description'].encode(preferredencoding(), 'xmlcharrefreplace'))
if self.params.get('forcefilename', False) and filename is not None:
print filename.encode(preferredencoding(), 'xmlcharrefreplace')
print(filename.encode(preferredencoding(), 'xmlcharrefreplace'))
if self.params.get('forceformat', False):
print info_dict['format'].encode(preferredencoding(), 'xmlcharrefreplace')
print(info_dict['format'].encode(preferredencoding(), 'xmlcharrefreplace'))
# Do nothing else if in simulate mode
if self.params.get('simulate', False):
@@ -399,10 +405,10 @@ class FileDownloader(object):
except (OSError, IOError):
self.trouble(u'ERROR: Cannot write description file ' + descfn)
return
if self.params.get('writesubtitles', False) and 'subtitles' in info_dict and info_dict['subtitles']:
# subtitles download errors are already managed as troubles in relevant IE
# that way it will silently go on when used with unsupporting IE
# that way it will silently go on when used with unsupporting IE
try:
srtfn = filename.rsplit('.', 1)[0] + u'.srt'
self.report_writesubtitles(srtfn)
@@ -448,7 +454,7 @@ class FileDownloader(object):
except (ContentTooShortError, ), err:
self.trouble(u'ERROR: content too short (expected %s bytes and served %s)' % (err.expected, err.downloaded))
return
if success:
try:
self.post_process(filename, info_dict)
@@ -474,6 +480,7 @@ class FileDownloader(object):
# Extract information from URL and process it
videos = ie.extract(url)
for video in videos or []:
video['extractor'] = ie.IE_NAME
try:
self.increment_downloads()
self.process_info(video)
+502 -78
View File
@@ -13,6 +13,8 @@ import urllib
import urllib2
import email.utils
import xml.etree.ElementTree
import random
import math
from urlparse import parse_qs
try:
@@ -95,7 +97,26 @@ class InfoExtractor(object):
class YoutubeIE(InfoExtractor):
"""Information extractor for youtube.com."""
_VALID_URL = r'^((?:https?://)?(?:youtu\.be/|(?:\w+\.)?youtube(?:-nocookie)?\.com/|tube.majestyc.net/)(?!view_play_list|my_playlists|artist|playlist)(?:(?:(?:v|embed|e)/)|(?:(?:watch(?:_popup)?(?:\.php)?)?(?:\?|#!?)(?:.+&)?v=))?)?([0-9A-Za-z_-]+)(?(1).+)?$'
_VALID_URL = r"""^
(
(?:https?://)? # http(s):// (optional)
(?:youtu\.be/|(?:\w+\.)?youtube(?:-nocookie)?\.com/|
tube\.majestyc\.net/) # the various hostnames, with wildcard subdomains
(?:.*?\#/)? # handle anchor (#/) redirect urls
(?!view_play_list|my_playlists|artist|playlist) # ignore playlist URLs
(?: # the various things that can precede the ID:
(?:(?:v|embed|e)/) # v/ or embed/ or e/
|(?: # or the v= param in all its forms
(?:watch(?:_popup)?(?:\.php)?)? # preceding watch(_popup|.php) or nothing (like /?v=xxxx)
(?:\?|\#!?) # the params delimiter ? or # or #!
(?:.+&)? # any other preceding param (like /?s=tuff&v=xxxx)
v=
)
)? # optional -> youtube.com/xxxx is OK
)? # all until now is optional -> you can pass the naked ID
([0-9A-Za-z_-]+) # here is it! the YouTube video ID
(?(1).+)? # if we found the ID, everything can follow
$"""
_LANG_URL = r'http://www.youtube.com/?hl=en&persist_hl=1&gl=US&persist_gl=1&opt_out_ackd=1'
_LOGIN_URL = 'https://www.youtube.com/signup?next=/&gl=US&hl=en'
_AGE_URL = 'http://www.youtube.com/verify_age?next_url=/&gl=US&hl=en'
@@ -134,6 +155,10 @@ class YoutubeIE(InfoExtractor):
}
IE_NAME = u'youtube'
def suitable(self, url):
"""Receives a URL and returns True if suitable for this IE."""
return re.match(self._VALID_URL, url, re.VERBOSE) is not None
def report_lang(self):
"""Report attempt to set language."""
self._downloader.to_screen(u'[youtube] Setting language')
@@ -188,9 +213,9 @@ class YoutubeIE(InfoExtractor):
return srt
def _print_formats(self, formats):
print 'Available formats:'
print('Available formats:')
for x in formats:
print '%s\t:\t%s\t[%s]' %(x, self._video_extensions.get(x, 'flv'), self._video_dimensions.get(x, '???'))
print('%s\t:\t%s\t[%s]' %(x, self._video_extensions.get(x, 'flv'), self._video_dimensions.get(x, '???')))
def _real_initialize(self):
if self._downloader is None:
@@ -213,7 +238,7 @@ class YoutubeIE(InfoExtractor):
else:
raise netrc.NetrcParseError('No authenticators for %s' % self._NETRC_MACHINE)
except (IOError, netrc.NetrcParseError), err:
self._downloader.to_stderr(u'WARNING: parsing .netrc: %s' % str(err))
self._downloader.to_stderr(u'WARNING: parsing .netrc: %s' % compat_str(err))
return
# Set language
@@ -222,7 +247,7 @@ class YoutubeIE(InfoExtractor):
self.report_lang()
urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.to_stderr(u'WARNING: unable to set language: %s' % str(err))
self._downloader.to_stderr(u'WARNING: unable to set language: %s' % compat_str(err))
return
# No authentication to be performed
@@ -245,7 +270,7 @@ class YoutubeIE(InfoExtractor):
self._downloader.to_stderr(u'WARNING: unable to log in: bad username or password')
return
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.to_stderr(u'WARNING: unable to log in: %s' % str(err))
self._downloader.to_stderr(u'WARNING: unable to log in: %s' % compat_str(err))
return
# Confirm age
@@ -258,7 +283,7 @@ class YoutubeIE(InfoExtractor):
self.report_age_confirmation()
age_results = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to confirm age: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to confirm age: %s' % compat_str(err))
return
def _real_extract(self, url):
@@ -268,7 +293,7 @@ class YoutubeIE(InfoExtractor):
url = 'http://www.youtube.com/' + urllib.unquote(mobj.group(1)).lstrip('/')
# Extract video id from URL
mobj = re.match(self._VALID_URL, url)
mobj = re.match(self._VALID_URL, url, re.VERBOSE)
if mobj is None:
self._downloader.trouble(u'ERROR: invalid URL: %s' % url)
return
@@ -280,7 +305,7 @@ class YoutubeIE(InfoExtractor):
try:
video_webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % compat_str(err))
return
# Attempt to extract SWF player URL
@@ -302,7 +327,7 @@ class YoutubeIE(InfoExtractor):
if 'token' in video_info:
break
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video info webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download video info webpage: %s' % compat_str(err))
return
if 'token' not in video_info:
if 'reason' in video_info:
@@ -365,7 +390,7 @@ class YoutubeIE(InfoExtractor):
try:
srt_list = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
raise Trouble(u'WARNING: unable to download video subtitles: %s' % str(err))
raise Trouble(u'WARNING: unable to download video subtitles: %s' % compat_str(err))
srt_lang_list = re.findall(r'name="([^"]*)"[^>]+lang_code="([\w\-]+)"', srt_list)
srt_lang_list = dict((l[1], l[0]) for l in srt_lang_list)
if not srt_lang_list:
@@ -382,13 +407,19 @@ class YoutubeIE(InfoExtractor):
try:
srt_xml = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
raise Trouble(u'WARNING: unable to download video subtitles: %s' % str(err))
raise Trouble(u'WARNING: unable to download video subtitles: %s' % compat_str(err))
if not srt_xml:
raise Trouble(u'WARNING: unable to download video subtitles')
video_subtitles = self._closed_captions_xml_to_srt(srt_xml.decode('utf-8'))
except Trouble as trouble:
self._downloader.trouble(trouble[0])
if 'length_seconds' not in video_info:
self._downloader.trouble(u'WARNING: unable to extract video duration')
video_duration = ''
else:
video_duration = urllib.unquote_plus(video_info['length_seconds'][0])
# token
video_token = urllib.unquote_plus(video_info['token'][0])
@@ -455,7 +486,8 @@ class YoutubeIE(InfoExtractor):
'thumbnail': video_thumbnail.decode('utf-8'),
'description': video_description,
'player_url': player_url,
'subtitles': video_subtitles
'subtitles': video_subtitles,
'duration': video_duration
})
return results
@@ -494,7 +526,7 @@ class MetacafeIE(InfoExtractor):
self.report_disclaimer()
disclaimer = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to retrieve disclaimer: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to retrieve disclaimer: %s' % compat_str(err))
return
# Confirm age
@@ -507,7 +539,7 @@ class MetacafeIE(InfoExtractor):
self.report_age_confirmation()
disclaimer = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to confirm age: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to confirm age: %s' % compat_str(err))
return
def _real_extract(self, url):
@@ -531,7 +563,7 @@ class MetacafeIE(InfoExtractor):
self.report_download_webpage(video_id)
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable retrieve video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable retrieve video webpage: %s' % compat_str(err))
return
# Extract URL, uploader and title from webpage
@@ -571,7 +603,7 @@ class MetacafeIE(InfoExtractor):
return
video_title = mobj.group(1).decode('utf-8')
mobj = re.search(r'(?ms)By:\s*<a .*?>(.+?)<', webpage)
mobj = re.search(r'submitter=(.*?);', webpage)
if mobj is None:
self._downloader.trouble(u'ERROR: unable to extract uploader nickname')
return
@@ -592,7 +624,7 @@ class MetacafeIE(InfoExtractor):
class DailymotionIE(InfoExtractor):
"""Information Extractor for Dailymotion"""
_VALID_URL = r'(?i)(?:https?://)?(?:www\.)?dailymotion\.[a-z]{2,3}/video/([^_/]+)_([^/]+)'
_VALID_URL = r'(?i)(?:https?://)?(?:www\.)?dailymotion\.[a-z]{2,3}/video/([^/]+)'
IE_NAME = u'dailymotion'
def __init__(self, downloader=None):
@@ -613,9 +645,9 @@ class DailymotionIE(InfoExtractor):
self._downloader.trouble(u'ERROR: invalid URL: %s' % url)
return
video_id = mobj.group(1)
video_id = mobj.group(1).split('_')[0].split('?')[0]
video_extension = 'flv'
video_extension = 'mp4'
# Retrieve video webpage to extract further information
request = urllib2.Request(url)
@@ -624,25 +656,34 @@ class DailymotionIE(InfoExtractor):
self.report_download_webpage(video_id)
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable retrieve video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable retrieve video webpage: %s' % compat_str(err))
return
# Extract URL, uploader and title from webpage
self.report_extraction(video_id)
mobj = re.search(r'(?i)addVariable\(\"sequence\"\s*,\s*\"([^\"]+?)\"\)', webpage)
mobj = re.search(r'\s*var flashvars = (.*)', webpage)
if mobj is None:
self._downloader.trouble(u'ERROR: unable to extract media URL')
return
sequence = urllib.unquote(mobj.group(1))
mobj = re.search(r',\"sdURL\"\:\"([^\"]+?)\",', sequence)
if mobj is None:
self._downloader.trouble(u'ERROR: unable to extract media URL')
flashvars = urllib.unquote(mobj.group(1))
for key in ['hd1080URL', 'hd720URL', 'hqURL', 'sdURL', 'ldURL', 'video_url']:
if key in flashvars:
max_quality = key
self._downloader.to_screen(u'[dailymotion] Using %s' % key)
break
else:
self._downloader.trouble(u'ERROR: unable to extract video URL')
return
mediaURL = urllib.unquote(mobj.group(1)).replace('\\', '')
# if needed add http://www.dailymotion.com/ if relative URL
mobj = re.search(r'"' + max_quality + r'":"(.+?)"', flashvars)
if mobj is None:
self._downloader.trouble(u'ERROR: unable to extract video URL')
return
video_url = mediaURL
video_url = urllib.unquote(mobj.group(1)).replace('\\/', '/')
# TODO: support choosing qualities
mobj = re.search(r'<meta property="og:title" content="(?P<title>[^"]*)" />', webpage)
if mobj is None:
@@ -650,17 +691,28 @@ class DailymotionIE(InfoExtractor):
return
video_title = unescapeHTML(mobj.group('title').decode('utf-8'))
mobj = re.search(r'(?im)<span class="owner[^\"]+?">[^<]+?<a [^>]+?>([^<]+?)</a></span>', webpage)
video_uploader = u'NA'
mobj = re.search(r'(?im)<span class="owner[^\"]+?">[^<]+?<a [^>]+?>([^<]+?)</a>', webpage)
if mobj is None:
self._downloader.trouble(u'ERROR: unable to extract uploader nickname')
return
video_uploader = mobj.group(1)
# lookin for official user
mobj_official = re.search(r'<span rel="author"[^>]+?>([^<]+?)</span>', webpage)
if mobj_official is None:
self._downloader.trouble(u'WARNING: unable to extract uploader nickname')
else:
video_uploader = mobj_official.group(1)
else:
video_uploader = mobj.group(1)
video_upload_date = u'NA'
mobj = re.search(r'<div class="[^"]*uploaded_cont[^"]*" title="[^"]*">([0-9]{2})-([0-9]{2})-([0-9]{4})</div>', webpage)
if mobj is not None:
video_upload_date = mobj.group(3) + mobj.group(2) + mobj.group(1)
return [{
'id': video_id.decode('utf-8'),
'url': video_url.decode('utf-8'),
'uploader': video_uploader.decode('utf-8'),
'upload_date': u'NA',
'upload_date': video_upload_date,
'title': video_title,
'ext': video_extension.decode('utf-8'),
'format': u'NA',
@@ -702,7 +754,7 @@ class GoogleIE(InfoExtractor):
self.report_download_webpage(video_id)
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % compat_str(err))
return
# Extract URL, uploader, and title from webpage
@@ -741,7 +793,7 @@ class GoogleIE(InfoExtractor):
try:
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % compat_str(err))
return
mobj = re.search(r'<img class=thumbnail-img (?:.* )?src=(http.*)>', webpage)
if mobj is None:
@@ -797,7 +849,7 @@ class PhotobucketIE(InfoExtractor):
self.report_download_webpage(video_id)
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % compat_str(err))
return
# Extract URL, uploader, and title from webpage
@@ -867,7 +919,7 @@ class YahooIE(InfoExtractor):
try:
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % compat_str(err))
return
mobj = re.search(r'\("id", "([0-9]+)"\);', webpage)
@@ -891,7 +943,7 @@ class YahooIE(InfoExtractor):
self.report_download_webpage(video_id)
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % compat_str(err))
return
# Extract uploader and title from webpage
@@ -949,7 +1001,7 @@ class YahooIE(InfoExtractor):
self.report_download_webpage(video_id)
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % compat_str(err))
return
# Extract media URL from playlist XML
@@ -978,7 +1030,7 @@ class VimeoIE(InfoExtractor):
"""Information extractor for vimeo.com."""
# _VALID_URL matches Vimeo URLs
_VALID_URL = r'(?:https?://)?(?:(?:www|player).)?vimeo\.com/(?:groups/[^/]+/)?(?:videos?/)?([0-9]+)'
_VALID_URL = r'(?:https?://)?(?:(?:www|player).)?vimeo\.com/(?:(?:groups|album)/[^/]+/)?(?:videos?/)?([0-9]+)'
IE_NAME = u'vimeo'
def __init__(self, downloader=None):
@@ -1007,7 +1059,7 @@ class VimeoIE(InfoExtractor):
self.report_download_webpage(video_id)
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % compat_str(err))
return
# Now we begin extracting as much information as we can from what we
@@ -1048,21 +1100,32 @@ class VimeoIE(InfoExtractor):
timestamp = config['request']['timestamp']
# Vimeo specific: extract video codec and quality information
# First consider quality, then codecs, then take everything
# TODO bind to format param
codecs = [('h264', 'mp4'), ('vp8', 'flv'), ('vp6', 'flv')]
for codec in codecs:
if codec[0] in config["video"]["files"]:
video_codec = codec[0]
video_extension = codec[1]
if 'hd' in config["video"]["files"][codec[0]]: quality = 'hd'
else: quality = 'sd'
files = { 'hd': [], 'sd': [], 'other': []}
for codec_name, codec_extension in codecs:
if codec_name in config["video"]["files"]:
if 'hd' in config["video"]["files"][codec_name]:
files['hd'].append((codec_name, codec_extension, 'hd'))
elif 'sd' in config["video"]["files"][codec_name]:
files['sd'].append((codec_name, codec_extension, 'sd'))
else:
files['other'].append((codec_name, codec_extension, config["video"]["files"][codec_name][0]))
for quality in ('hd', 'sd', 'other'):
if len(files[quality]) > 0:
video_quality = files[quality][0][2]
video_codec = files[quality][0][0]
video_extension = files[quality][0][1]
self._downloader.to_screen(u'[vimeo] %s: Downloading %s file at %s quality' % (video_id, video_codec.upper(), video_quality))
break
else:
self._downloader.trouble(u'ERROR: no known codec found')
return
video_url = "http://player.vimeo.com/play_redirect?clip_id=%s&sig=%s&time=%s&quality=%s&codecs=%s&type=moogaloop_local&embed_location=" \
%(video_id, sig, timestamp, quality, video_codec.upper())
%(video_id, sig, timestamp, video_quality, video_codec.upper())
return [{
'id': video_id,
@@ -1162,7 +1225,7 @@ class GenericIE(InfoExtractor):
self.report_download_webpage(video_id)
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % compat_str(err))
return
except ValueError, err:
# since this is the last-resort InfoExtractor, if
@@ -1283,7 +1346,7 @@ class YoutubeSearchIE(InfoExtractor):
try:
data = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download API page: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download API page: %s' % compat_str(err))
return
api_response = json.loads(data)['data']
@@ -1360,7 +1423,7 @@ class GoogleSearchIE(InfoExtractor):
try:
page = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download webpage: %s' % compat_str(err))
return
# Extract video identifiers
@@ -1443,7 +1506,7 @@ class YahooSearchIE(InfoExtractor):
try:
page = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download webpage: %s' % compat_str(err))
return
# Extract video identifiers
@@ -1469,9 +1532,9 @@ class YahooSearchIE(InfoExtractor):
class YoutubePlaylistIE(InfoExtractor):
"""Information Extractor for YouTube playlists."""
_VALID_URL = r'(?:https?://)?(?:\w+\.)?youtube\.com/(?:(?:course|view_play_list|my_playlists|artist|playlist)\?.*?(p|a|list)=|user/.*?/user/|p/|user/.*?#[pg]/c/)(?:PL)?([0-9A-Za-z-_]+)(?:/.*?/([0-9A-Za-z_-]+))?.*'
_VALID_URL = r'(?:(?:https?://)?(?:\w+\.)?youtube\.com/(?:(?:course|view_play_list|my_playlists|artist|playlist)\?.*?(p|a|list)=|user/.*?/user/|p/|user/.*?#[pg]/c/)(?:PL|EC)?|PL|EC)([0-9A-Za-z-_]+)(?:/.*?/([0-9A-Za-z_-]+))?.*'
_TEMPLATE_URL = 'http://www.youtube.com/%s?%s=%s&page=%s&gl=US&hl=en'
_VIDEO_INDICATOR_TEMPLATE = r'/watch\?v=(.+?)&amp;list=(PL)?%s&'
_VIDEO_INDICATOR_TEMPLATE = r'/watch\?v=(.+?)&amp;([^&"]+&amp;)*list=.*?%s'
_MORE_PAGES_INDICATOR = r'yt-uix-pager-next'
IE_NAME = u'youtube:playlist'
@@ -1513,7 +1576,7 @@ class YoutubePlaylistIE(InfoExtractor):
try:
page = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download webpage: %s' % compat_str(err))
return
# Extract video identifiers
@@ -1539,6 +1602,56 @@ class YoutubePlaylistIE(InfoExtractor):
return
class YoutubeChannelIE(InfoExtractor):
"""Information Extractor for YouTube channels."""
_VALID_URL = r"^(?:https?://)?(?:youtu\.be|(?:\w+\.)?youtube(?:-nocookie)?\.com)/channel/([0-9A-Za-z_-]+)(?:/.*)?$"
_TEMPLATE_URL = 'http://www.youtube.com/channel/%s/videos?sort=da&flow=list&view=0&page=%s&gl=US&hl=en'
_MORE_PAGES_INDICATOR = r'yt-uix-button-content">Next' # TODO
IE_NAME = u'youtube:channel'
def report_download_page(self, channel_id, pagenum):
"""Report attempt to download channel page with given number."""
self._downloader.to_screen(u'[youtube] Channel %s: Downloading page #%s' % (channel_id, pagenum))
def _real_extract(self, url):
# Extract channel id
mobj = re.match(self._VALID_URL, url)
if mobj is None:
self._downloader.trouble(u'ERROR: invalid url: %s' % url)
return
# Download channel pages
channel_id = mobj.group(1)
video_ids = []
pagenum = 1
while True:
self.report_download_page(channel_id, pagenum)
url = self._TEMPLATE_URL % (channel_id, pagenum)
request = urllib2.Request(url)
try:
page = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download webpage: %s' % compat_str(err))
return
# Extract video identifiers
ids_in_page = []
for mobj in re.finditer(r'href="/watch\?v=([0-9A-Za-z_-]+)&', page):
if mobj.group(1) not in ids_in_page:
ids_in_page.append(mobj.group(1))
video_ids.extend(ids_in_page)
if re.search(self._MORE_PAGES_INDICATOR, page) is None:
break
pagenum = pagenum + 1
for id in video_ids:
self._downloader.download(['http://www.youtube.com/watch?v=%s' % id])
return
class YoutubeUserIE(InfoExtractor):
"""Information Extractor for YouTube users."""
@@ -1583,7 +1696,7 @@ class YoutubeUserIE(InfoExtractor):
try:
page = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download webpage: %s' % compat_str(err))
return
# Extract video identifiers
@@ -1655,7 +1768,7 @@ class BlipTVUserIE(InfoExtractor):
mobj = re.search(r'data-users-id="([^"]+)"', page)
page_base = page_base % mobj.group(1)
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download webpage: %s' % compat_str(err))
return
@@ -1743,7 +1856,7 @@ class DepositFilesIE(InfoExtractor):
self.report_download_webpage(file_id)
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve file webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: Unable to retrieve file webpage: %s' % compat_str(err))
return
# Search for the real file URL
@@ -1860,7 +1973,7 @@ class FacebookIE(InfoExtractor):
else:
raise netrc.NetrcParseError('No authenticators for %s' % self._NETRC_MACHINE)
except (IOError, netrc.NetrcParseError), err:
self._downloader.to_stderr(u'WARNING: parsing .netrc: %s' % str(err))
self._downloader.to_stderr(u'WARNING: parsing .netrc: %s' % compat_str(err))
return
if useremail is None:
@@ -1880,7 +1993,7 @@ class FacebookIE(InfoExtractor):
self._downloader.to_stderr(u'WARNING: unable to log in: bad username/password, or exceded login rate limit (~3/min). Check credentials or wait.')
return
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.to_stderr(u'WARNING: unable to log in: %s' % str(err))
self._downloader.to_stderr(u'WARNING: unable to log in: %s' % compat_str(err))
return
def _real_extract(self, url):
@@ -1897,7 +2010,7 @@ class FacebookIE(InfoExtractor):
page = urllib2.urlopen(request)
video_webpage = page.read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % compat_str(err))
return
# Start extracting information
@@ -2031,13 +2144,13 @@ class BlipTVIE(InfoExtractor):
'urlhandle': urlh
}
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video info webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download video info webpage: %s' % compat_str(err))
return
if info is None: # Regular URL
try:
json_code = urlh.read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to read video info webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to read video info webpage: %s' % compat_str(err))
return
try:
@@ -2105,7 +2218,7 @@ class MyVideoIE(InfoExtractor):
self.report_download_webpage(video_id)
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % compat_str(err))
return
self.report_extraction(video_id)
@@ -2367,7 +2480,7 @@ class CollegeHumorIE(InfoExtractor):
try:
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % compat_str(err))
return
m = re.search(r'id="video:(?P<internalvideoid>[0-9]+)"', webpage)
@@ -2386,7 +2499,7 @@ class CollegeHumorIE(InfoExtractor):
try:
metaXml = urllib2.urlopen(xmlUrl).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video info XML: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download video info XML: %s' % compat_str(err))
return
mdoc = xml.etree.ElementTree.fromstring(metaXml)
@@ -2432,7 +2545,7 @@ class XVideosIE(InfoExtractor):
try:
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % compat_str(err))
return
self.report_extraction(video_id)
@@ -2518,7 +2631,7 @@ class SoundcloudIE(InfoExtractor):
try:
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % compat_str(err))
return
self.report_extraction('%s/%s' % (uploader, slug_title))
@@ -2545,7 +2658,7 @@ class SoundcloudIE(InfoExtractor):
mobj = re.search('track-description-value"><p>(.*?)</p>', webpage)
if mobj:
description = mobj.group(1)
# upload date
upload_date = None
mobj = re.search("pretty-date'>on ([\w]+ [\d]+, [\d]+ \d+:\d+)</abbr></h2>", webpage)
@@ -2553,7 +2666,7 @@ class SoundcloudIE(InfoExtractor):
try:
upload_date = datetime.datetime.strptime(mobj.group(1), '%B %d, %Y %H:%M').strftime('%Y%m%d')
except Exception, e:
self._downloader.to_stderr(str(e))
self._downloader.to_stderr(compat_str(e))
# for soundcloud, a request to a cross domain is required for cookies
request = urllib2.Request('http://media.soundcloud.com/crossdomain.xml', std_headers)
@@ -2597,7 +2710,7 @@ class InfoQIE(InfoExtractor):
try:
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % compat_str(err))
return
self.report_extraction(url)
@@ -2683,15 +2796,15 @@ class MixcloudIE(InfoExtractor):
return None
def _print_formats(self, formats):
print 'Available formats:'
print('Available formats:')
for fmt in formats.keys():
for b in formats[fmt]:
try:
ext = formats[fmt][b][0]
print '%s\t%s\t[%s]' % (fmt, b, ext.split('.')[-1])
print('%s\t%s\t[%s]' % (fmt, b, ext.split('.')[-1]))
except TypeError: # we have no bitrate info
ext = formats[fmt][0]
print '%s\t%s\t[%s]' % (fmt, '??', ext.split('.')[-1])
print('%s\t%s\t[%s]' % (fmt, '??', ext.split('.')[-1]))
break
def _real_extract(self, url):
@@ -2711,7 +2824,7 @@ class MixcloudIE(InfoExtractor):
self.report_download_json(file_url)
jsonData = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve file: %s' % str(err))
self._downloader.trouble(u'ERROR: Unable to retrieve file: %s' % compat_str(err))
return
# parse JSON
@@ -2895,7 +3008,7 @@ class MTVIE(InfoExtractor):
try:
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % compat_str(err))
return
mobj = re.search(r'<meta name="mtv_vt" content="([^"]+)"/>', webpage)
@@ -2928,7 +3041,7 @@ class MTVIE(InfoExtractor):
try:
metadataXml = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video metadata: %s' % str(err))
self._downloader.trouble(u'ERROR: unable to download video metadata: %s' % compat_str(err))
return
mdoc = xml.etree.ElementTree.fromstring(metadataXml)
@@ -2955,3 +3068,314 @@ class MTVIE(InfoExtractor):
}
return [info]
class YoukuIE(InfoExtractor):
_VALID_URL = r'(?:http://)?v\.youku\.com/v_show/id_(?P<ID>[A-Za-z0-9]+)\.html'
IE_NAME = u'Youku'
def __init__(self, downloader=None):
InfoExtractor.__init__(self, downloader)
def report_download_webpage(self, file_id):
"""Report webpage download."""
self._downloader.to_screen(u'[Youku] %s: Downloading webpage' % file_id)
def report_extraction(self, file_id):
"""Report information extraction."""
self._downloader.to_screen(u'[Youku] %s: Extracting information' % file_id)
def _gen_sid(self):
nowTime = int(time.time() * 1000)
random1 = random.randint(1000,1998)
random2 = random.randint(1000,9999)
return "%d%d%d" %(nowTime,random1,random2)
def _get_file_ID_mix_string(self, seed):
mixed = []
source = list("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\:._-1234567890")
seed = float(seed)
for i in range(len(source)):
seed = (seed * 211 + 30031 ) % 65536
index = math.floor(seed / 65536 * len(source) )
mixed.append(source[int(index)])
source.remove(source[int(index)])
#return ''.join(mixed)
return mixed
def _get_file_id(self, fileId, seed):
mixed = self._get_file_ID_mix_string(seed)
ids = fileId.split('*')
realId = []
for ch in ids:
if ch:
realId.append(mixed[int(ch)])
return ''.join(realId)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
self._downloader.trouble(u'ERROR: invalid URL: %s' % url)
return
video_id = mobj.group('ID')
info_url = 'http://v.youku.com/player/getPlayList/VideoIDS/' + video_id
request = urllib2.Request(info_url, None, std_headers)
try:
self.report_download_webpage(video_id)
jsondata = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error) as err:
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % compat_str(err))
return
self.report_extraction(video_id)
try:
config = json.loads(jsondata)
video_title = config['data'][0]['title']
seed = config['data'][0]['seed']
format = self._downloader.params.get('format', None)
supported_format = config['data'][0]['streamfileids'].keys()
if format is None or format == 'best':
if 'hd2' in supported_format:
format = 'hd2'
else:
format = 'flv'
ext = u'flv'
elif format == 'worst':
format = 'mp4'
ext = u'mp4'
else:
format = 'flv'
ext = u'flv'
fileid = config['data'][0]['streamfileids'][format]
seg_number = len(config['data'][0]['segs'][format])
keys=[]
for i in xrange(seg_number):
keys.append(config['data'][0]['segs'][format][i]['k'])
#TODO check error
#youku only could be viewed from mainland china
except:
self._downloader.trouble(u'ERROR: unable to extract info section')
return
files_info=[]
sid = self._gen_sid()
fileid = self._get_file_id(fileid, seed)
#column 8,9 of fileid represent the segment number
#fileid[7:9] should be changed
for index, key in enumerate(keys):
temp_fileid = '%s%02X%s' % (fileid[0:8], index, fileid[10:])
download_url = 'http://f.youku.com/player/getFlvPath/sid/%s_%02X/st/flv/fileid/%s?k=%s' % (sid, index, temp_fileid, key)
info = {
'id': '%s_part%02d' % (video_id, index),
'url': download_url,
'uploader': None,
'title': video_title,
'ext': ext,
'format': u'NA'
}
files_info.append(info)
return files_info
class XNXXIE(InfoExtractor):
"""Information extractor for xnxx.com"""
_VALID_URL = r'^http://video\.xnxx\.com/video([0-9]+)/(.*)'
IE_NAME = u'xnxx'
VIDEO_URL_RE = r'flv_url=(.*?)&amp;'
VIDEO_TITLE_RE = r'<title>(.*?)\s+-\s+XNXX.COM'
VIDEO_THUMB_RE = r'url_bigthumb=(.*?)&amp;'
def report_webpage(self, video_id):
"""Report information extraction"""
self._downloader.to_screen(u'[%s] %s: Downloading webpage' % (self.IE_NAME, video_id))
def report_extraction(self, video_id):
"""Report information extraction"""
self._downloader.to_screen(u'[%s] %s: Extracting information' % (self.IE_NAME, video_id))
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
if mobj is None:
self._downloader.trouble(u'ERROR: invalid URL: %s' % url)
return
video_id = mobj.group(1).decode('utf-8')
self.report_webpage(video_id)
# Get webpage content
try:
webpage = urllib2.urlopen(url).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: unable to download video webpage: %s' % err)
return
result = re.search(self.VIDEO_URL_RE, webpage)
if result is None:
self._downloader.trouble(u'ERROR: unable to extract video url')
return
video_url = urllib.unquote(result.group(1).decode('utf-8'))
result = re.search(self.VIDEO_TITLE_RE, webpage)
if result is None:
self._downloader.trouble(u'ERROR: unable to extract video title')
return
video_title = result.group(1).decode('utf-8')
result = re.search(self.VIDEO_THUMB_RE, webpage)
if result is None:
self._downloader.trouble(u'ERROR: unable to extract video thumbnail')
return
video_thumbnail = result.group(1).decode('utf-8')
info = {'id': video_id,
'url': video_url,
'uploader': None,
'upload_date': None,
'title': video_title,
'ext': 'flv',
'format': 'flv',
'thumbnail': video_thumbnail,
'description': None,
'player_url': None}
return [info]
class GooglePlusIE(InfoExtractor):
"""Information extractor for plus.google.com."""
_VALID_URL = r'(?:https://)?plus\.google\.com/(?:\w+/)*?(\d+)/posts/(\w+)'
IE_NAME = u'plus.google'
def __init__(self, downloader=None):
InfoExtractor.__init__(self, downloader)
def report_extract_entry(self, url):
"""Report downloading extry"""
self._downloader.to_screen(u'[plus.google] Downloading entry: %s' % url.decode('utf-8'))
def report_date(self, upload_date):
"""Report downloading extry"""
self._downloader.to_screen(u'[plus.google] Entry date: %s' % upload_date)
def report_uploader(self, uploader):
"""Report downloading extry"""
self._downloader.to_screen(u'[plus.google] Uploader: %s' % uploader.decode('utf-8'))
def report_title(self, video_title):
"""Report downloading extry"""
self._downloader.to_screen(u'[plus.google] Title: %s' % video_title.decode('utf-8'))
def report_extract_vid_page(self, video_page):
"""Report information extraction."""
self._downloader.to_screen(u'[plus.google] Extracting video page: %s' % video_page.decode('utf-8'))
def _real_extract(self, url):
# Extract id from URL
mobj = re.match(self._VALID_URL, url)
if mobj is None:
self._downloader.trouble(u'ERROR: Invalid URL: %s' % url)
return
post_url = mobj.group(0)
video_id = mobj.group(2)
video_extension = 'flv'
# Step 1, Retrieve post webpage to extract further information
self.report_extract_entry(post_url)
request = urllib2.Request(post_url)
try:
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve entry webpage: %s' % compat_str(err))
return
# Extract update date
upload_date = u'NA'
pattern = 'title="Timestamp">(.*?)</a>'
mobj = re.search(pattern, webpage)
if mobj:
upload_date = mobj.group(1)
# Convert timestring to a format suitable for filename
upload_date = datetime.datetime.strptime(upload_date, "%Y-%m-%d")
upload_date = upload_date.strftime('%Y%m%d')
self.report_date(upload_date)
# Extract uploader
uploader = u'NA'
pattern = r'rel\="author".*?>(.*?)</a>'
mobj = re.search(pattern, webpage)
if mobj:
uploader = mobj.group(1)
self.report_uploader(uploader)
# Extract title
# Get the first line for title
video_title = u'NA'
pattern = r'<meta name\=\"Description\" content\=\"(.*?)[\n<"]'
mobj = re.search(pattern, webpage)
if mobj:
video_title = mobj.group(1)
self.report_title(video_title)
# Step 2, Stimulate clicking the image box to launch video
pattern = '"(https\://plus\.google\.com/photos/.*?)",,"image/jpeg","video"\]'
mobj = re.search(pattern, webpage)
if mobj is None:
self._downloader.trouble(u'ERROR: unable to extract video page URL')
video_page = mobj.group(1)
request = urllib2.Request(video_page)
try:
webpage = urllib2.urlopen(request).read()
except (urllib2.URLError, httplib.HTTPException, socket.error), err:
self._downloader.trouble(u'ERROR: Unable to retrieve video webpage: %s' % compat_str(err))
return
self.report_extract_vid_page(video_page)
# Extract video links on video page
"""Extract video links of all sizes"""
pattern = '\d+,\d+,(\d+),"(http\://redirector\.googlevideo\.com.*?)"'
mobj = re.findall(pattern, webpage)
if len(mobj) == 0:
self._downloader.trouble(u'ERROR: unable to extract video links')
# Sort in resolution
links = sorted(mobj)
# Choose the lowest of the sort, i.e. highest resolution
video_url = links[-1]
# Only get the url. The resolution part in the tuple has no use anymore
video_url = video_url[-1]
# Treat escaped \u0026 style hex
video_url = unicode(video_url, "unicode_escape")
return [{
'id': video_id.decode('utf-8'),
'url': video_url,
'uploader': uploader.decode('utf-8'),
'upload_date': upload_date.decode('utf-8'),
'title': video_title.decode('utf-8'),
'ext': video_extension.decode('utf-8'),
'format': u'NA',
'player_url': None,
}]
+16 -9
View File
@@ -71,13 +71,14 @@ class FFmpegExtractAudioPP(PostProcessor):
@staticmethod
def detect_executables():
available = {'avprobe' : False, 'avconv' : False, 'ffmpeg' : False, 'ffprobe' : False}
for path in os.environ["PATH"].split(os.pathsep):
for program in available.keys():
exe_file = os.path.join(path, program)
if os.path.isfile(exe_file) and os.access(exe_file, os.X_OK):
available[program] = exe_file
return available
def executable(exe):
try:
subprocess.Popen([exe, '-version'], stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
except OSError:
return False
return exe
programs = ['avprobe', 'avconv', 'ffmpeg', 'ffprobe']
return dict((program, executable(program)) for program in programs)
def get_audio_codec(self, path):
if not self._exes['ffprobe'] and not self._exes['avprobe']: return None
@@ -142,14 +143,20 @@ class FFmpegExtractAudioPP(PostProcessor):
extension = 'mp3'
more_opts = []
if self._preferredquality is not None:
more_opts += [self._exes['avconv'] and '-b:a' or '-ab', self._preferredquality]
if int(self._preferredquality) < 10:
more_opts += [self._exes['avconv'] and '-q:a' or '-aq', self._preferredquality]
else:
more_opts += [self._exes['avconv'] and '-b:a' or '-ab', self._preferredquality + 'k']
else:
# We convert the audio (lossy)
acodec = {'mp3': 'libmp3lame', 'aac': 'aac', 'm4a': 'aac', 'vorbis': 'libvorbis', 'wav': None}[self._preferredcodec]
extension = self._preferredcodec
more_opts = []
if self._preferredquality is not None:
more_opts += [self._exes['avconv'] and '-b:a' or '-ab', self._preferredquality]
if int(self._preferredquality) < 10:
more_opts += [self._exes['avconv'] and '-q:a' or '-aq', self._preferredquality]
else:
more_opts += [self._exes['avconv'] and '-b:a' or '-ab', self._preferredquality + 'k']
if self._preferredcodec == 'aac':
more_opts += ['-f', 'adts']
if self._preferredcodec == 'm4a':
+43 -26
View File
@@ -1,6 +1,8 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import with_statement
__authors__ = (
'Ricardo Garcia Gonzalez',
'Danny Colligan',
@@ -19,7 +21,7 @@ __authors__ = (
)
__license__ = 'Public Domain'
__version__ = '2012.09.27'
__version__ = '2012.11.27'
UPDATE_URL = 'https://raw.github.com/rg3/youtube-dl/master/youtube-dl'
UPDATE_URL_VERSION = 'https://raw.github.com/rg3/youtube-dl/master/LATEST_VERSION'
@@ -46,7 +48,7 @@ from PostProcessor import *
def updateSelf(downloader, filename):
''' Update the program file with the latest version from the repository '''
# Note: downloader only used for options
if not os.access(filename, os.W_OK):
sys.exit('ERROR: no write permissions on %s' % filename)
@@ -64,7 +66,7 @@ def updateSelf(downloader, filename):
directory = os.path.dirname(exe)
if not os.access(directory, os.W_OK):
sys.exit('ERROR: no write permissions on %s' % directory)
try:
urlh = urllib2.urlopen(UPDATE_URL_EXE)
newcontent = urlh.read()
@@ -73,20 +75,18 @@ def updateSelf(downloader, filename):
outf.write(newcontent)
except (IOError, OSError), err:
sys.exit('ERROR: unable to download latest version')
try:
bat = os.path.join(directory, 'youtube-dl-updater.bat')
b = open(bat, 'w')
print >> b, """
b.write("""
echo Updating youtube-dl...
ping 127.0.0.1 -n 5 -w 1000 > NUL
move /Y "%s.new" "%s"
del "%s"
""" %(exe, exe, bat)
\n""" %(exe, exe, bat))
b.close()
os.startfile(bat)
except (IOError, OSError), err:
sys.exit('ERROR: unable to overwrite current version')
@@ -186,16 +186,18 @@ def parseOpts():
general.add_option('-r', '--rate-limit',
dest='ratelimit', metavar='LIMIT', help='download rate limit (e.g. 50k or 44.6m)')
general.add_option('-R', '--retries',
dest='retries', metavar='RETRIES', help='number of retries (default is 10)', default=10)
dest='retries', metavar='RETRIES', help='number of retries (default is %default)', default=10)
general.add_option('--dump-user-agent',
action='store_true', dest='dump_user_agent',
help='display the current browser identification', default=False)
general.add_option('--user-agent',
dest='user_agent', help='specify a custom user agent', metavar='UA')
general.add_option('--list-extractors',
action='store_true', dest='list_extractors',
help='List all supported extractors and the URLs they would handle', default=False)
selection.add_option('--playlist-start',
dest='playliststart', metavar='NUMBER', help='playlist video to start at (default is 1)', default=1)
dest='playliststart', metavar='NUMBER', help='playlist video to start at (default is %default)', default=1)
selection.add_option('--playlist-end',
dest='playlistend', metavar='NUMBER', help='playlist video to end at (default is last)', default=-1)
selection.add_option('--match-title', dest='matchtitle', metavar='REGEX',help='download only matching titles (regex or caseless sub-string)')
@@ -261,13 +263,18 @@ def parseOpts():
filesystem.add_option('-t', '--title',
action='store_true', dest='usetitle', help='use title in file name', default=False)
filesystem.add_option('--id',
action='store_true', dest='useid', help='use video ID in file name', default=False)
filesystem.add_option('-l', '--literal',
action='store_true', dest='useliteral', help='use literal title in file name', default=False)
action='store_true', dest='usetitle', help='[deprecated] alias of --title', default=False)
filesystem.add_option('-A', '--auto-number',
action='store_true', dest='autonumber',
help='number downloaded files starting from 00000', default=False)
filesystem.add_option('-o', '--output',
dest='outtmpl', metavar='TEMPLATE', help='output filename template. Use %(stitle)s to get the title, %(uploader)s for the uploader name, %(autonumber)s to get an automatically incremented number, %(ext)s for the filename extension, %(upload_date)s for the upload date (YYYYMMDD), and %% for a literal percent. Use - to output to stdout.')
dest='outtmpl', metavar='TEMPLATE', help='output filename template. Use %(title)s to get the title, %(uploader)s for the uploader name, %(autonumber)s to get an automatically incremented number, %(ext)s for the filename extension, %(upload_date)s for the upload date (YYYYMMDD), %(extractor)s for the provider (youtube, metacafe, etc), %(id)s for the video id and %% for a literal percent. Use - to output to stdout.')
filesystem.add_option('--restrict-filenames',
action='store_true', dest='restrictfilenames',
help='Avoid some characters such as "&" and spaces in filenames', default=False)
filesystem.add_option('-a', '--batch-file',
dest='batchfile', metavar='FILE', help='file containing URLs to download (\'-\' for stdin)')
filesystem.add_option('-w', '--no-overwrites',
@@ -292,12 +299,12 @@ def parseOpts():
help='write video metadata to a .info.json file', default=False)
postproc.add_option('--extract-audio', action='store_true', dest='extractaudio', default=False,
postproc.add_option('-x', '--extract-audio', action='store_true', dest='extractaudio', default=False,
help='convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)')
postproc.add_option('--audio-format', metavar='FORMAT', dest='audioformat', default='best',
help='"best", "aac", "vorbis", "mp3", "m4a", or "wav"; best by default')
postproc.add_option('--audio-quality', metavar='QUALITY', dest='audioquality', default='128K',
help='ffmpeg/avconv audio bitrate specification, 128k by default')
postproc.add_option('--audio-quality', metavar='QUALITY', dest='audioquality', default='5',
help='ffmpeg/avconv audio quality specification, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K (default 5)')
postproc.add_option('-k', '--keep-video', action='store_true', dest='keepvideo', default=False,
help='keeps the video file on disk after the post-processing; the video is erased by default')
@@ -326,6 +333,7 @@ def gen_extractors():
"""
return [
YoutubePlaylistIE(),
YoutubeChannelIE(),
YoutubeUserIE(),
YoutubeSearchIE(),
YoutubeIE(),
@@ -351,6 +359,9 @@ def gen_extractors():
MixcloudIE(),
StanfordOpenClassroomIE(),
MTVIE(),
YoukuIE(),
XNXXIE(),
GooglePlusIE(),
GenericIE()
]
@@ -368,6 +379,9 @@ def _real_main():
jar.load()
except (IOError, OSError), err:
sys.exit(u'ERROR: unable to open cookie file')
# Set user agent
if opts.user_agent is not None:
std_headers['User-Agent'] = opts.user_agent
# Dump user agent
if opts.dump_user_agent:
@@ -413,10 +427,10 @@ def _real_main():
parser.error(u'using .netrc conflicts with giving username/password')
if opts.password is not None and opts.username is None:
parser.error(u'account username missing')
if opts.outtmpl is not None and (opts.useliteral or opts.usetitle or opts.autonumber):
parser.error(u'using output template conflicts with using title, literal title or auto number')
if opts.usetitle and opts.useliteral:
parser.error(u'using title conflicts with using literal title')
if opts.outtmpl is not None and (opts.usetitle or opts.autonumber or opts.useid):
parser.error(u'using output template conflicts with using title, video ID or auto number')
if opts.usetitle and opts.useid:
parser.error(u'using title conflicts with using video ID')
if opts.username is not None and opts.password is None:
opts.password = getpass.getpass(u'Type account password and press return:')
if opts.ratelimit is not None:
@@ -444,6 +458,10 @@ def _real_main():
if opts.extractaudio:
if opts.audioformat not in ['best', 'aac', 'mp3', 'vorbis', 'm4a', 'wav']:
parser.error(u'invalid audio format specified')
if opts.audioquality:
opts.audioquality = opts.audioquality.strip('k').strip('K')
if not opts.audioquality.isdigit():
parser.error(u'invalid audio quality specified')
# File downloader
fd = FileDownloader({
@@ -463,15 +481,14 @@ def _real_main():
'format_limit': opts.format_limit,
'listformats': opts.listformats,
'outtmpl': ((opts.outtmpl is not None and opts.outtmpl.decode(preferredencoding()))
or (opts.format == '-1' and opts.usetitle and u'%(stitle)s-%(id)s-%(format)s.%(ext)s')
or (opts.format == '-1' and opts.useliteral and u'%(title)s-%(id)s-%(format)s.%(ext)s')
or (opts.format == '-1' and opts.usetitle and u'%(title)s-%(id)s-%(format)s.%(ext)s')
or (opts.format == '-1' and u'%(id)s-%(format)s.%(ext)s')
or (opts.usetitle and opts.autonumber and u'%(autonumber)s-%(stitle)s-%(id)s.%(ext)s')
or (opts.useliteral and opts.autonumber and u'%(autonumber)s-%(title)s-%(id)s.%(ext)s')
or (opts.usetitle and u'%(stitle)s-%(id)s.%(ext)s')
or (opts.useliteral and u'%(title)s-%(id)s.%(ext)s')
or (opts.usetitle and opts.autonumber and u'%(autonumber)s-%(title)s-%(id)s.%(ext)s')
or (opts.usetitle and u'%(title)s-%(id)s.%(ext)s')
or (opts.useid and u'%(id)s.%(ext)s')
or (opts.autonumber and u'%(autonumber)s-%(id)s.%(ext)s')
or u'%(id)s.%(ext)s'),
'restrictfilenames': opts.restrictfilenames,
'ignoreerrors': opts.ignoreerrors,
'ratelimit': opts.ratelimit,
'nooverwrites': opts.nooverwrites,
+25 -7
View File
@@ -19,13 +19,18 @@ except ImportError:
import StringIO
std_headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:5.0.1) Gecko/20100101 Firefox/5.0.1',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-us,en;q=0.5',
}
try:
compat_str = unicode # Python 2
except NameError:
compat_str = str
def preferredencoding():
"""Get preferred encoding.
@@ -83,7 +88,6 @@ class IDParser(HTMLParser.HTMLParser):
HTMLParser.HTMLParser.__init__(self)
def error(self, message):
print >> sys.stderr, self.getpos()
if self.error_count > 10 or self.started:
raise HTMLParser.HTMLParseError(message, self.getpos())
self.rawdata = '\n'.join(self.html.split('\n')[self.getpos()[0]:]) # skip one line
@@ -190,14 +194,28 @@ def timeconvert(timestr):
if timetuple is not None:
timestamp = email.utils.mktime_tz(timetuple)
return timestamp
def sanitize_filename(s):
"""Sanitizes a string so it could be used as part of a filename."""
def sanitize_filename(s, restricted=False):
"""Sanitizes a string so it could be used as part of a filename.
If restricted is set, use a stricter subset of allowed characters.
"""
def replace_insane(char):
if char in u' .\\/|?*<>:"' or ord(char) < 32:
if char == '?' or ord(char) < 32 or ord(char) == 127:
return ''
elif char == '"':
return '' if restricted else '\''
elif char == ':':
return '_-' if restricted else ' -'
elif char in '\\/|*<>':
return '-'
if restricted and (char in '&\'' or char.isspace()):
return '_'
return char
return u''.join(map(replace_insane, s)).strip('_')
result = u''.join(map(replace_insane, s))
while '--' in result:
result = result.replace('--', '-')
return result.strip('-')
def orderedSet(iterable):
""" Remove all duplicates from the input iterable """