API — pytube 15.0.0 documentation

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

相关文章推荐

乐观的煎鸡蛋 · Dart语言基础Map、List、Set操作 ...· 3 周前 ·

坏坏的西瓜 · Flutter中Map、List数组的常用方 ...· 3 周前 ·

有胆有识的沙滩裤 · Dart语言基础Map、List、Set操作 ...· 3 周前 ·

被表白的日记本 · 求一个算法，c#判断给定的坐标集合(list ...· 3 周前 ·

瘦瘦的泡面 · C#中List对象去重的三种方法 - 董川民· 3 周前 ·

腼腆的茶叶 · 跨专业心理学考研难吗_中国教育在线· 3 月前 ·

威武的灯泡 · 基于Anki+Vocabulary的英语单词 ...· 9 月前 ·

刚毅的鸵鸟 · 赣派建筑与徽派建筑区别- 知乎· 1 年前 ·

狂野的日光灯 · 12种折折纸飞机详细教程，满足小朋友放飞花样 ...· 1 年前 ·

睡不着的盒饭 · 韩春雨主动撤稿的两则警示· 1 年前 ·

class


   pytube.


   YouTube

( url: str, on_progress_callback: Optional[Callable[[Any, bytes, int], None]] = None, on_complete_callback: Optional[Callable[[Any, Optional[str]], None]] = None, proxies: Dict[str, str] = None, use_oauth: bool = False, allow_oauth_cache: bool = True ) [source] ¶

Core developer interface for pytube.


    author

Get the video author. :rtype: str


    check_availability

( ) [source] ¶

Check whether the video is available.

Raises different exceptions based on why the video is unavailable, otherwise does nothing.


    fmt_streams

Returns a list of streams if they have been initialized.

If the streams have not been initialized, finds all relevant streams and initializes them.

class


    pytube.contrib.playlist.


    Playlist

( url: str , proxies: Optional[Dict[str , str]] = None ) [source] ¶

Load a YouTube playlist with URL


    count

( value ) → integer -- return number of occurrences of value ¶


    index

( value [ , start [ , stop ] ] ) → integer -- return first index of value. ¶

Raises ValueError if the value is not present.

Supporting start and stop arguments is optional, but recommended.


    last_updated

Extract the date that the playlist was last updated.

For some playlists, this will be a specific date, which is returned as a datetime object. For other playlists, this is an estimate such as “1 week ago”. Due to the fact that this value is returned as a string, pytube does a best-effort parsing where possible, and returns the raw string where it is not possible.


    trimmed

( video_id: str ) → Iterable[str] [source] ¶

Retrieve a list of YouTube video URLs trimmed at the given video ID

i.e. if the playlist has video IDs 1,2,3,4 calling trimmed(3) returns [1,2] :type video_id: str

video ID to trim the returned list of playlist URLs at

class


    pytube.contrib.channel.


    Channel

( url: str , proxies: Optional[Dict[str , str]] = None ) [source] ¶


    about_html

Get the html for the /about page.

Currently unused for any functionality.


    index

( value [ , start [ , stop ] ] ) → integer -- return first index of value. ¶

Raises ValueError if the value is not present.

Supporting start and stop arguments is optional, but recommended.


    last_updated

Extract the date that the playlist was last updated.


    trimmed

( video_id: str ) → Iterable[str] ¶

Retrieve a list of YouTube video URLs trimmed at the given video ID

i.e. if the playlist has video IDs 1,2,3,4 calling trimmed(3) returns [1,2] :type video_id: str

video ID to trim the returned list of playlist URLs at

class


    pytube.


    Stream

( stream: Dict[KT, VT], monostate: pytube.monostate.Monostate ) [source] ¶

Container for stream manifest data.


    default_filename

Generate filename based on the video title.


    download

( output_path: Optional[str] = None , filename: Optional[str] = None , filename_prefix: Optional[str] = None , skip_existing: bool = True , timeout: Optional[int] = None , max_retries: Optional[int] = 0 ) → str [source] ¶

Write the media stream to disk.

Parameters:

output_path ( str or None ) – (optional) Output path for writing media file. If one is not specified, defaults to the current working directory.
filename ( str or None ) – (optional) Output filename (stem only) for writing media file. If one is not specified, the default filename is used.
filename_prefix ( str or None ) – (optional) A string that will be prepended to the filename. For example a number in a playlist or the name of a series. If one is not specified, nothing will be prepended This is separate from filename so you can use the default filename but still add a prefix.
skip_existing ( bool ) – (optional) Skip existing files, defaults to True
timeout ( int ) – (optional) Request timeout length in seconds. Uses system default.
max_retries ( int ) – (optional) Number of retries to attempt after socket timeout. Defaults to 0.

Path to the saved video

str


     on_progress

(

chunk: bytes

file_handler: BinaryIO

bytes_remaining: int

)

[source]

On progress callback function.

This function writes the binary data to the file, then checks if an additional callback is defined in the monostate. This is exposed to allow things like displaying a progress bar.

Parameters:

chunk ( bytes ) – Segment of media file binary data, not yet written to disk.
file_handler ( io.BufferedWriter ) – The file handle where the media is being written to.
bytes_remaining ( int ) – The delta between the total file size in bytes and amount already downloaded.

None


       parse_codecs

(

)

[source]

Get the video/audio codecs from list of codecs.

Parse a variable length sized list of codecs and returns a constant two element tuple, with the video codec as the first element and audio as the second. Returns None if one is not available (adaptive only).

class


        pytube.query.


        StreamQuery

( fmt_streams ) [source] ¶

Interface for querying the available media streams.

all

( ) → List[pytube.streams.Stream] [source] ¶

Get all the results represented by this query as a list.


        filter

( fps=None , res=None , resolution=None , mime_type=None , type=None , subtype=None , file_extension=None , abr=None , bitrate=None , video_codec=None , audio_codec=None , only_audio=None , only_video=None , progressive=None , adaptive=None , is_dash=None , custom_filter_functions=None ) [source] ¶

Apply the given filtering criterion.

Parameters:

fps ( int or None ) – (optional) The frames per second.
resolution ( str or None ) – (optional) Alias to res .
res ( str or None ) – (optional) The video resolution.
mime_type ( str or None ) – (optional) Two-part identifier for file formats and format contents composed of a “type”, a “subtype”.
type ( str or None ) – (optional) Type part of the mime_type (e.g.: audio, video).
subtype ( str or None ) – (optional) Sub-type part of the mime_type (e.g.: mp4, mov).
file_extension ( str or None ) – (optional) Alias to sub_type .
abr ( str or None ) – (optional) Average bitrate (ABR) refers to the average amount of data transferred per unit of time (e.g.: 64kbps, 192kbps).
bitrate ( str or None ) – (optional) Alias to abr .
video_codec ( str or None ) – (optional) Video compression format.
audio_codec ( str or None ) – (optional) Audio compression format.
progressive ( bool ) – Excludes adaptive streams (one file contains both audio and video tracks).
adaptive ( bool ) – Excludes progressive streams (audio and video are on separate tracks).
is_dash ( bool ) – Include/exclude dash streams.
only_audio ( bool ) – Excludes streams with video tracks.
only_video ( bool ) – Excludes streams with audio tracks.
custom_filter_functions ( list or None ) – (optional) Interface for defining complex filters without subclassing.


         
          Stream


         get_by_resolution

(

resolution: str

)

[source]

Get the corresponding Stream for a given resolution.

Stream must be a progressive mp4.

Parameters: resolution ( str ) – Video resolution i.e. “720p”, “480p”, “360p”, “240p”, “144p” Return type:


          
           Stream

or None Returns:The


          
           Stream

matching the given itag or None if not found.


          index

( value [ , start [ , stop ] ] ) → integer -- return first index of value. ¶

Raises ValueError if the value is not present.

Supporting start and stop arguments is optional, but recommended.

Return type:


          
           Stream

or None Returns:Return the last result of this query or None if the result doesn’t contain any streams. Parameters: is_otf ( bool ) – Set to False to retrieve only non-OTF streams Return type:


           
            StreamQuery

Returns:A StreamQuery object with otf filtered streams class


          pytube.


          Caption

( caption_track: Dict[KT, VT] ) [source] ¶

Container for caption tracks.


          download

( title: str , srt: bool = True , output_path: Optional[str] = None , filename_prefix: Optional[str] = None ) → str [source] ¶

Write the media stream to disk.

Parameters:

title ( str ) – Output filename (stem only) for writing media file. If one is not specified, the default filename is used.
srt – Set to True to download srt, false to download xml. Defaults to True.

filename_prefix

str

None

str


           generate_srt_captions

(

)

[source]

Generate “SubRip Subtitle” captions.

Takes the xml captions from xml_captions() and recompiles them into the “SubRip Subtitle” format.

class


            pytube.query.


            CaptionQuery

( captions: List[pytube.captions.Caption] ) [source] ¶

Interface for querying the available captions.

all

( ) → List[pytube.captions.Caption] [source] ¶

Get all the results represented by this query as a list.

Parameters: lang_code ( str ) – The code that identifies the caption language. Return type:


            
             Caption

or None Returns:The


            
             Caption

matching the given


            
             lang_code

or None if it does not exist. Parameters: continuation ( str ) – Continuation string for fetching results. Return type: tuple Returns:A tuple of a list of YouTube objects and a continuation string. Parameters: continuation ( str ) – Continuation string for fetching results. Return type: dict Returns:The raw json object returned by the innertube API.


            get_next_results

( ) [source] ¶

Use the stored continuation string to fetch the next set of results.

This method does not return the results, but instead updates the results property.


            results

Return search results.

On first call, will generate and return the first set of results. Additional results can be generated using .get_next_results() .


            pytube.extract.


            apply_descrambler

( stream_data: Dict[KT, VT] ) → None [source] ¶

Apply various in-place transforms to YouTube’s media stream data.

Creates a list of dictionaries by string splitting on commas, then taking each list item, parsing it as a query string, converting it to a dict and unquoting the value.

Example :

>>> d = {'foo': 'bar=1&var=test,em=5&t=url%20encoded'}
>>> apply_descrambler(d, 'foo')
>>> print(d)
{'foo': [{'bar': '1', 'var': 'test'}, {'em': '5', 't': 'url encoded'}]}
pytube.extract.apply_signature(stream_manifest: Dict[KT, VT], vid_info: Dict[KT, VT], js: str) → None[source]¶
Apply the decrypted signature to the stream manifest.
pytube.extract.channel_name(url: str) → str[source]¶
Extract the channel_name or channel_id from a YouTube url.
This function supports the following patterns:
https://youtube.com/c/channel_name/*
:samp:`https://youtube.com/channel/{channel_id}/*
https://youtube.com/u/channel_name/*
:samp:`https://youtube.com/user/{channel_id}/*
pytube.extract.get_ytcfg(html: str) → str[source]¶
Get the entirety of the ytcfg object.
This is built over multiple pieces, so we have to find all matches and
combine the dicts together.
pytube.extract.get_ytplayer_config(html: str) → Any[source]¶
Get the YouTube player configuration data from the watch html.
Extract the ytplayer_config, which is json data embedded within the
watch html and serves as the primary source of obtaining the stream
manifest data.
pytube.extract.get_ytplayer_js(html: str) → Any[source]¶
Get the YouTube player base JavaScript path.
:param str html
The html contents of the watch page.
pytube.extract.initial_data(watch_html: str) → str[source]¶
Extract the ytInitialData json from the watch_html page.
This mostly contains metadata necessary for rendering the page on-load,
such as video information, copyright notices, etc.
@param watch_html: Html of the watch page
@return:
pytube.extract.initial_player_response(watch_html: str) → str[source]¶
Extract the ytInitialPlayerResponse json from the watch_html page.
This mostly contains metadata necessary for rendering the page on-load,
such as video information, copyright notices, etc.
@param watch_html: Html of the watch page
@return:
pytube.extract.js_url(html: str) → str[source]¶
Get the base JavaScript url.
Construct the base JavaScript url, which contains the decipher
“transforms”.
pytube.extract.metadata(initial_data) → Optional[pytube.metadata.YouTubeMetadata][source]¶
Get the informational metadata for the video.
e.g.:
‘Song’: ‘강남스타일(Gangnam Style)’,
‘Artist’: ‘PSY’,
‘Album’: ‘PSY SIX RULES Pt.1’,
‘Licensed to YouTube by’: ‘YG Entertainment Inc. […]’
pytube.extract.mime_type_codec(mime_type_codec: str) → Tuple[str, List[str]][source]¶
Parse the type data.
Breaks up the data in the type key of the manifest, which contains the
mime type and codecs serialized together, and splits them into separate
elements.
Example:
mime_type_codec(‘audio/webm; codecs=”opus”’) -> (‘audio/webm’, [‘opus’])
Parameters:mime_type_codec (str) – String containing mime type and codecs.
Return type:tuple
Returns:The mime type and a list of codecs.
pytube.extract.playability_status(watch_html: str) -> (<class 'str'>, <class 'str'>)[source]¶
Return the playability status and status explanation of a video.
For example, a video may have a status of LOGIN_REQUIRED, and an explanation
of “This is a private video. Please sign in to verify that you may see it.”
This explanation is what gets incorporated into the media player overlay.
pytube.extract.playlist_id(url: str) → str[source]¶
Extract the playlist_id from a YouTube url.
This function supports the following patterns:
https://youtube.com/playlist?list=playlist_id
https://youtube.com/watch?v=video_id&list=playlist_id
pytube.extract.publish_date(watch_html: str)[source]¶
Extract publish date
:param str watch_html:
The html contents of the watch page.
pytube.extract.video_id(url: str) → str[source]¶
Extract the video_id from a YouTube url.
This function supports the following patterns:
https://youtube.com/watch?v=video_id
https://youtube.com/embed/video_id
https://youtu.be/video_id
pytube.extract.video_info_url_age_restricted(video_id: str, embed_html: str) → str[source]¶
Construct the video_info url.
Cipher¶
This module contains all logic necessary to decipher the signature.
YouTube’s strategy to restrict downloading videos is to send a ciphered version
of the signature to the client, along with the decryption algorithm obfuscated
in JavaScript. For the clients to play the videos, JavaScript must take the
ciphered version, cycle it through a series of “transform functions,” and then
signs the media URL with the output.
This module is responsible for (1) finding and extracting those “transform
functions” (2) maps them to Python equivalents and (3) taking the ciphered
signature and decoding it.
pytube.cipher.get_initial_function_name(js: str) → str[source]




    
¶
Extract the name of the function responsible for computing the signature.
:param str js:
The contents of the base.js asset file.
pytube.cipher.get_throttling_plan(js: str)[source]¶
Extract the “throttling plan”.
The “throttling plan” is a list of tuples used for calling functions
in the c array. The first element of the tuple is the index of the
function to call, and any remaining elements of the tuple are arguments
to pass to that function.
pytube.cipher.get_transform_map(js: str, var: str) → Dict[KT, VT][source]¶
Build a transform function lookup.
Build a lookup table of obfuscated JavaScript function names to the
Python equivalents.
Parameters:
js (str) – The contents of the base.js asset file.
var (str) – The obfuscated variable name that stores an object with all functions
that descrambles the signature.
pytube.cipher.get_transform_object(js: str, var: str) → List[str][source]¶
Extract the “transform object”.
The “transform object” contains the function definitions referenced in the
“transform plan”. The var argument is the obfuscated variable name
which contains these functions, for example, given the function call
DE.AJ(a,15) returned by the transform plan, “DE” would be the var.
Parameters:
js (str) – The contents of the base.js asset file.
var (str) – The obfuscated variable name that stores an object with all functions
that descrambles the signature.
>>> get_transform_object(js, 'DE')
['AJ:function(a){a.reverse()}',
'VR:function(a,b){a.splice(0,b)}',
'kT:function(a,b){var c=a[0];a[0]=a[b%a.length];a[b]=c}']
pytube.cipher.get_transform_plan(js: str) → List[str][source]¶
Extract the “transform plan”.
The “transform plan” is the functions that the ciphered signature is
cycled through to obtain the actual signature.
pytube.cipher.js_splice(arr: list, start: int, delete_count=None, *items)[source]¶
Implementation of javascript’s splice function.
Parameters:
arr (list) – Array to splice
start (int) – Index at which to start changing the array
delete_count (int) – Number of elements to delete from the array
*items – Items to add to the array
pytube.cipher.reverse(arr: List[T], _: Optional[Any])[source]¶
Reverse elements in a list.
This function is equivalent to:
function(a, b) { a.reverse() }
This method takes an unused b variable as their transform functions
universally sent two arguments.
Example:
>>> reverse([1, 2, 3, 4])
[4, 3, 2, 1]
pytube.cipher.splice(arr: List[T], b: int)[source]¶
Add/remove items to/from a list.
This function is equivalent to:
function(a, b) { a.splice(0, b) }
Example:
>>> splice([1, 2, 3, 4], 2)
[1, 2]
pytube.cipher.swap(arr: List[T], b: int)[source]¶
Swap positions at b modulus the list length.
This function is equivalent to:
function(a, b) { var c=a[0];a[0]=a[b%a.length];a[b]=c }
Example:
>>> swap([1, 2, 3, 4], 2)
[3, 2, 1, 4]
pytube.cipher.throttling_cipher_function(d: list, e: str)[source]¶
This ciphers d with e to generate a new list.
In the javascript, the operation is as follows:
var h = [A-Za-z0-9-_], f = 96;  // simplified from switch-case loop
d.forEach(
function(l,m,n){
this.push(
n[m]=h[
(h.indexOf(l)-h.indexOf(this[m])+m-32+f–)%h.length
e.split(“”)
pytube.cipher.throttling_mod_func(d: list, e: int)[source]¶
Perform the modular function from the throttling array functions.
In the javascript, the modular operation is as follows:
e = (e % d.length + d.length) % d.length
We simply translate this to python here.
pytube.cipher.throttling_nested_splice(d: list, e: int)[source]¶
Nested splice function in throttling js.
In the javascript, the operation is as follows:
function(d,e){
e=(e%d.length+d.length)%d.length;
d.splice(
d.splice(
While testing, all this seemed to do is swap element 0 and e,
but the actual process is preserved in case there was an edge
case that was not considered.
pytube.cipher.throttling_prepend(d: list, e: int)[source]¶
In the javascript, the operation is as follows:
function(d,e){
e=(e%d.length+d.length)%d.length;
d.splice(-e).reverse().forEach(
function(f){
d.unshift(f)
Effectively, this moves the last e elements of d to the beginning.
pytube.cipher.throttling_reverse(arr: list)[source]¶
Reverses the input list.
Needs to do an in-place reversal so that the passed list gets changed.
To accomplish this, we create a reversed copy, and then change each
indvidual element.
pytube.cipher.throttling_unshift(d: list, e: int)[source]¶
Rotates the elements of the list to the right.
In the javascript, the operation is as follows:
for(e=(e%d.length+d.length)%d.length;e–;)d.unshift(d.pop())
exception pytube.exceptions.MembersOnly(video_id: str)[source]¶
Video is members-only.
YouTube has special videos that are only viewable to users who have
subscribed to a content creator.
ref: https://support.google.com/youtube/answer/7544492?hl=en
exception pytube.exceptions.PytubeError[source]¶
Base pytube exception that all others inherit.
This is done to not pollute the built-in exceptions, which could result
in unintended errors being unexpectedly and incorrectly handled within
implementers code.
exception pytube.exceptions.RegexMatchError(caller: str, pattern: Union[str, Pattern[AnyStr]])[source]¶
Regex pattern did not return any matches.
class pytube.helpers.DeferredGeneratorList(generator)[source]¶
A wrapper class for deferring list generation.
Pytube has some continuation generators that create web calls, which means
that any time a full list is requested, all of those web calls must be
made at once, which could lead to slowdowns. This will allow individual
elements to be queried, so that slowdowns only happen as necessary. For
example, you can iterate over elements in the list without accessing them
all simultaneously. This should allow for speed improvements for playlist
and channel interactions.
generate_all()[source]¶
Generate all items.
pytube.helpers.create_mock_html_json(vid_id) → Dict[str, Any][source]¶
Generate a json.gz file with sample html responses.
:param str vid_id
YouTube video id
:return dict data
Dict used to generate the json.gz file
pytube.helpers.deprecated(reason: str) → Callable[source]¶
This is a decorator which can be used to mark functions
as deprecated. It will result in a warning being emitted
when the function is used.
pytube.helpers.generate_all_html_json_mocks()[source]¶
Regenerate the video mock json files for all current test videos.
This should automatically output to the test/mocks directory.
pytube.helpers.regex_search(pattern: str, string: str, group: int) → str[source]¶
Shortcut method to search a string for a given pattern.
Parameters:
pattern (str) – A regular expression pattern.
string (str) – A target string to search.
group (int) – Index of group to return.
Return type:str or tuple
Returns:Substring pattern matches.
pytube.helpers.safe_filename(s: str, max_length: int = 255) → str[source]¶
Sanitize a string making it safe to use as a filename.
This function was based off the limitations outlined here:
https://en.wikipedia.org/wiki/Filename.
pytube.helpers.setup_logger(level: int = 40, log_filename: Optional[str] = None) → None[source]¶
Create a configured instance of logger.
pytube.helpers.target_directory(output_path: Optional[str] = None) → str[source]¶
Function for determining target directory of a download.
Returns an absolute path (if relative one given) or the current
path (if none given). Makes directory if it does not exist.
pytube.helpers.uniqueify(duped_list: List[T]) → List[T][source]¶
Remove duplicate items from a list, while maintaining list order.
:param List duped_list
List to remove duplicates from
:return List result
De-duplicated list
pytube.request.post(url, extra_headers=None, data=None, timeout=<object object>)[source]¶
Send an http POST request.
Parameters:
url (str) – The URL to perform the POST request for.
extra_headers (dict) – Extra headers to add to the request
data (dict) – The data to send on the POST request
Return type:str
Returns:UTF-8 encoded string of response
pytube.request.seq_stream(url, timeout=<object object>, max_retries=0)[source]¶
Read the response in sequence.
:param str url: The URL to perform the GET request for.
:rtype: Iterable[bytes]
pytube.request.stream(url, timeout=<object object>, max_retries=0)[source]¶
Read the response in chunks.
:param str url: The URL to perform the GET request for.
:rtype: Iterable[bytes]