>>> d = {'foo': 'bar=1&var=test,em=5&t=url%20encoded'}
>>> apply_descrambler(d, 'foo')
>>> print(d)
{'foo': [{'bar': '1', 'var': 'test'}, {'em': '5', 't': 'url encoded'}]}
pytube.extract.
apply_signature
(stream_manifest: Dict[KT, VT], vid_info: Dict[KT, VT], js: str) → None[source]
Apply the decrypted signature to the stream manifest.
pytube.extract.
channel_name
(url: str) → str[source]
Extract the channel_name
or channel_id
from a YouTube url.
This function supports the following patterns:
https://youtube.com/c/channel_name/*
:samp:`https://youtube.com/channel/{channel_id}/*
https://youtube.com/u/channel_name/*
:samp:`https://youtube.com/user/{channel_id}/*
pytube.extract.
get_ytcfg
(html: str) → str[source]
Get the entirety of the ytcfg object.
This is built over multiple pieces, so we have to find all matches and
combine the dicts together.
pytube.extract.
get_ytplayer_config
(html: str) → Any[source]
Get the YouTube player configuration data from the watch html.
Extract the ytplayer_config
, which is json data embedded within the
watch html and serves as the primary source of obtaining the stream
manifest data.
pytube.extract.
get_ytplayer_js
(html: str) → Any[source]
Get the YouTube player base JavaScript path.
:param str html
The html contents of the watch page.
pytube.extract.
initial_data
(watch_html: str) → str[source]
Extract the ytInitialData json from the watch_html page.
This mostly contains metadata necessary for rendering the page on-load,
such as video information, copyright notices, etc.
@param watch_html: Html of the watch page
@return:
pytube.extract.
initial_player_response
(watch_html: str) → str[source]
Extract the ytInitialPlayerResponse json from the watch_html page.
This mostly contains metadata necessary for rendering the page on-load,
such as video information, copyright notices, etc.
@param watch_html: Html of the watch page
@return:
pytube.extract.
js_url
(html: str) → str[source]
Get the base JavaScript url.
Construct the base JavaScript url, which contains the decipher
“transforms”.
pytube.extract.
metadata
(initial_data) → Optional[pytube.metadata.YouTubeMetadata][source]
Get the informational metadata for the video.
e.g.:
‘Song’: ‘강남스타일(Gangnam Style)’,
‘Artist’: ‘PSY’,
‘Album’: ‘PSY SIX RULES Pt.1’,
‘Licensed to YouTube by’: ‘YG Entertainment Inc. […]’
pytube.extract.
mime_type_codec
(mime_type_codec: str) → Tuple[str, List[str]][source]
Parse the type data.
Breaks up the data in the type
key of the manifest, which contains the
mime type and codecs serialized together, and splits them into separate
elements.
Example:
mime_type_codec(‘audio/webm; codecs=”opus”’) -> (‘audio/webm’, [‘opus’])
Parameters:mime_type_codec (str) – String containing mime type and codecs.
Return type:tuple
Returns:The mime type and a list of codecs.
pytube.extract.
playability_status
(watch_html: str) -> (<class 'str'>, <class 'str'>)[source]
Return the playability status and status explanation of a video.
For example, a video may have a status of LOGIN_REQUIRED, and an explanation
of “This is a private video. Please sign in to verify that you may see it.”
This explanation is what gets incorporated into the media player overlay.
pytube.extract.
playlist_id
(url: str) → str[source]
Extract the playlist_id
from a YouTube url.
This function supports the following patterns:
https://youtube.com/playlist?list=playlist_id
https://youtube.com/watch?v=video_id&list=playlist_id
pytube.extract.
publish_date
(watch_html: str)[source]
Extract publish date
:param str watch_html:
The html contents of the watch page.
pytube.extract.
video_id
(url: str) → str[source]
Extract the video_id
from a YouTube url.
This function supports the following patterns:
https://youtube.com/watch?v=video_id
https://youtube.com/embed/video_id
https://youtu.be/video_id
pytube.extract.
video_info_url_age_restricted
(video_id: str, embed_html: str) → str[source]
Construct the video_info url.
Cipher
This module contains all logic necessary to decipher the signature.
YouTube’s strategy to restrict downloading videos is to send a ciphered version
of the signature to the client, along with the decryption algorithm obfuscated
in JavaScript. For the clients to play the videos, JavaScript must take the
ciphered version, cycle it through a series of “transform functions,” and then
signs the media URL with the output.
This module is responsible for (1) finding and extracting those “transform
functions” (2) maps them to Python equivalents and (3) taking the ciphered
signature and decoding it.
pytube.cipher.
get_initial_function_name
(js: str) → str[source]
Extract the name of the function responsible for computing the signature.
:param str js:
The contents of the base.js asset file.
pytube.cipher.
get_throttling_plan
(js: str)[source]
Extract the “throttling plan”.
The “throttling plan” is a list of tuples used for calling functions
in the c array. The first element of the tuple is the index of the
function to call, and any remaining elements of the tuple are arguments
to pass to that function.
pytube.cipher.
get_transform_map
(js: str, var: str) → Dict[KT, VT][source]
Build a transform function lookup.
Build a lookup table of obfuscated JavaScript function names to the
Python equivalents.
Parameters:
- js (str) – The contents of the base.js asset file.
- var (str) – The obfuscated variable name that stores an object with all functions
that descrambles the signature.
pytube.cipher.
get_transform_object
(js: str, var: str) → List[str][source]
Extract the “transform object”.
The “transform object” contains the function definitions referenced in the
“transform plan”. The var
argument is the obfuscated variable name
which contains these functions, for example, given the function call
DE.AJ(a,15)
returned by the transform plan, “DE” would be the var.
Parameters:
- js (str) – The contents of the base.js asset file.
- var (str) – The obfuscated variable name that stores an object with all functions
that descrambles the signature.
>>> get_transform_object(js, 'DE')
['AJ:function(a){a.reverse()}',
'VR:function(a,b){a.splice(0,b)}',
'kT:function(a,b){var c=a[0];a[0]=a[b%a.length];a[b]=c}']
pytube.cipher.
get_transform_plan
(js: str) → List[str][source]
Extract the “transform plan”.
The “transform plan” is the functions that the ciphered signature is
cycled through to obtain the actual signature.
pytube.cipher.
js_splice
(arr: list, start: int, delete_count=None, *items)[source]
Implementation of javascript’s splice function.
Parameters: