sum(random() < p for i in range(n))
試驗次數 n 應為非負整數。成功的機率 p 應在 0.0 <= p <= 1.0
之間。結果是 0 <= X <= n
在 3.12 版被加入.
random.uniform(a, b)
回傳一個隨機浮點數 N,當 a <= b
時確保 N 為 a <= N <= b
、b < a
時確保 N 為 b <= N <= a
終點值 b
可能包含在範圍內,也可能不包含在範圍內,取決於運算式 a + (b-a) * random()
指數分佈。lambd 為 1.0 除以所需的平均數。它應該不為零。(該參數將被稱為 "lambda",但這是 Python 中的保留字)如果 lambd 為正,則回傳值的範圍從 0 到正無窮大;如果 lambd 為負,則回傳值的範圍從負無窮大到 0。
在 3.12 版的變更: 新增 lambd
random.gammavariate(alpha, beta)
Gamma(伽瑪)分佈。(不是 Gamma 函式!)。形狀 (shape) 和比例 (scale) 參數 alpha 和 beta 必須具有正值。(根據呼叫習慣不同,部分來源會將 'beta' 定義為比例的倒數)。
Probability distribution function(機率密度函式)是:
x ** (alpha - 1) * math.exp(-x / beta)
pdf(x) = --------------------------------------
math.gamma(alpha) * beta ** alpha
random.gauss(mu=0.0, sigma=1.0)
常態分佈,也稱為高斯分佈。mu 是平均數,sigma 是標準差。這比下面定義的 normalvariate()
多執行緒須注意:當兩個執行緒同時呼叫此函式時,它們可能會收到相同的傳回值。這可以透過三種方式避免。1)讓每個執行緒使用隨機數產生器的不同實例。2)在所有呼叫周圍加鎖。3)使用較慢但執行緒安全的 normalvariate()
在 3.11 版的變更: mu 和 sigma 現在有預設引數。
實現 random
模組使用的預設偽隨機數產生器的 class。
在 3.11 版的變更: 過去 seed 可以是任何可雜湊物件,但現在必須是以下類型之一:None
如果 Random
seed(a=None, version=2)
在子類別中覆寫此方法以自訂 Random
實例的 seed()
關於 Reproducibility(復現性)的注意事項
大多數隨機 module 的演算法和 seed 設定函式在 Python 版本中可能會發生變化,但可以保證兩個方面不會改變:
如果增加了新的 seed 設定函式,則將提供向後相容的播種器 (seeder)。
當相容的播種器被賦予相同的種子時,產生器的 random()
>>> random() # Random float: 0.0 <= x < 1.0
>>> uniform(2.5, 10.0) # Random float: 2.5 <= x <= 10.0
>>> expovariate(1 / 5) # Interval between arrivals averaging 5 seconds
>>> randrange(10) # Integer from 0 to 9 inclusive
>>> randrange(0, 101, 2) # Even integer from 0 to 100 inclusive
>>> choice(['win', 'lose', 'draw']) # Single random element from a sequence
>>> deck = 'ace two three four'.split()
>>> shuffle(deck) # Shuffle a list
['four', 'two', 'ace', 'three']
>>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement
[40, 10, 50, 30]
>>> # Six roulette wheel spins (weighted sampling with replacement)
>>> choices(['red', 'black', 'green'], [18, 18, 2], k=6)
['red', 'green', 'black', 'black', 'red', 'black']
>>> # Deal 20 cards without replacement from a deck
>>> # of 52 playing cards, and determine the proportion of cards
>>> # with a ten-value: ten, jack, queen, or king.
>>> deal = sample(['tens', 'low cards'], counts=[16, 36], k=20)
>>> deal.count('tens') / 20
>>> # Estimate the probability of getting 5 or more heads from 7 spins
>>> # of a biased coin that settles on heads 60% of the time.
>>> sum(binomialvariate(n=7, p=0.6) >= 5 for i in range(10_000)) / 10_000
>>> # Probability of the median of 5 samples being in middle two quartiles
>>> def trial():
... return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500
>>> sum(trial() for i in range(10_000)) / 10_000
統計 bootstrapping(自助法)的範例,使用有重置的重新取樣來估計樣本平均數的信賴區間:
from statistics import fmean as mean
from random import choices
data = [41, 50, 29, 37, 81, 30, 73, 63, 20, 35, 68, 22, 60, 31, 95]
means = sorted(mean(choices(data, k=len(data))) for i in range(100))
print(f'The sample mean of {mean(data):.1f} has a 90% confidence '
f'interval from {means[5]:.1f} to {means[94]:.1f}')
重新取樣排列測試的範例,來確定觀察到的藥物與安慰劑之間差異的統計學意義或 p 值:
# Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson
from statistics import fmean as mean
from random import shuffle
drug = [54, 73, 53, 70, 73, 68, 52, 65, 65]
placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46]
observed_diff = mean(drug) - mean(placebo)
n = 10_000
count = 0
combined = drug + placebo
for i in range(n):
new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):])
count += (new_diff >= observed_diff)
print(f'{n} label reshufflings produced only {count} instances with a difference')
print(f'at least as extreme as the observed difference of {observed_diff:.1f}.')
print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null')
print(f'hypothesis that there is no difference between the drug and the placebo.')
模擬多伺服器佇列 (queue) 的到達時間與服務交付:
from heapq import heapify, heapreplace
from random import expovariate, gauss
from statistics import mean, quantiles
average_arrival_interval = 5.6
average_service_time = 15.0
stdev_service_time = 3.5
num_servers = 3
waits = []
arrival_time = 0.0
servers = [0.0] * num_servers # time when each server becomes available
for i in range(1_000_000):
arrival_time += expovariate(1.0 / average_arrival_interval)
next_server_available = servers[0]
wait = max(0.0, next_server_available - arrival_time)
service_duration = max(0.0, gauss(average_service_time, stdev_service_time))
service_completed = arrival_time + wait + service_duration
heapreplace(servers, service_completed)
print(f'Mean wait: {mean(waits):.1f} Max wait: {max(waits):.1f}')
print('Quartiles:', [round(q, 1) for q in quantiles(waits)])
Statistics for Hackers 是由 Jake Vanderplas 製作的教學影片,僅使用幾個基本概念(包括模擬、取樣、洗牌、交叉驗證)進行統計分析。
Economics Simulation是由 Peter Norvig 對市場進行的模擬,顯示了該模組提供的許多工具和分佈(高斯、均勻、樣本、 beta 變數、選擇,三角形、隨機數)的有效使用。
機率的具體介紹(使用Python)為 Peter Norvig 的教學課程,涵蓋了機率理論的基礎知識與如何模擬以及使用 Python 執行數據分析。
這些使用方案展示了如何有效地從 itertools
模組的組合疊代器 (combinatoric iterators) 中進行隨機選擇:
def random_product(*args, repeat=1):
"Random selection from itertools.product(*args, **kwds)"
pools = [tuple(pool) for pool in args] * repeat
return tuple(map(random.choice, pools))
def random_permutation(iterable, r=None):
"Random selection from itertools.permutations(iterable, r)"
pool = tuple(iterable)
r = len(pool) if r is None else r
return tuple(random.sample(pool, r))
def random_combination(iterable, r):
"Random selection from itertools.combinations(iterable, r)"
pool = tuple(iterable)
n = len(pool)
indices = sorted(random.sample(range(n), r))
return tuple(pool[i] for i in indices)
def random_combination_with_replacement(iterable, r):
"Choose r elements with replacement. Order the result to match the iterable."
# Result will be in set(itertools.combinations_with_replacement(iterable, r)).
pool = tuple(iterable)
n = len(pool)
indices = sorted(random.choices(range(n), k=r))
return tuple(pool[i] for i in indices)
預設的 random()
回傳 0.0 ≤ x < 1.0 範圍內 2⁻⁵³ 的倍數。所有數字都是均勻分佈的,並且可以完全表示為 Python float。但是,該間隔中的許多其他可表示的 float 不是可能的選擇。 例如 0.05954861408025609
不是 2⁻⁵³ 的整數倍。
以下範例採用不同的方法。間隔中的所有 float 都是可能的選擇。尾數來自 2⁵² ≤ 尾數 < 2⁵³ 範圍內的整數均勻分佈。指數來自幾何分佈,其中小於 -53 的指數的出現頻率是下一個較大指數的一半。
from random import Random
from math import ldexp
class FullRandom(Random):
def random(self):
mantissa = 0x10_0000_0000_0000 | self.getrandbits(52)
exponent = -53
x = 0
while not x:
x = self.getrandbits(32)
exponent += x.bit_length() - 32
return ldexp(mantissa, exponent)
Class 中的所有實數分佈都將使用新方法:
>>> fr = FullRandom()
>>> fr.random()
>>> fr.expovariate(0.25)
該範例在概念上等效於一種演算法,該演算法從 0.0 ≤ x < 1.0 範圍內 2⁻¹⁰⁷⁴ 的所有倍數中進行選擇。這些數字都是均勻分佈的,但大多數必須向下捨入到最接近的可表示的 Python float。(2⁻¹⁰⁷⁴ 是最小為正的非正規化 float,等於 math.ulp(0.0)
產生偽隨機浮點值 Allen B. Downey 的一篇論文描述了產生比通常由 random()
產生的 float 更 fine-grained(細粒的)的方法。
Command-line usage
在 3.13 版被加入.
The random
module can be executed from the command line.
python -m random [-h] [-c CHOICE [CHOICE ...] | -i N | -f N] [input ...]
The following options are accepted:
-h, --help
Show the help message and exit.
--float <N>
Print a random floating-point number between 0 and N inclusive,
using uniform()
If no options are given, the output depends on the input:
String or multiple: same as --choice
Integer: same as --integer
Float: same as --float
Command-line example
Here are some examples of the random
command-line interface:
$ # Choose one at random
$ python -m random egg bacon sausage spam "Lobster Thermidor aux crevettes with a Mornay sauce"
Lobster Thermidor aux crevettes with a Mornay sauce
$ # Random integer
$ python -m random 6
$ # Random floating-point number
$ python -m random 1.8
$ # With explicit arguments
$ python -m random --choice egg bacon sausage spam "Lobster Thermidor aux crevettes with a Mornay sauce"
$ python -m random --integer 6
$ python -m random --float 1.8
$ python -m random --integer 6
$ python -m random --float 6
