x ** (alpha - 1) * math.exp(-x / beta)
pdf(x) = --------------------------------------
math.gamma(alpha) * beta ** alpha
random.gauss(mu=0.0, sigma=1.0)
常態分佈,也稱為高斯分佈。mu 是平均數,sigma 是標準差。這比下面定義的 normalvariate()
函式快一點。
多執行緒須注意:當兩個執行緒同時呼叫此函式時,它們可能會收到相同的傳回值。這可以透過三種方式避免。1)讓每個執行緒使用隨機數產生器的不同實例。2)在所有呼叫周圍加鎖。3)使用較慢但執行緒安全的 normalvariate()
函式代替。
在 3.11 版本變更: mu and sigma now have default arguments.
關於 Reproducibility(復現性)的注意事項
有時,能夠重現偽隨機數產生器給出的序列很有用。只要多執行緒未運行,透過重複使用種子值,同一序列就應該可以被復現。
大多數隨機 module 的演算法和 seed 設定函式在 Python 版本中可能會發生變化,但可以保證兩個方面不會改變:
如果增加了新的 seed 設定函式,則將提供向後相容的播種器 (seeder)。
當相容的播種器被賦予相同的種子時,產生器的 random()
方法將持續產生相同的序列。
基礎範例:
>>> random() # Random float: 0.0 <= x < 1.0
0.37444887175646646
>>> uniform(2.5, 10.0) # Random float: 2.5 <= x <= 10.0
3.1800146073117523
>>> expovariate(1 / 5) # Interval between arrivals averaging 5 seconds
5.148957571865031
>>> randrange(10) # Integer from 0 to 9 inclusive
>>> randrange(0, 101, 2) # Even integer from 0 to 100 inclusive
>>> choice(['win', 'lose', 'draw']) # Single random element from a sequence
'draw'
>>> deck = 'ace two three four'.split()
>>> shuffle(deck) # Shuffle a list
['four', 'two', 'ace', 'three']
>>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement
[40, 10, 50, 30]
>>> # Six roulette wheel spins (weighted sampling with replacement)
>>> choices(['red', 'black', 'green'], [18, 18, 2], k=6)
['red', 'green', 'black', 'black', 'red', 'black']
>>> # Deal 20 cards without replacement from a deck
>>> # of 52 playing cards, and determine the proportion of cards
>>> # with a ten-value: ten, jack, queen, or king.
>>> dealt = sample(['tens', 'low cards'], counts=[16, 36], k=20)
>>> dealt.count('tens') / 20
>>> # Estimate the probability of getting 5 or more heads from 7 spins
>>> # of a biased coin that settles on heads 60% of the time.
>>> def trial():
... return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5
>>> sum(trial() for i in range(10_000)) / 10_000
0.4169
>>> # Probability of the median of 5 samples being in middle two quartiles
>>> def trial():
... return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500
>>> sum(trial() for i in range(10_000)) / 10_000
0.7958
統計 bootstrapping(自助法)的範例,使用有重置的重新取樣來估計樣本平均數的信賴區間:
# https://www.thoughtco.com/example-of-bootstrapping-3126155
from statistics import fmean as mean
from random import choices
data = [41, 50, 29, 37, 81, 30, 73, 63, 20, 35, 68, 22, 60, 31, 95]
means = sorted(mean(choices(data, k=len(data))) for i in range(100))
print(f'The sample mean of {mean(data):.1f} has a 90% confidence '
f'interval from {means[5]:.1f} to {means[94]:.1f}')
重新取樣排列測試的範例,來確定觀察到的藥物與安慰劑之間差異的統計學意義或 p 值:
# Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson
from statistics import fmean as mean
from random import shuffle
drug = [54, 73, 53, 70, 73, 68, 52, 65, 65]
placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46]
observed_diff = mean(drug) - mean(placebo)
n = 10_000
count = 0
combined = drug + placebo
for i in range(n):
shuffle(combined)
new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):])
count += (new_diff >= observed_diff)
print(f'{n} label reshufflings produced only {count} instances with a difference')
print(f'at least as extreme as the observed difference of {observed_diff:.1f}.')
print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null')
print(f'hypothesis that there is no difference between the drug and the placebo.')
模擬多伺服器佇列 (queue) 的到達時間與服務交付:
from heapq import heapify, heapreplace
from random import expovariate, gauss
from statistics import mean, quantiles
average_arrival_interval = 5.6
average_service_time = 15.0
stdev_service_time = 3.5
num_servers = 3
waits = []
arrival_time = 0.0
servers = [0.0] * num_servers # time when each server becomes available
heapify(servers)
for i in range(1_000_000):
arrival_time += expovariate(1.0 / average_arrival_interval)
next_server_available = servers[0]
wait = max(0.0, next_server_available - arrival_time)
waits.append(wait)
service_duration = max(0.0, gauss(average_service_time, stdev_service_time))
service_completed = arrival_time + wait + service_duration
heapreplace(servers, service_completed)
print(f'Mean wait: {mean(waits):.1f} Max wait: {max(waits):.1f}')
print('Quartiles:', [round(q, 1) for q in quantiles(waits)])
Statistics for Hackers 是由 Jake Vanderplas 製作的教學影片,僅使用幾個基本概念(包括模擬、取樣、洗牌、交叉驗證)進行統計分析。
Economics Simulation是由 Peter Norvig 對市場進行的模擬,顯示了該模組提供的許多工具和分佈(高斯、均勻、樣本、 beta 變數、選擇,三角形、隨機數)的有效使用。
機率的具體介紹(使用Python)為 Peter Norvig 的教學課程,涵蓋了機率理論的基礎知識與如何模擬以及使用 Python 執行數據分析。
使用方案
These recipes show how to efficiently make random selections
from the combinatoric iterators in the itertools
module:
def random_product(*args, repeat=1):
"Random selection from itertools.product(*args, **kwds)"
pools = [tuple(pool) for pool in args] * repeat
return tuple(map(random.choice, pools))
def random_permutation(iterable, r=None):
"Random selection from itertools.permutations(iterable, r)"
pool = tuple(iterable)
r = len(pool) if r is None else r
return tuple(random.sample(pool, r))
def random_combination(iterable, r):
"Random selection from itertools.combinations(iterable, r)"
pool = tuple(iterable)
n = len(pool)
indices = sorted(random.sample(range(n), r))
return tuple(pool[i] for i in indices)
def random_combination_with_replacement(iterable, r):
"Choose r elements with replacement. Order the result to match the iterable."
# Result will be in set(itertools.combinations_with_replacement(iterable, r)).
pool = tuple(iterable)
n = len(pool)
indices = sorted(random.choices(range(n), k=r))
return tuple(pool[i] for i in indices)
預設的 random()
回傳 0.0 ≤ x < 1.0 範圍內 2⁻⁵³ 的倍數。所有數字都是均勻分佈的,並且可以完全表示為 Python float。但是,該間隔中的許多其他可表示的 float 不是可能的選擇。 例如 0.05954861408025609
不是 2⁻⁵³ 的整數倍。
以下範例採用不同的方法。間隔中的所有 float 都是可能的選擇。尾數來自 2⁵² ≤ 尾數 < 2⁵³ 範圍內的整數均勻分佈。指數來自幾何分佈,其中小於 -53 的指數的出現頻率是下一個較大指數的一半。
from random import Random
from math import ldexp
class FullRandom(Random):
def random(self):
mantissa = 0x10_0000_0000_0000 | self.getrandbits(52)
exponent = -53
x = 0
while not x:
x = self.getrandbits(32)
exponent += x.bit_length() - 32
return ldexp(mantissa, exponent)
Class 中的所有實數分佈都將使用新方法:
>>> fr = FullRandom()
>>> fr.random()
0.05954861408025609
>>> fr.expovariate(0.25)
8.87925541791544
該範例在概念上等效於一種演算法,該演算法從 0.0 ≤ x < 1.0 範圍內 2⁻¹⁰⁷⁴ 的所有倍數中進行選擇。這些數字都是均勻分佈的,但大多數必須向下捨入到最接近的可表示的 Python float。(2⁻¹⁰⁷⁴ 是最小為正的非正規化 float,等於 math.ulp(0.0)
)
產生偽隨機浮點值 Allen B. Downey 的一篇論文描述了產生比通常由 random()
產生的 float 更 fine-grained(細粒的)的方法。
© 版權 2001-2023, Python Software Foundation.
This page is licensed under the Python Software Foundation License Version 2.
Examples, recipes, and other code in the documentation are additionally licensed under the Zero Clause BSD License.
See History and License for more information.
The Python Software Foundation is a non-profit corporation.
Please donate.
最後更新於 9月 21, 2023。
Found a bug?