利用 Python 可變預設參數特性,保留歷史報價與計算
December 1, 2023
可變預設參數 (Mutable Default Arguments)
函數的預設參數如果是可變物件 (Mutable Default Arguments)
在 Python 中就只會初始化一次,並不斷復用:
def doubled(val, the_list=[]):
the_list.append(val)
the_list.append(val)
return the_list
print(doubled(10))
# [10, 10]
print(doubled(99))
# [10, 10, 99, 99]
之所以會這樣,是因為函數是物件,預設參數如同屬性。
而常見的可變物件包含字典、列表等。
在一般情況中,我們都會鼓勵預設參數應該盡量使用不可變物件。
利用該特性,保留歷史報價與計算
然而在輕量處理快速大量湧入的報價時,反而可以利用這個機制
設計暫存區(Buffer)存放歷史報價在可變的預設參數:
# 計算平均 Bid Price
# 並保留最多 maxBuffer 個 Bid Price
def avgBid(newBid, lookback, histBids=[], maxBuffer=10):
histBids.append(newBid)
histBids = histBids[-maxBuffer:]
lookback = min(lookback, len(histBids))
print(f"Bids: {histBids}")
avg = sum(histBids[-lookback:]) / lookback
return avg
for x in range(2, 100, 7):
print(f"Avg Bid: {avgBid(x, 2)}")
# 計算最近 2 個平均報價, 並保留 10 個歷史報價
#
# Bids: [2]
# Avg Bid: 2.0
# Bids: [2, 9]
# Avg Bid: 5.5
# Bids: [2, 9, 16]
# Avg Bid: 12.5
# Bids: [2, 9, 16, 23]
#
# ...
#
# Bids: [23, 30, 37, 44, 51, 58, 65, 72, 79, 86]
# Avg Bid: 82.5
# Bids: [30, 37, 44, 51, 58, 65, 72, 79, 86, 93]
# Avg Bid: 89.5
顯然的潛在威脅:副作用
請留意,這樣的操作潛在有 很大的副作用 ,例如:
函數重複呼叫,可能會混入重複或其他的輸入到 buffer 中
用 Thread 監聽並執行,並沒有 Thread safe
如果要盡可能降低副作用,可以考慮內嵌函數 (inner function) 的函數工廠
要用就生產一個,函數內的 buffer 間互相獨立:
import threading
import time
def avgBidFactory():
def avgBid(id, newBid, lookback, histBids=[], maxBuffer=10):
histBids.append(newBid)
histBids = histBids[-maxBuffer:]
print(f"({id}) Bids: {histBids}")
lookback = min(lookback, len(histBids))
avg = sum(histBids[-lookback:]) / lookback
return avg
return avgBid
def test(id):
avgBidFunc = avgBidFactory()
for x in range(2, 100, 7):
print(f"({id}) Avg Bid: {avgBidFunc(id, x, 2)}")
time.sleep(0.3)
t1 = threading.Thread(target=test, args=(1,)).start()
t2 = threading.Thread(target=test, args=(2,)).start()
# (2) Bids: [2, 9]
# (2) Avg Bid: 5.5
# (1) Bids: [2, 9]
# (1) Avg Bid: 5.5
# (2) Bids: [2, 9, 16]
# (1) Bids: [2, 9, 16]
# (1) Avg Bid: 12.5
# (2) Avg Bid: 12.5
#
# ...
#
# (1) Avg Bid: 82.5
# (2) Bids: [23, 30, 37, 44, 51, 58, 65, 72, 79, 86]
# (2) Avg Bid: 82.5
# (2) Bids: [30, 37, 44, 51, 58, 65, 72, 79, 86, 93]
# (1) Bids: [30, 37, 44, 51, 58, 65, 72, 79, 86, 93]
# (1) Avg Bid: 89.5
# (2) Avg Bid: 89.5
因此得十分確定函數在掌握之中,最好是輕量、單一的函數。
速度也不見得有優勢
另外這樣並沒有比較快(通常是更慢):
import timeit
import statistics
# Method 1: 利用可變預設參數
def avgBid1(newBid, lookback, histBids=[], maxBuffer=10):
histBids.append(newBid)
histBids = histBids[-maxBuffer:]
lookback = min(lookback, len(histBids))
avg = sum(histBids[-lookback:]) / lookback
return avg
# Method 2: 一般寫法
def avgBid2(histBids, lookback):
lookback = min(lookback, len(histBids))
avg = sum(histBids[-lookback:]) / lookback
return avg
def test1():
for x in range(1, 10000, 3):
avgBid1(x, 7)
def test2():
buffer = []
for x in range(1, 10000, 3):
buffer.append(x)
buffer = buffer[-10:]
avg = avgBid2(buffer, 7)
executionTimes1 = [timeit.timeit(test1, number=1000) for _ in range(10)]
mean1 = statistics.stdev(executionTimes1)
stdDev1 = statistics.stdev(executionTimes1)
print(f"Time1: {mean1:.6f} ±{2*stdDev1:.6f} s")
# 利用可變預設參數 Time1: 0.084309 ±0.168618 s
executionTimes2 = [timeit.timeit(test2, number=1000) for _ in range(10)]
mean2 = statistics.stdev(executionTimes2)
stdDev2 = statistics.stdev(executionTimes2)
print(f"Time2: {mean2:.6f} ±{2*stdDev2:.6f} s")
# 一般寫法 Time2: 0.021163 ±0.042325 s
結論:危機就是轉機,謹慎用之
視情況可謹慎使用,例如在業務起頭或末端的輕量處理,就可以考慮。
例如:算完就 print, 寫 file, 送出到網路 … 等。
就不需要特地還要建立一個前值變數保存,然後運算完更新前值,然後前值又暴露在一堆地方。