利用 Python 可變預設參數特性,保留歷史報價與計算

December 1, 2023

img

可變預設參數 (Mutable Default Arguments)

函數的預設參數如果是可變物件 (Mutable Default Arguments)

在 Python 中就只會初始化一次,並不斷復用:

def doubled(val, the_list=[]):
    the_list.append(val)
    the_list.append(val)
    return the_list

print(doubled(10))
# [10, 10]

print(doubled(99))
# [10, 10, 99, 99]

之所以會這樣,是因為函數是物件,預設參數如同屬性。

而常見的可變物件包含字典、列表等。

在一般情況中,我們都會鼓勵預設參數應該盡量使用不可變物件。


利用該特性,保留歷史報價與計算

然而在輕量處理快速大量湧入的報價時,反而可以利用這個機制

設計暫存區(Buffer)存放歷史報價在可變的預設參數:

# 計算平均 Bid Price
# 並保留最多 maxBuffer 個 Bid Price
def avgBid(newBid, lookback, histBids=[], maxBuffer=10):
    histBids.append(newBid)
    histBids = histBids[-maxBuffer:]
    lookback = min(lookback, len(histBids))
    print(f"Bids: {histBids}")
    avg = sum(histBids[-lookback:]) / lookback

    return avg

for x in range(2, 100, 7):
    print(f"Avg Bid: {avgBid(x, 2)}")

# 計算最近 2 個平均報價, 並保留 10 個歷史報價
#
# Bids: [2]
# Avg Bid: 2.0
# Bids: [2, 9]
# Avg Bid: 5.5
# Bids: [2, 9, 16]
# Avg Bid: 12.5
# Bids: [2, 9, 16, 23]
#
# ...
#
# Bids: [23, 30, 37, 44, 51, 58, 65, 72, 79, 86]
# Avg Bid: 82.5
# Bids: [30, 37, 44, 51, 58, 65, 72, 79, 86, 93]
# Avg Bid: 89.5


顯然的潛在威脅:副作用

請留意,這樣的操作潛在有 很大的副作用 ,例如:

函數重複呼叫,可能會混入重複或其他的輸入到 buffer 中

用 Thread 監聽並執行,並沒有 Thread safe

如果要盡可能降低副作用,可以考慮內嵌函數 (inner function) 的函數工廠

要用就生產一個,函數內的 buffer 間互相獨立:

import threading
import time

def avgBidFactory():
    def avgBid(id, newBid, lookback, histBids=[], maxBuffer=10):
        histBids.append(newBid)
        histBids = histBids[-maxBuffer:]
        print(f"({id}) Bids: {histBids}")
        lookback = min(lookback, len(histBids))
        avg = sum(histBids[-lookback:]) / lookback
        return avg
    return avgBid

def test(id):
    avgBidFunc = avgBidFactory()
    for x in range(2, 100, 7):
        print(f"({id}) Avg Bid: {avgBidFunc(id, x, 2)}")
        time.sleep(0.3)

t1 = threading.Thread(target=test, args=(1,)).start()
t2 = threading.Thread(target=test, args=(2,)).start()

# (2) Bids: [2, 9]
# (2) Avg Bid: 5.5
# (1) Bids: [2, 9]
# (1) Avg Bid: 5.5
# (2) Bids: [2, 9, 16]
# (1) Bids: [2, 9, 16]
# (1) Avg Bid: 12.5
# (2) Avg Bid: 12.5
#
# ...
#
# (1) Avg Bid: 82.5
# (2) Bids: [23, 30, 37, 44, 51, 58, 65, 72, 79, 86]
# (2) Avg Bid: 82.5
# (2) Bids: [30, 37, 44, 51, 58, 65, 72, 79, 86, 93]
# (1) Bids: [30, 37, 44, 51, 58, 65, 72, 79, 86, 93]
# (1) Avg Bid: 89.5
# (2) Avg Bid: 89.5

因此得十分確定函數在掌握之中,最好是輕量、單一的函數。


速度也不見得有優勢

另外這樣並沒有比較快(通常是更慢):

import timeit
import statistics


# Method 1: 利用可變預設參數
def avgBid1(newBid, lookback, histBids=[], maxBuffer=10):
    histBids.append(newBid)
    histBids = histBids[-maxBuffer:]
    lookback = min(lookback, len(histBids))
    avg = sum(histBids[-lookback:]) / lookback
    return avg

# Method 2: 一般寫法
def avgBid2(histBids, lookback):
    lookback = min(lookback, len(histBids))
    avg = sum(histBids[-lookback:]) / lookback
    return avg

def test1():
    for x in range(1, 10000, 3):
        avgBid1(x, 7)

def test2():
    buffer = []
    for x in range(1, 10000, 3):
        buffer.append(x)
        buffer = buffer[-10:]
        avg = avgBid2(buffer, 7)

executionTimes1 = [timeit.timeit(test1, number=1000) for _ in range(10)]
mean1 = statistics.stdev(executionTimes1)
stdDev1 = statistics.stdev(executionTimes1)
print(f"Time1: {mean1:.6f} ±{2*stdDev1:.6f} s")
# 利用可變預設參數 Time1: 0.084309 ±0.168618 s

executionTimes2 = [timeit.timeit(test2, number=1000) for _ in range(10)]
mean2 = statistics.stdev(executionTimes2)
stdDev2 = statistics.stdev(executionTimes2)
print(f"Time2: {mean2:.6f} ±{2*stdDev2:.6f} s")
# 一般寫法 Time2: 0.021163 ±0.042325 s


結論:危機就是轉機,謹慎用之

視情況可謹慎使用,例如在業務起頭或末端的輕量處理,就可以考慮。

例如:算完就 print, 寫 file, 送出到網路 … 等。

就不需要特地還要建立一個前值變數保存,然後運算完更新前值,然後前值又暴露在一堆地方。

Tags: api python