What is a good approach for a fast FIFO queue?
Asked Answered
R

1

6

I am sampling from an external device in python and store the values in a FIFO queue. I have a fixed size array that I enqueue with a new sample from one end and then dequeue the "oldest" value from the other end (I have the terms from here: https://stackabuse.com/stacks-and-queues-in-python/). I have tried different implementations for this and the performance of each depends a lot on the size of the FIFO array, see the example below. Does there exist a faster way to do FIFO queues than the ones I gathered. Also, are there other concerns than the speed I can measure for a given size queue that I should be concerned about in these approaches?

import numpy as np
import time
import numba

@numba.njit
def fifo(sig_arr, n):
    for i in range(n):
        sig_arr[:-1] = sig_arr[1:]
        sig_arr[-1] = i
    return

n = 1000000 # number of enqueues/dequeues
for m in [100, 1000, 10000]: # fifo queue length
    print("FIFO array length is:" + str(m))
    print("Numpy-based queue")
    sig_arr_np = np.zeros(m)
    for _ in range(5):
        tic = time.time()
        for i in range(n):
            sig_arr_np[:-1] = sig_arr_np[1:]
            sig_arr_np[-1] = i
        print(time.time() - tic)

    print("Jitted numpy-based queue")
    sig_arr_jit = np.zeros(m)
    for _ in range(5):
        tic = time.time()
        fifo(sig_arr_jit, n)
        print(time.time()-tic)

    print("list-based queue")
    sig_arr_list = [0]*m
    for _ in range(5):
        tic = time.time()
        for i in range(n):
            sig_arr_list.append(i)
            sig_arr_list.pop(0)
        print(time.time() - tic)
print("done...")

output:

FIFO array length is:100
Numpy-based queue
0.7159860134124756
0.7160656452178955
0.7072808742523193
0.6405529975891113
0.6402220726013184
Jitted numpy-based queue
0.34624767303466797
0.10235905647277832
0.09779787063598633
0.10352706909179688
0.1059865951538086
list-based queue
0.19921231269836426
0.18682050704956055
0.178941011428833
0.190687894821167
0.18914198875427246
FIFO array length is:1000
Numpy-based queue
0.7035880088806152
0.7174069881439209
0.7061927318572998
0.7100749015808105
0.7161743640899658
Jitted numpy-based queue
0.4495429992675781
0.4449293613433838
0.4404451847076416
0.4400477409362793
0.43927478790283203
list-based queue
0.2652933597564697
0.26186203956604004
0.2784764766693115
0.27001261711120605
0.2699151039123535
FIFO array length is:10000
Numpy-based queue
2.0453989505767822
1.9288575649261475
1.9308562278747559
1.9575252532958984
2.048408269882202
Jitted numpy-based queue
5.075503349304199
5.083268404006958
5.181215286254883
5.115811109542847
5.163492918014526
list-based queue
1.2474076747894287
1.2347135543823242
1.2435767650604248
1.2809157371520996
1.237732172012329
done...

EDIT: Here I added the solution suggested by Jeff H. and set the deque to a fixed size such that the .pop() method is not needed and this makes a little faster.

n = 1000000 # number of enqueues/dequeues
for m in [100, 1000, 10000]: # fifo queue length
    print("deque-list-based queue")
    d = deque([None], m) 
    for _ in range(3):
        tic = time.time()
        for i in range(n):
            d.append(i)
        print(time.time() - tic) 
Romonaromonda answered 10/6, 2020 at 20:48 Comment(3)
Have you profiled your script to see where it's spending most of its time? If not, I suggest you do that before trying to optimize it. See How can you profile a Python script?Boyles
Hi @martineau. Thanks for the suggestion. No I have not done that. However I am only comparing the execution time of only two lines of code in each implementation.Romonaromonda
I was thinking about profiling the implementation of the FIFO.Boyles
P
4

Why aren't you trying the natural choice, collections.deque?

All of your implementations above suffer from the same poor performance because they are all O(N) operations every time you enque/deque anything as they are all list-backed. A proper data structure does this in constant time O(1) for FIFO.

Consider:

from collections import deque

from collections import deque

n = 1000000 # number of enqueues/dequeues
for m in [100, 1000, 10000, 1_000_000]: # fifo queue length
    print(f'\nqueue length: {m}')
    print('deque')
    d = deque(range(m))
    for _ in range(5):
        tic = time.time()
        for i in range(n):
            d.append(i)
            d.pop()
        print(time.time() - tic)
print("done...")

Yields: (note the larger m values and the near constant time, better than all of the above at any size)

queue length: 100
deque
0.13888287544250488
0.13873004913330078
0.13820695877075195
0.1369168758392334
0.1436598300933838

queue length: 1000
deque
0.1434800624847412
0.13672494888305664
0.1380469799041748
0.14961719512939453
0.13932228088378906

queue length: 10000
deque
0.14437294006347656
0.14214491844177246
0.13336801528930664
0.14667487144470215
0.1375408172607422

queue length: 1000000
deque
0.13426589965820312
0.13596534729003906
0.13602590560913086
0.13472890853881836
0.134993314743042
done...
[Finished in 3.4s]
Pennsylvania answered 10/6, 2020 at 21:16 Comment(3)
Thanks for the suggestion Jeff H. It performs much better for the larger the array as you pointed out. When I run it the jitted version of the "numpy-solution" is slightly faster for arrays below a length of 150, but the jitted version is very poor above a length of 2000 and even becomes slower than the non-jitted version.Romonaromonda
I just realized that the speed of the jitted version for the small array is because I also jit the for loop which i guess would not work in general. So when I do not do that the deque approach is always the best.Romonaromonda
Talking about FIFO, you may use popleft instead of pop. This benchmark always removed the just added element.Baiss

© 2022 - 2025 — McMap. All rights reserved.