I am building a FIX engine in C++ but I don't have a reference to know what would be considered a good performance number. Taking into account the network time and the FIX parsing time, what would be a good time in microseconds for a client to send a FIX message to a server? Also does anyone know the current lowest possible latencies expected for this simple FIX-message-from-client-to-server operation?
That will depend on how fast your FIX engine can parse bytes into a FixMessage
object and more importantly on how fast your network code is. Are you writing the network stack too? Writing a FIX engine looks simple from outside but it is actually a complex task with too many corner cases and features you have to cover. Are you going to support retransmission? Asynchronous audit-logging? FIX session timers? Repeating groups inside repeating groups? You should consider using an open-source or commercial FIX engine.
As for latencies you should expect, I am unaware of any FIX engine that can go below 4.5 microseconds. That's the one-way total time to write a FixMessage
object to a ByteBuffer
, transfer the ByteBuffer over the network to the server, the server then reads the ByteBuffer from the network and parses it back to a FixMessage
object. If you are using a descent FIX engine, the bottleneck will be the network I/O, not the FIX parsing.
To give you some numbers, here are the benchmark for CoralFIX, which is a FIX engine written in Java. If you can go below that, please let me know :)
Messages: 1,800,000 (one-way)
Avg Time: 4.774 micros
Min Time: 4.535 micros
Max Time: 69.516 micros
75% = [avg: 4.712 micros, max: 4.774 micros]
90% = [avg: 4.726 micros, max: 4.825 micros]
99% = [avg: 4.761 micros, max: 5.46 micros]
99.9% = [avg: 4.769 micros, max: 7.07 micros]
99.99% = [avg: 4.772 micros, max: 9.481 micros]
99.999% = [avg: 4.773 micros, max: 24.017 micros]
Disclaimer: I am one of the developers of CoralFIX.
Hannibal ante portas! . . . do not lose time to optimise in a wrong place
For the principally lowest achievable numbers do not forget to check the ASIC
/ FPGA
- based FIX-Protocol solutions. Any sequential / concurrent serial-processing has hard times to become faster than a parallel-silicon engine solution.
One may achieve not much deeper than about a 25 ns
resolution on a code-driven measurement aStopwatch.start();callProcessUnderTest();aStopwatch.stop()
but the real problems and issues are somewhere else.
To have some comparison on this, 20 ns
latency represents about as few as 5 m of AOC / active optical cables in 120 Gbps interconnects in colocation houses / HPC clusters.
0: Define REFERENCE POINTs: Avoid comparing oranges to apples
Performance tuning and latency minimisation are both thrilling and remarkable efforts.
On a first sight, asking the world for "Lowest possible anything" sounds attractive, if not sexy as it is common to hear in recent media, however any serious attempt to answer to the said "Lowest possible" is hard without a proper care spent on problem disambiguation ( if not even demystification ( especially to avoid MAR/COM-generated promises to receive Answers before one asks / Instant Kharma / Eternal Heaven et al, you know 'em well enough to mention any more ) ).
It is nothing new under sun, that for this very reason ITU-T / ITU-R and later IETF immense efforts have been spent on systematic care on defining specifications so as to avoid any potential misunderstanding, the less mis-interpretations ( be it in definitions of standards or acceptance testing procedures or specifications of a minimum performance envelopes a product / service has to meet so as to be fully inter-operable ).
So before taking any figure, be it in [ms], [us] or [ns], be sure whether we all have the same reference-setup of a System-Under-Test and be double-assured for which two reference points [FROM]-[TO]
the presented figure was in fact measured.
1: Define TEST SCENARIO ( incl. FIX-Protocol message ) End-to-End
________________________________________________________________________
+0 [us]-[__BaseLINE__] a decision to send anything is made @ <localhost>
|
|- <localhost> process wishes to request
| a FIX-MarketData
| Streaming Updates
| for a single item EURCHF
| during LON session opening time
|
|- <localhost> process wishes to request
| a FIX-MarketData
| Single FullRefresh
| for an itemset of:
| { EURUSD, GBPUSD, USDJPY,
| AUDUSD, USDCAD, USDCHF }
| during LON session opening time
|
+ [us]-o======< REFERENCE POINT[A] >===================================
|
|- transfer all DATA to a formatting / assembly process-entity
|
+ [us]-o======< REFERENCE POINT[B] >===================================
|
|- completed a FIX-message payload to be sent
|
+ [us]-o======< REFERENCE POINT[C] >===================================
|
|- completed a FIX-message Header/Trailer/CRC to dispatch
|
+ [us]-o======< REFERENCE POINT[D] >===================================
|
|- inlined SSH/DES/AES cryptor communication service processing
|
+ [us]-o======< REFERENCE POINT[E] >===================================
|
|- L3/2 transport service protocol SAR / PMD
|
+ [us]-o======< REFERENCE POINT[F] >===================================
|
|- L2/1 PHY-device wire-on/off-load process ( NIC / FPGA )-engine
|
+ [us]-o======< REFERENCE POINT[G] >===================================
|
|- E2E transport xmit/delivery processing / latency
|
+ [us]-o======< REFERENCE POINT[H] >===================================
|
|- L1/2 PHY-device on "receiving"-side wire-on/off-load process
|
+ [us]-o======< REFERENCE POINT[I] >===================================
|
|- L2/3 transport recv/handshaking processing / latency
|
+ [us]-o======< REFERENCE POINT[J] >===================================
|
|- inlined SSH/DES/AES decryptor processing
|
+ [us]-o======< REFERENCE POINT[K] >===================================
|
|- incoming FIX-message Header/Trailer/CRC check-in
|
+ [us]-o======< REFERENCE POINT[L] >===================================
|
|- authentication / FIX-Protocol message-counter cross-validation
|
+ [us]-o======< REFERENCE POINT[M] >===================================
|
|- FIX-message requested service content de-mapping
|
+ [us]-o======< REFERENCE POINT[N] >===================================
|
|- FIX-message requested service execution / handling
|
+ [us]-o======< REFERENCE POINT[O] >===================================
|
|- FIX-message requested service response pre-processing / assy
|
+ [us]-o======< REFERENCE POINT[P] >===================================
|
[__FinishLINE__] Ready To Send anything back to <localhost>
|
+ [us]-o======< REFERENCE POINT[Q] >===================================
________|_______________________________________________________________
: SUBTOTAL BEFORE A REQUESTED SERVICE'S RESPONSE-DELIVERY STARTS
________________________________________________________________________
As an ispiration, try to imagine something alike a uniform latency-reporting structure for all vendors ( or for your Project internal Dev/Test-teams ) like this example.
2.Define FULL CONTEXT ( incl. when/what background parallel-activities (workload/noise) you test )
If you need lowest latencies then FIX client and FIX server should be on same server, even in same application using an IPC solution like disruptor
I attended a presentation of the Singapore Stock Exchange a few years ago. They had recently purchase the Nasdaq OMX platform and claimed the lowest matching times at the time, at around 8 microseconds if messages were sent via the native protocol. They then said they support FIX, which would result in 2-3 microseconds on top of the matching time...
I guess you can use this 2-3 microsecond number as a sort of a minimal FIX overhead that an exchange claiming to be the fastest managed to achieve :)
© 2022 - 2024 — McMap. All rights reserved.