The semantics of Linux's Asynchronous file IO (AIO) is well described in the man page of io_setup(2), io_submit(2) and io_getevents(2).
However, without diving in the block IO subsystem, the operational side of the implementation is a little less clear. An aio_context allocates a queue for sending back io_events to a specific client in user-space. But is there more to it ?
- Let be a file read sequentially chunks by chunks. Can requests, especially in Direct IO (DIO), be collated ? What if requests for two files are interleaved into one aio_context ? What if requests for one file are sent to two different aio_contexts ?
- How requests are prioritized and scheduled in the above cases, with one or multiple aio_contexts ?
- Is it possible that requests from two aio_contexts get interleaved at some point ? (Occasioning more seek latencies than intended.)
- Does the thread or the CPU calling io_submit influence how it is scheduled ? Is the NUMA node containing the target buffer taken into consideration ?
More broadly, to which hardware resources (NUMA nodes, CPU cores, physical drives, file-systems and files) aio_contexts should be assigned, and at which level of granularity ?
Maybe it doesn't really matter and aio_contexts are no more than an abstraction for user-space programs. I'm asking since I have observed a performance decrease when concurrently reading multiples files, each with it's own aio_context, compared to a manual Round-robin serialization of chunks requests into a single aio_context.