How to configure asp.net kestrel for low latency?
Asked Answered
A

2

6

I am trying to implement an asp.net 2.2 application to serve HTTP requests with the lowest possible latency (not throughput, it's not for production but some sort of a competition). The application is supposed to run in a Linux docker container environment with 4 cores, and my handlers are CPU bound at 0.2..3 ms each. Connections are pre-created and keep-alived, but I am currently getting about 0.6..0.8 ms processing time for empty handlers (replying with 200 OK), with a noticeable jitter and occasional spikes to 20-50 ms that I can't explain.

Are there any particular settings of Kestrel/Sockets/Threads/CLR that can help to minimize the response time of each request? Or going the C/C++ route with EPOLL is my only option if I want to get it down to 0.1..0.2 ms?

Angadreme answered 9/1, 2019 at 4:38 Comment(8)
You are on .net - CLR is "slow by design". "occasional spikes to 20-50 ms" are GC cycles, perhaps. If you need extreme fast running code - go C++ and boostSanitarium
I specifically made sure GC does not play a role here: buffers are pre-allocated and pooled, and there are no new objects created when processing a request. Curiously, I observed similar spikes even without Kestrel, on plain sockets.Angadreme
You can't fully control GC from your application (because GC is not a part of your application) so it cycles still affects your app.Sanitarium
How is it not a programming question? Perhaps the words "set up" were confusing, changed to "implement".Angadreme
Do you have to use Kestrel and Web.API? There's lots of (useful) middleware in the request processing pipeline. If you implement a simple HTTP listener using TcpListener you can probably do better (of course you would never do that in production).Gibbon
Kestrel was my first option (stripped out of all unnecessary middleware), but I also experimented with TcpListener with awaitable extensions as well, parsing the HTTP header manually. It was slightly better than Kestrel but not by much (kudos to Kestrel developers). The latency was still high at [min=600 μs, max=4000 μs], measured within each second. What's puzzling me is that both numbers decreased to [min=400 μs, max=2500 μs] when the load increased from 1 to 1000 requests per second!Angadreme
As written the question does not appear to be about programming and development. As written it appears to be yet another "how do I configure my server" question. Perhaps you should provide Minimal, Complete, and Verifiable example and some profiling data.Roundelay
I feel like there needs to be more info to make a claim like this like what setup are you running and what are you using to test the performance. Not saying it's not true just a lot of things can influence the results.Stuccowork
B
7

Low latency is certainly possible with ASP.NET Core / Kestrel.

Here is a tiny web app to demonstrate this...

using System.Net;
using Microsoft.AspNetCore;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Http;

public static void Main(string[] args)
{
    IWebHost host = new WebHostBuilder()
        .UseKestrel()
        .Configure(app =>
        {
            // notice how we don't have app.UseMvc()?
            app.Map("/hello", SayHello);  // <-- ex: "http://localhost/hello"
        })
        .Build();

    host.Run();
}

private static void SayHello(IApplicationBuilder app)
{
    app.Run(async context =>
    {
        // implement your own response
        await context.Response.WriteAsync("Hello World!");
    });
}

I have answered this kind of similar question many times before here and here.

If you wish to compare the ASP.NET Core framework against others this is a great visual https://www.techempower.com/benchmarks/#section=data-r16&hw=ph&test=plaintext. As you can see, ASP.NET Core is has exceptional results and is the leading framework for C#.

In my code block above I noted the lack of app.UseMvc(). If you do need it, I have done a very detailed answer about getting better latency in this answer: What is the difference between AddMvc() and AddMvcCore()?


.NET Core Runtime (CoreRT)

If you are still needing more performance, I would encourage you to look at .Net Core Runtime (CoreRT).

Note that at time of this writing, this option probably needs to be reviewed in more detail before going ahead with this for a production system.

"CoreRT brings much of the performance and all of the deployment benefits of native compilation, while retaining your ability to write in your favorite .NET programming language."

CoreRT offers great benefits that are critical for many apps.

  • The native compiler generates a SINGLE FILE, including the app, managed dependencies and CoreRT.
  • Native compiled apps startup faster since they execute already compiled code. They don't need to generate machine code at runtime nor load a JIT compiler.
  • Native compiled apps can use an optimizing compiler, resulting in faster throughput from higher quality code (C++ compiler optimizations). Both the LLILLC and IL to CPP compilers rely on optimizing compilers.

These benefits open up some new scenarios for .NET developers

  • Copy a single file executable from one machine and run on another (of the same kind) without installing a .NET runtime.
  • Create and run a docker image that contains a single file executable (e.g. one file in addition to Ubuntu 14.04).

Linux-specific optimizations

There is a nice library that attempts to deal with very specialized cases. In particular for Linux (but this code is safe for other operating systems). The principle behind this optimization is to replace the libuv transport library (which ASP.NET Core uses) with another Linux-specific optimization.

It uses the kernel primitives directly to implement the Transport API. This reduces the number of heap allocated objects (e.g. uv_buf_t, SocketAsyncEventArgs), which means there is less GC pressure. Implementations building on top of an xplat API will pool objects to achieve this.

using RedHat.AspNetCore.Server.Kestrel.Transport.Linux; // <--- note this !

public static IWebHost BuildWebHost(string[] args) =>
    WebHost.CreateDefaultBuilder(args)
        .UseLinuxTransport()     // <--- and note this !!!
        .UseStartup()
        .Build();

// note: It's safe to call UseLinuxTransport on non-Linux platforms, it will no-op

You can take a look at the repository for that middleware on GitHub here https://github.com/redhat-developer/kestrel-linux-transport

enter image description here

enter image description here

Source: https://developers.redhat.com/blog/2018/07/24/improv-net-core-kestrel-performance-linux/

Booby answered 9/1, 2019 at 20:34 Comment(9)
Thanks, this is what I started with. Then I even removed the app.Map() and replaced with app.Run() to eliminate the path parsing code, but the latency is still a few times higher than I would get with C++/epoll(). Loosing hope now - unless there is some Socket level tweaking that I am not aware of, I'll have to resort to switching the language for the task.Angadreme
Did you try CoreRT? It will compile the entire solution and you will get better performance.Booby
Tried it too: created a native Linux-x64 app, but it didn't change the results significantly. The thing is, all "hot" code areas are JIT-optimized after a certain number of iterations. In fact, in some cases the runtime JIT may do even a better job than a static AOT compiler, since it can collect real execution stats before optimization instead of doing static analysis and guessing, so a good "warm-up" run after the start would trigger the JIT optimization.Angadreme
For your native C/C++ counterpart that you claim is faster, are you using libuv / kestrel? The performance issues might not be with the language, but instead the core technology for request handling.Booby
I have added additional content for linux optimizationsBooby
I tried both socket transport and libuv transport, libuv has demonstrated even higher latency. Re: linux transport: according to the developer, "There is no data that suggests this Transport offers a significant benefit compared to the Libuv/Sockets Transport for real workloads. For this reason we won't publish a supported version to nuget.org." github.com/redhat-developer/kestrel-linux-transport/issues/61 - but I'll give it a try anywaysAngadreme
@YuriyL -- It's possible that you are not simulating enough load / stress in your scenario. I want to avoid having a dialog down this road as it will certainly flag your question as becoming unsuitable for StackOverflow.Booby
Hopefully this will answer your question.Booby
Just one note on the Run example. Since we are talking about micro optimizations, the .net core team is actually using a trick to get the performance for the techenpower tests. github.com/aspnet/Benchmarks/blob/…Stuccowork
A
4

Thanks to everyone who answered. I ended up implementing my own HTTP server with epoll_wait() syscalls, this was the only way to get the latency down to the level I need. Kestrel offered about 2-2.5 times higher latency.

Please keep in mind the Kestrel is still an excellent choice for most production needs, it is optimized for maximum throughput with reasonable latency.

Angadreme answered 19/1, 2019 at 15:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.