How much overhead does the NewRelic PHP agent add?

Asked 28/3, 2014 at 0:36 Answered 7/4, 2014 at 15:42

Solved php performance profiling monitoring newrelic

By no means, NewRelic is taking the world by storm with many successful deployments. But what are the cons of using it in production?

PHP monitoring agent works as a .so extension. If I understand correctly, it connects to another system aggregation service, which filters data out and pushes them into the NewRelic cloud.

This simply means that it works transparently under the hood. However, is this actually true? Any monitoring, profiling or api service adds some overhead to the entire stack. The extension itself is 0.6 MB, which adds up to each php process, this isn't much so my concern is rather CPU and IO.

woodzu.vipserv.org/ec2-with-newrelic.png The image shows CPU Utilization on a production EC2 t1.micro instances with NewRelic agent (top blue one) and w/o the agent (other lines)

What does NewRelic really do what cause the additional overhead?
What are other negative sides when using it?

Illyes answered 28/3, 2014 at 0:36 Comment(2)

I'm also curious what the performance hit (especially latency overhead to expect) when running newrelic with the 'default' install setup from their site. I've got multiple load-balanced images--should I only run newrelic on one of them? – Ona 6/4, 2014 at 23:32

I think it would be great if we could compare our results. – Illyes 7/4, 2014 at 6:19

Your mileage may vary based on the settings, your particular site's code base, etc...

The additional overhead you're seeing is less the memory used, but the tracing and profiling of your php code and gathering analytic data on it as well as DB request profiling. Basically some additional overhead hooked into every php function call. You see similar overhead if you left Xdebug or ZendDebugger running on a machine or profiling. Any module will use some resources, ones that hook deep in for profiling can be the costliest, but I've seen new relic has config settings to dial back how intensively it profiles, so you might be able to lighten it's hit more than say Xdebug.

All that being said, with the newrelic shared PHP module loaded with the default setup and config from their site my company's website overall server response latency went up about 15-20% across the board when we turned it on for all our production machines. I'm only talking about the time it takes for php-fpm to generate an initial response. Our site is http://www.nara.me. The newrelic-daemon and newrelic-sysmon services running as well, but I doubt they have any impact on response time.

Don't get me wrong, I love new relic, but the perfomance hit in my specific situation hit doesn't make me want to keep the PHP module running on all our live load balanced machines. We'll probably keep it running on one machine all the time. We do plan to keep the sysmon stuff going 100% and keep the module disabled in case we need it for troubleshooting.

My advice is this:

Wrap any calls to new relic functions in if(function_exists ( $function_name )) blocks so your code can run without error if the new relic module isn't loaded
If you've multiple identical servers behind a loadbalancer sharing the same code, only enable the php module on one image to save performance. You can keep the sysmon stuff running if you use new relic for this.
If you've just one server, only enable the shared php module when you need it--when you're actually profiling your code or mysql unless a 10-20% performance hit isn't a problem.

One other thing to remember if your main source of info is the new relic website: they get paid by the number of machines you're monitoring, so don't expect them to convince you to not use it on anything less than 100% of your machines even if it not needed. I think one of their FAQ's or blogs state basically you should expect some performance impact, but if you use it as intended and fix the issues you see from it, you should recoup the latency lost. I agree, but I think once you fix the issues, limit the exposure to the smallest needed number of servers.

Ona answered 7/4, 2014 at 15:42 Comment(4)

I like the idea of only 1 server running the module all the time. It won't give me detailed data across all instances but should give a fair estimate from which I could work out some solutions. Anyway, I'm going to investigate the matter in more detail as this php.ini option looks promising: newrelic.transaction_tracer.detail=0 – Illyes 8/4, 2014 at 6:45

@Illyes I actually rename the newrelic.ini in my /etc/php.d directory to newrelic.ini.disabled on machines I don't want the module running. That way it's nothing is loaded into the php runtime. – Ona 8/4, 2014 at 13:42

@Illyes Just an update. I was surprised to see when I did a yum update it automatically create another newrelic.ini so when my php reloaded it started again!!! notices a 20% performance hit and figured it out. – Ona 8/7, 2014 at 21:36

Thanks for the update. I'll keep my eyes open for next update – Illyes 9/7, 2014 at 9:56

-1

The agent shouldn't be adding much overhead the way it is designed. Because of the level of detail required to adequately troubleshoot the problem, this seems like a good question to ask at https://support.newrelic.com

Erickaericksen answered 28/3, 2014 at 8:21 Comment(1)

Hi Walden. Any idea how is it designed? It's not OpenSource, is it? – Illyes 28/3, 2014 at 9:51

Recommended topics

Hot tags