How can I tell Sentry not to alert certain exceptions?

M

4

8

I have a Rails 5 application using raven-ruby to send exceptions to Sentry which then sends alerts to our Slack.

Raven.configure do |config|
  config.dsn = ENV['SENTRY_DSN']
  config.environments = %w[ production development ]
  config.excluded_exceptions += []
  config.async = lambda { |event|
    SentryWorker.perform_async(event.to_hash)
  }
end

class SentryWorker < ApplicationWorker
  sidekiq_options queue: :default

  def perform(event)
    Raven.send_event(event)
  end
end

It's normal for our Sidekiq jobs to throw exceptions and be retried. These are mostly intermittent API errors and timeouts which clear up on their own in a few minutes. Sentry is dutifully sending these false alarms to our Slack.

I've already added the retry_count to the jobs. How can I prevent Sentry from sending exceptions with a retry_count < N to Slack while still alerting for other exceptions? An example that should not be alerted will have extra context like this:

sidekiq: {
  context: Job raised exception,
  job: {
    args: [{...}],
    class: SomeWorker,
    created_at: 1540590745.3296254,
    enqueued_at: 1540607026.4979043,
    error_class: HTTP::TimeoutError,
    error_message: Timed out after using the allocated 13 seconds,
    failed_at: 1540590758.4266324,
    jid: b4c7a68c45b7aebcf7c2f577,
    queue: default,
    retried_at: 1540600397.5804272,
    retry: True,
    retry_count: 2
  },
}

What are the pros and cons of not sending them to Sentry at all vs sending them to Sentry but not being alerted?

Marmion answered 27/10, 2018 at 3:58 Comment(6)

I think what you want is to monitor how often this error occurs overall anyway, not if it still happens after the nth retry. – Jp 27/10, 2018 at 11:40

@MarkusUnterwaditzer How often it happens overall is why I want to keep sending them to Sentry, but not get alerted. I don't have control over how flakey the APIs are. I just have to compensate for that. One or two retries for a job is normal. At five we want to investigate. – Marmion 27/10, 2018 at 14:35

I posted two workarounds for this. I'm not entirely sure if they're a good idea. – Jp 28/10, 2018 at 14:48

Possible duplicate of Sidekiq retry count in job – Lucila 8/11, 2018 at 19:11

This is not part of the worker class on purpose: github.com/mperham/sidekiq/issues/845 Use a client middleware: https://mcmap.net/q/559321/-sidekiq-retry-count-in-job and medium.com/appaloosa-store-engineering/… – Lucila 8/11, 2018 at 19:12

@LukasEklund I've already added the retry_count, thank you. The issue is more about Sentry than Sidekiq. – Marmion 8/11, 2018 at 20:4

J

3

You can filter out the entire event if the retry_count is < N (can be done inside that sidekiq worker you posted). You will loose the data on how often this happens without alerting, but the alerts themselves will not be too noisy.

class SentryWorker < ApplicationWorker
  sidekiq_options queue: :default

  def perform(event)
    retry_count = event.dig(:extra, :sidekiq, :job, retry_count)
    if retry_count.nil? || retry_count > N
      Raven.send_event(event)
    end
  end
end

Another idea is to set a different fingerprint depending on whether this is a retry or not. Like this:

class MyJobProcessor < Raven::Processor
  def process(data)
    retry_count = event.dig(:extra, :sidekiq, :job, retry_count)
    if (retry_count || 0) < N
      data["fingerprint"] = ["will-retry-again", "{{default}}"]
    end
  end
end

See https://docs.sentry.io/learn/rollups/?platform=javascript#custom-grouping

I didn't test this, but this should split up your issues into two, depending on whether sidekiq will retry them. You can then ignore one group but can still look at it whenever you need the data.

Jp answered 28/10, 2018 at 14:32 Comment(5)

Thanks for your answer. I get what the processor is doing. What should I do on the Sentry side? – Marmion 8/11, 2018 at 20:19

@Marmion I don't understand the question. Are you asking for clarification on the last sentence? – Jp 8/11, 2018 at 21:29

I understand that it's adding a fingerprint, though I'm not entirely sure what that does. I can't find how I use that fingerprint to control the alerts on Sentry. sentry.io/settings/<team>/<project>/alerts/rules/… I have An event's tags match {key} {comparison} {value} so maybe I can add will-retry as a tag. – Marmion 8/11, 2018 at 22:47

Sure, you can also add it as a tag and use the tag to configure the alert. I guess that's a better idea. With the fingerprint you'd get a separate issue you can just mute. No alert configuration. – Jp 9/11, 2018 at 14:37

I think I see now. Thank you very much! – Marmion 9/11, 2018 at 17:29

C

8

Summary

An option that has worked well for me is by configuring Sentry's should_capture alongside Sidekiq's sidekiq_retries_exhausted with a custom attribute on the exception.

Details

1a. Add the custom attribute

You can add a custom attribute to an exception. You can define this on any error class with attr_accessor:

class SomeError
  attr_accessor :ignore

  alias ignore? ignore
end

1b. Rescue the error, set the custom attribute, & re-raise

def perform
  # do something
rescue SomeError => e
  e.ignore = true
  raise e
end

Configure should_capture

should_capture allows you to capture exceptions when they meet a defined criteria. The exception is passed to it, on which you can access the custom attribute.

config.should_capture { |e| !e.ignore? }

Flip the custom attribute when retries are exhausted

There are 2 ways to define the behaviour you want to happen when a job dies, depending on the version of Sidekiq being used. If you want to apply globally & have sidekiq v5.1+, you can use a death handler. If you want to apply to a particular worker or have less than v5.1, you can use sidekiq_retries_exhausted.

sidekiq_retries_exhausted { |_job, ex| ex.ignore = false }

Chivalric answered 18/10, 2019 at 10:48 Comment(3)

This doesn't work for me because the exception doesn't have an ignore= method. ` NoMethodError Exception: undefined method ignore=' – Intercolumniation 5/3, 2020 at 0:28

@Intercolumniation I added a code example and further explanation to my answer. You can set an attr_accessor on your error class (or on StandardError). – Chivalric 7/4, 2020 at 11:18

But what if the error is defined by a library? I'd have to monkey patch it right – Intercolumniation 10/4, 2020 at 2:49

J

3

You can filter out the entire event if the retry_count is < N (can be done inside that sidekiq worker you posted). You will loose the data on how often this happens without alerting, but the alerts themselves will not be too noisy.

class SentryWorker < ApplicationWorker
  sidekiq_options queue: :default

  def perform(event)
    retry_count = event.dig(:extra, :sidekiq, :job, retry_count)
    if retry_count.nil? || retry_count > N
      Raven.send_event(event)
    end
  end
end

Another idea is to set a different fingerprint depending on whether this is a retry or not. Like this:

class MyJobProcessor < Raven::Processor
  def process(data)
    retry_count = event.dig(:extra, :sidekiq, :job, retry_count)
    if (retry_count || 0) < N
      data["fingerprint"] = ["will-retry-again", "{{default}}"]
    end
  end
end

See https://docs.sentry.io/learn/rollups/?platform=javascript#custom-grouping

I didn't test this, but this should split up your issues into two, depending on whether sidekiq will retry them. You can then ignore one group but can still look at it whenever you need the data.

Jp answered 28/10, 2018 at 14:32 Comment(5)

Thanks for your answer. I get what the processor is doing. What should I do on the Sentry side? – Marmion 8/11, 2018 at 20:19

@Marmion I don't understand the question. Are you asking for clarification on the last sentence? – Jp 8/11, 2018 at 21:29

I understand that it's adding a fingerprint, though I'm not entirely sure what that does. I can't find how I use that fingerprint to control the alerts on Sentry. sentry.io/settings/<team>/<project>/alerts/rules/… I have An event's tags match {key} {comparison} {value} so maybe I can add will-retry as a tag. – Marmion 8/11, 2018 at 22:47

Sure, you can also add it as a tag and use the tag to configure the alert. I guess that's a better idea. With the fingerprint you'd get a separate issue you can just mute. No alert configuration. – Jp 9/11, 2018 at 14:37

I think I see now. Thank you very much! – Marmion 9/11, 2018 at 17:29

S

3

A much cleaner approach if you are trying to ignore exceptions belonging to a certain class is to add them to your config file

config.excluded_exceptions += ['ActionController::RoutingError', 'ActiveRecord::RecordNotFound']

In the above example, the exceptions Rails uses to generate 404 responses will be suppressed.

See the docs for more configuration options

Specious answered 2/12, 2020 at 2:33 Comment(1)

Thank you, but the question is about conditionally excluding exceptions, not whole classes of exceptions. – Marmion 2/12, 2020 at 3:0

B

1

From my point of view, the best option is Sentry holds all the exceptions and you could modify Sentry and set alerts to send or not the exceptions to the Slack. In order to configure the Alerts in Sentry: In the sentry account, you could go to the ALerts option in the main menu.

In the following picture I configure an alert to only send to slack a notification if occurs an Exception of type ControllerException more than 10 times

Using this alert we only receive the notification in Slack when all conditions are accomplished

Bioplasm answered 26/6, 2022 at 16:39 Comment(0)

Recommended topics

Hot tags