Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sentry-sidekiq throwing ActiveJob::SerializationError #1790

Closed
modosc opened this issue Apr 12, 2022 · 8 comments
Closed

sentry-sidekiq throwing ActiveJob::SerializationError #1790

modosc opened this issue Apr 12, 2022 · 8 comments
Assignees

Comments

@modosc
Copy link
Contributor

modosc commented Apr 12, 2022

Issue Description

same as #1585

we are seeing these entries in our logs (~3k in the last week):

"[ActiveJob] Failed enqueuing Sentry::SendEventJob to Sidekiq(default): ActiveJob::SerializationError (Can't serialize a Hash with reserved key "_aj_symbol_keys")"
[ActiveJob] Failed enqueuing Sentry::SendEventJob to Sidekiq(default): ActiveJob::SerializationError (Can't serialize a Hash with reserved key "_aj_ruby2_keywords")

i realize there's a theoretical solution here but somehow it's not working correctly.

    sentry-rails (5.2.1)
      sentry-ruby-core (~> 5.2.1)
    sentry-resque (5.2.1)
      sentry-ruby-core (~> 5.2.1)
    sentry-ruby (5.2.1)
      sentry-ruby-core (= 5.2.1)
    sentry-ruby-core (5.2.1)
    sentry-sidekiq (5.2.1)
      sentry-ruby-core (~> 5.2.1)

Reproduction Steps

we don't have a reproducible case - this is a private company's repo.

Expected Behavior

sending errors should work

Actual Behavior

we are losing errors.

Ruby Version

3.1.1

SDK Version

5.2.1

Integration and Its Version

No response

Sentry Config

# frozen_string_literal: true

Sentry.init do |config|
  config.dsn = Rails.application.secrets[:sentry_dsn]

  config.breadcrumbs_logger = %i[active_support_logger http_logger]

  # Send messages async to prevent blocking in application and keep tolerance if Sentry
  # has an outage.
  config.async = ->(event, hint) { Sentry::SendEventJob.perform_later(event, hint) }

  # Filter out protected data
  filter = ActiveSupport::ParameterFilter.new(Rails.application.config.filter_parameters)
  config.before_send = ->(event, hint) { filter.filter(event.to_hash) }

  config.excluded_exceptions += [
    "ActionController::RoutingError",
    "ActiveRecord::RecordNotFound",
    "RedisLock::AlreadyLocked", # These are very high volume and we just need to suppress for now
  ]

  config.release = ENV["SOURCE_COMMIT"]
  config.environment = ENV["SENTRY_ENV"] || Rails.env
end
@st0012
Copy link
Collaborator

st0012 commented Apr 13, 2022

@modosc Thanks for reporting this. I think it's caused by the use of async option + Sentry::SendEventJob, but without a reproduction app I can't be certain on it. And as I mentioned in #1585, async option is now discouraged and if your reason behind it is described as in the comment:

  # Send messages async to prevent blocking in application and keep tolerance if Sentry
  # has an outage.

I can tell you that it's not necessary anymore. The SDK now has its own background worker that'll discard unsent/overflown events without interrupting/delaying the main request. On the contrary, using async could be harmful to your system instead (as described in #1522). So I highly recommend dropping the use of async.

@modosc
Copy link
Contributor Author

modosc commented Apr 13, 2022

aparently today was the wrong day to make that change:

[FILTERED][PRODUCTION] main:0> {"entity.name":"FILTERED","entity.type":"SERVICE","hostname":"FILTERED","message":"[Transport] Sending envelope with items [event] ddadbef91232440cafc5af0d22c272bc to Sentry","log.level":"INFO","logger.name":"sentry","timestamp":1649877565997}
{"entity.name":"FILTERED","entity.type":"SERVICE","hostname":"FILTERED","message":"Event sending failed: the server responded with status 429","log.level":"ERROR","logger.name":"sentry","timestamp":1649877566038}
{"entity.name":"FILTERED","entity.type":"SERVICE","hostname":"FILTERED","message":"Unreported Event: test by jon","log.level":"INFO","logger.name":"sentry","timestamp":1649877566038}
{"entity.name":"FILTERED","entity.type":"SERVICE","hostname":"FILTERED,"message":"exception happened in background worker: the server responded with status 429","log.level":"ERROR","logger.name":"sentry","timestamp":1649877566038}

i don't think this is a reasonable way forward.

@st0012
Copy link
Collaborator

st0012 commented Apr 13, 2022

Status 429 means the payload size exceeded the limit (200kb). And since async doesn't change what was sent, that should have happened in your current setup as well. Can you check if some of the sidekiq jobs also failed because of the 429 responses?

Also, if you can switch to debug level for logging, it should tell you what info was oversized and caused the error. You can partially switch it just for SDK with:

Sentry.init do |config|
  # assign a custom logger so you don't accidentally change the Rails app's log level as well
  config.logger = Logger.new($stdout) # or whatever log source you have
  config.logger.level = :debug
end

The output should look like this:

{event_id: 34, level: 7, timestamp: 22, environment: 13, server_name: 14, modules: 935, message: 5, user: 2, tags: 2, contexts: 820791, extra: 2, fingerprint: 2, platform: 6, sdk: 40, threads: 51690}

@modosc
Copy link
Contributor Author

modosc commented Apr 13, 2022

i will look at this but again - this is very surprising behavior.

@modosc
Copy link
Contributor Author

modosc commented Apr 13, 2022

also all the docs say 429 is quota exceeded?

https://docs.sentry.io/product/accounts/quotas/

but we aren't listed as being over quota. i tried filing a support ticket as well.

@st0012
Copy link
Collaborator

st0012 commented Apr 13, 2022

Sorry my memory screwed my up 🤦‍♂️

  • 413 means oversized.
  • 429 means the events are being rate-limited.

So in your case, the events are being rate-limited because high volume. That's actually a valid behavior if your application is seeing a high-rate of exceptions.

And the reason you only see it after async is removed is probably because now events will be sent much faster (without job enqueuing + competing with other jobs for workers).

@st0012
Copy link
Collaborator

st0012 commented Apr 13, 2022

BTW, the SDK will automatically rate-limit events once it received the 429 response, so it won't keep making unnecessary requests to Sentry.
But if you want to reduce the volume of events, you can set the config.sample_rate config with something like 0.5 (50%).

@modosc
Copy link
Contributor Author

modosc commented Apr 14, 2022

ok thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants