Skip to content

Latest commit

 

History

History
94 lines (74 loc) · 3.48 KB

File metadata and controls

94 lines (74 loc) · 3.48 KB
title sidebar_title sidebar_order
Troubleshooting Kafka
Kafka
2

Offset Out Of Range Error

Exception: KafkaError{code=OFFSET_OUT_OF_RANGE,val=1,str="Broker: Offset out of range"}

This happens where Kafka and the consumers get out of sync. Possible reasons are:

  1. Running out of disk space or memory
  2. Having a sustained event spike that causes very long processing times, causing Kafka to drop messages as they go past the retention time
  3. Date/time out of sync issues due to a restart or suspend/resume cycle

Recovery

Note: These solutions may result in data loss when resetting the offset of the snuba consumers.

Proper solution

The proper solution is as follows (reported by @rmisyurev):

  1. Receive consumers list:
    docker compose run --rm kafka kafka-consumer-groups --bootstrap-server kafka:9092 --list
  2. Get group info:
    docker compose run --rm kafka kafka-consumer-groups --bootstrap-server kafka:9092 --group snuba-consumers --describe
  3. Watching what is going to happen with offset by using dry-run (optional):
    docker compose run --rm kafka kafka-consumer-groups --bootstrap-server kafka:9092 --group snuba-consumers --topic events --reset-offsets --to-latest --dry-run
  4. Set offset to latest and execute:
    docker compose run --rm kafka kafka-consumer-groups --bootstrap-server kafka:9092 --group snuba-consumers --topic events --reset-offsets --to-latest --execute
You can replace snuba-consumers with other consumer groups or events with other topics when needed.

Another option

This option is as follows (reported by @gabn88):

  1. Set offset to latest and execute:
    docker compose run --rm kafka kafka-consumer-groups --bootstrap-server kafka:9092 --all-groups --all-topics --reset-offsets --to-latest --execute

Unlike the proper solution, this involves resetting the offsets of all consumer groups and all topics.

Nuclear option

The nuclear option is removing all Kafka-related volumes and recreating them which will cause data loss. Any data that was pending there will be gone upon deleting these volumes.

  1. Stop the instance:

    docker compose down --volumes
  2. Remove the Kafka & Zookeeper related volumes:

    docker volume rm sentry-kafka
    docker volume rm sentry-zookeeper
  3. Run the install script again:

    ./install.sh
  4. Start the instance:

    docker compose up --wait

Reducing disk usage

If you want to reduce the disk space used by Kafka, you'll need to carefully calculate how much data you are ingesting, how much data loss you can tolerate and then follow the recommendations on this awesome StackOverflow post or this post on our community forum.

You could, however, add these on the Kafka container's environment variables (by @csvan):

services:
  kafka:
    # ...
    environment:
      KAFKA_LOG_RETENTION_HOURS: 24
      KAFKA_LOG_CLEANER_ENABLE: true
      KAFKA_LOG_CLEANUP_POLICY: delete