添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

So when I look for a way to count the messages in a topic, this one is good

kafka-run-class kafka.tools.GetOffsetShell --broker-list broker1:9092,broker2:9092,broker3:9092 --topic rev-dly-upd --time -1

The only thing is, when I change the retention.ms config to retention.ms=1000 , and even check that the topic has been configured by running kafka-topics --describe --zookeeper zookeeper1:2181 --topic rev-dly-upd . I can see clearly that that config is set at 1000...

Topic:rev-dly-upd   PartitionCount:8    ReplicationFactor:3 Configs:retention.ms=1000
    Topic: rev-dly-upd  Partition: 0    Leader: 159 Replicas: 159,96,160    Isr: 159,96,160
    Topic: rev-dly-upd  Partition: 1    Leader: 160 Replicas: 160,159,94    Isr: 94,160,159
    Topic: rev-dly-upd  Partition: 2    Leader: 94  Replicas: 94,160,95 Isr: 95,94,160
    Topic: rev-dly-upd  Partition: 3    Leader: 95  Replicas: 95,94,96  Isr: 95,96,94
    Topic: rev-dly-upd  Partition: 4    Leader: 96  Replicas: 96,95,159 Isr: 95,96,159
    Topic: rev-dly-upd  Partition: 5    Leader: 159 Replicas: 159,160,94    Isr: 159,94,160
    Topic: rev-dly-upd  Partition: 6    Leader: 160 Replicas: 160,94,95 Isr: 94,160,95
    Topic: rev-dly-upd  Partition: 7    Leader: 94  Replicas: 94,95,96  Isr: 95,96,94

yet when I run kafka-run-class kafka.tools.GetOffsetShell --broker-list broker1:9092,broker2:9092,broker3:9092 --topic rev-dly-upd --time -1 all I always get records returned. What could the reasons be?

offsets are not truncated when msgs are truncated. the data of those messages should be gone, however, the offsets will not be reused. i understand GetOffsetShell to be a tool to list the offsets of all partitions? did you try to actually consume the topics and see if the data is indeed there? – Marius Waldal Aug 1, 2018 at 12:30 Basically, if the data for an offset is missing, then the consumer just seeks forward to the next available one. The LogCleaner should be resetting the earliest offsets, but that thread can stop working and you need to monitor it from the running server logs. In any case it should give you an approximate count, assuming topic is not compacted. The alternative of consuming and doing line count on a topic isn't reliable 1) There can be newlines within data 2) console consumer never ends, so wc won't stop – OneCricketeer Sep 12, 2018 at 13:22

Basically I had to stop using kafka-run-class kafka.tools.GetOffsetShell to count the messages in a topic. If you google "how to count messages in kafka topic", a lot of posts and things will lead you to think that the above command, given the right arguments, will give you a count of total messages. However if you have purged messages during the lifespan of the topic, then it will not give you an accurate count. You just have to do something like open a console consumer, output to text file, and then read the lines of that file with old-fashioned wc -l.

Messages within a topic can't be deleted unless it's compacted, so what do you mean by "purged"? If you do --time -1 and --time -2, you can take a look at the difference to count number of offsets/messages in the partitions – OneCricketeer Sep 12, 2018 at 13:24 Purged just means I allowed the messages exhaust their retention period. I force this by changing the retention period to 1 second and then letting the messages be deleted and then changing the retention setting back to what it had been. The way you are doing it with adjusting the time period is ok but then I have to keep track of when it was last purged. And in a troubleshooting situation it would be possible that I could lose confidence if it was purged at all. Unless I had a really good auditing system set up that I didn't allow myself to bypass and no manual purges, which I don't have. – uh_big_mike_boi Sep 12, 2018 at 13:39

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.