Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Remove offsets version limit #1199

Closed
wants to merge 2 commits into from
Closed

Conversation

Jiayi-Liao
Copy link

Actually the offsets functions work well when the version of kafka is lower than 0.10.0 and the only difference is the offset request's API_VERSION is not the same. I think we need to let more users to enjoy the feature.

@tvoinarovskyi
Copy link
Collaborator

@buptljy Could you explain why you are sure, it works for brokers 0.10.0 and lower? It's explicitly stated in https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html that the feature is 0.10.1 and above.

For example, 0.10.0 brokers do not support offsetsForTimes, because this feature was added
in version 0.10.1

I'm not super familiar with how v0 of ListOffsetRequest was handled, but the KIP-79 states that:

In ListOffsetRequest/ListOffsetResponse v0, we return a list of offsets which is smaller
than or equals to the target time. This was because the timestamp search is based on
the segment create time. In order to make sure not to miss any messages, users may
have to consume from some earlier segments. After KIP-33, we can accurately find the
messages based on the timestamps, so there is no need to return a list of offsets to the
users anymore.

So the search on Kafka 0.10.0 will be quite inaccurate, do you think that's acceptable?

@Jiayi-Liao
Copy link
Author

@tvoinarovskyi You're right about the requests in different versions of kafka and I didn't know it before. Then I reviewed the kafka-0.8.2 source code after read your comments, I found that the beginning_offsets and end_offsets function can still work and the results will be accurate. Let me explain this:

  • We can see the latest and earliest offset is accurate from the source code in kafka.server.KafkaApis$fetchOffsetsBefore.
  • When we set auto.offset.reset and start to consume records, we will send an earliest/latest offset request and the result is accurate.

Maybe we can keep the version limit in 'offsets_for_times' and remove the version limits in other two functions. Do you think that's acceptable?

@tvoinarovskyi
Copy link
Collaborator

@blugowski yes i think, it's a good idea. Probably Java client doesn't have this limit either, only for offsetsForTimes

@Jiayi-Liao
Copy link
Author

Jiayi-Liao commented Sep 4, 2017

@tvoinarovskyi To remove the unnecessary commits, I've updated my code in #1200.

@dpkp
Copy link
Owner

dpkp commented Sep 8, 2017

The difference is that the older API version responds using the file timestamp of the log segments. The granularity is less precise, but this is probably not that important. The bigger problem is that when replicas are reassigned the log file timestamps are altered, so the response is no longer related to the message time. Nonetheless, I'm happy to let users that know what they are doing use the old api version.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants