-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Must have support for batching when writing to Kafka #9
Comments
Config - batch.size Type: int |
The batch size is defined using the property connect.cosmosdb..task.batch.size and it is used to set the MaxItemCount in the Change Feed options. changeFeedOptions.setMaxItemCount(setting.batchSize). |
must ensure reading from cosmos db in a batch, and writing to kafka, in a batch, is supported in new Java code. |
Officially by using KafkaProducer and producerRecord you can't do that, but you can do this by configuring some properties in ProducerConfig batch.size - from document producer batch up the records into requests that are sending to same partition and send them at once The producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. This helps performance on both the client and the server. This configuration controls the default batch size in bytes. No attempt will be made to batch records larger than this size. |
some more information on producer client and batching - |
does Cosmos DB support receiving a batch of items from the ChangeFeed at once? if we do batching of Cosmos DB messages, what do we do with the checkpoint and watermarks? the disadvantage is that if the connector fails on receiving the 6th message, before it has flushed anything to kafka, we will lose the currently buffered 5 messages and cosmos db will think it has already given them to the connector, so won't give them again. |
for this first pass we will park batch support and come back to it later. |
When reading from Cosmos DB and writing to Kafka batching should be the default behaviour
Batch size should be defaulted to a chosen value, but be configurable by user.
The text was updated successfully, but these errors were encountered: