-
Notifications
You must be signed in to change notification settings - Fork 11.7k
RIP 47 Data Layout V2
- Current State: Development
- Authors: Li Zhanhui
- Shepherds: -
- Mailing List discussion: dev@rocketmq.apache.org
- Pull Request:
- Released: no
- Will we add a new module? -- No.
- Will we add new APIs? -- No.
- Will we add new features? -- No.
Yes.
Clients prior to 5.x shares the exactly same data layout with store module on the server side, strictly blocking them evolving independently. Aka, brokers cannot make any changes to storage data layout unless all consumers are upgraded first, as is virtually impossible in practice. This imposes a series of blocking challenges to development of RocketMQ.
There has been a few known limitations
- Length of topic can be up to 128 bytes only;
- Occasional properties length run-over after more system properties are appended.
- Deleted topic re-appear after system crash recovery; ABA issue is impossible to resolve under current data layout paradigm. For example, create a topic T, send a few messages, delete T, then re-create topic T and send another batch of messages; Once system crash and recover, it's hard to maintain data integrity.
- Current data layout assumes IPv4 in terms of born host and store host. It is awkward to make it IPv6 compatible.
- System properties and user defined key-value pairs are mangled together. Newly added system properties have the risk of conflicting with existing user code.
- All issues listed in the previous section will be fundamentally resolved.
- See the previous section, problems stated will be solved.
N/A
Nothing specific.
- Current data layout is named v1. Brokers will, by default, assume that client SDKs support v1.
- When messages stored in v2 format are delivered to SDKs with v1 capability, they would be reflowed in v1 format.
- On connection, new SDKs would sync its capabilities to brokers, such that brokers may deliver messages in v2 format directly.
- Serialization and deserialization of v2 are supposed to be almost zero overhead. Overall, its format are as follows.
- FlatBuffers is targeted for message header serialization and deserialization because 1) it has well supports among popular programming languages; 2) its IDL is backward and forward compatible when adding/removing fields; 3) its extremely good performance
There is no API/interface change involved.
Compatibility is kept in mind by design. Future new SDK would enjoy all benefits while released client SDKs would still be supported as before.
We split this proposal into several tasks:
- Task1: Implement client capability sync on connection.
- Task2: Discuss v2 data layout, message header IDL.
- Task3: Store message in v2. Deliver messages according to client capabilities;
- Task4: Release remoting-based client SDK that support v2 data layout.
Keep the status quo, and continue with mentioned blockers.
Copyright © 2016~2022 The Apache Software Foundation.
- Home
- RocketMQ Improvement Proposal
- User Guide
- Community