Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

re_datastore: standardize the payload schema for insertion #435

Closed
Tracked by #520
teh-cmc opened this issue Dec 2, 2022 · 3 comments
Closed
Tracked by #520

re_datastore: standardize the payload schema for insertion #435

teh-cmc opened this issue Dec 2, 2022 · 3 comments
Labels
🏹 arrow Apache Arrow ⛃ re_datastore affects the datastore itself

Comments

@teh-cmc
Copy link
Member

teh-cmc commented Dec 2, 2022

Standardize and put into writing insert-payload schema

  • should the entire payload be a list, for client-side batching?
  • how strict do we want to be? how dynamic to we want to be?
  • what are we allowed to do? what is forbidden?
    • is having the same component present multiple times legal?
    • is passing no instances legal?
    • etc
@teh-cmc teh-cmc added 🏹 arrow Apache Arrow ⛃ re_datastore affects the datastore itself labels Dec 2, 2022
@teh-cmc
Copy link
Member Author

teh-cmc commented Dec 2, 2022

Example for reference:

This insertion:

    let (schema, components) = build_message(
        &ent_path,
        [build_log_time(now_plus_20ms), build_frame_nr(frame41)],
        [build_instances(nb_instances), build_rects(nb_instances)],
    );

turns into this schema:

schema: Schema {
    fields: [
        Field {
            name: "timelines",
            data_type: Struct(
                [
                    Field {
                        name: "log_time",
                        data_type: Timestamp(
                            Nanosecond,
                            None,
                        ),
                        is_nullable: false,
                        metadata: {
                            "RERUN:timeline": "Time",
                        },
                    },
                    Field {
                        name: "frame_nr",
                        data_type: Int64,
                        is_nullable: false,
                        metadata: {
                            "RERUN:timeline": "Sequence",
                        },
                    },
                ],
            ),
            is_nullable: false,
            metadata: {},
        },
        Field {
            name: "components",
            data_type: Struct(
                [
                    Field {
                        name: "instances",
                        data_type: List(
                            Field {
                                name: "item",
                                data_type: UInt32,
                                is_nullable: true,
                                metadata: {},
                            },
                        ),
                        is_nullable: false,
                        metadata: {},
                    },
                    Field {
                        name: "rects",
                        data_type: List(
                            Field {
                                name: "item",
                                data_type: Struct(
                                    [
                                        Field {
                                            name: "x",
                                            data_type: Float32,
                                            is_nullable: false,
                                            metadata: {},
                                        },
                                        Field {
                                            name: "y",
                                            data_type: Float32,
                                            is_nullable: false,
                                            metadata: {},
                                        },
                                        Field {
                                            name: "w",
                                            data_type: Float32,
                                            is_nullable: false,
                                            metadata: {},
                                        },
                                        Field {
                                            name: "h",
                                            data_type: Float32,
                                            is_nullable: false,
                                            metadata: {},
                                        },
                                    ],
                                ),
                                is_nullable: true,
                                metadata: {},
                            },
                        ),
                        is_nullable: false,
                        metadata: {},
                    },
                ],
            ),
            is_nullable: false,
            metadata: {},
        },
    ],
    metadata: {
        "RERUN:entity_path": "this/that",
    },
}

with this payload:

components: Chunk {
    arrays: [
        StructArray[{log_time: 2022-12-02 15:14:23.186512372, frame_nr: 41}],
        StructArray[{instances: [4262470174, 24667012, 2536452249], rects: [{x: 0, y: 0, w: 0, h: 0}, {x: 1, y: 1, w: 1, h: 1}, {x: 2, y: 2, w: 2, h: 2}]}],
    ],
}

@jondo2010
Copy link
Contributor

So the work in #501 addresses a large part of this, at least in code by factoring out matching encode/decode function pairs (implemented mostly as From conversions between helper types).

The airity and duplication questions are not answered.

@teh-cmc
Copy link
Member Author

teh-cmc commented Mar 21, 2023

See #1619

@teh-cmc teh-cmc closed this as completed Mar 21, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
🏹 arrow Apache Arrow ⛃ re_datastore affects the datastore itself
Projects
None yet
Development

No branches or pull requests

2 participants