Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Kaitai Struct use an field (u1) as type chooser or be part of the next attribute (str) #1118

Open
Kaipheu opened this issue Jul 31, 2024 · 2 comments
Labels

Comments

@Kaipheu
Copy link

Kaipheu commented Jul 31, 2024

Hello, thanks for that project very useful !

I've tried for 1 day to describe a part of a file format.

Two u1 "del" are used as delimiter and to know the structure of the data between. Those blocks are chained, like this:

struct : del|content|del|del|content|del...

values : 01|AC4A55|01|03|0022AA|03...

So I can describe that as:

seq:
  - id: delimiter_start
    type: u1
    enum: content_enum_type
  - id: content
    type: 
      switch-on: delimiter_start
      cases:
      "content_enum_type::one":   type_one
      "content_enum_type::two":   type_two
      "content_enum_type::three": type_three
  - id: delimiter_stop
    type: u1
types:
  type_one:
    seq:
     -id: A
      type: u1
     -id: B
      type: u1
     -id: C
      type: u1
  type_two:
    seq:
     -id: D
      type: u2
     -id: E
      type: u1

  type_three:
    seq:
     -id: F
      type: u4
enum:
   content_enum_type:
      0x1: type_one
      0x2: type_two
      0x3: type_three

This is working well, but I have one case where we have to store UTF-8 strings because they haven't delimiters, i.e.: del|content|del|string_content|del|content|del

So it's here I'm blocked, I need to read the delimiter_start but eventually not storing it.

I've tried to use instances but I can't have the position of delimiter_start, I can't make a substream to deal with relative pos because the line of content isn't fixed (can't use size) and the terminator could be two different values (can't use terminator).

Is there a solution I'm missing or maybe a new feature has to be implemented?
I have tried several things but am always stuck in the fact if delimiter_start is read I can't "suppress" resulting in :
delimiter_start = H string = ELLO WORLD.

I also tried this

...
  - id: delimiter_start
    type: u1
    enum: content_enum_type
    if delimiter_start == content_enum_type::one
#    if delimiter_start == content_enum_type::two
#    if delimiter_start == content_enum_type::three
...

But this doesn't work, the compiled JS looks like:

...
if(this.delimiterStart == myType.content_enum_type.ONE){
   this.delimiterStart = this._io.readU1();
}
...

this.delimiterStart is undefined before the readU1 so the if is always false.

@GreyCat
Copy link
Member

GreyCat commented Aug 4, 2024

Sounds like your "stopping delimiter" is present in some cases and missing in some others. Why not include it into types (e.g. type_one, type_two, type_three), where it is necessary, but omit it from the UTF-8-related type where it's not needed?

@Kaipheu
Copy link
Author

Kaipheu commented Aug 26, 2024

Hello, thank you to answer !
Yes a can do it but the field which haven't delimiters will have its first byte parse as delimiters and not as content of the field.
Some thing like : delimiter_start = H, string = ELLO WORLD.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants