-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add stateless alternative to buffered's Reader class #112
Comments
I definitely agree with the general idea, but I'd prefer to see a signature like this: fun box peek_u32_be(seq: ReadSeq[U8] box, offset: USize): U32 ?
fun box peek_u32_le(seq: ReadSeq[U8] box, offset: USize): U32 ?
I don't think this is actually a common use case? I think pretty much everyone using a feature like this in Pony is using it to read either a network protocol or a file format, which is designed either as big-endian or little-endian as part of the protocol/format specification, regardless of the system architecture it's running on. So I'd be really surprised to see any code "in the wild" that doesn't make a static choice for big-endian vs little-endian. |
Totally agree with 1.
About
To give some context, this is a case where I would have needed it if I wanted to support multiple architecture for the same Pony app: I am doing a wrapper for SDL2. To deal with events (and emulate polymorphism in C) SDL2 returns a union of different structs. The first byte, give the event type and can be used to cast the union to the correct structure. What I did (by doing the same thing that the Rust wrapper) is generate a byte array big enough to fit all the structs of the union and gave it to SDL2 via FFI. Then, once SDL2 has filled it, I read the first byte to know the event type and then read the content using functions like To sump up, I think the following functions covers all cases while maintaining a simple API (except maybe for the functions' names) with good performance for the common cases. // The functions you probably want to use in 99% of cases.
fun box peek_u32_be(seq: ReadSeq[U8] box, offset: USize): U32 ?
fun box peek_u32_le(seq: ReadSeq[U8] box, offset: USize): U32 ?
// A one that just call one of the upper 2 depending of a parameter
type EndianMode is (BigEndian | LittleEndian)
fun box peek_u32(array: ReadSeq[U8] box, endianMode: EndianMode, offset: USize) : U32 ?
// And basically a one that retrieve the processor EndianMode and call the previous one
fun box peek_u32_processor_endian_mode(seq: ReadSeq[U8] box, offset: USize): U32 ? |
@codec-abc - Thanks for giving those concrete examples (TIFF and SDL2) to justify the motivation. You convinced me 😉. I like the idea you shared of providing both variants so that those who know the choice statically pay no extra perf cost, and get a more succinct syntax. 👍
Whoops, that was a bit of a typo. I should have said "I'd prefer to see it as a type parameter rather than a runtime parameter". Something like this: fun box peek_u32[E: EndianMode](array: Array[U8] box, offset: USize) : U32 ? However, I'm okay with the approach you shared above in your latest comment, so I don't think that's necessary. |
Since this issue is focused on a very small point it shouldn't be too hard to write a proper RFC (and even the implementation after that) which I will try to do in the near future. But before writing a proper one, I would like your opinion -and anyone interested too- on the following points:
|
It's unclear to me what the actual changes are. I use Reader in a lot of high performance code so I'm very interested in the actual changes to it and the possible performance ramifications. I'd rather see this does as a separate package/class that could be eventually brought together if we think that is appropriate. Ideally, I'd like to see these exist as a 3rd party library first. Evolve some there (as I suspect they are going to be prone to some API changes early on) and then moved over for inclusion in the standard library. |
The changes are to add the following stateless functions to the standard library fun peek_X_be(seq: ReadSeq[U8] box, offset: USize): X?
fun peek_X_le(seq: ReadSeq[U8] box, offset: USize): X?
fun peek_X[E: EndianMode](array: ReadSeq[U8] box, offset: USize): X?
fun peek_X_processor_endian_mode(seq: ReadSeq[U8] box, offset: USize): X? for any X in (U16 | I16 | U32 | I32 | U64 | I64 | U128 | I128 | F32 | F64 | String) and nothing more. Obviously, The Reader class could use it internally (as it provide more abstraction about reading data from a chunk of bytes) but it is not even necessary. |
Reader is stateful. If you are proposing adding stateless methods to it, while leaving the rest of Reader stateful, I think that's a confusing API. I think stateless methods such as these would make sense in a primitive of their own. |
I do agree. Also I think it makes sense to add the reverse ones at the same time, ie fun x_to_U8_be(x: X): ReadSeq[U8]
fun x_to_U8_le(x: X): ReadSeq[U8]
fun x_to_U8[E: EndianMode](x: X): ReadSeq[U8]
fun x_to_U8_processor_endian_mode(x: X): ReadSeq[U8] |
@SeanTAllen - As I understood it, the proposal was not to change the way That is, the idea is that you should be able to, for example, read a big-endian I32 from a particular offset in a |
The Reader class of the buffered package contains functions that are useful to read common basic type (such as F32, F64, U32, etc..) from a byte array. Unfortunately, those functions are not stateless and cannot be used in some wider context (my use case was a Array[U8 val] ref). It would be great to add stateless functions that Reader can be built on and Pony's users can reused too. They would have a signature similar to this one:
I suggest that it could be a great addition to add endian-less function that would pick automatically one of the two method depending on which architecture it is running to avoid code like this:
The text was updated successfully, but these errors were encountered: