-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Unable to make it work for nested types in different Schema Files. #44
Comments
@markfarnan I think the correct place to report this problem is https://issues.apache.org/jira/projects/AVRO |
Thanks Martin, So I take it this should work (in theory !) as I'm using it ? Or put another way, that the Schema returned in the [0] element of the vector (in this case), SHOULD be a full schema ? Didn't want to report an error if I'm just using it wrong etc. |
I am not sure what was the initial purpose of this API. |
Sounds good. Interesting, if I derive the schema's from the Objects (using --derive-schemas in rsgen-avro), the resultant schema works fine to writer!. I've attached two files for reference in case it is usefull, the first one is the 'Failing' parsed Schema. (the result of the PrintLn) that just uses the list, the second is the 'working' schema that I obtained from. ProtocolException::get_schema()
It looks like the parse_list function is just putting in references, which are unresolvable, where are the get_schema() is putting in valid Unions that embed the refered schema. Note: I don't think I can use derive-schemas for my project, as the resultant file from rsgen-avro has dozens of errors. Some types dont' support derive. I'll try and work out how to do that test / Jira for Avro soon as I can, as using these AVRO Schemas is critical for us switching to RUST for some new work. |
Indeed issues related to Avro
This is strange, yes please open a separate issue for this with a basic example to reproduce the issue.
This is strange too, please open a separate issue too. Beside, if I've understood correctly the This is unexpected indeed, but this issue should be addressed in |
JIRA and PR Created in Apache Avro for this issue. I'll see if I can isolate some examples for the other issues. The SchemaDerive one is going to be more interesting, as the schema it breaks on, has a a dozen or so dependancies. |
Are you including |
Yes: [dependencies] It's wierd, because several things annotated with #[serde(with = "serde_bytes")]. work fine, just a couple don't. The AvroSchema stuff breaks on some Enums. with "error: AvroSchema derive does not work for enums with non unit structs" It dosn't like things like this it seems: (I'm too new at this to have a clue as to why) #[derive(Debug, PartialEq, Clone, serde::Serialize, apache_avro::AvroSchema)] |
See apache/avro#1921 |
Thanks, Unfortunatly I'm too new at Rust for that to make a lot of sense as yet !. However, on a seperate point. rsgen already knows a fully valid, resolved schema when it creates the objects in the first place (with all the extra fields etc, like doc, and real namespaces.). Would it be possible instead, for rsgen to store that schema with the objects themselves, so that 'getSchema' actually returns a guarenteed valid schema, by using the parsed version is used to create them in the first place? This is the approach taken by the Avro implementation in GO. It actually takes it a step further and creates Serialize and Deserialize functions that return valid Go objects, but that dosn't seem necessary using Serde, if the schema's are correct. |
This should be easy to do by providing an impl for AvroSchemaComponent |
It's not really implementable as long as Besides I have an implementation for this on branch impl-schema-component. It looks fine for simple cases but if the schema has dependencies etc, it might not work... |
The schema variable line 430 should be the expanded version of the schema to take into account of the dependencies. I was looking to add a recursive construction of the schema along the fields of the objects. |
It would be ideal if it:
|
As an aside, I've gotten the complex structs working, using PR 28 from serde_bytes + merging a couple other PR's for Avro. (3684 for Multi schema, plus the 'fixed' ones. ) and some hacking for the nullable unions. If interested, this is all in branch 'etp-working' of https://github.com/markfarnan/avro/tree/master/lang/rust/avro However, there are a lot of fixes required from the output of rsgen-avro before they are useable. It won't even compile 'out of the box' due to some misplaced serde_bytes entires. I plan to narrow them all down and make requests once the upstream rust-avro and serde-bytes are merged and I can see the final decisions on them. If you're interested, the schema's and generated file is here: https://github.com/Bardasz/etp-rs/tree/main/schema The most annoying change, is to go through and put 'serde-bytes' above each 'uuid' entry (which is a type for [u8,16] ) Once the upstream is sorted, I'll try and turn any remaining errors into failing tests or simple examples. |
Wow It might be interesting to have custom variant naming options for such cases, something like Looking forward to the upstream apache-avro update so that we can tackle the issues you mentioned ! Thanks for looking into all this. |
Can see why I want to rename it I'm sure :). That complex nullable, 'DataValue' Union, and the fixed [u8,16] were the two main culprits of various problems. |
just for notice:
should be corrected with #47 |
@markfarnan with latest release I have generated code for ETP v1.2 using the following approach: git clone https://bitbucket.org/energistics/etpv1.git
cd etpv1/
git checkout etpv1.2
cd ..
cargo new --bin gen-etp
cd gen-etp/
cargo add apache-avro serde
# then add `mod etp;` in the first line of main.rs
cd ..
rsgen-avro "etpv1/src/Schemas/Energistics/**/*" gen-etp/src/etp.rs The only compilation error that I have now is that 3 versions of |
Looking for guidance on how to make rsgen-Avro and avro-rust work when there are multiple schema files, where one file relies on a type in another file.
rsgen-avro seems to create the schema's correctly.
Schema::parse_list seems to parse the schema's correct.
Writer throws validation errors. (i've also trying using more lower level primatives, such as to_avro_datum, same issue)
Line 81 in the sample code below: Err(SchemaResolutionError(Name { name: "ErrorInfo", namespace: Some("Energistics.Etp.v12.Datatypes") }))
Sample code to show the issue:
NOTES:
-- This is a subset of what I need to do. The full Schema definition I need to use is 200+ .AVSC files, with multiple nested / reused types.
-- I can't change the structure of the Schema files, as this is a Published Industry Standard in our domain for data interchange using a wire protocol. (I will actually use to_avro_datum for the final version, as we just want the raw bytes, not the headers etc).
-- I'm getting some other errors with the rsgen (serde-bytes, and unable to use derive-schemas) that might post seperatly. I can share the full set of schema's if it is usefull for debugging.
Sample code showing the problem.
The text was updated successfully, but these errors were encountered: