Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Cannot generate code from directories with cross dependencies #6

Closed
batconjurer opened this issue Nov 19, 2020 · 7 comments
Closed

Comments

@batconjurer
Copy link
Contributor

Suppose I have a directory with two files: Thing.avsc and UUID.avsc with the following contents:

// Thing.avsc
{
	"name": "Thing",
	"type": "record",
	"fields": [
		{"name": "id", "type": "UUID"},
		{"name": "other", "type": "float"}
	]
}

// UUID.avsc
{
	"name": "UUID",
	"type": "record",
	"fields": [
		{"name": "bytes", "type": "bytes"}
	]
}

Code generation fails with Avro failure: Failed to parse schema: Unknown type: UUID because the current code generation does not seem to support cross depencies.

@lerouxrgd
Copy link
Owner

Indeed right now all files are expected to be standalone and cross file definition is not supported at the moment.

@lerouxrgd
Copy link
Owner

I added a note about this limitation in the README.md.

In practice though Avro schemas are better when self-contained as this is how they are written as header for big compressed files (typically stored in Hadoop). Just out of curiosity, what would be your use-case ?

A first simple solution for this would be to require the schemas in the input directory to have a naming that reflects their inter-dependencies. Something like 01_foo.avsc and 02_bar.avsc. And then just generate them in order.

A better solution would be to make a proper dependency graph (with petgraph for instance). The tricky part is that schema can be recursively nested. This is currently handled for self-contained schemas through the function deps_stacks defined as follows:

/// Utility function to find the ordered, nested dependencies of an Avro `schema`.
/// Explores nested `schema`s in a breadth-first fashion, pushing them on a stack
/// at the same time in order to have them ordered.
/// It is similar to traversing the `schema` tree in a post-order fashion.
fn deps_stack(schema: &Schema) -> Vec<&Schema> { }

In any case handling cross-files defined schemas would start from there I think.

@batconjurer
Copy link
Contributor Author

batconjurer commented Nov 30, 2020 via email

@lerouxrgd
Copy link
Owner

Following your PR to avro-rs released in 0.13.0 I added basic support for cross dependencies in the branch cross-deps.

Could you give it a try and let me know if that works for you ?

@batconjurer
Copy link
Contributor Author

Wow, you got that done very quickly! Yes, it looks like it works very well.

@lerouxrgd
Copy link
Owner

Ok, nice !

I'll try to use a glob pattern instead of a directory path to be able to take into account nested dir.

After that I'll merge it to master.

@lerouxrgd
Copy link
Owner

lerouxrgd commented Feb 1, 2021

This has been merged to master and released as 0.9.0.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants