Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

How to handle ZOD schemas at scale? Nominal ZOD schemas? #3909

Open
gabe-cc opened this issue Dec 12, 2024 · 0 comments
Open

How to handle ZOD schemas at scale? Nominal ZOD schemas? #3909

gabe-cc opened this issue Dec 12, 2024 · 0 comments

Comments

@gabe-cc
Copy link

gabe-cc commented Dec 12, 2024

Hi,

I have a bunch of related problems, that arise from the need of scaling many ZOD schemas while ZOD is fully structural and nameless.

I was writing about them internally, but I thought it might be worth discussing here too.

  1. I'm interested in the best ways to manage each of these problems in a ZOD paradigm.
  2. I'm interested in discussing the best way to introduce names to ZOD. It might that there's a trick that lets one do it as a separate library, that it must be integrated to ZOD itself or that for structural or roadmap reasons, it should be designed as an alternative library focusing on that use-case.

A few problems

Let's consider the following deep schema (weewooZ).

const foobarZ = z.object({
  foo : z.number() ,
  bar : z.string() ,
}) ;
type Foobar = z.infer<typeof foobarZ> ;

const weewooZ = z.array(z.object({
  wee : foobarZ ,
  woo : whateverZ
})) ;
type Weewoo = z.infer<typeof weewooZ> ;

A) Bad errors on deep schemas

I get a bad error message when I use weewooZ.parse on [ { wee : { foo : "42" , bar : "lol" } , woo : ... } ].
I get something like error at location 0.wee.foo, foo had the wrong type.
What I would want is instead two errors:

  1. validating weewooZ, at location 0.wee, { foo : "42" , bar : "lol" } did not follow the schema foobarZ
  2. validating foobarZ, at location foo, "42" did not match type int

B) Unrelated inferred types

It is obvious to us that Foobar and Weewoo["wee"] are one and the same. But it isn't to Typescript! As a result, error messages are just terrible and always show the fully unfolded types.

Sometimes, quite unpredictably, it gets worse, and Typescript silently fails to prove the equality between two types, without returning an error.

C) Transpilation

When I transpile ZOD schemas to something else (zod2lol), I lose all the structure that comes from the schemas referencing each other.

Here, if I write a transpiler as a fold on weewooZ, I lose the fact that wee came from a reference to foobarZ. At best, this leads to code duplication, but at worst, I lose a reference that was actually semantically meaningful.

Names in ZOD

I am not sure what is the best way to work around this.
I have learnt about .brand, but it solves none of the above. Typescript does not treat two types with the same brand as equal, ({ __brand : 'a' , foo : number } and {__brand : 'a' , foo : SomeComplexYetUnresolvedType} are not consider equal).

I have two ideas, that seem both quite expensive.

(i) ZOD Environments

Add a notion of "schema environment"s to ZOD.

On usage, it might look like:

import { Environment as ZodEnvironment } from 'zod' ;
const z = new ZodEnvironment() ;

const foobarZ = z.object(...).name('Foobar') ;
const weewooZ = z.object({
  wee : z.name('Foobar') , // or `foobarZ` and it fetches the name from it directly
  ...
}).name('Weeowoo') ;

Then, when validating (A) or transpiling (C), the references are kept track of both within each schemas, and can be resolved with the environment.

This doesn't address problem B) unfortunately. A way to do so might be to have a nice pattern to tell ZOD that a sub-schema has a specific TS type? Just having a natural pattern supported by ZOD would go a long way.

(ii) Type DSL with Names

It might be that ZOD should not be pushed beyond what it is great at. And what it is great at is validation of self-contained schemas and deep integration with a JS/TS project.

If we want more, like keeping track of schemas calling each other and generating great Typescript types, we might need to make more compromises.

What I am thinking of is:

  1. A type DSL with names (a strong candidate could be JSON Schema with its clunky $id, $ref and $defs)
  2. Static-time transpilation of files written in this DSL to Typescript (such as https://www.npmjs.com/package/json-schema-to-typescript)

This solves all 3 problems in one go. But this captures something different from ZOD: a web of types, instead of complex validators.

The main disadvantage of this approach compared to ZOD is that schemas must be written to a separate file, and that support for JS refinements would be clunky at best.
I think those are major bottlenecks, which is why I think that option (i) with some nice pattern to address problem B) would be better.

(iii) Other library that I'm not aware of??

I checked other libraries, but it might be that there is one that satisfies all of this that I'm not aware of.


Assuming that solution (i) makes sense, what would be the best way to go about it implementing it?


Cheers

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant