Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

It’s too easy to use a schema or document with errors #709

Closed
Tracked by #641
SimonSapin opened this issue Oct 19, 2023 · 13 comments · Fixed by #752
Closed
Tracked by #641

It’s too easy to use a schema or document with errors #709

SimonSapin opened this issue Oct 19, 2023 · 13 comments · Fixed by #752
Assignees
Labels
apollo-compiler issues/PRs pertaining to semantic analysis & validation

Comments

@SimonSapin
Copy link
Contributor

This is breaking change that we should either make before 1.0 leaves beta, or decide not to make.

Status quo

Typical intended usage of apollo-compiler 1.0.0-beta.4 looks like this:

let schema = apollo_compiler::Schema::parse(schema_input, "schema.graphql");
schema.validate()?;
let doc = apollo_compiler::ExecutableDocument::parse(&schema, input, "doc.graphql");
doc.validate(&schema);
do_something_with(&schema, &doc)

However it’s very easy to forget validation:

let schema = apollo_compiler::Schema::parse(schema_input, "schema.graphql");
let doc = apollo_compiler::ExecutableDocument::parse(&schema, input, "doc.graphql");
do_something_with(&schema, &doc)

… and get a schema and document that look fine and contain some of the expected things, but in fact may have error and silently be missing other things. Specifically, two categories of errors can be encountered before running "full validation":

  • Parse errors (invalid grammar or limit exceeded)
  • So-called "build errors" during conversion to a high-level data structure. For example: inserting a definition in a name-indexed map can fail if the name was already used.

In either case an error is recorded and returned by .validate(). But in the second usage example they’ll be ignored and may cause unexpected results later, when they’re harder to debug.

Full validation as well is a property that may be useful to encode in the Rust type system. Maybe even ExecutableDocument should only be created with a fully valid schema?

Proposal

Change parse methods to return a Result in order to force callers to handle the error case (even if with unwrap).

This result should not always be tied to that of full validation. Federation for example needs to parse subgraph schemas that are invalid per spec because their root Query type may be implicit. Federation code fills in the blanks by programmatically mutating a Schema before calling validation.

If each different state is a different Rust type we may need to pick names for up to 4 types just for schemas:

Schema::parse() -> Result<Schema, SchemaWithParseOrBuildError>
Schema::validate(self) -> Result<ValidSchema, InvalidSchema>

… and similarly for executable documents.

Unless parse runs full validation but only when it encounters a build or parse error? In which case both results could share a InvalidSchema error type.

@SimonSapin SimonSapin added the apollo-compiler issues/PRs pertaining to semantic analysis & validation label Oct 19, 2023
@SimonSapin
Copy link
Contributor Author

Sketch of a possible API:

/// Immutable, validated. Execution would require this.
///
/// FIXME: keep this name? "Document" is made implicit
struct Executable {}

/// Mutable, may or may not be valid
///
/// FIXME: some other name?
struct PartialExecutable {}

/// Immutable, validated.
/// 
/// `Executable` and `PartialExecutable` require this
struct Schema {}

/// Mutable, may or may not be valid
///
/// FIXME: some other name?
struct PartialSchema {}

struct WithErrors<T> {
    pub errors: DiagnosticList,

    /// Some components may be omitted when `errors` has a corresponding diagnostic
    pub document: T,
}


impl Schema {
    /// If `Err`, `DiagnosticList` may contain any kind of error
    pub fn parse_and_validate(
        source_text: impl Into<String>,
        path: impl AsRef<Path>,
    ) -> Result<Self, WithErrors<PartialSchema>> {}
}

impl PartialSchema {
    /// If `Err`, `DiagnosticList` may contain parse errors or/and build errors
    pub fn parse(
        source_text: impl Into<String>,
        path: impl AsRef<Path>,
    ) -> Result<Self, WithErrors<Self>> {}

    /// If `Err`, `DiagnosticList` only contains validation errors
    pub fn validate(self) -> Result<Schema, WithErrors<Self>> {}
}


impl Executable {
    /// If `Err`, `DiagnosticList` may contain any kind of error
    pub fn parse_and_validate(
        schema: &Schema,
        source_text: impl Into<String>,
        path: impl AsRef<Path>,
    ) -> Result<Self, WithErrors<PartialExecutable>> {}
}

impl PartialExecutable {
    /// If `Err`, `DiagnosticList` may contain parse errors or/and build errors
    pub fn parse(
        schema: &Schema,
        source_text: impl Into<String>,
        path: impl AsRef<Path>,
    ) -> Result<Self, WithErrors<Self>> {}

    /// If `Err`, `DiagnosticList` only contains validation errors
    pub fn validate(self) -> Result<Executable, WithErrors<Self>> {}
}

impl ast::Document {
    /// If `Err`, `DiagnosticList` only contains parse errors
    pub fn parse(
        source_text: impl Into<String>,
        path: impl AsRef<Path>,
    ) -> Result<Self, WithErrors<Self>> {}

    // FIXME: names of these four methods?

    /// If `Err`, `DiagnosticList` only contains build errors
    pub fn to_partial_schema(&self) -> Result<PartialSchema, WithErrors<PartialSchema>> {}

    /// If `Err`, `DiagnosticList` may contains build errors and/or validation errors
    pub fn to_schema_validate(&self) -> Result<Schema, WithErrors<PartialSchema>> {}

    /// If `Err`, `DiagnosticList` only contains build errors
    pub fn to_partial_executable(
        &self,
        schema: &Schema,
    ) -> Result<PartialExecutable, WithErrors<PartialExecutable>> {}

    /// If `Err`, `DiagnosticList` may contains build errors and/or validation errors
    pub fn to_executable_validate(
        &self,
        schema: &Schema,
    ) -> Result<Executable, WithErrors<PartialExecutable>> {}    
}

@SimonSapin
Copy link
Contributor Author

Also:

impl Schema {
    pub fn into_partial(self) -> PartialSchema {}
}

impl Executable {
    pub fn into_partial(self) -> PartialExecutable {}
}

@qwerdenkerXD
Copy link

Having all those different types is kinda confusing to me, why don't you do sth. like the following?

Schema::parse(...) -> Result<Self, DiagnosticList>  // parsing, including validation
Schema::parse_unchecked(...) -> Self  // only parsing, as is the current behaviour of parse(...)

and same thing for ExecutableDocument

@SimonSapin
Copy link
Contributor Author

It’s possible, but would sacrifice encoding in the type system "this schema/document is valid" and "this function can only be called with a valid schema". Would things be easier to understand with a wrapper type?

/// Mutable, may or may not be valid
struct Executable {}

/// Mutable, may or may not be valid
struct Schema {}

/// Immutable (`Arc::make_mut` not exposed), can only be obtained from `validate`, implements Deref
struct Valid<T>(Arc<T>);

struct WithErrors<T> {
    pub errors: DiagnosticList,
    pub document: T,
}

impl Schema {
    fn parse_and_validate(self) -> Result<Valid<Self>, WithError<Self>>
    fn parse() -> Result<Self, WithError<Self>>
    fn validate(self) -> Result<Valid<Self>, WithError<Self>>
}

@qwerdenkerXD
Copy link

Yes, the Valid-wrapper is way more intuitive. How are validation warnings handled?

@SimonSapin
Copy link
Contributor Author

Hmm good point! I kinda forgot about those 😅

Like now, if validation finds both errors and warnings they’d end up in the same DiagnosticList in Result::Err. But for the only-warnings case they should be part of Result::Ok somehow. I’d rather not make it part of Valid<T>: after you either print, log, or ignore warnings they don’t need to be kept around while the document continues being used. What if Ok contained a tuple? Then by symmetry Err could too:

fn validate(self) -> Result<(Valid<Self>, DiagnosticList), (Self, DiagnosticList)>

… but part of the motivation for the WithError type was to make it implement Debug such that validate().unwrap() prints (only) diagnostics. Which would leave:

fn validate(self) -> Result<(Valid<Self>, DiagnosticList), WithErrors<Self>>

Or, equivalent but maybe feels less awkward, rename WithErrors to WithDiagnostics and:

fn validate(self) -> Result<WithDiagnostics<Valid<Self>>, WithDiagnostics<Self>>

@SimonSapin
Copy link
Contributor Author

At this point I wonder: should warnings be part of validate in the first place? What if we declare that validate only returns validation errors, and there’s a separate method like fn lint(&self) -> DiagnosticList for warnings? Would anyone call it? If not, is it worth having non-error diagnostics at all in apollo-compiler when there’s only two?

@goto-bus-stop
Copy link
Member

I think there is a good argument for splitting them up as you currently have to filter out warnings and "advice" in the most common use cases.

@qwerdenkerXD
Copy link

I agree with splitting them up. In the case of the mentioned WithDiagnostics-wrapper it would also be a little bit contradictory if you had an empty list.

@qwerdenkerXD
Copy link

Concerning the Valid-wrapper, I think that it doesn't really solve the issue as you're still able to easily forget validation when using the unvalidated parse function because you have access to all functions of the Schema or Executable.
Maybe the first sketch of the API is really the better, as the partial/unvalidated variant enables modification only and evolves via successful validation.

@SimonSapin
Copy link
Contributor Author

Sorry this was unclear, but the idea is that ExecutableDocument::parse would take a &Valid<Schema> parameter instead of &Schema. Similarly, execution and introspection APIs would take &Valid<ExecutableDocument>

@qwerdenkerXD
Copy link

ah then it's fine, didn't know you're working on an execution api, very cool

@SimonSapin
Copy link
Contributor Author

By the way, the PR linked above contains a full execution engine but only to support introspection. The current plan is to keep it private at first, and later if someone is interested consider polishing it for other use cases. Some execution-related things like input variable coercion will likely be public though.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
apollo-compiler issues/PRs pertaining to semantic analysis & validation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants