Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add option to change error constructor #444

Open
mysteriouslyseeing opened this issue Dec 2, 2024 · 0 comments · May be fixed by #445
Open

Add option to change error constructor #444

mysteriouslyseeing opened this issue Dec 2, 2024 · 0 comments · May be fixed by #445
Labels
enhancement New feature or request

Comments

@mysteriouslyseeing
Copy link

Following up from discussion in #440:
The current implementation of a token's Error type makes use of the Default constructor. This works fine as a default, but you run into issues if you want the error to reflect the current span of the Lexer, for example, to provide users of a lexer an indication of where exactly there was an error. It's not impossible - currently, a solution is to provide an arbitrary token variant with an attribute like the following:

use logos::Logos;

#[derive(Logos)]
#[logos(error = String)]
enum Token {
    #[token("a", priority = 1)]
    A,
    #[token("b", priority = 1)]
    #[regex(".", callback = |lex| {
        Err::<(), String>(format!("Syntax error at {:?}: unrecognised character '{}'", lex.span(), lex.slice()))
    })]
    B,
}

Note that you have to add priority = 1 to both A and B because "." also matches "a" and "b", and you also have to specify the associated type of the result because Rust cannot infer it.

A solution is to allow users to provide a default constructor for Error, to be used instead of Default::default() with an attribute like #[logos(error_callback = ...)], or something similar. The previous example with this would look like this:

use logos::Logos;

#[derive(Logos)]
#[logos(error = String)]
#[logos(error_callback = |lex| {
    format!("Syntax error at {:?}: unrecognised character '{}'", lex.span(), lex.slice())
})]
enum Token {
    #[token("a")]
    A,
    #[token("b")]
    B,
}
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
2 participants