Skip to content

Inconsistent tokens between the same struct defined inside vs outside of function #49604

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
dtolnay opened this issue Apr 2, 2018 · 5 comments
Assignees
Labels
A-decl-macros-2-0 Area: Declarative macros 2.0 (#39412)

Comments

@dtolnay
Copy link
Member

dtolnay commented Apr 2, 2018

The following script reproduces the issue as of rustc 1.26.0-nightly (06fa27d 2018-04-01).

The repro crate defines two identical structs, S1 at the top level outside of a function and S2 inside of a function. The repro_derive crate prints the Debug representation of the first four input token trees. For both structs the first four token trees are #, [doc = "..."], #, [doc = "..."].

Pay attention to the tokens corresponding to the = signs that come after "doc". For S1 (defined outside of function), the first and second = tokens both say kind: Tree.

TokenStream {
    kind: Tree(
        Token(
            Span {/* ... */},
            Eq
        )
    )
},

But in S2 (defined within a function), the first = token is kind: Tree while the second = token is kind: JointTree. It would be good to understand what is causing the inconsistency between the tokenization of S1 vs S2.

TokenStream {
    kind: JointTree(
        Token(
            Span {/* ... */},
            Eq
        )
    )
},

Repro script

#!/bin/sh

cargo new --lib repro_derive
cargo new --lib repro

echo >repro_derive/src/lib.rs '
#![feature(proc_macro)]

extern crate proc_macro;
use proc_macro::TokenStream;

#[proc_macro_derive(Repro, attributes(repro))]
pub fn derive_repro(input: TokenStream) -> TokenStream {
    for tt in input.into_iter().take(4) {
        println!("{:#?}", tt);
    }
    TokenStream::empty()
}
'

echo >>repro_derive/Cargo.toml '
[lib]
proc-macro = true
'

echo >repro/src/lib.rs '
#![allow(dead_code)]

#[macro_use]
extern crate repro_derive;

/// X1
/// Y1
#[repro]
#[derive(Repro)]
struct S1;

fn f() {
    /// X2
    /// Y2
    #[repro]
    #[derive(Repro)]
    struct S2;
}
'

echo >>repro/Cargo.toml '
repro_derive = { path = "../repro_derive" }
'

cargo build --manifest-path repro/Cargo.toml

@alexcrichton

@dtolnay
Copy link
Member Author

dtolnay commented Apr 2, 2018

Possibly related to #47941 -- the tokens of /// X1, /// Y1, and /// X2 all have correct-looking lo and hi byte positions. The tokens of /// Y2 have lo and hi equal to 0.

@alexcrichton
Copy link
Member

I've confirmed that all the Joint is gone with #49597, and the reason Joint is there is a bug in ec1a8f0.

As to why there are different spans here, it looks like this is due to #43081. For recursive items we don't save off a token stream to return later, so the spans will all be "wrong".

For example if you change the procedural macro to look like:

#![feature(proc_macro)]

extern crate proc_macro;
use proc_macro::TokenStream;

#[proc_macro_derive(Repro, attributes(repro))]
pub fn derive_repro(input: TokenStream) -> TokenStream {
    input.into_iter()
        .next()
        .unwrap()
        .span()
        .error("test")
        .emit();
    TokenStream::empty()
}

and then you also remove the doc comments, you'll get:

   Compiling repro v0.1.0 (file:///home/alex/code/wut/repro)
error: test
  --> src/lib.rs:10:1
   |
10 | struct S1;
   | ^^^^^^

error: test
 --> <macro expansion>:1:1
  |
1 | struct S2;
  | ^^^^^^

error: aborting due to 2 previous errors

error: Could not compile `repro`.

To learn more, run the command again with --verbose.

Note how the first error has a filename/line number, while the second has only a bogus one.

Should this be closed in favor of #43081?

@dtolnay
Copy link
Member Author

dtolnay commented Apr 3, 2018

If I understand correctly the explanation in #49596 (comment), with #49597 the proc_macro::Spacing::Joint is correctly gone but there is still an incorrect syntax::tokenstream::TokenStreamKind::JointTree in the underlying token stream -- which is a separate bug that should also be fixed. The links in your explanation are all in the code of proc_macro::TokenStream::from_internal(syntax::tokenstream::TokenStream, ...) so I believe only the proc_macro::TokenStream has been fixed while the syntax::tokenstream::TokenStream is still wrong.

@alexcrichton
Copy link
Member

@dtolnay hm I think we may need to work on a different reproduction though? In testing with #49597 I noticed that all the JointTree instances in the test in this issue were gone and couldn't get them to reappear...

@dtolnay dtolnay self-assigned this Apr 3, 2018
@dtolnay
Copy link
Member Author

dtolnay commented Apr 3, 2018

I assigned myself and will try to reproduce after your changes land, and close if I cannot.

@dtolnay dtolnay closed this as completed Apr 29, 2018
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
A-decl-macros-2-0 Area: Declarative macros 2.0 (#39412)
Projects
None yet
Development

No branches or pull requests

2 participants