-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
File gets parsed correctly, but the offset information indicating symbol location seems wrong #238
Comments
This is just an error with printing, basically:
|
- Applied the suggestion in julia-vscode/CSTParser.jl#238
Thanks! This seems to work well, however I noticed that it still bugs with unicode characters. Basically, for some reason their contribution to "span" and "fullspan" is
|
^^ all size measurements (span/fullspan) are in bytes rather than characters |
I see. That explains the values, but still it seems like the wrong measurement to use if you want to annotate where the error appears in a string? Unless there is a function that can convert bytes to character position (I'm pretty new to Julia, so I don't know of one). You could modify the previous code to use:
|
To get the character index for a given byte index for a string, do this: Julia string handling, especially when it comes to Unicode, is good, but probably requires you to read the manual section :) It is here and quite helpful. |
- Use byte to character count conversion as suggested by julia-vscode/CSTParser.jl#238 (comment) - TODO can improve not reading the file string each time
Fantastic! That suggestion works, thanks a lot. Just for your information, I am building a Flycheck parser for Julia in Emacs using StaticLint.jl in this repo (where I incorporated your advice): https://github.com/dmalyuta/julia-staticlint |
Very nice! |
I've been exploring using StaticLint.jl to do static error analysis on my code, but an issue that I'm running into (and I think it has to do with CSTParser) is that the file offsets that CSTParser outputs for where commands are located in the file are not exactly the beginning and end of the commands. Sometimes it seems quite arbitrary. Consider the following super simple file:
I parse the file using the following code:
And this is what it outputs:
My understanding is that on the left we have information of the form
<starting offset>:<ending offset>
whereoffset
is basically the character number starting from the beginning of the file, that points to the beginning and ending of the corresponding operation/string/variable/etc. So take for example the above output:LinearAlgebra
is said to reside on offsets1:17
. In fact, it is on offsets7:19
include
command is said to reside on offsets24:30
. This is correct!foo.jl
is said to reside on offsets31:38
. In fact, it is on offsets32:39
So you see - already in this simple file there seem to be three cases, each with different behaviour - a "really bad" error (
LinearAlgebra
), an error that is an offset of one character (foo.jl
), and a correct answer (include
).Perhaps I do not fully understand how CSTParser works, in which case I'd really appreciate an explanation. At the end of the day, I would like to get to a state where StaticLint.jl return the correct file offset pointing to the error in the file. I believe that it is currently not doing so because CSTParser returns this weird-looking information.
My system info:
https://github.com/julia-vscode/CSTParser.jl#master
The text was updated successfully, but these errors were encountered: