Skip to content

Generic pattern skips non-letter characters at the end #3014

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
TIMONz1535 opened this issue Dec 28, 2024 · 2 comments
Closed

Generic pattern skips non-letter characters at the end #3014

TIMONz1535 opened this issue Dec 28, 2024 · 2 comments

Comments

@TIMONz1535
Copy link
Contributor

TIMONz1535 commented Dec 28, 2024

How are you using the lua-language-server?

Visual Studio Code Extension (sumneko.lua)

Which OS are you using?

Windows

What is the issue affecting?

Annotations

Expected Behaviour

feature #2484

Generic pattern can't recognize non-letter characters after `T` for example `T`.Base or `T`*

There is my example

---@generic T
---@param t MainContainer.`T`
---@return T
local function SubContainerClass(t) end

local mainContainerBase = SubContainerClass('Base')           -- MainContainer.Base
local mainContainerSubExtra = SubContainerClass('SubExtra')   -- MainContainer.SubExtra

---@generic T
---@param t `T`.Base
---@return T
local function ContainerClassBase(t) end

local mainContainerBase = ContainerClassBase('MainContainer') -- BUG: MainContainer instead of MainContainer.Base
local extraContainerBase = ContainerClassBase('ExtraContainer') -- BUG: ExtraContainer instead of ExtraContainer.Base

---@generic T
---@param t `T`*
---@return T
local function GetObjectPointer(t) end

---@generic T
---@param t `T`2
---@return T
local function GetObject2(t) end

local entityPointer = GetObjectPointer('Entity') -- BUG: Entity instead of Entity*
local entity2 = GetObject2('Entity') -- BUG: Entity instead of Entity2

Actual Behaviour

{8C546A71-BC93-4889-B8CB-60BBA6C99E42}

---@generic T
---@param t1 A.`T`Base -- works
---@param t2 A`T`Base -- works
---@param t3 A*`T` -- works
---@param t4 A`T`.Base -- nothing after `T`
---@param t5 `T`.Base -- nothing after `T`
---@param t6 `T`Base -- nothing after `T`
---@param t7 A*`T`* -- nothing after `T`
---@param t8 `T`* -- nothing after `T`
---@param t9 A`T`2 -- seems to be broken at all
---@return T
local function Test(t1,t2,t3,t4,t5,t6,t7,t8,t9) end

Reproduction steps

  1. Create generic patter with *, . or number after `T`
  2. There is no more definition after `T`

Additional Notes

Your class name can be very weird, but it works.
class

I just want to use pointer Class*

Log File

No response

@tomlau10
Copy link
Contributor

This seems due to how luadoc is parsed in script/parser/luadoc.lua 🤔

The token types

`T` is a code token, while MainContainer is a name token

  • from the lpeg rules here:
    name = (m.R('az', 'AZ', '09', '\x80\xff') + m.S('_')) * (m.R('az', 'AZ', '__', '09', '\x80\xff') + m.S('_.*-'))^0,
  • this largely translates to \w[\w.*-]*
    which basically means a Name can only start with \w, it CANNOT start with a . / * / -
    while the . / * / - can repeat themselves, so you can have some weird class names

The parse code logic

when it parse a generic pattern type, it has 2 logics

function parseTypeUnit(parent)
local result = parseFunction(parent)
or parseTable(parent)
or parseTuple(parent)
or parseString(parent)
or parseCode(parent)
or parseInteger(parent)
or parseBoolean(parent)
or parseParen(parent)
or parseCodePattern(parent)

  • parseCode: If the type starts with a code token, then only a single code token is allowed, nothing after it will be parsed as type
    => this explains why `T`.Base doesn't work

    local function parseCode(parent)
    local tp, content = peekToken()
    if not tp or tp ~= 'code' then
    return nil
    end
    nextToken()
    local code = {
    type = 'doc.type.code',
    start = getStart(),
    finish = getFinish(),
    parent = parent,
    [1] = content,
    }
    return code
    end

  • parseCodePattern: must start with a name token first, in the middle should be a code token, and after it can be name token again

    local function parseCodePattern(parent)
    local tp, pattern = peekToken()
    if not tp or tp ~= 'name' then
    return nil
    end
    local codeOffset
    local finishOffset
    local content
    for i = 2, 8 do
    local next, nextContent = peekToken(i)
    if not next or TokenFinishs[Ci+i-1] + 1 ~= TokenStarts[Ci+i] then
    if codeOffset then
    finishOffset = i
    break
    end
    ---不连续的name,无效的
    return nil
    end
    if next == 'code' then
    if codeOffset and content ~= nextContent then
    -- 暂时不支持多generic
    return nil
    end
    codeOffset = i
    pattern = pattern .. "%s"
    content = nextContent
    elseif next ~= 'name' then
    return nil
    else
    pattern = pattern .. nextContent
    end
    end
    local start = getStart()
    for _ = 2, finishOffset do
    nextToken()
    end
    local code = {
    type = 'doc.type.code',
    start = start,
    finish = getFinish(),
    parent = parent,
    pattern = pattern,
    [1] = content,
    }
    return code
    end

    • BUT‼️since a name can only starts with \w, therefore currently you cannot have Base.`T`.Base, yet Base.`T`Base will work

An attempt to fix

I just want to use pointer Class*

I tried to make this work by changing 2 places:

  1. Change the name rule to allow starting with a *
  2. Merge parseCode and parseCodePattern such that name token after a code can be parsed
diff --git forkSrcPrefix/script/parser/luadoc.lua forkDstPrefix/script/parser/luadoc.lua
index d108cebc26c64fb8506d525cce8ffcca3e085e1e..f12e26b4bdb485568c7265553e024dc313917029 100644
--- forkSrcPrefix/script/parser/luadoc.lua
+++ forkDstPrefix/script/parser/luadoc.lua
@@ -71,7 +71,7 @@ Symbol              <-  ({} {
     er = '\r',
     et = '\t',
     ev = '\v',
-    name = (m.R('az', 'AZ', '09', '\x80\xff') + m.S('_')) * (m.R('az', 'AZ', '__', '09', '\x80\xff') + m.S('_.*-'))^0,
+    name = (m.R('az', 'AZ', '09', '\x80\xff') + m.S('_*')) * (m.R('az', 'AZ', '__', '09', '\x80\xff') + m.S('_.*-'))^0,
     Char10 = function (char)
         ---@type integer?
         char = tonumber(char)
@@ -738,12 +738,17 @@ end
 
 local function parseCodePattern(parent)
     local tp, pattern = peekToken()
-    if not tp or tp ~= 'name' then
+    if not tp or (tp ~= 'name' and tp ~= 'code') then
         return nil
     end
     local codeOffset
     local finishOffset
     local content
+    if tp == 'code' then
+        codeOffset = 1
+        content = pattern
+        pattern = "%s"
+    end
     for i = 2, 8 do
         local next, nextContent = peekToken(i)
         if not next or TokenFinishs[Ci+i-1] + 1 ~= TokenStarts[Ci+i] then
@@ -834,7 +839,7 @@ function parseTypeUnit(parent)
                 or parseTable(parent)
                 or parseTuple(parent)
                 or parseString(parent)
-                or parseCode(parent)
                 or parseInteger(parent)
                 or parseBoolean(parent)
                 or parseParen(parent)
  • So far this allow the following use case:
---@generic T
---@param a `T`*
---@return T
function ToPtrClass(a) end

local a = ToPtrClass('A') --> a: A*

But still it doesn't solve all the cases that you reported.
And I don't know if this would cause other side effects or not, since a name (identifier) generally should not be allowed to start with a *.
Still might be you or others would like to pick it up from here 😄 or even better ask opinions from maintainers first

TIMONz1535 added a commit to TIMONz1535/lua-language-server that referenced this issue Dec 29, 2024
…Pattern to support "`T`.*-", prevent crash with "`T``T`" and when tokens >= 8, fix wrong getStart of result.
TIMONz1535 added a commit to TIMONz1535/lua-language-server that referenced this issue Dec 29, 2024
…-" without name token before code. Added tests.
TIMONz1535 added a commit to TIMONz1535/lua-language-server that referenced this issue Dec 29, 2024
@TIMONz1535
Copy link
Contributor Author

TIMONz1535 commented Dec 29, 2024

I fixed it without messing up the name parsing. But I found a few edge cases

---@class MehClass-Sub
---@class MehClass..Sub
---@class MehClass...Sub
---@class MehClass--Sub

---@param t `T`-Sub -- ok
---@param t `T`..Sub -- ok
---@param t `T`...Sub -- doesn't work because `...` is an individual symbol and I don't want to support it.
---@param t `T`--Sub -- doesn't work because `--` becomes a comment!

I also found an error with luadoc Parser - the token 2.0 is considered as a name token, but the name should only start with a letter or an underscore.

a.-1 (name)
_0w* (name)
0 (integer)
2.0 (should be `integer symbol integer` but its `name`)
2_0 (integer name)
.*-2.0 (should be `symbol symbol integer symbol integer` but its `symbol symbol symbol name`)
.*-2_0 (symbol symbol integer name)

upd. Well, I see that the @version directive uses this behavior. It gets a single token name. That is, it could be some other single token, a new type number for example, but not an integer symbol integer.

            if tp ~= 'name' then
                pushWarning {
                    type  = 'LUADOC_MISS_VERSION',
                    start  = getStart(),
                    finish = getFinish(),
                }
                break
            end
            version.version = tonumber(text) or text

TIMONz1535 added a commit to TIMONz1535/lua-language-server that referenced this issue Jan 6, 2025
…y and comment without a space. Fixed regression.
TIMONz1535 added a commit to TIMONz1535/lua-language-server that referenced this issue Jan 6, 2025
…y and comment without a space. Fixed regression.
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants