[feature-request] Collection words from current file please #22

skywind3000 · 2020-03-03T18:09:10Z

This plugin runs more fluently than All AutoComplete plugin, but one thing really hold me back is that All AutoComplete can suggest words from current documents.

It would be much helpful when I am inputing some terms not existing in the wordlist, for example my name, if I have entered my name before, All AutoComplete is capable to suggest my name if I am trying to type it again.

But All AutoComplete will stuck for seconds if the word list is too long (eg. 40k words), I really hope this plugin can collect words from current document.

The text was updated successfully, but these errors were encountered:

yzhang-gh · 2020-03-04T01:29:31Z

Thank you for the feedback.

I know VSCode has word-based suggestions but only when there is no other completion provider (ref). And there is a good reason "We do this to prevent mixing good suggestions with word-based suggestions".

I guess the current dictionary already covers most of the words we use. You can add more words in your user settings (or even create a PR).

yzhang-gh · 2020-06-30T01:08:27Z

I have two concerns about this feature:

The quality of suggestions. This extension uses a relative small built-in dictionary rather than those used by some spell checkers (because they have, for example, many uncommon abbreivations).

Proposal: collect words (space-split) from the current file, and use a filter to remove some unwanted words (e.g. Chinese words). (Probably we should only enable this feature in Markdown files. There may be problems to collect words from HTML/LaTeX/Python/… files.)
The performance.

Proposal: 1) incrementally rebuild the dictionary, and 2) disable this feature when editing large files. (We already have such a feature (performance sense) in Markdown All in One.)

yzhang-gh · 2020-06-30T01:13:32Z

BTW, in case you don't know, this extension now offers suggestions if you are editing string/comment parts in Python/JS/TS files (need to enable the programmingLanguage option). I love it and am going to enable it by default starting from the next release. Feedback is welcome 😉.

starship863 · 2021-01-14T08:44:27Z

I have two concerns about this feature:

The quality of suggestions. This extension uses a relative small built-in dictionary rather than those used by some spell checkers (because they have, for example, many uncommon abbreivations).
Proposal: collect words (space-split) from the current file, and use a filter to remove some unwanted words (e.g. Chinese words). (Probably we should only enable this feature in Markdown files. There may be problems to collect words from HTML/LaTeX/Python/… files.)

The performance.
Proposal: 1) incrementally rebuild the dictionary, and 2) disable this feature when editing large files. (We already have such a feature (performance sense) in Markdown All in One.)

Thank you so much for this very cool extension!

I think @skywind3000 made a rather good suggestion which could become a very useful feature. This should be optional so that the default behaviour will not disturb the users with low quality suggestions. Users who are suffering from long typing works such as paper writing might benifit a lot by turnning this feature on.

I can't tell you how much I have benifit from this extension. Thank you very much again.

BTW: @skywind3000 Thank you very much for your cool works (you know what I mean :-)

yzhang-gh · 2021-01-14T15:20:27Z

@starship863 Thanks for the feedback.

As you mentioned paper writing, do you expect this for LaTeX documents?

starship863 · 2021-01-15T01:40:48Z

@starship863 Thanks for the feedback.

As you mentioned paper writing, do you expect this for LaTeX documents?

The present version works well for latex documents. I have been using this extension in latex documents for a long time.

For the new feature, I hope it may ignore the latex macros starting with \

yzhang-gh · 2021-01-15T06:14:14Z

Just tried and realized we have to make use of some Markdown/LaTeX parsers, otherwise there will be tons of low-quality suggestions.

ArithmeticError · 2024-01-09T12:48:54Z

I also encountered this problem. I achieved this requirement by modifying the DictionaryCompletionItemProvider class implemention in completion.js file under ~\.vscode\extensions\yzhang.dictionary-completion-1.2.2\out\src, where ~ represents the windows user path. The specific modifications are as follows:

class DictionaryCompletionItemProvider {
    constructor(fileType) {
        this.fileType = fileType;
    }
    provideCompletionItems(document, position, _token) {
        // add the document text wordlists
        let allWords_for_completion = [...allWords];
        for (let i = 0; i < document.lineCount; ++i) {
            const line = document.lineAt(i);
            const text = line.text;
            const words = text.split(/\b(\w+)\b/);
            words.forEach((word) => {
                if (!allWords_for_completion.includes(word)) {
                    allWords_for_completion.push(word);
                }
            });
        }

        const lineText = document.lineAt(position.line).text;
        const textBefore = lineText.substring(0, position.character);
        const docTextBefore = document.getText(new vscode_1.Range(new vscode_1.Position(0, 0), position));
        const wordBefore = textBefore.replace(/\W/g, ' ').split(/[\s]+/).pop();
        const firstLetter = wordBefore.charAt(0);
        const followingChar = lineText.charAt(position.character);
        const addSpace = vscode.workspace.getConfiguration('dictCompletion').get('addSpaceAfterCompletion') && !followingChar.match(/[ ,.:;?!\-]/);
        if (wordBefore.length < vscode.workspace.getConfiguration('dictCompletion').get('leastNumOfChars')) {
            return [];
        }
        switch (this.fileType) {
            case "markdown":
                // [caption](don't complete here)
                if (/\[[^\]]*\]\([^\)]*$/.test(textBefore)) {
                    return [];
                }
                return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
            case "latex":
                // `|` means cursor
                // \command|
                if (/\\[^ {\[]*$/.test(textBefore)) {
                    return [];
                }
                // \begin[...|] or \begin{...}[...|]
                if (/\\(documentclass|usepackage|begin|end|cite|ref|includegraphics)({[^}]*}|)?\[[^\]]*$/.test(textBefore)) {
                    return [];
                }
                // \begin{...|} or \begin[...]{...|}
                if (/\\(documentclass|usepackage|begin|end|cite|ref|includegraphics|input|include)(\[[^\]]*\]|)?{[^}]*$/.test(textBefore)) {
                    return [];
                }
                return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
            case "html":
                // <don't complete here>
                if (/<[^>]*$/.test(textBefore)) {
                    return [];
                }
                //// Inside <style> or <srcipt>
                let docBefore = document.getText(new vscode.Range(new vscode.Position(0, 0), position));
                if (docBefore.includes('<style>')
                    && (!docBefore.includes('</style>')
                        || docBefore.match(/<style>/g).length > docBefore.match(/<\/style>/g).length)) {
                    return [];
                }
                if (docBefore.includes('<script>')
                    && (!docBefore.includes('</script>')
                        || docBefore.match(/<script>/g).length > docBefore.match(/<\/script>/g).length)) {
                    return [];
                }
                return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
            case "javascript":
            case "typescript":
                //// Multiline comment
                if (/\/\*((?!\*\/)[\W\w])*$/.test(docTextBefore)) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                //// Inline comment or string
                const tmpTextBeforeJs = textBefore.replace(/(?<!\\)('|").*?(?<!\\)\1/g, '');
                if (/\/{2,}/.test(tmpTextBeforeJs) //// inline comment
                    || (/(?<!\\)['"]/.test(tmpTextBeforeJs) //// inline string
                        && !/(import|require)/.test(tmpTextBeforeJs.split(/['"]/)[0]) //// reject if in import/require clauses
                    )) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                return [];
            case "python":
                //// Multiline comment (This check should go before inline comment/string check)
                const tmpDocTextBefore = docTextBefore.replace(/('''|""")[\W\w]*?\1/g, '');
                if (/('''|""")((?!\1)[\W\w])*$/.test(tmpDocTextBefore)) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                //// Inline comment or string
                const inlineCheckStr1 = textBefore.replace(/('''|""")/g, '').replace(/f?(?<!\\)('|").*?(?<!\\)\1/g, '');
                const inlineCheckStr2 = inlineCheckStr1.replace(/((?<!\\){).*?((?<!\\)})/g, '');
                if (/#+/.test(inlineCheckStr1)
                    || /(?<!\\|f)['"]/.test(inlineCheckStr1)
                    || /f(?<!\\)['"][^{]*$/.test(inlineCheckStr2)) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                return [];
            // TMP adapted from JS/TS
            case "c":
                //// Multiline comment
                if (/\/\*((?!\*\/)[\W\w])*$/.test(docTextBefore)) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                //// Inline comment or string
                const tmpTextBeforeC = textBefore.replace(/(?<!\\)(").*?(?<!\\)\1/g, '');
                if (/\/{2,}/.test(tmpTextBeforeC) //// inline comment
                    || (/(?<!\\)"/.test(tmpTextBeforeC) //// inline string
                        && !/#include/.test(tmpTextBeforeC.split(/"/)[0]) //// reject if in include clauses
                    )) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                return [];
            // TMP not tested
            case "vue":
                // <don't complete here>
                if (/<[^>]*$/.test(textBefore)) {
                    return [];
                }
                //// Multiline comment
                if (/\/\*((?!\*\/)[\W\w])*$/.test(docTextBefore)) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                //// Inline comment or string
                const tmpTextBeforeVue = textBefore.replace(/(?<!\\)('|").*?(?<!\\)\1/g, '');
                if (/\/{2,}/.test(tmpTextBeforeVue) //// inline comment
                    || (/(?<!\\)['"]/.test(tmpTextBeforeVue) //// inline string
                        && !/(import|require)/.test(tmpTextBeforeVue.split(/['"]/)[0]) //// reject if in import/require clauses
                    )) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                return [];
        }
    }
    completeByFirstLetter(firstLetter, wordlist, addSpace = false) {
        if (firstLetter.toLowerCase() == firstLetter) {
            // Lowercase letter
            let completions = wordlistToComplItems(wordlist.filter(w => w.toLowerCase().startsWith(firstLetter)));
            if (addSpace) {
                completions.forEach(item => item.insertText = item.label + ' ');
            }
            return new Promise((resolve, reject) => resolve(completions));
        }
        else {
            // Uppercase letter
            let completions = wordlist.filter(w => w.toLowerCase().startsWith(firstLetter.toLowerCase()))
                .map(w => {
                let newLabel = w.charAt(0).toUpperCase() + w.slice(1);
                let newItem = new vscode.CompletionItem(newLabel, vscode.CompletionItemKind.Text);
                if (addSpace) {
                    newItem.insertText = newLabel + ' ';
                }
                return newItem;
            });
            return new Promise((resolve, reject) => resolve(completions));
        }
    }
}

The main change is to add the words of the current file to the original matching string list. Since I am not very familiar with code optimization, there may be areas where this code needs further optimization. For example, each completion requires re-reading the characters of the entire file for segmentation, which may slow down the completion of long documents.

yzhang-gh · 2024-01-09T13:30:25Z

Thanks for sharing. Would you mind opening a PR?

ArithmeticError · 2024-01-09T15:17:33Z

Thanks for sharing. Would you mind opening a PR?

Ok, I have made a PR. But I think my way maybe not so efficient, so you'd better make a toggle command between this and the original version. Thank you for creating this plugin!

#22

yzhang-gh · 2024-01-14T16:15:27Z

It is now supported in v1.3.0 (needs to set option collectWordsFromCurrentFile to true).

yzhang-gh mentioned this issue Jun 30, 2020

Include default autocompletion results #26

Closed

yzhang-gh added the discussion label Jun 30, 2020

yzhang-gh added feature help wanted and removed discussion labels Jan 14, 2021

yzhang-gh linked a pull request Jan 14, 2024 that will close this issue

add the document text English wordlists #44

Merged

yzhang-gh added a commit that referenced this issue Jan 14, 2024

🔖 v1.3.0

437db8f

#22

yzhang-gh closed this as completed Jan 14, 2024

yzhang-gh removed the help wanted label Jan 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature-request] Collection words from current file please #22

[feature-request] Collection words from current file please #22

skywind3000 commented Mar 3, 2020

yzhang-gh commented Mar 4, 2020

yzhang-gh commented Jun 30, 2020 •

edited

Loading

yzhang-gh commented Jun 30, 2020

starship863 commented Jan 14, 2021

yzhang-gh commented Jan 14, 2021

starship863 commented Jan 15, 2021 •

edited

Loading

yzhang-gh commented Jan 15, 2021

ArithmeticError commented Jan 9, 2024

yzhang-gh commented Jan 9, 2024

ArithmeticError commented Jan 9, 2024

yzhang-gh commented Jan 14, 2024

[feature-request] Collection words from current file please #22

[feature-request] Collection words from current file please #22

Comments

skywind3000 commented Mar 3, 2020

yzhang-gh commented Mar 4, 2020

yzhang-gh commented Jun 30, 2020 • edited Loading

yzhang-gh commented Jun 30, 2020

starship863 commented Jan 14, 2021

yzhang-gh commented Jan 14, 2021

starship863 commented Jan 15, 2021 • edited Loading

yzhang-gh commented Jan 15, 2021

ArithmeticError commented Jan 9, 2024

yzhang-gh commented Jan 9, 2024

ArithmeticError commented Jan 9, 2024

yzhang-gh commented Jan 14, 2024

yzhang-gh commented Jun 30, 2020 •

edited

Loading

starship863 commented Jan 15, 2021 •

edited

Loading