Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[feature-request] Collection words from current file please #22

Closed
skywind3000 opened this issue Mar 3, 2020 · 11 comments · Fixed by #44
Closed

[feature-request] Collection words from current file please #22

skywind3000 opened this issue Mar 3, 2020 · 11 comments · Fixed by #44
Labels

Comments

@skywind3000
Copy link

This plugin runs more fluently than All AutoComplete plugin, but one thing really hold me back is that All AutoComplete can suggest words from current documents.

It would be much helpful when I am inputing some terms not existing in the wordlist, for example my name, if I have entered my name before, All AutoComplete is capable to suggest my name if I am trying to type it again.

But All AutoComplete will stuck for seconds if the word list is too long (eg. 40k words), I really hope this plugin can collect words from current document.

@yzhang-gh
Copy link
Owner

Thank you for the feedback.

I know VSCode has word-based suggestions but only when there is no other completion provider (ref). And there is a good reason "We do this to prevent mixing good suggestions with word-based suggestions".

I guess the current dictionary already covers most of the words we use. You can add more words in your user settings (or even create a PR).

@yzhang-gh
Copy link
Owner

yzhang-gh commented Jun 30, 2020

I have two concerns about this feature:

  • The quality of suggestions. This extension uses a relative small built-in dictionary rather than those used by some spell checkers (because they have, for example, many uncommon abbreivations).

    Proposal: collect words (space-split) from the current file, and use a filter to remove some unwanted words (e.g. Chinese words). (Probably we should only enable this feature in Markdown files. There may be problems to collect words from HTML/LaTeX/Python/… files.)

  • The performance.

    Proposal: 1) incrementally rebuild the dictionary, and 2) disable this feature when editing large files. (We already have such a feature (performance sense) in Markdown All in One.)

@yzhang-gh
Copy link
Owner

BTW, in case you don't know, this extension now offers suggestions if you are editing string/comment parts in Python/JS/TS files (need to enable the programmingLanguage option). I love it and am going to enable it by default starting from the next release. Feedback is welcome 😉.

@starship863
Copy link

I have two concerns about this feature:

  • The quality of suggestions. This extension uses a relative small built-in dictionary rather than those used by some spell checkers (because they have, for example, many uncommon abbreivations).
    Proposal: collect words (space-split) from the current file, and use a filter to remove some unwanted words (e.g. Chinese words). (Probably we should only enable this feature in Markdown files. There may be problems to collect words from HTML/LaTeX/Python/… files.)
  • The performance.
    Proposal: 1) incrementally rebuild the dictionary, and 2) disable this feature when editing large files. (We already have such a feature (performance sense) in Markdown All in One.)

Thank you so much for this very cool extension!

I think @skywind3000 made a rather good suggestion which could become a very useful feature. This should be optional so that the default behaviour will not disturb the users with low quality suggestions. Users who are suffering from long typing works such as paper writing might benifit a lot by turnning this feature on.

I can't tell you how much I have benifit from this extension. Thank you very much again.

BTW: @skywind3000 Thank you very much for your cool works (you know what I mean :-)

@yzhang-gh
Copy link
Owner

@starship863 Thanks for the feedback.

As you mentioned paper writing, do you expect this for LaTeX documents?

@starship863
Copy link

starship863 commented Jan 15, 2021

@starship863 Thanks for the feedback.

As you mentioned paper writing, do you expect this for LaTeX documents?

The present version works well for latex documents. I have been using this extension in latex documents for a long time.

For the new feature, I hope it may ignore the latex macros starting with \

@yzhang-gh
Copy link
Owner

Just tried and realized we have to make use of some Markdown/LaTeX parsers, otherwise there will be tons of low-quality suggestions.

@ArithmeticError
Copy link
Contributor

I also encountered this problem. I achieved this requirement by modifying the DictionaryCompletionItemProvider class implemention in completion.js file under ~\.vscode\extensions\yzhang.dictionary-completion-1.2.2\out\src, where ~ represents the windows user path. The specific modifications are as follows:

class DictionaryCompletionItemProvider {
    constructor(fileType) {
        this.fileType = fileType;
    }
    provideCompletionItems(document, position, _token) {
        // add the document text wordlists
        let allWords_for_completion = [...allWords];
        for (let i = 0; i < document.lineCount; ++i) {
            const line = document.lineAt(i);
            const text = line.text;
            const words = text.split(/\b(\w+)\b/);
            words.forEach((word) => {
                if (!allWords_for_completion.includes(word)) {
                    allWords_for_completion.push(word);
                }
            });
        }

        const lineText = document.lineAt(position.line).text;
        const textBefore = lineText.substring(0, position.character);
        const docTextBefore = document.getText(new vscode_1.Range(new vscode_1.Position(0, 0), position));
        const wordBefore = textBefore.replace(/\W/g, ' ').split(/[\s]+/).pop();
        const firstLetter = wordBefore.charAt(0);
        const followingChar = lineText.charAt(position.character);
        const addSpace = vscode.workspace.getConfiguration('dictCompletion').get('addSpaceAfterCompletion') && !followingChar.match(/[ ,.:;?!\-]/);
        if (wordBefore.length < vscode.workspace.getConfiguration('dictCompletion').get('leastNumOfChars')) {
            return [];
        }
        switch (this.fileType) {
            case "markdown":
                // [caption](don't complete here)
                if (/\[[^\]]*\]\([^\)]*$/.test(textBefore)) {
                    return [];
                }
                return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
            case "latex":
                // `|` means cursor
                // \command|
                if (/\\[^ {\[]*$/.test(textBefore)) {
                    return [];
                }
                // \begin[...|] or \begin{...}[...|]
                if (/\\(documentclass|usepackage|begin|end|cite|ref|includegraphics)({[^}]*}|)?\[[^\]]*$/.test(textBefore)) {
                    return [];
                }
                // \begin{...|} or \begin[...]{...|}
                if (/\\(documentclass|usepackage|begin|end|cite|ref|includegraphics|input|include)(\[[^\]]*\]|)?{[^}]*$/.test(textBefore)) {
                    return [];
                }
                return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
            case "html":
                // <don't complete here>
                if (/<[^>]*$/.test(textBefore)) {
                    return [];
                }
                //// Inside <style> or <srcipt>
                let docBefore = document.getText(new vscode.Range(new vscode.Position(0, 0), position));
                if (docBefore.includes('<style>')
                    && (!docBefore.includes('</style>')
                        || docBefore.match(/<style>/g).length > docBefore.match(/<\/style>/g).length)) {
                    return [];
                }
                if (docBefore.includes('<script>')
                    && (!docBefore.includes('</script>')
                        || docBefore.match(/<script>/g).length > docBefore.match(/<\/script>/g).length)) {
                    return [];
                }
                return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
            case "javascript":
            case "typescript":
                //// Multiline comment
                if (/\/\*((?!\*\/)[\W\w])*$/.test(docTextBefore)) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                //// Inline comment or string
                const tmpTextBeforeJs = textBefore.replace(/(?<!\\)('|").*?(?<!\\)\1/g, '');
                if (/\/{2,}/.test(tmpTextBeforeJs) //// inline comment
                    || (/(?<!\\)['"]/.test(tmpTextBeforeJs) //// inline string
                        && !/(import|require)/.test(tmpTextBeforeJs.split(/['"]/)[0]) //// reject if in import/require clauses
                    )) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                return [];
            case "python":
                //// Multiline comment (This check should go before inline comment/string check)
                const tmpDocTextBefore = docTextBefore.replace(/('''|""")[\W\w]*?\1/g, '');
                if (/('''|""")((?!\1)[\W\w])*$/.test(tmpDocTextBefore)) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                //// Inline comment or string
                const inlineCheckStr1 = textBefore.replace(/('''|""")/g, '').replace(/f?(?<!\\)('|").*?(?<!\\)\1/g, '');
                const inlineCheckStr2 = inlineCheckStr1.replace(/((?<!\\){).*?((?<!\\)})/g, '');
                if (/#+/.test(inlineCheckStr1)
                    || /(?<!\\|f)['"]/.test(inlineCheckStr1)
                    || /f(?<!\\)['"][^{]*$/.test(inlineCheckStr2)) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                return [];
            // TMP adapted from JS/TS
            case "c":
                //// Multiline comment
                if (/\/\*((?!\*\/)[\W\w])*$/.test(docTextBefore)) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                //// Inline comment or string
                const tmpTextBeforeC = textBefore.replace(/(?<!\\)(").*?(?<!\\)\1/g, '');
                if (/\/{2,}/.test(tmpTextBeforeC) //// inline comment
                    || (/(?<!\\)"/.test(tmpTextBeforeC) //// inline string
                        && !/#include/.test(tmpTextBeforeC.split(/"/)[0]) //// reject if in include clauses
                    )) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                return [];
            // TMP not tested
            case "vue":
                // <don't complete here>
                if (/<[^>]*$/.test(textBefore)) {
                    return [];
                }
                //// Multiline comment
                if (/\/\*((?!\*\/)[\W\w])*$/.test(docTextBefore)) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                //// Inline comment or string
                const tmpTextBeforeVue = textBefore.replace(/(?<!\\)('|").*?(?<!\\)\1/g, '');
                if (/\/{2,}/.test(tmpTextBeforeVue) //// inline comment
                    || (/(?<!\\)['"]/.test(tmpTextBeforeVue) //// inline string
                        && !/(import|require)/.test(tmpTextBeforeVue.split(/['"]/)[0]) //// reject if in import/require clauses
                    )) {
                    return this.completeByFirstLetter(firstLetter, allWords_for_completion, addSpace);
                }
                return [];
        }
    }
    completeByFirstLetter(firstLetter, wordlist, addSpace = false) {
        if (firstLetter.toLowerCase() == firstLetter) {
            // Lowercase letter
            let completions = wordlistToComplItems(wordlist.filter(w => w.toLowerCase().startsWith(firstLetter)));
            if (addSpace) {
                completions.forEach(item => item.insertText = item.label + ' ');
            }
            return new Promise((resolve, reject) => resolve(completions));
        }
        else {
            // Uppercase letter
            let completions = wordlist.filter(w => w.toLowerCase().startsWith(firstLetter.toLowerCase()))
                .map(w => {
                let newLabel = w.charAt(0).toUpperCase() + w.slice(1);
                let newItem = new vscode.CompletionItem(newLabel, vscode.CompletionItemKind.Text);
                if (addSpace) {
                    newItem.insertText = newLabel + ' ';
                }
                return newItem;
            });
            return new Promise((resolve, reject) => resolve(completions));
        }
    }
}

The main change is to add the words of the current file to the original matching string list. Since I am not very familiar with code optimization, there may be areas where this code needs further optimization. For example, each completion requires re-reading the characters of the entire file for segmentation, which may slow down the completion of long documents.

@yzhang-gh
Copy link
Owner

Thanks for sharing. Would you mind opening a PR?

@ArithmeticError
Copy link
Contributor

Thanks for sharing. Would you mind opening a PR?

Ok, I have made a PR. But I think my way maybe not so efficient, so you'd better make a toggle command between this and the original version. Thank you for creating this plugin!

@yzhang-gh yzhang-gh linked a pull request Jan 14, 2024 that will close this issue
yzhang-gh added a commit that referenced this issue Jan 14, 2024
@yzhang-gh
Copy link
Owner

It is now supported in v1.3.0 (needs to set option collectWordsFromCurrentFile to true).

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants