Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Feature Request: Include an option in textstat_summary to retrieve normalised counts for URLs etc #38

Open
dshgna opened this issue May 13, 2021 · 0 comments

Comments

@dshgna
Copy link

dshgna commented May 13, 2021

I find textstat_summary() very useful to compare the textual features between two or more groups.

However, given that the number of puncts, URLs, numbers, symbols, tags, and emojis can be explained by the number of characters/tokens/types, I usually end up writing a custom function to normalize based on length (that is, the longer texts would anyway have more URLs, for example, so normalization is required to compare between texts).

I usually end up writing a function for this, and think it'd be super useful to have this as a feature in textstat_summary.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant