Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Create plugin to determine file categories #1745

Open
johnmhoran opened this issue Oct 3, 2019 · 2 comments
Open

Create plugin to determine file categories #1745

johnmhoran opened this issue Oct 3, 2019 · 2 comments
Assignees
Milestone

Comments

@johnmhoran
Copy link
Member

This plugin will apply a set of rules to certain fields/values collected during a scan (e.g., file_type, mime_type) and add a category (or similarly named) field and associated value (e.g., Java, JavaScript) to the JSON output file.

@johnmhoran johnmhoran self-assigned this Oct 3, 2019
johnmhoran added a commit that referenced this issue Oct 7, 2019
* Install by navigating to /scancode-toolkit/plugins/scancode-categories/
   and running 'pip install .'
* Rules comprise a set of any() and all() functions contained as string
   values in a list of JSON objects.
* Current test ruleset is quite small, based solely on scan of
   bionic-master-libc-bionic.tar.gz-extract
* Current working ruleset:
   /scancode-categories/src/python_rules/python_rules_01.py
* Command example: scancode -i -n 2 <path to codebase> --categories
   <path to JSON object> --json <path to JSON output file>
* Currently uses JSON object inside .py file.  Test of .json file coming soon.
* Unable thus far to create working rules (1) using YAML or text files or
   (2) without including Python code (any/all() functions) inside the
   rules themselves.
* Code not yet cleaned up -- still a WIP.

Signed-off-by: John M. Horan <johnmhoran@gmail.com>
johnmhoran added a commit that referenced this issue Oct 7, 2019
* .json ruleset performs as intended.
* Cleaned up plugin_categories.py

Signed-off-by: John M. Horan <johnmhoran@gmail.com>
johnmhoran added a commit that referenced this issue Oct 10, 2019
* New rules: /scancode-categories/src/json_rules/json_rules_simple_01.json
* Seems to work well on test codebase
   bionic-master-libc-bionic.tar.gz-extract (largely C++).
* Next steps include expanding rules using more-diverse test codebases.
* No formal test suite yet but coming soon.
* This branch also includes code for 'Hello ScanCode' plugin
   illustrated in ScanCode wiki entry 'How To: Add a post scan plugin'
   (see /scancode-hello/).

Signed-off-by: John M. Horan <johnmhoran@gmail.com>
@pombredanne
Copy link
Member

See also #426
I think we could eventually bundle all this as part of the --info scans... this is quite essential
IMHO

@johnmhoran
Copy link
Member Author

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

2 participants