-
-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
New summary/primary Content Type prototype #1754
Labels
Comments
johnmhoran
added a commit
that referenced
this issue
Oct 11, 2019
* Includes LICENSE.txt which seems to have been added during the './configure.bat' process. Signed-off-by: John M. Horan <johnmhoran@gmail.com>
johnmhoran
added a commit
that referenced
this issue
Oct 11, 2019
* Added basic directory structure and basic files. * Initial test code in plugin_primary_content.py. Signed-off-by: John M. Horan <johnmhoran@gmail.com>
johnmhoran
added a commit
that referenced
this issue
Oct 11, 2019
Signed-off-by: John M. Horan <johnmhoran@gmail.com>
As an initial step to get familiar with contenttype.py and related files/processes, I will:
|
johnmhoran
added a commit
that referenced
this issue
Oct 14, 2019
Signed-off-by: John M. Horan <johnmhoran@gmail.com>
johnmhoran
added a commit
that referenced
this issue
Oct 14, 2019
Signed-off-by: John M. Horan <johnmhoran@gmail.com>
johnmhoran
added a commit
that referenced
this issue
Oct 14, 2019
Signed-off-by: John M. Horan <johnmhoran@gmail.com>
johnmhoran
changed the title
Prototype new summary/primary Content Type prototype
New summary/primary Content Type prototype
Oct 15, 2019
johnmhoran
added a commit
that referenced
this issue
Oct 15, 2019
* Also added 'make.bat' and related test to 'is_build()' property. Signed-off-by: John M. Horan <johnmhoran@gmail.com>
The code behind this issue and branch is now in https://github.com/nexB/typecode but there never was a PR for https://github.com/nexB/scancode-toolkit/compare/1754-primary-content |
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
ScanCode currently reports the following 9 fields for determining the content type of a file:
When organizing a codebase for analysis, it would be useful to consolidate this data into a single field that indicates the level and type of analysis to apply. The focus should be on identifying source and binary files "program/code" files that are copyright-able and likely to be licensed. There will be many specialized file types (e.g. only for a proprietary software program) that will be not be covered by this feature.
In the prototype phase we will use a ScanCode plugin to analyze a Scan and annotate it with a new primary_content_type field that designates the primary Content Type (primary for analysis) in the format: Language-Type.
An initial test can be to report a primary Content Type to distinguish between SourceCode and Scripts for files written in Programming Languages (e.g. Python ,Ruby) that have both. There should be patterns in the current set of Content Type data to make this distinction.
The text was updated successfully, but these errors were encountered: