Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Need summary file type #873

Closed
mjherzog opened this issue Dec 15, 2017 · 2 comments
Closed

Need summary file type #873

mjherzog opened this issue Dec 15, 2017 · 2 comments

Comments

@mjherzog
Copy link
Member

We currently have several "file type" fields returned from a scan:

  • Type: either File or Directory
  • MIME Type
  • File Type
  • Binary
  • Text File
  • Archive File
  • Media File
  • Source File
  • Script File
  • Package Type

For this topic, I will ignore Type since this just covers File vs Directory and focus on files only. We need some simpler way to identify the file type in one field to facilitate filtering in AboutCode Manager and other tools. MIME Type and File Type each have pros and cons.

-In many cases MIME Type seems more useful because it summarizes the type a bit more - e.g. "text/x-shellscript" is probably more useful than corresponding File Types like "Bourne-Again shell script, ASCII text executable" and "POSIX shell script, ASCII text executable" because I primarily want to find all of the script files (which often do not have an extension).

  • On the other hand MIME Type seems to use "application/octet-stream" as a catch-all The "octet-stream" subtype is used to indicate that a body contains arbitrary binary data.) is not very helpful .

It may be the case that we could get the best result with a new Summary File Type field where the possible values are: Binary, Archive, Text, Media, Source or Script, but I am not sure whether a scanning will resolve to only one of these values (presumably we have multiple fields today because of some overlap).

The primary use case is that I want to easily filter for Binary and Source code files which are the primary targets for analysis. The secondary use case is to easily filter for chunks like Script or Media files. This will also be important for filtering DeltaCode results to set up alerts/warnings for code files, but ignore or lower the priority of changes to Script or Media files.

@mjherzog
Copy link
Member Author

I reviewed some scans and noticed many shell script files show up as Text rather than Script so the current identification of Script: true/false is not going to help much.

@pombredanne
Copy link
Member

I have merged this in #426

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

2 participants