Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

GUI takes much longer to analyse archives now that it expands sub archives recursively #370

Closed
jcharlet opened this issue Jan 30, 2020 · 3 comments
Assignees
Labels
Milestone

Comments

@jcharlet
Copy link
Contributor

jcharlet commented Jan 30, 2020

Issue

Since we now expand all archives recursively (for those formats supported), analysis can be much longer than on 6.4 and feels like stagnating on some archives. It's just because it's doing a deeper analysis, and requires waiting for a longer period of time. Analysis eventually finishes. Browsing sub archives can also be disabled.

Context

following PR #334

Solution

Do we need to manage this?

Should we prevent expanding archives to a certain amount of sub archives? @Dclipsham @DavidUnderdown what are your opinions on that?

@jcharlet jcharlet changed the title GUI fails to finish analyzing one tgz file on Download folder GUI takes much longer to analyse archives now that it expands sub archives recursively Jan 30, 2020
@jcharlet jcharlet added this to the 6.5 milestone Jan 30, 2020
@jcharlet jcharlet self-assigned this Jan 30, 2020
@jcharlet jcharlet added question and removed invalid labels Jan 30, 2020
@nishihatapalmer
Copy link
Contributor

nishihatapalmer commented Feb 4, 2020

DROID was designed to process nested archives - zip, tar and gz did this from the start. Some later archive format implementations missed the trick of how to do that, and so fixing that has meant more archives being processed correctly.

It is an interesting question to ask whether you can process too many! It is now possible to configure which archive types are processed. Limiting to a certain depth would certainly improve performance, but I guess the question is why you are interested in the contents of archives at one depth but not a lower one.

I suppose one might be willing to trade off lower levels of detail, on the grounds that the main files you're interested in are mainly in, say, the first or second levels. Or maybe you'd only want to drill down into particular archive types to a certain level, but others could be different?

@jcharlet
Copy link
Contributor Author

jcharlet commented Feb 4, 2020

thanks @nishihatapalmer !
one usecase I faced while testing locally DROID: scan a directory where there happens to be 1 large application archive (typically an 80mb archive of the software postman I think). Everything is rather quick to process, but suddenly it starts diving forever in that one archive that shouldn't be here.

I don't know how frequently that can happen when scanning department archives though.

And I guess we could argue that you could select the files you want to process, and maybe put aside the biggest files you have waiting?

@sparkhi
Copy link
Collaborator

sparkhi commented Feb 6, 2020

Jeremie: seen with David, it's acceptable since we check thoroughly the sub archives, and this is the expected behaviour.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants