forked from clearlydefined/crawler
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Reduce the number of fetches harvesting one component (clearlydefined…
…#475) * Cache in progress fetch promises, cached fetched results Cache in progress fetch promises, cached fetched results for maven Add a unit test for gitCloner Cache fetch results from gitCloner Add a unit test for pypiFetch Cache fetch results from pypiFetch Minor refactoring Cache fetch results from npmjsFetch Add unit tests for rubyGem Cache fetch results from rubyGemFetch Cache fetch results from packagistFetch Cache fetch results from crateioFetch Cache fetch results from debianFetch Cache fetch results from goFetch Deep clone cached result on copy Cache fetch results from nugetFetch Add unit tests for podFetch Cache results from podFetch Delay fetchResult construction until end of fetch. Delay fetchResult construction and transfer the clean up of the download directory at the end of the fetch. This is to ensure when error occurs, the cleanup of the download directory will still be tracked in request. Minor refactoring Minor refactoring Remove todo to avoid merge conflict Adapt tests after merge * Add ScopedQueueSets ScopedQueueSets contains local and global scoped queue sets. local scoped queueset holds tasks to be performed on the fetched result (package) that is currently processed and cached locally on the crawler instance. This avoid refectch and increase the cache hit. global scoped queueset is the shared queues among crawler instances. local queueset is popped prior to the global one. This ensures that cache is utilized before expiration. * Publish requests on local queues to global upon crawler shutdown Fix and add tests Allow graceful shutdown * Minor refactor and add more tests * Update docker file to relay of shutdown signal * Add config for dispatcher.fetched cache After the scopedQueueSets is introduced, the tool tasks on the same fetched result (in the local scoped queueset) are processed consecutively. Therefore, cache ttl for the fetched result can now be reduced. * Address review comments * Removed --init option in docker run In my previous changes: -nodejs application is run as PID 1 in the docker container, and -the application can handle termination signals. Therefore, --init option is not longer necessary and hence removed in docker run command.
- Loading branch information
1 parent
cc4af2e
commit 3fea16e
Showing
64 changed files
with
2,005 additions
and
278 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -50,4 +50,4 @@ ENV NODE_ENV "localhost" | |
|
||
ENV PORT 5000 | ||
EXPOSE 5000 | ||
ENTRYPOINT ["npm", "start"] | ||
ENTRYPOINT ["node", "index.js"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -86,4 +86,4 @@ COPY . "${APPDIR}" | |
|
||
ENV PORT 5000 | ||
EXPOSE 5000 | ||
ENTRYPOINT ["npm", "start"] | ||
ENTRYPOINT ["node", "index.js"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
CALL docker kill cdcrawler | ||
CALL mkdir C:\temp\crawler-data | ||
CALL docker run --rm --name cdcrawler --env-file %~dp0\..\..\env.list -p 5000:5000 -p 9229:9229 -v C:\temp\crawler-data:/tmp/cd --entrypoint npm cdcrawler:latest run local | ||
CALL docker run --rm --name cdcrawler --env-file %~dp0\..\..\env.list -p 5000:5000 -p 9229:9229 -v C:\temp\crawler-data:/tmp/cd --entrypoint node cdcrawler:latest --inspect-brk=0.0.0.0:9229 index.js |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.