Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

add support for a secondary queue #571

Open
2 tasks
elrayle opened this issue Apr 17, 2024 · 1 comment
Open
2 tasks

add support for a secondary queue #571

elrayle opened this issue Apr 17, 2024 · 1 comment

Comments

@elrayle
Copy link
Collaborator

elrayle commented Apr 17, 2024

Description

Add support for configuring a secondary queue. When ready to process more requests, the crawler will look for requests in the primary queue. If none found, it will look for requests in the secondary queue.

Tasks:

  • add configuration for two queues identified as primary and secondary
  • update queue processing code to pull requests from the secondary queue when the primary is empty

Configuration

As an example, the kubernetes config in crawler.yml would have the following changes. There are other places requiring updates including other example configs, config defaults in code, and the code processing the configs and queues.

Current Configuration

            - name: CRAWLER_QUEUE_PREFIX
              valueFrom:
                secretKeyRef:
                  name: secrets
                  key: CRAWLER_QUEUE_PREFIX

Proposed Configuration

            - name: CRAWLER_QUEUE_PREFIX
              valueFrom:
                secretKeyRef:
                  name: secrets
                  key: CRAWLER_QUEUE_PREFIX

            - name: CRAWLER_SECONDARY_QUEUE_PREFIX
              valueFrom:
                secretKeyRef:
                  name: secrets
                  key: CRAWLER_SECONDARY_QUEUE_PREFIX
@qtomlinson
Copy link
Collaborator

qtomlinson commented Apr 24, 2024

Instead of adding a secondary queue (QueueSet internally), an alternative approach to achieve the same goal could be to make the QueueSet configurable. Currently, the QueueSet is hardcoded as prefix-immediate, prefix-normal, prefix-soon, and prefix-later internally, with the prefix being the only configurable part. It is potentially possible to allow the list of queue names to be configurable instead, rather than allowing the prefix to be configurable and constructing the names implicitly for the QueueSet. In the configuration, the existing prefix-immediate, prefix-normal, prefix-soon, and prefix-later can be specified explicitly, and additional queues can be added alongside the existing four queues. The weights for the queues in the QueueSet can already be configured, allowing for prioritization when pulling from specific queues.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants