Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add .get_task method to Schedulers - APS v4 #953

Open
1 task done
HK-Mattew opened this issue Aug 10, 2024 · 10 comments
Open
1 task done

Add .get_task method to Schedulers - APS v4 #953

HK-Mattew opened this issue Aug 10, 2024 · 10 comments

Comments

@HK-Mattew
Copy link
Contributor

Things to check first

  • I have searched the existing issues and didn't find my feature already requested there

Feature description

Hello,

My suggestion is to add the .get_task(task_id=...) method to the Schedulers.

Use case

I found myself in a situation where I needed to pass the Task instance directly to the .add_job method to get an existing task configuration.

I could use the method to get all tasks with .get_tasks. But I would have to filter this list every time to get a single specific task. I don't think this would be a very interesting approach in my use case and I believe my suggestion will be useful to others as well.

@agronholm
Copy link
Owner

I'll consider this, but I'm curious as to why you would need to pass a Task instance to add_job(). Could you explain that?

@HK-Mattew
Copy link
Contributor Author

I'll consider this, but I'm curious as to why you would need to pass a Task instance to add_job(). Could you explain that?

Because whenever I use the .add_job method, the .add_job method itself uses the .configure_task method internally and if I pass the task id to the .add_job method, it overwrites some configurations that I made previously in the task.

However, passing the Task instance does not overwrite my configuration.

I did not report this as a bug, because I am not sure if this is a bug or if it is actually expected behavior.

@agronholm
Copy link
Owner

Passing the task ID to add_job() should never overwrite any task configuration. Can you give me a reproducible example where you demonstrate such behavior?

@HK-Mattew
Copy link
Contributor Author

Passing the task ID to add_job() should never overwrite any task configuration. Can you give me a reproducible example where you demonstrate such behavior?

Good to know,

I'll reproduce this now

@HK-Mattew
Copy link
Contributor Author

Passing the task ID to add_job() should never overwrite any task configuration. Can you give me a reproducible example where you demonstrate such behavior?

Here is the sample code:

from apscheduler import Scheduler, SchedulerRole
from apscheduler.executors.async_ import AsyncJobExecutor
from apscheduler.executors.thread import ThreadPoolJobExecutor
from apscheduler.executors.subprocess import ProcessPoolJobExecutor
from apscheduler.datastores.mongodb import MongoDBDataStore
import config



scheduler_web_configs = dict(
    data_store=MongoDBDataStore(
        client_or_uri=config.MONGO_DB_URI,
        database=config.MONGO_DB_NAME
    ),
    role=SchedulerRole.scheduler,
    max_concurrent_jobs=100,
    job_executors={
        'async': AsyncJobExecutor(),
        'threadpool': ThreadPoolJobExecutor(),
        'processpool': ProcessPoolJobExecutor(),
    }
)



def func_to_task_1():
    ...


with Scheduler(**scheduler_web_configs) as scheduler:
    
    scheduler.configure_task(
        func_or_task_id='task1',
        func=func_to_task_1,
        job_executor='async',
        max_running_jobs=5
    )

    print(scheduler.get_tasks())

    """
    [print result]

    [Task(id='task1', func='__main__:func_to_task_1', job_executor='async',
    max_running_jobs=5, misfire_grace_time=None, metadata={}, running_jobs=0)]
    """


    scheduler.add_job(
        func_or_task_id='task1'
    )

    print(scheduler.get_tasks())

    """
    [print result]
    [Task(id='task1', func='__main__:func_to_task_1', job_executor='threadpool',
    max_running_jobs=1, misfire_grace_time=None, metadata={}, running_jobs=0)]
    """

In the result of my execution you can see that the .add_job method overrode some of my task settings. Like the max_running_jobs and job_executor fields.

@agronholm agronholm mentioned this issue Aug 11, 2024
3 tasks
@agronholm
Copy link
Owner

Ok, I understand the problem now, and it's a design issue. I'll have to refactor the add_task() data store method.

@mmmcorpsvit
Copy link

@agronholm , sorry for my question, are there any fix updates?

@agronholm
Copy link
Owner

I'm making some progress once in a while, but it seems that every time I fix something, I uncover another problem. The rabbit hole is deep ☹️
I'll get it done Soon(tm). But I have people in other projects constantly asking for updates, not just APScheduler...

@agronholm
Copy link
Owner

The hard work on AnyIO's next release is done, so I can focus on this now. Getting incremental updates to task configuration is the crux of the problem here. I'm still working on a solution to that.

@agronholm
Copy link
Owner

Sorry for the delay. I'm having a bit of trouble fixing AsyncScheduler.configure_task() to work with the data stores in a sane way, so I'm currently experimenting with different ways of implementing full and partial task updates. This might take some time, unfortunately.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

3 participants