-
Notifications
You must be signed in to change notification settings - Fork 124
vine: separate blocked and ready tasks #4275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
vine: separate blocked and ready tasks #4275
Conversation
…' into separate_pending_and_ready_tasks
|
Can we implement this with the changes to the priority queue? That is, give blocked tasks a small priority and not change anything else in the code? |
|
I agree that implementing the whole thing with flexible priorities may mitigate the problem, but I don't see it as a solid abstraction for a long-term solution. For ineligible tasks, demoting them to the blocked list or lowering their priority serves the same purpose, which is to delay the execution and consider later. What's elegant of demoting is that it gives us distinct states to track tasks, which allows us to decide what operations are appropriate on each of the states. For For As it stands, even if recovery tasks are removed in the near future, there are still other cases where a task can become ineligible and should be delayed, to which I think adding a blocked list is more reasonable. |
|
If a task has a 'blocked' priority, we can still check why it is blocked. It is not too much about the efficiency, as I think both methods are about the same cost, but rather reducing the complexity of the code. In the future, every time we need to update the ready task queue we will have to do the same for the blocked task queue, and I rather do not increase the state that we have to keep. |
|
I might be wrong, but I guess adding the blocked queue does reduce the complexity of the code? Instead, it makes tasks more maintainable. On the one hand, distinguishing the two states allow us to adopt different operations to each without mixing them up. On the other hand, lowering the priority doesn't solve the fundamental problem: the blocked tasks are still left in the scheduling queue and might be reconsidered several times, their scheduling frequency and order may beyond our control over time. Meanwhile, it bings two other problems: 1) tasks with positive and negative priorities should be treated differently, and 2) to what extend the priority value should be lowered. If a future change is needed to the ready queue, we don't need to apply the same thing to the blocked queue, because blocked tasks will be rotated to the ready queue and handled there soon or later. |
|
The blocked queue is simply implementing lowest priority? As the recent interaction with parsl+wq, priority is important for users and we should have a system that works well with resetting the priority queue when resources become available (e.g. a workers joins or a task result is return). Given this, if we assign lowest priority to "blocked" tasks they will get checked only when there is nothing else to do. The blocked list helps when priority is not important so that we can rotate the priority queue. However, the operation of taskvine itself now depends on priorities (resource exhaustion and recovery tasks). Thus, if we add a blocked list we should remove priorities from taskvine, as otherwise it is very hard to predict what taskvine is doing. Since priorities are important to users, I don't think this is the way we want to go. |
|
And also, I guess we are talking different complexities... From my perspective adding a second data structure increases the complexity of the code, as we need to make sure to maintain it correctly across all possible state transitions. We may be able to do it today when we have fresh in our minds, but not in a year when we forgot what the code was doing. |
|
That's a fair argument, adding an additional data structure would increase the maintainance complexity in the future... I think we can go with that. Then if we lower the priority of tasks when they are found to be ineligible, the iterating cursor doesn't appear to be helpful. Do we want to get rid of that feature and stick with peaking at the top? |
|
What if we add the blocked list but only for logical blocks? If an input file is not there, then the task is put in the block list. If this makes sense, then the changes in this pr should be reduced, e.g., expiring_tasks function should not be removed/changed. You may want to add different return values to consider_task with an enum so you know if it was logical, resources, etc. what prevented the task to be scheduled. |
|
I think this is a good approach, and I definitely understand that there are too many things going on in the scheduling function so we need to be very circumspect! Returning tasks with inputs missing or library missing seems to be the direction we'll have to take sooner or later. Given that the implementation requires adhering the overall architecture and some coding philosophy behind it, I'm afraid that I couldn't get it right. Hou about you check it out later? That way we might have a more clear idea of what to do here! |
|
Sounds good! Like me make a sketch to see if it does what you need! |
|
Jin, I would like for you to keep the responsibility of implementing this. You and Ben are having a constructive discussion about how it should work. I want you guys to come to a shared understanding, and then Jin should implement it. |
|
Got it! |
|
Jin, before you continue adding to this pr, please split it into smaller chunks. For example, the first chunk should only add the block list with no other changes to the expiring list function. Since this is a structural change, lets split it into changes that we can more easily evaluate and think about. |
|
Sounds great! |

Proposed Changes
A new data structure
struct list *blocked_tasksis added to track tasks that cannot be run.We rotate the list for a fixed amount of time, by default
q->scheduling_depth(100), to give all tasks an equal chance of being reconsidered, and the priority queue is populated with eligible tasks for fast dispatch.Merge Checklist
The following items must be completed before PRs can be merged.
Check these off to verify you have completed all steps.
make testRun local tests prior to pushing.make formatFormat source code to comply with lint policies. Note that some lint errors can only be resolved manually (e.g., Python)make lintRun lint on source code prior to pushing.