A greedy heap solution for taking the maximum number of courses before their deadlines.
Problem Restatement
We are given a list of courses.
Each course is represented as:
[duration, lastDay]duration is how many days the course takes.
lastDay is the latest day by which the course must be finished.
We start on day 1.
We can take only one course at a time.
We need to return the maximum number of courses we can finish before or on their deadlines.
Input and Output
| Item | Meaning |
|---|---|
| Input | A list courses, where courses[i] = [duration_i, lastDay_i] |
| Output | Maximum number of courses that can be taken |
| Constraint | Courses must be taken continuously |
| Constraint | Only one course can be taken at a time |
| Constraint | A course must finish on or before lastDay_i |
The official constraints are:
| Constraint | Value |
|---|---|
courses.length | 1 <= courses.length <= 10^4 |
duration_i | 1 <= duration_i <= 10^4 |
lastDay_i | 1 <= lastDay_i <= 10^4 |
Examples
Example 1:
courses = [[100, 200], [200, 1300], [1000, 1250], [2000, 3200]]One optimal plan is:
| Course | Duration | Finish Day |
|---|---|---|
[100, 200] | 100 | 100 |
[1000, 1250] | 1000 | 1100 |
[200, 1300] | 200 | 1300 |
We can take 3 courses.
The course [2000, 3200] cannot also fit, because it would make the total time too large.
So the answer is:
3Example 2:
courses = [[1, 2]]The course takes 1 day and must finish by day 2.
So the answer is:
1Example 3:
courses = [[3, 2], [4, 3]]The first course takes 3 days but must finish by day 2.
The second course takes 4 days but must finish by day 3.
Neither course can be taken.
So the answer is:
0First Thought: Try All Subsets
A direct approach is to try every subset of courses.
For each subset, we can check whether there is an order that finishes all selected courses before their deadlines.
But there are too many subsets.
For n courses, there are:
2**npossible subsets.
Since courses.length can be 10^4, this is impossible.
We need a greedy method.
Key Insight
Sort courses by deadline.
Then process courses from the earliest deadline to the latest deadline.
While processing courses in this order, suppose we decide to take some courses. Let time be the total duration of those chosen courses.
If time is less than or equal to the current deadline, the chosen set is still feasible.
If time becomes greater than the current deadline, we have chosen too much work. We must remove one course.
To keep as many courses as possible, we should remove the course with the largest duration.
Why?
Because removing the longest course frees the most time while removing only one course.
This gives future courses the best chance to fit.
So we use a max-heap to store the durations of chosen courses.
Python has a min-heap, so we store negative durations.
Algorithm
- Sort
coursesbylastDay. - Initialize
time = 0. - Initialize an empty heap.
- For each course
[duration, lastDay]:- Add the course duration to
time. - Push
-durationinto the heap. - If
time > lastDay, remove the longest selected course.
- Add the course duration to
- Return the number of courses left in the heap.
The heap contains exactly the courses we keep.
Implementation
import heapq
class Solution:
def scheduleCourse(self, courses: list[list[int]]) -> int:
courses.sort(key=lambda course: course[1])
time = 0
heap = []
for duration, last_day in courses:
time += duration
heapq.heappush(heap, -duration)
if time > last_day:
longest = -heapq.heappop(heap)
time -= longest
return len(heap)Code Explanation
First, we sort by deadline:
courses.sort(key=lambda course: course[1])This makes us handle tighter deadlines earlier.
Then we maintain the total duration of selected courses:
time = 0The heap stores selected course durations as negative numbers:
heap = []For every course, we first tentatively take it:
time += duration
heapq.heappush(heap, -duration)If this makes the schedule invalid:
if time > last_day:we remove the selected course with the largest duration:
longest = -heapq.heappop(heap)
time -= longestBecause the heap stores negative values, the most negative heap entry represents the largest original duration.
At the end, the heap size is the number of courses we kept:
return len(heap)Correctness
After sorting by lastDay, when we are processing a course with deadline D, all previously processed courses have deadlines less than or equal to D.
The heap stores the durations of courses currently selected among the courses processed so far. The variable time is the total duration of those selected courses.
If time <= D, then the selected courses can be scheduled in deadline order up to the current course, because their total duration does not exceed the current largest deadline.
If time > D, then the selected set is too large to finish by the current deadline. We must remove at least one selected course. Removing the longest selected course reduces time as much as possible while decreasing the number of selected courses by exactly one.
This choice is safe. Among all ways to keep the same number of courses after removing one course, dropping the longest course leaves the smallest total time. A smaller total time can only make future deadlines easier to satisfy.
Therefore, after each step, the heap contains the largest possible number of courses among the courses seen so far, with the smallest possible total time among such choices.
By the end, all courses have been processed, so the heap size is the maximum number of courses that can be taken.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(n log n) | Sorting takes O(n log n), and each heap operation takes O(log n) |
| Space | O(n) | The heap may store up to n course durations |
Testing
def run_tests():
s = Solution()
assert s.scheduleCourse([[100, 200], [200, 1300], [1000, 1250], [2000, 3200]]) == 3
assert s.scheduleCourse([[1, 2]]) == 1
assert s.scheduleCourse([[3, 2], [4, 3]]) == 0
assert s.scheduleCourse([[5, 5], [4, 6], [2, 6]]) == 2
assert s.scheduleCourse([[2, 5], [2, 6], [2, 7]]) == 3
assert s.scheduleCourse([[10, 5], [1, 20], [1, 20]]) == 2
print("all tests passed")
run_tests()Test meaning:
| Test | Why |
|---|---|
[[100,200],[200,1300],[1000,1250],[2000,3200]] | Official main example |
[[1,2]] | Single feasible course |
[[3,2],[4,3]] | No course can fit |
[[5,5],[4,6],[2,6]] | Requires dropping a longer course |
[[2,5],[2,6],[2,7]] | All courses fit |
[[10,5],[1,20],[1,20]] | Impossible long course should not block short courses |