This change makes for a clearer (less confusing) path of execution in the scheduler, now the code to execute when a thread awakes is closer to the code that puts the thread to sleep (WaitSynch1, WaitSynchN). It also allows us to implement the special wake up behavior of ReplyAndReceive without hacking up WaitObject::WakeupAllWaitingThreads.
If savestates are desired in the future, we can change this implementation to one similar to the CoreTiming event system, where we first register the callback functions at startup and assign their identifiers to the Thread callback variable instead of directly assigning a lambda to the wake up callback variable.
This is necessary for loading multiple processes at the same time.
The main thread will be automatically scheduled when necessary once the scheduler runs.
After hwtesting and reverse engineering the kernel, it was found that the CTROS scheduler performs no priority boosting for threads like this, although some other forms of scheduling priority-starved threads might take place.
For example, it was found that hardware interrupts might cause low-priority threads to run if the CPU is preempted in the middle of an SVC handler that deschedules the current (high priority) thread before scheduling it again.
This commit removes the overly general THREADSTATUS_WAIT_SYNCH and replaces it with two more granular statuses:
THREADSTATUS_WAIT_SYNCH_ANY when a thread waits on objects via WaitSynchronization1 or WaitSynchronizationN with wait_all = false.
THREADSTATUS_WAIT_SYNCH_ALL when a thread waits on objects via WaitSynchronizationN with wait_all = true.
The implementation is based on reverse engineering of the 3DS's kernel.
A mutex holder's priority will be temporarily boosted to the best priority among any threads that want to acquire any of its held mutexes.
When the holder releases the mutex, it's priority will be boosted to the best priority among the threads that want to acquire any of its remaining held mutexes.
Threads will now be awakened when the objects they're waiting on are signaled, instead of repeating the WaitSynchronization call every now and then.
The scheduler is now called once after every SVC call, and once after a thread is awakened from sleep by its timeout callback.
This new implementation is based off reverse-engineering of the real kernel.
See https://gist.github.com/Subv/02f29bd9f1e5deb7aceea1e8f019c8f4 for a more detailed description of how the real kernel handles rescheduling.
Each thread gets a 0x200-byte area from the 0x1000-sized page, when all 8 thread slots in a single page are used up, the kernel allocates a new page to hold another 8 entries.
This is consistent with what the real kernel does.