Verified with a hwtest and implemented based on reverse engineering.
Thread A's priority will get bumped to the highest priority among all the threads that are waiting for a mutex that A holds.
Once A releases the mutex and ownership is transferred to B, A's priority will return to normal and B's priority will be bumped.
Switch mutexes are no longer kernel objects, they are managed in userland and only use the kernel to handle the contention case.
Mutex addresses store a special flag value (0x40000000) to notify the guest code that there are still some threads waiting for the mutex to be released. This flag is updated when a thread calls ArbitrateUnlock.
TODO:
* Fix svcWaitProcessWideKey
* Fix svcSignalProcessWideKey
* Remove the Mutex class.
Ported from citra PR #3091
The delay specified here is from a Nintendo 3DS, and should be measured in a Nintendo Switch.
This change is enough to prevent Puyo Puyo Tetris's main thread starvation.
This change makes for a clearer (less confusing) path of execution in the scheduler, now the code to execute when a thread awakes is closer to the code that puts the thread to sleep (WaitSynch1, WaitSynchN). It also allows us to implement the special wake up behavior of ReplyAndReceive without hacking up WaitObject::WakeupAllWaitingThreads.
If savestates are desired in the future, we can change this implementation to one similar to the CoreTiming event system, where we first register the callback functions at startup and assign their identifiers to the Thread callback variable instead of directly assigning a lambda to the wake up callback variable.
Don't automatically assume that Thread::Create will only be called when the parent process is currently scheduled. This assumption will be broken when applets or system modules are loaded.
This is necessary for loading multiple processes at the same time.
The main thread will be automatically scheduled when necessary once the scheduler runs.
After hwtesting and reverse engineering the kernel, it was found that the CTROS scheduler performs no priority boosting for threads like this, although some other forms of scheduling priority-starved threads might take place.
For example, it was found that hardware interrupts might cause low-priority threads to run if the CPU is preempted in the middle of an SVC handler that deschedules the current (high priority) thread before scheduling it again.
This commit removes the overly general THREADSTATUS_WAIT_SYNCH and replaces it with two more granular statuses:
THREADSTATUS_WAIT_SYNCH_ANY when a thread waits on objects via WaitSynchronization1 or WaitSynchronizationN with wait_all = false.
THREADSTATUS_WAIT_SYNCH_ALL when a thread waits on objects via WaitSynchronizationN with wait_all = true.
The implementation is based on reverse engineering of the 3DS's kernel.
A mutex holder's priority will be temporarily boosted to the best priority among any threads that want to acquire any of its held mutexes.
When the holder releases the mutex, it's priority will be boosted to the best priority among the threads that want to acquire any of its remaining held mutexes.
Threads will now be awakened when the objects they're waiting on are signaled, instead of repeating the WaitSynchronization call every now and then.
The scheduler is now called once after every SVC call, and once after a thread is awakened from sleep by its timeout callback.
This new implementation is based off reverse-engineering of the real kernel.
See https://gist.github.com/Subv/02f29bd9f1e5deb7aceea1e8f019c8f4 for a more detailed description of how the real kernel handles rescheduling.
Each thread gets a 0x200-byte area from the 0x1000-sized page, when all 8 thread slots in a single page are used up, the kernel allocates a new page to hold another 8 entries.
This is consistent with what the real kernel does.
This commit fixes several kernel object leaks. The most severe of them
was threads not being removed from the private handle table used for
CoreTiming events. This resulted in Threads never being released, which
in turn held references to Process, causing CodeSets to never be freed
when loading other applications.
memory.cpp/h contains definitions related to acessing memory and
configuring the address space
mem_map.cpp/h contains higher-level definitions related to configuring
the address space accoording to the kernel and allocating memory.
Involves making asserts use printf instead of the log functions (log functions are asynchronous and, as such, the log won't be printed in time)
As such, the log type argument was removed (printf obviously can't use it, and it's made obsolete by the file and line printing)
Also removed some GEKKO cruft.
* Simplifies scheduling logic, specifically regarding thread status. It should be much clearer which statuses are valid
for a thread at any given point in the system.
* Removes dead code from thread.cpp.
* Moves the implementation of resetting a ThreadContext to the corresponding core's implementation.
Other changes:
* Fixed comments in arm interfaces.
* Updated comments in thread.cpp
* Removed confusing, useless, functions like MakeReady() and ChangeStatus() from thread.cpp.
* Removed stack_size from Thread. In the CTR kernel, the thread's stack would be allocated before thread creation.
During normal operation, a thread waiting on an WaitObject and the
object hold mutual references to each other for the duration of the
wait.
If a process is forcefully terminated (The CTR kernel has a SVC to do
this, TerminateProcess, though no equivalent exists for threads.) its
threads would also be stopped and destroyed, leaving dangling pointers
in the WaitObjects.
The solution is to simply have the Thread remove itself from WaitObjects
when it is stopped. The vector of Threads in WaitObject has also been
changed to hold SharedPtrs, just in case. (Better to have a reference
cycle than a crash.)
This should speed up compile times a bit, as well as enable more liberal
use of forward declarations. (Due to SharedPtr not trying to emit the
destructor anymore.)
- Separate wait checking from waiting the current thread
- Resume thread when wait_all=true only if all objects are available at once
- Set output to correct wait object index when there are duplicate handles
This thread will not actually execute instructions, it will only advance the timing/events and try to yield immediately to the next ready thread, if there aren't any ready threads then it will be rescheduled and start its job again.
Replace all the C-style complicated buffer management with a std::deque.
In addition to making the code easier to understand it also adds support
for non-POD IdTypes.
Also clean the rest of the code to follow our code style.
This handle manager more closely mirrors the behaviour of the CTR-OS
one. In addition object ref-counts and support for DuplicateHandle have
been added.
Note that support for DuplicateHandle is still experimental, since parts
of the kernel still use Handles internally, which will likely cause
troubles if two different handles to the same object are used to e.g.
wait on a synchronization primitive.