Does this behavior differ between the x86_64 and arm64e kernels?
Does this behavior differ between the x86_64 and arm64e kernels?
There are a number of issues that you should consider when deciding whether to use floating point math or AltiVec vector math in the kernel.
First, the kernel takes a speed penalty whenever floating-point math or AltiVec instructions are used in a system call context (or other similar mechanisms where a user thread executes in a kernel context), as floating-point and AltiVec registers are only maintained when they are in use.
Note In cases where AltiVec or floating point has already been used in user space in the calling thread, there is no additional penalty for using them in the kernel. Thus, for things like audio drivers, the above does not apply.
The last time I looked at this the kernel has a lazy mechanism for saving and restoring the non-general purpose registers. Let’s focus on floating point for the moment. On entry to the kernel the system disables the FPU. If your kernel code accessed the FPU, it traps within the kernel, which saves the FPU state to the user thread’s context, clear the registers, and then returns to your kernel code. This would then require a restore of FPU state as you leave the kernel.In general, you should avoid doing using floating-point math or AltiVec instructions in the kernel unless doing so will result in a significant speedup. It is not forbidden, but is strongly discouraged.