Easy to deadlock with new proc_iterate

Ever since 10.15.5 (I think it was) brought in the new proc_lock_ APIs it has been quite easy to deadlock namei() lookups and mount at the same time.

Stack 1

       *1000  unix_syscall64 + 698 (kernel.development + 9558170) [0xffffff8000b1d89a]
         *1000  lstat64 + 47 (kernel.development + 4947279) [0xffffff80006b7d4f]
           *1000  fstatat_internal + 327 (kernel.development + 4944567) [0xffffff80006b72b7]
             *1000  nameiat + 117 (kernel.development + 4919557) [0xffffff80006b1105]
               *1000  namei + 3857 (kernel.development + 4813841) [0xffffff8000697411]
                 *1000  lookup + 1842 (kernel.development + 4817810) [0xffffff8000698392]
                   *1000  lookup_handle_found_vnode + 677 (kernel.development + 4814677) [0xffffff8000697755]
                     *1000  vfs_busy + 79 (kernel.development + 4847775) [0xffffff800069f89f]
                       *1000  IORWLockRead + 738 (kernel.development + 3527154) [0xffffff800055d1f2]

Stack 2

            1000  mount + 10 (libsystem_kernel.dylib + 41114) [0x7fff72fc109a]
             *1000  hndl_unix_scall64 + 22 (kernel.development + 1622534) [0xfffff
f800038c206]
               *1000  unix_syscall64 + 698 (kernel.development + 9558170) [0xfffff
f8000b1d89a]
                 *1000  mount + 78 (kernel.development + 4901838) [0xffffff80006ac
bce]
                   *1000  __mac_mount + 1330 (kernel.development + 4903186) [0xfff
fff80006ad112]
                     *1000  mount_common + 4860 (kernel.development + 4897964) [0xffffff80006abcac]
                       *1000  checkdirs + 115 (kernel.development + 4901059) [0xffffff80006ac8c3]
                         *1000  proc_iterate + 892 (kernel.development + 8110892) [0xffffff80009bc32c]
                           *1000  checkdirs_callback + 139 (kernel.development + 4901547) [0xffffff80006acaab]
                             *1000  IORWLockWrite + 1240 (kernel.development + 3528664) [0xffffff800055d7d8]

The mount call will vfs_busy() then wait for proc_dirs_lock_exclusive() (IORWLockWrite). Whereas stat will grab proc_dirs_lock_share() in namei(), then because it needs to cross mountpoint, it calls lookup_traverse_mountpoints() which calls vfs_busy(). Classic A-B, B-A deadlock.

Having a hard to time to 1) avoid it, or 2) detect it will happen, since everything is opaque, settings like NOCROSSMNT is not something I can set.

Easy to deadlock with new proc_iterate
 
 
Q