Thanks to anyone that has read my post and have been debugging the deadlock! Just a few days ago the source packages for Ventura 13.3 were published, and in that I found webdavfs-395. The following diff is one that I'm very happy to see:
https://github.com/apple-oss-distributions/webdavfs/commit/b7756b02549929bb18062ebcd76f0bbb75a120cb
That change does in fact target exactly the use-case of calling ubc_msync() on mmap'ed files at clean-up, but unfortunately real-world testing of a fresh Ventura 13.5b4 still triggers the deadlock despite including webdavfs-395.
In other words, it seems there is still a code path that goes through webdav_vnop_pageout, determine that is_open == TRUE and end up calling webdav_vnop_close -> ... -> webdav_fsync -> ubc_msync meaning the newly introduced flag WEBDAV_PAGEOUT_CLOSE_IN_RECLAIM wasn't set in webdav_vnop_pageout. So is vnode_isrecycled not really the check that's relevant, or is the page-out function called multiple times (either recursively or in sequence, e.g. due to webdav_unmount flushing twice), or maybe something else?
My test case remains the same as above, and for completeness I'm including a fresh thread dump from the deadlocked unmount thread on Ventura 13.5b4:
Date/Time: 2023-07-03 10:35:06.796 +0200
End time: 2023-07-03 10:35:16.801 +0200
OS Version: macOS 13.5 (Build 22G5059d)
Architecture: arm64e
Report Version: 40
Data Source: Stackshots
Shared Cache: 725CB32F-D723-38F2-8952-4D21C1FD290B slid base address 0x198a8c000, slide 0x18a8c000 (System Primary)
Shared Cache: D6EB184C-4628-3C49-9D21-5D5A97D08FDC slid base address 0x1bd9e8000, slide 0x3d9e8000 (DriverKit)
Shared Cache: 4E3FAD7E-E5B0-35FD-BF81-F0E22E907F07 slid base address 0x7ff8084cc000, slide 0x84cc000 (Rosetta)
Duration: 10.00s
Steps: 1001 (10ms sampling interval)
Hardware model: Macmini9,1
Active cpus: 8
HW page size: 16384
VM page size: 16384
[...]
Process: diskarbitrationd [537]
UUID: 9DF766CA-6596-3311-8A02-20FC83FD3A24
Path: /usr/libexec/diskarbitrationd
Codesigning ID: com.apple.diskarbitrationd
Shared Cache: 725CB32F-D723-38F2-8952-4D21C1FD290B slid base address 0x198a8c000, slide 0x18a8c000 (System Primary)
Architecture: arm64e
Parent: launchd [1]
UID: 0
Sudden Term: Tracked
Footprint: 2993 KB
Time Since Fork: 1033s
Num samples: 1001 (1-1001)
Note: 1 idle work queue thread omitted
Thread 0xf69 1001 samples (1-1001) priority 31 (base 31)
1001 _dispatch_sig_thread + 60 (libdispatch.dylib + 97144) [0x198cefb78]
1001 __sigsuspend_nocancel + 8 (libsystem_kernel.dylib + 34280) [0x198e535e8]
*1001 ??? (kernel.release.t8103 + 5219752) [0xfffffe00088aa5a8]
Thread 0x55ad 1001 samples (1-1001) priority 46 (base 31)
1001 thread_start + 8 (libsystem_pthread.dylib + 7584) [0x198e86da0]
1001 _pthread_start + 148 (libsystem_pthread.dylib + 28584) [0x198e8bfa8]
1001 ??? (diskarbitrationd + 99436) [0x102a2c46c]
1001 unmount + 8 (libsystem_kernel.dylib + 54736) [0x198e585d0]
*1001 ??? (kernel.release.t8103 + 30712) [0xfffffe00083b77f8]
*1001 ??? (kernel.release.t8103 + 1599252) [0xfffffe0008536714]
*1001 ??? (kernel.release.t8103 + 6321364) [0xfffffe00089b74d4]
*1001 ??? (kernel.release.t8103 + 2282292) [0xfffffe00085dd334]
*1001 ??? (kernel.release.t8103 + 2283096) [0xfffffe00085dd658]
*1001 vnode_iterate + 528 (kernel.release.t8103 + 2192876) [0xfffffe00085c75ec]
*1001 ??? (kernel.release.t8103 + 5437556) [0xfffffe00088df874]
*1001 ??? (kernel.release.t8103 + 987212) [0xfffffe00084a104c]
*1001 ??? (kernel.release.t8103 + 990312) [0xfffffe00084a1c68]
*1001 ??? (kernel.release.t8103 + 930160) [0xfffffe0008493170]
*1001 ??? (kernel.release.t8103 + 930448) [0xfffffe0008493290]
*1001 ??? (kernel.release.t8103 + 5806728) [0xfffffe0008939a88]
*1001 webdav_vnop_pageout + 452 (com.apple.filesystems.webdav + 17112) [0xfffffe000b37d708]
*1001 webdav_vnop_close + 64 (com.apple.filesystems.webdav + 9628) [0xfffffe000b37b9cc]
*1001 webdav_vnop_close_locked + 96 (com.apple.filesystems.webdav + 19916) [0xfffffe000b37e1fc]
*1001 webdav_close_mnomap + 264 (com.apple.filesystems.webdav + 20212) [0xfffffe000b37e324]
*1001 webdav_fsync + 416 (com.apple.filesystems.webdav + 20704) [0xfffffe000b37e510]
*1001 ubc_msync + 184 (kernel.release.t8103 + 5438968) [0xfffffe00088dfdf8]
*1001 ??? (kernel.release.t8103 + 987212) [0xfffffe00084a104c]
*1001 ??? (kernel.release.t8103 + 989672) [0xfffffe00084a19e8]
*1001 lck_rw_sleep + 132 (kernel.release.t8103 + 461616) [0xfffffe0008420b30]
*1001 ??? (kernel.release.t8103 + 555992) [0xfffffe0008437bd8]
*1001 ??? (kernel.release.t8103 + 562548) [0xfffffe0008439574]
I'm incredibly thankful that there's someone actively working on this problem, and please let me know if I can help in any way.
Post
Replies
Boosts
Views
Activity
Thank you for sharing this info. I just experienced the same thing but wouldn't have figured out a solution without this thread. I've submitted FB12508762 so this can be improved.