Thank you for sharing this info. I just experienced the same thing but wouldn't have figured out a solution without this thread. I've submitted FB12508762 so this can be improved.
Post
Replies
Boosts
Views
Activity
Thanks to anyone that has read my post and have been debugging the deadlock! Just a few days ago the source packages for Ventura 13.3 were published, and in that I found webdavfs-395. The following diff is one that I'm very happy to see:
https://github.com/apple-oss-distributions/webdavfs/commit/b7756b02549929bb18062ebcd76f0bbb75a120cb
That change does in fact target exactly the use-case of calling ubc_msync() on mmap'ed files at clean-up, but unfortunately real-world testing of a fresh Ventura 13.5b4 still triggers the deadlock despite including webdavfs-395.
In other words, it seems there is still a code path that goes through webdav_vnop_pageout, determine that is_open == TRUE and end up calling webdav_vnop_close -> ... -> webdav_fsync -> ubc_msync meaning the newly introduced flag WEBDAV_PAGEOUT_CLOSE_IN_RECLAIM wasn't set in webdav_vnop_pageout. So is vnode_isrecycled not really the check that's relevant, or is the page-out function called multiple times (either recursively or in sequence, e.g. due to webdav_unmount flushing twice), or maybe something else?
My test case remains the same as above, and for completeness I'm including a fresh thread dump from the deadlocked unmount thread on Ventura 13.5b4:
Date/Time: 2023-07-03 10:35:06.796 +0200
End time: 2023-07-03 10:35:16.801 +0200
OS Version: macOS 13.5 (Build 22G5059d)
Architecture: arm64e
Report Version: 40
Data Source: Stackshots
Shared Cache: 725CB32F-D723-38F2-8952-4D21C1FD290B slid base address 0x198a8c000, slide 0x18a8c000 (System Primary)
Shared Cache: D6EB184C-4628-3C49-9D21-5D5A97D08FDC slid base address 0x1bd9e8000, slide 0x3d9e8000 (DriverKit)
Shared Cache: 4E3FAD7E-E5B0-35FD-BF81-F0E22E907F07 slid base address 0x7ff8084cc000, slide 0x84cc000 (Rosetta)
Duration: 10.00s
Steps: 1001 (10ms sampling interval)
Hardware model: Macmini9,1
Active cpus: 8
HW page size: 16384
VM page size: 16384
[...]
Process: diskarbitrationd [537]
UUID: 9DF766CA-6596-3311-8A02-20FC83FD3A24
Path: /usr/libexec/diskarbitrationd
Codesigning ID: com.apple.diskarbitrationd
Shared Cache: 725CB32F-D723-38F2-8952-4D21C1FD290B slid base address 0x198a8c000, slide 0x18a8c000 (System Primary)
Architecture: arm64e
Parent: launchd [1]
UID: 0
Sudden Term: Tracked
Footprint: 2993 KB
Time Since Fork: 1033s
Num samples: 1001 (1-1001)
Note: 1 idle work queue thread omitted
Thread 0xf69 1001 samples (1-1001) priority 31 (base 31)
1001 _dispatch_sig_thread + 60 (libdispatch.dylib + 97144) [0x198cefb78]
1001 __sigsuspend_nocancel + 8 (libsystem_kernel.dylib + 34280) [0x198e535e8]
*1001 ??? (kernel.release.t8103 + 5219752) [0xfffffe00088aa5a8]
Thread 0x55ad 1001 samples (1-1001) priority 46 (base 31)
1001 thread_start + 8 (libsystem_pthread.dylib + 7584) [0x198e86da0]
1001 _pthread_start + 148 (libsystem_pthread.dylib + 28584) [0x198e8bfa8]
1001 ??? (diskarbitrationd + 99436) [0x102a2c46c]
1001 unmount + 8 (libsystem_kernel.dylib + 54736) [0x198e585d0]
*1001 ??? (kernel.release.t8103 + 30712) [0xfffffe00083b77f8]
*1001 ??? (kernel.release.t8103 + 1599252) [0xfffffe0008536714]
*1001 ??? (kernel.release.t8103 + 6321364) [0xfffffe00089b74d4]
*1001 ??? (kernel.release.t8103 + 2282292) [0xfffffe00085dd334]
*1001 ??? (kernel.release.t8103 + 2283096) [0xfffffe00085dd658]
*1001 vnode_iterate + 528 (kernel.release.t8103 + 2192876) [0xfffffe00085c75ec]
*1001 ??? (kernel.release.t8103 + 5437556) [0xfffffe00088df874]
*1001 ??? (kernel.release.t8103 + 987212) [0xfffffe00084a104c]
*1001 ??? (kernel.release.t8103 + 990312) [0xfffffe00084a1c68]
*1001 ??? (kernel.release.t8103 + 930160) [0xfffffe0008493170]
*1001 ??? (kernel.release.t8103 + 930448) [0xfffffe0008493290]
*1001 ??? (kernel.release.t8103 + 5806728) [0xfffffe0008939a88]
*1001 webdav_vnop_pageout + 452 (com.apple.filesystems.webdav + 17112) [0xfffffe000b37d708]
*1001 webdav_vnop_close + 64 (com.apple.filesystems.webdav + 9628) [0xfffffe000b37b9cc]
*1001 webdav_vnop_close_locked + 96 (com.apple.filesystems.webdav + 19916) [0xfffffe000b37e1fc]
*1001 webdav_close_mnomap + 264 (com.apple.filesystems.webdav + 20212) [0xfffffe000b37e324]
*1001 webdav_fsync + 416 (com.apple.filesystems.webdav + 20704) [0xfffffe000b37e510]
*1001 ubc_msync + 184 (kernel.release.t8103 + 5438968) [0xfffffe00088dfdf8]
*1001 ??? (kernel.release.t8103 + 987212) [0xfffffe00084a104c]
*1001 ??? (kernel.release.t8103 + 989672) [0xfffffe00084a19e8]
*1001 lck_rw_sleep + 132 (kernel.release.t8103 + 461616) [0xfffffe0008420b30]
*1001 ??? (kernel.release.t8103 + 555992) [0xfffffe0008437bd8]
*1001 ??? (kernel.release.t8103 + 562548) [0xfffffe0008439574]
I'm incredibly thankful that there's someone actively working on this problem, and please let me know if I can help in any way.