Unexpected Permission denied error on file sharing volume

Question

Created 1w

Replies 3

Boosts 0

Participants 2

I am getting recurring errors running code on macOS 15.1 on arm that is using a volume mounted from a machine running macOS 14.7.1 on x86. The code I am running copies files to the remote volume and deletes files and directories on the remote volume. The files and directories it deletes are typically files it previously had copied.

The problem is that I get permission failures trying to delete certain directories.

After this happens, if I try to list the directory using Terminal on the 15.1 system, I get a strange error:

ls -lA TestVAppearances.app/Contents/runtime-arm/Contents
total 0
ls: fts_read: Permission denied

If I try to list the directory on the target (14.7.1) system, there is no error:

TestVAppearances.app/Contents/runtime-arm/Contents:
total 0

Boost

Answer 1

DTS Engineer OP

Apple

6d

So, let me start with the error here:

ls: fts_read: Permission denied

"fts" ("File Traversal Stream") is basically the Unix/POSIX equivalent to Foundation NSDirectoryEnumerator. The details are documented in it's man page ("man fts"), but "fts_read" returns the next file system object in the iteration set. So "Permission denied" would mean you weren't able to read the contents fo that directory.

Moving to here:

The problem is that I get permission failures trying to delete certain directories.

How were you deleting the directories?
What actually failed, particularly in relation to the directory you're trying to list?

After this happens, if I try to list the directory using Terminal on the 15.1 system, I get a strange error:

Do you get the same failure if you open a new terminal window an navigate to the directory?
Do you get the same failure if you unmount and remount the SMB volume?
What do you see if you list the contents of the parent directory?

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

0

Answer 2

cbfiddle OP

3d

Kevin, I thank you for your response. I believe progress is happening.

For clarity, let me call the arm machine the client system and the x86 machine the server system.

I am deleting a directory tree on the server system from a Java application running on the client system. Java uses basic system calls (rmdir and unlink) to delete items.

I put a breakpoint on the exception handler and discovered an interesting situation.

The failure on directory deletion is directory not empty. That should not happen because before attempting to delete the directory, my program deleted its contents.

When I examine the directory that I could not delete (D) in Terminal on the server system, it is indeed not empty. It contains an empty subdirectory (S), which my program previously "deleted".

A few seconds later, directory S disappeared (as viewed in Terminal on the server system)!

It appears that there is a race condition. The operation to delete S apparently succeeded, but did not take effect immediately. The operation to delete D somehow overtook the previous operation and failed as a result.

From Terminal on the client system, S appears to exist but trying to list its contents fails with the fts_read error. I get the same error if I open a new Terminal window and navigate to D and try to list S.

If I unmount the volume and reconnect, I see the same bad state in Terminal. Listing D shows S. Listing S gets the fts_read error.

Is this a bug or am I doing something wrong?

Is there a reliable way to work around this problem?

0

Answer 3

DTS Engineer OP

Apple

3d

For clarity, let me call the arm machine the client system and the x86 machine the server system.

What's are the network conditions between these two machines? What's the latency and bandwidth of the connection? Note that latency in particular has a huge effect here.

I am deleting a directory tree on the server system from a Java application running on the client system. Java uses basic system calls (rmdir and unlink) to delete items.

Just to clarify, where are that actually directory commands being issued? Are you:

Calling rmdir/unlink on the mac, targeting the files in the smb mount.

OR

Tell your server app do delete those file "directly" and then viewing the changes through the smb mount on the mac?

Note that while race conditions are possible in both cases, they're all be guaranteed in the second.

Jumping to here:

It appears that there is a race condition. The operation to delete S apparently succeeded, but did not take effect immediately. The operation to delete D somehow overtook the previous operation and failed as a result.

What's the actual SMB server? More specifically, is it a Windows machine? I'm not sure how widely it's being used*, but the SMB2/3 delete works by:

*This was part of SMB2 which isn't exactly "new".

smb2fs_smb_delete(struct smb_share *share, struct smbnode *np, enum vtype vnode_type,
...
    /*
     * Looking at Win <-> Win with SMB 2/3, delete is handled by opening the file
     * with Delete and "Read Attributes", then a Set Info is done to set
     * "Delete on close", then a Close is sent.
     */
...

That's a fairly elegant approach, but I believe it can mean that a delete ends up being "deferred" because some other process/client has the directory open.

However, please keep in mind that this is only one example among many. Part of the nature of network files systems is simply that it's nearly impossible to create a network file system that:

"Feels" like a local file system under normal usage conditions.
Doesn't exhibit "weird behavior" under specific conditions and/or when monitored more closely.

In the category of weird, I'm not sure how these connect to each other:

A few seconds later, directory S disappeared (as viewed in Terminal on the server system)!

...

If I unmount the volume and reconnect, I see the same bad state in Terminal. Listing D shows S. Listing S gets the fts_read error.

Are you saying that the server and the client are persistently showing inconsistent results, particularly across unmounts? The unmount is important here because, as far as the system is concerned, it basically "forgets" everything it knows about the previous volume state when it unmounts the volume. So any data it's showing, came from the server*.

*Is the client mounting multiple shares from the server, particularly shares that "overlap", so the client can "see" the same directory through two different mountpoints? Things become more complicated when multiple shares are involved because the client "knows" that it's both shares are from the same source.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

0