st_dev of mount point directory is different to device ID of device-file

I have NTFS which is mounted on '/Volumes/usb_vol'

#mount
Filesystem       Mounted on
/dev/disk5s1     /Volumes/usb_vol

The following simple code reports different values of device Id for device-file and mount point directory

struct stat buf;

for (int i = 1; i < argc; i++)
{
    std::cout << argv[i] << std::endl;

    if (stat(argv[i], &buf) < 0)
    {
        continue;
    }

    if (S_ISBLK(buf.st_mode))
    {
        std::cout << "st_rdev (" << major(buf.st_rdev) << "/" << minor(buf.st_rdev) << ") hex: " << std::hex << buf.st_rdev << std::endl;
    }
    else
    {
        std::cout << "st_dev (" << major(buf.st_dev) << "/" << minor(buf.st_dev) << ") hex: " << std::hex << buf.st_dev << std::endl;
    }
}

Output:

/dev/disk5s1
st_rdev (1/22) hex: 1000016

/Volumes/usb_vol
st_dev (48/119) hex: 30000077

I believe this is expected but I have not found any explanation of this behaviour. Are there any explanation of difference these values?

I can assume the stat() will report (48/119) for all objects which are located on this file system. Is it correct?

Thank you for the help!

Answered by DTS Engineer in 818928022

I have NTFS which is mounted on '/Volumes/usb_vol'

What mounted this volume? Was it our read-only driver or was it a 3rd party read/write driver?

I believe this is expected

Expected is a such a tricky word... I am both:

  • Surprised, in that this is definitely not what I would have expected stat to return.

  • Not Surprised, in that I know that the value returned by stat (and similar functions) are VERY loosely defined, to the point that stat can (in theory) basically return "anything".

but I have not found any explanation of this behaviour. Are there any explanation of difference these values?

I haven't looked into it in detail, but is suspect it's caused by one of two things:

  1. If it's our driver, then it's probably a side effect of the user land VFS driver. We've been moving more of our file system over to that system and that transitions changes "details" of how the file system "presents" itself to the higher level system. This is the first time I've heard of this, but there are other places where this is visible.

  2. If it's 3rd party driver then this it can basically return anything it wants.

Can you share it's entry from the volume list when you run "mount" in Terminal? That might provide more detail about what's actually going on. Also, what does "statfs()" return when run on the same path?

I can assume the stat() will report (48/119) for all objects which are located on this file system. Is it correct?

Well, yes. That is, the values of st_dev and st_rdev will be identical for any objects located on the same filesystem. However, this is basically a truism as part of what defines object as "being on the same volume"... is those values matching.

With all that context, the big question here is "What are you actually trying to do?". In my experience, stat() is rarely the "right" API choice. For higher level apps focused on "files", it's better/safer/faster to get the same information through them. For lower level apps that are "volume" focused, I think you're often better off moving "down" an API layer, typically into DiskArbritration, sometimes IOKit. The advantage of the lower level APIs is harder to summarize, but it's things like:

  • The give you access to a much broader information set.

  • They're closer to (or "are") "the truth", which it easier to understand "what's actually going on".

As a concrete example, things like software RAID or APFS volume container mean that you can't reliably determine the relationship between different volumes based on device paths. In concrete terms, it's entirely possible that a volume at "/dev/rdisk3s2" an another at "/dev/rdisk4s3" are in fact located on exactly the same physical device. Detecting that relationship is straightforward in IOKit and basically impossible above that later.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Accepted Answer

I have NTFS which is mounted on '/Volumes/usb_vol'

What mounted this volume? Was it our read-only driver or was it a 3rd party read/write driver?

I believe this is expected

Expected is a such a tricky word... I am both:

  • Surprised, in that this is definitely not what I would have expected stat to return.

  • Not Surprised, in that I know that the value returned by stat (and similar functions) are VERY loosely defined, to the point that stat can (in theory) basically return "anything".

but I have not found any explanation of this behaviour. Are there any explanation of difference these values?

I haven't looked into it in detail, but is suspect it's caused by one of two things:

  1. If it's our driver, then it's probably a side effect of the user land VFS driver. We've been moving more of our file system over to that system and that transitions changes "details" of how the file system "presents" itself to the higher level system. This is the first time I've heard of this, but there are other places where this is visible.

  2. If it's 3rd party driver then this it can basically return anything it wants.

Can you share it's entry from the volume list when you run "mount" in Terminal? That might provide more detail about what's actually going on. Also, what does "statfs()" return when run on the same path?

I can assume the stat() will report (48/119) for all objects which are located on this file system. Is it correct?

Well, yes. That is, the values of st_dev and st_rdev will be identical for any objects located on the same filesystem. However, this is basically a truism as part of what defines object as "being on the same volume"... is those values matching.

With all that context, the big question here is "What are you actually trying to do?". In my experience, stat() is rarely the "right" API choice. For higher level apps focused on "files", it's better/safer/faster to get the same information through them. For lower level apps that are "volume" focused, I think you're often better off moving "down" an API layer, typically into DiskArbritration, sometimes IOKit. The advantage of the lower level APIs is harder to summarize, but it's things like:

  • The give you access to a much broader information set.

  • They're closer to (or "are") "the truth", which it easier to understand "what's actually going on".

As a concrete example, things like software RAID or APFS volume container mean that you can't reliably determine the relationship between different volumes based on device paths. In concrete terms, it's entirely possible that a volume at "/dev/rdisk3s2" an another at "/dev/rdisk4s3" are in fact located on exactly the same physical device. Detecting that relationship is straightforward in IOKit and basically impossible above that later.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Hello,

I am sorry for delay with answer? (Christmas and few weeks holidays). The volume is mounted with read only permission by default driver.

Can you share it's entry from the volume list when you run "mount" in Terminal? That might provide more detail about what's actually going on.

# df /dev/disk4s1    
Filesystem   512-blocks  Used Available Capacity iused ifree %iused  Mounted on
/dev/disk4s1   30715832 93776  30622056     1%       1     0  100%   /Volumes/CCCOMA_X64FRE_RU-RU_DV9

# mount  | grep disk4s1 
/dev/disk4s1 on /Volumes/CCCOMA_X64FRE_RU-RU_DV9 (ntfs, local, nodev, nosuid, read-only, noowners, noatime, fskit)

That is, the values of st_dev and st_rdev will be identical for any objects located on the same filesystem.

Thank you a lot for this confirmation! I can implement my task based on this info.

Thank you a lot for you help!

So, starting with the mount itself:

# mount  | grep disk4s1 
/dev/disk4s1 on /Volumes/CCCOMA_X64FRE_RU-RU_DV9 (ntfs, local, nodev, nosuid, read-only, noowners, noatime, fskit)

The "fskit" at the end indicates what I said here:

  1. If it's our driver, then it's probably a side effect of the user land VFS driver. We've been moving more of our file system over to that system and that transitions changes "details" of how the file system "presents" itself to the higher level system. This is the first time I've heard of this, but there are other places where this is visible.

...in in fact what's going on here.

Thank you a lot for this confirmation! I can implement my task based on this info.

You haven't said what your actually trying to do, but there are some points I want to expand on in my earlier

Not Surprised, in that I know that the value returned by stat (and similar functions) are VERY loosely defined, to the point that stat can (in theory) basically return "anything".

Making this explicit, it is a programmatic error to assume that there is ANY inherent relationship between the value returned by st_dev and the value returned by st_rdev. It has HAPPENED to be the case that our block storage VFS drivers tended to use the disk/rdisk pattern, but that was simply a side effect of their implementation NOT any kind of requirement of the system.

My broader recommendation here is actually that these values should basically be treated as opaque string values and not really "interpreted" in any way.

In my experience, stat() is rarely the "right" API choice. For higher level apps focused on "files", it's better/safer/faster to get the same information through them. For lower level apps that are "volume" focused, I think you're often better off moving "down" an API layer, typically into DiskArbritration, sometimes IOKit.

Expanding on what I said here, the basic issue is that the "stat" API is VERY "awkwardly" placed within the broader system architecture. For "low level" use cases (for example, opening and writing dev nodes) it doesn't really map to the underlying hardware. For example, if you want to map volume to physical devices, that cannot be done through stat. It LOOKS like it could be done by taking the st_rdev (/dev/disk4s1-> slice) and dropping the slice number(s) to find the whole device (/dev/disk4-> whole device), but that will give you the wrong answer in a variety of edge cases.

For "high level" use cases, the stat layer is harder to use and provides less information than NSURL. What most apps really need/want is a "standardized" concept of what a volume "is" that gives the app access to all the relevant information in a single APIs. That's exactly what our higher level APIs were designed to do. For example, if you want to determine if two objects are on the same volume, you can do that with NSURL by retrieving "NSURLVolumeIdentifierKey" on both objects and then comparing the UUIDs. That can be done via stat as well, however, with NSURL you can also retrieve NSURLVolumeNameKey and NSURLVolumeLocalizedNameKey*, neither of which can really be accessed through stat. More broadly, most of the data returned by the NSURL API can be retrieved through other APIs, but the required APIs cross multiple APIs and abstraction layers. For example, most of the "Supports" keys (like NSURLVolumeSupportsFileCloningKey) come from getattrlist but a key like "NSURLVolumeIsEjectableKey" actually comes from IOKit.

*The fact that volume names can be localized is Yet Another Easily Overlooked Edge Case™, except this time I don't think there's any lower level API that will get you the data.

Lastly, one side note on performance. This forum post goes into the details, but the summary is that if you're working with a large file hierarchy, our high level iteration APIs are faster and easier to use than any straightforward implementation you're likely to write.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

In my case, stat struct is provided by Endpoint Security message.

Looks like, the best solution is to use DiskArbitration framework. DiskArbitration can be used to find relationship between mount point (DAVolumePath) and physical device (DADiskGetBSDName).

If st_dev and st_rdev are identical for any objects located on the same filesystem, the mount point directory has the same st_dev and st_rdev to. So, bsdName can be found for any objects. And finally a device can be found in I/O Registry by bsdName.

Thank you a lot for detailed answer!!!

In my case, stat struct is provided by Endpoint Security message.

Ahh, that makes sense then.

One note here:

Looks like, the best solution is to use DiskArbitration framework.

I have to confess, I'm a bit of a fan boy™ when it comes to the DiskArb framework, having had the (dubious) pleasure of using it's original (and AWFUL) private API before the public API was introduced in 10.4. The API is looks a bit "dated" by modern standards, but it's very good at what it does.

DiskArbitration can be used to find relationship between mount point (DAVolumePath) and physical device (DADiskGetBSDName).

The main thing to be aware of (and take advantage of) is that the DiskArbitration can "find out" (and block) actions "before" they happen, meaning "before" the EndpointSecurity framework actually happens. For example, if you wanted to control/prevent mounts, the best way (IMHO) would be to:

  1. Use DARegisterDiskMountApprovalCallback() to block the DiskArb mount.

  2. Use the ES auth callback as the "backstop" if #1 is bypassed (for example, by someone directly calling "mount()").

This approach gives a "smoother" and less system disruptive UI experience, since the system fully expect that #1 can occur but which isn't really true for an arbitrary mount call.

Similarly for physically attached (not remote) devices, DARegisterDiskAppearedCallback(::::) is triggered from IOKit (not the BSD side), so it should be called well before* the ES becomes "aware" of the device. I'd actually use the appear/disappear callback to maintain a simple lookup table that maps dev node's to my own object. That object would then track the DADisk and the data I was caching about that disk. The big thing to avoid here is that I'd avoid "frequently" calling DADiskCopyDescription. The data returned isn't going to change unless the device/mount state changes and the function requires an IPC to diskarbirationd. You definitely don't want to be doing that in every open auth.

*I'm hedging here because I think it MIGHT be possible for force a scenario where the DA callback was delayed, but it's not a normal/realistic scenario.

A note here:

relationship between mount point (DAVolumePath) and physical device (DADiskGetBSDName).

FYI, firmlinks mean that it's possible for an object on one volume to exists "outside" it's mount point. There usage is currently quite limited, but it's something to be aware of when you think about these issues.

If st_dev and st_rdev are identical for any objects located on the same filesystem, the mount point directory has the same st_dev and st_rdev to.

Yes, however, I'm afraid it's "messier" than that. The big issue here is union mounts, where multiple volumes mount at the same mount point "target". The system's support for this is somewhat poor and buggy. For example, the shell doesn't properly "merge" the two directories contents (so the root directory only shows the contents of the second volume) and the Finder completely refuses to display any directory on the "masked" volume. However, the contents are in fact mounted and can be accessed even while "hidden".

Making that concrete, if you start with two volumes which each contain a single directory, you can union mount them like this:

sudo mount -t apfs -o union,nobrowse <bsd path 1> <mountpoint> 
sudo mount -t apfs -o union,nobrowse <bsd path 2> <mountpoint> 

You'd expect this output:

[SilverBrick:~/mountpoint] kevine% ls
Vol1/ Vol2/

But you'll actually get this:

[SilverBrick:~/mountpoint] kevine% ls
Vol2/

However, the crazy part is that if you do this:

[SilverBrick:~/mountpoint] kevine% cd Vol1
~/mountpoint/Vol1
[SilverBrick:~/mountpoint/Vol2] kevine% ls
<contents of the first first volume's "Vol1">

In other words, the contents of the first volume aren't displaying correctly at the mountpoint, but that are still accessible if you know what's going on.

So, bsdName can be found for any objects. And finally a device can be found in I/O Registry by bsdName.

DiskArb will give you this object with DADiskCopyIOMedia(_:) and the underlying whole device using DADiskCopyWholeDisk(_:). However, keep in mind that APFS and RAID both make determining the underlying hardware parent(s) more complicated than simply retrieving the whole parent. If that's something you want to do, then that would probably be worth a new thread.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thank you a lot for sharing knowledge. These are very valuable tips which will save a lot of my time.

Thank you again!

Thank you again!

You're very welcome.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

st_dev of mount point directory is different to device ID of device-file
 
 
Q