creating files follow up / code critique

hello wise ones...


so I spent the last week reading about NSURL, NSData, NSFileHandle, and sandboxing. I have some code here that works... I'd like to share it with you to see if I am indeed following best practices, and if anyone has any suggestions as to how to improve upon it, I welcome all constructive criticism.


this app is just a test app to see if I can read from one file a user selects, and write it to a file the app creates in the destination of the user's choosing. I have three buttons and a textField... the buttons are "select file", "select folder" (destination), and "run." The textField allows the user to input the desired name of the file. I have in XCode set the sandbox capabilities to be able to read/write for user selected files.



@interface RootViewController : NSViewController


{

NSURL *readFileURL;

NSURL *writeFileFolderURL;

}



@implementation RootViewController


- (IBAction)selectFileButton:(NSButton *)sender

{

NSOpenPanel *selectFile = [NSOpenPanel openPanel];

[selectFile setCanChooseFiles:true];

[selectFile setCanChooseDirectories:true];

[selectFile setAllowsMultipleSelection:false];

[selectFile setPrompt:@"Select"];

[selectFile setMessage:@"Select video file..."];

[selectFile runModal];


readFileURL = [selectFile.URLs firstObject];

}


- (IBAction)selectFolderButton:(NSButton *)sender

{

NSOpenPanel *selectFolder = [NSOpenPanel openPanel];

[selectFolder setCanChooseFiles:false];

[selectFolder setCanChooseDirectories:true];

[selectFolder setAllowsMultipleSelection:false];

[selectFolder setPrompt:@"Select"];

[selectFolder setMessage:@"Select destination folder..."];

[selectFolder runModal];


writeFileFolderURL = [selectFolder.URLs firstObject];

}


- (IBAction)runButton:(NSButton *)sender

{

NSError *readFileError = nil;

NSFileHandle *readFileHandle;

readFileHandle = [NSFileHandle fileHandleForReadingFromURL:readFileURL error:&readFileError];


if (readFileError) {

NSLog(@"Fail: %@", [readFileError localizedDescription]);

return;

}

NSData *readDataLoad = [NSData new];

NSMutableData *readDataBuffer = [NSMutableData new];

readDataLoad = [readFileHandle readDataOfLength:50]; //get first 50 (50 is just a random number for testing purposes)

[readDataBuffer appendData:readDataLoad];

readDataLoad = [readFileHandle readDataOfLength:50]; //get next 50

[readDataBuffer appendData:readDataLoad];


NSString *writeFileTempName = [_writeFileNameTextField stringValue]; //get user's desired name

NSString *writeFileName = [writeFileTempName stringByAppendingString:@".mid"]; //append .mid

NSURL *destinationURL = [writeFileFolderURL URLByAppendingPathComponent:writeFileName]; //append file name to URL

NSError *writeError = nil; //create error pointer

BOOL success = [readDataBuffer writeToURL:destinationURL options:0 error:&writeError]; //write to file

if(!success){

NSLog(@"Fail: %@", [writeError localizedDescription]);

}

[readFileHandle closeFile];


}


some questions:

1. I'm assuming that by setting my sandbox settings in XCode that the same settings will apply to my stand alone app, yes?

2. The way I am using readDataOfLength (which is a NSData object) and then handing it off to my readDataBuffer (which is an NSMutableData object) seems clunky. Is there a better way? The idea here is to slowly amass the data I need in the readDataBuffer, and then only have to writeToURL once at the end.

3. I'm using two different error pointers... does that matter, or should I just be using one pointer and be done with it?


Thanks as always for your help and guidance.

Accepted Reply

1. I'm assuming that by setting my sandbox settings in XCode that the same settings will apply to my stand alone app, yes?

I suspect I’m missing the point of your question. The settings you configure via the App Sandbox slice of the Capabilities tab are the sandbox settings for your standalone app. They don’t have anything to do with Xcode itself. Can you elaborate on your concern here?

2. The way I am using

readDataOfLength
(which is a
NSData
object) and then handing it off to my
readDataBuffer
(which is an
NSMutableData
object) seems clunky. Is there a better way? The idea here is to slowly amass the data I need in the
readDataBuffer
, and then only have to
writeToURL
once at the end.

Why are you doing that?

The normal reason that folks use an incremental read API (like

-readDataOfLength:
) is that they want to process an arbitrarily large file without using an arbitrary amount of memory. In your case, however, you’re taking those chunks of data and accumulating them in
readDataBuffer
, and so you’re going to use that much memory anyway. And if you’re going to do that, then you might as well just read all the data at once, using code like this:
NSData * readData = [NSURL dataWithContentsOfURL:readFileURL];

3. I'm using two different error pointers... does that matter, or should I just be using one pointer and be done with it?

I would typically declare one error value and then reuse it where appropriate, but it doesn’t really matter.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Replies

1. I'm assuming that by setting my sandbox settings in XCode that the same settings will apply to my stand alone app, yes?

I suspect I’m missing the point of your question. The settings you configure via the App Sandbox slice of the Capabilities tab are the sandbox settings for your standalone app. They don’t have anything to do with Xcode itself. Can you elaborate on your concern here?

2. The way I am using

readDataOfLength
(which is a
NSData
object) and then handing it off to my
readDataBuffer
(which is an
NSMutableData
object) seems clunky. Is there a better way? The idea here is to slowly amass the data I need in the
readDataBuffer
, and then only have to
writeToURL
once at the end.

Why are you doing that?

The normal reason that folks use an incremental read API (like

-readDataOfLength:
) is that they want to process an arbitrarily large file without using an arbitrary amount of memory. In your case, however, you’re taking those chunks of data and accumulating them in
readDataBuffer
, and so you’re going to use that much memory anyway. And if you’re going to do that, then you might as well just read all the data at once, using code like this:
NSData * readData = [NSURL dataWithContentsOfURL:readFileURL];

3. I'm using two different error pointers... does that matter, or should I just be using one pointer and be done with it?

I would typically declare one error value and then reuse it where appropriate, but it doesn’t really matter.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Thanks eskimo... you've confirmed what I thought, that my stand alone app will have the capabilities I give it in that tab.


The reason I am just reading small bits of a file is because the file is huge... 30 GB, but could be as big as 60 GB (uncompressed video). So... I am just seeking through the file, finding the bits I need, doing some processes, and moving on.


One error... perfect.


Thank you!

I am just seeking through the file, finding the bits I need, doing some processes, and moving on.

Ah, yeah, in that case you need a streaming API.

NSFileHandle
will work for this but whether that’s the best option depends on how your processing code works.

The issue with

NSFileHandle
is that it does no user-space buffering. If, for example, you read the file one byte at a time then each
-readDataOfLength:
call will result in a call to the
read
system call (documented in its man page). That is incredibly inefficient.

You have a couple of options here:

  • You can structure your processing code so that it can work on large buffers of data (option A). This can be quite tricky to get right if the data you’re looking for spans a buffer boundary.

  • You can use a read API that does user-space buffering (option B).

With regards option A, the API I recommend for streaming through a file of that magnitude is Dispatch I/O (to learn more, read the

dispatch_io_create
a man page). It was specifically designed for this task.

With regards option B, one good API for this is the C standard I/O library (see the

stdio
man page). It lets you read lines (
fgets
), read and parse based on a format string (
fscanf
), and read chunks (
fread
).

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Pay attention to what you are doing in those action methods. They are meant to be relatively quick. They execute on the main, user-interface thread. If you are processing 60 GB of data, that is going to lock up your app, display the "beach ball" cursor, and flag your app to the system as "non-responsive".


As eskimo suggests, Dispatch I/O would solve that problem as it forces you into asynchronous logic. But your code suggests typical programmer linear thinking. You will have to get your brain re-wired for asynchronous logic. And doing dispatch I/O is mind-numbingly complicated. The function calls themselves are documented. But there is no documentation or publicly-known best-practices for using it in real-world software.


You might find it easier to just spawn linear logic in a background thread. Look for the "dispatch_async" method. This is related to Dipatch I/O, both functions being part of GCD (Grand Central Dispatch), but much easier to use.


If this video data is binary, make sure to properly handle endianness. Also review default buffer sizes and buffering modes. stdio tends to assume you are reading and writing small text files with line buffering.


Finally, it is possible that you might need to do really low-level, raw I/O using file descriptors with read() and write(). You can combine this with both C and C++ stdio buffering. By default, many higher-level I/O function, like NSFileHandle or even C stdio may use memory-mapped files. These can sometimes be really slow because they essentally use the virtual memory system for I/O. That works well when you need to read most of a relatively small file with random access. But it may not be appropriate for very large files being used the way you describe. It has been many years since I encountered this problem (before massive amount of RAM and SSDs), so just keep an eye out for performance.

thanks for the heads up. I originally wrote this program I'm working on in C... and it works, it will run through a 30 GB file in about a minute, I haven't tried anything bigger than that. Now I'm trying to convert it over to a stand alone app with an easy to use interface... Xcode makes that part easy, but still being a novice when it comes down to XCode / Obj.-C I'm running into issues with my conception of how things should work. For example, I was pretty miffed that you can't do bitwise operations on a NSData object. Doesn't that seem like it should be a gimmie... just let me &= 0xFFFF this thing!


Anyway, thank you all, I seriously can't tell you how awesome you all are to be helping me out. I'm going to keep chugging on this and I'm sure you'll find out soon enough what roadblocks I run into. 🙂

You can do bitwise operations on NSData. One way would be to use NSMutableData. Other way would be to create an NSData from an existing buffer without copying. Then operate on the buffer.

You can do bitwise operations on an NSMutableData object? Interesting... I guess if I can't find it in the documentation, and nobody has ever asked how to do that on StackOverflow... I just assume it can't be done. So much still to learn. :/

You can do bitwise operations on an

NSMutableData
object?

Yes (from a certain point of view).

NSMutableData
exposes a
mutableBytes
property that’s a pointer to the underlying storage. It’s legal to modify that data as you see fit, although you be using C constructs rather than anything specific to Objective-C. For example:
NSMutableData * d = [[@"Hello Cruel World!" dataUsingEncoding:NSUTF8StringEncoding] mutableCopy];
((uint8_t *) d.mutableBytes)[0] |= 0x20;
NSString * s = [[NSString alloc] initWithData:d encoding:NSUTF8StringEncoding];
NSLog(@"%@", s);
// -> hello Cruel World!

IMPORTANT The pointer returned by

mutableBytes
is invalidated when the
NSMutableData
object changes, for example, if you add or remove bytes from object.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"