Another data storage question

I saw a question similar to this but I wanted to ask it pertaining to my situation.


Basically I have an app that is like an encyclopedia. The app will contain "articles" about different subjects. I will be writing all of these articles. The user will not be capable of writing or editing any of the articles. The user just types in a search term and sees the articles I have written.


Now the question is, how do I store all of these articles? Right now I have them all in a giant txt file which I am parsing into arrays of strings every time the user searches. Is there some obvious or better way to store all of this data? Thanks.


P.S. One problem I am having is that I use special characters to delimit where one article ends and another begins and also to designate other information about the articles. It can be a pain because if I make one typo with these delimiting characters, it will crash the app and I will have to search this entire text file to find where the typo was.

Replies

What is the average size of an article? What is the total size of all the articles?


In roughly increasing order of sophistication, your choices include:


1. A giant text file. This is not absolutely a terrible thing.


2. A giant file with some internal structure that either lets you read only parts of it, or that allows to parse it more easily.


3. A folder with one text file per article. You make this into a "package", which means that it looks to a user like a single file, if that's necessary.


4. A database with one text file per record, or something like that.


There's no real correct answer, but a pragmatic approach is to do something different only if you need to solve a problem you know you have, and not one you think you might have.

How many articles do you consider ?

100 ? 1000 ? More ?

In any case, I would avoid a single flat text file. You found the problems yourself.


if 100, you could store each individually in the app resources, as separate files.


Beyond that, you should consider a CoreData, which will also let you create indexes, …


Last point : look at Appstore conditions, such an app, if it has not advanced functions, may not be accepted on the store.

The articles are small - mostly 100 words or less, many are less than 20 words. My plan is that there will be thousands of them. Maybe 5-10 thousand total once I'm done writing them all.


My other concern is about how to protect the data from somebody simply copying it all. Once somebody has the app, is all of this information easy to extract if I use, say CoreData? I am talking about someone looking at the app file and trying to "steal" the encylopedia so to say. Which approach would protect well against that?

One other question: do the articles change independently of the app? Or is your app always using the articles baked in to it?

The articles are small - mostly 100 words or less, many are less than 20 words. My plan is that there will be thousands of them. Maybe 5-10 thousand total once I'm done writing them all.

I’d probably use a SQLite file for this. Critically, the article sizes you’ve described are well within SQLite’s abilities. You may also be able to take advantage of SQLite’s full text search (FTS) feature.

My other concern is about how to protect the data from somebody simply copying it all.

I’ve posted about this many times. The posts most appropriate to your case are in the old DevForums, here and on this thread. If you have problems accessing these, let me know and I’ll copy them over to the new forums.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

Eskimo, please do copy those two threads you reference in your asnwer to the new forums as the provided links don't seem to be working anymore. Thanks!

Some links in a 2 year old post are obsolete ? Is it really surprising ? 😉 Certainly not the emitter's fault !

please do copy those two threads you reference in your asnwer to the new forums as the provided links don't seem to be working anymore.

Not a problem. The old DevForums was shut down a while ago and thus there’s no way to follow those links )-: However, I keep copies of everything I post, so I was able to resurrect that content and paste it in below (-:

Note The first item is the specific post I referenced, and the remaining items are my posts on that thread (hopefully there’s enough quoted text for you to understand the questions I was addressing). I’ve added titles and dates to establish the historical context, and a few editorial remarks.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"


Title Additional Security measure for valuable data stored within the app
Date 10 Sep 2010

Actually about data that is shiped with my app.

First up, unsupported device modifications are completely orthogonal to this issue. Protecting assets is a challenge, even in fully supported configurations. Once the user has sync’d your app to the computer, they can simple go into the iTunes folder, grab the .ipa file, rename it to .zip, and double click to unzip it.

[While syncing apps via iTunes is no longer standard, there are easy still ways for folks to access your .ipa — Quinn, 26 Apr 2019]

Secondly, there are two approaches you can take his:

  • Kid sister security [1] — If you want to protect your assets from casual browsing, you can just scramble them in some way. For example, you could encrypt them with some simple symmetric key encryption scheme (DES, AES, and so on) and then hard-wire the key in your application. For that matter, XORing the file with 0xAA would probably be a good start.

    Keep in mind that, if you’re dealing with an SQLite database, you don’t have to scramble the whole thing. You could just scramble the critical columns of the critical tables. SQLite doesn’t support this directly, but it’s easy to wrap your SQLite access to do this.

  • Real security — There is no way to solve this problem in the general case. The asset has to ship with your application, and your application has to ship with the code and keys required to access that asset. A determined hacker will be able to reverse engineer whatever protection scheme you dream up. So, if you need something beyond kid sister security, you have to come up with an alternative approach. For example, you could require the user to sign up for a web service that vends the asset to that user specifically (either piecemeal or as an entire database that you then protect with the new iOS 4 data protection APIs).

Finally, I want to point out that what you’re trying to create here is a DRM scheme. That is, you want to give the asset to the user but restrict what the user can do with it. As such, be aware that DTS does not support DRM scheme development. In our experience DRM schemes run counter to DTS’s primary goal, which is to help developers create software that works reliably, now and in the future. So, while I’m happy to help you with the specific technical details of this problem (for example, using Common Crypto to do DES, or using the new data protection APIs), the design of your DRM scheme is your own concern.

[1] http://en.wikiquote.org/wiki/Bruce_Schneier


Title Include API Key in Application
Date 31 Jul 2013

My concern is how to include it in the application in the safest way.

It’s hard to offer advice here because we don’t know the cost/benefit trade-offs you’re making. Here’s my general points:

  • If the data is included in your app, or can be fetched by your app without using some per-user identification, it can be reversed engineered: It just boils down to a question of how much time an attacker wants to spend on the problem.

  • This means that any attempts to hide this data is simply obfuscation. There’s lots of info and resources on that topic available on the ’net.

  • You can literally spend the rest of your life coming up with clever obfuscation schemes. This is where the cost/benefit trade-off comes in. A simple scheme will be vulnerable to attack. A complex scheme will take a long time to create.

In most cases I recommend that folks err on the side of ‘simple’. Obviously putting the data into your Info.plist is probably a little too easy, but putting the data in a C static variable obfuscating by XORing it with another variable that contains randomly generated bytes will defeat all but the most experienced attackers.

[Despite all the references to C here, there’s no reason you couldn’t implement similar techniques in Swift. Swift didn’t exist when I posted this originally! — Quinn, 26 Apr 2019]


Title Include API Key in Application
Date 31 Jul 2013

Is it right?

Sounds about right.

Other question, is there some example of code or API in obj-c/cocoa to do this kind of XOR?

I would just do this in C.

static uint8_t kScrambledKey[16] = { ... };
static uint8_t kRandomNumbers[16] = { ... };
static uint8_t sKey[16];

static void UnscrambleKey(void)
{
    for (size_t i = 0; i < sizeof(sKey); i++) {
        sKey = kScrambledKey ^ kRandomNumbers;
    }
}

A nice trick here is to put the definition of kScrambledKey and kRandomNumbers in a separate .c file and have the main code #include that. That way you can automatically generate this .c file at build time.


Title Include API Key in Application
Date 1 Aug 2013

Ok thanks so much! I don’t see the real difference of having both keys in separated files... Just more time to find the key if doing reverse engineering?

Separate source files; everything gets compiled and linked into the same executable before it’s deployed to the user. The separate source files are there simply to make you build-time key generation easier.

Other thing, in order to generate kScrambledKey I need to do out of the application before building it, using a function ScrambleKey that uses as parameters kRandomNumbers and sKey... Is it right? On this case, how would be in C that function?

Your ScrambleKey looks fine, but you have to a) come up with kRandomNumbers (which you can get from SecRandomCopyBytes), and b) write the results out to a .c file that the compiler builds into your app.


Title Include API Key in Application
Date 5 Aug 2013

If I call the function to generate random data, it has to be done at runtime, so I can’t include the generated random numbers as an static char[]

Right. You would do that in a separate command line tool that runs at build time.

Here’s a summary of the process:

  • Your app has the following code:

    #include “MyCryptoData.c”
    
    static uint8_t sKey[16];
    
    static void UnscrambleKey(void)
    {
        for (size_t i = 0; i < sizeof(sKey); i++) {
            sKey = kScrambledKey ^ kRandomNumbers;
        }
    }
    
  • You have a separate command line tool. You run it like this:

    $ MyScrambleKeyTool 'YourAPIKey' > MyCryptoData.c
    

    It takes your API key, scrambles it with some random bytes, and writes the result to stdout. So, MyCryptoData.c ends up looking like this:

    $ cat MyCryptoData.c
    static uint8_t kScrambledKey[16]  = { ... };
    static uint8_t kRandomNumbers[16] = { ... };
    
  • When you build your app, the UnscrambleKey function gets the scrambled API key and the random bytes required to unscramble it.

Et voilà!

For another take on this issue, see this article.

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"