Serializing generic structs

What is the best approach to archive/serialize generic structs?


I cannot use NSCoding (since structs are not NSObjects).

I can translate my structs to some other representation (such as JSON strings),

but I'm struggling to unarchive/deserialize the structs back from the JSON

if the structs are generic.


More precisely, I'm wrestling with the type system to allow me to express

this in an elegant, maintainable way.

Answered by QuinceyMorris in 18725022

Well, it's a bit voluminous, and getting the generic creation function to compile was a bit strange, but this is what I ended up with:


protocol SerializableType {
  static var serializableName: String { get }
  var serializableProperties: [String: Any] { get }
  init (serializedProperties: [String: Any])
}

extension Int: SerializableType {
  static var serializableName: String { return "Int" }
  var serializableProperties: [String: Any] {
  return ["value": self]
  }
  init (serializedProperties: [String: Any]) {
  self = serializedProperties ["value"] as! Int
  }
}

extension String: SerializableType {
  static var serializableName: String { return "String" }
  var serializableProperties: [String: Any] {
  return ["value": self]
  }
  init (serializedProperties: [String: Any]) {
  self = serializedProperties ["value"] as! String
  }
}

struct A: SerializableType {
  static var serializableName: String { return "A" }
  var serializableProperties: [String: Any] {
  return ["a1": a1, "a2": a2]
  }
  init (a1: Int, a2: String) {
  self.a1 = a1
  self.a2 = a2
  }
  init (serializedProperties: [String: Any]) {
  a1 = serializedProperties ["a1"] as! Int
  a2 = serializedProperties ["a2"] as! String
  }
  var a1: Int
  var a2: String
}

struct B<T: SerializableType>: SerializableType {
  static var serializableName: String { return "B<\(T.serializableName)>" }
  var serializableProperties: [String: Any] {
  return ["b1": b1, "b2": b2, "b3": b3 as Any]
  }
  init (b1: Int, b2: String, b3: T) {
  self.b1 = b1
  self.b2 = b2
  self.b3 = b3
  }
  init (serializedProperties: [String: Any]) {
  b1 = serializedProperties ["b1"] as! Int
  b2 = serializedProperties ["b2"] as! String
  b3 = serializedProperties ["b3"] as! T
  }
  var b1: Int
  var b2: String
  var b3: T
}

func serialize (instance: SerializableType) -> [String: Any] {
  return ["type": instance.dynamicType.serializableName, "values": instance.serializableProperties]
}

func deserialize (serialization: [String: Any]) -> SerializableType {

  let type = serialization ["type"] as! String
  let typeName: String = …
  let typeParameter: String? = …
  let values = serialization ["values"] as! [String: Any]

  if let typeParameter = typeParameter {
  switch typeParameter {

  case "A":
       return deserializedValueWithParameter (A.self, typeName: typeName, values: values)
  case "Int":
       return deserializedValueWithParameter (Int.self, typeName: typeName, values: values)
  case "String":
       return deserializedValueWithParameter (String.self, typeName: typeName, values: values)
  default:
       fatalError ()
  }
  }

  else {
  switch typeName {
  case "A":
       return A (serializedProperties: values)
  case "Int":
       return Int (serializedProperties: values)
  case "String":
       return String (serializedProperties: values)
  }
  }
}

func deserializedValueWithParameter<T: SerializableType> (parameterType: T.Type, typeName: String, values: [String: Any]) -> SerializableType {

  switch typeName {
  case "B":
       return B<T> (serializedProperties: values)
  default:
       fatalError ()
  }
}

let b = B (b1: 0, b2: "X", b3: 1)
let bb = serialize (b)
let bbb = deserialize (bb)


Note:

— The code for serializing and deserializing actually needs to be recursive, but for reasons of space I didn't write that code here.

— I've already written all of the non-generic implemention of this as real code in a project I'm working on. In the real code, known types like Int and String are special cased, rather than handled like the above.

— I didn't write the code to analyze the "B<A>" string into its components, so the last 'bbb = …' line crashes, but the rest of it works in a playground.

— I didn't write any error handling, I just threw in "as!" everywhere.

— I didn't really serialize anything, just converted to a plist-style dictionary that can be serialized easily into JSON, or a plist, or a NSKeyedArchiver archive, according to preference.

— But the number of cases for generics is additive, not multiplicative, so this approach looks a lot better if the number of serializable types is bigger.

— Sorry, the indentation got messed up when I pasted.

I'm asking for the best approach, not an approach that doesn't use any dynamic typing features.

You should check out the "Developing iOS 8 Apps with Swift" Stanford course in iTunesU. In the first three lectures a calculator is developed that uses enum and struct to create a stack that handles numeric and string data. Then in the first part of the lecture 5 (if memory serves correctly) an example of translating the stack to a string and back again is shown. The approach used in that course may work for you.

Just for clarification: That would be the solution of using enums that I have been mentioning several times above, like eg:

"/.../ (well, you can use an enum as I said previously, but the uncertainty as to which of that set of possible Printer<Hmm> types a certain value is will propagate throughout your code, and you'll have to switch case on/in your enum because of that)."


Neither this solution, nor any other solution, will be able to give you back any "encoded struct" (as in: one of many possible actual struct types), which is what MikeA is asking for, at least according to my best possible understanding of the following:


MikeA wrote:

"I'm repeatedly asking for the best way to approach struct decoding in Swift"


I'm just saying that "struct decoding", in any general and meaningful sense, is impossible, and that this is not due to some limitation of Swift. It's as impossible as getting your compiler to typecheck and give compile time errors for stuff that has been entered or will be entered into some unknown JSON-file.


If this is not what you (MikeA) asked for in the above question, because you do in fact already know all of this, then I think you would also know how to rephrase your question to better communicate what you are actually asking for.


Sorry if this is just me going on and on about the same thing again (but the questions are going on and on too), I don't mean to upset anyone.

What about


enum JsonError : ErrorType {
    case InvalidType
}
enum InvalidData {
    case Example
}
func unarchive() throws -> Foo<Any> {
    let readFromJson : Any;
    let random = arc4random();
    if random < UInt32.max / 3 {
        readFromJson = 1234
    } else if random < UInt32.max / 3 * 2 {
        readFromJson = "Hello"
    } else {
        readFromJson = InvalidData.Example
    }

    switch(readFromJson) {
    case let num as Int:
        return Foo(value:num)
    case let string as String:
        return Foo(value:string)
    case _: throw JsonError.InvalidType;
    }
}

for _ in 0 ..< 10 {
    do {
        let foo = try unarchive()
        print(foo.value)
    } catch {
        print("invalid data")
    }
}


You want to express that unarchive does not just return anything but a Foo, but at compile time you can not know the type parameter of Foo so you declare the return value as Foo<Any>


Inside of unarchive you need to have different branches (depending on the data in the json file) ofcause.

I've already suggested this kind of solution. It was given as one of the possible solutions in my first post to this thread. Here's that particular code example again for completeness:

import Darwin // (for arc4random)

// Some generic struct type of which you want to create values initialized by deserialization:
struct Foo<T> {
    typealias ValueType = T
    let value: ValueType
}

/*
// AFAICS you want a deserializer to do something impossible that is essentially something very similar to this:
func createFooWithSomeValueTypeThatIsOnlyKnownAtRuntime() -> Foo<???> {
    if arc4random() < UInt32.max / 2 {
        return Foo(value: 1234)
    } else {
        return Foo(value: "Hello")
    }
}
*/

// You can of course fight the type system and do something like this
// (which will create a bunch of Foo<Any> values).

func getSomeValueThatCanOnlyBeKnownAtRuntime() -> Any {
    if arc4random() < UInt32.max / 2 {
        return 1234
    } else {
        return "Hello"
    }
}

for _ in 0 ..< 10 {
    let foo = Foo(value: getSomeValueThatCanOnlyBeKnownAtRuntime())
    print(foo.value)
}


(EDIT: Note that it is creating the Foo<Any>'s in the for loop at the end. You can alt-click (option-click) on the foo constant to have Xcode show you its type.)


But evidently this was not what MikeA is asking for (see conversation following that first post of mine).

I know, my suggestion is based on your code. The main difference is the usage of the Any type to parameterize the Foo type when declaring the return type.

Because MikeA wrote:

func unarchivePrinter(s: String) -> Printer<???>

(obvious not compiling) and

func unarchivePrinter(s: String) -> Any

complaining that in the second case the return value being of type Printer is not expressed.

That's why I assumed he is looking for

func unarchivePrinter(s: String) -> Printer<Any>

I edited my previous post to highlight the fact that the code is actually creating Foo<Any>'s from the "statically unknowable source" (the function returning the Any's).


Your code helps to clarify what I meant to show in my code (which I admit was perhaps obfuscating the Foo<Any> part a bit, but I did spell it out clearly in the text of the post), thanks.

Actually I just took your code commented with:

"AFAICS you want a deserializer to do something impossible that is essentially something very similar to this:"


And made it not so impossible by replacing your Foo<???> with Foo<Any>

Thanks cwrindfuss, finally a sane answer from someone.


I went ahead and saw the lecture. In case someone else wants to see the lecture, it is Lecture 5. Objective-C Compatibilty, Property List, Views.


I paste the actual coding and decoding code here:


var program: AnyObject { // guaranteed to be a PropertyList
      get {
            return opStack.map { $0.description }
      }
      set {
            if let opSymbols = newValue as? Array<String> {
                var newOpStack = [Op]()
                for opSymbol in opSymbols {
                      if let op = knownOps[opSymbol] {
                          newOpStack.append(op)
                      } else if let operand = NSNumberFormatter().numberFromString(opSymbol)?.doubleValue {
                          newOpStack.append(.Operand(operand))
                      }
                }
                opStack = newOpStack
            }
      }
}


The code encodes and decodes an array of Ops to and from Property Lists. Op is an enum.


The coding to PropertyList occurs on line 3. The $0 parameter actually corresponds to an Op enum. The "description" computed property does the actual encoding of Op to PropertyList there.


This will be a crucial point later, so I repeat this: encoding Op to PropertyList is implemented in Op's "description" property.


Decoding of individual Ops, on the other hand, is done on lines 9-12, in the setter of the program property that belongs to CalculatorBrain object.


This is also a crucial point, so I repeat it: decoding Op from PropertyList is not implemented in Op, but in entirely different class (CalculatorBrain).

To sum up:

  • Encoding of Op is implemeted in Op enum
  • Decoding of Op is implemeted in CalculatorBrain class


This asymmetry is heavily problematic. It works fine for small, educational examples such as the Stanford calculator project. However, this is totally terrible for larger projects. It's a maintenance disaster. Decoding and encoding of type X should happen in one place, ideally in the place where X itself is defined.


This is the case with Objective-C's NSCoding protocol. You implement two methods - initWithCoder: and encodeWithCode: - in the same class, right next to each other, without the need to make any changes to code of any other class/struct/enum/whatever.


I have yet to find a good encoding/decoding architecture that also preserves this property in Swift, particularly when you want to encode/decode enums and structs that cannot conform to the NSCoding protocol. If the enums and structs are generic, the problem becomes even bigger.


Again, the approach outlined in the (very fine, by the way) Stanford course will work for small toy projects, but is not really scalable to larger apps with more complicated models.

What I meant by those ??? was of course:


??? = What on earth should we put here, Int or String, String or Int, how could the compiler know what it should be, Int or String?


Any doesn't count in that alternative, as it is not giving back the "generic struct" that the OP was originally asking to deserialize.


I thought this was clear, especially as I also gave the Foo<Any> solution as an alternative to the impossible, and wrote about Foo<Any> in the text after the code (see for yourself in my first post to this thread (which I haven't edited since then)).


I think it's time for me to leave this thread now ... : )

Then why do you not just copy the NSCoding protocol by creating your own protocol with the same methods/semantics?

Laszlo, it's pleasure to finally see the code of someone who understands what he is talking about :-)


While this is a workable approach for decoding things, it is not without its problems.

Please see my reply to cwrindfuss above for more detailed discussion why this approach is problematic in Swift.


This may very well be the best solution possible in Swift (or close to it), but it has some huge disadvantages when compared to NSCoding approach in Objective-C, particularly with regards to scalability and maintainability.


Specifically, it leads to a situations when the code for decoding a particular struct/enum is detached from the rest of the code of that struct/enum.

See above.

And on and on it goes. I'll leave this thread now and perhaps somone brave and pedagogically superior to me will be able to explain to you for example that:


What is possible to do (dynamically) in Objective C is (of course) also possible to do (dynamically) in Swift (although of course not statically, using structs the way you seem to believe/require).

That's actually great question. Maybe this can be done, but I hadn't yet found a way to do it. That is why I'm asking this question.


Reimplementing a "clone" of the NSCoding protocol for structs isn't straightforward because of generic structs and generic enums.


Maybe it can be done with some of the new DP2 improvements (ability to call init on a type object). I'm not sure, that's why I'm asking.


EDIT: I forgot to add, this also entails re-implementing NSKeyedArchiver and NSKeyedUnarchiver, as these won't be able to work on structs, even if those structs supported NSCoding-like methods. Re-implementing those two classes isn't exactly piece of cake, either. Another reason why I'm asking this question in the first place.

Something like

protocol MyNSCoding {
    func encodeWithCoder(aCoder: NSCoder)
    init?(coder aDecoder: NSCoder)
}

enum FooEnum : MyNSCoding {
    case A
    case B

    func encodeWithCoder(aCoder: NSCoder) {
        aCoder.encodeInt(1, forKey: "self")
    }

    init?(coder aDecoder: NSCoder) {
        self = .A
    }
}

struct FooStruct<T where T:MyNSCoding> : MyNSCoding {
    let bar : Int
    let baz : T

    func encodeWithCoder(aCoder: NSCoder) {
        aCoder.encodeInt64(Int64(bar), forKey: "bar")
        baz.encodeWithCoder(aCoder)
    }

    init?(coder aDecoder: NSCoder) {
        bar = Int(aDecoder.decodeInt64ForKey("bar"))
        if let b = T(coder: aDecoder) {
            baz = b
        } else {
            return nil
        }
    }
}
Serializing generic structs
 
 
Q