Serializing generic structs

What is the best approach to archive/serialize generic structs?


I cannot use NSCoding (since structs are not NSObjects).

I can translate my structs to some other representation (such as JSON strings),

but I'm struggling to unarchive/deserialize the structs back from the JSON

if the structs are generic.


More precisely, I'm wrestling with the type system to allow me to express

this in an elegant, maintainable way.

Accepted Reply

Well, it's a bit voluminous, and getting the generic creation function to compile was a bit strange, but this is what I ended up with:


protocol SerializableType {
  static var serializableName: String { get }
  var serializableProperties: [String: Any] { get }
  init (serializedProperties: [String: Any])
}

extension Int: SerializableType {
  static var serializableName: String { return "Int" }
  var serializableProperties: [String: Any] {
  return ["value": self]
  }
  init (serializedProperties: [String: Any]) {
  self = serializedProperties ["value"] as! Int
  }
}

extension String: SerializableType {
  static var serializableName: String { return "String" }
  var serializableProperties: [String: Any] {
  return ["value": self]
  }
  init (serializedProperties: [String: Any]) {
  self = serializedProperties ["value"] as! String
  }
}

struct A: SerializableType {
  static var serializableName: String { return "A" }
  var serializableProperties: [String: Any] {
  return ["a1": a1, "a2": a2]
  }
  init (a1: Int, a2: String) {
  self.a1 = a1
  self.a2 = a2
  }
  init (serializedProperties: [String: Any]) {
  a1 = serializedProperties ["a1"] as! Int
  a2 = serializedProperties ["a2"] as! String
  }
  var a1: Int
  var a2: String
}

struct B<T: SerializableType>: SerializableType {
  static var serializableName: String { return "B<\(T.serializableName)>" }
  var serializableProperties: [String: Any] {
  return ["b1": b1, "b2": b2, "b3": b3 as Any]
  }
  init (b1: Int, b2: String, b3: T) {
  self.b1 = b1
  self.b2 = b2
  self.b3 = b3
  }
  init (serializedProperties: [String: Any]) {
  b1 = serializedProperties ["b1"] as! Int
  b2 = serializedProperties ["b2"] as! String
  b3 = serializedProperties ["b3"] as! T
  }
  var b1: Int
  var b2: String
  var b3: T
}

func serialize (instance: SerializableType) -> [String: Any] {
  return ["type": instance.dynamicType.serializableName, "values": instance.serializableProperties]
}

func deserialize (serialization: [String: Any]) -> SerializableType {

  let type = serialization ["type"] as! String
  let typeName: String = …
  let typeParameter: String? = …
  let values = serialization ["values"] as! [String: Any]

  if let typeParameter = typeParameter {
  switch typeParameter {

  case "A":
       return deserializedValueWithParameter (A.self, typeName: typeName, values: values)
  case "Int":
       return deserializedValueWithParameter (Int.self, typeName: typeName, values: values)
  case "String":
       return deserializedValueWithParameter (String.self, typeName: typeName, values: values)
  default:
       fatalError ()
  }
  }

  else {
  switch typeName {
  case "A":
       return A (serializedProperties: values)
  case "Int":
       return Int (serializedProperties: values)
  case "String":
       return String (serializedProperties: values)
  }
  }
}

func deserializedValueWithParameter<T: SerializableType> (parameterType: T.Type, typeName: String, values: [String: Any]) -> SerializableType {

  switch typeName {
  case "B":
       return B<T> (serializedProperties: values)
  default:
       fatalError ()
  }
}

let b = B (b1: 0, b2: "X", b3: 1)
let bb = serialize (b)
let bbb = deserialize (bb)


Note:

— The code for serializing and deserializing actually needs to be recursive, but for reasons of space I didn't write that code here.

— I've already written all of the non-generic implemention of this as real code in a project I'm working on. In the real code, known types like Int and String are special cased, rather than handled like the above.

— I didn't write the code to analyze the "B<A>" string into its components, so the last 'bbb = …' line crashes, but the rest of it works in a playground.

— I didn't write any error handling, I just threw in "as!" everywhere.

— I didn't really serialize anything, just converted to a plist-style dictionary that can be serialized easily into JSON, or a plist, or a NSKeyedArchiver archive, according to preference.

— But the number of cases for generics is additive, not multiplicative, so this approach looks a lot better if the number of serializable types is bigger.

— Sorry, the indentation got messed up when I pasted.

Replies

For the structs I'm using in a project of my own, I opted to go with JSON encoding and decoding. For generics - I split the responsibility between the struct and the encode/decode caller:


func encode<T>(encodeInfo: T -> JSON) -> JSON {
...
}


So the generic defines an encode method which encodes what it meaningfully can about itself, and which takes a function or closure to encode the generic portion. The caller of the encode/decode methods owns the specific instance of the generic, and so will know the specific type associated with it, so it's in a good position to provide a serialization/deserialization implementation for the otherwise generic portion.

Thanks monyshuk. What does the decode look like and how is the decoding mechanism structured? That seems to be the harder part.

Please note that I might be totally misunderstanding you, but anyway:


import Darwin // (for arc4random)

// Some generic struct type of which you want to create values initialized by deserialization:
struct Foo<T> {
    typealias ValueType = T
    let value: ValueType
}

// AFAICS you want a deserializer to do something impossible that is essentially something very similar to this:
/*
func createFooWithSomeValueTypeThatIsOnlyKnownAtRuntime() -> Foo<???> {
    if arc4random() < UInt32.max / 2 {
        return Foo(value: 1234)
    } else {
        return Foo(value: "Hello")
    }
}
*/


You can of course fight the type system and do something like this:

func getSomeValueThatCanOnlyBeKnownAtRuntime() -> Any {
    if arc4random() < UInt32.max / 2 {
        return 1234
    } else {
        return "Hello"
    }
}


for _ in 0 ..< 10 {
    let foo = Foo(value: getSomeValueThatCanOnlyBeKnownAtRuntime())
    print(foo.value)
}


Which will create a bunch of Foo<Any> values. You might try all possible different solutions. But you will always end up with something like this, ie using Any or some enum with associated values of the set of possible types, something like that, but you will never be able to get what (I guess it is that) you really want, ie you will never be able to do something like this:

let deserializedFoo: Foo<DeserializedType> = Foo(deserializedFrom: someRuntimeSource)


Simply because the DeserializedType is only known at runtime, it can't be known at compile time. Generics is about compile time, deserialization is about runtime.

Thanks Jens, you understand correctly. This is exactly what I'm trying to achieve:

func createFooWithSomeValueTypeThatIsOnlyKnownAtRuntime() -> Foo<???> {
    if arc4random() < UInt32.max / 2 {
        return Foo(value: 1234)
    } else {
        return Foo(value: "Hello")
    }
}


Let's say we have a Printer<T> struct:

struct Printer<T>
{
     let value: T
     init(value: T)
     {
          self.value = value
     }
     func print()
     {
          ... print value ...
     }
}


I create 2 variables:


let doublePrinter = Printer<Double>(123.456)
let stringPrinter = Printer<String>("Hello")


Then, I somehow archive these two structs into string for saving them on disk.


Later, I want to unarchive those two structs back from the string.


I want to write the unarchive function that will take the string and return a Printer struct.

Since I do not know what struct is serialized in the string (only that it is some kind of Printer),

it is impossible to express this:


func unarchivePrinter(s: String) -> Printer<???>
{
     ...
}


However, I can instead change this to return Any, as you suggested:


func unarchivePrinter(s: String) -> Any
{
     ....
}


This will work, but the type of the function now doesn't communicate that it returns some kind of Printer.


Is there really no better way of unarchiving generic Swift structs without falling back to using bunch of Any values?

I *know* that the unarchived structs are Printers. I want to let the compiler and typechecker know, too.

EDIT: I realize (now, afterwords) that I use the word Printer here in another meaning than how you used it in the example ... see my next post instead.


If you (and most importantly the compiler!) can actually know that, in some place, in your code, there will be a Foo which will always be of some type (eg Printer), then there is of course no problem, as you can simply do eg:

let unarchivedPrinterFoo: Foo<Printer> = Foo(unarchiveAsPrinter: unarchiver)


Not being able to do something like this most likely means that Printer is in fact not knowable at compile time after all, but somehow you got the idea that it was, probably because there are a number of immediate/close steps in some causality chain that clearly indicates that this must be a Printer, but further down/back in that causality chain, there is something unknowable to the compiler (like the contents of some json file).


Generics is a tool for working with stuff that the compiler can know, I get the impression that you want to use it for something that it's never been intended for.

All I really want is to unarchive an array of generic structs. I have


protocol Vehicle { }

struct Train<T>: Vehicle {
...
}

I have an array of these:


let vehicles: [Vehicle] = ...


I archive them into a JSON file (somehow).


Then, I want to unarchive them back from the file and get a [Vehicle] array back.

Let's presume the trains are encoded in JSON format, but it may be anything, really.


I just want to find an elegant way to unarchive an array of [Vehicles] back from the JSON representation,

when the array may actually contain different concrete types of trains, such as Train<Coal>, Train<Passenger>,

Train<Wood>...


I totally agree with you that the createFooWithSomeValueTypeThatIsOnlyKnownAtRuntime function you suggested is trying to use the type system in a way that it clearly can't handle. But I'm yet to find a nice way to do the unarchiving process in a way that doesn't try to abuse the type system and

is also maintainable and elegant. That was the original question.


(By the way, the only reason I need to do this at all is because structs can NOT conform to NSCoding protocol, so I cannot use Cocoa's NSKeyedArchiver to do the archiving and unarchiving. If that were possible, I wouldn't be asking this question. I'm looking for a nice alternative since the NSCoding approach cannot be used for complex struct-based models. How would you read and write the model in the Crustacean example to file, for example?)

Ok, I think I finally know how to explain what I've been trying to say all along now, let's see how it goes : ) But first let's look at these from your previous post:

let doublePrinter = Printer<Double>(123.456)
let stringPrinter = Printer<String>("Hello")


Just so that we are on the same page. Here, doublePrinter and stringPrinter are of two separate types. There is no type Printer, Printer is not a type (on its own). It's only by giving it a T (by entering some text in your code, statically, at compile time) that you can get a type from whatever that Printer thing is, and the type you get by giving it a certain T is not a "subclass" or anything of Printer.


And as for the following, also from you previous post:

Is there really no better way of unarchiving generic Swift structs without falling back to using bunch of Any values?

I *know* that the unarchived structs are Printers. I want to let the compiler and typechecker know, too.


Well, You know that because you happen to know about some runtime-specific string or JSON file or something like that, how should the compiler be able to know about something which can change at runtime? You can of course make a bunch of Printer<Any>'s if the compiler can know they will always be some kind of Printer<?> types. But you can never get back a Printer<Int> or Printer<Double> if you can't know the T-part of Printer at compile time. (well, you can use an enum as I said previously, but the uncertainty as to which of that set of possible Printer<Hmm> types a certain value is will propagate throughout your code, and you'll have to switch case on/in your enum because of that).


The only way to "let the compiler and the typechecker know, too" is by doing something really crazy like:

Including some specific archive of choice within your source code, and make sure it is done in such a way that type checker / compiler can interpret it (ie you must turn it into code, ie do the unarchiving yourself(!)). So your "unarchiver" would now only be able to handle exactly this included archive. But you could of course add as many such (unarchived) archives as you want, letting the compiler know about those too ... This makes no sense, I know (and that is the point). But:


I think you will understand what I mean if you think a bit about the chronological and causal order of these steps:

#1. You write your code, here's where type checking and compilation happens, and you hit "Build".

#2. You get your executable.

#3. You change something in a JSON archive that the executable has been setup to unarchive from.

#4. You run your executable.

#5. You get frustrated about the fact that what happened in #3 can't be known by the compiler at #1.


IMHO You are actually asking if Swift will allow you to travel back in time (in an elegant and maintainable way)!

: )


I think this might also be (part of) the reason for why "the NSCoding approach cannot be used for complex struct-based models".

You are not answering the question. You are also totally missing the real problem. Try actually writing a code that archives an array of generic structs to file and then unarchives it back and you'll finally know what I'm talking about the whole time.

The problem is that, unless I am totally misunderstanding you, you are asking for the impossible and it is impossible in principle, not just because some limitation of Swift generics.


As I tried to show in detail above: You are essentially asking Swift to allow you to travel back in time, or perhaps rather the compiler to predict what is going to happen in the future. You are asking if there is a way of enabling the compiler to magically predict eg what will be in some JSON file at runtime, ie after the compiler has finished compiling the program.


And now that I have spent a lot of time to really explain why this is impossible, what generics are, what compile time and runtime is, you are refusing to accept that it is impossible. Reread my last answer again, especially the last part with the #1 #2 #3 #4 #5.


You simply can not "unarchive generic structs" in any meaningful and general way, not as long as a "generic struct" is a compile time thing, and "unarchiving something" is a runtime thing. Making generic struct dynamic would be as meaningless and absurd as trying to make unarchiving a compile time thing, ie you loose all useful aspects of them in doing so.


The Bar part of a Foo<Bar> is a compile time thing, it is not something that can be changed dynamically at runtime, simple as that.


You are not happy with getting what not only Swift, but reality itself (time, causality), can give you, ie a Foo<Any> or an enum with associated values for its two cases Foo<Bar> and Foo<Baz>. And you refuse to understand that the compiler can't possibly know / magially predict whether some archive will unarchive into a Foo<Bar> or into a Foo<Baz>.

Please forget everything you and I wrote in this thread so far. Then, show me how you would archive an array of generic structs to file and unarchive it back, by any means necessary. You are still going back to your original implementation suggestion, going on and on about why that particular approach won't work. I fully understand why that approach won't work, that's why I'm asking this question in the first place! I'm not married to that solution. I want to see a solution - any good solution! - that works.


I have a model made of structs, some of them generic. All I want to do is archive my struct-based model into a file and then unarchive it back. I'm looking for the best way to do this. I haven't yet found a good solution.


Please, tell me how would you archive and unarchive the model from Apple's Crustacean example. Show me the actual code.

So the particular kind of "arrays of generic structs" (which you want to be able to archive and unarchive) does not need to have elements whose types are conforming to a protocol with associated type constraints such as this:

protocol SomeThing {
    typealias TypeOfThing
    var thing: TypeOfThing { get }
}

?

(Because, as I guess you already know, it is not possible to even have an array like the one on the last line here:

struct GenericThing<T> : SomeThing {
    typealias TypeOfThing = T // This line can be left out as the line below will allow the compiler to infer it.
    let thing: TypeOfThing
}
struct StringThing : SomeThing {
    typealias TypeOfThing = String // This line can be left out as the line below will allow the compiler to infer it.
    let thing: String
}
let myArrayOfSomethings: [SomeThing] = [GenericThing(thing: 1234), StringThing(thing: "Hello"), GenericThing(thing: 12.34)]

)


As I've already said, you'll need to devise some way of creating that array which will not only work for stuff that can be known at compile time, as it will also have to work for stuff that is only possible to know at runtime, long after the compiler is done. Once you have decided on that, you will also be able to create that array. For example like this:

let myArrayOfSomethings: [Any] = [GenericThing(thing: 1234), StringThing(thing: "Hello"), GenericThing(thing: 12.34)]


And there are of course various ways to do that in a way that is perhaps less disappointing to you than to have to resort to [Any], but all those alternatives have also already been suggested. It's just that you will never be able to get back an array where one certain element is eg a GenericThing<Int>, but instead you might have at best some GenericThing<Any>, or an enum which can hold some particular GenericThing<X> where the X is only possible to know at runtime, thus requiring that enum.

I agree with everything you wrote, but please proceed to the problem of actually unarchiving of the structs. I'm begging you for the third time now. You're still writing about things that aren't really the main problem here. You'll see what I mean when you actually try to write the unarchiver. The main problem is that even if we use the Any-returning function (which is not ideal, but so be it if it cannot be helped), that unarchiving function will have to have information about all unarchivable structs concentrated in one place in my code. If I add another structure type, I will also need to remember to update that decoding function to include that new structure type. Information about decoding all kinds of structs will be centralized in that function, which is very, VERY bad design. In Objective-C, you can de-centralize the decoding just by implementing initWithCoder method on all classes (which will put the decoding code in each class it belongs to).


The fact that you'll probably have to use Any type as some return value somewhere is a tangential, minor issue compared to the bigger picture here.


Please, tell me how would you archive and unarchive the model from Apple's Crustacean example. Show me the actual code.

Even if I did write some code, I am pretty sure that you would not be satisfied with the solution. But I might post some code anyway later if I have the time.

The reason why I think you will not be happy with my solution is that the main problem is that you seem unwilling to accept that it is impossible to get static type information for something that is inherently dynamic. The solution will always have to be able to represent all possible dynamic variations in the type information.

So the easiest (and least type safe way) would be to use dynamic data structures for everything (this is typically how people usually do everyhting in Objective C) or you will have to make the type system aware of all variations that are possible during runtime, which if taken to an extreme, would mean that you would have to encode all possible type variations that the archives imply in your code / the type system, this is probably not very practical.

You just have to decide where in that spectrum you would like your solution to land. Swift adds nice static features that are simply not there in eg Objective C, so it gives you and the compiler some new tools, but that doesn't mean that those tools are somehow possible to turn inside out and be used "dynamically", that would just be strange.

In Objective C you almost never have think about this stuff, ie what is possible/practical/motivated to do statically vs dynamically, since almost everything is just dynamic, sort of unnecessarily extremely so. But in Swift you have deal with that, which gives you more power as a programmer and the compiler more power as a compiler / type checker (providing compile time errors and making optimizations that would not have been possible in eg Objective C). Just because of the static features in Swift, it is not possible to suddenly make everything static and thus type checkable, there are of course still problems (like yours) that are only possible / practical to solve using the dynamic tools that eg Obejctive C uses for everything (including stuff that would be much better done statically).

There's no general "best" way of solving all problems of this kind, just as there are no general best way of solving any given (sufficiently complex) class of problems.

This is the fourth time in a row you totally ignored what I wrote to you. You keep repeating the same thing over and over again without addressing the main question. Please don't bother anymore. Maybe someone else can add something productive to the discussion instead of this totally tangential philosophical discourse on static/dynamic typing. That is NOT what the question is about at all. I'm repeatedly asking for the best way to approach struct decoding in Swift, requesting code. Instead of writing some code, you keep lecturing me in entry-level static/dynamic programming topics that I am 100% familiar with. Please stop doing that. Have a nice day.

This thread, and the "solution" to your problem is ALL about static vs dynamic typing and value vs reference semantics.


Let's make this really simple and say we have a JSON file in which we have encoded one of the primitive Swift types (Int, Double, UInt8 et al, which are defined in the std lib as structs).


How could it (ever) be possible to write a deserializer that would not have to take into account all the different possible types of structs that such a JSON file would be allowed to encode?


This is why you would use dynamic typing for such tasks.


So this is the last time I repeat this (promise): You are asking for the impossible.

Unless you are willing to give into the fact that you need some dynamic typing here, and you can not "serialize and deserialize structs" as such.