Extracting polymorphic types in json4s
Asked Answered
P

1

13

I am using json4s to work with JSON objects in my Scala code. I want to convert JSON data to an internal representation. The following learning test illustrates my problem:

"Polimorphic deserailization" should "be possible" in {
    import org.json4s.jackson.Serialization.write
    val json =
      """
        |{"animals": [{
        |  "name": "Pluto"
        |  }]
        |}
      """.stripMargin
    implicit val format = Serialization.formats(ShortTypeHints(List(classOf[Dog], classOf[Bird])))
    val animals = parse(json) \ "animals"
    val ser = write(Animals(Dog("pluto") :: Bird(canFly = true) :: Nil))
    System.out.println(ser)
    // animals.extract[Animal] shouldBe Dog("Pluto") // Does not deserialize, because Animal cannot be constructed
}

Suppose there is a JSON object which has a list of Animals. Animal is an abstract type, and hence cannot be instantiated. Instead, I want to parse the JSON structure to return either Dog or Bird objects. They have a different signature:

case class Dog(name: String) extends Animal
case class Bird(canFly: Boolean) extends Animal

Because their signature is distinct, they can be identified without having a class Tag in the JSON object. (To be precise, the JSON structure I receive does not provide those tags).

I tried to serialize a list of Animal objects (see the code). The result is: Ser: {"animals":[{"jsonClass":"Dog","name":"pluto"},{"jsonClass":"Bird","canFly":true}]}

As you can see, when serializing, json4s adds the class-tag jsonClass.

How can I deserialize a JSON object that does not provide such a tag? Is it possible to achieve this by extending TypeHints?

I also found a similar question: [json4s]:Extracting Array of different objects with a solution that somehow uses generics instead of subclassing. However, if I understand correctly, this solution does not allow to simply pass the json object and have an internal representation. Instead I would need to select the form that is not None (while checking all possible Types in the inheritance hiearchy. This is a bit tedious, since I have multiple Polymorphic classes at different depths in the JSON structure.

Parisparish answered 22/9, 2014 at 18:37 Comment(3)
Did you ever find an answer to that? I'm facing the same challenge here...Dishevel
Unfortunately, I didn't find an answer. As a workaround I agreed with the guy that created the serialized JSON to add type hints; but this is obviously not a solution if you can't influence the JSON scheme. I'm still interested in an answer and have a bit more knowledge about json4s than I had at the time of writing the question, so I'll try to come up with a solution.Parisparish
@Dishevel Thank you for reviving the question. I found extending CustomSerializer to be a fairly simple solution (though the code for extracting large polymorphic structures may become a bit bloated). I hope this also helps you solving your problem.Parisparish
P
16

Ultimately, on the project that lead to this question, I agreed with the guy creating the serialized JSON on adding type hints for all polymorphic types. In retrospect this solution is probably the cleanest because it enables future extension of the JSON schema without dangers of introducing ambiguity.

Nevertheless, there exists a fairly simple solution (not just a workaround) to the actual problem.

The type org.json4s.Formats, which is an implicit value in our scope, provides a function +(org.json4s.Serializer[A]). This function allows us to add new custom serializers. So for each polymorphic supertype (in our case this concerns only Animal), we can define a custom serializer. In our example, where we have

trait Animal
case class Dog(name: String) extends Animal
case class Bird(canFly: Boolean) extends Animal

a custom serializer that operates without type hints would look as follows:

class AnimalSerializer extends CustomSerializer[Animal](format => ( {
  case JObject(List(JField("name", JString(name)))) => Dog(name)
  case JObject(List(JField("canFly", JBool(canFly)))) => Bird(canFly)
}, {
  case Dog(name) => JObject(JField("name", JString(name)))
  case Bird(canFly) => JObject(JField("canFly", JBool(canFly)))
}))

Thanks to the function + we can add multiple custom serializers, while keeping the default serializers.

case class AnimalList(animals: List[Animal])

val json =
  """
    |{"animals": [
    |  {"name": "Pluto"},
    |  {"name": "Goofy"},
    |  {"canFly": false},
    |  {"name": "Rover"}
    |  ]
    |}
  """.stripMargin
implicit val format = Serialization.formats(NoTypeHints) + new AnimalSerializer
println(parse(json).extract[AnimalList])

prints

AnimalList(List(Dog(Pluto), Dog(Goofy), Bird(false), Dog(Rover)))
Parisparish answered 26/11, 2014 at 23:40 Comment(4)
This works indeed, thanks for pursuing your investigation... I feel like the type hints work better if you have a grip on the json producer though as this will indeed become rapidly bloated...Dishevel
Note also that this works because the serialized types for Dog and Bird are different (JString v/s JBool). In the more general case, you'll need to also serialize the name of the type in order to discern between them.Parisi
Also note and be aware that the deserialize part of your custom serializer class, expects the JSON data to be in the order of how you organized your pattern match. more details here: linkInterrogatory
To reduce the amount of boilerplaty-code: 1) When using the technique in the previous comment, you could case-class extraction in the deserializer part of the customSerializer class. 2) For the serializer part of the customSeriealizer class you could use the DSL to produce JValues:: more info about DSLInterrogatory

© 2022 - 2024 — McMap. All rights reserved.