How do I avoid n+1 queries with Spring Data Rest?
Asked Answered
J

2

25

Question. How do I avoid n+1 queries with Spring Data REST?

Background. When querying Spring Data REST for a list of resources, each of the resulting top-level resources has links to the associated resources, as opposed to having the associated resources embedded directly in the top-level resources. For example, if I query for a list of data centers, the associated regions appear as links, like this:

{
  "links" : [ {
    "rel" : "self",
    "href" : "http://localhost:2112/api/datacenters/1"
  }, {
    "rel" : "datacenters.DataCenter.region",
    "href" : "http://localhost:2112/api/datacenters/1/region"
  } ],
  "name" : "US East 1a",
  "key" : "amazon-us-east-1a"
}

It is pretty typical, however, to want to get the associated information without having to do n+1 queries. To stick with the example above, I might want to display a list of data centers and their associated regions in a UI.

What I've tried. I created a custom query on my RegionRepository to get all the regions for a given set of data center keys:

@RestResource(path = "find-by-data-center-key-in")
Page<Region> findByDataCentersKeyIn(
    @Param("key") Collection<String> keys,
    Pageable pageable);

Unfortunately the links this query generates don't overlap with the links that the data center query above generates. Here are the links I get for the custom query:

http://localhost:2112/api/regions/search/find-by-data-center-key-in?key=amazon-us-east-1a&key=amazon-us-east-1b

{
  "links" : [ ],
  "content" : [ {
    "links" : [ {
      "rel" : "self",
      "href" : "http://localhost:2112/api/regions/1"
    }, {
      "rel" : "regions.Region.datacenters",
      "href" : "http://localhost:2112/api/regions/1/datacenters"
    }, {
      "rel" : "regions.Region.infrastructureprovider",
      "href" : "http://localhost:2112/api/regions/1/infrastructureprovider"
    } ],
    "name" : "US East (N. Virginia)",
    "key" : "amazon-us-east-1"
  }, {
    "links" : [ {
      "rel" : "self",
      "href" : "http://localhost:2112/api/regions/1"
    }, {
      "rel" : "regions.Region.datacenters",
      "href" : "http://localhost:2112/api/regions/1/datacenters"
    }, {
      "rel" : "regions.Region.infrastructureprovider",
      "href" : "http://localhost:2112/api/regions/1/infrastructureprovider"
    } ],
    "name" : "US East (N. Virginia)",
    "key" : "amazon-us-east-1"
  } ],
  "page" : {
    "size" : 20,
    "totalElements" : 2,
    "totalPages" : 1,
    "number" : 1
  }
}

The challenge seems to be that the data center query returns links that aren't particularly informative once you already understand the shape of the data. For example, I already know that the region for data center 1 is at /datacenters/1/region, so if I want actual information about which specific region is involved, I have to follow the link to get it. In particular I have to follow the link to get the canonical URI that shows up in the bulk queries that would allow me to avoid n+1 queries.

Johppah answered 8/4, 2013 at 19:3 Comment(4)
The problem isn't REST, which doesn't mandate this sort of approach at all. The problem is the data model that you're mapping into JSON is rather sparse in what it says at any point; it most certainly doesn't need to be like that. (I prefer to use XML queries because then I can send back richer structures more easily; I find that being able to distinguish attributes and content helps here.)Codeine
Still, the real problem is that you've just slapped data structures into a serializer rather than planning what info you actually want to send back in response to each request. Methinks you might wish to revisit that.Codeine
Agree with you Donal re: REST generally. Just to clarify, however, this is the way the Spring Data REST framework works. (It is generating the JSON representations based on backend entity definitions.) Certainly I can revisit my choice of framework, but I want to explore what I can do with it before I abandon it.Johppah
What I would like to see however is something along the lines you mention--at least provide the canonical URI as the link so I can bulk query the associated resources and then connect them through the canonical URI.Johppah
R
23

The reason Spring Data REST works like this is the following: by default, we assume every application repository a primary resource of the REST service. Thus, if you expose a repository for an entity's related object you get links rendered to it and we expose the assignment of one entity to another via a nested resource (e.g. foo/{id}/bar).

To prevent this, annotate the related repository interface with @RestResource(exported = false) which prevents the entities managed by this repository from becoming top level resources.

The more general approach to this is starting with Spring Data REST letting you expose the resources you want to get managed and default rules applied. You can then customize the rendering and links by implementing ResourceProcessor<T> and registering your implementation as Spring bean. The ResourceProcessor will then allow you to customize the data rendered, links added to the representation etc.

For everything else, manually implement controllers (potentially blending into the URI space of the default controllers) and add links to those through ResourceProcessor implementations. An example for this can be seen in the Spring RESTBucks sample. The sample project uses Spring Data REST to manage Order instances and implements a custom controller to implement the more complex payment process. Beyond that it adds a link to the Order resource to point to the manually implemented code.

Rabbitfish answered 9/4, 2013 at 12:42 Comment(4)
Thanks Oliver. I want the entities involved to be top-level resources with relationships between them. I looked at ResourceProcessor, but its process() method takes a Resource as a param, and I don't see how I would add an associated Resource's canonical URL without first GETting the resource using its noncanonical URL.Johppah
To make it concrete: I want to display a table containing 20 data centers along with their associated regions. Each data center has a link like this: /datacenter/{id}/region. How do I get the regions without individually calling each /datacenter/{id}/region? I don't think I can add the Region's canonical URL as a Link because the information about the Region simply isn't available in the DataCenter's Link to the Region. Hopefully that clarifies the problem I am facing.Johppah
Note also that this isn't a one-off. Most of my top-level resources (and there are dozens--this is a configuration management system) have associations that I want to be able to display in the list view. Is the thought that I would write custom controllers in all of those cases?Johppah
You could let the ResourceProcessor lookup the associated regions using a repository and add them to the wrapped datacenter instance. Another option is to expose a finder on the RegionRepository to retrieve all regions for a datacenter and trigger that from the client.Rabbitfish
R
8

Spring Data REST will only create the representation you describe if the serializer that is configured inside the Jackson ObjectMapper is triggered by seeing a PersistentEntityResource, which is a special kind of Resource that is used inside Spring Data REST.

If you create a ResourceProcessor<Resource<MyPojo>> and return a new Resource<MyPojo>(origResource.getContent(), origResource.getLinks()), then the default Spring Data REST serialization machinery will not be triggered and Jackson's normal serialization rules will apply.

Note, however, that the reason Spring Data REST does associations the way it does is because it's very difficult to arbitrarily stop traversing an object graph when serializing to JSON. By handling associations the way it does, it guarantees that the serializer won't start traversing an object graph that is N levels deep and become much slower in performance and in the performance of the representation going over-the-wire.

Ensuring that Jackson does not try to serialize a PersistentEntityResource, which is what it's doing in the default configuration, will ensure that none of the Spring Data REST handling of associations is triggered. The down side to this, of course, is that none of Spring Data REST's helpers will be triggered. If you still want links to the associated resources, you'll have to make sure you create those yourself and add them to the outgoing plain Resource.

Rachele answered 10/4, 2013 at 13:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.