Saving data into multiple collections in DocumentDb
Asked Answered
F

1

5

In DocumentDb, what is the best way and place to decouple data in order to save them in separate collections?

So far, most of the examples of how to manage data with DocumentDb use simple objects but in real life, we hardly ever do. I just want to understand how and where I need to handle my complex classes before I save them as Json objects in DocumentDb.

Let's look at the following example. I'll be saving my project information into the Projects collection but I do NOT want to save full names of people in the project team within a project document. I just want to save their EmployeeId's in the project document. I have a separate Employees collection where I want to save person/employee specific information. My project object looks like this:

public class Project
{
   [JsonProperty(PropertyName="id")]
   public int ProjectId {get; set;}

   [JsonProperty(PropertyName="projectName")]
   public string ProjectName {get; set;}

   [JsonProperty(PropertyName="projectType")]
   public string ProjectType {get; set;}

   [JsonProperty(PropertyName="projectTeam")]
   public List<TeamMember> ProjectTeam {get; set}
}

My TeamMember class inherits from Employee object and looks like this:

public class TeamMember : Employee
{
   [JsonProperty(PropertyName="position")]
   public string Position {get; set;}
}

My Employee class looks like this:

public class Employee
{
   [JsonProperty(PropertyName="id")]
   public int EmployeeId {get; set;}

   [JsonProperty(PropertyName="firstName")]
   public string FirstName {get; set;}

   [JsonProperty(PropertyName="lastName")]
   public string LastName {get; set;}

   [JsonProperty(PropertyName="gender")]
   public string Gender {get; set;}

   [JsonProperty(PropertyName="emailAddress")]
   public string EmailAddress {get; set;}
}

Before saving to Projects collection, here's an example of what my Project document should look like:

{
   id: 12345,
   projectName: "My first project",
   projectType: "Construction Project",
   projectTeam: [
      { id: 7777, position: "Engineer" },
      { id: 8998, position: "Project Manager" }
   ]
}

As you can see, I decoupled my project information from the employee data in order to store them in their own collections, Projects and Employees collections respectively.

Let's not get into why I should or should not decouple data. I just want to see how and where I should handle decoupling in order to produce fastest results. I want to follow the best practices so I just want to see how experts working with DocumentDb handle this scenario.

I can think of two places to handle this but I want to understand if there's a better, more direct way to do this:

  1. I can convert my Project class into a JSON object within my C# code and pass the JSON object to DocumentDb for storage.
  2. Alternatively, I can pass my Project object directly to DocumentDb, into a JavaScript stored procedure and I can handle decoupling and storing data in two or more collections within DocumentDb.

Here's what I'd like to know:

  1. Which is the right place to handle decoupling data?
  2. Which would provide better performance?
  3. Is there a better way to handle this? I keep reading about how I can just pass my POCO classes to DocumentDb and it will just handle them for me. Would DocumentDb handle this more complex scenarios? If so, how?

I appreciate your help. Thank you.

Fustian answered 11/9, 2014 at 5:22 Comment(1)
Possible duplicate of Single or Multiple Entities Per Collection in DocumentDBWeld
L
7

in a NoSql store such as this, you can store different types of documents with different schemas in the same collection.

please do not treat collections as tables. think of collections rather as units of partition and boundaries for execution of queries, transactions etc.

so with that in mind, there is nothing wrong with storing your project document as shown and including the employee documents in to the same collection.

now saying all of that; if you still wanted to do this, then you can ... in order to achieve this your project object would have to change. instead of having TeamMember : Employee (which would include the entire Employee object) have the TeamMember object mimic what you want from your JSON ... i.e.

class TeamMember
{
   int id {get;set;}
   string position {get;set;}
}

Now when DocumentDB serializes your project object you would end up with JSON that resembled what you wanted. And then you could separately save your Employee object somewhere else.

If you don't want to do this, or can't do this because you don't control the definition of the Model or because other parts of the system are built to depend on this already then you could investigate building a custom JSON converter for your Project object that would spit out the JSON you wanted. Then decorate your Project object with that JsonConverter and when DocumentDB does the conversion the correct result would be created each time.

Lotz answered 11/9, 2014 at 18:2 Comment(2)
Ryan, you're right! I was treating collections as tables. How would I tell what type of document I want to query if I store both projects and employees in the same collection?Fustian
The easiest way to do this today would be to add a "type" attribute to each JSON doc and include this in the query. WHERE type=project or WHERE type=employee I know you're only limited to 3 clauses in the WHERE today, but this is likely to be changed any day now to allow more. We'd love to hear if of other ways that would make this easier for you, so please keep the feedback coming.Lotz

© 2022 - 2024 — McMap. All rights reserved.