I want to implement a taxonomy structure (geo terms) for my node.js application with NoSQL database. I had a similar taxonomy structure with MySQL but it's time to move forward and learn something new so I decided to try a different approach and use NoSQL (document-oriented) for my test app. The taxonomy structure is simple - there're five different levels: country (i.e. United Kingdom) → region (England) → county (Merseyside) → city/town/village (Liverpool) → part of the city (Toxteth).
The obvious choice is to use a tree structure but the devil is in the detail - historically some cities and towns belonged to other counties. The idea was to tag persons who were born in certain cities or towns with those terms and to filter them later by geo tags so I have to respect the fact Liverpool or Manchester (among others) were part of Lancashire at the time some people were born. Otherwise the result any user gets with my geo filter will be incorrect.
Example: John Doe was born in Blackburn (Lancashire) back in 1957. Paul Brown was born in 1960 in Liverpool (Lancashire, now Merseyside). Georgia Doe (nee Jones) was born in Wirral (Cheshire, now Merseyside) 5 years later. Their son Ringo was born in Liverpool (Merseyside by that time) in 1982.
John is Lancastrian by birth, Paul is Lancastrian and Merseysider, Georgia is from Cheshire and Merseyside at the same time, Ringo is from Merseyside. So they should be categorized accordingly when I search by county. But with simple one-to-many structure that follows modern structure of the country they'll never be filtered as they should be.
How to implement the collection respecting the complexity of its structure with NoSQL (first of all document-oriented) solutions? I googled it and did some research over stack* but still had no clue what to do next with it. There's a few possible ways to solve it in my opinion:
Use SQL-like data structure:
{ {'name': 'United Kingdom', 'unique_id': 1}, {'name': 'England', 'unique_id': 2, 'parents': [1]}, {'name': 'Merseyside', 'unique_id': 3, 'parents': [2]}, {'name': 'Lancashire', 'unique_id': 4, 'parents': [2]}, {'name': 'Liverpool', 'unique_id': 5, 'parents': [3, 4]}, }
Use tree structure with some references:
{ {'name': 'United Kingdom', 'unique_id': 1 {'name': 'England', 'unique_id': 2] {'name': 'Merseyside', 'unique_id': 3] {'name': 'Liverpool', 'unique_id': 5, 'alternate_parents': [4]}, }, {'name': 'Lancashire', 'unique_id': 4}, }, }, }
Use tree structure with no references (one-to-many) and add "alternate parent" tag to a document manually:
{ {'name': 'United Kingdom', 'unique_id': 1 {'name': 'England', 'unique_id': 2] {'name': 'Merseyside', 'unique_id': 3] {'name': 'Liverpool', 'unique_id': 5}, }, {'name': 'Lancashire', 'unique_id': 4}, }, }, }
Stick with SQL.
- Try to implement database-less taxonomy.
Give me advice on that matter please. I'm a newby with any NoSQL (currently I've designed no such databases) so there's a real design issue for me.
And I'm new to stack* so feel free to correct me if I did anything wrong with this post :) Thank you!
EDIT I've chosen @Jonathan answer as a solution. I think it suits better for my needs (there'll be other documents to store in my database and tag them with those terms) especially with mapReduce functionality suggested by @Valentyn.
But if there's no document collections needed for your app a graph database (based on relationships not documents) suggested by @Philipp is probably the best solution possible.