Both solutions are normally standard for this, the first being standard in RDBMS techs as well (or file based translations being another method that is not possible here).
As for which is best right here, I am leaning towards the second considering your use.
Some of the reasons would be:
- One single document load for all translations and product data, no JOINs
- Making for a single contiguous read of your disk
- Allowing for atomic updating and adding of new languages and changes etc to a single product
But creating some downsides:
- Updating could (probably will) create fragmentation which can be remedied to some extent (not completely) by powerof2sizes
- All your ops will now go to one single part of your hard disk which may actually create a bottle neck however, your scenario is such that you do not update often if at all so this shouldn't be a problem.
As a side note: I am judging that fragmentation might not bee too much of a problem for you. The reason being is that you only really bulk import products, probably from a CSV as such your documents will not probably grow greater than by the power of 2 from their insertion regularly. As such this point might be obsolete.
So overall, if planned right the second option is a good one however, there are some considerations to take into account:
- Could the multiple descriptions/fields push the document past the 16meg limit?
- How to manually pad to the document to efficiently use space and prevent fragmentation?
Those are your biggest concerns if you go with the second option.
Considering that you can fit all of the works of Shakespear into 4MB with room to spare I am actually not sure if you will reach the 16MB limit, if you do it would have to be some considerable text, and maybe storing the images in binary into the document.
Coming back to the first option, your largest concern will be duplication of certain data, i.e. price (France and Spain both have the Euro) unless you use two documents, one to house common data and the other a translation (this will make 4 documents actually but two queries).
Considering that this catalogue will never be updated unless in bulk duplicated data will not matter too much (however, for future reference in the case of expansion I will be cautious) so:
- You can make it have one document per translation and not worry about updating prices atomically across all regions
- You have one disk read without the fragmentation
- No need to manually pad your documents
So both options are readily available but I am leaning towards the second case.