@Joachim Bøggild linked to Mike Williamson: https://mikewilliamson.wordpress.com/2015/07/16/data-modeling-with-arangodb/
I would agree with Williamson that "Compact by default" is generally the way to go. You can then extract vertices (aka. nodes) from properties if/when the actual need emerges. It also avoids creating an overly interconnected graph structure which would be slow for all kinds of traversal queries.
However, in this case, I think having Tag vertices (i.e. "documents", in your terminology) is good to have, because you can then store meta-data on the tag (like count), and connect it to other tags and sub-tags. It seems very useful and immediately foreseeable in the particular case of tags. Having a vertex, which you can add more relationships to if/when you need them, is also very extensible, so you keep your future options more open (more easily, at least).
It seems Williamson agrees that Tags warrant special consideration:
"But not everything belongs together. Any attribute that contains a
complex data structure (like the “comments” array or the “tags” array)
deserves a little scrutiny as it might make sense as a vertex (or
vertices) of its own."
The original question by @ropeladder poses the main objection that it would require extra overhead (an extra query). I think it might be premature optimization to think too much about performance at this stage. After all; the extra query might be fast, or it might actually be joined with and included in the original query. In any case, I would quote this:
“In general, it’s bad practice to try to conflate nodes to preserve
query-time efficiency. If we model in accordance with the questions we
want to ask of our data, an accurate representation of the domain will
emerge. Graph databases maintain fast query times even when storing
vast amounts of data. Learning to trust our graph database is
important when learning to structure our graphs without denormalizing
them.”
--- from page 64, chapter 'Avoiding Anti-patterns', in the book 'Graph Databases', a book co-written by Eifrem, the founder of Neo4j, another
very popular native graph database. It's free and available online
here: https://neo4j.com/graph-databases-book/
See also this article on some anti-patterns (dense vs sparse graphs), to supplement Williamsons points: https://neo4j.com/blog/dark-side-neo4j-worst-practices/
Extra section included for completeness, to those who want to dive a little bit deeper into this question:
Answering Williamson's own criteria for deciding whether something should be a vertex/node on its own, instead of leaving it as a property on the document vertex:
Will it be accessed on it’s own? (ie: showing tags without the document)
Yes. Browsing tags available in the system could be useful.
Will you be running a graph measurement (like GRAPH_BETWEENNESS) on it?
Unsure. Likely not.
Will it be edited on it’s own?
Yes, probably. A user could edit it separately. Maybe an admin/moderator wants to clean up the tag names (correct spelling errors), or clean up their structure (if you have sub-tags).
Does/could the tags have relationships of it’s own? (assuming you care)
Yes. They could. Sub-tags, or other kinds of content than merely documents. Actually, it's very useful to be able to click a tag and immediately see all documents with that tag. That would presumably be sub-optimal with tags stored as a property array on each document. Whereas a graph database is fundamentally optimized for the case of querying vertices adjacent to other vertices (aka. nodes).
Would/should this attribute exist without it’s parent vertex?
Yes. A tag could/should exist even if the last tagged document was deleted. Someone might want to use that tag later on, and it represents domain information you might want to preserve.