One way to evaluate the word2vec model is to develop a "ground truth" set of words. Ground truth will represent words that should ideally be closest together in vector space. For example if your corpus is related to customer service, perhaps the vectors for "dissatisfied" and "disappointed" will ideally have the smallest euclidean distance or largest cosine similarity.
You create this table for ground truth, maybe it has 200 paired words. These 200 words are the most important paired words for your industry / topic. To assess which word2vec model is best, simply calculate the distance for each pair, do it 200 times, sum up the total distance, and the smallest total distance will be your best model.
I like this way better than the "eye-ball" method, whatever that means.