I have not been able to find any discussion that says one method of mergesort should be faster than the other.
Bottom-up and top-down merge sorts, as well as other variants, have been well studied during the 90s. In a nutshell, if you measure the cost as the number of comparisons of individual keys, the best costs are the same (~ (n lg n)/2), the worst cost of top-down is lower than or equal to the worst case of bottom-up (but both ~ n lg n) and the average cost of top-down is lower than or equal to the average case of bottom-up (but both ~ n lg n), where "lg n" is the binary logarithm. The differences stem from the linear terms. Of course, if n=2^p, the two variants are in fact exactly the same. This means that, comparison-wise, top-down is always better than bottom-up. Furthermore, it has been proved that the "half-half" splitting strategy of top-down merge sort is optimal. The research papers are from Flajolet, Golin, Panny, Prodinger, Chen, Hwang and Sedgewick.
Here is what I came up in my book Design and Analysis of Purely Functional Programs (College Publications, UK), in Erlang:
tms([X|T=[_|U]]) -> cutr([X],T,U);
tms(T) -> T.
cutr(S,[Y|T],[_,_|U]) -> cutr([Y|S],T,U);
cutr(S, T, U) -> mrg(tms(S),tms(T)).
mrg( [], T) -> T;
mrg( S, []) -> S;
mrg(S=[X|_],[Y|T]) when X > Y -> [Y|mrg(S,T)];
mrg( [X|S], T) -> [X|mrg(S,T)].
Note that this is not a stable sort. Also, in Erlang (and OCaml), you need to use aliases (ALIAS=...) in the patterns if you want to save memory. The trick here is to find the middle of the list without knowing its length. This is done by cutr/3 which handles two pointers to the input list: one is incremented by one and the other by two, so when the second reaches the end, the first one is in the middle. (I learnt this from a paper by Olivier Danvy.) This way, you don't need to keep track of the length and you don't duplicate the cells of the second half of the list, so you only need (1/2)n lg n extra space, versus n lg n. This is not well known.
It is often claimed that the bottom-up variant is preferable for functional languages or linked list (Knuth, Panny, Prodinger), but I don't think this is true.
I was puzzled like you by the lack of discussion on merge sorts, so I did my own research and wrote a large chapter about it. I am currently preparing a new edition with more material on merge sorts.
By the way, there are other variants: queue merge sort and on-line merge sort (I discuss the latter in my book).
[EDIT: As the measure for the cost is the number of comparisons, there is no difference between choosing an array versus a linked list. Of course, if you implement the top-down variant with linked lists, you have to be clever, as you don't necessarily know the number of keys, but you'll need to traverse a least half the keys, each time, and reallocate, in total (1/2)n lg n cells (if you are clever). Bottom-up merge sort with linked lists actually requires more extra memory, n lg n + n cells. So, even with linked lists, the top-down variant is the best choice. As far as the length of the program goes, your mileage may vary, but in a functional language, top-down merge sort can be made shorter than bottom-up, if stability is not required. There are some papers that discuss implementations issues of merge sort, like in-place (for which you need arrays), or stability etc. For instance, A Meticulous Analysis of Mergesort Programs, by Katajainen and Larsson Traff (1997).]