The OS and MMU page management responsibilities are 2 sides of the same mechanism, that lives on the boundary between architecture and micro-architecture.
The first side defines the "contract" between the hardware and the software that runs over it (in this case - the OS) - if you want to use virtual memory, you need build and maintain a page table as described in that contract.
The MMU side, on the other hand, is a hardware unit that's responsible for performing the HW tasks of the address translation. This may or may not include hardware optimizations, these are usually hidden and may be implemented in various ways to run under the hood, as long as it maintains the hardware side of the contract.
In theory, the MMU may decide to issue a set of memory accesses for each translation (a page walk), in order to achieve the required behavior. However, since it's a performance critical element, most MMUs optimize this by caching the results of previous page walks inside the TLB, just like a cache stores the results of previous accesses (actually, on some implementations, the caches themselves may also store some of the accesses to the page table since it usually resides in cacheable memory). The MMU can manage multiple TLBs (most implementations separate the ones for data and code pages, and some have 2nd level TLBs), and provide the translation from there without you noticing that except for the faster access time.
It should also be noted that the hardware must guard against many corner cases that can harm the coherency of such TLB "caching" of previous translations, for example page aliasing or remaps during usage. On some machines, the nastier cases even require a massive flush flow called TLB shootdown.