Bump mapping was originally suggested by Jim Blinn back in 1978. His system basically works by perturbing the normal on a surface by using the height of that texel and the height of the surrounding texels.
This is quite similar to DUDV bumpmapping (You may recall the original environment mapped bump mapping as introduced in DX6 which was DUDV). This works by pre-calculating the derivatives from above so that you can miss out the first stage of the calculation (as it does not change each frame).
Normal mapping is a very similar technique that works by, simply, replacing the normal at each texel position. Conceptually its much simpler.
There is another technique that produces "similar" results. It is called emboss bump mapping. This method works by using multipass rendering. Basically you end up subtracting a gray scale heightmap from the last pass but offsetting it a small amount based on the light direction.
There are other ways of emulating surface topology as well.
Elevation mapping uses the height map as an alpha texture and then renders multiple slices through that texture with a different alpha value to simulate the change in height. If not performed correctly, however, the slices can be very visible.
Displacement mapping works by generating a 3D mesh that uses the texture as its basis. This, obviously, massively increase your vertex count.
Steep parallax, relief mapping, etc are the newest techniques. They work by casting a ray through the heightmap until it intersects. This has the big advantage that if a lump should block out the texture behing it now does as the ray doesn't hit the heightmap behind where it initially hits so always displays the "closest" texel.