The Framebuffer object is not actually a buffer, but an aggregator object that contains one or more attachments, which by their turn, are the actual buffers. You can understand the Framebuffer as C structure where every member is a pointer to a buffer. Without any attachment, a Framebuffer object has very low footprint.
Now each buffer attached to a Framebuffer can be a Renderbuffer or a texture.
The Renderbuffer is an actual buffer (an array of bytes, or integers, or pixels). The Renderbuffer stores pixel values in native format, so it's optimized for offscreen rendering. In other words, drawing to a Renderbuffer can be much faster than drawing to a texture. The drawback is that pixels uses a native, implementation-dependent format, so that reading from a Renderbuffer is much harder than reading from a texture. Nevertheless, once a Renderbuffer has been painted, one can copy its content directly to screen (or to other Renderbuffer, I guess), very quickly using pixel transfer operations. This means that a Renderbuffer can be used to efficiently implement the double buffer pattern that you mentioned.
Renderbuffers are a relatively new concept. Before them, a Framebuffer was used to render to a texture, which can be slower because a texture uses a standard format. It is still possible to render to a texture, and that's quite useful when one needs to perform multiple passes over each pixel to build a scene, or to draw a scene on a surface of another scene!
The OpenGL wiki has this page that shows more details and links.