I can advise you to use Hessian-Affine and MSER for detection, if you need invariance to different factors (e.g., viewpoint change) or FAST, if you need real time.
FAST is doing similar job to the Harris, but much faster.
You can look into "Local Invariant Feature Detectors: A Survey", and "A Comparison of Affine Region Detectors" where many detectors are tested and described very well.
Update: "WxBS: Wide Baseline Stereo Generalizations" does extended benchmark of the novel and classical detectors and descriptors.
Second, the description part is usually slower than detection, so to be real-time you have to use GPU or binary descriptor like BRIEF or FREAK.
Update2: "HPatches (Homography Patches) dataset and benchmark" and corresponding workshop at ECCV 2016. http://www.iis.ee.ic.ac.uk/ComputerVision/DescrWorkshop/index.html .
Update3: "Comparative Evaluation of Hand-Crafted and Learned Local Features" Descriptors (and a bit detectors) evaluation on large-scale 3D reconstruction task CVPR 2017 .
Update4: "Interest point detectors stability evaluation on ApolloScape dataset" Detector evaluation on authonomous driving dataset, ECCVW2018 .
Update5: "From handcrafted to deep local invariant features" Huuuge survey-overview paper about handcrafted and learned features, 2018.
Update6: "Image Matching across Wide Baselines: From Paper to Practice" Large scale benchmark of the abovementioned and more recent methods for the camera pose estimation. IJCV, 2020.