You might try multi-scale histogram of oriented gradients. It won't be fully scale-invariant, but if your data are constrained with a reasonable set of scale limits (often the case in practice) then this can probably work for you.
Another approach, depending totally on what your intended application is, would be to leverage poselets, even if they are built on top of a non-scale-invariant descriptor like plain histogram of oriented gradient, or appearance models. If the annotations in your training data include examples of different items for detection all at different scales, then the Procrustes-style distance used in Poselets for training should take care of a lot of scale-invariance. This may not be satisfactory though if your primary application is not localized detection of parts.
As an aside, I think it's rather unfortunate that SIFT and SURF were capable of being patented this way, given that they were (at least in part) funded with tax-payer dollars through grants.