Static code analysis for detecting passing a wchar_t* to BSTR
Asked Answered
C

4

14

Since a BSTR is only a typedef for wchar_t* our code base has several (many?) places where string literals are passed to a method expecting a BSTR this can mess up with marshallers or anyone who tries to use any BSTR specific method (e.g. SysStringLen).

Is there any way to statically detect this type of misuse?

I tried compiling with VC10 /Wall and with static code analysis Microsoft All Rules but the following offending piece of code doesn't get flagged by either of them.

void foo(BSTR str)  
{
    std::cout << SysStringLen(str) << std::endl; 
}

int _tmain()
{
    foo(L"Don't do that");
}

Update: After trying to vandalize wtypes.h into detecting these kinds of transgressions I've given up.

I tried two paths, both of which I got to work with my sample program above but once I tried a real project they failed.

  1. create a class named BSTR but since a VARIANT has a BSTR as a union member the new class couldn't have any constructors or assignment operators this broke every place were NULL was treated as a BSTR. I tried to replace NULL with a type that has conversion operators but after adding dozens of new operators (comparison, conversion etc.) I started to run into ambiguous calls and gave up.
  2. I then tried the way suggested by @CashCow and @Hans (makeing BSTR a typedef to another type of pointer). That didn't work either, after adding toBSTR and fromBSTR methods and littering comutil.h (_bstr_t) and other places with conversions I finally got to the point where the compiler choked at headers produced from IDLs (default values are translated to literal wide strings).

In short I've given up on trying to achieve this on my own, if anyone knows of a code analysis tool that can help I would be very happy to hear about it.

Capapie answered 24/1, 2012 at 11:33 Comment(11)
BSTR may be such a typedef but that doesn't mean it's a null-terminated string. Its string representation actually beings with the 3rd wchar_t character, the first 2 being a length prefix (thus allowing length of up to 0xFFFF). They can contain embedded nulls. I think they are usually null-terminated.Maltha
@CashCow, not exactly, the length is placed before the actual data (this is what SysStringLen looks for) that's why it's incorrect to treat a wchar_t* as a BSTR.Capapie
Yes, the length is before the data. When you call SysAllocString it returns you the 5th byte. When you pass in a BSTR you pass in the address of the 5th allocated byte.Maltha
@Maltha I understand all this, what is your point?Capapie
Making BSTR an alias for wchar_t* was quite intentional. One possible approach is to carefully make a backup copy of the SDK's wtypes.h header and edit the definition to, say, a pointer to an empty struct. Rebuild your projects and the compiler will complain loudly wherever a wchar_t* is substituted.Rosmunda
@HansPassant that would cause many other compilation errors e.g. where a BSTR is passed to wcscmp.Capapie
Yup, that was the intentional part. You can't have it both ways.Rosmunda
You write an overload of wcscmp that takes your new BSTR so that code compiles. Then you take it out again.Maltha
@Maltha and for every other string function that is used... I suppose that would work.Capapie
@HansPassant I've updated the question with the (lack of) progress I made. Any further suggestions are welcome.Capapie
It's unfortunate that typedef enum class BSTRchar : wchar_t {} *BSTR; doesn't work either... scoped enums don't have implicit upcast.Radom
A
4

I believe Coverity claims to detect these sorts of vulnerabilities. I remember them mentioning COM stuff specifically when they demo'd to a company I worked for.

Their datasheet certainly seems to imply they check for classes of improper BSTR usage. They have a demo-period; you could try it and see if it flags your sample input.

Aam answered 8/3, 2012 at 0:56 Comment(1)
After checking it out I see that COM.BSTR.CONV detects this pattern, thanks!Capapie
M
1

Are you able to change your methods to take _bstr_t or CComBSTR instead?

If not then as a literal is technically a const wchar_t *, if there is a compiler setting to not allow literal->non-const pointer conversion then you can do that.

Failing that, there is a possibility of modifying the definition of BSTR to be unsigned short *. Then if you build all your source, you will get compiler errors wherever a literal is being passed in and you can fix all this code. Then I would suggest changing it back...

Maltha answered 24/1, 2012 at 12:19 Comment(1)
Option 2 (prevent conversions from const wchar_t*) isn't enough since not all instances are literal strings. Option 3 (typedef unsigned short* BSTR) wouldn't work since we sometimes pass a BSTR to wcscmp (et al.) which doesn't accept unsigned short*Capapie
P
1

You may try compiling with Clang, it's static/dynamic analysis may find what you are looking for.

Proem answered 3/2, 2012 at 19:56 Comment(2)
Does Clang support compiling Microsoft specific C++?Capapie
Pretty sure it does now, and continually improving.Proem
P
0

Overload all functions with the BSTR and forward them with the appropriate conversion.

void foo( BSTR str )
{
    std::cout << SysStringLen(str) << std::endl; 
}

void foo( const WCHAR *str )
{
    foo( SysAllocString( str ));
}

int _tmain()
{
    foo( L"don't do this" );
    return 0;
}

Or, to generate compiler errors, change all of the parameter types from BSTR to something else and look for the errors:

typedef UINT bar;

void foo( bar _str )
{
    // make the compiler happy below
    BSTR str = (BSTR)_str;
    std::cout << SysStringLen(str) << std::endl;
}

int _tmain()
{
    foo( L"don't do this" );
    foo( (bar)42 );
    return 0;
}

error C2664: 'foo' : cannot convert parameter 1 from 'const wchar_t [14]' to 'bar'

I assume the C2664 error and the 'const wchar_t[]' type identified by the compiler is what you want the compiler to find for every internal call made to the function using the BSTR?

Proem answered 1/2, 2012 at 20:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.