How can I best write unit test cases for a Parser?

Asked 13/1, 2009 at 15:51 Answered 4/2, 2009 at 9:20

I am writing a parser which generates the 32 bit opcode for each command. For example, for the following statement:

set lcl_var = 2

my parser generates the following opcodes:

// load immdshort 2 (loads the value 2)
0x10000010
// strlocal lclvar (lcl_var is converted to an index to identify the var)
0x01000002

Please note that lcl_var can be anything i.e., any variable can be given. How can I write the unit test cases for this? Can we avoid hard coding the values? Is there a way to make it generic?

Mud answered 13/1, 2009 at 15:51 Comment(1)

Hard coding is best, the unit test should tell you very specifically where the error is in the code base. If it is generic the error could be in the "list of valid codes" not the parser. – Bargello 14/2, 2009 at 10:23

It depends on how you structured your parser. A Unit-Test tests a single UNIT.

So, if you want to test your entire parser as a single unit, you can give it a list of commands and verify it produces the correct opcodes (which you checked manually when you wrote the test). You can write tests for each command, and test the normal usage, edge-case usage, just-beyond-edge-case usage. For example, test that:

set lcl_var = 2

results in:

0x10000010 0x01000002

And the same for 0, -1, MAX_INT-1, MAX_INT+1, ...

You know the correct result for these values. Same goes for different variables.

Lashondalashonde answered 14/1, 2009 at 10:19 Comment(0)

If your question is "How do I run the same test with different inputs and expected values without writing one xUnit test per input-output combination?"

Then the answer to that would be to use something like the RowTest NUnit extension. I wrote a quick bootup post on my blog recently. An example of this would be

[TestFixture]
    public class TestExpression
    {
        [RowTest]
        [Row(" 2 + 3 ", "2 3 +")]
        [Row(" 2 + (30 + 50 ) ", "2 30 50 + +")]
        [Row("  ( (10+20) + 30 ) * 20-8/4 ", "10 20 + 30 + 20 * 8 4 / -")]
        [Row("0-12000-(16*4)-20", "0 12000 - 16 4 * - 20 -")]
        public void TestConvertInfixToPostfix(string sInfixExpr, string sExpectedPostfixExpr)
        {
            Expression converter = new Expression();
            List<object> postfixExpr = converter.ConvertInfixToPostfix(sInfixExpr);

            StringBuilder sb = new StringBuilder();
            foreach(object term in postfixExpr)
            {
                sb.AppendFormat("{0} ", term.ToString());
            }
            Assert.AreEqual(sExpectedPostfixExpr, sb.ToString().Trim());
        }

Bithia answered 4/2, 2009 at 9:20 Comment(0)

int[] opcodes = Parser.GetOpcodes("set lcl_var = 2");
Assert.AreEqual(2, opcodes.Length);
Assert.AreEqual(0x10000010, opcodes[0]);
Assert.AreEqual(0x01000002, opcodes[1]);

Humblebee answered 14/1, 2009 at 15:25 Comment(0)

What do you want to test? Do you want to know whether the correct "store" instruction is created? Whether the right variable is picked up? Make up your mind what you want to know and the test will be obvious. As long as you don't know what you want to achieve, you will not know how to test the unknown.

In the meantime, just write a simple test. Tomorrow or some later day, you will come to this place again because something broke. At that time, you will know more about what you want to do and it might be more simple to design a test.

Today, don't try to be the person you will be tomorrow.

Philomena answered 13/1, 2009 at 15:51 Comment(0)

It's not clear if you are looking for a methodology or a specific technology to use for your testing.

As far as methodology goes maybe you don't want to do extensive unit testing. Perhaps a better approach would be to write some programs in your domain specific language and then execute the opcodes to produce a result. The test programs would then check this result. This way you can exercise a bunch of code, but check only one result at the end. Start with simple ones to flush out obvious bugs and the move to harder ones. Instead of checking the generated opcodes each time.

Another approach to take is to automatically generate programs in your domain specific language along with the expected opcodes. This can be very simple like writing a perl script that produces a set of programs like:

set lcl_var = 2

set lcl_var = 3

Once you have a suite of test programs in your language that have correct output you can go backwards and generate unit tests that check each opcode. Since you already have the opcodes it becomes a matter of inspecting the output of the parser for correctness; reviewing its code.

While I've not used cppunit, I've used an in-house tool that was very much like cppunit. It was easy to implement unit tests using cppunit.

Rubin answered 13/1, 2009 at 15:51 Comment(0)

You don't specify what language you're writing the parser in, so I'm going to assume for the sake of argument that you're using an object-oriented language.

If this is the case, then dependency injection could help you out here. If the destination of the emitted opcodes is an instance of a class (like File, for instance), try giving your emitter class a constructor that takes an object of that type to use as the destination for emitted code. Then, from a unit test, you can pass in a mock object that's an instance of a subclass of your destination class, capture the emitted opcodes for specific statements, and assert that they are correct.

If your destination class isn't easily extensible, you may want to create an interface based on it that both the destination class and your mock class can implement.

Mantelletta answered 13/1, 2009 at 15:59 Comment(1)

I am writing the parser using C++ – Mud 13/1, 2009 at 16:3

As I understand it, you would first write a test for your specific example, i.e. where the input to your parser is:

set lcl_var = 2

and the output is:

0x10000010 // load immdshort 2
0x01000002 // strlocal lclvar

When you have implemented the production code to pass that test, and refactored it, then if you are not satisified it could handle any local variable, write another test with a different local variable and see if it passes or not. e.g. new test with input:

set lcl_var2 = 2

And write your new test to expect the different output that you want. Keep doing this until you are satisfied that your production code is robust enough.

Gowk answered 14/1, 2009 at 10:18 Comment(0)

Recommended topics

Hot tags