Line continuation/wrapping in doctest
Asked Answered
R

6

13

I am using doctest.testmod() to do some basic testing. I have a function that returns a long string, say get_string(). Something like:

def get_string(a, b):
    r''' (a, b) -> c

    >>> get_string(1, 2)
    'This is \n\n a long \n string with new \
    space characters \n\n'
    # Doctest should work but does not.

    '''
    return ('This is \n\n a long \n string ' + \
            'with new space characters \n\n')

The problem is that the doctest is not passing because it is expecting a single line string, and it is treating the wrap as a \n character. Is there a way to get around this?

PS: This is not the actual function I am working with, but a minimal version for your sake.

Rolfrolfe answered 2/3, 2015 at 8:41 Comment(1)
Note: in your code in the return line the backslash is useless if not harmful. Parenthesis are enough for continuing lines (and they should be preferred anyway). Besides, even the + is superfluous.Ory
O
8

I don't think you understood how doctest works. It does not check whether the output is somehow "equivalent", it only checks if the output is identical (with only very minor possible variations, like using ellipsis). From the documentation:

The doctest module searches for pieces of text that look like interactive Python sessions, and then executes those sessions to verify that they work exactly as shown.

Doctest matches the output (not a string literal, a python expression or whatever. Raw output bytes) with the content of the sample output you provide. Since it doesn't know that the text between quotes represents a string literal it cannot interpret it as you want.

In other words: the only thing you can do is to simply put the whole output on one line as in:

>>> get_string(1, 2)
    'This is \n\n a long \n string with new space characters \n\n'

If the output of this is too long you can try to modify the example to produce a shorter string (e.g. cutting it to 50 characters: get_string(1, 2)[:50]). If you look at doctests of different projects you'll find different hacks to make doctests more readable while providing reliable output.

Ory answered 2/3, 2015 at 10:55 Comment(0)
F
13

You can use NORMALIZE_WHITESPACE option (see also full list of options).

Here is an example from doctest documentation:

>>> print range(20) # doctest: +NORMALIZE_WHITESPACE
[0,   1,  2,  3,  4,  5,  6,  7,  8,  9,
10,  11, 12, 13, 14, 15, 16, 17, 18, 19]
Fuchs answered 29/5, 2017 at 9:39 Comment(1)
This answer does not make it obvious but the directive NORMALIZE_WHITESPACE normalizes whitespace in all the output (including what looks like a string literal). So it works with the example from the question too after you remove the backslash.Exteroceptor
O
8

I don't think you understood how doctest works. It does not check whether the output is somehow "equivalent", it only checks if the output is identical (with only very minor possible variations, like using ellipsis). From the documentation:

The doctest module searches for pieces of text that look like interactive Python sessions, and then executes those sessions to verify that they work exactly as shown.

Doctest matches the output (not a string literal, a python expression or whatever. Raw output bytes) with the content of the sample output you provide. Since it doesn't know that the text between quotes represents a string literal it cannot interpret it as you want.

In other words: the only thing you can do is to simply put the whole output on one line as in:

>>> get_string(1, 2)
    'This is \n\n a long \n string with new space characters \n\n'

If the output of this is too long you can try to modify the example to produce a shorter string (e.g. cutting it to 50 characters: get_string(1, 2)[:50]). If you look at doctests of different projects you'll find different hacks to make doctests more readable while providing reliable output.

Ory answered 2/3, 2015 at 10:55 Comment(0)
A
5

If you are making a test against a long single-line string in the output, you can keep the doctest match string within 80 characters for PEP8 goodness by using doctest's ELLIPSIS feature, where ... will match any string. While it is generally used for variable output like object addresses, it works just fine with (long) fixed output as well, for example:

def get_string(a, b):
    r''' (a, b) -> c

    >>> get_string(1, 2)  # doctest: +ELLIPSIS
    'This is ... string with newline characters \n\n'
    '''
    return ('This is \n\n a long \n string '
            'with newline characters \n\n')

There is a small loss of exactness in the matching, but this is usually not critical for tests.

Affirmative answered 27/1, 2016 at 16:31 Comment(0)
B
1

From doctest's docs:

If you continue a line via backslashing in an interactive session, or for any other reason use a backslash, you should use a raw docstring, which will preserve your backslashes exactly as you type them:

>>> def f(x):
...     r'''Backslashes in a raw docstring: m\n'''
>>> print f.__doc__
Backslashes in a raw docstring: m\n

Otherwise you could use a double-backslash.

Bales answered 2/3, 2015 at 8:46 Comment(0)
A
1

A simple solution is >>> repr(get_string(1,2)); that will escape new lines and use a single-line string just for the test.

I'd prefer a solution where you can say:

>>> get_string(1,2)
first line
second line

fourth

In your case, this is a problem because you have trailing white space.

Also note that it's not possible to test the line continuation character.

"a" + \
"b"

is exactly the same as

"a" + "b"

namely "ab"

Ahola answered 2/3, 2015 at 8:48 Comment(6)
Sounds like the best solution so far, although I would like to test for newline characters too, since in this case things could go wrong in my program if they are badly placed. Thanks though!Rolfrolfe
repr() allows exactly that, so I don't understand your comment.Ahola
Actually, I haven't been able to implement the doctest with repr(...). It does escape the new lines, but the line continuation is still interpreted as a newline character, and thus the test doesn't pass. Could you show me an example of how it would look like?Rolfrolfe
Ah. There is no way to test a line continuation character since backslash followed by a newline will be simply swallowed. It doesn't appear in the result at all. The parser removes it before the string class has a chance to see it. So "a" + \ "b" is exactly the same as "a" + "b" or "ab" after calling the method. There is no way to write a test for this unless you ask the Python interpreter for the source code of the method and find the respective lines.Ahola
Oh, the line continuation that I'm having trouble with is the one in the docstring. I am effectively making a test against a single-line string (the line continuation in the actual function doesn't matter). However, the problem is that the string in the doctest is interpreted as having several lines.Rolfrolfe
Well, you can't use line continuation in docstring. I'm pretty sure that the output of the Python methods isn't parsed by Python, so there is no one who could remove the LC. You will have to paste a very, very long single line string into the doctest.Ahola
D
0

Another option, which your documentation readers may prefer, is to use pprint to pretty up the output instead of doing it by hand.

POEM = """ My mother groaned, my father wept:
Into the dangerous world I leapt,
Helpless, naked, piping loud,
Like a fiend hid in a cloud.

Struggling in my father’s hands,
Striving against my swaddling bands,
Bound and weary, I thought best
To sulk upon my mother’s breast."""

def poem():
    """
    Enumerate the lines of INFANT SORROW

    >>> import pprint
    >>> pprint.pprint(poem())
    {0: ' My mother groaned, my father wept:',
     1: 'Into the dangerous world I leapt,',
     2: 'Helpless, naked, piping loud,',
     3: 'Like a fiend hid in a cloud.',
     4: '',
     5: 'Struggling in my father’s hands,',
     6: 'Striving against my swaddling bands,',
     7: 'Bound and weary, I thought best',
     8: 'To sulk upon my mother’s breast.'}
    """
    lines = POEM.split("\n")
    return {lineno:line for lineno, line in enumerate(lines)}

Dabbs answered 10/2, 2023 at 10:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.