How to skip whitespace but use it as a token delimeter in a parser combinator

I am trying to build a small parser where the tokens (luckily) never contain whitespace. Whitespace (spaces, tabs and newlines) are essentially token delimeters (apart from cases where there are brackets etc.).

I am extending the RegexParsers class. If I turn on skipWhitespace the parser is greedily joining tokens together when the next token matches the regular expression of the previous one. If I turn off skipWhitespace, on the other hand, it complains because of the spaces not being part of the definition. I am trying to match the BNF as much as possible, and given that whitespace is almost always the delimeter (apart from brackets or some other cases where the delimeter is explicitly defined in the BNF), is there away to avoid putting whitespace regex in all my definitions?

UPDATE

This is a small test example where the tokens are being joined together:

import scala.util.parsing.combinator.RegexParsers

object TestParser extends RegexParsers {
  def test  = "(test" ~> name <~ ")"

  def name : Parser[String] = (letter ~ (anyChar*)) ^^ { case first ~ rest => (first :: rest).mkString}

  def anyChar = letter | digit | "_".r | "-".r
  def letter = """[a-zA-Z]""".r
  def digit = """\d""".r

  def main(args: Array[String]) {

    val s = "(test hello these should not be joined and I should get an error)"

    val res = parseAll(test, s)
    res match {
      case Success(r, n) => println(r)
      case Failure(msg, n) => println(msg)
      case Error(msg, n) => println(msg)
    }

  }

}

In the above case I just get the string joined together. A similar effect is if I change test to the following, expecting it to give me the list of separate words after test, but instead it joins them together and just gives me a one element list with a long string, without the middle spaces:

def test  = "(test" ~> (name+) <~ ")"

Recommended topics

Hot tags