How to test for texts not fitting an Instaparse-grammar (Clojure)?
Asked Answered
M

1

6

I wrote a project for parsing strings using context-free grammar in Instaparse (Clojure). Now I'd like to test several input-Strings for their parsing results. Some input strings might not fit into the grammar. So far I only tested for "parsed strings not fitting the expectation". But I think it would be more accurate to test for exceptions using (is (thrown? ...)). Are there exceptions thrown? It seems to me that some output (Containing Parse error...) is generated, but no exception is thrown.

My project.clj is:

(defproject com.stackoverflow.clojure/tests "0.1.0-SNAPSHOT"
  :description "Tests of Clojure test-framework."
  :url "http://example.com/FIXME"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.6.0"]
                 [instaparse "1.3.4"]])

My core source is:

(ns com.stackoverflow.clojure.testInstaparseWrongGrammar
  (:require [instaparse.core :as insta]))

(def parser (insta/parser "
    <sentence> = words <DOT>
    DOT        = '.'
    <words>    = word (<SPACE> word)*
    SPACE      = ' '
    word     = #'(?U)\\w+'
"))

(defn formatter [expr] 
  (->> (parser expr)
       (insta/transform {:word identity})
       (apply str)))

My test source is:

(ns com.stackoverflow.clojure.testInstaparseWrongGrammar-test
  (:require [clojure.test :refer :all]
            [com.stackoverflow.clojure.testInstaparseWrongGrammar :refer :all]))

(deftest parser-tests
  (is (= [[:word "Hello"] [:word "World"]] (parser "Hello World.")))
  (is (not (= [[:word "Hello"] [:word "World"]] (parser "Hello World?"))))
  ;(parser "Hello World?")     gives:
  ;
  ;Parse error at line 1, column 12:
  ;Hello World?
  ;           ^
  ;Expected one of:
  ;"." (followed by end-of-string)
  ;" "
)

(deftest formatter-tests
  (is (= "HelloWorld" (formatter "Hello World.")))
  (is (not (= "HelloWorld" (formatter "Hello World?"))))
  ;(formatter "Hello World?")     gives:
  ;"[:index 11][:reason [{:tag :string, :expecting \".\", :full true} {:tag :string, :expecting \" \"}]][:text \"Hello World?\"][:column 12][:line 1]"
)

; run the tests
(run-tests)

How should I test for the errors (Here: when the sentence does not end with a . but with a !)?

Mycah answered 13/10, 2014 at 11:28 Comment(0)
R
6

Instaparse does not throw an exception on a parse error; instead, it returns a "failure object" (ref: parse errors). You can test for a failure object with (insta/failure? result).

If you want your parser/formatter to throw an exception on unexpected input, add that to your core:

(ns com.stackoverflow.clojure.testInstaparseWrongGrammar
  (:require [instaparse.core :as insta])
  (:require [instaparse.failure :as fail]))

(def raw-parser (insta/parser "
    <sentence> = words <DOT>
    DOT        = '.'
    <words>    = word (<SPACE> word)*
    SPACE      = ' '
    word     = #'(?U)\\w+'
"))

; pretty-print a failure as a string
(defn- failure->string [result]
  (with-out-str (fail/pprint-failure result)))

; create an Exception with the pretty-printed failure message
(defn- failure->exn [result]
  (Exception. (failure->string result)))  

(defn parser [expr]
  (let [result (raw-parser expr)]
    (if (insta/failure? result)
      (throw (failure->exn result))
      result)))

(defn formatter [expr]
  (->> (parser expr)
       (insta/transform {:word identity})
       (apply str)))

...and now you can use (is (thrown? ...)) in the test:

(deftest parser-tests
  (is (= [[:word "Hello"] [:word "World"]] (parser "Hello World.")))
  (is (thrown? Exception (= [[:word "Hello"] [:word "World"]] (parser "Hello World?"))))

This approach uses instaparse to pretty-print the failure and wraps that in an Exception. Another approach is to use ex-info as outlined in this answer.

Rivard answered 13/10, 2014 at 15:49 Comment(3)
How do I get the information from the failure-object? For the beginning I'd like to do two things (if possible). First: Add the line-number to my exception method. Second: Add the nicely-formatted error-message to my exception. Moreover, for creating a new Exception class, it seems to be the easiest way to implement it in Java - is that correct?Mycah
... and what exactly do you mean by failure object. I thought there a no Objects (with methods and variables) in Clojure. So how can I (in general) access methods and variables of those Objects?Mycah
@Mycah the code above now includes a text description of the parse error (line, column, etc.) in the Exception. The "failure object" is a map (technically a record created by defrecord) that has some well-known keys; for example, line number can be accessed with (:line result).Rivard

© 2022 - 2024 — McMap. All rights reserved.