New to Spark, and new to Ada, so this question may be overly broad. However, it's asked in good faith, as part of an attempt to understand Spark. Besides direct answers to the questions below, I welcome critique of style, workflow, etc.
As my first foray into Spark, I chose to try to implement (easy) and prove correctness (unsuccessful so far) the function .
Question: What is the proper way of implementing and proving the correctness of this function?
I started with the following util.ads
:
package Util is
function Floor_Log2(X : Positive) return Natural with
Post => 2**Floor_Log2'Result <= X and then X < 2**(Floor_Log2'Result + 1);
end Util;
I have no pre-condition because the ranges of the input fully expresses the only interesting pre-condition. The post-condition I wrote based on the mathematical definition; however, I have an immediate concern here. If X
is Positive'Last
, then 2**(Floor_Log2'Result + 1)
exceeds Positive'Last
and Natural'Last
. Already I'm up against my limited knowledge of Ada here, so: Sub-question 1: What is the type of the sub-expression in the post condition, and is this overflow a problem? Is there a general way to resolve it? To avoid the issue in this particular case, I revised the specification to the less-intuitive but equivalent:
package Util is
function Floor_Log2(X : Positive) return Natural with
Post => 2**Floor_Log2'Result <= X and then X/2 < 2**Floor_Log2'Result;
end Util;
There are many ways to implement this function, and I'm not particularly concerned about performance at this point, so I'd be happy with any of them. I'd consider the "natural" implementation (given my particular C background) to be something like the following util.adb
:
package body Util is
function Floor_Log2 (X : Positive) return Natural is
I : Natural := 0;
Remaining : Positive := X;
begin
while Remaining > 1 loop
I := I + 1;
Remaining := Remaining / 2;
end loop;
return I;
end Floor_Log2;
end Util;
Attempting to prove this with no loop invariants fails, as expected. Results (this and all results are GNATprove level 4, invoked from GPS as gnatprove -P%PP -j0 %X --ide-progress-bar -u %fp --level=4 --report=all
):
util.adb:6:13: info: initialization of "Remaining" proved[#2]
util.adb:7:15: info: initialization of "I" proved[#0]
util.adb:7:17: medium: overflow check might fail[#5]
util.adb:8:23: info: initialization of "Remaining" proved[#1]
util.adb:8:33: info: range check proved[#4]
util.adb:8:33: info: division check proved[#8]
util.adb:10:14: info: initialization of "I" proved[#3]
util.ads:3:14: medium: postcondition might fail, cannot prove 2**Floor_Log2'Result <= X[#7]
util.ads:3:15: medium: overflow check might fail[#9]
util.ads:3:50: info: division check proved[#6]
util.ads:3:56: info: overflow check proved[#10]
Most of the errors here make basic sense to me. Starting with the first overflow check, GNATprove cannot prove that the loop terminates in less than Natural'Last
iterations (or at all?), so it cannot prove that I := I + 1
doesn't overflow. We know that this isn't the case, because Remaining
is decreasing. I tried to express this adding the loop variant pragma Loop_Variant (Decreases => Remaining)
, and GNATprove was able to prove that loop variant, but the potential overflow of I := I + 1
is unchanged, presumedly because proving the loop terminates at all is not equivalent to proving that it terminates in less than Positive'Last
iterations. A tighter constraint would show that the loop terminates in at most Positive'Size
iterations, but I'm not sure how to prove that. Instead, I "forced it" by adding a pragma Assume (I <= Remaining'Size)
; I know this is bad practice, the intent here was purely to let me see how far I could get with this first issue "swept under the covers." As expected, this assumption lets the prover prove all range checks in the implementation file. Sub-question 2: What is the correct way to prove that I
does not overflow in this loop?
However, we've still made no progress on proving the postcondition. A loop invariant is clearly needed. One loop invariant that holds at the top of the loop is that pragma Loop_Invariant (Remaining * 2**I <= X and then X/2 < Remaining * 2**I)
; besides being true, this invariant has the nice property that it is clearly equivalent to the post-condition when the loop termination condition is true. However, as expected GNATprove is unable to prove this invariant: medium: loop invariant might fail after first iteration, cannot prove Remaining * 2**I <= X[#20]
. This makes sense, because the inductive step here is non-obvious. With division on the real numbers one could imagine a straightforward lemma stating that for all I, X * 2**I = (X/2) * 2**(I+1)
, but (a) I don't expect GNATprove to know that without a lemma being provided, and (b) it's messier with integer division. So, Sub-Question 3a: Is this the appropriate loop invariant to try to use to prove this implementation? Sub-Question 3b: If so, what's the right way to prove it? Externally prove a lemma and use that? If so, what exactly does that mean?
At this point, I thought I'd explore a completely different implementation, just to see if it led anywhere different:
package body Util is
function Floor_Log2 (X : Positive) return Natural is
begin
for I in 1 .. X'Size - 1 loop
if 2**I > X then
return I - 1;
end if;
end loop;
return X'Size - 1;
end Floor_Log2;
end Util;
This is a less intuitive implementation to me. I didn't explore this second implementation as much, but I leave it here to show what I tried; to give a potential avenue for other solutions to the main question; and to raise additional sub-questions.
The idea here was to bypass some of the proof around overflow of I and termination conditions by making termination and ranges explicit. Somewhat to my surprise, the prover first choked on overflow checking the expression 2**I
. I had expected 2**(X'Size - 1)
to be provably within the bounds of X
-- but again, I'm up against the limits of my Ada knowledge. Sub-Question 4: Is this expression actually overflow-free in Ada, and how can that be proven?
This has turned out to be a long question... but I think the questions I'm raising, in the context of a nearly-trivial example, are relatively general and likely to be useful to others who, like me, are trying to understand if and how Spark is relevant to them.
type Big is mod 2 ** Integer'Size;
and change your conditions to2 ** Floor_Log2'Result <= Big (X) and Big (X) < 2 ** (Floor_Log2'Result + 1)
and similar to avoid the overflow warnings. – Krute