Erlang: Will adding type spec to code make dialyzer more effective?
Asked Answered
K

1

8

I have a project that doesn't have -spec or -type in code, currently dialyzer can find some warnings, most of them are in machine generated codes.

Will adding type specs to code make dialyzer find more errors?

Off the topic, is there any tool to check whether the specs been violated or not?

Kwiatkowski answered 21/12, 2015 at 7:5 Comment(0)
S
11

Adding typespecs will dramatically improve the accuracy of Dialyzer.

Because Erlang is a dynamic language Dialyzer must default to a rather broad interpretation of types unless you give it hints to narrow the "success" typing that it will go by. Think of it like giving Dialyzer a filter by which it can transform a set of possible successes to a subset of explicit types that should ever work.

This is not the same as Haskell, where the default assumption is failure and all code must be written with successful typing to be compiled at all -- Dialyzer must default to assume success unless it knows for sure that a type is going to fail.

Typespecs are the main part of this, but Dialyzer also checks guards, so a function like

increment(A) -> A + 1.

Is not the same as

increment(A) when A > 100 -> A + 1.

Though both may be typed as

-spec increment(integer()) -> integer().

Most of the time you only care about integer values being integer(), pos_integer(), neg_integer(), or non_neg_integer(), but occasionally you need an arbitrary range bounded only on one side -- and the type language has no way to represent this currently (though personally I would like to see a declaration of 100..infinity work as expected).

The unbounded-range of when A > 100 requires a guard, but a bounded range like when A > 100 and A < 201 could be represented in the typespec alone:

-spec increment(101..200) -> pos_integer().
increment(A) -> %stuff.

Guards are fast with the exception of calling length/1 (which you should probably never actually need in a guard), so don't worry with the performance overhead until you actually know and can demonstrate that you have a performance problem that comes from guards. Using guards and typespecs to constrain Dialyzer is extremely useful. It is also very useful as documentation for yourself and especially if you use edoc, as the typespec will be shown there, making APIs less mysterious and easy to toy with at a glance.

There is some interesting literature on the use of Dialyzer in existing codebases. A well-documented experience is here: Gradual Typing of Erlang Programs: A Wrangler Experience. (Unfortunately some of the other links I learned a lot from previously have disappeared or moved. (!.!) A careful read of the Wrangler paper, skimming over the User's Guide and man page, playing with Dialyzer, and some prior experience in a type system like Haskell's will more than prepare you for getting a lot of mileage out of Dialyzer, though.)

[On a side note, I've spoken with a few folks before about specifying "pure" functions that could be guaranteed as strongly typed either with a notation or by using a different definition syntax (maybe Prolog's :- instead of Erlang's ->... or something), but though that would be cool, and it is very possible even now to concentrate side-effects in a tiny part of the program and pass all results back in a tuple of {Results, SideEffectsTODO}, this is simply not a pressing need and Erlang works pretty darn well as-is. But Dialyzer is indeed very helpful for showing you where you've lost track of yourself!]

Sherronsherry answered 21/12, 2015 at 8:0 Comment(12)
Thanks for the detailed answer :) Before asking this, I added a little specs here and there to the code base, it didn't help a thing, guess I added too less or added the trivial part (the part that dialyzer guessed accurately). I still miss Haskell's type system :(Kwiatkowski
@NotanID I miss it, too! That's why I would like to have a way to declare compiler-enforced strict/pure functions. Overall the benefits in production of Erlang's approach are pretty overwhelming (and you won't get random build failures because someone twiddled forbidden mysteries in Cabal this week), but it would be nice to have some enforced method of pure definitions or a formal method of monadifying operations with a deferred-side-effects burrito like {Values, TODOs} that Dialyzer or the compiler knew to complain about if it smelled funny. (Hmmm... the "stinky burrito" type system.)Sherronsherry
That being said, Dialyzer is also great at determining types on it's own. I'd suggest only adding specific type specs using the typer tool to identify places where Dialyzer didn't figure out the types. Personally, I think excessive type speccing makes the code more unreadable.Pearlypearman
@AdamLindberg Absolutely! On smallish projects (and its pretty amazing how far a small bit of Erlang can actually go) I sometimes only use typespecs as documentation for myself regarding functions I want to write as I layout the code in my head. But! On a few larger projects (game servers or business application services, etc.) I've found typespecs to be pretty helpful -- there can be some pretty long chains of value passthrough and manipulation.Sherronsherry
For sure, but I argue that adding them when Dialyzer already figured out the correct type is unnecessary. The documentation aspect of it can in many cases be solved by good variable and function names.Pearlypearman
@AdamLindberg I've frequently thought that Dialyzer had the "correct" types in mind, only to discover that when I actually specced a program it was riddled with type errors that just weren't normally causing problems, but could certainly be made to with the correct user inputs. This changed my mind about it -- because I've found for-real bugs that power users of my programs would have run into eventually. This is especially helpful if you make frequent use of the "every line is an assertion" style. But its certainly not necessary.Sherronsherry
@Sherronsherry Static-typing for Erlang? guess it's too good to be true.Kwiatkowski
@AdamLindberg the problem with documenting with good names is that they are not guaranteed to be correct, (and sometimes it's hard to come up with a good name that expresses the return value, this is especially true when your native language is not English), while static-typing or dialyzer-verified-spec provides some level of guarantee. for some people this is not a problem, but for those paranoid, ahh...Kwiatkowski
@Sherronsherry Actually, other than annotating some functions are pure, wouldn't it be better if pure is the default, and add annotations to allow functions to have side-effect?Kwiatkowski
@NotanID That would require changes to the runtime that are simply incompatible with the way Erlang works (and has always worked). Dynamic types are necessary for many of the runtime features we expect to work properly. Adding a pure declaration as a separate language semantic, though, would be entirely possible. And we want side-effects in Erlang -- this entire style of concurrency is based on messaging, not calling, after all.Sherronsherry
@NotanID The same applies to type specifications.Pearlypearman
@NotanID This is not EEP-worthy yet (I would want to subject it to a thoroughly humiliating session with ROK first, to be honest), but this post has motivated me to write a post of my own about it.Sherronsherry

© 2022 - 2024 — McMap. All rights reserved.