Requiring type declaration in Julia
Asked Answered
H

2

18

Is there any way to explicitly require in Julia (e.g. say within a module or package) that types must be declared? Does e.g. PackageCompiler or Lint.jl have any support for such checks? More broadly, does the Julia standard distribution itself provide any static code analyzer or equivalent that could help check this requirement?

As a motivating example, say we want to make sure that our growing production code base only accepts code that is always type declared, under the hypothesis that large code bases with type declarations tend to be more maintainable.

If we want to enforce that condition, does Julia in its standard distribution provide any mechanisms to require type declaration or help advance that goal? (e.g. anything that could be checked via linters, commit hooks, or equivalent?)

Homochromatic answered 30/12, 2019 at 15:7 Comment(1)
not sure how much this helps, but, similar to Bogumil's thoughts, hasmethod(f, (Any,) ) will return false if no generic has been defined. You'd still need to match the number of arguments though (i.e. hasmethod(f, (Any,Any) ) for a two-argument function).Pammie
C
12

The short answer is: no, there is currently no tooling for type checking your Julia code. It is possible in principle, however, and some work has been done in this direction in the past, but there isn't a good way to do it right now.

The longer answer is that "type annotations" are a red herring here, what you really want is type checking, so the broader part of your question is actually the right question. I can talk a little bit about why type annotations are a red herring, some other things that aren't the right solution, and what the right kind of solution would look like.

Requiring type annotations probably doesn't accomplish what you want: one could just put ::Any on any field, argument or expression and it would have a type annotation, but not one that tells you or the compiler anything useful about the actual type of that thing. It adds a lot of visual noise without actually adding any information.

What about requiring concrete type annotations? That rules out just putting ::Any on everything (which is what Julia implicitly does anyway). However, there are many perfectly valid uses of abstract types that this would make illegal. For example, the definition of the identity function is

identity(x) = x

What concrete type annotation would you put on x under this requirement? The definition applies for any x, regardless of type—that's kind of the point of the function. The only type annotation that is correct is x::Any. This is not an anomaly: there are many function definitions that require abstract types in order to be correct, so forcing those to use concrete types would be quite limiting in terms of what kind of Julia code one can write.

There's a notion of "type stability" that is often talked about in Julia. The term appears to have originated in the Julia community, but has been picked up by other dynamic language communities, like R. It's a little tricky to define, but it roughly means that if you know the concrete types of the arguments of a method, you know the type of its return value as well. Even if a method is type stable, that's not quite enough to guarantee that it would type check because type stability doesn't talk about any rules for deciding whether something type checks or not. But this is getting in the right direction: you'd like to be able to check that each method definition is type stable.

You many not want to require type stability, even if you could. Since Julia 1.0, it has become common to use small unions. This started with the redesign of the iteration protocol, which now uses nothing to indicate that iteration is done versus returning a (value, state) tuple when there are more values to iterate. The find* functions in the standard library also use a return value of nothing to indicate that no value has been found. These are technically type instabilities, but they are intentional and the compiler is quite good at reasoning about them optimizing around the instability. So at least small unions probably must be allowed in code. Moreover, there's no clear place to draw the line. Although perhaps one could say that a return type of Union{Nothing, T} is acceptable, but not anything more unpredictable than that.

What you probably really want, however, rather than requiring type annotations or type stability, is to have a tool that will check that your code cannot throw method errors, or perhaps more broadly that it will not throw any kind of unexpected error. The compiler can often precisely determine which method will be called at each call site, or at least narrow it down to a couple of methods. That's how it generates fast code—full dynamic dispatch is very slow (much slower than vtables in C++, for example). If you have written incorrect code, on the other hand, the compiler may emit an unconditional error: the compiler knows you made a mistake but doesn't tell you until runtime since those are the language semantics. One could require that the compiler be able to determine which methods might be called at each call site: that would guarantee that the code will be fast and that there are no method errors. That's what a good type checking tool for Julia should do. There's a great foundation for this sort of thing since the compiler already does much of this work as part of the process of generating code.

Churchwoman answered 2/1, 2020 at 15:47 Comment(0)
R
12

This is an interesting question. The key question is what we define as type declared. If you mean there is a ::SomeType statement in every method definition then it is somewhat tricky to do as you have different possibilities of dynamic code generation in Julia. Maybe there is a complete solution in this sense but I do not know it (I would love to learn it).

The thing that comes to my mind though, that seems relatively simpler to do is to check if any method defined within a module accepts Any as its argument. This is similar but not equivalent to the earlier statement as:

julia> z1(x::Any) = 1
z1 (generic function with 1 method)

julia> z2(x) = 1
z2 (generic function with 1 method)

julia> methods(z1)
# 1 method for generic function "z1":
[1] z1(x) in Main at REPL[1]:1

julia> methods(z2)
# 1 method for generic function "z2":
[1] z2(x) in Main at REPL[2]:1

look the same for methods function as the signature of both functions accepts x as Any.

Now to check if any method in a module/package accepts Any as an argument to any of methods defined in it something like the following code could be used (I have not tested it extensively as I have just written it down, but it seems to mostly cover possible cases):

function check_declared(m::Module, f::Function)
    for mf in methods(f).ms
        if mf.module == m
            if mf.sig isa UnionAll
                b = mf.sig.body
            else
                b = mf.sig
            end
            x = getfield(b, 3)
            for i in 2:length(x)
                if x[i] == Any
                    println(mf)
                    break
                end
            end
        end
    end
end

function check_declared(m::Module)
    for n in names(m)
        try
            f = m.eval(n)
            if f isa Function
                check_declared(m, f)
            end
        catch
            # modules sometimes return names that cannot be evaluated in their scope
        end
    end
end

Now when you run it on Base.Iterators module you get:

julia> check_declared(Iterators)
cycle(xs) in Base.Iterators at iterators.jl:672
drop(xs, n::Integer) in Base.Iterators at iterators.jl:628
enumerate(iter) in Base.Iterators at iterators.jl:133
flatten(itr) in Base.Iterators at iterators.jl:869
repeated(x) in Base.Iterators at iterators.jl:694
repeated(x, n::Integer) in Base.Iterators at iterators.jl:714
rest(itr::Base.Iterators.Rest, state) in Base.Iterators at iterators.jl:465
rest(itr) in Base.Iterators at iterators.jl:466
rest(itr, state) in Base.Iterators at iterators.jl:464
take(xs, n::Integer) in Base.Iterators at iterators.jl:572

and when you e.g. check DataStructures.jl package you get:

julia> check_declared(DataStructures)
compare(c::DataStructures.LessThan, x, y) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\heaps.jl:66
compare(c::DataStructures.GreaterThan, x, y) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\heaps.jl:67
cons(h, t::LinkedList{T}) where T in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\list.jl:13
dec!(ct::Accumulator, x, a::Number) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\accumulator.jl:86
dequeue!(pq::PriorityQueue, key) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\priorityqueue.jl:288
dequeue_pair!(pq::PriorityQueue, key) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\priorityqueue.jl:328
enqueue!(s::Queue, x) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\queue.jl:28
findkey(t::DataStructures.BalancedTree23, k) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\balanced_tree.jl:277
findkey(m::SortedDict, k_) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\sorted_dict.jl:245
findkey(m::SortedSet, k_) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\sorted_set.jl:91
heappush!(xs::AbstractArray, x) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\heaps\arrays_as_heaps.jl:71
heappush!(xs::AbstractArray, x, o::Base.Order.Ordering) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\heaps\arrays_as_heaps.jl:71
inc!(ct::Accumulator, x, a::Number) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\accumulator.jl:68
incdec!(ft::FenwickTree{T}, left::Integer, right::Integer, val) where T in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\fenwick.jl:64
nil(T) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\list.jl:15
nlargest(acc::Accumulator, n) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\accumulator.jl:161
nsmallest(acc::Accumulator, n) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\accumulator.jl:175
reset!(ct::Accumulator{#s14,V} where #s14, x) where V in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\accumulator.jl:131
searchequalrange(m::SortedMultiDict, k_) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\sorted_multi_dict.jl:226
searchsortedafter(m::Union{SortedDict, SortedMultiDict, SortedSet}, k_) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\tokens2.jl:154
sizehint!(d::RobinDict, newsz) in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\robin_dict.jl:231
update!(h::MutableBinaryHeap{T,Comp} where Comp, i::Int64, v) where T in DataStructures at D:\AppData\.julia\packages\DataStructures\iymwN\src\heaps\mutable_binary_heap.jl:250

What I propose is not a full solution to your question but I found it useful for myself so I thought of sharing it.

EDIT

The code above accepts f to be Function only. In general you can have types that can be callable. Then check_declared(m::Module, f::Function) signature could be changed to check_declared(m::Module, f) (actually then the function itself would allow Any as the second argument :)) and pass all evaluated names to this function. Then you would have to check if methods(f) has positive length inside the function (as methods for non-callable returns a value that has length 0).

Rattoon answered 30/12, 2019 at 18:27 Comment(0)
C
12

The short answer is: no, there is currently no tooling for type checking your Julia code. It is possible in principle, however, and some work has been done in this direction in the past, but there isn't a good way to do it right now.

The longer answer is that "type annotations" are a red herring here, what you really want is type checking, so the broader part of your question is actually the right question. I can talk a little bit about why type annotations are a red herring, some other things that aren't the right solution, and what the right kind of solution would look like.

Requiring type annotations probably doesn't accomplish what you want: one could just put ::Any on any field, argument or expression and it would have a type annotation, but not one that tells you or the compiler anything useful about the actual type of that thing. It adds a lot of visual noise without actually adding any information.

What about requiring concrete type annotations? That rules out just putting ::Any on everything (which is what Julia implicitly does anyway). However, there are many perfectly valid uses of abstract types that this would make illegal. For example, the definition of the identity function is

identity(x) = x

What concrete type annotation would you put on x under this requirement? The definition applies for any x, regardless of type—that's kind of the point of the function. The only type annotation that is correct is x::Any. This is not an anomaly: there are many function definitions that require abstract types in order to be correct, so forcing those to use concrete types would be quite limiting in terms of what kind of Julia code one can write.

There's a notion of "type stability" that is often talked about in Julia. The term appears to have originated in the Julia community, but has been picked up by other dynamic language communities, like R. It's a little tricky to define, but it roughly means that if you know the concrete types of the arguments of a method, you know the type of its return value as well. Even if a method is type stable, that's not quite enough to guarantee that it would type check because type stability doesn't talk about any rules for deciding whether something type checks or not. But this is getting in the right direction: you'd like to be able to check that each method definition is type stable.

You many not want to require type stability, even if you could. Since Julia 1.0, it has become common to use small unions. This started with the redesign of the iteration protocol, which now uses nothing to indicate that iteration is done versus returning a (value, state) tuple when there are more values to iterate. The find* functions in the standard library also use a return value of nothing to indicate that no value has been found. These are technically type instabilities, but they are intentional and the compiler is quite good at reasoning about them optimizing around the instability. So at least small unions probably must be allowed in code. Moreover, there's no clear place to draw the line. Although perhaps one could say that a return type of Union{Nothing, T} is acceptable, but not anything more unpredictable than that.

What you probably really want, however, rather than requiring type annotations or type stability, is to have a tool that will check that your code cannot throw method errors, or perhaps more broadly that it will not throw any kind of unexpected error. The compiler can often precisely determine which method will be called at each call site, or at least narrow it down to a couple of methods. That's how it generates fast code—full dynamic dispatch is very slow (much slower than vtables in C++, for example). If you have written incorrect code, on the other hand, the compiler may emit an unconditional error: the compiler knows you made a mistake but doesn't tell you until runtime since those are the language semantics. One could require that the compiler be able to determine which methods might be called at each call site: that would guarantee that the code will be fast and that there are no method errors. That's what a good type checking tool for Julia should do. There's a great foundation for this sort of thing since the compiler already does much of this work as part of the process of generating code.

Churchwoman answered 2/1, 2020 at 15:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.