Downcast traits inside Rc for AST manipulation
Asked Answered
P

1

18

I'm trying to manipulate ASTs in Rust. There will be lots of manipulations, and I want my trees to be immutable, so to save time all references will be Rcs.

My tree nodes will look like this:

enum Condition {
    Equals(Rc<Expression>, Rc<Expression>),
    LessThan(Rc<Expression>, Rc<Expression>),
    ...
}

enum Expression {
    Plus(Rc<Expression>, Rc<Expression>),
    ...
}

I want to replace a random node of a given type with another node of the same type. To do generic operations on trees I've made a trait:

trait AstNode {
    fn children(&self) -> Vec<Rc<AstNode>>;
}

And all nodes implement this. This allows me to walk the tree without having to destructure each node type for every operation, by simply calling children().

I also want to clone a node while updating only one of its children, and leaving the other ones in place. Assume that I've been able to generate nodes of the right concrete type (and I'm happy for the program to panic if I'm wrong). I'll add the following method to the trait:

trait AstNode {
    fn clone_with_children(&self, new_children: Vec<Rc<AstNode>>) -> Self
        where Self: Sized;
}

My plan is to take the children returned by childen(), replace one of them, and call clone_with_children() to construct a node of the same enum variant but with one node replaced.

My problem is how to write clone_with_children().

I need to downcast Rc<AstNode> to Rc<Expression> (or what have you), while keeping the refcount inside the Rc the same, but none of the downcasting libraries I've found seem to be able to do that.

Is what I want possible, or should I do it completely differently?

Prostomium answered 13/10, 2016 at 14:39 Comment(8)
You've got another problem coming, actually: you cannot return Self (a trait object cannot use Self by value), you need to return Rc<AstNode> probably.Fructiferous
Is there any reason why Expression does not implement the AstNode trait, which would let dynamic dispatch do the right thing?Fructiferous
I think the AST manipulation is a distraction here. What you really want to know is whether it's possible to downcast Rc<Trait> to Rc<Object>, where Object implements Trait, right?Carpogonium
@MatthieuM., Expression does implement AstNode, I hoped that would be obvious. Doesn't help for downcasting.Prostomium
@trentcl that's true, but I included the broader context in hopes that someone could propose an altogether better solution if the downcasting turns out to be impossibleProstomium
@rix0rrr: I meant that if Expression implements AstNone, then Expression can implement clone_with_children and when it does then you do not need to downcast any longer, you can just invoke clone_with_children and it is dynamically dispatched to current concrete type.Fructiferous
I seem to have finally pieced down your question; whether I am right or wrong I would encourage you to produce a MCVE showing in which context you need to downcast (and cannot). My guess, given the lack of code sample, is that AstNode::clone_with_children returns a Rc<AstNode> which cannot be substituted for a Rc<Expression> in the conditions and expressions... I can see many ways to circumvent the issue, but it's hard to predict in advance which are useful and which are useless...Fructiferous
I've edited my answer somewhat, but the second part of the question is a bit open-ended. Possibly you should consider asking on Programmers.SE or the Rust language forums for more perspectives.Carpogonium
C
8

Note: In this answer I will use the dyn Trait syntax to make it more clear when a type is a trait object. The older way to write Rc<dyn Trait> is Rc<Trait>. See What does "dyn" mean in a type?

No, you can't downcast Rc<dyn Trait> to Rc<Concrete>, because trait objects like dyn Trait don't contain any information about the concrete type the data belongs to.

Here's an excerpt from the official documentation that applies to all pointers to trait objects (&dyn Trait, Box<dyn Trait>, Rc<dyn Trait>, etc.):

pub struct TraitObject {
    pub data: *mut (),
    pub vtable: *mut (),
}

The data field points to the struct itself, and the vtable field points to a collection of function pointers, one for each method of the trait. At runtime, that's all you have. And that's not sufficient to reconstruct the struct's type. (With Rc<dyn Trait>, the block data points to also contains the strong and weak reference counts, but no additional type information.)

But there are at least 3 other options.

Put the common behavior in the trait

First, you could add all the operations that you need to do on Expressions or Conditions to the trait AstNode, and implement them for each struct. This way you never need to call a method that isn't available on the trait object, because the trait contains all the methods you need.

This likely entails replacing most Rc<Expression> and Rc<Condition> members in the tree with Rc<dyn AstNode>, since you can't downcast Rc<dyn AstNode> (but see below about Any):

enum Condition {
    Equals(Rc<dyn AstNode>, Rc<dyn AstNode>),
    LessThan(Rc<dyn AstNode>, Rc<dyn AstNode>),
    ...
}

A variation on this might be writing methods on AstNode that take &self and return references to various concrete types:

trait AstNode {
    fn as_expression(&self) -> Option<&Expression> { None }
    fn as_condition(&self) -> Option<&Condition> { None }
    ...
}

impl AstNode for Expression {
    fn as_expression(&self) -> Option<&Expression> { Some(self) }
}

impl AstNode for Condition {
    fn as_condition(&self) -> Option<&Condition> { Some(self) }
}

Instead of downcasting Rc<dyn AstNode> to Rc<Condition>, just store it as an AstNode and call e.g. rc.as_condition().unwrap().method_on_condition(), if you're confident rc is in fact an Rc<Condition>.

Double down on enums

Second, you could create another enum that unifies Condition and Expression, and do away with trait objects entirely. This is what I have done in the AST of my own Scheme interpreter. With this solution, no downcasting is required because all the type information is in the enum variant. (Also with this solution, you definitely have to replace Rc<Condition> or Rc<Expression> if you need to get an Rc<Node> out of it.)

enum Node {
    Condition(Condition),
    Expression(Expression),
    // you may add more here
}
impl Node {
    fn children(&self) -> Vec<Rc<Node>> { ... }
}

Downcast with Any

A third option is to use Any, and Rc::downcast each Rc<dyn Any> into its concrete type as needed.

A slight variation on that would be to add a method fn as_any(&self) -> &Any { self } to AstNode, and then you can call Expression methods (that take &self) by writing node.as_any().downcast_ref::<Expression>().method_on_expression(). But there is currently no way to (safely) upcast an Rc<dyn Trait> to an Rc<dyn Any> (this could change in the future, though).

Any is, strictly speaking, the closest thing to an answer to your question. I don't recommend it because downcasting, or needing to downcast, is often an indication of a poor design. Even in languages with class inheritance, like Java, if you want to do the same kind of thing (store a bunch of nodes in an ArrayList<Node>, for example), you'd have to either make all needed operations available on the base class or somewhere enumerate all the subclasses that you might need to downcast to, which is a terrible anti-pattern. Anything you'd do here with Any would be comparable in complexity to just changing AstNode to an enum.

TL;DR

You need to store each node of the AST as a type that (a) provides all the methods you might need to call and (b) unifies all the types you might need to put in one. Option 1 uses trait objects, while option 2 uses enums, but they're pretty similar in principle. A third option is to use Any to enable downcasting.

See also

Carpogonium answered 14/10, 2016 at 1:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.