How is code stored and executed on the C++ abstract machine?

Asked 27/9, 2020 at 20:12 Answered 28/9, 2020 at 18:29

c++language-lawyer program-counter abstract-machine

In the first book I read about C++, it went a little bit into the details of how code is actually executed on a machine (it mentioned the program counter, the call stack, return addresses, and such). I found it really fascinating to get to know how this stuff works, although I'm aware that it isn't really necessary to know how the computer works to write good code.

When reading up on the same subjects on this Q/A site, I found out that it by no means has to be the way I had learned before, because what I had read about only was a certain implementation of C++, depending on certain computer architecture and a certain compiler. C++ code could as well run on something completely else, as long as one has a compliant compiler which behaves the "right" way. What the right way is then defined by the standard and the behavior of an "abstract machine" (I hope I got it right so far).

Of course, I'd still like to know whether concepts like the code-segment of memory or the program counter are still "somehow" pictured in the standard, and if they are, to what extent are they pictured? How is the concept of code-pieces being executed one after another described in the abstract machine?

Since it was asked in a comment whether I'd like to have the standard repeated to me: I wasn't able to understand the standard well enough to pin down exactly what it says about the abstract machine / OR which statements of the standard can be interpreted as statements about an abstract concept of "program counter" "Code storage" ... etc. So yes, out of inability, I ask the community to interpret what's written in the standard. The expected outcome of this interpretation is the most detailed conception of the internal structure of the abstract machine that still matches the criterion of being "abstract".

Rancourt answered 27/9, 2020 at 20:12 Comment(9)

You can make your code more robust, and more efficient by understanding how code is actually executed on a machine. For example, a processor's data cache. Implementing code to take advantage of the data cache will make the code run more efficiently (and faster), rather than having the processor reload the cache often. Similarly with the instruction cache. – Tallent 27/9, 2020 at 20:26

Added Language-Lawyer tag. – Tallent 27/9, 2020 at 20:28

I'm not sure I understand your question. Are you asking us to repeat, interpret and/or summarize what's written in the C++ standard? – Rotatory 27/9, 2020 at 21:5

@UlrichEckhardt see the edit (which was basically an answer to your question before I decided to edit my question). – Rancourt 27/9, 2020 at 21:26

@ThomasMatthews Some MCUs don't have a cache and instead execute directly from memory. So I agree with you - understanding the underlying hardware is really necessary. For example, I have worked on systems before that had 256 bytes of memory - you really needed to know that to write code that would work! – Marrissa 27/9, 2020 at 21:39

Do you ask what is an abstract machine and how to understand it or do you understand the topic and ask about what limitations does the standard put on the abstract machine? I don't understand - How is code stored and executed on the C++ abstract machine? - no one knows, because it's "abstract" as the name says. "how" is always a broad question. Is the term "abstract" ("abstraction") confusing you? – Villada 27/9, 2020 at 21:42

youtube.com/watch?v=ZAji7PkXaKY – Marrissa 27/9, 2020 at 21:42

I think the question needs more focus. How about adding a small program (a few lines of code), and describing what the implementation that you read in the book would do with the program? Then the answer could be an explanation of how the c++ abstract machine evaluates that particular program. – Czechoslovak 27/9, 2020 at 21:42

It is abstract, allows it to wave hands at the gritty implementation details of code storage and execution. – Star 27/9, 2020 at 22:11

Short answer: it's not.

We don't actually execute code on the abstract machine of the C++ spec (or any abstract machine -- other languages also define them). We execute code on real machines implemented with transistors, or in software running on transistors. The abstract machine in the language spec is used to define boundaries about what the code on the real machine will do -- it must run "as if" it is running on the abstract machine, at least as far as the appearence to the environment of the abstract machine definition is concerned.

The relevant quote from the standard is:

A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input.

There's no real solid definition of what exactly "observable behavior" is, however.

So why even define these abstract machines? Well, mostly because there are many different real machines and you want to say that your code will run the same way on any of them. Real machines are also very complex and hard to reason about. So the language spec defines an abstract machine that is a simplification of the kinds of real machines it expects to run on. Particularly with respect to the details of how code is stored an executed, those details are mostly "abstracted away" in the abstract machine -- it doesn't specify, so an implementation can use whatever mechanisms the real target provides and still be compliant with the spec.

Caenogenesis answered 28/9, 2020 at 0:35 Comment(5)

I got that code is not actually run on the abstract machine. From your answer I conclude that the internal structure of the abstract machine doesn't say anything about code storage, program counters, etc ... the code just "runs". If so, the question kind of remains for me. There still must be some things we can say about the code execution on the abstract machine (for example, as the other answer points out, that statements are executed sequentally, and therefore, what it even means that a statement is executed on an abstract machine). But maybe I should ask this in a new question. – Rancourt 29/9, 2020 at 14:49

There is definition of what observable behavior is. – Mockup 4/10, 2020 at 10:21

Not in any spec I've seen. There are occasional mentions of things that are not "observable behavior" (such as side effects in constructors and destructors of unnamed temps), but no hard and fast definition of what is. Most compilers seems to treat only things that could be observed by other threads or processes as "observable", not things that might be inspected by a debugger or some such. – Caenogenesis 4/10, 2020 at 19:21

There is a section [intro.abstract] in the standard that ends with "These collectively are referred to as the observable behavior of the program." eel.is/c++draft/intro.abstract#6 – Afrikah 20/12, 2021 at 11:3

@BoP: This is a recent improvement, but is still pretty vague. What constitutes "an interactive device" is still completely undefined, but at least it gets at the "intent" of what the observable behavior of the abstract machine is. It also suggests that how code is stored and executed is NOT part of the "observable behavior", even though something like a debugger might be able to "observe" it. But then maybe a debugger is an "interactive device" – Caenogenesis 20/12, 2021 at 22:7

The standard doesn't specify how the abstract machine works internally, that's the whole point. This concept is used to abstract away inner workings of physical machines.

code-segment of memory or the program counter are still "somehow" pictured in the standard

No. The standard just says (roughly speaking) than statements are executed sequentally, explains the evaluation order, etc. It doesn't have a notion of processor instructions or program counter. Function pointers are described as completely opaque, pointing to "functions" rather than individual instructions. It doesn't even guarantee that functions are stored in the same memory as the data.

The standard also doesn't introduce the concepts of the stack and the heap. It only describes what is the lifetime of objects created in different ways. The pointers are carefully described to not restrict them to be scalars. There's no notion of registers, cache, ...

Runagate answered 28/9, 2020 at 18:29 Comment(1)

Ok. But there is a concept of "execution" of a statement, and this execution can take place before or after another "execution" ... I won't get an answer on how this executions work internally, because as you say that part is abstracted away, and there really isn't an "inside" to the Abstract machine. But I'd still like to know how detailed the work of the Abstract machine can be described. Up until now I know it's a strange Thing that changes things in the Memory, and as you say it does so sequentally, following the threads of the program. Is there we CAN say about that? – Rancourt 29/9, 2020 at 15:4

Recommended topics

Hot tags