What is the difference between assembly code and bytecode?
Asked Answered
S

7

43

While in the search for the various differences in the meanings of source code, bytecode, assembly code, machine code, compilers, linkers, interpreters, assemblers and all the rest, I only got confused on the difference between bytcode and assembly code.

Particularly the introduction this wikipedia article to describe CIL confused me since it seems to use both terms (assembly code and bytecode) interchangeably making me think they might mean exactly the same.

Superorder answered 23/11, 2009 at 11:1 Comment(3)
If you have access to Andrew Tanenbaum's Structured Computer Organization book, he will have a technically correct definition of the two terms.Viviennevivify
See also e.g. this answer to a similar question about Java.Mouton
See also https://mcmap.net/q/391098/-what-exactly-is-bytecodeMouton
T
24

Assembly code normally does mean the human readable form of a machine's native language (the so-called machine language). Byte code on the other hand is normally a language that can be interpreted by a byte code interpreter — so it is not the processor's native language.

Why the confusion then? You can't compare Assembly language versus Byte code this way. Of course a byte code can also have an assembly code — meaning a human readable form of it, because "Assembly language" does not necessary mean that it is for a real machine — but it is a human readable form of some native language — for processors, this native language is the machine code — but you also can have assembly code of a pseudo-(or interpreted) machine like Bytecode.

See also: Assembly Language

Further distress comes of course — like you can see in all the discussion here — because IT people (also myself) tend to be lax in wording. "Assembly language" is often used when speaking about machine code. This of course is not totally correct, because Assembly Language is only the human readable form of some machine's code.

Turnpike answered 23/11, 2009 at 13:36 Comment(1)
What IT people do (hopefully) is to abstract from form. The mapping between machine code and assembly is performed to optimise for the intended audience, CPU or human.Thesis
E
3

Assembly code normally is used to refer to code that, once compiled to Machine Code, can be executed by a CPU whilst bytecode in a virtual machine.

The source of confusion over CIL might be related to the fact that machine code for CPU X can be interpreted by a Virtual Machine running on CPU Y (for example).

Note that a Virtual Machine implementation can be crafted to interpret any machine code and/or bytecode: it is left to the developers and their aspiration (and time on their hands) ;-)

Electrokinetics answered 23/11, 2009 at 11:8 Comment(1)
Once again: Assembly code is not executed by a real CPU. What is executed is "Machine Code". Assembly code is the human readable form of a machine code (or in some cases: byte code).Turnpike
B
3

I remember that since the begining of microcontrollers and microprocessors the word Assembly was used to designate the machine code in a human readable way. It seems to me that Microsoft has caused confusion by using the same word "Assembly" to name what would be the ByteCode produced by their dotNET Framework compilers. So in this case I'd say that what "Bytecode" means to the Java runtime is similar to what this new use of the word "Assembly" means for Microsoft dotNET runTime environment. Am I wrong to assume that?

Balderas answered 13/1, 2013 at 0:24 Comment(1)
Pretty sure that conclusion about Java is incorrect (everything else is correct, though). IDK if Java has a name for collections of bytecode (in a .jar file?), but if it does I don't think it's called "an assembly". Maybe "package" or "library". There is stuff like maven.apache.org/plugins/maven-assembly-plugin that collects a bunch of stuff including documentation into an "assembly", in a similar sense to .net, but AFAIK unrelated. For manipulating Java bytecode (including at runtime via reflection?) there's java-bytecode-asm for a Java package.Yandell
D
1

Assembled code is runnable on a CPU with a specific instruction set, while bytecode can be executed in a virtual machine (such as the Java runtime) on any CPU that can run the VM.

Dispersive answered 23/11, 2009 at 13:51 Comment(2)
"Assembled code" is also called "machine code" -- just for clarification. See link in my answer.Turnpike
When you have meant "Assembly Code" this of course is not machine code, since it must be "assembled" by an Assembler first -- the result than is machine code.Turnpike
M
0

assembler is a macro language. It's a set of instructions used to instruct the CPU or other device. It's translated in machine code which are readable instructions by the CPU.

Byte code s are instructions for the virtual machine to be interpreted and still need to get translated into machine code before being executed.

Mesoderm answered 23/11, 2009 at 11:10 Comment(1)
assembler is not necessarily a macro language. Assembler in its basic form is just a human readable form of machine code.Turnpike
W
-1

Bytecode is mainly for platform independence and needs a virtual environment to run.

Assembly code is human readable machine code (at a bit upper level) that directly run by the CPU.

Bytecode is not machine/hardware specific (directly handling hardware) but assembly code is machine/hardware specific.

Waterlogged answered 13/1, 2020 at 3:1 Comment(1)
As other answers have pointed out, you can have an assembly language for bytecode, i.e. a human-readable text version of bytecode. There's even an SO tag for java-bytecode-asm. (So yes, there's often a distinction between asm for hardware machine code vs. other assembly languages.)Yandell
T
-3

Assembly code is (represents) the native code for the processor you are programming.

Bytecode is a term for the binary version of the "commands" that are compiled to be executed by an interpreter, or a virtual machine.

In essence bytecodes define the opcodes for a virtual processor, while assembly consists of the opcodes of a physical processor. (we will ignore the microcode inside the CPU for now :-) )

Thesis answered 23/11, 2009 at 11:9 Comment(3)
Not totally correct. Assembly code is the human readable form of a machine code. Machine code is the native code for a processor.Turnpike
@Juergen, you mix form and content, it is a matter of detail or context that decides between terminology like "cpu opcodes", "machine language", "assembler". In the context of OP's question they are equivalent imho.Thesis
IMHO you mix things up, since IT people are often lax in their wording, things get confusing. Assembly language is a human readable representation of some machine language (can also be a virtual machine -- e.g. bytecode) period. See the wikipedia article I linked in my answer.Turnpike

© 2022 - 2024 — McMap. All rights reserved.