How Byte code is interpreted by JVM?

How Byte code is interpreted by JVM?

high-level language like Java requires the compiler to translate source code to highly optimized byte code, then byte code is interpreted by the JVM interpreter. Bytecode is generated from javac compiler and produced as a .class file.

Did have the question in your mind that what .class file contains nothing but hexadecimal code? but which code is used for which purpose? if then you are in the right place to go through!

let’s write code for subtract operation, and we will generate through compile.

public class AddToValue {
  public static void main(String[] args) {
    long maxLong = Long.MAX_VALUE;
    long secondMinimumPositiveLong = 1;
    long secondMaxLong = maxLong - secondMinimumPositiveLong;
    System.out.println(secondMaxLong);
  }
}

this high-level language hides the complexity of how JVM or operating system will read this code. Suppose we have declared the max long value, this is a simple instruction to us but how the value of maxLong variable is loaded to the CPU register is not our headache. we have to be relayed on the abstract layer.

although high-level code must be converted to instruction machine code-named as bytecode.

let’s compile this program.

javac AddToValue.java

we will get output like this

    cafe babe 0000 0037 001f 0a00 0800 1107
    0012 057f ffff ffff ffff ff09 0013 0014
    0a00 1500 1607 0017 0700 1801 0006 3c69
    6e69 743e 0100 0328 2956 0100 0443 6f64
    6501 000f 4c69 6e65 4e75 6d62 6572 5461
    626c 6501 0004 6d61 696e 0100 1628 5b4c
    6a61 7661 2f6c 616e 672f 5374 7269 6e67
    3b29 5601 000a 536f 7572 6365 4669 6c65
    0100 0f41 6464 546f 5661 6c75 652e 6a61
    7661 0c00 0900 0a01 000e 6a61 7661 2f6c
    616e 672f 4c6f 6e67 0700 190c 001a 001b
    0700 1c0c 001d 001e 0100 0a41 6464 546f
    5661 6c75 6501 0010 6a61 7661 2f6c 616e
    672f 4f62 6a65 6374 0100 106a 6176 612f
    6c61 6e67 2f53 7973 7465 6d01 0003 6f75
    7401 0015 4c6a 6176 612f 696f 2f50 7269
    6e74 5374 7265 616d 3b01 0013 6a61 7661
    2f69 6f2f 5072 696e 7453 7472 6561 6d01
    0007 7072 696e 746c 6e01 0004 284a 2956
    0021 0007 0008 0000 0000 0002 0001 0009
    000a 0001 000b 0000 001d 0001 0001 0000
    0005 2ab7 0001 b100 0000 0100 0c00 0000
    0600 0100 0000 0100 0900 0d00 0e00 0100
    0b00 0000 3c00 0400 0700 0000 1414 0003
    400a 421f 2165 3705 b200 0516 05b6 0006
    b100 0000 0100 0c00 0000 1600 0500 0000
    0300 0400 0400 0600 0500 0b00 0600 1300
    0700 0100 0f00 0000 0200 10

according to The Java Virtual Machine Instruction Set there are 256 possible bytecode instructions. generated bytecode contains instructions among them, we will find out which instruction is used for which bytecode down the line. this instructions set vary x86 to x64 another reason is that JVM tries to use a specific register for an instruction that the processor does not have this register and used a stack for that operation.

let’s disassemble this class and we will print the instructions from the bytecode.

javap -c AddToValue.class

Compiled from "AddToValue.java"
public class AddToValue {
  public AddToValue();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]);
    Code:
       0: ldc2_w        #3                  // long 9223372036854775807l
       3: lstore_1
       4: lconst_1
       5: lstore_3
       6: lload_1
       7: lload_3
       8: lsub
       9: lstore        5
      11: getstatic     #5                  // Field java/lang/System.out:Ljava/io/PrintStream;
      14: lload         5
      16: invokevirtual #6                  // Method java/io/PrintStream.println:(J)V
      19: return
}

We will map bytecode with these printed instructions, first, we have to understand some term

opcode: opcode is operation code, also known as instruction machine code, that specifies the operation to be performed. ex:

operand stack: The operand stack is a 32-bit, used to store value and return once instructions are invoked.

mnemonic: Short description of the instruction ex: istore_1

theorem: the mnemonic form of opcode => mnemonic = opcode

now we will subtract operation instructions

    long maxLong = Long.MAX_VALUE;
    long secondMinimumPositiveLong = 1;
    long secondMaxLong = maxLong - secondMinimumPositiveLong;

for this human-readable instructions, we got 8 opcode instructions

       0: ldc2_w        #3                  // long 9223372036854775807l
       3: lstore_1
       4: lconst_1
       5: lstore_3
       6: lload_1
       7: lload_3
       8: lsub
       9: lstore        5

opcode instructions corresponding hexadecimal value

MnemonicOpcode
ldc2_w14
lstore_140
lconst_10a
lstore_342
lload_11f
lload_321
lsub65
lstore37

this series of hexadecimal code will exist within this .class file.

Bytecode
cafe babe 0000 0037 001f 0a00 0800 1107
0012 057f ffff ffff ffff ff09 0013 0014
0a00 1500 1607 0017 0700 1801 0006 3c69
6e69 743e 0100 0328 2956 0100 0443 6f64
6501 000f 4c69 6e65 4e75 6d62 6572 5461
626c 6501 0004 6d61 696e 0100 1628 5b4c
6a61 7661 2f6c 616e 672f 5374 7269 6e67
3b29 5601 000a 536f 7572 6365 4669 6c65
0100 0f41 6464 546f 5661 6c75 652e 6a61
7661 0c00 0900 0a01 000e 6a61 7661 2f6c
616e 672f 4c6f 6e67 0700 190c 001a 001b
0700 1c0c 001d 001e 0100 0a41 6464 546f
5661 6c75 6501 0010 6a61 7661 2f6c 616e
672f 4f62 6a65 6374 0100 106a 6176 612f
6c61 6e67 2f53 7973 7465 6d01 0003 6f75
7401 0015 4c6a 6176 612f 696f 2f50 7269
6e74 5374 7265 616d 3b01 0013 6a61 7661
2f69 6f2f 5072 696e 7453 7472 6561 6d01
0007 7072 696e 746c 6e01 0004 284a 2956
0021 0007 0008 0000 0000 0002 0001 0009
000a 0001 000b 0000 001d 0001 0001 0000
0005 2ab7 0001 b100 0000 0100 0c00 0000
0600 0100 0000 0100 0900 0d00 0e00 0100
0b00 0000 3c00 0400 0700 0000 1414 0003
400a 421f 2165 3705 b200 0516 05b6 0006 (40 0a 42 1f 21 65 37 05 -> lstore_1 lconst_1 lstore_3 lload_1 lload_3 lsub lstore)
b100 0000 0100 0c00 0000 1600 0500 0000
0300 0400 0400 0600 0500 0b00 0600 1300
0700 0100 0f00 0000 0200 10

Operand Stack Preparation from these instructions:

MnemonicDescription
ldc2_wpush 9223372036854775807l to stack from constant pool
lstore_1store 9223372036854775807l in a local variable 1
lconst_1push 1l onto the stack
lstore_3store 1l in a local variable 3
lload_1load 9223372036854775807l from a local variable 1
lload_3load 1l from a local variable 3
lsubsubtract 9223372036854775807l - 1l
lstorestore 9223372036854775806l in a local variable #index

In summary, we have seen how byte code holds the instruction need to be executed by JVM also mapping is made between Mnemonic and opcode in hexadecimal form.

ref:

  1. List of Java bytecode instructions
  2. The Java Virtual Machine Instruction Set
  3. The Java Virtual Machine