Most programmers rarely give a second thought to the bit patterns their compilers generate. After all, they don’t need to. Compilers have freed them from the low-level details of ones and zeroes. If you are a Java programmer, the JVM has converted your box to a virtual Java machine. It’s a convenient and effective abstraction that allows you to develop wonderful things.
Still … there is a fundamental appeal to writing assembly language, clearing away the virtual cloud, and working with bits and bytes directly, and without a net. At the assembly level, it’s crucial to pay attention to all the ones and zeroes. In fact, paying attention to ones and zeroes is the way forward … the way to get a grip on the hundreds of instructions that are available to you as an assembler programmer.
Assembler instructions fall naturally into groups based on the binary pattern of each instruction. This pattern is called an “Instruction format”. For example the MVC (Move Character) instruction belongs to a group of instructions called “Storage to Storage”. Specifically, MVC is Storage to Storage, type one (SS1). Here’s the format, bit by bit:
Bit 0 – 7 | The operation code
Bits 8 – 15 | The length associated with operand 1
Bits 16 – 19 | The base register for operand 1
Bits 20 – 31 | The displacement for operand 1
Bits 32 – 35 | The base register for operand 2
Bits 36 – 47 | The displacement for operand 2
Suppose you had coded MVC X,Y and you saw the assembled instruction in memory presented in a hex format: D2 03 C0 04 C0 08 . The D2 is the operation code for MVC. The 03 represents the length (really the length – 1) associated with operand 1. C004 is the base/displacement address of X, and C008 is the base/displacement address of Y. This is a fairly boring array of information, yes? Perhaps so, until you realize that this is the only information presented to the CPU when it’s time to execute the instruction.
So what information does the CPU know about your instruction at execution time?
- The operation – MVC
- The number of bytes associated with X. (This length may or may not match the actual length of X!)
- The beginning address of X
- The beginning address of Y
And what exactly doesn’t the CPU understand about y your instruction?
- The ending address of X
- The ending address of Y
- The type of information stored in X or Y
- Whether X and Y will overlap if the operation is repeated for the specified length
What’s really interesting and helpful is that lots of instructions fall into the SS1 instruction format. The facts you learned above apply to all the instructions in this group. Knowing an instruction is SS1 is half the battle to learning how the instruction works.
So here’s my plan. Early on, learn the following instruction formats: SS1, SS2, SI, RS, RX, and RR. There are other types, but these six types will carry you a long way into this journey.
I’ve developed a software product called VisibleZ that will also help you on this journey. VisisbleZ was written in Java, and is an object code emulator for IBM mainframes that will help you visualize instructions as you single step through object code programs. It will also help you learn each instruction type. You can download the product from the product homepage: http://csc.columbusstate.edu/woolbright/visiblez.xml .
You will also find a series of lessons that will help you get started. Start with the lesson called “Reading Objectcode” and you’ll soon be an expert on the six instruction formats mentioned above.