Logical shifts are operations in which the bits of a register or memory location
are moved to the right or left by a certain number or a value in the
register. They are also a very quick way to multiply or divide by 2 or powers of
2 as it involves just a shift of bits. There are 4 shift bit instructions, 4
rotate bit instructions and 2 double precision shift bit instructions for
general purpose registers.
The shift arithmetic left
SAL and shift logical left
perform the same operation, and shift the bits in the destination operand to the
left. For each shifted bit, the most significant bit is moved into the
(carry flag) in the
RFLAGS register, and the least significant bit is cleared.
Similarly, the shift arithmetic right
SAR and shift logical right
instruction shift the bits in the destination operand to the right, with the
least significant bit being moved into the
CF flag. However, the most
significant bit is cleared only for the
SHR instruction. It remains the same for
SAR instruction, for maintaining the sign of the unshifted value in the
The rotate left
ROL and rotate through carry left
RCL shift all their bits
to their more-significant bit locations, where the most significant bit is
rotated back into the least significant bit location. The rotate right
rotate through carry right
RCR shift all their bits to their
lesser significant bit locations, where the least significant bit is rotated
back into the most significant bit location. The
RCR instructions include
the carry flag
CF in the rotation. The overflow flag
OF is defined only for the 1-bit rotations.
In 64-bit mode, the default operation size is 32 bits, and the mask-width for
CL register is 5 bits (value is 31). This means that the default maximum
number of bit-shifts will be 31. To change the operation size to 64 bits, and
the mask-width for the
CL register to 6 bits (value is 63), the
needs to be used. The assembler will automatically add that if the 64-bit
RBX, etc. are used. If the
32-bit registers like
EBX, etc. are used, no
REX prefix is added. If the extra registers
R8 - R15 are used,
REX prefix is added by the assembler. This is valid for all
the rotate and shift operation instructions.
Here is an example of what the opcodes would look like for different size registers being used:
- If the instruction is
rol eax, 1, the opcode generated (in hexadecimal notation) is
- If the instruction is
rol rax, 1, the opcode generated (in hexadecimal notation) is
48 D1 C0. You can see that
REX.Wprefix addedby the assembler.
- If the instruction is
rol r8,1, the opcode generated (in hexadecimal notation) is
49 D1 C0. The
REXprefix here is
Below is a sample program to count the number of bits that are on (value 1) in a 4-byte integer entered by the user at the prompt.
section .rodata prompt1 db "Enter a number:",0 prompt2 db "The number of bits that are on in %d are %d.",10,0 num_format db "%d" section .text global main extern printf, scanf main: push rsp mov rbp, rsp sub rsp, 8 ; we plan to read in a 4-byte integer on the stack push rbx push r12 push r13 push r14 push r15 pushfq ; read in the 4-byte integer mov rdi, dword prompt1 xor rax, rax call printf lea rsi, [rbp-8] mov rdi, dword num_format xor rax, rax call scanf ; count the bits that have value 1 mov eax, [rbp-8] ; since we deal with a 4-byte integer we use EAX here. ; If we want to work with a 64-bit integer we will use RAX instead. mov rcx, 64 ; set the maximum number of bits you want to count, in this case 64 (register size). xor rdx, rdx count_loop: rol rax, 1 ; since we want to rotate the bits so as to maintain the unshifted value we use RAX. adc rdx, 0 ; since the most-significant bit is moved into CF, we add with carry. loop count_loop ; print the result ; the third argument is the counted value, but that is already stored in RDX. mov rsi, rax ; move the original 4-byte integer value into RSI. We can also use EAX and ESI. mov rdi, dword prompt2 xor rax, rax call printf pop rcx ; removing the subtracted 8 bytes popfq pop r15 pop r14 pop r13 pop r12 pop rbx leave ret
As we can see, the above program maintains the original 4-byte integer value in
the same register
RAX because of the fact that we use the
ROL instruction. We
could also use the
SHL instruction and the same result would be achieved.
However, we would end up losing the original value in the
RAX register and would
have to get it back from memory to print it on screen in the results. Will this
program work in the same way if the
ROL instruction was replaced
ROR instruction ?
To read in an 8-byte long integer, we would have to make the following changes, viz.,
"%ld"in all the string prompts and formats, and
mov eax, [rbp+8]with
mov rax, [rbp+8].
To compile and link the program above called
bitcount.asm we do the following:
$ yasm -f elf64 bitcount.asm $ ld -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 \ /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o \ bitcount.o /usr/lib/x86_64-linux-gnu/crtn.o -lc -o bitcount.out
There are two double-precision shift instructions
SHRD … to be done!.