Logical shifts are operations in which the bits of a register or memory location
are moved to the right or left by a certain number or a value in the CL
register. They are also a very quick way to multiply or divide by 2 or powers of
2 as it involves just a shift of bits. There are 4 shift bit instructions, 4
rotate bit instructions and 2 double precision shift bit instructions for
general purpose registers.
The shift arithmetic left SAL and shift logical left SHL instructions
perform the same operation, and shift the bits in the destination operand to the
left. For each shifted bit, the most significant bit is moved into the CF flag
(carry flag) in the RFLAGS register, and the least significant bit is cleared.
Similarly, the shift arithmetic right SAR and shift logical right SHR
instruction shift the bits in the destination operand to the right, with the
least significant bit being moved into the CF flag. However, the most
significant bit is cleared only for the SHR instruction. It remains the same for
the SAR instruction, for maintaining the sign of the unshifted value in the
destination operand.
The rotate left ROL and rotate through carry left RCL shift all their bits
to their more-significant bit locations, where the most significant bit is
rotated back into the least significant bit location. The rotate right ROR and
rotate through carry right RCR shift all their bits to their
lesser significant bit locations, where the least significant bit is rotated
back into the most significant bit location. The RCL and RCR instructions include
the carry flag CF in the rotation. The overflow flag OF is defined only for the 1-bit rotations.
In 64-bit mode, the default operation size is 32 bits, and the mask-width for
the CL register is 5 bits (value is 31). This means that the default maximum
number of bit-shifts will be 31. To change the operation size to 64 bits, and
the mask-width for the CL register to 6 bits (value is 63), the REX.W prefix
needs to be used. The assembler will automatically add that if the 64-bit
registers like RAX, RBX, etc. are used. If the
32-bit registers like EAX, EBX, etc. are used, no REX prefix is added. If the extra registers R8 - R15 are used,
the corresponding REX prefix is added by the assembler. This is valid for all
the rotate and shift operation instructions.
Here is an example of what the opcodes would look like for different size registers being used:
- If the instruction is
rol eax, 1, the opcode generated (in hexadecimal notation) isD1 C0. - If the instruction is
rol rax, 1, the opcode generated (in hexadecimal notation) is48 D1 C0. You can see that0x48is theREX.Wprefix addedby the assembler. - If the instruction is
rol r8,1, the opcode generated (in hexadecimal notation) is49 D1 C0. TheREXprefix here is0x49.
Below is a sample program to count the number of bits that are on (value 1) in a 4-byte integer entered by the user at the prompt.
section .rodata
prompt1 db "Enter a number:",0
prompt2 db "The number of bits that are on in %d are %d.",10,0
num_format db "%d"
section .text
global main
extern printf, scanf
main:
push rsp
mov rbp, rsp
sub rsp, 8 ; we plan to read in a 4-byte integer on the stack
push rbx
push r12
push r13
push r14
push r15
pushfq
; read in the 4-byte integer
mov rdi, dword prompt1
xor rax, rax
call printf
lea rsi, [rbp-8]
mov rdi, dword num_format
xor rax, rax
call scanf
; count the bits that have value 1
mov eax, [rbp-8] ; since we deal with a 4-byte integer we use EAX here.
; If we want to work with a 64-bit integer we will use RAX instead.
mov rcx, 64 ; set the maximum number of bits you want to count, in this case 64 (register size).
xor rdx, rdx
count_loop:
rol rax, 1 ; since we want to rotate the bits so as to maintain the unshifted value we use RAX.
adc rdx, 0 ; since the most-significant bit is moved into CF, we add with carry.
loop count_loop
; print the result
; the third argument is the counted value, but that is already stored in RDX.
mov rsi, rax ; move the original 4-byte integer value into RSI. We can also use EAX and ESI.
mov rdi, dword prompt2
xor rax, rax
call printf
pop rcx ; removing the subtracted 8 bytes
popfq
pop r15
pop r14
pop r13
pop r12
pop rbx
leave
ret
As we can see, the above program maintains the original 4-byte integer value in
the same register RAX because of the fact that we use the ROL instruction. We
could also use the SHL instruction and the same result would be achieved.
However, we would end up losing the original value in the RAX register and would
have to get it back from memory to print it on screen in the results. Will this
program work in the same way if the ROL instruction was replaced
by the ROR instruction ?
To read in an 8-byte long integer, we would have to make the following changes, viz.,
- replace
"%d"with"%ld"in all the string prompts and formats, and - replace
mov eax, [rbp+8]withmov rax, [rbp+8].
To compile and link the program above called bitcount.asm we do the following:
$ yasm -f elf64 bitcount.asm
$ ld -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 \
/usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o \
bitcount.o /usr/lib/x86_64-linux-gnu/crtn.o -lc -o bitcount.out
Download bitcount.asm, asm_io.inc and asm_io.asm.
There are two double-precision shift instructions SHLD and SHRD … to be done!.

Donate BITCOIN to 19hrWWw1dPvBE1wVPfCnH8LqnUwsT3NsHW.