gdb msb lsdb float little endian

1. gdb float
2. disassemble
3. cpp weekly assembly
- 3.1. C++ Weekly - Ep 34 - Reading Assembly Language - Part 1 (intel syntax, AT&T syntax)
- 3.2. C++ Weekly - Ep 35 - Reading Assembly Language - Part 2

1 gdb float

#include <iostream>

int main() {
    float f = 8.5;
    std::cout << f << '\n';
    return 0;
}

THIS IS little endian example

(gdb) p f
$1 = 8.5

(gdb) x/4tb &f
0x7fffffffc22c: 00000000        00000000        00001000        01000001

Python

# split by space and reverse
''.join('00000000        00000000        00001000        01000001'.split()[::-1])

1.1 float 8.5 IEEE

8.5 (https://resources.nerdfirst.net/float)

Final Bit String: 0 10000010 0001000 00000000 00000000

1.2 little or big endian

Intel processors are little endian machines

int x = 0x76543210; 
char *c = (char*) &x;

Now if you take a pointer c of type charand

assign x‘s address to c by casting x to char pointer,

then on little endian architecture

you will get 0x10 when *c is printed

and on big endian architecture you will get 0x76 while printing down *c.

1.3 verilog

input [7:0] address;

Note the [7:0] means we're using the little-endian convention

You start with 0 at the rightmost bit to begin the vector, then move to the left.

http://www.asic-world.com/verilog/verilog_one_day1.html

—

CPU 所儲存一個 byte 內部是否也有 big endian 和 little endian 之分

例如 0xB4 (10110100)

lsb                                                   msb
--------------------------------------------------------------->
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   0  |   0  |   1  |   0  |   1  |   1  |   0  |   1  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

msb                                                         lsb
---------------------------------------------------------------->
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   1  |   0  |   1  |   1  |   0  |   1  |   0  |   0  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

但 CPU 操作的最小單位是一個 byte 所以其內部的 bit order 對應用程式來說是一個黑盒子

比如 0x12345678

低地址                                            高地址
----------------------------------------->
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     78     |      56    |     34      |     12    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Big Endian

低地址                                            高地址
-------------------------------------------------------->
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     12     |      34    |     56      |     78    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

big endian是指低地址存放最高有效字节（MSB）

1.4 float IEEE 754

equation https://youtu.be/gc1Nl3mmCuY?t=315 exponent: unsigned number (offset 127) https://youtu.be/gc1Nl3mmCuY?t=450 https://www.youtube.com/watch?v=gc1Nl3mmCuY https://www.youtube.com/watch?v=b2FgF2sUoS8 https://resources.nerdfirst.net/float simulator #self-note denormal float->bit https://youtu.be/b2FgF2sUoS8?t=507 denormal float->bit https://youtu.be/b2FgF2sUoS8?t=540

1.5 二的補數

https://www.cnblogs.com/zhangziqiu/archive/2011/03/30/computercode.html

在二的補數系統中，某個整數 x 加上負號，會等於對該數進行 not 運算後加上 1 的值。 (-x ≡ ~x + 1)

原碼, 反碼, 補碼

原碼原碼就是符號位加上真值的絕對值, 即用第一位表示符號, 其餘位表示值. 比如如果是8位二進制:

比如如果是8位二進制:

[+1]原 = 0000 0001
[-1]原 = 1000 0001

第一位是符號位. 因為第一位是符號位, 所以8位二進制數的取值範圍就是:

[1111 1111 , 0111 1111]

即

[-127 , 127]

反碼

正數的反碼是其本身負數的反碼是在其原碼的基礎上, 符號位不變，其餘各個位取反.

[+1] = [00000001]原 = [00000001]反

[-1] = [10000001]原 = [11111110]反

如果一個反碼表示的是負數, 人腦無法直觀的看出來它的數值. 通常要將其轉換成原碼再計算.

補碼正數的補碼就是其本身

負數的補碼是在其原碼的基礎上, 符號位不變, 其餘各位取反, 最後+1. (即在反碼的基礎上+1)

[+1] = [00000001]原 = [00000001]反 = [00000001]補
[-1] = [10000001]原 = [11111110]反 = [11111111]補

對於負數, 補碼表示方式也是人腦無法直觀看出其數值的. 通常也需要轉換成原碼在計算其數值.

1.6 stack

Journey to the Stack, Part I https://www.zcfy.cc/article/journey-to-the-stack-part-i https://manybutfinite.com/post/journey-to-the-stack/

2 disassemble

The instruction pointer is a register that stores the memory address of the next instruction.

On x86₆₄, athat register is %rip.

We can access the instruction pointer using the $rip variable, or alternatively we can use the architecture independent $pc

(gdb) x/i $pc
0x100000e94 <natural_generator+4>:  movl   $0x1,-0x4(%rbp)

The instruction pointer always contains the address of the next instruction to be run.

In GDB 7.0 or later, you can just run set disassemble-next-line on, which shows all the instructions that make up the next line of source

set disassemble-next-line on

在函数的第一条汇编指令打断点 https://wizardforcel.gitbooks.io/100-gdb-tips/break-on-first-assembly-code.html

通常给函数打断点的命令：“b func”（b是break命令的缩写），不会把断点设置在汇编指令层次函数的开头

(gdb) disassemble

如果要把断点设置在汇编指令层次函数的开头，要使用如下命令：“b *func”

(gdb) b *func

—

可以用“disas /m fun”（disas是disassemble命令缩写）命令将函数代码和汇编指令映射起来

如果只想查看某一行所对应的地址范围，可以：

(gdb) i line 13
Line 13 of "foo.c" starts at address 0x4004e9 <main+37> and ends at 0x40050c <main+72>.

disassemble [Start],[End]
(gdb) disassemble 0x4004e9, 0x40050c

— 显示将要执行的汇编指令

display /i $pc
display /3i $pc

(gdb) i registers
(gdb) info registers

以上输出不包括浮点寄存器和向量寄存器的内容

(gdb) i all-registers

C++ 反汇编笔记 https://bot-man-jl.github.io/articles/?post=2019/Cpp-Disassembly-Notes

3 cpp weekly assembly

3.1 C++ Weekly - Ep 34 - Reading Assembly Language - Part 1 (intel syntax, AT&T syntax)

https://www.youtube.com/watch?v=my39Gpt6bvY

Compiler Explorer https://godbolt.org/

int main()
{
}

no optimization Intel assembly language syntax the source is on the right hand side: rsp the destination is on the left: rbp

main:
  push rbp
  mov rbp, rsp
  mvo eax, 0
  pop rbp
  ret

4:00 return 0; return 5;

xor eax, eax

There is also AT&T syntax my understanding is AT&T syntax is the older syntax and it more or less readable depending on who you're talk to

main:
  xorl %eax, %eax 
  ret

int main()
{
  return 5;
}

AT&T syntax

main:
  movl $5, %eax
  ret

intel syntax

main:
  move eax, 5
  ret

In a way i prefer AT&T syntax I think it is a little more readable for left to right movement makes more sense to me but a lot of the world is moving towards Intel syntax

3.2 C++ Weekly - Ep 35 - Reading Assembly Language - Part 2

https://www.youtube.com/watch?v=R3HZJ1h2BVY

I have been focusing specifically on 64-bit intel compatible assmbly but these concept should apply to most other things

https://godbolt.org/

int main(int argc, const char* [])
{
  return 5;
}

main:
  push rbp
  mov rbp, rsp
  mov DWORD PTR[rbp-4], edi
  mov DWORD PTR[rbp-16], rsi
  mov eax, 5
  pop rbp
  ret

return 5 + 3*argc

-O1 3:06

main:
  lea eax, [rdi+5+rdi*2]
  ret

index memory location

rdi - argc 4 byte to do the math