API Reference: https://api.binary.ninja/binaryninja.architecture-module.html#binaryninja.architecture.Architecture The architecture plugin is where we define things such as - endianness - max_instr_length - default_int_size - instr_alignment - stack_pointer ```python class CoolVMArch(Architecture):     name = "coolvm"     endianness = Endianness.BigEndian     default_int_size = 1     max_instr_size = 4     instr_alignment = 4     stack_pointer = "sp" ``` # Registers We can also define the registers. In `CoolVm` the `init_program` function sets the registers to 0. ```c int64_t init_program(struct program_struct* arg1) { arg1->sp = 0 arg1->sp = malloc(bytes: 0x400) int64_t rax_5 if (arg1->sp != 0) { arg1->reg1 = 0 arg1->reg2 = 0 arg1->reg3 = 0 arg1->reg4 = 0 arg1->exit = 0 rax_5 = 0 } else { rax_5 = 1 } return rax_5 } ``` We can see there are 4 general purpose registers, and exit register (if this value is 1 then exit), a stack pointer (with the stack space of 0x400 bytes), and while not shown in this function there is a pc register. For each register, we will create a `RegisterInfo` object (https://api.binary.ninja/binaryninja.architecture-module.html?highlight=registerinfo#binaryninja.architecture.RegisterInfo) which defines the name, and size, as well as if its a sub-register. `CoolVM` however does not have any sub-registers. ```python regs = {} regs['sp'] = RegisterInfo("sp",1) regs['pc'] = RegisterInfo("pc",1) regs['exit'] = RegisterInfo("exit",1) for x in range(1,5):     reg_name = f"r{x}"     regs[reg_name] = RegisterInfo(reg_name,1) ``` # Intrinsics We can make the instructions `read`,`print`,and `exit` intrinsic so when binja decompiles these instructions they get treated as blackbox functions. https://api.binary.ninja/binaryninja.architecture-module.html?highlight=registerinfo#binaryninja.architecture.IntrinsicInfo An intrinsic can have inputs and outputs. In the case of `CoolVM`, `read` and `print` both take one argument as an input. `read` modifies the register in the input, however we don't need to specify that to binja. Lastly, we create a `printStr` intrinsic which is used in the [[Workflow]] that combines multiple 1 char `print` instructions into one `printStr` instruction. ```python intrinsics = { "read": IntrinsicInfo([Type.char()],[]), "print":IntrinsicInfo([Type.char()],[]), "exit": IntrinsicInfo([],[]), "printStr": IntrinsicInfo([],[]) } ``` # Disassembling Next, we can create an object of out disassembler defined in [[Disassembler]] in the `__init__` function of the class. The disassembler will be used in both disassembling and lifting of instructions. ```python def __init__(self): self.disassembler = CoolVMDisassembler() ``` Binja has two functions related to disassembling: - `get_instruction_info` - `get_instruction_text` ## get_instruction_info API Reference: https://api.binary.ninja/binaryninja.architecture-module.html?highlight=registerinfo#binaryninja.architecture.InstructionInfo An instruction info holds the information about the size of the instruction and any branching. Since `CoolVM` has fixed instruction length of 4 we can hard code that. The disassembler returns two variables, `tokens` and `branch_conds`. The tokens arent used in this function, however the branch conditions are. `branch_conds` is a list of `BranchInfo` objects, created by us. The branch types are as follows: ```python class BranchType(enum.IntEnum): UnconditionalBranch = 0 FalseBranch = 1 TrueBranch = 2 CallDestination = 3 FunctionReturn = 4 SystemCall = 5 IndirectBranch = 6 ExceptionBranch = 7 UnresolvedBranch = 127 UserDefinedBranch = 128 ``` Some branch types do not use a target (like `FunctionReturn`) ```python class BranchInfo: def __init__(self,_type,target=None): self.type = _type self.target = target ``` So the final code for the function is: ```python def get_instruction_info(self,data,addr): _, branch_conds = self.disassembler.disas(data,addr) instr_info = InstructionInfo(4) for branch_info in branch_conds: if branch_info.target is not None: instr_info.add_branch(branch_info.type,branch_info.target) else: instr_info.add_branch(branch_info.type) return instr_info ``` ## get_instruction_text This function is how binja displays the tokens in the linear and graph view, this will be the actual text of the instruction as well as the type of each token. However, our disassembler will do the heavy lifting and return the correct tokens.. The second var that this function returns should be the instruction size, which for `CoolVM` is always `4`. ```python def get_instruction_text(self,data,addr): tokens,_ = self.disassembler.disas(data,addr) return tokens, 4 ``` # Lifting Lifting is how you go from diassembly to decompilation. In binja, you write the basic `lowlevelil` associated for each instruction. Binja will propogate that information up to `MediumLevelIL` and `HighLevelIL` as well as providing `Pseudo-C`. The actual lifting is implemented in our [[Lifter]], however, the Arch class has a function that will call our lifter so we will also need to add the lifter class to out `__init__` function. ## get_instruction_low_level_il Calling the lifter and then return the instruction size. ```python def get_instruction_low_level_il(self,data,addr,il): self.lifter.lift(data,addr,il) return 4 ``` # Code Can be found at [coolvm_binja/arch.py at master · thisusernameistaken/coolvm_binja (github.com)](https://github.com/thisusernameistaken/coolvm_binja/blob/master/arch.py) ```python from binaryninja import (     Architecture,     Endianness,     RegisterInfo ) from .disassembler import CoolVMDisassembler from .disassembler import CoolVMLifter class CoolVMArch(Architecture):     name = "coolvm"     endianness = Endianness.BigEndian     default_int_size = 1     max_instr_size = 4     instr_alignment = 4     stack_pointer = "sp"     regs = {}     regs['sp'] = RegisterInfo("sp",1)     regs['pc'] = RegisterInfo("pc",1)     regs['exit'] = RegisterInfo("exit",1)     for x in range(1,5):         reg_name = f"r{x}"         regs[reg_name] = RegisterInfo(reg_name,1)     intrinsics = {         "read": IntrinsicInfo([Type.char()],[]),         "print": IntrinsicInfo([Type.char()],[]),         "exit": IntrinsicInfo([],[]),         "printStr": IntrinsicInfo([],[])     }     def __init__(self):         self.disassembler = CoolVMDisassembler()         self.lifter = CoolVMLifter()     def get_instruction_info(self,data,addr):         _, branch_conds = self.disassembler.disas(data,addr)         instr_info = InstructionInfo(4)         for branch_info in branch_conds:             if branch_info.target is not None:                 instr_info.add_branch(branch_info.type,branch_info.target)             else:                 instr_info.add_branch(branch_info.type)         return instr_info     def get_instruction_text(self,data,addr):         tokens,_ = self.disassembler.disas(data,addr)         return tokens, 4     def get_instruction_low_level_il(self,data,addr,il):         self.lifter.lift(data,addr,il)         return 4 ```