remill 源码初探1

remill是一个能将机器码转化为LLVM IR的library,这个library只专注于LLVM IR的提取,所以作为工具使用的话,还要用官方写的另外一个工具,McSema

我们接下来将分析一下remill的源码,因为官方写的McSema比较大,因此我们用官方写的一个小例子作为分析的入口吧

我们先来看下关于这个小工具的Readme,简单了解下它能做什么

remill-lift is an example program that shows how to use some of the Remill
APIs, specifically, the TraceLifter API.

Here is an example usage of remill-lift:

1
remill-lift-6.0 --arch amd64 --ir_out /dev/stdout --bytes c704ba01000000

This lifts the AMD64 mov DWORD PTR [rdx + rdi * 4], 1 to LLVM bitcode. It will output the lifted module to the stdout, showing something similar to the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
; Function Attrs: noinline nounwind ssp
define %struct.Memory* @sub_0(%struct.State* noalias dereferenceable(3280), i64, %struct.Memory* noalias) local_unnamed_addr #0 {
entry:
%3 = getelementptr inbounds %struct.State, %struct.State* %0, i64 0, i32 6, i32 33, i32 0, i32 0
%4 = getelementptr inbounds %struct.State, %struct.State* %0, i64 0, i32 6, i32 7, i32 0, i32 0
%5 = getelementptr inbounds %struct.State, %struct.State* %0, i64 0, i32 6, i32 11, i32 0, i32 0
%6 = load i64, i64* %4, align 8
%7 = load i64, i64* %5, align 8
%8 = shl i64 %7, 2
%9 = add i64 %8, %6
%10 = add i64 %1, 7
store i64 %10, i64* %3, align 8
%11 = tail call %struct.Memory* @__remill_write_memory_32(%struct.Memory* %2, i64 %9, i32 1) #3
%12 = tail call %struct.Memory* @__remill_missing_block(%struct.State* nonnull %0, i64 %10, %struct.Memory* %11)
ret %struct.Memory* %12
}

简单来说就是将一段机器码翻译为IR

我们来看下main函数吧

main函数

略过一堆初始化判断的过程

首先是用UnhexlifyInputBytes将输入的hex转化为byte,存到Memory这个类型里面

然后一系列初始化获取trace_lifter

1
2
3
4
5
Memory memory = UnhexlifyInputBytes(addr_mask);
SimpleTraceManager manager(memory);
remill::IntrinsicTable intrinsics(module);
remill::InstructionLifter inst_lifter(arch, intrinsics);
remill::TraceLifter trace_lifter(inst_lifter, manager);

然后下面是真正的Lift过程,调用Lift函数,我们跟进去看一下

1
2
3
// Lift all discoverable traces starting from `--entry_address` into
// `module`.
trace_lifter.Lift(FLAGS_entry_address);

trace_lifter.Lift

在这里我们也同样跳过一堆判断,直接跳到Decode那里

首先是一个while循环,判断是否还有没有被处理的inst

然后获取地址,获取地址所在的block

假如这个inst已经被lift了,就继续

假如这个inst是某个trace的第一个inst,就不decode和lift,跳过这个inst

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Decode instructions.
while (!state.inst_work_list.empty()) {
const auto inst_addr = state.PopInstructionAddress();

state.block = state.GetOrCreateBlock(inst_addr);
state.switch_inst = nullptr;

// We have already lifted this instruction block.
if (!state.block->empty()) {
continue;
}

// Check to see if this instruction corresponds with an existing
// trace head, and if so, tail-call into that trace directly without
// decoding or lifting the instruction.
if (inst_addr != trace_addr) {
if (auto inst_as_trace = GetLiftedTraceDeclaration(inst_addr)) {
AddTerminatingTailCall(state.block, inst_as_trace);
continue;
}
}

这里是读取inst的byte

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Read instruction bytes.
state.inst_bytes.clear();
for (size_t i = 0; i < state.max_inst_bytes; ++i) {
const auto byte_addr = (inst_addr + i) & addr_mask;
if (byte_addr < inst_addr) {
break; // 32- or 64-bit address overflow.
}
uint8_t byte = 0;
if (!manager.TryReadExecutableByte(byte_addr, &byte)) {
DLOG(WARNING)
<< "Couldn't read executable byte at "
<< std::hex << byte_addr << std::dec;
break;
}
state.inst_bytes.push_back(static_cast<char>(byte));
}

// No executable bytes here.
if (state.inst_bytes.empty()) {
AddTerminatingTailCall(state.block, intrinsics->missing_block);
continue;
}

state.inst.Reset();

接下来是利用直接decode这个inst,转换为XED形式,详细的就不多说

1
(void) arch->DecodeInstruction(inst_addr, state.inst_bytes, state.inst);

然后下面就是真正lift的部分,我们跟进去看一下

1
auto lift_status = inst_lifter.LiftIntoBlock(state.inst, state.block);

inst_lifter.LiftIntoBlock

首先初始化一下,然后调用GetInstructionFunction,我们跟进去看一下

1
2
3
4
5
6
7
llvm::Function *func = block->getParent();
llvm::Module *module = func->getParent();
llvm::Function *isel_func = nullptr;
auto status = kLiftedInstruction;

if (arch_inst.IsValid()) {
isel_func = GetInstructionFunction(module, arch_inst.function);

GetInstructionFunction

这里注释也说的非常清楚了,找到那个实现了该inst语义相等的函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// Try to find the function that implements this semantics.
llvm::Function *GetInstructionFunction(llvm::Module *module,
const std::string &function) {
std::stringstream ss;
ss << "ISEL_" << function;
auto isel_name = ss.str();

auto isel = FindGlobaVariable(module, isel_name);
if (!isel) {
return nullptr; // Falls back on `UNIMPLEMENTED_INSTRUCTION`.
}

if (!isel->isConstant() || !isel->hasInitializer()) {
LOG(FATAL)
<< "Expected a `constexpr` variable as the function pointer for "
<< "instruction semantic function " << function
<< ": " << LLVMThingToString(isel);
}

auto sem = isel->getInitializer()->stripPointerCasts();
return llvm::dyn_cast_or_null<llvm::Function>(sem);
}

我们可以在github的例子看一下 How to implement the semantics of an instruction

就是实现以下的函数,然后和fcd差不多,利用clang生成其IR,再进行优化

1
2
3
4
5
6
7
8
template <typename D, typename S1, typename S2>
DEF_SEM(AND, D dst, S1 src1, S2 src2) {
auto lhs = Read(src1);
auto rhs = Read(src2);
auto res = UAnd(lhs, rhs);
WriteZExt(dst, res);
// SetFlagsLogical(state, lhs, rhs, res);
}