fcd源码分析3

前言

这篇文章主要分析了fcd的fixindirects Pass和CallingConvention分析部分

正文

前面两篇文章分析完如何生成IR之后,我们来到了优化IR的阶段,可以用 -n 参数来输出未优化的IR,我们可以随便找一个程序来尝试

1
2
3
4
5
6
7
8
9
10
11
12
13
define void @udelete(%struct.x86_regs*) !fcd.vaddr !2811 !fcd.recoverable !3 {
entry:
%1 = getelementptr inbounds %struct.x86_regs, %struct.x86_regs* %0, i64 0, i32 9, i32 0
%2 = load i64, i64* %1, align 8, !tbaa !4, !alias.scope !2812
%3 = getelementptr inbounds %struct.x86_regs, %struct.x86_regs* %0, i64 0, i32 8, i32 0
%4 = load i64, i64* %3, align 8, !tbaa !4, !alias.scope !2815
%5 = add i64 %4, -8
%6 = inttoptr i64 %5 to i64*
store i64 %2, i64* %6, align 4, !fcd.prgmem !3
%7 = getelementptr inbounds %struct.x86_regs, %struct.x86_regs* %0, i64 0, i32 7, i32 0
%8 = load i64, i64* %7, align 8, !tbaa !4, !alias.scope !2820, !noalias !2827
%9 = add i64 %4, -16
%10 = inttoptr i64 %9 to i64*

可以看到,这个函数IR的参数只有%struct.x86_regs, 然后下面也是一堆从%struct.x86_regs里面拿数据,操作一下,然后再放回去的操作,这样指令数量非常大,而且不利于进一步的优化

我们可以用两个 -n 来输出优化后的IR

1
2
3
4
5
6
7
8
9
10
11
12
13
define void @udelete(i64 %rip) !fcd.vaddr !28 !fcd.recoverable !3 {
entry:
%stackframe = alloca <{ [12 x i8], i32, [8 x i8], i64 }>, align 8, !fcd.stackframe !3
%0 = getelementptr inbounds <{ [12 x i8], i32, [8 x i8], i64 }>, <{ [12 x i8], i32, [8 x i8], i64 }>* %stackframe, i64 0, i32 3
store i64 %rip, i64* %0, align 8, !fcd.prgmem !3
%1 = ptrtoint <{ [12 x i8], i32, [8 x i8], i64 }>* %stackframe to i64
call void @write(i64 4197992)
%2 = call i64 @input_choice(i64 4198002, i64 %1)
%registers.sroa.0.8.extract.trunc = trunc i64 %2 to i32
%3 = getelementptr inbounds <{ [12 x i8], i32, [8 x i8], i64 }>, <{ [12 x i8], i32, [8 x i8], i64 }>* %stackframe, i64 0, i32 1
store i32 %registers.sroa.0.8.extract.trunc, i32* %3, align 4, !fcd.prgmem !3
%4 = icmp ult i32 %registers.sroa.0.8.extract.trunc, 16
br i1 %4, label %"400e97", label %"400e81"

看起来相对正常了一点,没有一堆struct.x86_regs

那么我们就来分析一下,这个转换是怎样做的吧

main 函数

首先在main 函数这里

1
if (!mainObj.optimizeAndTransformModule(*module, errs(), executable.get()))

执行了上面这个函数

跟进去,发现执行了一系列的Pass

1
2
3
4
5
6
7
8
9
auto passManager = createBasePassManager();
passManager.add(new ExecutableWrapper(executable));
passManager.add(createParameterRegistryPass());
passManager.add(createExternalAAWrapperPass(&Main::aliasAnalysisHooks));
for (Pass* pass : optimizeAndTransformPasses)
{
passManager.add(pass);
}
passManager.run(module);

optimizeAndTransformPasses里有

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
vector<string> passNames = {
"globaldce",
"fixindirects",
"argrec",
"sroa",
"intnarrowing",
"signext",
"instcombine",
"intops",
"simplifyconditions",
// <-- custom passes go here with the default pass pipeline
"instcombine",
"gvn",
"simplifycfg",
"instcombine",
"gvn",
"recoverstackframe",
"dse",
"sccp",
"simplifycfg",
"eliminatecasts",
"instcombine",
"memssadle",
"dse",
"instcombine",
"sroa",
"instcombine",
"globaldce",
"simplifycfg",
};

一部分pass是fcd自己的,一部分是llvm自带的

llvm自带的我就不过多分析了,主要分析一下fcd自己写的pass有什么作用

fixindirects

这是一个module pass,所以主要的部分是在runOnModule函数里面

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
virtual bool runOnModule(Module& m) override
{
// FIXME: avoid references to x86 intrinsics directly.

bool changed = false;
if (Function* indJump = m.getFunction("x86_jump_intrin"))
{
changed |= fixIndirectJumps(*indJump);
}

if (Function* indCall = m.getFunction("x86_call_intrin"))
{
changed |= fixIndirectCalls(*indCall);
}

return changed;
}

首先是找到了x86_jump_intrin这个函数,然后传进了fixIndirectJumps

fixIndirectJumps

这个函数一开始是获取 intptr 类型和void 类型,还有__indirect_jump这个Function

然后进入一个循环

1
2
3
4
5
6
7
8
9
10
for (Value* user : vector<Value*>(callIntrin.user_begin(), callIntrin.user_end()))
{
if (auto call = dyn_cast<CallInst>(user))
{
Value* destination = call->getArgOperand(2);
auto intptrDestination = CastInst::Create(CastInst::BitCast, destination, intptrTy, "", call);
CallInst::Create(indirectJump, { intptrDestination }, "", call);
call->eraseFromParent();
}
}

遍历所有call x86_call_intrin指令

然后获取跳转的目的地

1
Value* destination = call->getArgOperand(2);

转换call __indirect_jump函数

1
2
auto intptrDestination = CastInst::Create(CastInst::BitCast, destination, intptrTy, "", call);
CallInst::Create(indirectJump, { intptrDestination }, "", call);

最后把原来的call x86_call_intrin指令删除掉

1
call->eraseFromParent();

fixIndirectCalls

这个函数首先是利用getAnalysis获取ParameterRegistry这个pass,然后再用getTargetInfo获取目标程序的各种信息,例如架构等

1
2
ParameterRegistry& params = getAnalysis<ParameterRegistry>();
auto target = TargetInfo::getTargetInfo(*callIntrin.getParent());

然后进入到一个循环里面,遍历所有 call x86_call_intrin指令

第一个if是判断是否是call指令,第二个是利用上面获取的ParameterRegistry pass来分析这个call

1
2
3
4
for (Value* user : vector<Value*>(callIntrin.user_begin(), callIntrin.user_end()))
{
if (auto call = dyn_cast<CallInst>(user))
if (auto info = params.analyzeCallSite(CallSite(call)))

我们跟进去看一下

一开始是new一个CallInformation

1
unique_ptr<CallInformation> info(new CallInformation);

然后开始遍历CallingConvention列表

1
for (CallingConvention* cc : ccChain)

这里ccChain的初始化是在setupCCChain函数里面,这里不过多分析,在ccChain里面有[CallingConvention_x86_64_systemv,CallingConvention_AnyArch_AnyCC,CallingConvention_AnyArch_Interactive] 这三个CallingConvention,在遍历CallingConvention的过程中,假如有一个CallingConvention完成了分析,就会直接返回分析的内容

我们这里来分析CallingConvention_x86_64_systemv

CallingConvention_x86_64_systemv analyzeFunction

首先获取目标程序的信息

然后往call参数加上rip和rsp

1
2
3
4
5
TargetInfo& targetInfo = registry.getTargetInfo();

// We always need rip and rsp.
callInfo.addParameter(ValueInformation::IntegerRegister, targetInfo.registerNamed("rip"));
callInfo.addParameter(ValueInformation::IntegerRegister, targetInfo.registerNamed("rsp"));

再下面,一堆判断,判断参数是否是%struct.x86_regs*

1
2
3
4
assert(function.arg_size() == 1);
auto regs = function.arg_begin();
auto pointerType = dyn_cast<PointerType>(regs->getType());
assert(pointerType != nullptr && pointerType->getElementType()->getStructName() == "struct.x86_regs");

下面是一个循环,找到那些用GetElementPtrI来从struct.x86_regs获取寄存器值的指令,利用registerInfo来获取到底是从哪个寄存器里面获取值

最后把regName和对应的GetElementPtr指令插入到geps中

1
2
3
4
5
6
7
8
9
unordered_multimap<const TargetRegisterInfo*, GetElementPtrInst*> geps;
for (auto& use : regs->uses())
{
if (GetElementPtrInst* gep = dyn_cast<GetElementPtrInst>(use.getUser()))
if (const TargetRegisterInfo* regName = targetInfo.registerInfo(*gep))
{
geps.insert({regName, gep});
}
}

然后下面还是一个大的循环

看到注释是找到那些在写入值之前已经被读的寄存器,简单的来说就是判断函数的参数有哪些

遍历的寄存器有 const char* parameterRegisters[] = { “rdi”, “rsi”, “rdx”, “rcx”, “r8”, “r9” };

然后再下面是找到那些通过栈传进来的参数,这两个分析的过程就不多说了

最后这里是寻找返回值用的寄存器,遍历的是 rax和rdx

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
vector<const TargetRegisterInfo*> usedReturns;
usedReturns.reserve(2);

for (const char* regName : returnRegisters)
{
const TargetRegisterInfo* regInfo = targetInfo.registerNamed(regName);
auto range = geps.equal_range(regInfo);
for (auto iter = range.first; iter != range.second; ++iter)
{
bool hasStore = any_of(iter->second->use_begin(), iter->second->use_end(), [](Use& use)
{
return isa<StoreInst>(use.getUser());
});

if (hasStore)
{
usedReturns.push_back(regInfo);
break;
}
}
}

for (const TargetRegisterInfo* reg : ipaFindUsedReturns(registry, function, usedReturns))
{
// return value!
callInfo.addReturn(ValueInformation::IntegerRegister, reg);
}

fixIndirectCalls

回到fixIndirectCalls函数,我们通过analyzeCallSite获取传入的参数用了哪几个寄存器,返回的参数用了哪几个寄存器,

然后通过ArgumentRecovery::createFunctionType这个函数获取llvm的FunctionType,根据FunctionType还原call,最后把原来的call x86_call_intrin去掉

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
if (auto info = params.analyzeCallSite(CallSite(call)))
{
Function& parent = *call->getParent()->getParent();
Module& module = *parent.getParent();

string name;
raw_string_ostream(name) << "indirect_" << indirectCallCount;
++indirectCallCount;

FunctionType* ft = ArgumentRecovery::createFunctionType(*target, *info, module, name);
Value* callable = CastInst::CreateBitOrPointerCast(call->getOperand(2), ft->getPointerTo(), "", call);
Value* registers = call->getOperand(1);
CallInst* result = ArgumentRecovery::createCallSite(*target, *info, *callable, *registers, *call);
result->takeName(call);
call->eraseFromParent();
}