llvm学习（十七）：lldb python callback

Posted on 2021-09-08 Edited on 2021-12-24 In llvm

从零开始的 lldb 脚本开发。系列第三篇，介绍大部分对接到 python 的位置。

本文的大部分内容均可参考官方文档：https://lldb.llvm.org/use/python-reference.html

导入python文件

假设文件叫 peda.py，可以用下方的命令导入。

1	command script import /path/to/peda.py

之后可以使用 peda.func() 来调用 peda 里定义的函数。

导入时，lldb 会自动调用 py 脚本里的 __lldb_init_module 函数，该函数签名为

1 2	def __lldb_init_module(debugger: lldb.SBDebugger, internal_dict: Dict): pass

我们可以使用 debugger 来初始化一些事情，一般调用 handleCommand 调用命令修改配置和注册函数。

script执行命令

script 是一个 lldb 里的命令，不加参数可以进入交互模式，加参数可以直接编译并运行。

举例：

(lldb) script
Python Interactive Interpreter. To exit, type 'quit()', 'exit()' or Ctrl-D.
>>> print (lldb.debugger)
Debugger (instance: "debugger_1", id: 1)
>>> exit()

(lldb) script print(lldb)
<module 'lldb' from '/Library/Developer/CommandLineTools/Library/PrivateFrameworks/LLDB.framework/Resources/Python/lldb/__init__.py'>

stop-hook（重要）

2021年12月24日注：该API从lldb12开始才支持python。https://github.com/llvm/llvm-project/commit/b65966cff65bfb66de59621347ffd97238d3f645

触发时机：触发断点、触发步入、触发步过，进程进入 "stop" 状态。

它非常重要，著名的插件 gdb、pwndbg 对 GUI 的增强都是在这个时刻做的，包括打印寄存器、打印code、打印栈回溯等，API 是：

target stop-hook add -P py-class

经过上文的导入文件，例如导入了 peda.py，就可以使用它里面的类，导入时进行构造，每当触发 stop 都会调用到方法。

python 类需要满足：

class PedaHook:
    def __init__(self, target: lldb.SBTarget, extra_args: lldb.SBStructuredData, _dict : Dict):
        print("Construct hook")
    def handle_stop(self, exe_ctx: lldb.SBExecutionContext, stream: lldb.SBStream):
        print("Trigger hook")

command script（重要）

注册自定义函数，有两种形式，注册到 python 的 method 或者注册到 python 的 class，区别在于method 适用于简单的情形、class 适用于复杂的情形，本质上没有区别。

1 2	command script add [-f <python-function>] [-h <help-text>] [-s <script-cmd-synchronicity>] <cmd-name> command script add [-c <python-class>] [-s <script-cmd-synchronicity>] <cmd-name>

绑定到 python method

被绑定的方法需要有如下的参数：

1 2	def func(debugger: lldb.SBDebugger, command: str, result, internal_dict): pass

command 是调用自定义命令时传入的参数。

同样，需要先 import 这个 py 文件，再执行 command script add -f peda.func func，之后就可以调用 func 命令了。

绑定到 python class

被绑定的类需要有如下的方法：

class CommandObjectType:
    def __init__(self, debugger: lldb.SBDebugger, internal_dict):
        pass
    def __call__(self, debugger: lldb.SBDebugger, command: str, exe_ctx, result):
        pass
    def get_short_help(self) -> str:
        pass
    def get_long_help(self) -> str:
        pass

和 method 唯一的区别就是更加完整，注册时候会调用构造方法，命令触发时会调用 __call__ 这个魔术方法，全局 help 时返回 get_short_help，单独 help 时返回 get_long_help。

实际测下来，由于 python 非常灵活，只要能找到符号就行，因此可以使用继承、抽象类、Mixin等技术让代码更加优雅。

thread step-scripted

没用过，https://lldb.llvm.org/use/python-reference.html 里有案例。

breakpoint command

没用过，https://lldb.llvm.org/use/python-reference.html 里有案例。

frame recognizer

没用过，https://lldb.llvm.org/use/python-reference.html 里有案例。

type summary add

用于修改结构体的打印方式，不大会用到。

结语

有了上文的触发点，加上 lldb 本身完备的 API，剩下的工程就是搭积木了，学习成本不高。

下篇介绍一些简单的玩法，敬请期待！