Vector系列内存布局分析(一)

17年10月仔细分析和利用了一个CVE,结果拖到现在才有空写出来,主要是讲一下android里的一个很常见的结构 "Vector",如何使用堆溢出来尽可能做多的事情。网上没有查到过详细介绍Vector实现的文章,本文,算是先河么?

一、简述

Vector可以被理解为一个变长的数组,开发者可以使用insert, size, clear等API来对其进行操作,用法简单易懂,会自动在堆上分配内存,以及在适当的情况下进行析构。这里的android::Vector与C++的std::vector是不一样的,需要明确一下。

SortedVector继承于Vector,增加的功能是对每次数据变化时主动进行了排序,利用的时候需要维护好sorted的状态,否则会在遍历时出现各种问题。

KeyedVector继承于SortedVector,增加的功能是根据key的值去查找到value,并且key也是按照顺序存放和查询的。

在android自带的共享库里,可以经常看到它的身影,所以也可以通过泄漏出Vector的虚表地址来获得整个module的绝对地址。本文主要介绍Vector,另外两种可以通过debug和读源码的方式去理解,如果时间允许,我会在后续的文章里把另外两种结构也去讲一下。

二、结构

函数定义位于 system/core/include/utils/Vector.h和VectorImpl.h ,实现位于VectorImpl.cpp。

(为了方便阅读,贴一下链接好了)

控制结构为

1
2
3
4
5
6
7
class VectorImpl
{
void * mStorage; // base address of the vector
size_t mCount; // number of items
const uint32_t mFlags;
const size_t mItemSize;
};

VectorImpl.h和VectorImpl.cpp里定义了返回成员变量的几个函数

1
2
3
4
5
6
7
8
9
size_t VectorImpl::itemSize() const {
return mItemSize;
}

inline const void* arrayImpl() const { return mStorage; }

inline size_t size() const { return mCount; }

inline bool isEmpty() const { return mCount == 0; }

VectorImpl的构造方法只有一个,VectorImpl(size_t size, uint32_t flag) ,例如Vector<int>的构造函数,被反汇编出来的是,android::VectorImpl::VectorImpl(&vector_obj, 4uLL, 7u); ,那个flag不是很懂有啥用,共有3个flag,在VectorImpl.h里被定义,构造方法基本为空,只设置了itemSize和flag。

析构方法里检查了是否有内存泄漏,必须要由编译器主动调用finish_vector,应该就是释放掉堆上占用的内存了。

1
2
3
4
5
6
7
8
VectorImpl::~VectorImpl()
{
ALOGW_IF(mCount,
"[%p] subclasses of VectorImpl must call finish_vector()"
" in their destructor. Leaking %d bytes.",
this, (int)(mCount*mItemSize));
// We can't call _do_destroy() here because the vtable is already gone.
}

Vector本身是不存放capacity信息的,(这一点与KeyedVector不同,KeyedVector的内存布局里是包含capacity信息的),通过mStorage的大小和mItemSize来计算得到当前容量。

1
2
3
4
5
6
7
size_t VectorImpl::capacity() const
{
if (mStorage) {
return SharedBuffer::bufferFromData(mStorage)->size() / mItemSize;
}
return 0;
}

三、内存布局和增删改

Vector<int>为例,写一段简单的代码,arm64,按照时间顺序讲一下内存的变化。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#include <utils/Vector.h>
#include <stdlib.h>

char buffer[100];

int main() {

printf("input something to startn");
scanf("%s", buffer);


android::Vector<int> vector;

printf("Hello, World!n");
unsigned int seed = (unsigned int) time(NULL);
srand(seed);
int* ppp = (int*) (&vector);
while (1) {
printf("Input Commandn");
scanf("%s", buffer);
if (strstr(buffer, "add")) {
int r = (int) random() % 100;
printf("random insert %dn", r);
vector.add(r);
printf("after insert capacity is %zu, size is %zu, ptr is %pn", vector.capacity(), vector.size(), ppp[2]);
} else if (strstr(buffer, "del")) {
printf("del vector[0]n");
vector.removeAt(0);
printf("after del capacity is %zu, size is %zu, ptr is %pn", vector.capacity(), vector.size(), ppp[2]);
} else if (strstr(buffer, "clear")) {
printf("clear vectorn");
vector.clear();
} else if (strstr(buffer, "exit")) {
printf("exitn");
break;
} else {
printf("invalid inputn");
continue;
}
}
printf("Bye, World!n");
return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
LOCAL_PATH:= $(call my-dir)
include $(CLEAR_VARS)

LOCAL_MODULE:=hello_vector

LOCAL_CPPFLAGS += -g

LOCAL_SRC_FILES:= main.cpp

LOCAL_SHARED_LIBRARIES:=
libcutils
libutils


include $(BUILD_EXECUTABLE)

每个Vector的控制结构大小为40byte(0x28byte),分别为

1
2
3
4
5
6
7
8
 offset  size  comment 
-------------------
0 8 vtable
8 8 storage
16 8 count
24 4 flag
28 4 junk
32 8 itemSize

初始化前的vector,从EF0到F18。

初始化后的vector,写了vtable、flag、itemSize。

添加1个元素后的vector,创建了buffer,写了count。

在添加第4和第5个元素时,进行了堆上的扩展,并且可以观察到在之前存放数据的位置仍然有残留。

关注的2个属性的变化如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
after insert capacity is 4, size is 1
after insert capacity is 4, size is 2
after insert capacity is 4, size is 3
after insert capacity is 4, size is 4
after insert capacity is 8, size is 5
after insert capacity is 8, size is 6
after insert capacity is 8, size is 7
after insert capacity is 8, size is 8
after insert capacity is 14, size is 9
after insert capacity is 14, size is 10
after insert capacity is 14, size is 11
after insert capacity is 14, size is 12
after insert capacity is 14, size is 13
after insert capacity is 14, size is 14
after insert capacity is 23, size is 15

触发了2次扩容,更新规则在VectorImpl.cpp的_grow函数里,当capacity不足时,按照公式扩大容量,所以storage的长度其实是可以被预测的。扩展时使用SharedBuffer,也就是简单的malloc/memcpy/free,在memcpy的时候复制数据。例如我们这里最开始capacity是0,4,8,14,23这样的,这个按照 x + x/2 + 1 计算 ,使用safe_add宏,为什么4变成8、8变成14,可能是可能是向上取整,细节也懒得深究了。

增加元素时的主要代码是

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
ssize_t VectorImpl::add()
{
return add(0);
}
ssize_t VectorImpl::add(const void* item)
{
return insertAt(item, size());
}

ssize_t VectorImpl::insertAt(size_t index, size_t numItems)
{
return insertAt(0, index, numItems);
}

ssize_t VectorImpl::insertAt(const void* item, size_t index, size_t numItems)
{
if (index > size())
return BAD_INDEX;
void* where = _grow(index, numItems);
if (where) {
if (item) {
_do_splat(where, item, numItems);
} else {
_do_construct(where, numItems);
}
}
return where ? index : (ssize_t)NO_MEMORY;
}

逻辑比较清晰,判断是否OOB,再判断是否需要扩展长度,扩展成功的话,根据item是否为空来执行splat和construct,之后返回ERROR或者返回index。这里注意insertAt的最后一个参数是被省略掉的,看起来有参数不一致的现象,其实在VectorImpl.h里有默认值。

do_splat和do_construct的代码均位于Vector.h和TypeHelper.h里,splat根据元素的类型来插入新的元素,construct则重新构建一遍。

在删除元素时,操作有点不一样,主要表现在capacity的变化上和数据的擦除上。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
after del capacity is 23, size is 14
after del capacity is 23, size is 13
after del capacity is 23, size is 12
after del capacity is 23, size is 11
after del capacity is 20, size is 10
after del capacity is 18, size is 9
after del capacity is 16, size is 8
after del capacity is 14, size is 7
after del capacity is 12, size is 6
after del capacity is 10, size is 5
after del capacity is 8, size is 4
after del capacity is 6, size is 3
after del capacity is 4, size is 2
after del capacity is 4, size is 1
after del capacity is 4, size is 0
after del capacity is 4, size is 0

capacity的规律也好观察,capacity=max(4, min(capacity, size*2)),关键是内存的操作,是否有堆上的重新分配。

removeAt调用到removeItemsAt

1
2
3
inline  ssize_t         removeItemsAt(size_t index, size_t count = 1);
//! remove one item
inline ssize_t removeAt(size_t index) { return removeItemsAt(index); }

removeItemsAt调用到_shrink,实现较为复杂。主要思路就是如果new_size < capacity/2 时,使用SharedBuffer重新布局;否则,使用原本的buffer,再调用_do_move_backward将空缺的位置补齐。

当new_size < capacity/2时,代码如下,条件成立的分支里,不进行copy,而是直接返回一个新的对象,在editResize里调用realloc。在条件不成立的分支里,使用alloc来创建并将数据全部复制。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
if ((where == new_size) &&
(mFlags & HAS_TRIVIAL_COPY) &&
(mFlags & HAS_TRIVIAL_DTOR))
{
const SharedBuffer* cur_sb = SharedBuffer::bufferFromData(mStorage);
SharedBuffer* sb = cur_sb->editResize(new_capacity * mItemSize);
if (sb) {
mStorage = sb->data();
} else {
return;
}
} else {
SharedBuffer* sb = SharedBuffer::alloc(new_capacity * mItemSize);
if (sb) {
void* array = sb->data();
if (where != 0) {
_do_copy(array, mStorage, where);
}
if (where != new_size) {
const void* from = reinterpret_cast<const uint8_t *>(mStorage) + (where+amount)*mItemSize;
void* dest = reinterpret_cast<uint8_t *>(array) + where*mItemSize;
_do_copy(dest, from, new_size - where);
}
release_storage();
mStorage = const_cast<void*>(array);
} else{
return;
}
}

当new_size >= capacity/2时,代码如下,比较简单,平移整块数据。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SharedBuffer* sb = SharedBuffer::alloc(new_capacity * mItemSize);
if (sb) {
void* array = sb->data();
if (where != 0) {
_do_copy(array, mStorage, where);
}
if (where != new_size) {
const void* from = reinterpret_cast<const uint8_t *>(mStorage) + (where+amount)*mItemSize;
void* dest = reinterpret_cast<uint8_t *>(array) + where*mItemSize;
_do_copy(dest, from, new_size - where);
}
release_storage();
mStorage = const_cast<void*>(array);
} else{
return;
}

内存布局懒得用IDA截图了,打log看下。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
after del capacity is 23, size is 21, ptr is 0x5c26018
after del capacity is 23, size is 20, ptr is 0x5c26018
after del capacity is 23, size is 19, ptr is 0x5c26018
after del capacity is 23, size is 18, ptr is 0x5c26018
after del capacity is 23, size is 17, ptr is 0x5c26018
after del capacity is 23, size is 16, ptr is 0x5c26018
after del capacity is 23, size is 15, ptr is 0x5c26018
after del capacity is 23, size is 14, ptr is 0x5c26018
after del capacity is 23, size is 13, ptr is 0x5c26018
after del capacity is 23, size is 12, ptr is 0x5c26018
after del capacity is 23, size is 11, ptr is 0x5c26018
after del capacity is 20, size is 10, ptr is 0x5c39018
after del capacity is 18, size is 9, ptr is 0x5c2d078
after del capacity is 16, size is 8, ptr is 0x5c2d0d8
after del capacity is 14, size is 7, ptr is 0x5c34018
after del capacity is 12, size is 6, ptr is 0x5c34068
after del capacity is 10, size is 5, ptr is 0x5c33018
after del capacity is 8, size is 4, ptr is 0x5c33058
after del capacity is 6, size is 3, ptr is 0x5c277f8
after del capacity is 4, size is 2, ptr is 0x5c27828
after del capacity is 4, size is 1, ptr is 0x5c277f8
after del capacity is 4, size is 0, ptr is 0x5c277f8
after del capacity is 4, size is 0, ptr is 0x5c277f8

在capacity变化的时候,每次都会精准地进行一次堆空间的申请和释放,并且堆上会残留大量的历史数据。

编辑操作的话,非共享的情况下会引起堆上的操作,其他时候基本用不到。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void* VectorImpl::editArrayImpl()
{
if (mStorage) {
const SharedBuffer* sb = SharedBuffer::bufferFromData(mStorage);
SharedBuffer* editable = sb->attemptEdit();
if (editable == 0) {
// If we're here, we're not the only owner of the buffer.
// We must make a copy of it.
editable = SharedBuffer::alloc(sb->size());
// Fail instead of returning a pointer to storage that's not
// editable. Otherwise we'd be editing the contents of a buffer
// for which we're not the only owner, which is undefined behaviour.
LOG_ALWAYS_FATAL_IF(editable == NULL);
_do_copy(editable->data(), mStorage, mCount);
release_storage();
mStorage = editable->data();
}
}
return mStorage;
}

嗯,大概有这些知识已经足够进行简单的堆布局了,俗话说得好,书读百遍,其义自现,代码也是一样,多看、多debug,慢慢就懂Vector怎么实现了,不是很麻烦的。

四、常见利用操作

如果按照规范使用Vector,或者不按照规范使用Vector的话,都不会出啥大问题,里面check挺多的,如果配合其他的堆溢出漏洞或者越界写漏洞,去覆盖Vector的内容的话,还是有机会去跑shellcode的。

1、信息泄漏。Vector是个很常用的结构体,当元素内容不同的时候,对应的虚表也不一样,如果有越界读,这是一个很好的选择来泄漏libutils.so的地址。

2、 内存布局。就拿Vector<int>为例,如果输入可控,mStorage本身的内容就完全可控,可以用来存放连续的shellcode(虽然需要大量的IO)。但实际中没怎么见人用。另外可以拿来打乱jemalloc里的linkedList顺序,具体在另一篇关于三星的利用里会讲,非常神奇的一系列操作,到现在都感觉很得意。

3、将一处堆溢出转化为更多的堆溢出。Vector不大好操作,因为capacity是计算出来的,如果可以将SharedBuffer所创建的buffer的size改掉,即可让以后的add操作无限向后延伸。但实际中也没怎么见人用过。

(别的方法想起了再补上)

五、结语

其实Vector挺无聊的,敬请期待SortedVector、KeyedVector等后续文章,以及CVE的分析。