Linux物理内存镜像分析

本文由hanniba911@yahoo.com.cn 翻译.本文概述了怎样分析目标计算机的内存镜像的方法,通过这些方法,你可能从目标计算机提取出许多有用的信息,比如:一个内容完整的文件,每个进程中删除的信息以及所有那些曾经本次开机以来所有运行过,然后又被中止的进程。本文力图向大家说明内存分析的概念,本文说介绍的这些技术也能使你能从内存镜像中分析出重要的数据结构,并从物理内存中恢复文件的内容。Linux物理内存镜像分析(Digital forensics of the physical memory)翻译的术语解释:compromised machine:目标计算机 因为这个英语单词有时是指调查员调查的被黑的计算机,有时又是指调查员调查的犯罪嫌疑人的计算机,所以统一翻成“目标计算机”。Zone:区 这个术语指内存中很大的一块,而且只有0~16MB、16MB~896MB、896MB~物理内存的尽头,三种Region:区域 注意这和上面的区是有本质的区别的,区域指的是内存中很小的一段,对应于可执行文件装入内存后的段。请不要把二者混淆起来。它们的关系是(从小到大):字节->页->区域->区->整个物理内存空间Field:成员 指构成一个struct的变量、指针或者另一个struct,在C++中称为成员变量,这里的环境是C,但是我实在想不起来C里管这个叫什么了,只好套用C++的概念称之为成员了。哪位好形人帮我改改。Struct:结构 我记得C语言里应该叫结构体吧,不过因为已经Mariusz Burdach,Mariusz.Burdach@seccure.netWarsaw, March 2005Abstract概述This paper presents methods by which physical memory from a compromised machine can be analyzed. Through this methods, it is possible to extract useful information from memory such as: a full content of files, detailed information about each process and also processes that were being executed and then were terminated in the past. This paper aims to explain the concepts of digital investigations of volatile memory. Techniques covered by this paper will lead you through the process of analyzing important structures and recovering contents of files from physical memory.本文概述了怎样分析目标计算机的内存镜像的方法,通过这些方法,你可能从目标计算机提取出许多有用的信息,比如:一个内容完整的文件,每个进程中删除的信息以及所有那些曾经本次开机以来所有运行过,然后又被中止的进程。本文力图向大家说明内存分析的概念,本文说介绍的这些技术也能使你能从内存镜像中分析出重要的数据结构,并从物理内存中恢复文件的内容。In addition, a technique, that detects hidden User Mode processes, will be discussed indepth. This technique leads to detect processes which can be hidden by using various methods such as: function hooking or direct kernel object manipulation (DKOM). Basing on methods discussed in this paper, the proof-of-concept toolkit, called idetect, will be presented. This toolkit can help an investigator to extract some information from memory image or from memory object on a live system.同时,怎样发现隐藏的用户模式的进程的技巧我们也会加以深入的讨论。一般黑客隐藏用户模式的进程的技巧包括:系统钩子和直接指向内核对象的操作(DKOM)。本文讨论了这些内容之后,会提供一个名为idetect的工具包,这个工具包可以帮助调查人员从内存镜像或者一个运行中的系统中分析出一些有用的信息1. Introduction1.介绍In the past, a procedure of making an accurate and a reliable copy of the data from a compromised machine was limited into storages such as hard disks. It means, that a forensic analysis process relied on evidence found on file systems. There are several reasons for using such a procedure. First of all, the acquisition procedure is quite easy and an investigator’s experience is not necessary. It is enough to remove power from a compromised machine and then to protect the crime scene. A second reason is more important. In most cases, examination tools, available on the market, can be used only to investigate file systems. There are some forensic tools such as EnCase EE or ProDiscover IR that help digital investigators to preserve some data from live system but for several reasons the tools are much more useful in an incident response process. It is quite obvious that if we omit volatile data during an acquisition procedure, we can loose evidence. Furthermore, sophisticated methods of infecting computers, used by tools such as the FU rootkit or the SQL Slammer worm, show us that in near future the memory content will be the only place where evidence can be found. An infection of malicious code into a running processes, caused by internet worms and viruses, is more and more popular. For example, the mentioned SQL Slammer resides only in memory and never writes anything to disk.以前调查员能从目标计算机中获取的正确可靠的数据拷贝仅仅只有硬盘镜像,这就意味着:调查员分析获取的结论是基于文件系统的。采用这种方法的理由有这样几个:首先,获取程序简单,对调查员要求低(只要关掉目标计算机的电源并且保护好犯罪现场就可以了)。第二个也是更重要的理由是:在绝大多数案件中,市场上买的到的用来进行调查工作的工具,只能用于进行磁盘镜像。当然EnCase EE、ProDiscover IR之类的工具也可以帮助调查员在一个运行中的系统保存一些数据,但是出于某些原因,这些工具被更多的用于紧急响应事件的处理中。很显然,如果我们在数据获取工作中忽略了内存中的数据之类的易灭失数据的,毫无疑问,我们将永远失去一部分证据!更进一步,我们考虑一些被诸如:the FU rootkit 或者the SQL Slammer之类黑客工具/病毒攻击/感染的目标计算机,我们怎样对这些计算机进行取证调查呢?让我们考虑这样的现实:现在使用蠕虫/病毒之类的工具向运行中的系统中直接注入恶意代码(而不是上传一个恶意文件然后想办法让他运行起来)的手法越来越流行了(比如SQL Slammer的代码就是只存在于内存中而从不向硬盘中写入数据的)。我认为:在不久的将来,物理内存镜像可能会是我们唯一能发现证据/线索的地方!There are also other advantages of performing memory investigation. Let’s suppose, that we need to recover a part of email or a part of a document lost after a word editor crash. Where are we going to look it for? Even a simple task of searching of strings in main memory is sometimes very useful and allows us to extract interesting information such as commands typed by an intruder [6].另外,对内存镜像进行分析还有其他的一些好处。假定我们现在向恢复一个email的内容,或者一个曾经被编辑过的word文档的内容,我们可以从哪里入手呢?(当然你可以从传统的文件系统层的方法)但有时,简单的在主存中搜索字符串的办法也是可行的。另外,内存分析也有助于我们发现一些令人感兴趣的东东,比如:黑客所有命令行的输入:)Above examples show us that memory investigation is critical for digital forensics. It is worth mentioning that most interesting information can be found when the compromised system was not rebooted. In this paper I will try to discuss some techniques of finding evidence in preserved memory image.上面的例子表明内存镜像分析的方法对电子证据司法调查是决定性的,只要目标计算机还没有关机或重启,我们就可以从内存中找到大量令人感兴趣的东西。本文中我们会讨论从已保存的内存镜像中发现相关证据的技巧。2. Problems with memory acquisition procedure2.获取内存镜像中的问题Most standards and best practice guidelines, such as: the “Computer Security Incident Handling Guide” from NIST or RFC 3227 “Guidelines for Evidence Collection and Archiving”, include procedures of gathering volatile data. Some data, which must be acquired, is specified in these papers. For example: current network connections, running processes, users’ sessions, kernel parameters, open files etc. But, to gather this data an investigator must use several tools such as: netstat, lsof, ifconfig, etc. These tools help in collecting only obvious data, leaving most of the system’s memory unanalyzed. Moreover, these tools are executed from user mode. Even statically linked tools can print unreliable data because of a kernel level modification.虽然大多数司法取证标准或者是实践指南(比如,NIST的“计算机安全事件处理指南”或者是RFC 3227“电子证据收集和保存指南”)都包含有固定和获取易灭失证据的内容。但是本文还是要特别提一下怎样获取一些数据。比如,在现在的司法取证标准或者是实践指南中,当前的网络连接情况,正在运行的进程,用户请求的任务,内核参数,已打开的文件等等都是要收集的,但是,调查员用于收集这个数据的工具:netstat, lsof, ifconfig之类的工具只能收集那些明显存在的数据,大多数系统使用的内存空间都被忽略了。更进一步说,这些程序都是运行在用户态的,如果系统内核被修改了的话,这些工具收集的数据就变得不可信了The perfect tool for collecting volatile data should not rely on an operating system. Such solutions exist and one of them is described in the “Digital Investigation” magazine Vol. 1 No. 1. The described hardware-based solution called Tribble is almost perfect. Unfortunately, the special PCI card must be physically installed in a machine before an intrusion occurs. Obviously, it is impossible to install such a card in each machine in internet. A memory acquisition procedure should be useful in every environment so in most cases it must be a software solution. The only thing which can be done by an investigator when an intrusion occurs is limiting memory collection process to few steps. This allows him to minimize impact on the compromised machine. He should dump main memory by using only one command. In second step, he should remove power from the compromised machine and then preserve remaining storages such as: hard disks, floppy disks, etc. The dd tool can be used to dump main memory. This tool does a bit-by-bit copy from one file to another. Additionally, a content of main memory has to be saved on a storage other than local file systems. One of solutions is sending data to a remote host. The well known tool, which supports sending files through network, is the netcat tool. In Linux operating system there are two files (/dev/mem and /proc/kcore) which correspond to main memory (RAM). The size of dumped memory is equal to the size of RAM. The / proc/kcore object is presented in the ELF core format, so it can be easily analyzed by the gdb tool. The size of the /proc/kcore file is a little bigger because of the ELF file header.真正的易灭失数据收集工具应该是不依赖于任何一个操作系统的。事实上这些问题的解决方案是存在的,而且其中一个已经在“数字侦探”杂志的第一期第一卷中发表了。我认为这个被称为:“Tribble”基于硬件的方案已经是接近于完美了,但可惜,如果你想采用这个方案的话,你必须在安全事件发生前在目标计算机上安装一块特殊的PCI卡,显然在所有上网的计算机上都安装上这么一块卡是不现实的。我想,一个内存镜像获取方案应该在每一种环境下都可行,因此,在大多数情况下,它应该是一个软件方案。当一个安全事件发生后,调查员所能做的应该是在有限的几个步骤中获取有关的证据,这样他就可以把他对目标计算机的影响降低到最小的限度,首先,调查员应该只用一条命令镜像主要的内存,然后,调查员应该把目标计算机关闭后保存其他存储介质内的数据(如:硬盘,软盘等等)。我们可以用dd命令来镜像内存。Dd工具可以一比特一比特的把一个文件内的数据拷到另一个文件中去。另外再罗嗦一句,内存镜像应该保存到其他存储介质中去,而不是保存到本地的硬盘或其他存储介质中去。保存数据的一个办法是把数据保存到一台远程主机上去。比如你用著名的工具netcat就可以轻松的完成这一任务。在linux系统中,代表内存的两个文件是/dev/mem 和/proc/kcore。你做的内存镜像的大小就是计算机上安装的物理内存的大小(换而言之,你做的是物理内存的镜像)。/proc/kcore文件有一个ELF格式的头(linux可执行文件的头,类似于windows可执行文件的PE头,用于标记程序中各个段的位置),因此你可以用gdb之类的调试器直接分析它,另外要说明的是,因为有那个ELF头,所以/proc/kcore文件的大小比计算机安装的物理内存的大小略微要大一些。The whole memory can be dumped in the way presented below:整个计算机的物理内存可以用下面这条命令来镜像:#/mnt/cdrom/dd if=/dev/mem | /mnt/cdrom/nc(这条命令就不翻译了吧:))If we have dumped memory image, we can start digital investigation.好,现在我假定你已经做好了内存的镜像,接下来就开始去分析它吧3. Introduction to analysis of the physical memory3.物理内存分析的一些约定和前置知识3.1 Limitations of the paper3.1分析的环境To limit the size of this document it was necessary to specify a few conditions:为了不至于陷入没完没了的对目标计算机运行环境的假定之中,我们先假定我们分析的目标计算机的一些情况:· The 2.4.20 kernel release is used in all examples. Similar investigations can be performed with other kernel releases.· The total size of physical memory is less than 896 MB. When physical memory is larger, additional calculations must be performed to localize page frames properly.· The page frame size is 4 KB. This is the default value used in almost each Linux distribution. The proof-of-concept toolkit idetect is used to simplify the described investigation. After simple modifications, the presented tools can be used on live systems during an incident response.1、在本文中所有的例子都是基于linux2.4.20内核环境的。其他内核版本的情况请读者执行推广。2、我们假定目标计算机上安装的物理内存不多于896MB。当需要更多内存时,多余的内存会被写入页交换空间(这一句原文好像有点问题,我用操作系统原理的知识翻译的这一句)(老赵给我一个很好的解答:这个是VM的原因,高于806M就映射到高端内存,你实际编程如果要用到这部分,就得用vmalloc())3、我们假定目标计算机的页大小是4KB,这是绝大多数linux发行版本中页大小的默认值。我们默认使用idetect工具来进行分析工作。另外,只要稍加变化idetect工具还可用户对一个运行中的系统的紧急事件响应。3.2 Symbols3.2符号During the digital investigation the System.map file can be very helpful. This file is used as a map with addresses of important kernel symbols. Every time you compile a new kernel, the addresses of various symbols are changed. The symbols included in that file provide helpful information for investigators. Let’s say that we want to enumerate addresses of system calls. These addresses are stored in the kernel structure called the system call table. The sys_call_table symbol stores an address of this table. Using the cat and the grep commands we receive the address of that table.在我们整个分析工作中System.map文件将会非常有用。System.map文件是一张重要的内核符号的地址映射表。每当你编译一个新的内核时,许多内核符号的地址都会改变的。这个文件对调查员非常有用。比如:我们想要获得所有系统调用的地址。这些地址实际上是存储在一个叫做系统调用表(the system call table)的内核数据结构中的,而我们可以通过从System.map文件中查询的sys_call_table symbol获得系统调用表的地址,进而通过系统调用表查到我们想要获得所有系统调用的地址。我们可以使用cat和grep命令来看一下$ cat /boot/System.map | grep sys_call_table c030a0f0 D sys_call_tableOn Listing 1 first few entries of system call table are presented.这里列出了几个系统调用的入口地址(gdb) x/256 0xc030a0f00xc030a0f0 : 0xc0128fa0 0xc011f8e0 0xc0107aa0 0xc0146cb00xc030a100 : 0xc0146df0 0xc0146220 0xc0146370 0xc01200600xc030a110 : 0xc01462c0 0xc0154510 0xc0154070 0xc0107bb00xc030a120 : 0xc01457f0 0xc0120d40 0xc01536b0 0xc0145b700xc030a130 : 0xc012ca00 0xc0128fa0 0xc014e910 0xc0146b40…Listing 1. The result of running the gdb tool against the /proc/kcore file.用gdb分析/proc/kcore文件的结果Entries in this table correspond to names of functions stored in the file /usr/include/asm/unistd.h.这几个系统调用对应的是/usr/include/asm/unistd.h中定义的几个函数#define __NR_exit 1#define __NR_fork 2#define __NR_read 3#define __NR_write 4#define __NR_open 5#define __NR_close 6…For example, the sys_write function is at 0xc0146df0, the sys_open function is at 0xc0146220, and so on.例如: sys_write函数的内存地址在0xc0146df0, sys_open 函数的内存地址在0xc0146220, 等等等等……The Symbol.map file is usually located in the /boot directory on a local file system.Symbol.map文件通常页用来定位/boot目录在本地文件系统中的位置。4. An introduction to the digital investigation of the physical memory4.物理内存镜像分析导论Terminology used in the digital investigation of the physical memory is similar to the digital investigation against file systems. We can define data units and meta-data units. The data unit contains raw data such as execution code or data section from memory mapped file. Additionally, they can contain a content of the stack or some meta data such a process descriptor. The data units are a fixed size. In most systems data unit is equal to 4 KB – this is the default size of the page frame.我们在分析物理内存镜像时所用的术语和文件系统分析时所用的术语基本上是一致的。我们定义数据单元和元数据单元。数据单元中存放的构成进程的相关数据(回忆一下操作系统原理中进程的组成部分:进程描述符,堆栈,把可执行文件载入内存后得到的指令序列和数据区)。数据单元是4KB大小对齐的。因为它要参与页交换过程。The meta data unit is where the descriptive data about various memory structures is stored. This kind of the unit includes structures such as: page descriptors, process descriptors, memory regions, and so on.元数据单元是许多内存结构描述数据存放的地方,这些内存结构描述数据包括:页分配表,进程分配表,内存分块信息等等(这里是根据操作系统原理的相关知识翻的,已经一年多没碰了,有点记不清了,错误再所难免,请各位看官多多指正)4.1 Virtual Address Space4.1虚地址空间In most examples in this paper, the virtual (linear) addresses are used. All modern operating systems, including Linux, use this kind of addresses to access the contents of memory cells. In the x86 architecture with 32-bit CPU processors, a single 32-bit unsigned integer can be used to address up to 4 GB.在本文的许多例子中,内存虚地址的概念经常会被用到。在所有的现代操作系统中,系统都是使用虚地址去访问内存单元的。在32位的x86芯片的构架上,一个简单的32位的无符号整型变量可以用来对4GB的内存进行寻址。(这里作者的意思好像是要说在32位的x86芯片的构架上,每个进程都可以拥有4GB的内存空间)The Linux operating system divides memory into 2 parts. Upper 1 GB (0xc0000000 – 0xffffffff) is reserved for a kernel of operating system (this memory area can be accessed only when the CPU is switched into Kernel Mode). The remaining part of memory (3GB) is called User Land.linux操作系统中系统把进程的整个内存空间分成两个部分,前面的1GB (0xc0000000 – 0xffffffff)是专门为操作系统内核保留的(这部分内存空间只能在CPU切换到管态时才能访问)剩下的3GB才能为用户使用。4.2 Physical addresses4.2物理地址Physical addresses are used to address memory cells in memory chips. Physical addresses are represented as a 32-bit unsigned integer. The CPU control unit transforms a linear address into a physical address automatically. Helpfully, a calculation from a linear address to a physical one is quite simple and this will be shown several times in the following chapters.物理地址是用来真正访问内存单元的地址。物理地址是用一个32为的无符号整型编址的。CPU会自动把一个虚地址映射成一个物理地址。不过还好把一个虚地址映射成一个物理地址的过程还算是相当简单的,在下面的章节里我们会多次展示这些技巧。5. Map of system memory系统内存映射表In this chapter important structures of kernel memory are discussed. Only elements, useful for forensic investigators, will be described. It is recommended to use books [1][2] listed in references to find detailed information about each structure discussed in this document.在这一章里,我们要讨论一种重要的内核内存结构。但是我们只讨论那些调查员必须知道的内容,其他的更多的信息请参考参考书目1、25.1 Uniform Memory Access5.1均匀存储器访问In x86 architecture, the Linux uses physical memory as an homogeneous, shared resource. This method is called Uniform Memory Access (UMA) and it means that the memory of the computer is seen by operating system as a single node. This node is represented as the static pg_data_t structure. The symbol contig_page_data contains the address of this structure. The pg_data_t struct contains information about: a size of a node (it means that it is a total size of physical memory), number of zones in a node, an address of table with page descriptors for this node and many more. At this point, it is important to understand what the zones are.在x86芯片架构下,linux系统等价的看待所有存储单元,并共享所有的资源,这一机制被称为UMA均匀存储器访问。这意味着计算机中所有内存都被操作系统视为单一结点。这一结点是用一个静态的pg_data_t结构描述的。而Symbol.map文件中contig_page_data符号中存放的就是指向这一结构的指针。pg_data_t结构中包含了下列信息:结点的大小(也就是说物理内存的总大小),结点中区的数量,页描述符中对该结点描述表的地址(?)等等。从这一点来说这一结构对理解区是什么是非常重要的。5.2 Zones5.2区Physical memory (or node) is partitioned into three zones: ZONE_DMA, ZONE_NORMAL and ZONE_HIGHMEM. As we can read in book [1], there are reasons for providing such a fragmentation.整个物理内存(现在我们也可以称之为结点)被分为三个部分:ZONE_DMA, ZONE_NORMAL 和ZONE_HIGHMEM,就像我们在Daniel P. Bovet, Marco Cesati的《深入理解linux内核(第二版)》一书(这是参考书目1)中所看到的,系统这样分配整个物理内存是又道理的:“However, real computer architectures have hardware constraints that may limit the way page frames can be used. In particular, the Linux kernel must deal with two hardware constraints of the 80 x 86 architecture:“无论如何,真实的计算机体系结构都要受到物理硬件环境的限制,因此也会限制内存页的使用方式。从实现技术上来说,linux内核必须处理来自80×86芯片结构的两个硬件环境对它的限制:”- The Direct Memory Access (DMA) processors for ISA buses have a strong limitation: they are able to address only the first 16 MB of RAM. – In modern 32-bit computers with lots of RAM, the CPU cannot directly access all physical memory because the linear address space is too small. To scope with these two limitations, Linux partitions the physical memory in three zones.” The size of the ZONE_DMA zone is 16 MB. The size of the ZONE_NORMAL zone is equal to 896 MB – 16 MB (ZONE_DMA). The memory above 896 MB is included in the ZONE_HIGHMEM zone. This last zone contains page frames that cannot be directly accessed by the kernel because of limitation of a single 32-bit unsigned integer. Each memory zone has its own descriptor of type zone_struct. This structure is defined in the file /usr/src/linux-2.4/include/linux/mmzone.h.主板上ISA总线的DMA处理器有一个很大的限制:1、DMA只能直接访问内存开头的16M;2、在现代的安装大容量内存的32位计算机中,因为32位无符号整型数表示范围的限制, CPU不能直接访问所有的物理内存空间(只能访问896MB一下的空间)。为了解决这一问题linux把整个内存空间划分成上个区:“ZONE_DMA”区的的范围是0MB-16MB,“ZONE_NORMAL”区的范围是896MB-16MB,其他剩下的高于896MB的内存空间都在“ZONE_HIGHMEM”区里。在“ZONE_HIGHMEM”区里的页等价于内核不能直接访问的页,每个内存的区都拥有自己的类型描述结构zone_struct,这个结构的定义在/usr/src/linux-2.4/include/linux/mmzone.h文件中。In all examples I used physical memory which size is 128 MB. It means that all users’ and kernel data are stored in the ZONE_NORMAL zone and sometimes in the ZONE_DMA zone. Pointers to the mentioned zone descriptors are kept in the zone_table array. The address of the zone_table symbol is stored in the System.map file.在本文的所有例子中我们假定我们的目标计算机只安装了128MB内存,这意味着:所有用户进程或者内核进程的数据都位于“ZONE_NORMAL”或者是“ZONE_DMA”区内。指向这两个区的描述结构的指针位于zone_table数组中。而zone_table数组的符号链接则位于System.map文件中。$cat System.map | grep zone_table c03e6238 B zone_table(这是linux命令我没办法翻译:))A letter “B” means that the symbol is in the uninitialized data section (known as BSS).字母”B”的意思是符号位于未初始化的数据段中(即著名的.bss段中)The content of the zone_table array is presented in Listing 2.zone_table数组的内容见表2:· zone_table[0] stores an address of the ZONE_DMA descriptor· zone_table[1] stores an address of the ZONE_NORMAL descriptor· zone_table[2] stores an address of the ZONE_HIGH descriptor003e6230 80 9b 34 c0 ce ff 02 00 80 9b 34 c0 80 9e 34 c0 |..4…….4…4.|003e6240 80 a1 34 c0 00 00 00 00 00 00 00 00 00 00 00 00 |..4………….|003e6250 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |…………….|003e6260 ce ff 02 00 01 00 00 00 00 00 00 00 00 00 00 00 |…………….|Listing 2. Fragment of physical memory with the zone_table array.表2:在zone_table数组中的物理内存碎片At address 0xc0349e80 we can find the zone descriptor for the ZONE_NORMAL zone.在0xc0349e80地址上我们可以发现“ZONE_NORMAL”区的描述结构Each zone descriptor contains a lot of information about its own page frames. There is also an address of the mem_map array. This table stores page descriptors of each page frame in the zone. Before looking closer at the mem_map array, let’s focus on page frames and page descriptors.每个区的描述结构都包含了大量关于它自己的页的大量的信息。在mem_map数组中也有一个指针,该指针指向的表存储了在这个区中的每个页的页描述结构。在研究完mem_map数组之前,我们先考虑页和页描述符。5.3 Page frames and page descriptors5.3页和页描述符Most data used by the CPU are stored in physical memory in a form of pages frames (In fact, physical memory is partitioned into a fixed-length page frames). Each page frame is 4KB large – this is the default value. Sometimes x86 processors can use different sizes of page frame, such as 4 MB or 2 MB, but the standard memory allocation unit is 4 KB (only a standard size of page frames is discussed in this document). In page frames all volatile data is stored. For instance, when a file, which has a size of 7 KB is mapped, its content (code and data segments) will be stored in physical memory in two page frames. When a process requests a memory item, the system will use a linear (virtual) address to access requested data. To read data properly a hardware Memory Management Unit (MMU) translates a virtual address automatically to a physical one. The page may be marked as paged in or paged out. If the page is paged in then an access to memory can be proceed after translating a virtual address to a physical address. If the requested page is paged out, the MMU has to locate this page in the swap area and then load it into physical memory. These two possibilities will be discussed in next sections.CPU使用的大多数数据是存放在物理内存的,而这些数据在内存中是以页的形式组织起来的(事实上,物理内存是被分成许多边界对齐的页的)。每个页的大小是4KB——这是系统默认的值,但是有时候x86芯片页可以使用其他大小的页,比如4MB或者是2MB,但是标准的内存分配单元的大小是4KB(本文中不讨论其他页大小的情况)。所有的易灭失数据都是存放在内存页中的。举个例子来说吧,设有一个7KB大小的文件被装入内存,那么这个文件的内容(代码段和数据段)将会被拷贝到物理内存的两个页中去。当CPU要访问一个内存地址时,则系统会用虚地址(线性地址空间)来访存。这时将由MMU部件把虚地址转换为实地址(就是数据所在物理内存的地址)。转换的结果有两种:页在内存中或页已换出内存。如果页在内存中,则MMU将把虚地址转换为实地址,然后系统用实地址读出相关数据;如果页已换出内存,则MMU将定位页在硬盘页交换空间的位置,然后把页调入内存,再把虚地址转换为实地址,然后系统用实地址读出相关数据。下面一节里我们会分别详细讨论这两种情况。A kernel uses page descriptors to keep track of all physical pages. Each page frame has a corresponding page descriptor. In a page descriptor the information about state of page is stored.内核用页描述符来确保能找到所有的页,每个页都有相应的页描述符,页描述符里存放这页状态的有关信息。A structure of page descriptor is defined in the file/usr/src/linux-2.4/include/linux/mm.h.页描述符是一个结构,这个结构是在/usr/src/linux-2.4/include/linux/mm.h文件中定义的。下面是有关的定义:typedef struct page {struct list_head list;struct address_space *mapping;unsigned long index;struct page *next_hash;atomic_t count;unsigned long flags;struct list_head lru;union {struct pte_chain *chain;pte_addr_t direct;} pte;unsigned char age;struct page **pprev_hash;struct buffer_head * buffers;struct buffer_head * buffers;#if defined(CONFIG_HIGHMEM) || defined(WANT_PAGE_VIRTUAL)void *virtual;#endif /* CONFIG_HIGMEM || WANT_PAGE_VIRTUAL */} mem_map_t;The mapping field is a pointer to an address_space struct. The virtual field is an address of the physical page frame where data is stored. When the total amount of the physical memory is less than 896MB then it is easy to calculate a real (physical) address of each page frame by removing the PAGE_OFFSET (0xc0000000) from an address pointed by the virtual field.该结构的“mapping”成员是一个指向“address_space”结构的指针。“virtual”成员则记录了数据所存放的页的物理地址。(请注意“virtual”成员定义的条件)如果计算机所安装的物理内存少于896MB时,“address_space”结构可以很方便的用虚地址域页偏移(PAGE_OFFSET)来计算出页的实(物理)地址。The list field contains pointers to next and previous page descriptors which belong to the same memory region.“list”成员是一个指向一个双向链表的指针,该双向链表记录了指向属于同一个内存区中的上一个和下一个页的页描述符的指针。As it was mentioned, all page descriptors are stored in the one global mem_map array.另外,所有的页描述符都被存放在全局的mem_map数组中。5.4 The mem_map arraymem_map数组In fact, we can identify three mem_map arrays. Each zone has its own array. When the first mem_map array (for the ZONE_DMA zone) finishes then the second mem_map array (for the ZONE_NORMAL zone) starts.事实上,我们应该意识到一共有三个mem_map数组,内存的每个区都有一个(mem_map数组)。第二个(就是“ZONE_NORMAL”区的)mem_map数组是紧接着第一个(就是“ZONE_DMA”区的)mem_map数组存放的。If we know the size of the page descriptor, it will be quite easy to find the beginning address of the mem_map array for the ZONE_NORMAL. The ZONE_NORMAL has a special meaning because most page frames, allocated by users’ processes, belong to this zone. For most 2.4.x kernels the size of the page descriptor is equal to 56 (0x38) bytes. We know that the total size of the ZONE_DMA is equal to 16 MB (0x01000000). We need 4096 page frames to address the ZONE_DMA zone (0x00000000 – 0x01000000). The size of the mem_map array for the ZONE_DMA zone is [0x38] bytes * [0x1000] = 0x38000.现在,如果我们知道了页描述符的尺寸的话,就能很方便的计算出对应“ZONE_NORMAL”区的mem_map数组的起始位置了。我们为什么这么“关注”对应“ZONE_NORMAL”区的这个mem_map数组呢?因为首先这个区中包含了大量的页,其次所有的用户进程请求的空间都是分配在这个区的。对于大多数的linux2.4.x内核来说,页描述符的尺寸是56(0x38)个字节,另外,我们知道“ZONE_DMA”区的大小是16MB,也就是说“ZONE_DMA”区里一共有4096个页。所以,对应“ZONE_NORMAL”区的mem_map数组的起始位置是在mem_map array偏移[0x38] * [0x1000] = 0x38000处的。I have to mention that Linux operating system maps virtual addresses into physical addresses starting from PAGE_OFFSET. For instance, the virtual address 0xc0000000 corresponds to the physical address 0x00000000, the address 0xc0001000 corresponds to 0x00001000 one and so on.现在,我们不得不说一下linux操作系统是怎样通过页偏移(PAGE_OFFSET)来把虚地址转换为实地址的了。例如:把虚地址0xc0000000转换成实地址0x00000000,或者把虚地址0xc0001000转换成实地址0xc0001000,等等等等……It is also important to note that physically the mem_map array is placed in page frames which belong to the ZONE_NORMAL zone. The location of the mem_map array is shown in Figure 1.我们应该意识到:mem_map数组实际上存储在“ZONE_NORMAL”区中的页里的。如下图所示:黄色部分:ZONE_DMA绿色部分:ZONE_NORMAL蓝色部分:ZONE_HIGHMEMFigure 1. The mem_map array is stored in the ZONE_NORMAL zone.图1:mem_map数组的存储位置Basing on the above scheme, we can assume that the mem_map for the ZONE_DMA zone starts from the physical address 0x01000000. In fact, the mem_map for ZONE_DMA starts from offset 0x30. Now we can easily locate the beginning physical address of the ZONE_NORMAL which is equal to: 0x01000030 + 0x00038000 = 0x01038030 (the virtual address of the mem_map array for ZONE_NORMAL zone is 0xc1038030).基于我们上面的讨论,我们现在可以确定对应“ZONE_DMA”区的mem_map数组的起始位置是:0x01000000(实地址)。不过我们还是有点小小的修正(为了叙述方便这个小小的细节我们一开始并没有题出来):“ZONE_DMA”区的mem_map数组的起始位置实际上还应该向下偏移0x30个字节,即位于0x01000030(实地址)。那么,对应“ZONE_NORMAL”区的mem_map数组的起始位置就是0x01000030 + 0x00038000 = 0x01038030(对应的虚地址是0xc1038030)。I focused on the ZONE_NORMAL zone but digital investigators must also look at the ZONE_DMA zone. On some conditions, the page frames, which belong to users’ processes, can be allocated in the ZONE_DMA zone. It happens when all page frames in the ZONE_NORMAL have been already allocated.当然,截至目前为止,我们都在讨论“ZONE_NORMAL”区,但是,真正的调查时还是要注意一下“ZONE_DMA”区的。因为在某些极端的情况下,比如在“ZONE_NORMAL”区的所有页都已经被分配出去了的情况下,用户进程还是会写入“ZONE_DMA”区的。When files are mapped into the main memory, their inode structures (this inode structure will be discussed in the next section) have the associated address_space structure. The mapping field in page descriptor points exactly to this structure.当一个文件被载入主存时,文件的i结点(关于i结点我们下一节讨论)就被转换成相应的“address_space”结构,而指向这一结构的指针就是页描述符中的映射域。5.5 The address_space struct5.5“address_space”结构This object can be associated with a regular file. So when a new process is created, an executable file is mapped into a process address space and then the address_space object is initialized in the memory. Each memory mapped file has its own address_space structure. The address_space structure is defined in /usr/src/linux-2.4/include/linux/fs.h source file.“address_space”结构对象总是和一个正常的文件关联在一起的。当我们创建一个新的进程时,一个可执行文件就被装入主存,然后一个“address_space”结构对象就在内存中被初始化。每个被装入主存的文件都有它自己的“address_space”结构,“address_space”结构是在/usr/src/linux-2.4/include/linux/fs.h源文件中定义的。struct address_space {struct list_head clean_pages; /* list of clean pages */struct list_head dirty_pages; /* list of dirty pages */struct list_head locked_pages; /* list of locked pages */unsigned long nrpages; /* number of total pages */struct address_space_operations *a_ops; /* methods */struct inode *host; /* owner: inode, block_device */struct vm_area_struct *i_mmap; /* list of private mappings */struct vm_area_struct *i_mmap_shared; /* list of shared mappings */spinlock_t i_shared_lock; /* and spinlock protecting it */int gfp_mask; /* how to allocate the pages */};The address_space object includes doubly linked lists of all page descriptors of mapped file stored in the main memory. If we sum all page descriptors we should receive the total number of physical page frames that is equal to a value kept in the nrpages field. The main role of address_space object is linking these page frames with methods associated with the mapped file. Two fields in the address_space structure are really important for us. The host field points to the inode structure of the memory mapped file, the second one – the i_mmap field points to the memory region to which these page frames belongs.“address_space”结构包含了一个双向链表,链表中的内容记录了这个文件被装入主存后占用的所有的页的编号。物理内存中页的总数保存在一个名为nrpages的无符号长整型数中。“address_space”结构的一个主要作用是在内存中的进程的页和硬盘中的文件的数据块(data_block)之间建立意义对应关系。“address_space”结构中两个成员对我们特别有用:1、“host”成员,这是一个指向硬盘中那个被装入内存的可执行文件的i结点的指针;2、i_mmap成员,这是一个指向页所属的内存区的指针。5.6 The inode struct5.6 i结点结构This structure describes memory mapped file. A lot of useful information can be obtained from this object. An investigator can: determine the directory from which the file was executed (if the inode describes an executable file), and find MAC times and so on. Such a structure is described in the file /usr/src/linux-2.4/include/linux/fs.h. We don’t have to fully understand the role of all fields in the inode structure right now. Let’s focus on a few important fields. The i_ino field contains the inode number. The i_dentry field points to a dirent structure which describes the directory and contains the name of memory mapped file. The i_atime, i_mtime and i_ctime fields correspond to the access, the modification and the change times, known as the MAC times. The i_mapping field points to well known address_space structure.这个结构描述的是被载入内存的那个可执行文件的相关信息(是属于文件系统层的)。一个有经验的调查员可以从i结点结构中可以知道文件所在的目录,文件的MAC(创建、修改、访问)时间。这个结构是在/usr/src/linux-2.4/include/linux/fs.h中定义的。讨论i结点结构中所有成员的意义不是本文的任务(更多信息请参见文件系统分析)。但是,我们还是提一下i结点结构中几个重要的成员:1、i_dentry成员,该成员指出了文件所属的目录和文件的文件名;2、i_atime, i_mtime 和i_ctime成员分别对应文件的MAC(创建、修改、访问)时间;3、i_mapping成员指向你现在已经很熟悉了的“address_space”结构。5.7 The dentry struct5.7 “dentry”结构As it was mentioned, the dentry structure describes a directory object. This object is defined in the file /usr/src/linux-2.4/include/linux/dcache.h. The dentry structure includes the d_iname array. In this array the name of file is stored.如上所述:“dentry”结构描述的是一个目录对象。这个对象的定义在/usr/src/linux-2.4/include/linux/dcache.h文件中。“dentry”结构包含一个“d_iname”数组,这个数组是专门用于存放文件名的。5.8 Memory regions5.8 内存区域We remember that in the address_space structure there is a field which points to the proper memory region. Each memory region is described by the vm_area_struct structure. This object represents memory regions reserved for the User Mode process. For instance, each file, mapped into the process address space, contains at least three regions. First region includes an executable code, the second one represents an initialized data segment, the last one represents the heap. There are also additional memory regions for the stack, shared libraries and so on. In the pseudo file system procfs each process has a file called maps. This file contains addresses of all memory regions. For example, as it is illustrated in Listing 3, the process with PID = 1143 has 12 memory regions.还记得吗?在“address_space”结构中有一个成员,其内容是指向页所属的内存区的指针(i_mmap成员)。每个内存区域都是用一个“vm_area_struct”结构描述的。这个对象表示的是专为用户态进程预留的内存区域,举个例子来说:每当一个(可执行)文件被装入进程的地址空间时,它至少占用三个区域:一个区域中的是二进制指令代码(.text正文段);一个区域中的是已初始化的数据段(.data数据段);另一个区域则是堆。另外还有为栈额外预留的空间、共享库等等…… 在/proc中(这是一个描述内存状态的伪文件系统)每个进程都有一个被称为“maps”的文件,这个文件包含了该进程所有内存区域的地址,下面就是一个例子:(进程ID为1143的进程中有12个内存区域)$ cat /proc/1143/maps08048000-0804c000 r-xp 00000000 08:02 385967 /usr/sbin/atd0804c000-0804d000 rw-p 00003000 08:02 385967 /usr/sbin/atd0804d000-0804f000 rwxp 00000000 00:00 040000000-40015000 r-xp 00000000 08:02 337446 /lib/ld-2.3.2.so40015000-40016000 rw-p 00014000 08:02 337446 /lib/ld-2.3.2.so40016000-40018000 rw-p 00000000 00:00 04001d000-40028000 r-xp 00000000 08:02 337467 /lib/libnss_files-2.3.2.so40028000-40029000 rw-p 0000a000 08:02 337467 /lib/libnss_files-2.3.2.so42000000-4212e000 r-xp 00000000 08:02 433839 /lib/tls/libc-2.3.2.so4212e000-42131000 rw-p 0012e000 08:02 433839 /lib/tls/libc-2.3.2.so42131000-42133000 rw-p 00000000 00:00 0bfffe000-c0000000 rwxp fffff000 00:00 0Listing 3. Memory regions of selected process.Each region is described by the vm_area_struct structure which is defined in the /usr/src/linux-2.4/include/linux/mm.h source file.描述区域的“vm_area_struct”结构在/usr/src/linux-2.4/include/linux/mm.h源文件中的定义:struct vm_area_struct {struct mm_struct * vm_mm;unsigned long vm_start;unsigned long vm_end;struct vm_area_struct *vm_next;pgprot_t vm_page_prot;unsigned long vm_flags;rb_node_t vm_rb;struct vm_area_struct *vm_next_share;struct vm_area_struct **vm_pprev_share;struct vm_operations_struct * vm_ops;unsigned long vm_pgoff;struct file * vm_file;unsigned long vm_raend;void * vm_private_data;};Every memory region starts from the address contained in the vm_start field. The vm_end field contains the last used address. All regions, which belong to a process address space, are linked by the vm_next field. This field points to the address of the next memory region occupied by the process. There are some flags which are associated with the page frames of the memory region. If a particular region contains the code of a mapped file, then the following flags are set: VM_READ, VM_EXEC and VM_EXECUTABLE. If a region is associated with a mapped file, then the vm_file field points to the object of type file struct. The file structure contains a field which points to a inode structure.每个内存区域的起始地址都存放在vm_start成员中,而它的结束地址则存放在vm_end成员中。本进程拥有的所有的区域都可以用vm_next成员链接起来,顾名思义,vm_next成员记录的是指向属于本进程的下一个内存区域的指针。另外“vm_area_struct”结构还有一些与内存区域中的页相关的一些标志位。如果这个区域的内容是文件中的代码(.text正文段),那么下面这些标志位将被置为真:VM_READ, VM_EXEC和VM_EXECUTABLE;反之,vm_file成员(它是一个file struct的指针)就会指向一个文件描述符,而文件描述符的里还有一个成员指向文件的i结点。In the vm_area_struct structure the vm_mm field points to a special data structure. This structure of type mm_stuct is called a memory descriptor.在“vm_area_struct”结构中vm_mm成员所指向的是一个特殊的数据结构——内存描述符(“mm_stuct”结构)。5.9 The mm_struct struct“mm_stuct”结构Every User Mode process contains only one mm_struct descriptor. This structure contains all information related to a process address space. The mm_struct is defined in /usr/src/linux- 2.4/include/linux/sched.h.每个用户态进程都拥有且只用户一个“mm_stuct”结构(内存描述符)。该结构包含了与该进程地址空间有关的一切信息。本结构的定义在/usr/src/linux- 2.4/include/linux/sched.hstruct mm_struct {struct vm_area_struct * mmap;…pgd_t * pgd;…atomic_t mm_count;int map_count;…} mm_stat;The mmap field points to a linked list of all memory regions of type vma_area_struct. The mm_struct object has a pointer to the Page Global Directory (the pgd field). The value, stored in this field, can be used to find all page frames of a process. It is also useful if we want to localize page frames which are swapped out to disk. The mm_count field informs us about a quantity of page frames which are allocated by a process. Additionally, all memory descriptors are linked by a doubly linked list (see Figure 3).“mmap”成员是指向所有内存区域链表vma_area_struct的指针(?)。“mm_struct”对象有一个“pgd”成员,该成员是一个指向全局页目录(the Page Global Directory)的指针。通过这个成员可以查到该进程所有的页。这也有助于我们判断某一个页是不是已经被交换到硬盘缓存中去了。“mm_count”成员则可以分析出进程所分配到的一些页。另外所有的内存描述符都被链接成一个双向链表。(见图3)5.10 The task_struct object5.10 “task_struct”对象This is the last kernel structure described in this paper. Every process is represented by a process descriptor that includes information about the current state of a process. This structure is defined in the /usr/src/linxu-2.4/include/linux/sched.h file. As we can assume, there is the mm field which points to the mm_struct structure. In case of kernel threads, which are also described by the task_struct, the pointer to the mm_struct object stores value equal to NULL.这是本文中描述的最后一个内核结构。每个进程都由一个包含了进程当前状态的进程描述符代表。该结构的定义在/usr/src/linxu-2.4/include/linux/sched.h中。我们假定,该结构中有一个mm成员,该成员指向mm_struct结构,但在内核线程中,进程描述符总是由“task_struct”对象描述的,所以指向mm_struct结构那个成员的值总是NULL。All structures of type task_struct are linked by a doubly linked list. Figure 2 illustrates a doubly linked list which links all existing process descriptors. The head of this list is kept in a structure called the init_task_union.所有的task_struct被组织成一个双向链表,如图2所示,链表的的第一个元素总是init_task_unionFigure 2. A doubly linked list.The address of the init_task_union is exported. The init_task_union represents the process number 0 called swapper. If we find its address, we can enumerate almost all processes in a running state.“init_task_union”的地址是链表的入口,“init_task_union”对应的是0号进程,就是著名的对换进程。如果我们拿到了“init_task_union”的地址,我们就能拿到所有正在运行中的进程的地址The init_task_union symbol is stored in the data section as it is shown below.我们可以这样获得“init_task_union”的地址:$ cat /boot/System.map | grep init_task_unionc034c000 D init_task_unionTo list all processes we can go through this doubly linked list. Notice, that this technique is resistant to attempts of function hooking that the aim is hiding some processes. It happens like that because we read this list directly from the physical memory object. But what about processes which are unlinked from such a list? A technique called DKOM (Direct Kernel Object Manipulation) can be used by an intruder to hide some processes. In the Linux operating system linked process descriptors are used by the scheduler to reserve some time for the CPU. Thus, there are methods of changing the scheduler code in the way which makes possible to create the second list of processes which are seen only by the scheduler. A method of detecting such processes will be presented in the next section.为了拿到所有进程的地址,我们不得不遍历这个链表。注意:我们可以用这个办法来对付那些使用函数挂钩的办法来隐藏自身的进程,因为这样我们是直接从物理内存中把这个链表读出来的。但是还有一个问题,有没有已经不在这个链表中,却又仍在运行中的进程呢?答案是有的。入侵者可以使用直接操作内核的办法来隐藏进程的。原理是这样的:在linux操作系统中,这张进程描述符的链表是用于CPU时间片的调度进程的。因此,还是有一些办法改变CPU时间片的调度进程的代码,创建一个新的只有CPU时间片的调度进程能看见的进程链表(并把要隐藏的进程挂接在这张链表中)。对付这种隐藏进程的办法的方法我们在下一节再讨论Figure 3 illustrates relations between all previously described objects. This map will be very helpful during digital investigations of the physical memory.图3是对本节中所涉及知识的总结,这张图对内存镜像极为有用!Figure 3. Relations between data structures of each process.I have already described important structures of the kernel memory, so now we can use this knowledge to find evidence in the physical memory of the compromised machine.到目前为止,我已经把所有内核进行内存管理的重要的数据结构介绍了一遍,接下来,我们就可以用前面所讲的知识来对目标计算机进行物理内存镜像的分析了。6. Recovering content of files from the main memory and the memory image6.从内存或内存镜像中恢复文件的内容As it is illustrated in Figure 3, it is quite easy to identify all page frames which belong to a memory mapped file. I mean page frames which are still in the main memory. Starting from the task_struct structure of the selected process, we can localize the address_struct structure, then we can enumerate all page descriptors and finally read addresses pointed by the virtual field of each page descriptor. As it was mentioned previously, the virtual field points to an address where the content of the requested file is stored. Similar operations can be also performed on a live system during incident handling. Therefore, this method is an alternative to coping the exe objects from the /proc file system. A big advantage of such methods is that we don’t use the ptrace() function. Incident handlers often use tools such as memfetch, pcat to dump a suspicious process. Unfortunately, these tools use the ptrace() function. When a suspicious process is already traced or is protected against such tools, there is no possibility to dump the whole address space of a suspicious process in right way. Let’s discuss the simple example. To perform this simple test I run the nc tool and then set the value, stored in the ptrace field, to other value than 0 (the ptrace field is included in the task_struct structure). It means that process is already traced. The fragment of a loadable kernel module, which performs this task, is listed below.正如图3所示,现在我们可以很轻松的找到任意一个被装入内存的文件的所有的页。当然,这些页必须是还没有被交换到交换分区。不嫌罗嗦的话,我再说一遍:通过所选的进程的“task_struct”结构我们可以找到相应的“address_struct”结构。而通过“address_struct”结构,我们可以找到所有页的描述符,通过页描述符中的“virtual”成员,我们就能得到每个页的虚地址。这样我们就达到目的了(得到文件的内容了)。当然,我们也可以用同样的办法分析一个正在运行的linux系统。所以,不妨说这个办法也是从/proc文件系统中拷贝一个可执行文件的办法之一。为什么我们要花这么大的力气用这个办法来拷贝呢?因为用这个办法的好处在于:我们不需要调用到ptrace()函数!我们现在在紧急响应中用常用memfetch, pcat之类的工具对特定的进程做镜像。但是不幸的是,这些工具中都要调用到ptrace()函数。如果嫌疑进程被做了手脚来阻止memfetch, pcat之类的工具正常运行的话,你就没办法正确的进行复制了。我们举一个简单的例子来说吧:我们运行一个nc,然后我们把相应task_struct结构中的的ptrace成员的值改成除了0之外的其他的值。这样我们就做完手脚了。下面三行代码就是我们的修改程序的一部分。task_lock(current);current->ptrace=1;task_unlock(current);Next, as it is illustrated in Listing 4, I wanted to dump the code segment of the nc process by using the pcat and the memgrep tools.然后,如下所示:我们现在想用pcat或memgrep把nc的代码段镜像出来,结果.…..[root@mh302 bin]# ps -ef | grep ncmariusz 9111 2563 0 22:08 pts/3 00:00:00 nc[root@mh302 memgrep-0.8.0]# ./memgrep -p 9111 -d -a text -l 100 | moreptrace(ATTACH): Operation not permittedmemgrep_initialize(): Couldn’t open medium device.[root@mh302 bin]# ./pcat 9111./pcat: ptrace PTRACE_ATTACH: Operation not permittedListing 4. Attempts of dumping the code section from the selected process.As you may notice, it is impossible to dump the content of the text section. A method of dumping selected page frames directly from the memory is much more effective. As it is shown in Listing 5, I used the tool called taskenum to enumerate all page descriptors of the memory region with the code segment of the suspicious process.就像你所看到的,现在要镜像.text段的内容好像是不可能了。这时候直接从内存中dump所选择的页的办法似乎更有效些。如下所示:我们可以用“tasknum”列举装了指定进程代码段的所有的页的页描述符。$ ./taskenum -a |grep ncPid: 9111 [nc] task_struct:[0xc38f7ff4] mm:[0xc068a880]$ ./taskenum -p 9111|moreThe process number 9111Pid: 9111 [nc] task_struct:[0xc38f8000] mm:[0xc068a880]Number of memory regions: 10Mapping address: c29e9528Number of pages: 4page desc addresses: c10abfd8 c1074278 c10473e0 c1095de8Address range: 8048000-804c000 (vma: c223dc80) i file (nc) c1d50400 i dir c3083180 i inode c29e9480 i mapping c29e9528 …Listing 5. Using the taskenum tool to enumerate page descriptors of a selected process. Every page descriptor contains the virtual field which points to a physical address. Let’s examine the first page descriptor which is available at the address 0xc10abfd8.0xc10abfd8: 0xc1074278 0xc29e9528 0xc29e9528 0x000000010xc10abfe8: 0xc1059c48 0x00000003 0x010400cc 0xc1095e040xc10abff8: 0xc10473fc 0x03549124 0x00000099 0xc1279fa40xc10ac008: 0xc3a7a300 0xc3123000This page frame starts at 0x03123000 (0xc3123000 – 0xc0000000).Now we can dump the page frame using the dd tool. We must only convert the physical address into the decimal notation.$dd if=/dev/mem of=page bs=1 count=4096 skip=51523584Unfortunately this technique may not allow us to dump the whole address space of a suspicious process. To dump all page frames we must consider that some page frames can be swapped out. Additionally, in the address space of the process there are page frames which are shared between processes and there are page frames which contain data such as the stack or the heap. To localize all pages frames we must enumerate all page table entries of the process. The mm_struct object has the pgd field which points to the Page Global Directory. Entries of this table contain pointers to the Page Tables. Finally, we can identify entries which point to page frames or to swapped out page frames in the swap file (in this case entries of the Page Table store indexes).

任何的限制,都是从自己的内心开始的。

Linux物理内存镜像分析

相关文章:

你感兴趣的文章:

标签云: