introducing Linux Kernel symbols

In kernel developing, sometimes we have to examine some kernel status, or we want to reuse some kernel facilities, we need to access (read, write, execute) kernel symbols. In this article, we will see how the kernel maintains the symbol table, and how we can use the kernel symbols.

This article is more of a guide to reading kernel source code and kernel development. So we will work a lot with source code.

What are kernel symbols

Let’s begin with some basic knowledge. In programming language, a symbol is either a variable or a function. Or more generally, we can say, a symbol is a name representing an space in the memory, which stores data (variable, for reading and writing) or instructions (function, for executing). To make life easier for cooperation among various kernel function unit, there are thousands of global symbols in Linux kernel. A global variable is defined outside of any function body. A global function is declared withoutinlineandstatic. All global symbols are listed in/proc/kallsyms. It looks like this:

$ tail /proc/kallsymsffffffff81da9000 b .brk.dmi_allocffffffff81db9000 B __brk_limitffffffffff600000 T vgettimeofdayffffffffff600140 t vread_tscffffffffff600170 t vread_hpetffffffffff600180 D __vsyscall_gtod_dataffffffffff600400 T vtimeffffffffff600800 T vgetcpuffffffffff600880 D __vgetcpu_modeffffffffff6008c0 D __jiffies

It’s innm’s output format. The first column is the symbol’s address, the second column is the symbol type. You can see the detailed instruction innm’s manpage.

In general, one will tell you this is the output ofnm vmlinux. However, some entries in this symbol table are from loadable kernel modules, how can they be listed here? Let’s see how this table is generated.

How is /proc/kallsyms generated

As we have seen in the last two articles, contents of procfs files are generated on reading, so don’t try to find this file anywhere on your disk. But we can directly go to the kernel source for the answer. First, let’s find the code that creates this file inkernel/kallsyms.c.

static const struct file_operations kallsyms_operations = {        .open = kallsyms_open,        .read = seq_read,        .llseek = seq_lseek,        .release = seq_release_private,};static int __init kallsyms_init(void){        proc_create("kallsyms", 0444, NULL, &kallsyms_operations);        return 0;}device_initcall(kallsyms_init);

On creating the file, the kernel associates theopen()operation withkallsyms_open(),read()->seq_read(),llseek()->seq_lseek()andrelease()->seq_release_private(). Here we see that this file is a sequence file.

The detail about sequence file is out of scope of this article. There is a comprehensive description located in kernel documentation, please go throughDocumentation/filesystems/seq_file.txtif you don’t know what is sequence file. In a short way, due to thepagelimitation inproc_read_t, the kernel introduced sequence file for kernel to provide large amount of information to the user.

Ok, back to the source. Inkallsyms_open(), it does nothing more than create and reset the iterator forseq_readoperation, and of course set theseq_operations:

static const struct seq_operations kallsyms_op = {        .start = s_start,        .next = s_next,        .stop = s_stop,        .show = s_show};

So, for our goals, we care abouts_start()ands_next(). They both invokeupdate_iter(), and the core ofupdate_iter()isget_ksymbol_mod(), and followed byget_ksymbol_mod(). At last, we reachedmodule_get_kallsym()inkernel/module.c:

int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,                        char *name, char *module_name, int *exported){        struct module *mod;        preempt_disable();        list_for_each_entry_rcu(mod, &modules, list) {                if (symnum < mod->num_symtab) {                        *value = mod->symtab[symnum].st_value;                        *type = mod->symtab[symnum].st_info;                        strlcpy(name, mod->strtab + mod->symtab[symnum].st_name,                                KSYM_NAME_LEN);                        strlcpy(module_name, mod->name, MODULE_NAME_LEN);                        *exported = is_exported(name, *value, mod);                        preempt_enable();                        return 0;                }                symnum -= mod->num_symtab;        }        preempt_enable();        return -ERANGE;}

Inmodule_get_kallsym(), it iterates all modules and symbols. Five properties are assigned values.valueis the symbol’s address,typeis the symbol’s type,nameis the symbol’s name,module_nameis the module name if the module is not compiled in core, otherwise empty.exportedindicates whether the symbol is exported. Have you ever wondered why there are some many “local” (the type char is in lower case) symbols in the symbol table? Let’s have a lookats_show():

if (iter->module_name[0]) {                char type;                /*                 * Label it "global" if it is exported,                 * "local" if not exported.                 */                type = iter->exported ? toupper(iter->type) :                                        tolower(iter->type);                seq_printf(m, "%0*lx %c %s\t[%s]\n",                           (int)(2 * sizeof(void *)),                           iter->value, type, iter->name, iter->module_name);        } else                seq_printf(m, "%0*lx %c %s\n",                           (int)(2 * sizeof(void *)),                           iter->value, iter->type, iter->name);

Ok, clear about it? All these symbols are global in C language aspect, but only exported symbols are labeled as “global”.

After the iteration finished, we see the contents of/proc/kallsyms.

How to access symbols

Here, access can be read, write and execute. Let’s have a look at this simplest module:

#include <linux/module.h>#include <linux/init.h>#include <linux/kernel.h>#include <linux/jiffies.h>MODULE_AUTHOR("Stephen Zhang");MODULE_LICENSE("GPL");MODULE_DESCRIPTION("Use exported symbols");static int __init lkm_init(void){    printk(KERN_INFO "[%s] module loaded.\n", __this_module.name);    printk("[%s] current jiffies: %lu.\n", __this_module.name, jiffies);    return 0;}static void __exit lkm_exit(void){    printk(KERN_INFO "[%s] module unloaded.\n", __this_module.name);}module_init(lkm_init);module_exit(lkm_exit);

In this module, we usedprintk()andjiffies, which are both symbols from kernel space. Why are these symbols available in our code? Because they are “exported”.

You can think of kernel symbols as visible at three different levels in the kernel source code:

“static”, and therefore visible only within their own source file“external”, and therefore potentially visible to any other code built into the kernel itself, and“exported”, and therefore visible and available to any loadable module.

The kernel use two macros to export symbols:

EXPORT_SYMBOLexports the symbol to any loadable moduleEXPORT_SYMBOL_GPLexports the symbol only to GPL-licensed modules.

We find the two symbols exported in the kernel source code:

kernel/printk.c:EXPORT_SYMBOL(printk);kernel/time.c:EXPORT_SYMBOL(jiffies);

Except for examine the kernel code to find whether a symbol is exported, is there anyway to identify it more easily? The answer is sure! All exported entry have another symbol prefixed with__ksymab_. e.g.

ffffffff81a4ef00 r __ksymtab_printkffffffff81a4eff0 r __ksymtab_jiffies

Let’s just have another look at the definition ofEXPORT_SYMBOL:

/* For every exported symbol, place a struct in the __ksymtab div */#define __EXPORT_SYMBOL(sym, sec)                               \        extern typeof(sym) sym;                                 \        __CRC_SYMBOL(sym, sec)                                  \        static const char __kstrtab_##sym[]                     \        __attribute__((div("__ksymtab_strings"), aligned(1))) \        = MODULE_SYMBOL_PREFIX #sym;                            \        static const struct kernel_symbol __ksymtab_##sym       \        __used                                                  \        __attribute__((div("__ksymtab" sec), unused))       \        = { (unsigned long)&sym, __kstrtab_##sym }#define EXPORT_SYMBOL(sym)                                      \        __EXPORT_SYMBOL(sym, "")

The highlighted line places astruct kernel_symbol __ksymtab_##symint the symbol table.

There is one more thing that worth noting,__this_moduleis not an exported symbol, nor is it defined anywhere in the kernel source. In the kernel, all we can find about__this_moduleare nothing more than the following two lines:

extern struct module __this_module;#define THIS_MODULE (&__this_module)

How?! It’s not defined in the kernel, what to link against whileinsmodthen? Don’t panic. Have you noticed the temporary filehello.mod.cwhile compiling the module ? Here is the definition for__this_module:

struct module __this_module__attribute__((div(".gnu.linkonce.this_module"))) = { .name = KBUILD_MODNAME, .init = init_module,#ifdef CONFIG_MODULE_UNLOAD .exit = cleanup_module,#endif .arch = MODULE_ARCH_INIT,};

So far, as we see, we can use any exported symbols directly in our module; the only thing we have to do is to include the corresponding header file, or just to have the right declaration. Then, what if we want to access the other symbols in the kernel? Though it’s not a good idea to do such a thing, any symbol that is not exported, usually don’t expect anyone else to visit them, avoiding potential disasters; someday, just to fulfill one’s curiosity, or one knows exactly what he is doing, we have to access the non-exported symbols. Let’s go further.

How to access non-exported symbol

For each symbol in the kernel, we have an entry in/proc/kallsyms, and we have addresses for all of them. Since we are in the kernel, we can see any bit we want to see! Just read from that address. Let’s takeresume_fileas an example. Source code comes first:

#include <linux/module.h>#include <linux/kallsyms.h>#include <linux/string.h>MODULE_LICENSE("GPL");MODULE_DESCRIPTION("Access non-exported symbols");MODULE_AUTHOR("Stephen Zhang");static int __init lkm_init(void){    char *sym_name = "resume_file";    unsigned long sym_addr = kallsyms_lookup_name(sym_name);    char filename[256];    strncpy(filename, (char *)sym_addr, 255);    printk(KERN_INFO "[%s] %s (0x%lx): %s\n", __this_module.name, sym_name, sym_addr, filename);    return 0;}static void __exit lkm_exit(void){}module_init(lkm_init);module_exit(lkm_exit);

Here, instead of parsing/proc/kallsymsto find the a symbol’s address, we usekallsyms_lookup_name()to do it. Then, we just treat the address aschar *, which is the type ofresume_file, and read it usingstrncpy().

Let’s see what happens when we run:

sudo insmod lkm_hello.kodmesg | tail -n 1[lkm_hello] resume_file (0xffffffff81c17140): /dev/sda6grep resume_file /proc/kallsymsffffffff81c17140 d resume_file

Yeap! We did it! And we see the symbol address returned bykallsyms_lookup_name()is exactly the same as in/proc/kallsyms. Just like read, you can also write to a symbol’s address, but be careful, some addresses are inrodatadiv ortextdiv, which cannot be written. If you try to write to a readonly address, you will probably get a kernel oops. However, this does not mean NO. You can turn off the protection. Follow instructionsin this page. The basic idea is changing the page attribute:

int set_page_rw(long unsigned int _addr){    struct page *pg;    pgprot_t prot;    pg = virt_to_page(_addr);    prot.pgprot = VM_READ | VM_WRITE;    return change_page_attr(pg, 1, prot);}int set_page_ro(long unsigned int _addr){    struct page *pg;    pgprot_t prot;    pg = virt_to_page(_addr);    prot.pgprot = VM_READ;    return change_page_attr(pg, 1, prot);}

Conclusion

Well, that’s too much for this post. In this article, we first dig into the Linux kernel source code, to find out how the kernel symbol table is generated. Then we learned how to use exported kernel symbols in our modules. Finally, we saw the tricky way to access all kernel symbols within a module.

ReferenceKernel Symbols: What’s Available to Your Module, What Isn’tDocumentation/filesystems/seq_file.txtLinux Kernel: System call hooking example观今宜鉴古,无古不成今。

introducing Linux Kernel symbols

相关文章:

你感兴趣的文章:

标签云: