linux内核中断、异常

欢迎进入Linux社区论坛，与200万技术人员互动交流 >>进入

中断：

可屏蔽中断：所有有I/O设备请求的中断都是，被屏蔽的中断会一直被CPU 忽略，直到屏蔽位被重置。

不可屏蔽中断：非常危险的事件引起（如硬件失败）。

异常：

处理器产生的（Fault，Trap，Abort）异常

programmed exceptions（软中断）：由程序员通过INT或INT3指令触发，通常当做trap处理，用处：实现系统调用。

中断描述符表（IDT）：256项，其中的每一项关联一个中断/异常处理过程，有三种类型：

Task Gate Descriptor. Linux未使用该类型的描述符。

Interrupt Gate Descriptor.用于处理中断。

Trap Gate Descriptor. 用于处理异常。

中断门：用于硬件中断，DPL为0，不允许用户态直接使用int指令访问，硬件中断免去这一判断，因此可以在用户态响应中断，见set_intr_gate

DPL3 陷阱门：用于系统调用，DPL为3，允许用户态直接使用int指令访问，这样才能通过int80访问系统调用，只有80号向量属于此门，见 set_system_gate

DPL0陷阱门：用于CPU异常，不允许用户态直接使用int指令访问，硬件中断免去这一判断，因此可以在用户产生CPU异常，见set_trap_gate

在指令执行过程中控制单元检测是否有中断/异常发生，如果有，等待该条指令执行完成以后，硬件按如下过程执行：

确定中断向量的编号i。

从IDT表中得到第i个门描述符。（idtr指向IDT）

由第i项中的选择符和gdtr 查到位于GDT中的段描述符，从而得到中断处理程序的基地址，而偏移量位于门描述符中。

做权限检查：比较cs中的CPL和GDT中段描述符的DPL，确保中断处理程序的特权级不低于调用者。对于programed exception 还需检查CPL与门描述符的DPL，还应确保CPL大于等于门的DPL。Why？因为INT指令允许用户态的进程产生中断信号，其向量值可以为0到255的任一值，为了避免用户通过INT指令产生非法中断，在初始化的时候，将向量值为80H的门描述符（系统调用使用该门）的DPL设为3，将其他需要避免访问的门描述符的DPL值设为0，这样在做权限检查的时候就可以检查出来非法的情况。

检查是否发生了特权级的变化，一般指是否由用户态陷入了内核态。如果是由用户态陷入了内核态，控制单元必须开始使用与新的特权级相关的堆栈a. 读tr寄存器，访问运行进程的tss段。why？因为任何进程从用户态陷入内核态都必须从TSS获得内核堆栈指针。

b. 用与新特权级相关的栈段和栈指针装载ss和esp寄存器。这些值可以在进程的tss段中找到。

c. 在新的栈（内核栈）中保存用户态的ss和esp，这些值指明了用户态相关栈的逻辑地址。

若发生的是故障，用引起异常的指令地址修改cs和eip寄存器的值，以使得这条指令在异常处理结束后能被再次执行

在栈中保存eflags、cs和eip的内容

如果异常带有一个硬件出错码，则将它保存在栈中

装载cs和eip寄存器，其值分别是在GDT中找到的段描述符段基址和IDT表中第i 个门的偏移量。这样就得到了中断/异常处理程序第一条指令的逻辑地址。

从中断/异常返回：

中断/异常处理完后，相应的处理程序会执行一条iret指令，做了如下事情：

1）用保存在栈中的值装载cs、eip和eflags寄存器。如果一个硬件出错码曾被压入栈中，那么弹出这个硬件出错码

2）检查处理程序的特权级是否等于cs中最低两位的值（这意味着进程在被中断的时候是运行在内核态还是用户态）。若是内核态，iret终止执行；否则，转入3

3）从栈中装载ss和esp寄存器。这步意味着返回到与旧特权级相关的栈。

4）检查ds、es、fs和gs段寄存器的内容，如果其中一个寄存器包含的选择符是一个段描述符，并且特权级比当前特权级高，则清除相应的寄存器。这么做是防止怀有恶意的用户程序利用这些寄存器访问内核空间。

关于硬件中断和异常的原理简单描述为：当中断到到来时，由硬件触发中断引脚，通过引脚号找到中断号，然后通过中断号从中断描述符表（IDT）中找到对应的项。从gdtr寄存器中获得GDT的基地址，并在GDT中查找，以读取IDT表项中的选择符所标识的段描述符。这个描述符指定中断或异常处理程序所在段的基地址。权限检查。保存现场。装载cs和eip寄存器，其值分别是IDT表中第i想们描述符的段选择符和偏移量字段。这些值给出了中断或者异常处理程序的第一条指令的逻辑地址。中断或异常返回后，相应的处理程序必须产生一条iret指令，把控制权转交给被中断的进程。

中断流：

中断描述符表的初始化

在内核初始化过程中，setup_idt汇编语言函数用同一个中断门（即指向ignore_int中断处理程序）来填充所有这256个表项

[plain] view plaincopyprint?/*

* setup_idt

* sets up a idt with 256 entries pointing to

* ignore_int, interrupt gates. It doesn’t actually load

* idt – that can be done only after paging has been enabled

* and the kernel moved to PAGE_OFFSET. Interrupts

* are enabled elsewhere, when we can be relatively

* sure everything is ok.

* Warning: %esi is live across this function.

setup_idt:

lea ignore_int,%edx

movl $（__KERNEL_CS 《 16），%eax

movw %dx,%ax /* selector = 0x0010 = cs */

movw $0x8E00,%dx /* interrupt gate – dpl=0, present */

lea idt_table,%edi

mov $256,%ecx

rp_sidt:

movl %eax,（%edi）

movl %edx,4（%edi）

addl $8,%edi

dec %ecx

jne rp_sidt

.macro set_early_handler handler,trapno

lea \handler,%edx

movl $（__KERNEL_CS 《 16），%eax

movw %dx,%ax

movw $0x8E00,%dx /* interrupt gate – dpl=0, present */

lea idt_table,%edi

movl %eax,8*\trapno（%edi）

movl %edx,8*\trapno+4（%edi）

.endm

set_early_handler handler=early_divide_err,trapno=0

set_early_handler handler=early_illegal_opcode,trapno=6

set_early_handler handler=early_protection_fault,trapno=13

set_early_handler handler=early_page_fault,trapno=14

ret

在start_kernel中调用trap_init函数想idt表中添加项（主要是异常处理）

[cpp] view plaincopyprint?void __init trap_init（void）

{

int i;

#ifdef CONFIG_EISA

void __iomem *p = early_ioremap（0x0FFFD9, 4）；

if （readl（p） == ‘E’ + （’I’《8） + （’S’《16） + （’A’《24））

EISA_bus = 1;

early_iounmap（p, 4）；

#endif

set_intr_gate（0, ÷_error）；

set_intr_gate_ist（1, &debug, DEBUG_STACK）；

set_intr_gate_ist（2, &nmi, NMI_STACK）；

/* int3 can be called from all */

set_system_intr_gate_ist（3, &int3, DEBUG_STACK）；

/* int4 can be called from all */

set_system_intr_gate（4, &overflow）；

set_intr_gate（5, &bounds）；

set_intr_gate（6, &invalid_op）；

set_intr_gate（7, &device_not_available）；

#ifdef CONFIG_X86_32

set_task_gate（8, GDT_ENTRY_DOUBLEFAULT_TSS）；

#else

set_intr_gate_ist（8, &double_fault, DOUBLEFAULT_STACK）；

#endif

set_intr_gate（9, &coprocessor_segment_overrun）；

set_intr_gate（10, &invalid_TSS）；

set_intr_gate（11, &segment_not_present）；

set_intr_gate_ist（12, &stack_segment, STACKFAULT_STACK）；

set_intr_gate（13, &general_protection）；

set_intr_gate（14, &page_fault）；

set_intr_gate（15, &spurious_interrupt_bug）；

set_intr_gate（16, &coprocessor_error）；

set_intr_gate（17, &alignment_check）；

#ifdef CONFIG_X86_MCE

set_intr_gate_ist（18, &machine_check, MCE_STACK）；

#endif

set_intr_gate（19, &simd_coprocessor_error）；

/* Reserve all the builtin and the syscall vector: */

for （i = 0; i < FIRST_EXTERNAL_VECTOR; i++）

set_bit（i, used_vectors）；

#ifdef CONFIG_IA32_EMULATION

set_system_intr_gate（IA32_SYSCALL_VECTOR, ia32_syscall）；

set_bit（IA32_SYSCALL_VECTOR, used_vectors）；

#endif

#ifdef CONFIG_X86_32

if （cpu_has_fxsr） {

printk（KERN_INFO “Enabling fast FPU save and restore… “）；

set_in_cr4（X86_CR4_OSFXSR）；

printk（”done.\n”）；

}

if （cpu_has_xmm） {

printk（KERN_INFO

“Enabling unmasked SIMD FPU exception support… “）；

set_in_cr4（X86_CR4_OSXMMEXCPT）；

printk（”done.\n”）；

}

set_system_trap_gate（SYSCALL_VECTOR, &system_call）；

set_bit（SYSCALL_VECTOR, used_vectors）；

#endif

* Should be a barrier for any external CPU state:

cpu_init（）；

x86_init.irqs.trap_init（）；

}

异常处理

异常处理程序有一个标准的结构，由以下三部分组成：

1，在内核堆栈中保存大多数寄存器的内容（这部分用汇编语言实现）

例如，对于除0异常的汇编

[plain] view plaincopyprint?ENTRY（divide_error）

RING0_INT_FRAME

pushl $0 # no error code

CFI_ADJUST_CFA_OFFSET 4

pushl $do_divide_error

CFI_ADJUST_CFA_OFFSET 4

jmp error_code

CFI_ENDPROC

END（divide_error）

其中入口divide_error为idt表中对应项的处理函数地址，也就是说，产生异常后首先跳到这里执行。当异常产生时，如果控制单元没有自动地把一个硬件出错代码插入到栈中，相应的汇编片段会含一条pushl $0指令，在栈中垫上一个空值。然后，把高级c函数的地址压入栈中，他的名字由异常处理程序名与do_前缀组成。然后跳转到error_code中执行

[plain] view plaincopyprint?error_code:

/* the function address is in %gs’s slot on the stack */

pushl %fs

CFI_ADJUST_CFA_OFFSET 4

/*CFI_REL_OFFSET fs, 0*/

pushl %es

CFI_ADJUST_CFA_OFFSET 4

/*CFI_REL_OFFSET es, 0*/

pushl %ds

CFI_ADJUST_CFA_OFFSET 4

/*CFI_REL_OFFSET ds, 0*/

pushl %eax

CFI_ADJUST_CFA_OFFSET 4

CFI_REL_OFFSET eax, 0

pushl %ebp

CFI_ADJUST_CFA_OFFSET 4

CFI_REL_OFFSET ebp, 0

pushl %edi

CFI_ADJUST_CFA_OFFSET 4

CFI_REL_OFFSET edi, 0

pushl %esi

CFI_ADJUST_CFA_OFFSET 4

CFI_REL_OFFSET esi, 0

pushl %edx

CFI_ADJUST_CFA_OFFSET 4

CFI_REL_OFFSET edx, 0

pushl %ecx

CFI_ADJUST_CFA_OFFSET 4

CFI_REL_OFFSET ecx, 0

pushl %ebx

CFI_ADJUST_CFA_OFFSET 4

CFI_REL_OFFSET ebx, 0

cld

movl $（__KERNEL_PERCPU）， %ecx

movl %ecx, %fs

UNWIND_ESPFIX_STACK

GS_TO_REG %ecx

movl PT_GS（%esp）， %edi # get the function address

movl PT_ORIG_EAX（%esp）， %edx # get the error code

movl $-1, PT_ORIG_EAX（%esp） # no syscall to restart

REG_TO_PTGS %ecx

SET_KERNEL_GS %ecx

movl $（__USER_DS）， %ecx

movl %ecx, %ds

movl %ecx, %es

TRACE_IRQS_OFF

movl %esp,%eax # pt_regs pointer

call *%edi

jmp ret_from_exception

error_code汇编代码主要完成大部分寄存器的保存，然后调用call *%edi代码调用上面保存在栈中的c函数执行。

在linux2.6内核中，采用宏的方式定义这类do_函数：

[cpp] view plaincopyprint?DO_ERROR_INFO（0, SIGFPE, “divide error”, divide_error, FPE_INTDIV, regs->ip）

DO_ERROR（4, SIGSEGV, “overflow”, overflow）

DO_ERROR（5, SIGSEGV, “bounds”, bounds）

DO_ERROR_INFO（6, SIGILL, “invalid opcode”, invalid_op, ILL_ILLOPN, regs->ip）

DO_ERROR（9, SIGFPE, “coprocessor segment overrun”, coprocessor_segment_overrun）

DO_ERROR（10, SIGSEGV, “invalid TSS”, invalid_TSS）

DO_ERROR（11, SIGBUS, “segment not present”, segment_not_present）

#ifdef CONFIG_X86_32

DO_ERROR（12, SIGBUS, “stack segment”, stack_segment）

#endif

我们对上面的宏，看一个

[cpp] view plaincopyprint?#define DO_ERROR_INFO（trapnr, signr, str, name, sicode, siaddr） \

dotraplinkage void do_##name（struct pt_regs *regs, long error_code） \

{ \

siginfo_t info; \

info.si_signo = signr; \

info.si_errno = 0; \

info.si_code = sicode; \

info.si_addr = （void __user *）siaddr; \

if （notify_die（DIE_TRAP, str, regs, error_code, trapnr, signr） \

== NOTIFY_STOP） \

return; \

conditional_sti（regs）； \

do_trap（trapnr, signr, str, regs, error_code, &info）； \

}

可见最后都调用了do_trap函数来执行。

异常返回

当执行异常处理的C函数终止时，程序执行一条jmp指令以跳转到ret_from_exception函数（上面的error_code汇编函数）

[cpp] view plaincopyprint?ret_from_exception:

preempt_stop（CLBR_ANY）

ret_from_intr:

GET_THREAD_INFO（%ebp）

check_userspace:

movl PT_EFLAGS（%esp）， %eax # mix EFLAGS and CS

movb PT_CS（%esp）， %al

andl $（X86_EFLAGS_VM | SEGMENT_RPL_MASK）， %eax

cmpl $USER_RPL, %eax

/*当被中断的程序在中断发生运行时在内核态*/

jb resume_kernel # not returning to v8086 or userspace

/*在用户空间时*/

ENTRY（resume_userspace）

LOCKDEP_SYS_EXIT

DISABLE_INTERRUPTS（CLBR_ANY） # make sure we don’t miss an interrupt

# setting need_resched or sigpending

# between sampling and the iret

TRACE_IRQS_OFF

movl TI_flags（%ebp）， %ecx

andl $_TIF_WORK_MASK, %ecx # is there any work to be done on

# int/exception return?

jne work_pending

jmp restore_all

END（ret_from_exception）

#ifdef CONFIG_PREEMPT

ENTRY（resume_kernel）

DISABLE_INTERRUPTS（CLBR_ANY）

/*允许内核抢占时，执行need_resched*/

cmpl $0,TI_preempt_count（%ebp） # non-zero preempt_count ?

/*不等于0，被中断的程序重新开始执行*/

jnz restore_all

need_resched:

movl TI_flags（%ebp）， %ecx # need_resched set ?

testb $_TIF_NEED_RESCHED, %cl

jz restore_all

testl $X86_EFLAGS_IF,PT_EFLAGS（%esp） # interrupts off （exception path） ?

jz restore_all

call preempt_schedule_irq

jmp need_resched

END（resume_kernel）

#endif

CFI_ENDPROC

[1][2][3]

做对的事情比把事情做对重要。

相关文章：

你感兴趣的文章：

标签云：