Linux中进程内存与cgroup内存的统计

在Linux内核,对于进程的内存使用与Cgroup的内存使用统计有一些相同和不同的地方。

进程的内存统计

一般来说,业务进程使用的内存主要有以下几种情况:

(1)用户空间的匿名映射页(Anonymous pages in User Mode address spaces),比如调用malloc分配的内存,以及使用MAP_ANONYMOUS的mmap;当系统内存不够时,内核可以将这部分内存交换出去;

(2)用户空间的文件映射页(Mapped pages in User Mode address spaces),包含map file和map tmpfs;前者比如指定文件的mmap,后者比如IPC共享内存;当系统内存不够时,内核可以回收这些页,但回收之前可能需要与文件同步数据;

(3)文件缓存(page in page cache of disk file);发生在程序通过普通的read/write读写文件时,当系统内存不够时,内核可以回收这些页,但回收之前可能需要与文件同步数据;

(4)buffer pages,属于page cache;比如读取块设备文件。

其中(1)和(2)是算作进程的RSS,(3)和(4)属于page cache。

与进程内存统计相关的几个文件:

/proc/[pid]/stat

(23) vsize %luVirtual memory size in bytes.(24) rss %ldResident Set Size: number of pages the process hasin real memory. This is just the pages which counttoward text, data, or stack space. This does notinclude pages which have not been demand-loaded in,or which are swapped out.

RSS的计算:

对应top的RSS列,do_task_stat

#define get_mm_rss(mm)\(get_mm_counter(mm, file_rss) + get_mm_counter(mm, anon_rss))

即RSS=file_rss + anon_rss

/proc/[pid]/statm

Provides information about memory usage, measured in pages.The columns are:size(1) total program size(same as VmSize in /proc/[pid]/status) resident (2) resident set size(same as VmRSS in /proc/[pid]/status) share(3) shared pages (i.e., backed by a file) text(4) text (code) lib(5) library (unused in Linux 2.6) data(6) data + stack dt(7) dirty pages (unused in Linux 2.6)

见函数proc_pid_statm

int task_statm(struct mm_struct *mm, int *shared, int *text,int *data, int *resident){*shared = get_mm_counter(mm, file_rss);*text = (PAGE_ALIGN(mm->end_code) – (mm->start_code & PAGE_MASK))>> PAGE_SHIFT;*data = mm->total_vm – mm->shared_vm;*resident = *shared + get_mm_counter(mm, anon_rss);return mm->total_vm;}

top的SHR=file_rss。实际上,进程使用的共享内存,也是算到file_rss的,因为共享内存基于tmpfs。

anon_rss与file_rss的计算

static int __do_fault(struct mm_struct *mm, struct vm_area_struct *vma,unsigned long address, pmd_t *pmd,pgoff_t pgoff, unsigned int flags, pte_t orig_pte){if (flags & FAULT_FLAG_WRITE) {if (!(vma->vm_flags & VM_SHARED)) {anon = 1;///anon page…if (anon) {inc_mm_counter(mm, anon_rss);page_add_new_anon_rmap(page, vma, address);} else {inc_mm_counter(mm, file_rss);page_add_file_rmap(page);cgroup 的内存统计stat file

memory.stat file includes following statistics

# per-memory cgroup local statuscache- # of bytes of page cache memory.rss- # of bytes of anonymous and swap cache memory (includestransparent hugepages).rss_huge – # of bytes of anonymous transparent hugepages.mapped_file – # of bytes of mapped file (includes tmpfs/shmem)pgpgin- # of charging events to the memory cgroup. The chargingevent happens each time a page is accounted as either mappedanon page(RSS) or cache page(Page Cache) to the cgroup.pgpgout- # of uncharging events to the memory cgroup. The unchargingevent happens each time a page is unaccounted from the cgroup.swap- # of bytes of swap usagedirty- # of bytes that are waiting to get written back to the disk.writeback – # of bytes of file/anon cache that are queued for syncing todisk.inactive_anon – # of bytes of anonymous and swap cache memory on inactiveLRU list.active_anon – # of bytes of anonymous and swap cache memory on activeLRU list.inactive_file – # of bytes of file-backed memory on inactive LRU list.active_file – # of bytes of file-backed memory on active LRU list.unevictable – # of bytes of memory that cannot be reclaimed (mlocked etc).

相关代码

static voidmem_cgroup_get_local_stat(struct mem_cgroup *mem, struct mcs_total_stat *s){s64 val;/* per cpu stat */val = mem_cgroup_read_stat(&mem->stat, MEM_CGROUP_STAT_CACHE);s->stat[MCS_CACHE] += val * PAGE_SIZE;val = mem_cgroup_read_stat(&mem->stat, MEM_CGROUP_STAT_RSS);s->stat[MCS_RSS] += val * PAGE_SIZE;val = mem_cgroup_read_stat(&mem->stat, MEM_CGROUP_STAT_FILE_MAPPED);s->stat[MCS_FILE_MAPPED] += val * PAGE_SIZE;val = mem_cgroup_read_stat(&mem->stat, MEM_CGROUP_STAT_PGPGIN_COUNT);s->stat[MCS_PGPGIN] += val;val = mem_cgroup_read_stat(&mem->stat, MEM_CGROUP_STAT_PGPGOUT_COUNT);s->stat[MCS_PGPGOUT] += val;if (do_swap_account) {val = mem_cgroup_read_stat(&mem->stat, MEM_CGROUP_STAT_SWAPOUT);s->stat[MCS_SWAP] += val * PAGE_SIZE;}/* per zone stat */val = mem_cgroup_get_local_zonestat(mem, LRU_INACTIVE_ANON);s->stat[MCS_INACTIVE_ANON] += val * PAGE_SIZE;val = mem_cgroup_get_local_zonestat(mem, LRU_ACTIVE_ANON);s->stat[MCS_ACTIVE_ANON] += val * PAGE_SIZE;val = mem_cgroup_get_local_zonestat(mem, LRU_INACTIVE_FILE);s->stat[MCS_INACTIVE_FILE] += val * PAGE_SIZE;val = mem_cgroup_get_local_zonestat(mem, LRU_ACTIVE_FILE);s->stat[MCS_ACTIVE_FILE] += val * PAGE_SIZE;val = mem_cgroup_get_local_zonestat(mem, LRU_UNEVICTABLE);s->stat[MCS_UNEVICTABLE] += val * PAGE_SIZE;}数据结构struct mem_cgroup {…/** statistics. This must be placed at the end of memcg.*/struct mem_cgroup_stat stat; ///统计数据};/* memory cgroup统计值 * Statistics for memory cgroup. */enum mem_cgroup_stat_index {/** For MEM_CONTAINER_TYPE_ALL, usage = pagecache + rss.*/MEM_CGROUP_STAT_CACHE,/* # of pages charged as cache */MEM_CGROUP_STAT_RSS,/* # of pages charged as anon rss */MEM_CGROUP_STAT_FILE_MAPPED, /* # of pages charged as file rss */MEM_CGROUP_STAT_PGPGIN_COUNT, /* # of pages paged in */MEM_CGROUP_STAT_PGPGOUT_COUNT, /* # of pages paged out */MEM_CGROUP_STAT_EVENTS, /* sum of pagein + pageout for internal use */MEM_CGROUP_STAT_SWAPOUT, /* # of pages, swapped out */MEM_CGROUP_STAT_NSTATS,};struct mem_cgroup_stat_cpu {s64 count[MEM_CGROUP_STAT_NSTATS];} ____cacheline_aligned_in_smp;struct mem_cgroup_stat {struct mem_cgroup_stat_cpu cpustat[0];};rss and cache但要相信真诚的爱情,对爱情永远怀有单纯的向往。

Linux中进程内存与cgroup内存的统计

相关文章:

你感兴趣的文章:

标签云: