Commit | Line | Data |
---|---|---|
1da177e4 LT |
1 | Documentation for /proc/sys/vm/* kernel version 2.2.10 |
2 | (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> | |
3 | ||
4 | For general info and legal blurb, please look in README. | |
5 | ||
6 | ============================================================== | |
7 | ||
8 | This file contains the documentation for the sysctl files in | |
9 | /proc/sys/vm and is valid for Linux kernel version 2.2. | |
10 | ||
11 | The files in this directory can be used to tune the operation | |
12 | of the virtual memory (VM) subsystem of the Linux kernel and | |
13 | the writeout of dirty data to disk. | |
14 | ||
15 | Default values and initialization routines for most of these | |
16 | files can be found in mm/swap.c. | |
17 | ||
18 | Currently, these files are in /proc/sys/vm: | |
19 | - overcommit_memory | |
20 | - page-cluster | |
21 | - dirty_ratio | |
22 | - dirty_background_ratio | |
23 | - dirty_expire_centisecs | |
24 | - dirty_writeback_centisecs | |
25 | - max_map_count | |
26 | - min_free_kbytes | |
27 | - laptop_mode | |
28 | - block_dump | |
9d0243bc | 29 | - drop-caches |
1743660b | 30 | - zone_reclaim_mode |
fadd8fbd | 31 | - panic_on_oom |
1da177e4 LT |
32 | |
33 | ============================================================== | |
34 | ||
35 | dirty_ratio, dirty_background_ratio, dirty_expire_centisecs, | |
36 | dirty_writeback_centisecs, vfs_cache_pressure, laptop_mode, | |
9d0243bc | 37 | block_dump, swap_token_timeout, drop-caches: |
1da177e4 LT |
38 | |
39 | See Documentation/filesystems/proc.txt | |
40 | ||
41 | ============================================================== | |
42 | ||
43 | overcommit_memory: | |
44 | ||
45 | This value contains a flag that enables memory overcommitment. | |
46 | ||
47 | When this flag is 0, the kernel attempts to estimate the amount | |
48 | of free memory left when userspace requests more memory. | |
49 | ||
50 | When this flag is 1, the kernel pretends there is always enough | |
51 | memory until it actually runs out. | |
52 | ||
53 | When this flag is 2, the kernel uses a "never overcommit" | |
54 | policy that attempts to prevent any overcommit of memory. | |
55 | ||
56 | This feature can be very useful because there are a lot of | |
57 | programs that malloc() huge amounts of memory "just-in-case" | |
58 | and don't use much of it. | |
59 | ||
60 | The default value is 0. | |
61 | ||
62 | See Documentation/vm/overcommit-accounting and | |
63 | security/commoncap.c::cap_vm_enough_memory() for more information. | |
64 | ||
65 | ============================================================== | |
66 | ||
67 | overcommit_ratio: | |
68 | ||
69 | When overcommit_memory is set to 2, the committed address | |
70 | space is not permitted to exceed swap plus this percentage | |
71 | of physical RAM. See above. | |
72 | ||
73 | ============================================================== | |
74 | ||
75 | page-cluster: | |
76 | ||
77 | The Linux VM subsystem avoids excessive disk seeks by reading | |
78 | multiple pages on a page fault. The number of pages it reads | |
79 | is dependent on the amount of memory in your machine. | |
80 | ||
81 | The number of pages the kernel reads in at once is equal to | |
82 | 2 ^ page-cluster. Values above 2 ^ 5 don't make much sense | |
83 | for swap because we only cluster swap data in 32-page groups. | |
84 | ||
85 | ============================================================== | |
86 | ||
87 | max_map_count: | |
88 | ||
89 | This file contains the maximum number of memory map areas a process | |
90 | may have. Memory map areas are used as a side-effect of calling | |
91 | malloc, directly by mmap and mprotect, and also when loading shared | |
92 | libraries. | |
93 | ||
94 | While most applications need less than a thousand maps, certain | |
95 | programs, particularly malloc debuggers, may consume lots of them, | |
96 | e.g., up to one or two maps per allocation. | |
97 | ||
98 | The default value is 65536. | |
99 | ||
100 | ============================================================== | |
101 | ||
102 | min_free_kbytes: | |
103 | ||
104 | This is used to force the Linux VM to keep a minimum number | |
105 | of kilobytes free. The VM uses this number to compute a pages_min | |
106 | value for each lowmem zone in the system. Each lowmem zone gets | |
107 | a number of reserved free pages based proportionally on its size. | |
8ad4b1fb RS |
108 | |
109 | ============================================================== | |
110 | ||
111 | percpu_pagelist_fraction | |
112 | ||
113 | This is the fraction of pages at most (high mark pcp->high) in each zone that | |
114 | are allocated for each per cpu page list. The min value for this is 8. It | |
115 | means that we don't allow more than 1/8th of pages in each zone to be | |
116 | allocated in any single per_cpu_pagelist. This entry only changes the value | |
117 | of hot per cpu pagelists. User can specify a number like 100 to allocate | |
118 | 1/100th of each zone to each per cpu page list. | |
119 | ||
120 | The batch value of each per cpu pagelist is also updated as a result. It is | |
121 | set to pcp->high/4. The upper limit of batch is (PAGE_SHIFT * 8) | |
122 | ||
123 | The initial value is zero. Kernel does not use this value at boot time to set | |
124 | the high water marks for each per cpu page list. | |
1743660b CL |
125 | |
126 | =============================================================== | |
127 | ||
128 | zone_reclaim_mode: | |
129 | ||
1b2ffb78 CL |
130 | Zone_reclaim_mode allows to set more or less agressive approaches to |
131 | reclaim memory when a zone runs out of memory. If it is set to zero then no | |
132 | zone reclaim occurs. Allocations will be satisfied from other zones / nodes | |
133 | in the system. | |
134 | ||
135 | This is value ORed together of | |
136 | ||
137 | 1 = Zone reclaim on | |
138 | 2 = Zone reclaim writes dirty pages out | |
139 | 4 = Zone reclaim swaps pages | |
2a16e3f4 | 140 | 8 = Also do a global slab reclaim pass |
1b2ffb78 CL |
141 | |
142 | zone_reclaim_mode is set during bootup to 1 if it is determined that pages | |
143 | from remote zones will cause a measurable performance reduction. The | |
1743660b | 144 | page allocator will then reclaim easily reusable pages (those page |
1b2ffb78 CL |
145 | cache pages that are currently not used) before allocating off node pages. |
146 | ||
147 | It may be beneficial to switch off zone reclaim if the system is | |
148 | used for a file server and all of memory should be used for caching files | |
149 | from disk. In that case the caching effect is more important than | |
150 | data locality. | |
151 | ||
152 | Allowing zone reclaim to write out pages stops processes that are | |
153 | writing large amounts of data from dirtying pages on other nodes. Zone | |
154 | reclaim will write out dirty pages if a zone fills up and so effectively | |
155 | throttle the process. This may decrease the performance of a single process | |
156 | since it cannot use all of system memory to buffer the outgoing writes | |
157 | anymore but it preserve the memory on other nodes so that the performance | |
158 | of other processes running on other nodes will not be affected. | |
1743660b | 159 | |
1b2ffb78 CL |
160 | Allowing regular swap effectively restricts allocations to the local |
161 | node unless explicitly overridden by memory policies or cpuset | |
162 | configurations. | |
1743660b | 163 | |
2a16e3f4 CL |
164 | It may be advisable to allow slab reclaim if the system makes heavy |
165 | use of files and builds up large slab caches. However, the slab | |
166 | shrink operation is global, may take a long time and free slabs | |
167 | in all nodes of the system. | |
168 | ||
fadd8fbd KH |
169 | ============================================================= |
170 | ||
171 | panic_on_oom | |
172 | ||
173 | This enables or disables panic on out-of-memory feature. If this is set to 1, | |
174 | the kernel panics when out-of-memory happens. If this is set to 0, the kernel | |
175 | will kill some rogue process, called oom_killer. Usually, oom_killer can kill | |
176 | rogue processes and system will survive. If you want to panic the system | |
177 | rather than killing rogue processes, set this to 1. | |
178 | ||
179 | The default value is 0. | |
180 |