Commit | Line | Data |
---|---|---|
2d6fff63 DH |
1 | =============================== |
2 | FS-CACHE NETWORK FILESYSTEM API | |
3 | =============================== | |
4 | ||
5 | There's an API by which a network filesystem can make use of the FS-Cache | |
6 | facilities. This is based around a number of principles: | |
7 | ||
8 | (1) Caches can store a number of different object types. There are two main | |
9 | object types: indices and files. The first is a special type used by | |
10 | FS-Cache to make finding objects faster and to make retiring of groups of | |
11 | objects easier. | |
12 | ||
13 | (2) Every index, file or other object is represented by a cookie. This cookie | |
14 | may or may not have anything associated with it, but the netfs doesn't | |
15 | need to care. | |
16 | ||
17 | (3) Barring the top-level index (one entry per cached netfs), the index | |
18 | hierarchy for each netfs is structured according the whim of the netfs. | |
19 | ||
20 | This API is declared in <linux/fscache.h>. | |
21 | ||
22 | This document contains the following sections: | |
23 | ||
24 | (1) Network filesystem definition | |
25 | (2) Index definition | |
26 | (3) Object definition | |
27 | (4) Network filesystem (un)registration | |
28 | (5) Cache tag lookup | |
29 | (6) Index registration | |
30 | (7) Data file registration | |
31 | (8) Miscellaneous object registration | |
94d30ae9 | 32 | (9) Setting the data file size |
2d6fff63 DH |
33 | (10) Page alloc/read/write |
34 | (11) Page uncaching | |
da9803bc | 35 | (12) Index and data file consistency |
94d30ae9 DH |
36 | (13) Cookie enablement |
37 | (14) Miscellaneous cookie operations | |
38 | (15) Cookie unregistration | |
39 | (16) Index invalidation | |
40 | (17) Data file invalidation | |
41 | (18) FS-Cache specific page flags. | |
2d6fff63 DH |
42 | |
43 | ||
44 | ============================= | |
45 | NETWORK FILESYSTEM DEFINITION | |
46 | ============================= | |
47 | ||
48 | FS-Cache needs a description of the network filesystem. This is specified | |
49 | using a record of the following structure: | |
50 | ||
51 | struct fscache_netfs { | |
52 | uint32_t version; | |
53 | const char *name; | |
54 | struct fscache_cookie *primary_index; | |
55 | ... | |
56 | }; | |
57 | ||
58 | This first two fields should be filled in before registration, and the third | |
59 | will be filled in by the registration function; any other fields should just be | |
60 | ignored and are for internal use only. | |
61 | ||
62 | The fields are: | |
63 | ||
64 | (1) The name of the netfs (used as the key in the toplevel index). | |
65 | ||
66 | (2) The version of the netfs (if the name matches but the version doesn't, the | |
67 | entire in-cache hierarchy for this netfs will be scrapped and begun | |
68 | afresh). | |
69 | ||
70 | (3) The cookie representing the primary index will be allocated according to | |
71 | another parameter passed into the registration function. | |
72 | ||
73 | For example, kAFS (linux/fs/afs/) uses the following definitions to describe | |
74 | itself: | |
75 | ||
76 | struct fscache_netfs afs_cache_netfs = { | |
77 | .version = 0, | |
78 | .name = "afs", | |
79 | }; | |
80 | ||
81 | ||
82 | ================ | |
83 | INDEX DEFINITION | |
84 | ================ | |
85 | ||
86 | Indices are used for two purposes: | |
87 | ||
88 | (1) To aid the finding of a file based on a series of keys (such as AFS's | |
89 | "cell", "volume ID", "vnode ID"). | |
90 | ||
91 | (2) To make it easier to discard a subset of all the files cached based around | |
92 | a particular key - for instance to mirror the removal of an AFS volume. | |
93 | ||
94 | However, since it's unlikely that any two netfs's are going to want to define | |
95 | their index hierarchies in quite the same way, FS-Cache tries to impose as few | |
96 | restraints as possible on how an index is structured and where it is placed in | |
97 | the tree. The netfs can even mix indices and data files at the same level, but | |
98 | it's not recommended. | |
99 | ||
25985edc | 100 | Each index entry consists of a key of indeterminate length plus some auxiliary |
2d6fff63 DH |
101 | data, also of indeterminate length. |
102 | ||
103 | There are some limits on indices: | |
104 | ||
105 | (1) Any index containing non-index objects should be restricted to a single | |
106 | cache. Any such objects created within an index will be created in the | |
107 | first cache only. The cache in which an index is created can be | |
108 | controlled by cache tags (see below). | |
109 | ||
110 | (2) The entry data must be atomically journallable, so it is limited to about | |
111 | 400 bytes at present. At least 400 bytes will be available. | |
112 | ||
113 | (3) The depth of the index tree should be judged with care as the search | |
114 | function is recursive. Too many layers will run the kernel out of stack. | |
115 | ||
116 | ||
117 | ================= | |
118 | OBJECT DEFINITION | |
119 | ================= | |
120 | ||
121 | To define an object, a structure of the following type should be filled out: | |
122 | ||
123 | struct fscache_cookie_def | |
124 | { | |
125 | uint8_t name[16]; | |
126 | uint8_t type; | |
127 | ||
128 | struct fscache_cache_tag *(*select_cache)( | |
129 | const void *parent_netfs_data, | |
130 | const void *cookie_netfs_data); | |
131 | ||
132 | uint16_t (*get_key)(const void *cookie_netfs_data, | |
133 | void *buffer, | |
134 | uint16_t bufmax); | |
135 | ||
136 | void (*get_attr)(const void *cookie_netfs_data, | |
137 | uint64_t *size); | |
138 | ||
139 | uint16_t (*get_aux)(const void *cookie_netfs_data, | |
140 | void *buffer, | |
141 | uint16_t bufmax); | |
142 | ||
143 | enum fscache_checkaux (*check_aux)(void *cookie_netfs_data, | |
144 | const void *data, | |
145 | uint16_t datalen); | |
146 | ||
147 | void (*get_context)(void *cookie_netfs_data, void *context); | |
148 | ||
149 | void (*put_context)(void *cookie_netfs_data, void *context); | |
150 | ||
151 | void (*mark_pages_cached)(void *cookie_netfs_data, | |
152 | struct address_space *mapping, | |
153 | struct pagevec *cached_pvec); | |
154 | ||
155 | void (*now_uncached)(void *cookie_netfs_data); | |
156 | }; | |
157 | ||
158 | This has the following fields: | |
159 | ||
160 | (1) The type of the object [mandatory]. | |
161 | ||
162 | This is one of the following values: | |
163 | ||
164 | (*) FSCACHE_COOKIE_TYPE_INDEX | |
165 | ||
166 | This defines an index, which is a special FS-Cache type. | |
167 | ||
168 | (*) FSCACHE_COOKIE_TYPE_DATAFILE | |
169 | ||
170 | This defines an ordinary data file. | |
171 | ||
172 | (*) Any other value between 2 and 255 | |
173 | ||
174 | This defines an extraordinary object such as an XATTR. | |
175 | ||
176 | (2) The name of the object type (NUL terminated unless all 16 chars are used) | |
177 | [optional]. | |
178 | ||
179 | (3) A function to select the cache in which to store an index [optional]. | |
180 | ||
181 | This function is invoked when an index needs to be instantiated in a cache | |
182 | during the instantiation of a non-index object. Only the immediate index | |
183 | parent for the non-index object will be queried. Any indices above that | |
184 | in the hierarchy may be stored in multiple caches. This function does not | |
185 | need to be supplied for any non-index object or any index that will only | |
186 | have index children. | |
187 | ||
188 | If this function is not supplied or if it returns NULL then the first | |
19f59460 | 189 | cache in the parent's list will be chosen, or failing that, the first |
2d6fff63 DH |
190 | cache in the master list. |
191 | ||
192 | (4) A function to retrieve an object's key from the netfs [mandatory]. | |
193 | ||
194 | This function will be called with the netfs data that was passed to the | |
195 | cookie acquisition function and the maximum length of key data that it may | |
196 | provide. It should write the required key data into the given buffer and | |
197 | return the quantity it wrote. | |
198 | ||
199 | (5) A function to retrieve attribute data from the netfs [optional]. | |
200 | ||
201 | This function will be called with the netfs data that was passed to the | |
202 | cookie acquisition function. It should return the size of the file if | |
203 | this is a data file. The size may be used to govern how much cache must | |
204 | be reserved for this file in the cache. | |
205 | ||
206 | If the function is absent, a file size of 0 is assumed. | |
207 | ||
25985edc | 208 | (6) A function to retrieve auxiliary data from the netfs [optional]. |
2d6fff63 DH |
209 | |
210 | This function will be called with the netfs data that was passed to the | |
25985edc LDM |
211 | cookie acquisition function and the maximum length of auxiliary data that |
212 | it may provide. It should write the auxiliary data into the given buffer | |
2d6fff63 DH |
213 | and return the quantity it wrote. |
214 | ||
25985edc | 215 | If this function is absent, the auxiliary data length will be set to 0. |
2d6fff63 | 216 | |
25985edc | 217 | The length of the auxiliary data buffer may be dependent on the key |
2d6fff63 DH |
218 | length. A netfs mustn't rely on being able to provide more than 400 bytes |
219 | for both. | |
220 | ||
25985edc | 221 | (7) A function to check the auxiliary data [optional]. |
2d6fff63 DH |
222 | |
223 | This function will be called to check that a match found in the cache for | |
25985edc | 224 | this object is valid. For instance with AFS it could check the auxiliary |
2d6fff63 DH |
225 | data against the data version number returned by the server to determine |
226 | whether the index entry in a cache is still valid. | |
227 | ||
228 | If this function is absent, it will be assumed that matching objects in a | |
229 | cache are always valid. | |
230 | ||
231 | If present, the function should return one of the following values: | |
232 | ||
233 | (*) FSCACHE_CHECKAUX_OKAY - the entry is okay as is | |
234 | (*) FSCACHE_CHECKAUX_NEEDS_UPDATE - the entry requires update | |
235 | (*) FSCACHE_CHECKAUX_OBSOLETE - the entry should be deleted | |
236 | ||
25985edc | 237 | This function can also be used to extract data from the auxiliary data in |
2d6fff63 DH |
238 | the cache and copy it into the netfs's structures. |
239 | ||
240 | (8) A pair of functions to manage contexts for the completion callback | |
241 | [optional]. | |
242 | ||
243 | The cache read/write functions are passed a context which is then passed | |
244 | to the I/O completion callback function. To ensure this context remains | |
245 | valid until after the I/O completion is called, two functions may be | |
246 | provided: one to get an extra reference on the context, and one to drop a | |
247 | reference to it. | |
248 | ||
249 | If the context is not used or is a type of object that won't go out of | |
250 | scope, then these functions are not required. These functions are not | |
251 | required for indices as indices may not contain data. These functions may | |
252 | be called in interrupt context and so may not sleep. | |
253 | ||
254 | (9) A function to mark a page as retaining cache metadata [optional]. | |
255 | ||
256 | This is called by the cache to indicate that it is retaining in-memory | |
257 | information for this page and that the netfs should uncache the page when | |
258 | it has finished. This does not indicate whether there's data on the disk | |
259 | or not. Note that several pages at once may be presented for marking. | |
260 | ||
261 | The PG_fscache bit is set on the pages before this function would be | |
262 | called, so the function need not be provided if this is sufficient. | |
263 | ||
264 | This function is not required for indices as they're not permitted data. | |
265 | ||
266 | (10) A function to unmark all the pages retaining cache metadata [mandatory]. | |
267 | ||
268 | This is called by FS-Cache to indicate that a backing store is being | |
269 | unbound from a cookie and that all the marks on the pages should be | |
270 | cleared to prevent confusion. Note that the cache will have torn down all | |
271 | its tracking information so that the pages don't need to be explicitly | |
272 | uncached. | |
273 | ||
274 | This function is not required for indices as they're not permitted data. | |
275 | ||
276 | ||
277 | =================================== | |
278 | NETWORK FILESYSTEM (UN)REGISTRATION | |
279 | =================================== | |
280 | ||
281 | The first step is to declare the network filesystem to the cache. This also | |
282 | involves specifying the layout of the primary index (for AFS, this would be the | |
283 | "cell" level). | |
284 | ||
285 | The registration function is: | |
286 | ||
287 | int fscache_register_netfs(struct fscache_netfs *netfs); | |
288 | ||
289 | It just takes a pointer to the netfs definition. It returns 0 or an error as | |
290 | appropriate. | |
291 | ||
292 | For kAFS, registration is done as follows: | |
293 | ||
294 | ret = fscache_register_netfs(&afs_cache_netfs); | |
295 | ||
296 | The last step is, of course, unregistration: | |
297 | ||
298 | void fscache_unregister_netfs(struct fscache_netfs *netfs); | |
299 | ||
300 | ||
301 | ================ | |
302 | CACHE TAG LOOKUP | |
303 | ================ | |
304 | ||
305 | FS-Cache permits the use of more than one cache. To permit particular index | |
306 | subtrees to be bound to particular caches, the second step is to look up cache | |
307 | representation tags. This step is optional; it can be left entirely up to | |
308 | FS-Cache as to which cache should be used. The problem with doing that is that | |
309 | FS-Cache will always pick the first cache that was registered. | |
310 | ||
311 | To get the representation for a named tag: | |
312 | ||
313 | struct fscache_cache_tag *fscache_lookup_cache_tag(const char *name); | |
314 | ||
315 | This takes a text string as the name and returns a representation of a tag. It | |
316 | will never return an error. It may return a dummy tag, however, if it runs out | |
317 | of memory; this will inhibit caching with this tag. | |
318 | ||
319 | Any representation so obtained must be released by passing it to this function: | |
320 | ||
321 | void fscache_release_cache_tag(struct fscache_cache_tag *tag); | |
322 | ||
323 | The tag will be retrieved by FS-Cache when it calls the object definition | |
324 | operation select_cache(). | |
325 | ||
326 | ||
327 | ================== | |
328 | INDEX REGISTRATION | |
329 | ================== | |
330 | ||
331 | The third step is to inform FS-Cache about part of an index hierarchy that can | |
332 | be used to locate files. This is done by requesting a cookie for each index in | |
333 | the path to the file: | |
334 | ||
335 | struct fscache_cookie * | |
336 | fscache_acquire_cookie(struct fscache_cookie *parent, | |
337 | const struct fscache_object_def *def, | |
94d30ae9 DH |
338 | void *netfs_data, |
339 | bool enable); | |
2d6fff63 DH |
340 | |
341 | This function creates an index entry in the index represented by parent, | |
342 | filling in the index entry by calling the operations pointed to by def. | |
343 | ||
344 | Note that this function never returns an error - all errors are handled | |
345 | internally. It may, however, return NULL to indicate no cookie. It is quite | |
346 | acceptable to pass this token back to this function as the parent to another | |
347 | acquisition (or even to the relinquish cookie, read page and write page | |
348 | functions - see below). | |
349 | ||
350 | Note also that no indices are actually created in a cache until a non-index | |
351 | object needs to be created somewhere down the hierarchy. Furthermore, an index | |
352 | may be created in several different caches independently at different times. | |
353 | This is all handled transparently, and the netfs doesn't see any of it. | |
354 | ||
94d30ae9 DH |
355 | A cookie will be created in the disabled state if enabled is false. A cookie |
356 | must be enabled to do anything with it. A disabled cookie can be enabled by | |
357 | calling fscache_enable_cookie() (see below). | |
358 | ||
2d6fff63 DH |
359 | For example, with AFS, a cell would be added to the primary index. This index |
360 | entry would have a dependent inode containing a volume location index for the | |
361 | volume mappings within this cell: | |
362 | ||
363 | cell->cache = | |
364 | fscache_acquire_cookie(afs_cache_netfs.primary_index, | |
365 | &afs_cell_cache_index_def, | |
94d30ae9 | 366 | cell, true); |
2d6fff63 DH |
367 | |
368 | Then when a volume location was accessed, it would be entered into the cell's | |
369 | index and an inode would be allocated that acts as a volume type and hash chain | |
370 | combination: | |
371 | ||
372 | vlocation->cache = | |
373 | fscache_acquire_cookie(cell->cache, | |
374 | &afs_vlocation_cache_index_def, | |
94d30ae9 | 375 | vlocation, true); |
2d6fff63 DH |
376 | |
377 | And then a particular flavour of volume (R/O for example) could be added to | |
378 | that index, creating another index for vnodes (AFS inode equivalents): | |
379 | ||
380 | volume->cache = | |
381 | fscache_acquire_cookie(vlocation->cache, | |
382 | &afs_volume_cache_index_def, | |
94d30ae9 | 383 | volume, true); |
2d6fff63 DH |
384 | |
385 | ||
386 | ====================== | |
387 | DATA FILE REGISTRATION | |
388 | ====================== | |
389 | ||
390 | The fourth step is to request a data file be created in the cache. This is | |
391 | identical to index cookie acquisition. The only difference is that the type in | |
392 | the object definition should be something other than index type. | |
393 | ||
394 | vnode->cache = | |
395 | fscache_acquire_cookie(volume->cache, | |
396 | &afs_vnode_cache_object_def, | |
94d30ae9 | 397 | vnode, true); |
2d6fff63 DH |
398 | |
399 | ||
400 | ================================= | |
401 | MISCELLANEOUS OBJECT REGISTRATION | |
402 | ================================= | |
403 | ||
404 | An optional step is to request an object of miscellaneous type be created in | |
405 | the cache. This is almost identical to index cookie acquisition. The only | |
406 | difference is that the type in the object definition should be something other | |
407 | than index type. Whilst the parent object could be an index, it's more likely | |
408 | it would be some other type of object such as a data file. | |
409 | ||
410 | xattr->cache = | |
411 | fscache_acquire_cookie(vnode->cache, | |
412 | &afs_xattr_cache_object_def, | |
94d30ae9 | 413 | xattr, true); |
2d6fff63 DH |
414 | |
415 | Miscellaneous objects might be used to store extended attributes or directory | |
416 | entries for example. | |
417 | ||
418 | ||
419 | ========================== | |
420 | SETTING THE DATA FILE SIZE | |
421 | ========================== | |
422 | ||
423 | The fifth step is to set the physical attributes of the file, such as its size. | |
424 | This doesn't automatically reserve any space in the cache, but permits the | |
425 | cache to adjust its metadata for data tracking appropriately: | |
426 | ||
427 | int fscache_attr_changed(struct fscache_cookie *cookie); | |
428 | ||
429 | The cache will return -ENOBUFS if there is no backing cache or if there is no | |
430 | space to allocate any extra metadata required in the cache. The attributes | |
431 | will be accessed with the get_attr() cookie definition operation. | |
432 | ||
433 | Note that attempts to read or write data pages in the cache over this size may | |
434 | be rebuffed with -ENOBUFS. | |
435 | ||
436 | This operation schedules an attribute adjustment to happen asynchronously at | |
437 | some point in the future, and as such, it may happen after the function returns | |
438 | to the caller. The attribute adjustment excludes read and write operations. | |
439 | ||
440 | ||
441 | ===================== | |
696f69b6 | 442 | PAGE ALLOC/READ/WRITE |
2d6fff63 DH |
443 | ===================== |
444 | ||
445 | And the sixth step is to store and retrieve pages in the cache. There are | |
446 | three functions that are used to do this. | |
447 | ||
448 | Note: | |
449 | ||
450 | (1) A page should not be re-read or re-allocated without uncaching it first. | |
451 | ||
452 | (2) A read or allocated page must be uncached when the netfs page is released | |
453 | from the pagecache. | |
454 | ||
455 | (3) A page should only be written to the cache if previous read or allocated. | |
456 | ||
457 | This permits the cache to maintain its page tracking in proper order. | |
458 | ||
459 | ||
460 | PAGE READ | |
461 | --------- | |
462 | ||
463 | Firstly, the netfs should ask FS-Cache to examine the caches and read the | |
464 | contents cached for a particular page of a particular file if present, or else | |
465 | allocate space to store the contents if not: | |
466 | ||
467 | typedef | |
468 | void (*fscache_rw_complete_t)(struct page *page, | |
469 | void *context, | |
470 | int error); | |
471 | ||
472 | int fscache_read_or_alloc_page(struct fscache_cookie *cookie, | |
473 | struct page *page, | |
474 | fscache_rw_complete_t end_io_func, | |
475 | void *context, | |
476 | gfp_t gfp); | |
477 | ||
478 | The cookie argument must specify a cookie for an object that isn't an index, | |
479 | the page specified will have the data loaded into it (and is also used to | |
480 | specify the page number), and the gfp argument is used to control how any | |
481 | memory allocations made are satisfied. | |
482 | ||
483 | If the cookie indicates the inode is not cached: | |
484 | ||
485 | (1) The function will return -ENOBUFS. | |
486 | ||
487 | Else if there's a copy of the page resident in the cache: | |
488 | ||
489 | (1) The mark_pages_cached() cookie operation will be called on that page. | |
490 | ||
491 | (2) The function will submit a request to read the data from the cache's | |
492 | backing device directly into the page specified. | |
493 | ||
494 | (3) The function will return 0. | |
495 | ||
496 | (4) When the read is complete, end_io_func() will be invoked with: | |
497 | ||
498 | (*) The netfs data supplied when the cookie was created. | |
499 | ||
500 | (*) The page descriptor. | |
501 | ||
502 | (*) The context argument passed to the above function. This will be | |
503 | maintained with the get_context/put_context functions mentioned above. | |
504 | ||
505 | (*) An argument that's 0 on success or negative for an error code. | |
506 | ||
507 | If an error occurs, it should be assumed that the page contains no usable | |
5a6f282a | 508 | data. fscache_readpages_cancel() may need to be called. |
2d6fff63 DH |
509 | |
510 | end_io_func() will be called in process context if the read is results in | |
511 | an error, but it might be called in interrupt context if the read is | |
512 | successful. | |
513 | ||
514 | Otherwise, if there's not a copy available in cache, but the cache may be able | |
515 | to store the page: | |
516 | ||
517 | (1) The mark_pages_cached() cookie operation will be called on that page. | |
518 | ||
519 | (2) A block may be reserved in the cache and attached to the object at the | |
520 | appropriate place. | |
521 | ||
522 | (3) The function will return -ENODATA. | |
523 | ||
524 | This function may also return -ENOMEM or -EINTR, in which case it won't have | |
525 | read any data from the cache. | |
526 | ||
527 | ||
528 | PAGE ALLOCATE | |
529 | ------------- | |
530 | ||
531 | Alternatively, if there's not expected to be any data in the cache for a page | |
532 | because the file has been extended, a block can simply be allocated instead: | |
533 | ||
534 | int fscache_alloc_page(struct fscache_cookie *cookie, | |
535 | struct page *page, | |
536 | gfp_t gfp); | |
537 | ||
538 | This is similar to the fscache_read_or_alloc_page() function, except that it | |
539 | never reads from the cache. It will return 0 if a block has been allocated, | |
540 | rather than -ENODATA as the other would. One or the other must be performed | |
541 | before writing to the cache. | |
542 | ||
543 | The mark_pages_cached() cookie operation will be called on the page if | |
544 | successful. | |
545 | ||
546 | ||
547 | PAGE WRITE | |
548 | ---------- | |
549 | ||
550 | Secondly, if the netfs changes the contents of the page (either due to an | |
551 | initial download or if a user performs a write), then the page should be | |
552 | written back to the cache: | |
553 | ||
554 | int fscache_write_page(struct fscache_cookie *cookie, | |
555 | struct page *page, | |
556 | gfp_t gfp); | |
557 | ||
558 | The cookie argument must specify a data file cookie, the page specified should | |
559 | contain the data to be written (and is also used to specify the page number), | |
560 | and the gfp argument is used to control how any memory allocations made are | |
561 | satisfied. | |
562 | ||
563 | The page must have first been read or allocated successfully and must not have | |
564 | been uncached before writing is performed. | |
565 | ||
566 | If the cookie indicates the inode is not cached then: | |
567 | ||
568 | (1) The function will return -ENOBUFS. | |
569 | ||
570 | Else if space can be allocated in the cache to hold this page: | |
571 | ||
572 | (1) PG_fscache_write will be set on the page. | |
573 | ||
574 | (2) The function will submit a request to write the data to cache's backing | |
575 | device directly from the page specified. | |
576 | ||
577 | (3) The function will return 0. | |
578 | ||
579 | (4) When the write is complete PG_fscache_write is cleared on the page and | |
580 | anyone waiting for that bit will be woken up. | |
581 | ||
582 | Else if there's no space available in the cache, -ENOBUFS will be returned. It | |
583 | is also possible for the PG_fscache_write bit to be cleared when no write took | |
584 | place if unforeseen circumstances arose (such as a disk error). | |
585 | ||
586 | Writing takes place asynchronously. | |
587 | ||
588 | ||
589 | MULTIPLE PAGE READ | |
590 | ------------------ | |
591 | ||
592 | A facility is provided to read several pages at once, as requested by the | |
593 | readpages() address space operation: | |
594 | ||
595 | int fscache_read_or_alloc_pages(struct fscache_cookie *cookie, | |
596 | struct address_space *mapping, | |
597 | struct list_head *pages, | |
598 | int *nr_pages, | |
599 | fscache_rw_complete_t end_io_func, | |
600 | void *context, | |
601 | gfp_t gfp); | |
602 | ||
603 | This works in a similar way to fscache_read_or_alloc_page(), except: | |
604 | ||
605 | (1) Any page it can retrieve data for is removed from pages and nr_pages and | |
606 | dispatched for reading to the disk. Reads of adjacent pages on disk may | |
607 | be merged for greater efficiency. | |
608 | ||
609 | (2) The mark_pages_cached() cookie operation will be called on several pages | |
610 | at once if they're being read or allocated. | |
611 | ||
612 | (3) If there was an general error, then that error will be returned. | |
613 | ||
614 | Else if some pages couldn't be allocated or read, then -ENOBUFS will be | |
615 | returned. | |
616 | ||
617 | Else if some pages couldn't be read but were allocated, then -ENODATA will | |
618 | be returned. | |
619 | ||
620 | Otherwise, if all pages had reads dispatched, then 0 will be returned, the | |
621 | list will be empty and *nr_pages will be 0. | |
622 | ||
623 | (4) end_io_func will be called once for each page being read as the reads | |
624 | complete. It will be called in process context if error != 0, but it may | |
625 | be called in interrupt context if there is no error. | |
626 | ||
627 | Note that a return of -ENODATA, -ENOBUFS or any other error does not preclude | |
628 | some of the pages being read and some being allocated. Those pages will have | |
629 | been marked appropriately and will need uncaching. | |
630 | ||
631 | ||
5a6f282a MT |
632 | CANCELLATION OF UNREAD PAGES |
633 | ---------------------------- | |
634 | ||
635 | If one or more pages are passed to fscache_read_or_alloc_pages() but not then | |
636 | read from the cache and also not read from the underlying filesystem then | |
637 | those pages will need to have any marks and reservations removed. This can be | |
638 | done by calling: | |
639 | ||
640 | void fscache_readpages_cancel(struct fscache_cookie *cookie, | |
641 | struct list_head *pages); | |
642 | ||
643 | prior to returning to the caller. The cookie argument should be as passed to | |
644 | fscache_read_or_alloc_pages(). Every page in the pages list will be examined | |
645 | and any that have PG_fscache set will be uncached. | |
646 | ||
647 | ||
2d6fff63 DH |
648 | ============== |
649 | PAGE UNCACHING | |
650 | ============== | |
651 | ||
652 | To uncache a page, this function should be called: | |
653 | ||
654 | void fscache_uncache_page(struct fscache_cookie *cookie, | |
655 | struct page *page); | |
656 | ||
657 | This function permits the cache to release any in-memory representation it | |
658 | might be holding for this netfs page. This function must be called once for | |
659 | each page on which the read or write page functions above have been called to | |
660 | make sure the cache's in-memory tracking information gets torn down. | |
661 | ||
662 | Note that pages can't be explicitly deleted from the a data file. The whole | |
663 | data file must be retired (see the relinquish cookie function below). | |
664 | ||
665 | Furthermore, note that this does not cancel the asynchronous read or write | |
666 | operation started by the read/alloc and write functions, so the page | |
201a1542 | 667 | invalidation functions must use: |
2d6fff63 DH |
668 | |
669 | bool fscache_check_page_write(struct fscache_cookie *cookie, | |
670 | struct page *page); | |
671 | ||
672 | to see if a page is being written to the cache, and: | |
673 | ||
674 | void fscache_wait_on_page_write(struct fscache_cookie *cookie, | |
675 | struct page *page); | |
676 | ||
677 | to wait for it to finish if it is. | |
678 | ||
679 | ||
201a1542 DH |
680 | When releasepage() is being implemented, a special FS-Cache function exists to |
681 | manage the heuristics of coping with vmscan trying to eject pages, which may | |
682 | conflict with the cache trying to write pages to the cache (which may itself | |
683 | need to allocate memory): | |
684 | ||
685 | bool fscache_maybe_release_page(struct fscache_cookie *cookie, | |
686 | struct page *page, | |
687 | gfp_t gfp); | |
688 | ||
689 | This takes the netfs cookie, and the page and gfp arguments as supplied to | |
690 | releasepage(). It will return false if the page cannot be released yet for | |
691 | some reason and if it returns true, the page has been uncached and can now be | |
692 | released. | |
693 | ||
694 | To make a page available for release, this function may wait for an outstanding | |
695 | storage request to complete, or it may attempt to cancel the storage request - | |
696 | in which case the page will not be stored in the cache this time. | |
697 | ||
698 | ||
c902ce1b DH |
699 | BULK INODE PAGE UNCACHE |
700 | ----------------------- | |
701 | ||
702 | A convenience routine is provided to perform an uncache on all the pages | |
703 | attached to an inode. This assumes that the pages on the inode correspond on a | |
704 | 1:1 basis with the pages in the cache. | |
705 | ||
706 | void fscache_uncache_all_inode_pages(struct fscache_cookie *cookie, | |
707 | struct inode *inode); | |
708 | ||
709 | This takes the netfs cookie that the pages were cached with and the inode that | |
710 | the pages are attached to. This function will wait for pages to finish being | |
711 | written to the cache and for the cache to finish with the page generally. No | |
712 | error is returned. | |
713 | ||
714 | ||
da9803bc DH |
715 | =============================== |
716 | INDEX AND DATA FILE CONSISTENCY | |
717 | =============================== | |
718 | ||
719 | To find out whether auxiliary data for an object is up to data within the | |
720 | cache, the following function can be called: | |
721 | ||
722 | int fscache_check_consistency(struct fscache_cookie *cookie) | |
723 | ||
724 | This will call back to the netfs to check whether the auxiliary data associated | |
725 | with a cookie is correct. It returns 0 if it is and -ESTALE if it isn't; it | |
726 | may also return -ENOMEM and -ERESTARTSYS. | |
2d6fff63 DH |
727 | |
728 | To request an update of the index data for an index or other object, the | |
729 | following function should be called: | |
730 | ||
731 | void fscache_update_cookie(struct fscache_cookie *cookie); | |
732 | ||
733 | This function will refer back to the netfs_data pointer stored in the cookie by | |
734 | the acquisition function to obtain the data to write into each revised index | |
735 | entry. The update method in the parent index definition will be called to | |
736 | transfer the data. | |
737 | ||
738 | Note that partial updates may happen automatically at other times, such as when | |
739 | data blocks are added to a data file object. | |
740 | ||
741 | ||
94d30ae9 DH |
742 | ================= |
743 | COOKIE ENABLEMENT | |
744 | ================= | |
745 | ||
746 | Cookies exist in one of two states: enabled and disabled. If a cookie is | |
747 | disabled, it ignores all attempts to acquire child cookies; check, update or | |
748 | invalidate its state; allocate, read or write backing pages - though it is | |
749 | still possible to uncache pages and relinquish the cookie. | |
750 | ||
751 | The initial enablement state is set by fscache_acquire_cookie(), but the cookie | |
752 | can be enabled or disabled later. To disable a cookie, call: | |
753 | ||
754 | void fscache_disable_cookie(struct fscache_cookie *cookie, | |
755 | bool invalidate); | |
756 | ||
757 | If the cookie is not already disabled, this locks the cookie against other | |
758 | enable and disable ops, marks the cookie as being disabled, discards or | |
759 | invalidates any backing objects and waits for cessation of activity on any | |
760 | associated object before unlocking the cookie. | |
761 | ||
762 | All possible failures are handled internally. The caller should consider | |
763 | calling fscache_uncache_all_inode_pages() afterwards to make sure all page | |
764 | markings are cleared up. | |
765 | ||
766 | Cookies can be enabled or reenabled with: | |
767 | ||
768 | void fscache_enable_cookie(struct fscache_cookie *cookie, | |
769 | bool (*can_enable)(void *data), | |
770 | void *data) | |
771 | ||
772 | If the cookie is not already enabled, this locks the cookie against other | |
773 | enable and disable ops, invokes can_enable() and, if the cookie is not an index | |
774 | cookie, will begin the procedure of acquiring backing objects. | |
775 | ||
776 | The optional can_enable() function is passed the data argument and returns a | |
777 | ruling as to whether or not enablement should actually be permitted to begin. | |
778 | ||
779 | All possible failures are handled internally. The cookie will only be marked | |
780 | as enabled if provisional backing objects are allocated. | |
781 | ||
782 | ||
2d6fff63 DH |
783 | =============================== |
784 | MISCELLANEOUS COOKIE OPERATIONS | |
785 | =============================== | |
786 | ||
787 | There are a number of operations that can be used to control cookies: | |
788 | ||
789 | (*) Cookie pinning: | |
790 | ||
791 | int fscache_pin_cookie(struct fscache_cookie *cookie); | |
792 | void fscache_unpin_cookie(struct fscache_cookie *cookie); | |
793 | ||
794 | These operations permit data cookies to be pinned into the cache and to | |
795 | have the pinning removed. They are not permitted on index cookies. | |
796 | ||
797 | The pinning function will return 0 if successful, -ENOBUFS in the cookie | |
798 | isn't backed by a cache, -EOPNOTSUPP if the cache doesn't support pinning, | |
799 | -ENOSPC if there isn't enough space to honour the operation, -ENOMEM or | |
800 | -EIO if there's any other problem. | |
801 | ||
802 | (*) Data space reservation: | |
803 | ||
804 | int fscache_reserve_space(struct fscache_cookie *cookie, loff_t size); | |
805 | ||
806 | This permits a netfs to request cache space be reserved to store up to the | |
807 | given amount of a file. It is permitted to ask for more than the current | |
808 | size of the file to allow for future file expansion. | |
809 | ||
810 | If size is given as zero then the reservation will be cancelled. | |
811 | ||
812 | The function will return 0 if successful, -ENOBUFS in the cookie isn't | |
813 | backed by a cache, -EOPNOTSUPP if the cache doesn't support reservations, | |
814 | -ENOSPC if there isn't enough space to honour the operation, -ENOMEM or | |
815 | -EIO if there's any other problem. | |
816 | ||
817 | Note that this doesn't pin an object in a cache; it can still be culled to | |
818 | make space if it's not in use. | |
819 | ||
820 | ||
821 | ===================== | |
822 | COOKIE UNREGISTRATION | |
823 | ===================== | |
824 | ||
825 | To get rid of a cookie, this function should be called. | |
826 | ||
827 | void fscache_relinquish_cookie(struct fscache_cookie *cookie, | |
94d30ae9 | 828 | bool retire); |
2d6fff63 DH |
829 | |
830 | If retire is non-zero, then the object will be marked for recycling, and all | |
831 | copies of it will be removed from all active caches in which it is present. | |
832 | Not only that but all child objects will also be retired. | |
833 | ||
834 | If retire is zero, then the object may be available again when next the | |
835 | acquisition function is called. Retirement here will overrule the pinning on a | |
836 | cookie. | |
837 | ||
838 | One very important note - relinquish must NOT be called for a cookie unless all | |
839 | the cookies for "child" indices, objects and pages have been relinquished | |
840 | first. | |
841 | ||
842 | ||
ef778e7a DH |
843 | ================== |
844 | INDEX INVALIDATION | |
845 | ================== | |
846 | ||
847 | There is no direct way to invalidate an index subtree. To do this, the caller | |
848 | should relinquish and retire the cookie they have, and then acquire a new one. | |
849 | ||
850 | ||
851 | ====================== | |
852 | DATA FILE INVALIDATION | |
853 | ====================== | |
854 | ||
855 | Sometimes it will be necessary to invalidate an object that contains data. | |
856 | Typically this will be necessary when the server tells the netfs of a foreign | |
857 | change - at which point the netfs has to throw away all the state it had for an | |
858 | inode and reload from the server. | |
859 | ||
860 | To indicate that a cache object should be invalidated, the following function | |
861 | can be called: | |
862 | ||
863 | void fscache_invalidate(struct fscache_cookie *cookie); | |
864 | ||
865 | This can be called with spinlocks held as it defers the work to a thread pool. | |
866 | All extant storage, retrieval and attribute change ops at this point are | |
867 | cancelled and discarded. Some future operations will be rejected until the | |
868 | cache has had a chance to insert a barrier in the operations queue. After | |
869 | that, operations will be queued again behind the invalidation operation. | |
870 | ||
871 | The invalidation operation will perform an attribute change operation and an | |
872 | auxiliary data update operation as it is very likely these will have changed. | |
873 | ||
874 | Using the following function, the netfs can wait for the invalidation operation | |
875 | to have reached a point at which it can start submitting ordinary operations | |
876 | once again: | |
2d6fff63 | 877 | |
ef778e7a | 878 | void fscache_wait_on_invalidate(struct fscache_cookie *cookie); |
2d6fff63 DH |
879 | |
880 | ||
881 | =========================== | |
882 | FS-CACHE SPECIFIC PAGE FLAG | |
883 | =========================== | |
884 | ||
885 | FS-Cache makes use of a page flag, PG_private_2, for its own purpose. This is | |
886 | given the alternative name PG_fscache. | |
887 | ||
888 | PG_fscache is used to indicate that the page is known by the cache, and that | |
889 | the cache must be informed if the page is going to go away. It's an indication | |
890 | to the netfs that the cache has an interest in this page, where an interest may | |
891 | be a pointer to it, resources allocated or reserved for it, or I/O in progress | |
892 | upon it. | |
893 | ||
894 | The netfs can use this information in methods such as releasepage() to | |
895 | determine whether it needs to uncache a page or update it. | |
896 | ||
897 | Furthermore, if this bit is set, releasepage() and invalidatepage() operations | |
898 | will be called on a page to get rid of it, even if PG_private is not set. This | |
899 | allows caching to attempted on a page before read_cache_pages() to be called | |
900 | after fscache_read_or_alloc_pages() as the former will try and release pages it | |
901 | was given under certain circumstances. | |
902 | ||
903 | This bit does not overlap with such as PG_private. This means that FS-Cache | |
904 | can be used with a filesystem that uses the block buffering code. | |
905 | ||
906 | There are a number of operations defined on this flag: | |
907 | ||
908 | int PageFsCache(struct page *page); | |
909 | void SetPageFsCache(struct page *page) | |
910 | void ClearPageFsCache(struct page *page) | |
911 | int TestSetPageFsCache(struct page *page) | |
912 | int TestClearPageFsCache(struct page *page) | |
913 | ||
914 | These functions are bit test, bit set, bit clear, bit test and set and bit | |
915 | test and clear operations on PG_fscache. |