Commit | Line | Data |
---|---|---|
1da177e4 LT |
1 | Changes since 2.5.0: |
2 | ||
3eb43f68 | 3 | --- |
1da177e4 LT |
4 | [recommended] |
5 | ||
6 | New helpers: sb_bread(), sb_getblk(), sb_find_get_block(), set_bh(), | |
7 | sb_set_blocksize() and sb_min_blocksize(). | |
8 | ||
9 | Use them. | |
10 | ||
11 | (sb_find_get_block() replaces 2.4's get_hash_table()) | |
12 | ||
3eb43f68 | 13 | --- |
1da177e4 LT |
14 | [recommended] |
15 | ||
16 | New methods: ->alloc_inode() and ->destroy_inode(). | |
17 | ||
18 | Remove inode->u.foo_inode_i | |
19 | Declare | |
20 | struct foo_inode_info { | |
21 | /* fs-private stuff */ | |
22 | struct inode vfs_inode; | |
23 | }; | |
24 | static inline struct foo_inode_info *FOO_I(struct inode *inode) | |
25 | { | |
26 | return list_entry(inode, struct foo_inode_info, vfs_inode); | |
27 | } | |
28 | ||
29 | Use FOO_I(inode) instead of &inode->u.foo_inode_i; | |
30 | ||
3eb43f68 | 31 | Add foo_alloc_inode() and foo_destroy_inode() - the former should allocate |
1da177e4 LT |
32 | foo_inode_info and return the address of ->vfs_inode, the latter should free |
33 | FOO_I(inode) (see in-tree filesystems for examples). | |
34 | ||
35 | Make them ->alloc_inode and ->destroy_inode in your super_operations. | |
36 | ||
12debc42 DH |
37 | Keep in mind that now you need explicit initialization of private data |
38 | typically between calling iget_locked() and unlocking the inode. | |
1da177e4 LT |
39 | |
40 | At some point that will become mandatory. | |
41 | ||
42 | --- | |
43 | [mandatory] | |
44 | ||
45 | Change of file_system_type method (->read_super to ->get_sb) | |
46 | ||
47 | ->read_super() is no more. Ditto for DECLARE_FSTYPE and DECLARE_FSTYPE_DEV. | |
48 | ||
49 | Turn your foo_read_super() into a function that would return 0 in case of | |
50 | success and negative number in case of error (-EINVAL unless you have more | |
51 | informative error value to report). Call it foo_fill_super(). Now declare | |
52 | ||
454e2398 DH |
53 | int foo_get_sb(struct file_system_type *fs_type, |
54 | int flags, const char *dev_name, void *data, struct vfsmount *mnt) | |
1da177e4 | 55 | { |
454e2398 DH |
56 | return get_sb_bdev(fs_type, flags, dev_name, data, foo_fill_super, |
57 | mnt); | |
1da177e4 LT |
58 | } |
59 | ||
60 | (or similar with s/bdev/nodev/ or s/bdev/single/, depending on the kind of | |
61 | filesystem). | |
62 | ||
63 | Replace DECLARE_FSTYPE... with explicit initializer and have ->get_sb set as | |
64 | foo_get_sb. | |
65 | ||
66 | --- | |
67 | [mandatory] | |
68 | ||
69 | Locking change: ->s_vfs_rename_sem is taken only by cross-directory renames. | |
70 | Most likely there is no need to change anything, but if you relied on | |
71 | global exclusion between renames for some internal purpose - you need to | |
72 | change your internal locking. Otherwise exclusion warranties remain the | |
73 | same (i.e. parents and victim are locked, etc.). | |
74 | ||
75 | --- | |
76 | [informational] | |
77 | ||
78 | Now we have the exclusion between ->lookup() and directory removal (by | |
79 | ->rmdir() and ->rename()). If you used to need that exclusion and do | |
80 | it by internal locking (most of filesystems couldn't care less) - you | |
81 | can relax your locking. | |
82 | ||
83 | --- | |
84 | [mandatory] | |
85 | ||
86 | ->lookup(), ->truncate(), ->create(), ->unlink(), ->mknod(), ->mkdir(), | |
87 | ->rmdir(), ->link(), ->lseek(), ->symlink(), ->rename() | |
88 | and ->readdir() are called without BKL now. Grab it on entry, drop upon return | |
89 | - that will guarantee the same locking you used to have. If your method or its | |
90 | parts do not need BKL - better yet, now you can shift lock_kernel() and | |
91 | unlock_kernel() so that they would protect exactly what needs to be | |
92 | protected. | |
93 | ||
94 | --- | |
95 | [mandatory] | |
96 | ||
34e5053f AB |
97 | BKL is also moved from around sb operations. BKL should have been shifted into |
98 | individual fs sb_op functions. If you don't need it, remove it. | |
1da177e4 LT |
99 | |
100 | --- | |
101 | [informational] | |
102 | ||
103 | check for ->link() target not being a directory is done by callers. Feel | |
104 | free to drop it... | |
105 | ||
106 | --- | |
107 | [informational] | |
108 | ||
c2b38989 | 109 | ->link() callers hold ->i_mutex on the object we are linking to. Some of your |
1da177e4 LT |
110 | problems might be over... |
111 | ||
112 | --- | |
113 | [mandatory] | |
114 | ||
115 | new file_system_type method - kill_sb(superblock). If you are converting | |
116 | an existing filesystem, set it according to ->fs_flags: | |
117 | FS_REQUIRES_DEV - kill_block_super | |
118 | FS_LITTER - kill_litter_super | |
119 | neither - kill_anon_super | |
120 | FS_LITTER is gone - just remove it from fs_flags. | |
121 | ||
122 | --- | |
123 | [mandatory] | |
124 | ||
125 | FS_SINGLE is gone (actually, that had happened back when ->get_sb() | |
126 | went in - and hadn't been documented ;-/). Just remove it from fs_flags | |
127 | (and see ->get_sb() entry for other actions). | |
128 | ||
129 | --- | |
130 | [mandatory] | |
131 | ||
c2b38989 JJS |
132 | ->setattr() is called without BKL now. Caller _always_ holds ->i_mutex, so |
133 | watch for ->i_mutex-grabbing code that might be used by your ->setattr(). | |
134 | Callers of notify_change() need ->i_mutex now. | |
1da177e4 LT |
135 | |
136 | --- | |
137 | [recommended] | |
138 | ||
139 | New super_block field "struct export_operations *s_export_op" for | |
140 | explicit support for exporting, e.g. via NFS. The structure is fully | |
141 | documented at its declaration in include/linux/fs.h, and in | |
dc7a0816 | 142 | Documentation/filesystems/nfs/Exporting. |
1da177e4 LT |
143 | |
144 | Briefly it allows for the definition of decode_fh and encode_fh operations | |
145 | to encode and decode filehandles, and allows the filesystem to use | |
146 | a standard helper function for decode_fh, and provide file-system specific | |
147 | support for this helper, particularly get_parent. | |
148 | ||
149 | It is planned that this will be required for exporting once the code | |
150 | settles down a bit. | |
151 | ||
152 | [mandatory] | |
153 | ||
154 | s_export_op is now required for exporting a filesystem. | |
155 | isofs, ext2, ext3, resierfs, fat | |
156 | can be used as examples of very different filesystems. | |
157 | ||
158 | --- | |
159 | [mandatory] | |
160 | ||
161 | iget4() and the read_inode2 callback have been superseded by iget5_locked() | |
162 | which has the following prototype, | |
163 | ||
164 | struct inode *iget5_locked(struct super_block *sb, unsigned long ino, | |
165 | int (*test)(struct inode *, void *), | |
166 | int (*set)(struct inode *, void *), | |
167 | void *data); | |
168 | ||
169 | 'test' is an additional function that can be used when the inode | |
170 | number is not sufficient to identify the actual file object. 'set' | |
171 | should be a non-blocking function that initializes those parts of a | |
172 | newly created inode to allow the test function to succeed. 'data' is | |
173 | passed as an opaque value to both test and set functions. | |
174 | ||
12debc42 DH |
175 | When the inode has been created by iget5_locked(), it will be returned with the |
176 | I_NEW flag set and will still be locked. The filesystem then needs to finalize | |
177 | the initialization. Once the inode is initialized it must be unlocked by | |
178 | calling unlock_new_inode(). | |
1da177e4 LT |
179 | |
180 | The filesystem is responsible for setting (and possibly testing) i_ino | |
181 | when appropriate. There is also a simpler iget_locked function that | |
182 | just takes the superblock and inode number as arguments and does the | |
183 | test and set for you. | |
184 | ||
185 | e.g. | |
b46980fe DH |
186 | inode = iget_locked(sb, ino); |
187 | if (inode->i_state & I_NEW) { | |
188 | err = read_inode_from_disk(inode); | |
189 | if (err < 0) { | |
190 | iget_failed(inode); | |
191 | return err; | |
192 | } | |
193 | unlock_new_inode(inode); | |
194 | } | |
195 | ||
196 | Note that if the process of setting up a new inode fails, then iget_failed() | |
197 | should be called on the inode to render it dead, and an appropriate error | |
198 | should be passed back to the caller. | |
1da177e4 LT |
199 | |
200 | --- | |
201 | [recommended] | |
202 | ||
203 | ->getattr() finally getting used. See instances in nfs, minix, etc. | |
204 | ||
205 | --- | |
206 | [mandatory] | |
207 | ||
208 | ->revalidate() is gone. If your filesystem had it - provide ->getattr() | |
209 | and let it call whatever you had as ->revlidate() + (for symlinks that | |
210 | had ->revalidate()) add calls in ->follow_link()/->readlink(). | |
211 | ||
212 | --- | |
213 | [mandatory] | |
214 | ||
215 | ->d_parent changes are not protected by BKL anymore. Read access is safe | |
216 | if at least one of the following is true: | |
217 | * filesystem has no cross-directory rename() | |
1da177e4 LT |
218 | * we know that parent had been locked (e.g. we are looking at |
219 | ->d_parent of ->lookup() argument). | |
220 | * we are called from ->rename(). | |
221 | * the child's ->d_lock is held | |
222 | Audit your code and add locking if needed. Notice that any place that is | |
223 | not protected by the conditions above is risky even in the old tree - you | |
224 | had been relying on BKL and that's prone to screwups. Old tree had quite | |
225 | a few holes of that kind - unprotected access to ->d_parent leading to | |
226 | anything from oops to silent memory corruption. | |
227 | ||
228 | --- | |
229 | [mandatory] | |
230 | ||
231 | FS_NOMOUNT is gone. If you use it - just set MS_NOUSER in flags | |
232 | (see rootfs for one kind of solution and bdev/socket/pipe for another). | |
233 | ||
234 | --- | |
235 | [recommended] | |
236 | ||
237 | Use bdev_read_only(bdev) instead of is_read_only(kdev). The latter | |
238 | is still alive, but only because of the mess in drivers/s390/block/dasd.c. | |
239 | As soon as it gets fixed is_read_only() will die. | |
240 | ||
241 | --- | |
242 | [mandatory] | |
243 | ||
244 | ->permission() is called without BKL now. Grab it on entry, drop upon | |
245 | return - that will guarantee the same locking you used to have. If | |
246 | your method or its parts do not need BKL - better yet, now you can | |
247 | shift lock_kernel() and unlock_kernel() so that they would protect | |
248 | exactly what needs to be protected. | |
249 | ||
250 | --- | |
251 | [mandatory] | |
252 | ||
253 | ->statfs() is now called without BKL held. BKL should have been | |
254 | shifted into individual fs sb_op functions where it's not clear that | |
255 | it's safe to remove it. If you don't need it, remove it. | |
256 | ||
257 | --- | |
258 | [mandatory] | |
259 | ||
260 | is_read_only() is gone; use bdev_read_only() instead. | |
261 | ||
262 | --- | |
263 | [mandatory] | |
264 | ||
265 | destroy_buffers() is gone; use invalidate_bdev(). | |
266 | ||
267 | --- | |
268 | [mandatory] | |
269 | ||
270 | fsync_dev() is gone; use fsync_bdev(). NOTE: lvm breakage is | |
271 | deliberate; as soon as struct block_device * is propagated in a reasonable | |
272 | way by that code fixing will become trivial; until then nothing can be | |
273 | done. | |
1e231735 CH |
274 | |
275 | [mandatory] | |
276 | ||
277 | block truncatation on error exit from ->write_begin, and ->direct_IO | |
278 | moved from generic methods (block_write_begin, cont_write_begin, | |
279 | nobh_write_begin, blockdev_direct_IO*) to callers. Take a look at | |
280 | ext2_write_failed and callers for an example. | |
281 | ||
282 | [mandatory] | |
283 | ||
b9f61c3c | 284 | ->truncate is gone. The whole truncate sequence needs to be |
1e231735 CH |
285 | implemented in ->setattr, which is now mandatory for filesystems |
286 | implementing on-disk size changes. Start with a copy of the old inode_setattr | |
287 | and vmtruncate, and the reorder the vmtruncate + foofs_vmtruncate sequence to | |
288 | be in order of zeroing blocks using block_truncate_page or similar helpers, | |
289 | size update and on finally on-disk truncation which should not fail. | |
290 | inode_change_ok now includes the size checks for ATTR_SIZE and must be called | |
291 | in the beginning of ->setattr unconditionally. | |
336fb3b9 AV |
292 | |
293 | [mandatory] | |
294 | ||
295 | ->clear_inode() and ->delete_inode() are gone; ->evict_inode() should | |
296 | be used instead. It gets called whenever the inode is evicted, whether it has | |
297 | remaining links or not. Caller does *not* evict the pagecache or inode-associated | |
91b0abe3 JW |
298 | metadata buffers; the method has to use truncate_inode_pages_final() to get rid |
299 | of those. Caller makes sure async writeback cannot be running for the inode while | |
300 | (or after) ->evict_inode() is called. | |
f283c86a DC |
301 | |
302 | ->drop_inode() returns int now; it's called on final iput() with | |
303 | inode->i_lock held and it returns true if filesystems wants the inode to be | |
304 | dropped. As before, generic_drop_inode() is still the default and it's been | |
305 | updated appropriately. generic_delete_inode() is also alive and it consists | |
306 | simply of return 1. Note that all actual eviction work is done by caller after | |
307 | ->drop_inode() returns. | |
308 | ||
dbd5768f JK |
309 | As before, clear_inode() must be called exactly once on each call of |
310 | ->evict_inode() (as it used to be for each call of ->delete_inode()). Unlike | |
311 | before, if you are using inode-associated metadata buffers (i.e. | |
312 | mark_buffer_dirty_inode()), it's your responsibility to call | |
313 | invalidate_inode_buffers() before clear_inode(). | |
336fb3b9 AV |
314 | |
315 | NOTE: checking i_nlink in the beginning of ->write_inode() and bailing out | |
316 | if it's zero is not *and* *never* *had* *been* enough. Final unlink() and iput() | |
317 | may happen while the inode is in the middle of ->write_inode(); e.g. if you blindly | |
318 | free the on-disk inode, you may end up doing that while ->write_inode() is writing | |
319 | to it. | |
fe15ce44 NP |
320 | |
321 | --- | |
322 | [mandatory] | |
323 | ||
324 | .d_delete() now only advises the dcache as to whether or not to cache | |
325 | unreferenced dentries, and is now only called when the dentry refcount goes to | |
326 | 0. Even on 0 refcount transition, it must be able to tolerate being called 0, | |
327 | 1, or more times (eg. constant, idempotent). | |
621e155a NP |
328 | |
329 | --- | |
330 | [mandatory] | |
331 | ||
332 | .d_compare() calling convention and locking rules are significantly | |
333 | changed. Read updated documentation in Documentation/filesystems/vfs.txt (and | |
334 | look at examples of other filesystems) for guidance. | |
b1e6a015 NP |
335 | |
336 | --- | |
337 | [mandatory] | |
338 | ||
339 | .d_hash() calling convention and locking rules are significantly | |
340 | changed. Read updated documentation in Documentation/filesystems/vfs.txt (and | |
341 | look at examples of other filesystems) for guidance. | |
b5c84bf6 NP |
342 | |
343 | --- | |
344 | [mandatory] | |
345 | dcache_lock is gone, replaced by fine grained locks. See fs/dcache.c | |
346 | for details of what locks to replace dcache_lock with in order to protect | |
347 | particular things. Most of the time, a filesystem only needs ->d_lock, which | |
348 | protects *all* the dcache state of a given dentry. | |
fa0d7e3d NP |
349 | |
350 | -- | |
351 | [mandatory] | |
352 | ||
353 | Filesystems must RCU-free their inodes, if they can have been accessed | |
354 | via rcu-walk path walk (basically, if the file can have had a path name in the | |
355 | vfs namespace). | |
356 | ||
049b3c10 AV |
357 | Even though i_dentry and i_rcu share storage in a union, we will |
358 | initialize the former in inode_init_always(), so just leave it alone in | |
359 | the callback. It used to be necessary to clean it there, but not anymore | |
360 | (starting at 3.2). | |
34286d66 NP |
361 | |
362 | -- | |
363 | [recommended] | |
364 | vfs now tries to do path walking in "rcu-walk mode", which avoids | |
365 | atomic operations and scalability hazards on dentries and inodes (see | |
a82416da NP |
366 | Documentation/filesystems/path-lookup.txt). d_hash and d_compare changes |
367 | (above) are examples of the changes required to support this. For more complex | |
34286d66 NP |
368 | filesystem callbacks, the vfs drops out of rcu-walk mode before the fs call, so |
369 | no changes are required to the filesystem. However, this is costly and loses | |
370 | the benefits of rcu-walk mode. We will begin to add filesystem callbacks that | |
371 | are rcu-walk aware, shown below. Filesystems should take advantage of this | |
372 | where possible. | |
373 | ||
374 | -- | |
375 | [mandatory] | |
376 | d_revalidate is a callback that is made on every path element (if | |
377 | the filesystem provides it), which requires dropping out of rcu-walk mode. This | |
378 | may now be called in rcu-walk mode (nd->flags & LOOKUP_RCU). -ECHILD should be | |
379 | returned if the filesystem cannot handle rcu-walk. See | |
b74c79e9 NP |
380 | Documentation/filesystems/vfs.txt for more details. |
381 | ||
47016077 AG |
382 | permission is an inode permission check that is called on many or all |
383 | directory inodes on the way down a path walk (to check for exec permission). It | |
384 | must now be rcu-walk aware (mask & MAY_NOT_BLOCK). See | |
385 | Documentation/filesystems/vfs.txt for more details. | |
92424157 JB |
386 | |
387 | -- | |
388 | [mandatory] | |
389 | In ->fallocate() you must check the mode option passed in. If your | |
390 | filesystem does not support hole punching (deallocating space in the middle of a | |
391 | file) you must return -EOPNOTSUPP if FALLOC_FL_PUNCH_HOLE is set in mode. | |
392 | Currently you can only have FALLOC_FL_PUNCH_HOLE with FALLOC_FL_KEEP_SIZE set, | |
393 | so the i_size should not change when hole punching, even when puching the end of | |
394 | a file off. | |
1a102ff9 AV |
395 | |
396 | -- | |
397 | [mandatory] | |
398 | ->get_sb() is gone. Switch to use of ->mount(). Typically it's just | |
399 | a matter of switching from calling get_sb_... to mount_... and changing the | |
400 | function type. If you were doing it manually, just switch from setting ->mnt_root | |
401 | to some pointer to returning that pointer. On errors return ERR_PTR(...). | |
76fe3276 AV |
402 | |
403 | -- | |
404 | [mandatory] | |
4e34e719 | 405 | ->permission() and generic_permission()have lost flags |
76fe3276 | 406 | argument; instead of passing IPERM_FLAG_RCU we add MAY_NOT_BLOCK into mask. |
4e34e719 CH |
407 | generic_permission() has also lost the check_acl argument; ACL checking |
408 | has been taken to VFS and filesystems need to provide a non-NULL ->i_op->get_acl | |
409 | to read an ACL from disk. | |
982d8165 JB |
410 | |
411 | -- | |
412 | [mandatory] | |
413 | If you implement your own ->llseek() you must handle SEEK_HOLE and | |
414 | SEEK_DATA. You can hanle this by returning -EINVAL, but it would be nicer to | |
415 | support it in some way. The generic handler assumes that the entire file is | |
416 | data and there is a virtual hole at the end of the file. So if the provided | |
417 | offset is less than i_size and SEEK_DATA is specified, return the same offset. | |
418 | If the above is true for the offset and you are given SEEK_HOLE, return the end | |
419 | of the file. If the offset is i_size or greater return -ENXIO in either case. | |
02c24a82 JB |
420 | |
421 | [mandatory] | |
422 | If you have your own ->fsync() you must make sure to call | |
423 | filemap_write_and_wait_range() so that all dirty pages are synced out properly. | |
424 | You must also keep in mind that ->fsync() is not called with i_mutex held | |
425 | anymore, so if you require i_mutex locking you must make sure to take it and | |
426 | release it yourself. | |
32991ab3 AV |
427 | |
428 | -- | |
429 | [mandatory] | |
430 | d_alloc_root() is gone, along with a lot of bugs caused by code | |
431 | misusing it. Replacement: d_make_root(inode). The difference is, | |
432 | d_make_root() drops the reference to inode if dentry allocation fails. | |
0b728e19 AV |
433 | |
434 | -- | |
435 | [mandatory] | |
00cd8dd3 AV |
436 | The witch is dead! Well, 2/3 of it, anyway. ->d_revalidate() and |
437 | ->lookup() do *not* take struct nameidata anymore; just the flags. | |
ebfc3b49 AV |
438 | -- |
439 | [mandatory] | |
440 | ->create() doesn't take struct nameidata *; unlike the previous | |
441 | two, it gets "is it an O_EXCL or equivalent?" boolean argument. Note that | |
442 | local filesystems can ignore tha argument - they are guaranteed that the | |
443 | object doesn't exist. It's remote/distributed ones that might care... | |
ecf3d1f1 JL |
444 | -- |
445 | [mandatory] | |
446 | FS_REVAL_DOT is gone; if you used to have it, add ->d_weak_revalidate() | |
447 | in your dentry operations instead. | |
5c0ba4e0 AV |
448 | -- |
449 | [mandatory] | |
450 | vfs_readdir() is gone; switch to iterate_dir() instead | |
2233f31a AV |
451 | -- |
452 | [mandatory] | |
453 | ->readdir() is gone now; switch to ->iterate() | |
4aa32895 CH |
454 | [mandatory] |
455 | vfs_follow_link has been removed. Filesystems must use nd_set_link | |
456 | from ->follow_link for normal symlinks, or nd_jump_link for magic | |
457 | /proc/<pid> style links. | |
5a3cd992 AV |
458 | -- |
459 | [mandatory] | |
460 | iget5_locked()/ilookup5()/ilookup5_nowait() test() callback used to be | |
461 | called with both ->i_lock and inode_hash_lock held; the former is *not* | |
462 | taken anymore, so verify that your callbacks do not rely on it (none | |
463 | of the in-tree instances did). inode_hash_lock is still held, | |
464 | of course, so they are still serialized wrt removal from inode hash, | |
465 | as well as wrt set() callback of iget5_locked(). | |
41d28bca AV |
466 | -- |
467 | [mandatory] | |
468 | d_materialise_unique() is gone; d_splice_alias() does everything you | |
469 | need now. Remember that they have opposite orders of arguments ;-/ | |
78d28e65 AV |
470 | -- |
471 | [mandatory] | |
472 | f_dentry is gone; use f_path.dentry, or, better yet, see if you can avoid | |
473 | it entirely. | |
5d5d5689 AV |
474 | -- |
475 | [mandatory] | |
476 | never call ->read() and ->write() directly; use __vfs_{read,write} or | |
477 | wrappers; instead of checking for ->write or ->read being NULL, look for | |
478 | FMODE_CAN_{WRITE,READ} in file->f_mode. | |
479 | -- | |
480 | [mandatory] | |
481 | do _not_ use new_sync_{read,write} for ->read/->write; leave it NULL | |
482 | instead. | |
84363182 AV |
483 | -- |
484 | [mandatory] | |
485 | ->aio_read/->aio_write are gone. Use ->read_iter/->write_iter. | |
203bc643 AV |
486 | --- |
487 | [recommended] | |
488 | for embedded ("fast") symlinks just set inode->i_link to wherever the | |
489 | symlink body is and use simple_follow_link() as ->follow_link(). | |
490 | -- | |
491 | [mandatory] | |
492 | calling conventions for ->follow_link() have changed. Instead of returning | |
493 | cookie and using nd_set_link() to store the body to traverse, we return | |
494 | the body to traverse and store the cookie using explicit void ** argument. | |
495 | nameidata isn't passed at all - nd_jump_link() doesn't need it and | |
496 | nd_[gs]et_link() is gone. | |
497 | -- | |
498 | [mandatory] | |
499 | calling conventions for ->put_link() have changed. It gets inode instead of | |
500 | dentry, it does not get nameidata at all and it gets called only when cookie | |
501 | is non-NULL. Note that link body isn't available anymore, so if you need it, | |
502 | store it as cookie. | |
8a81252b ED |
503 | -- |
504 | [mandatory] | |
505 | __fd_install() & fd_install() can now sleep. Callers should not | |
506 | hold a spinlock or other resources that do not allow a schedule. | |
21fc61c7 AV |
507 | -- |
508 | [mandatory] | |
509 | any symlink that might use page_follow_link_light/page_put_link() must | |
510 | have inode_nohighmem(inode) called before anything might start playing with | |
e8ecde25 AV |
511 | its pagecache. No highmem pages should end up in the pagecache of such |
512 | symlinks. That includes any preseeding that might be done during symlink | |
513 | creation. __page_symlink() will honour the mapping gfp flags, so once | |
514 | you've done inode_nohighmem() it's safe to use, but if you allocate and | |
515 | insert the page manually, make sure to use the right gfp flags. | |
6b255391 AV |
516 | -- |
517 | [mandatory] | |
518 | ->follow_link() is replaced with ->get_link(); same API, except that | |
519 | * ->get_link() gets inode as a separate argument | |
520 | * ->get_link() may be called in RCU mode - in that case NULL | |
521 | dentry is passed | |
fceef393 AV |
522 | -- |
523 | [mandatory] | |
524 | ->get_link() gets struct delayed_call *done now, and should do | |
525 | set_delayed_call() where it used to set *cookie. | |
526 | ->put_link() is gone - just give the destructor to set_delayed_call() | |
527 | in ->get_link(). | |
ce23e640 AV |
528 | -- |
529 | [mandatory] | |
530 | ->getxattr() and xattr_handler.get() get dentry and inode passed separately. | |
531 | dentry might be yet to be attached to inode, so do _not_ use its ->d_inode | |
532 | in the instances. Rationale: !@#!@# security_d_instantiate() needs to be | |
533 | called before we attach dentry to inode. | |
84e710da AV |
534 | -- |
535 | [mandatory] | |
536 | symlinks are no longer the only inodes that do *not* have i_bdev/i_cdev/ | |
537 | i_pipe/i_link union zeroed out at inode eviction. As the result, you can't | |
538 | assume that non-NULL value in ->i_nlink at ->destroy_inode() implies that | |
539 | it's a symlink. Checking ->i_mode is really needed now. In-tree we had | |
540 | to fix shmem_destroy_callback() that used to take that kind of shortcut; | |
541 | watch out, since that shortcut is no longer valid. | |
9902af79 AV |
542 | -- |
543 | [mandatory] | |
544 | ->i_mutex is replaced with ->i_rwsem now. inode_lock() et.al. work as | |
545 | they used to - they just take it exclusive. However, ->lookup() may be | |
546 | called with parent locked shared. Its instances must not | |
547 | * use d_instantiate) and d_rehash() separately - use d_add() or | |
548 | d_splice_alias() instead. | |
549 | * use d_rehash() alone - call d_add(new_dentry, NULL) instead. | |
550 | * in the unlikely case when (read-only) access to filesystem | |
551 | data structures needs exclusion for some reason, arrange it | |
552 | yourself. None of the in-tree filesystems needed that. | |
553 | * rely on ->d_parent and ->d_name not changing after dentry has | |
554 | been fed to d_add() or d_splice_alias(). Again, none of the | |
555 | in-tree instances relied upon that. | |
556 | We are guaranteed that lookups of the same name in the same directory | |
557 | will not happen in parallel ("same" in the sense of your ->d_compare()). | |
558 | Lookups on different names in the same directory can and do happen in | |
559 | parallel now. | |
61922694 AV |
560 | -- |
561 | [recommended] | |
562 | ->iterate_shared() is added; it's a parallel variant of ->iterate(). | |
563 | Exclusion on struct file level is still provided (as well as that | |
564 | between it and lseek on the same struct file), but if your directory | |
565 | has been opened several times, you can get these called in parallel. | |
566 | Exclusion between that method and all directory-modifying ones is | |
567 | still provided, of course. | |
568 | ||
569 | Often enough ->iterate() can serve as ->iterate_shared() without any | |
570 | changes - it is a read-only operation, after all. If you have any | |
571 | per-inode or per-dentry in-core data structures modified by ->iterate(), | |
572 | you might need something to serialize the access to them. If you | |
573 | do dcache pre-seeding, you'll need to switch to d_alloc_parallel() for | |
574 | that; look for in-tree examples. | |
575 | ||
576 | Old method is only used if the new one is absent; eventually it will | |
577 | be removed. Switch while you still can; the old one won't stay. | |
9cf843e3 AV |
578 | -- |
579 | [mandatory] | |
580 | ->atomic_open() calls without O_CREAT may happen in parallel. | |
3767e255 AV |
581 | -- |
582 | [mandatory] | |
583 | ->setxattr() and xattr_handler.set() get dentry and inode passed separately. | |
584 | dentry might be yet to be attached to inode, so do _not_ use its ->d_inode | |
585 | in the instances. Rationale: !@#!@# security_d_instantiate() needs to be | |
586 | called before we attach dentry to inode and !@#!@##!@$!$#!@#$!@$!@$ smack | |
587 | ->d_instantiate() uses not just ->getxattr() but ->setxattr() as well. | |
6fa67e70 AV |
588 | -- |
589 | [mandatory] | |
590 | ->d_compare() doesn't get parent as a separate argument anymore. If you | |
591 | used it for finding the struct super_block involved, dentry->d_sb will | |
592 | work just as well; if it's something more complicated, use dentry->d_parent. | |
593 | Just be careful not to assume that fetching it more than once will yield | |
594 | the same value - in RCU mode it could change under you. |