Commit | Line | Data |
---|---|---|
1da177e4 LT |
1 | Using RCU to Protect Read-Mostly Arrays |
2 | ||
3 | ||
4 | Although RCU is more commonly used to protect linked lists, it can | |
5 | also be used to protect arrays. Three situations are as follows: | |
6 | ||
7 | 1. Hash Tables | |
8 | ||
9 | 2. Static Arrays | |
10 | ||
11 | 3. Resizeable Arrays | |
12 | ||
cf9fbf80 PM |
13 | Each of these three situations involves an RCU-protected pointer to an |
14 | array that is separately indexed. It might be tempting to consider use | |
15 | of RCU to instead protect the index into an array, however, this use | |
16 | case is -not- supported. The problem with RCU-protected indexes into | |
17 | arrays is that compilers can play way too many optimization games with | |
18 | integers, which means that the rules governing handling of these indexes | |
19 | are far more trouble than they are worth. If RCU-protected indexes into | |
20 | arrays prove to be particularly valuable (which they have not thus far), | |
21 | explicit cooperation from the compiler will be required to permit them | |
22 | to be safely used. | |
23 | ||
24 | That aside, each of the three RCU-protected pointer situations are | |
25 | described in the following sections. | |
1da177e4 LT |
26 | |
27 | ||
28 | Situation 1: Hash Tables | |
29 | ||
30 | Hash tables are often implemented as an array, where each array entry | |
31 | has a linked-list hash chain. Each hash chain can be protected by RCU | |
32 | as described in the listRCU.txt document. This approach also applies | |
33 | to other array-of-list situations, such as radix trees. | |
34 | ||
35 | ||
36 | Situation 2: Static Arrays | |
37 | ||
38 | Static arrays, where the data (rather than a pointer to the data) is | |
39 | located in each array element, and where the array is never resized, | |
40 | have not been used with RCU. Rik van Riel recommends using seqlock in | |
41 | this situation, which would also have minimal read-side overhead as long | |
42 | as updates are rare. | |
43 | ||
44 | Quick Quiz: Why is it so important that updates be rare when | |
45 | using seqlock? | |
46 | ||
47 | ||
48 | Situation 3: Resizeable Arrays | |
49 | ||
50 | Use of RCU for resizeable arrays is demonstrated by the grow_ary() | |
cf9fbf80 PM |
51 | function formerly used by the System V IPC code. The array is used |
52 | to map from semaphore, message-queue, and shared-memory IDs to the data | |
53 | structure that represents the corresponding IPC construct. The grow_ary() | |
1da177e4 LT |
54 | function does not acquire any locks; instead its caller must hold the |
55 | ids->sem semaphore. | |
56 | ||
57 | The grow_ary() function, shown below, does some limit checks, allocates a | |
58 | new ipc_id_ary, copies the old to the new portion of the new, initializes | |
59 | the remainder of the new, updates the ids->entries pointer to point to | |
60 | the new array, and invokes ipc_rcu_putref() to free up the old array. | |
61 | Note that rcu_assign_pointer() is used to update the ids->entries pointer, | |
62 | which includes any memory barriers required on whatever architecture | |
63 | you are running on. | |
64 | ||
65 | static int grow_ary(struct ipc_ids* ids, int newsize) | |
66 | { | |
67 | struct ipc_id_ary* new; | |
68 | struct ipc_id_ary* old; | |
69 | int i; | |
70 | int size = ids->entries->size; | |
71 | ||
72 | if(newsize > IPCMNI) | |
73 | newsize = IPCMNI; | |
74 | if(newsize <= size) | |
75 | return newsize; | |
76 | ||
77 | new = ipc_rcu_alloc(sizeof(struct kern_ipc_perm *)*newsize + | |
78 | sizeof(struct ipc_id_ary)); | |
79 | if(new == NULL) | |
80 | return size; | |
81 | new->size = newsize; | |
82 | memcpy(new->p, ids->entries->p, | |
83 | sizeof(struct kern_ipc_perm *)*size + | |
84 | sizeof(struct ipc_id_ary)); | |
85 | for(i=size;i<newsize;i++) { | |
86 | new->p[i] = NULL; | |
87 | } | |
88 | old = ids->entries; | |
89 | ||
90 | /* | |
91 | * Use rcu_assign_pointer() to make sure the memcpyed | |
92 | * contents of the new array are visible before the new | |
93 | * array becomes visible. | |
94 | */ | |
95 | rcu_assign_pointer(ids->entries, new); | |
96 | ||
97 | ipc_rcu_putref(old); | |
98 | return newsize; | |
99 | } | |
100 | ||
101 | The ipc_rcu_putref() function decrements the array's reference count | |
102 | and then, if the reference count has dropped to zero, uses call_rcu() | |
103 | to free the array after a grace period has elapsed. | |
104 | ||
105 | The array is traversed by the ipc_lock() function. This function | |
106 | indexes into the array under the protection of rcu_read_lock(), | |
107 | using rcu_dereference() to pick up the pointer to the array so | |
108 | that it may later safely be dereferenced -- memory barriers are | |
109 | required on the Alpha CPU. Since the size of the array is stored | |
110 | with the array itself, there can be no array-size mismatches, so | |
111 | a simple check suffices. The pointer to the structure corresponding | |
112 | to the desired IPC object is placed in "out", with NULL indicating | |
113 | a non-existent entry. After acquiring "out->lock", the "out->deleted" | |
114 | flag indicates whether the IPC object is in the process of being | |
115 | deleted, and, if not, the pointer is returned. | |
116 | ||
117 | struct kern_ipc_perm* ipc_lock(struct ipc_ids* ids, int id) | |
118 | { | |
119 | struct kern_ipc_perm* out; | |
120 | int lid = id % SEQ_MULTIPLIER; | |
121 | struct ipc_id_ary* entries; | |
122 | ||
123 | rcu_read_lock(); | |
124 | entries = rcu_dereference(ids->entries); | |
125 | if(lid >= entries->size) { | |
126 | rcu_read_unlock(); | |
127 | return NULL; | |
128 | } | |
129 | out = entries->p[lid]; | |
130 | if(out == NULL) { | |
131 | rcu_read_unlock(); | |
132 | return NULL; | |
133 | } | |
134 | spin_lock(&out->lock); | |
135 | ||
136 | /* ipc_rmid() may have already freed the ID while ipc_lock | |
137 | * was spinning: here verify that the structure is still valid | |
138 | */ | |
139 | if (out->deleted) { | |
140 | spin_unlock(&out->lock); | |
141 | rcu_read_unlock(); | |
142 | return NULL; | |
143 | } | |
144 | return out; | |
145 | } | |
146 | ||
147 | ||
148 | Answer to Quick Quiz: | |
149 | ||
150 | The reason that it is important that updates be rare when | |
151 | using seqlock is that frequent updates can livelock readers. | |
152 | One way to avoid this problem is to assign a seqlock for | |
153 | each array entry rather than to the entire array. |