View Javadoc
1   //******************************************************************************
2   //
3   // File:    IntegerForLoop.java
4   // Package: edu.rit.pj
5   // Unit:    Class edu.rit.pj.IntegerForLoop
6   //
7   // This Java source file is copyright (C) 2007 by Alan Kaminsky. All rights
8   // reserved. For further information, contact the author, Alan Kaminsky, at
9   // ark@cs.rit.edu.
10  //
11  // This Java source file is part of the Parallel Java Library ("PJ"). PJ is free
12  // software; you can redistribute it and/or modify it under the terms of the GNU
13  // General Public License as published by the Free Software Foundation; either
14  // version 3 of the License, or (at your option) any later version.
15  //
16  // PJ is distributed in the hope that it will be useful, but WITHOUT ANY
17  // WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
18  // A PARTICULAR PURPOSE. See the GNU General Public License for more details.
19  //
20  // Linking this library statically or dynamically with other modules is making a
21  // combined work based on this library. Thus, the terms and conditions of the GNU
22  // General Public License cover the whole combination.
23  //
24  // As a special exception, the copyright holders of this library give you
25  // permission to link this library with independent modules to produce an
26  // executable, regardless of the license terms of these independent modules, and
27  // to copy and distribute the resulting executable under terms of your choice,
28  // provided that you also meet, for each linked independent module, the terms
29  // and conditions of the license of that module. An independent module is a module
30  // which is not derived from or based on this library. If you modify this library,
31  // you may extend this exception to your version of the library, but you are not
32  // obligated to do so. If you do not wish to do so, delete this exception
33  // statement from your version.
34  //
35  // A copy of the GNU General Public License is provided in the file gpl.txt. You
36  // may also obtain a copy of the GNU General Public License on the World Wide
37  // Web at http://www.gnu.org/licenses/gpl.html.
38  //
39  //******************************************************************************
40  package edu.rit.pj;
41  
42  /**
43   * Class IntegerForLoop is the abstract base class for one variation of a
44   * parallel for loop that is executed inside a {@linkplain ParallelRegion}. The
45   * loop index data type is <code>int</code>. The loop stride is implicit (+1).
46   * <P>
47   * To execute a parallel for loop, create a {@linkplain ParallelRegion} object;
48   * create an instance of a concrete subclass of class IntegerForLoop; and pass
49   * this instance to the parallel region's <code>execute()</code> method. Either
50   * every parallel team thread must call the parallel region's <code>execute()</code>
51   * method with identical arguments, or every thread must not call the
52   * <code>execute()</code> method. You can do all this using an anonymous inner
53   * class; for example:
54   * <PRE>
55   *     new ParallelRegion()
56   *         {
57   *         . . .
58   *         public void run()
59   *             {
60   *             . . .
61   *             execute (0, 99, new IntegerForLoop()
62   *                 {
63   *                 // Thread local variable declarations
64   *                 . . .
65   *                 public void start()
66   *                     {
67   *                     // Per-thread pre-loop initialization code
68   *                     . . .
69   *                     }
70   *                 public void run (int first, int last)
71   *                     {
72   *                     // Loop code
73   *                     . . .
74   *                     }
75   *                 public void finish()
76   *                     {
77   *                     // Per-thread post-loop finalization code
78   *                     . . .
79   *                     }
80   *                 });
81   *             }
82   *         . . .
83   *         }
84   * </PRE>
85   * <P>
86   * The parallel region's <code>execute()</code> method does the following. Each
87   * parallel team thread calls the parallel for loop's <code>start()</code> method
88   * once before beginning any loop iterations. The range of loop indexes is
89   * divided into "chunks" and the chunks are apportioned among the threads, in a
90   * manner determined by the parallel for loop's schedule as returned by the
91   * <code>schedule()</code> method. Each thread repeatedly calls the parallel for
92   * loop's <code>run()</code> method, passing in a different chunk on each call,
93   * until all the chunks assigned to that thread have been performed. When a
94   * thread has finished calling <code>run()</code>, the thread calls the parallel for
95   * loop's <code>finish()</code> method. Then the thread waits at an implicit
96   * barrier. When all the threads have reached the barrier, the
97   * <code>execute()</code> method returns.
98   * <P>
99   * Note that each parallel team thread actually creates its own instance of the
100  * parallel for loop class and passes that instance to the parallel region's
101  * <code>execute()</code> method. Thus, any fields declared in the parallel for loop
102  * class will <I>not</I> be shared by all the threads, but instead will be
103  * private to each thread.
104  * <P>
105  * The <code>start()</code> method is intended for performing per-thread
106  * initialization before starting the loop iterations. If no such initialization
107  * is needed, omit the <code>start()</code> method.
108  * <P>
109  * The <code>run()</code> method contains the code for the loop. The first and last
110  * indexes for a chunk of loop iterations are passed in as arguments. The loop
111  * stride is implicit (+1). The parallel for loop's <code>run()</code> method must
112  * be coded this way:
113  * <PRE>
114  *     public void run (int first, int last)
115  *         {
116  *         for (int i = first; i &lt;= last; ++ i)
117  *             {
118  *             // Loop body code
119  *             . . .
120  *             }
121  *         }
122  * </PRE> with the loop indexes running from <code>first</code> to <code>last</code>
123  * inclusive and increasing by +1 on each iteration.
124  * <P>
125  * The <code>finish()</code> method is intended for performing per-thread
126  * finalization after finishing the loop iterations. If no such finalization is
127  * needed, omit the <code>finish()</code> method.
128  * <P>
129  * Sometimes a portion of a parallel for loop has to be executed sequentially in
130  * the order of the loop indexes, while the rest of the parallel for loop can be
131  * executed concurrently. For example, the loop body is performing some
132  * computation that can be executed in parallel for different loop indexes, but
133  * the results of each computation must be written to a file sequentially in the
134  * order of the loop indexes. The <code>ordered()</code> method is provided for this
135  * purpose. A call to the <code>ordered()</code> method may appear once in the
136  * parallel for loop's <code>run()</code> method, like so:
137  * <PRE>
138  *     public void run (int first, int last)
139  *         {
140  *         for (int i = first; i &lt;= last; ++ i)
141  *             {
142  *             // This portion executed concurrently
143  *             . . .
144  *             ordered (new ParallelSection()
145  *                 {
146  *                 public void run()
147  *                     {
148  *                     // This portion executed sequentially
149  *                     // in the order of the loop indexes
150  *                     . . .
151  *                     }
152  *                 });
153  *             // This portion executed concurrently again
154  *             . . .
155  *             }
156  *         }
157  * </PRE> When called, the <code>ordered()</code> method waits until the
158  * <code>ordered()</code>
159  * method has been called and has returned in all loop iterations prior to the
160  * current loop iteration. Then the <code>ordered()</code> method calls the given
161  * parallel section's <code>run()</code> method. When the parallel section's
162  * <code>run()</code> method returns, the <code>ordered()</code> method returns. If the
163  * parallel section's <code>run()</code> method throws an exception, the
164  * <code>ordered()</code> method throws that same exception.
165  * <P>
166  * It is possible to stop a parallel for loop using the <code>stopLoop()</code>
167  * method, like this:
168  * <PRE>
169  *     public void run (int first, int last)
170  *         {
171  *         for (int i = first; i &lt;= last; ++ i)
172  *             {
173  *             // Loop body
174  *             . . .
175  *             if (/&#42;time to stop the loop&#42;/)
176  *                 {
177  *                 stopLoop();
178  *                 break;
179  *                 }
180  *             // More loop body
181  *             . . .
182  *             }
183  *         }
184  * </PRE> Once <code>stopLoop()</code> is called, after each parallel team thread
185  * finishes executing its current chunk of iterations, each thread will execute
186  * no further chunks and will proceed to finish the parallel for loop. Note well
187  * that stopping a parallel for loop is not the same as executing a
188  * <code>break</code> statement in a regular for loop. The parallel for loop does
189  * not stop until each thread, <I>including the thread that called
190  * <code>stopLoop()</code></I>, has finished its current <I>chunk</I> of iterations.
191  * Thus, depending on the parallel for loop's schedule, additional iterations
192  * may be executed after <code>stopLoop()</code> is called. (The <code>break</code>
193  * statement in the above example causes the thread that called
194  * <code>stopLoop()</code> to finish its chunk of iterations early.)
195  * <P>
196  * Normally, at the end of the parallel for loop, the parallel team threads wait
197  * for each other at a barrier. To eliminate this barrier wait, include
198  * {@link edu.rit.pj.BarrierAction#NO_WAIT BarrierAction.NO_WAIT} in the <code>execute()</code>
199  * method call:
200  * <PRE>
201  *     new ParallelRegion()
202  *         {
203  *         . . .
204  *         public void run()
205  *             {
206  *             . . .
207  *             execute (0, 99, new IntegerForLoop()
208  *                 {
209  *                 . . .
210  *                 },
211  *             BarrierAction.NO_WAIT);
212  *             . . .
213  *             }
214  *         }
215  * </PRE> To execute a section of code in a single thread as part of the barrier
216  * synchronization, include an instance of class {@linkplain BarrierAction} in
217  * the <code>execute()</code> method call. The barrier action object's
218  * <code>run()</code> method contains the code to be executed in a single thread
219  * while the other threads wait:
220  * <PRE>
221  *     new ParallelRegion()
222  *         {
223  *         . . .
224  *         public void run()
225  *             {
226  *             . . .
227  *             execute (0, 99, new IntegerForLoop()
228  *                 {
229  *                 . . .
230  *                 },
231  *             new BarrierAction()
232  *                 {
233  *                 public void run()
234  *                     {
235  *                     // Single-threaded code goes here
236  *                     . . .
237  *                     }
238  *                 });
239  *             . . .
240  *             }
241  *         }
242  * </PRE> For further information, see class {@linkplain BarrierAction}.
243  * <P>
244  * If the parallel for loop's <code>start()</code>, <code>run()</code>, or
245  * <code>finish()</code> method throws an exception in one of the threads, then that
246  * thread executes no further code in the loop, and the parallel region's
247  * <code>execute()</code> method throws that same exception in that thread.
248  * Furthermore, the other threads in the parallel team also execute no further
249  * code in the loop after finishing their current chunks. Thus, if one thread
250  * throws an exception, the whole parallel for loop exits with some (perhaps
251  * none) of the iterations unperformed.
252  *
253  * @author Alan Kaminsky
254  * @version 11-Nov-2007
255  */
256 public abstract class IntegerForLoop
257         extends ParallelForLoop {
258 
259 // Hidden data members.
260     // Parallel for loop schedule.
261     IntegerSchedule mySchedule;
262 
263     // Loop index for ordered() construct.
264     int myOrderedIndex;
265 
266 // Exported constructors.
267     /**
268      * Construct a new parallel for loop.
269      */
270     public IntegerForLoop() {
271         super();
272     }
273 
274 // Exported operations.
275     /**
276      * Determine this parallel for loop's schedule. The schedule determines how
277      * the loop iterations are apportioned among the parallel team threads. For
278      * further information, see class {@linkplain IntegerSchedule}.
279      * <P>
280      * The <code>schedule()</code> method may be overridden in a subclass to return
281      * the desired schedule. If not overridden, the default is a runtime
282      * schedule (see {@link edu.rit.pj.IntegerSchedule#runtime()}).
283      *
284      * @return Schedule for this parallel for loop.
285      */
286     public IntegerSchedule schedule() {
287         return IntegerSchedule.runtime();
288     }
289 
290     /**
291      * Perform per-thread initialization actions before starting the loop
292      * iterations.
293      * <P>
294      * The <code>start()</code> method may be overridden in a subclass. If not
295      * overridden, the <code>start()</code> method does nothing.
296      *
297      * @exception Exception The <code>start()</code> method may throw any exception.
298      * @throws java.lang.Exception if any.
299      */
300     public void start()
301             throws Exception {
302     }
303 
304     /**
305      * Execute one chunk of iterations of this parallel for loop. The
306      * <code>run()</code> method must perform the loop body for indexes
307      * <code>first</code> through <code>last</code> inclusive, increasing the loop index
308      * by +1 after each iteration.
309      * <P>
310      * The <code>run()</code> method must be overridden in a subclass.
311      *
312      * @param first First loop index.
313      * @param last Last loop index.
314      * @exception Exception The <code>run()</code> method may throw any exception.
315      * @throws java.lang.Exception if any.
316      */
317     public abstract void run(int first,
318             int last)
319             throws Exception;
320 
321     /**
322      * Perform per-thread finalization actions after finishing the loop
323      * iterations.
324      * <P>
325      * The <code>finish()</code> method may be overridden in a subclass. If not
326      * overridden, the <code>finish()</code> method does nothing.
327      *
328      * @exception Exception The <code>finish()</code> method may throw any
329      * exception.
330      * @throws java.lang.Exception if any.
331      */
332     public void finish()
333             throws Exception {
334     }
335 
336     /**
337      * Execute the given section of code in order of the loop indexes. A call to
338      * the <code>ordered()</code> method may appear in this parallel for loop's
339      * <code>run()</code> method. When called, the <code>ordered()</code> method waits
340      * until the <code>ordered()</code> method has been called and has returned in
341      * all loop iterations prior to the current loop iteration. Then the
342      * <code>ordered()</code> method calls the <code>run()</code> method of
343      * <code>theParallelSection</code>. When the parallel section's <code>run()</code>
344      * method returns, the <code>ordered()</code> method returns. If the parallel
345      * section's <code>run()</code> method throws an exception, the
346      * <code>ordered()</code> method throws that same exception.
347      * <P>
348      * The <code>ordered()</code> method is used when a portion of a parallel for
349      * loop has to be executed sequentially in the order of the loop indexes,
350      * while the rest of the parallel for loop can be executed concurrently.
351      * <P>
352      * <I>Note:</I> Either the <code>ordered()</code> method must be called exactly
353      * once during each call of the parallel for loop's <code>run()</code> method,
354      * or the <code>ordered()</code> method must not be called at all.
355      *
356      * @param theSection Parallel section to execute in order.
357      * @exception NullPointerException (unchecked exception) Thrown if
358      * <code>theSection</code> is null.
359      * @exception IllegalStateException (unchecked exception) Thrown if no
360      * parallel team is executing this parallel for loop.
361      * @exception Exception Thrown if <code>theSection</code>'s <code>run()</code>
362      * method throws an exception.
363      * @throws java.lang.Exception if any.
364      */
365     public final void ordered(ParallelSection theSection)
366             throws Exception {
367         // Verify preconditions.
368         if (theSection == null) {
369             throw new IllegalStateException("IntegerForLoop.ordered(): Parallel section is null");
370         }
371         if (myTeam == null) {
372             throw new IllegalStateException("IntegerForLoop.ordered(): No parallel team executing");
373         }
374 
375         // Wait until the ordered() construct has finished for all previous
376         // iterations.
377         if (mySchedule.myOrderedIndex != this.myOrderedIndex) {
378             Spinner spinner = new Spinner();
379             while (mySchedule.myOrderedIndex != this.myOrderedIndex) {
380                 spinner.spin();
381             }
382         }
383 
384         // Execute parallel section. Propagate any exception.
385         theSection.myTeam = this.myTeam;
386         try {
387             theSection.run();
388         } finally {
389             theSection.myTeam = null;
390 
391             // Notify that the ordered construct has finished for this
392             // iteration.
393             ++this.myOrderedIndex;
394             mySchedule.myOrderedIndex = this.myOrderedIndex;
395         }
396     }
397 
398     /**
399      * Stop this parallel for loop. Once <code>stopLoop()</code> is called, after
400      * each parallel team thread finishes executing its current chunk of
401      * iterations, each thread will execute no further chunks and will proceed
402      * to finish this parallel for loop.
403      *
404      * @exception IllegalStateException (unchecked exception) Thrown if no
405      * parallel team is executing this parallel for loop.
406      */
407     public final void stopLoop() {
408         if (myTeam == null) {
409             throw new IllegalStateException("ParallelForLoop.stopLoop(): No parallel team executing");
410         }
411         mySchedule.myBreak = true;
412     }
413 
414 // Hidden operations.
415     /**
416      * Execute one chunk of iterations of this parallel for loop. This method
417      * performs common processing, then calls the <code>run()</code> method.
418      *
419      * @param first First loop index.
420      * @param last Last loop index.
421      *
422      * @exception Exception This method may throw any exception.
423      */
424     void commonRun(int first,
425             int last)
426             throws Exception {
427         myOrderedIndex = first;
428         run(first, last);
429     }
430 
431     // Kludge to avert false sharing in multithreaded programs.
432     // Padding fields.
433     volatile long p0 = 1000L;
434     volatile long p1 = 1001L;
435     volatile long p2 = 1002L;
436     volatile long p3 = 1003L;
437     volatile long p4 = 1004L;
438     volatile long p5 = 1005L;
439     volatile long p6 = 1006L;
440     volatile long p7 = 1007L;
441     volatile long p8 = 1008L;
442     volatile long p9 = 1009L;
443     volatile long pa = 1010L;
444     volatile long pb = 1011L;
445     volatile long pc = 1012L;
446     volatile long pd = 1013L;
447     volatile long pe = 1014L;
448     volatile long pf = 1015L;
449 
450     // Method to prevent the JDK from optimizing away the padding fields.
451     long preventOptimization() {
452         return p0 + p1 + p2 + p3 + p4 + p5 + p6 + p7 +
453             p8 + p9 + pa + pb + pc + pd + pe + pf;
454     }
455 }