Review Board 1.7.22


PIG-3400 FS commands do not work with S3 paths

Review Request #13122 - Created July 31, 2013 and updated

Cheolsoo Park
PIG-3400
Reviewers
pig
pig-git
Makes fs utility commands work with s3 paths. Documents fs utility commands in the Pig manual. (They were not documented at all.)
All unit tests pass.

Manually verified the following commands with s3 paths:
pig -e 'ls s3://<path>'
pig -e 'mkdir s3://<path>'
pig -e 'rm s3://<path>'
pig -e 'cp s3://<path1> s3://<path2>'
pig -e 'mv s3://<path1> s3://<path2>'
pig -e 'copyToLocal s3://<path>/<file> .'
pig -e 'copyFromLocal <file> s3://<path>' 
src/docs/src/documentation/content/xdocs/cmds.xml
Revision 38babd2 New Change
[20] 69 lines
[+20]
70
   <title>Usage</title>
70
   <title>Usage</title>
71
   <p>Use the fs command to invoke any FsShell command from within a Pig script or Grunt shell. 
71
   <p>Use the fs command to invoke any FsShell command from within a Pig script or Grunt shell. 
72
   The fs command greatly extends the set of supported file system commands and the capabilities
72
   The fs command greatly extends the set of supported file system commands and the capabilities
73
   supported for existing commands such as ls that will now support globing. For a complete list of
73
   supported for existing commands such as ls that will now support globing. For a complete list of
74
   FsShell commands, see 
74
   FsShell commands, see 
75
   <a href="http://hadoop.apache.org/common/docs/current/file_system_shell.html">File System Shell Guide</a></p>
75
   <a href="http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html">File System Shell Guide</a></p>
76
   </section>
76
   </section>
77
   
77
   
78
   <section>
78
   <section>
79
   <title>Examples</title>
79
   <title>Examples</title>
80
   <p>In these examples a directory is created, a file is copied, a file is listed.</p>
80
   <p>In these examples a directory is created, a file is copied, a file is listed.</p>
[+20] [20] 54 lines
[+20]
135
</p>
135
</p>
136
   </section>
136
   </section>
137
   
137
   
138
   <section>
138
   <section>
139
   <title>Example</title>
139
   <title>Example</title>
140
   <p>In this example the ls command is invoked.</p>
140
   <p>In this example, the ls command is invoked.</p>
141
<source>
141
<source>
142
grunt> sh ls 
142
grunt> sh ls 
143
bigdata.conf 
143
bigdata.conf 
144
nightly.conf 
144
nightly.conf 
145
..... 
145
..... 
[+20] [20] 7 lines
[+20]
153
 
153
 
154
 <!-- ======================================================== -->         
154
 <!-- ======================================================== -->         
155
        
155
        
156
   <section id="utillity-cmds">
156
   <section id="utillity-cmds">
157
   <title>Utility Commands</title>
157
   <title>Utility Commands</title>

    
   
158

   

    
   
159
    <!-- +++++++++++++++++++++++++++++++++++++++ -->

    
   
160
       <section id="cat">

    
   
161
           <title>cat</title>

    
   
162
           <p>Copies a source path to stdout.</p>

    
   
163
           

    
   
164
           <section>

    
   
165
               <title>Syntax </title>

    
   
166
               <table>

    
   
167
                   <tr>

    
   
168
                       <td>

    
   
169
                           <p>cat URI</p>

    
   
170
                       </td>

    
   
171
                   </tr> 

    
   
172
               </table>

    
   
173
           </section>

    
   
174
           

    
   
175
           <section>

    
   
176
               <title>Terms</title>

    
   
177
               <table>

    
   
178
                   <tr>

    
   
179
                       <td>

    
   
180
                           <p>key</p>

    
   
181
                       </td>

    
   
182
                       <td>

    
   
183
                           <p>Description.</p>

    
   
184
                       </td>

    
   
185
                   </tr>

    
   
186
                   <tr>

    
   
187
                       <td>

    
   
188
                           <p>URI</p>

    
   
189
                       </td>

    
   
190
                       <td>

    
   
191
                           <p>URI can be a local, hdfs, or s3 path. If no URI scheme is given, the default file system is assumed.</p>

    
   
192
                       </td>

    
   
193
                   </tr> 

    
   
194
               </table></section>

    
   
195
           <section>

    
   
196
               <title>Example</title>

    
   
197
               <p>In this example, the cat command prints the content of file1 on stdout.</p>

    
   
198
               <source>

    
   
199
grunt&gt; cat hdfs://nn1.example.com/file1</source>

    
   
200
           </section>

    
   
201
       </section>

    
   
202

   

    
   
203
       <!-- +++++++++++++++++++++++++++++++++++++++ -->

    
   
204
       <section id="cd">

    
   
205
           <title>cd</title>

    
   
206
           <p>Change the current working directory.</p>

    
   
207
           

    
   
208
           <section>

    
   
209
               <title>Syntax </title>

    
   
210
               <table>

    
   
211
                   <tr>

    
   
212
                       <td>

    
   
213
                           <p>cd path</p>

    
   
214
                       </td>

    
   
215
                   </tr> 

    
   
216
               </table>

    
   
217
           </section>

    
   
218
           

    
   
219
           <section>

    
   
220
               <title>Terms</title>

    
   
221
               <table>

    
   
222
                   <tr>

    
   
223
                       <td>

    
   
224
                           <p>key</p>

    
   
225
                       </td>

    
   
226
                       <td>

    
   
227
                           <p>Description.</p>

    
   
228
                       </td>

    
   
229
                   </tr>

    
   
230
                   <tr>

    
   
231
                       <td>

    
   
232
                           <p>path</p>

    
   
233
                       </td>

    
   
234
                       <td>

    
   
235
                           <p>Path must be a hdfs path in MR mode and a local path in local mode.</p>

    
   
236
                       </td>

    
   
237
                   </tr> 

    
   
238
               </table></section>

    
   
239
           <section>

    
   
240
               <title>Example</title>

    
   
241
               <p>In this example, the cd command changes the current directory to /user/hadoop.</p>

    
   
242
               <source>

    
   
243
grunt&gt; cd /user/hadoop</source>

    
   
244
           </section>

    
   
245
       </section>
158
   
246
   
159
   <!-- +++++++++++++++++++++++++++++++++++++++ -->
247
   <!-- +++++++++++++++++++++++++++++++++++++++ -->
160
       <section id="clear">
248
       <section id="clear">
161
           <title>clear</title>
249
           <title>clear</title>
162
           <p>Clear the screen of Pig grunt shell and position the cursor at top of the screen.</p>
250
           <p>Clear the screen of Pig grunt shell and position the cursor at top of the screen.</p>
[+20] [20] 29 lines
[+20]
192
                       </td>
280
                       </td>
193
                   </tr> 
281
                   </tr> 
194
               </table></section>
282
               </table></section>
195
           <section>
283
           <section>
196
               <title>Example</title>
284
               <title>Example</title>
197
               <p>In this example the clear command clean up Pig grunt shell.</p>
285
               <p>In this example, the clear command cleans up Pig grunt shell.</p>
198
               <source>
286
               <source>
199
grunt&gt; clear</source>
287
grunt&gt; clear</source>
200
           </section>
288
           </section>
201
           
289
           
202
       </section>
290
       </section>
203

    
   
291

   
204
   <!-- +++++++++++++++++++++++++++++++++++++++ -->
292
       <!-- +++++++++++++++++++++++++++++++++++++++ -->

    
   
293
       <section id="copyFromLocal">

    
   
294
           <title>copyFromLocal</title>

    
   
295
           <p>Copy a file or directory from local source to remote destination.</p>

    
   
296
           

    
   
297
           <section>

    
   
298
               <title>Syntax </title>

    
   
299
               <table>

    
   
300
                   <tr>

    
   
301
                       <td>

    
   
302
                           <p>copyFromLocal path URI</p>

    
   
303
                       </td>

    
   
304
                   </tr> 

    
   
305
               </table>

    
   
306
           </section>

    
   
307
           

    
   
308
           <section>

    
   
309
               <title>Terms</title>

    
   
310
               <table>

    
   
311
                   <tr>

    
   
312
                       <td>

    
   
313
                           <p>key</p>

    
   
314
                       </td>

    
   
315
                       <td>

    
   
316
                           <p>Description.</p>

    
   
317
                       </td>

    
   
318
                   </tr>

    
   
319
                   <tr>

    
   
320
                       <td>

    
   
321
                           <p>URI</p>

    
   
322
                       </td>

    
   
323
                       <td>

    
   
324
                           <p>URI can be a local, hdfs, or s3 path. If no URI scheme is given, the default file system is assumed.</p>

    
   
325
                       </td>

    
   
326
                   </tr> 

    
   
327
               </table></section>

    
   
328
           <section>

    
   
329
               <title>Example</title>

    
   
330
               <p>In this example, the copyFromLocal command copies file1 to file2.</p>

    
   
331
               <source>

    
   
332
grunt&gt; copyFromLocal file1 hdfs://nn1.example.com/user/hadoop/file2</source>

    
   
333
           </section>

    
   
334
       </section>

    
   
335

   

    
   
336
       <!-- +++++++++++++++++++++++++++++++++++++++ -->

    
   
337
       <section id="copyToLocal">

    
   
338
           <title>copyToLocal</title>

    
   
339
           <p>Copy a file or directory from remote source to local destination.</p>

    
   
340
           

    
   
341
           <section>

    
   
342
               <title>Syntax </title>

    
   
343
               <table>

    
   
344
                   <tr>

    
   
345
                       <td>

    
   
346
                           <p>copyToLocal URI path</p>

    
   
347
                       </td>

    
   
348
                   </tr> 

    
   
349
               </table>

    
   
350
           </section>

    
   
351
           

    
   
352
           <section>

    
   
353
               <title>Terms</title>

    
   
354
               <table>

    
   
355
                   <tr>

    
   
356
                       <td>

    
   
357
                           <p>key</p>

    
   
358
                       </td>

    
   
359
                       <td>

    
   
360
                           <p>Description.</p>

    
   
361
                       </td>

    
   
362
                   </tr>

    
   
363
                   <tr>

    
   
364
                       <td>

    
   
365
                           <p>URI</p>

    
   
366
                       </td>

    
   
367
                       <td>

    
   
368
                           <p>URI can be a local, hdfs, or s3 path. If no URI scheme is given, the default file system is assumed.</p>

    
   
369
                       </td>

    
   
370
                   </tr> 

    
   
371
               </table></section>

    
   
372
           <section>

    
   
373
               <title>Example</title>

    
   
374
               <p>In this example, the copyToLocal command copies file1 to file2.</p>

    
   
375
               <source>

    
   
376
grunt&gt; copyToLocal hdfs://nn1.example.com/user/hadoop/file1 file2</source>

    
   
377
           </section>

    
   
378
       </section>

    
   
379
   

    
   
380
       <!-- +++++++++++++++++++++++++++++++++++++++ -->

    
   
381
       <section id="cp">

    
   
382
           <title>cp</title>

    
   
383
           <p>Copy a file or directory from source to destination.</p>

    
   
384
           

    
   
385
           <section>

    
   
386
               <title>Syntax </title>

    
   
387
               <table>

    
   
388
                   <tr>

    
   
389
                       <td>

    
   
390
                           <p>cp URI URI</p>

    
   
391
                       </td>

    
   
392
                   </tr> 

    
   
393
               </table>

    
   
394
           </section>

    
   
395
           

    
   
396
           <section>

    
   
397
               <title>Terms</title>

    
   
398
               <table>

    
   
399
                   <tr>

    
   
400
                       <td>

    
   
401
                           <p>key</p>

    
   
402
                       </td>

    
   
403
                       <td>

    
   
404
                           <p>Description.</p>

    
   
405
                       </td>

    
   
406
                   </tr>

    
   
407
                   <tr>

    
   
408
                       <td>

    
   
409
                           <p>URI</p>

    
   
410
                       </td>

    
   
411
                       <td>

    
   
412
                           <p>URI can be a local, hdfs, or s3 path. If no URI scheme is given, the default file system is assumed.</p>

    
   
413
                       </td>

    
   
414
                   </tr> 

    
   
415
               </table></section>

    
   
416
           <section>

    
   
417
               <title>Example</title>

    
   
418
               <p>In this example, the cp command copies file1 to file2.</p>

    
   
419
               <source>

    
   
420
grunt&gt; cp hdfs://nn1.example.com/user/hadoop/file1 hdfs://nn1.example.com/user/hadoop/file2</source>

    
   
421
           </section>

    
   
422
       </section>

    
   
423

   

    
   
424
   <!-- +++++++++++++++++++++++++++++++++++++++ -->
205
  <section id="exec">
425
  <section id="exec">
206
   <title>exec</title>
426
   <title>exec</title>
207
   <p>Run a Pig script.</p>
427
   <p>Run a Pig script.</p>
208
   
428
   
209
   <section>
429
   <section>
210
   <title>Syntax</title>
430
   <title>Syntax</title>
211
   <table>
431
   <table>
212
       <tr>
432
       <tr>
213
            <td>
433
            <td>
214
               <p>exec [–param param_name = param_value] [–param_file file_name] [script]  </p>
434
               <p>exec [–param param_name=param_value] [–param_file file_name] [script]  </p>
215
            </td>
435
            </td>
216
         </tr> 
436
         </tr> 
217
   </table></section>
437
   </table></section>
218
   
438
   
219
   <section>
439
   <section>
220
   <title>Terms</title>
440
   <title>Terms</title>
221
   <table>
441
   <table>
222
    
442
    
223
        <tr>
443
        <tr>
224
            <td>
444
            <td>
225
               <p>–param param_name = param_value</p>
445
               <p>–param param=param_value</p>
226
            </td>
446
            </td>
227
            <td>
447
            <td>
228
               <p>See <a href="cont.html#Parameter-Sub">Parameter Substitution</a>.</p>
448
               <p>See <a href="cont.html#Parameter-Sub">Parameter Substitution</a>.</p>
229
            </td>
449
            </td>
230
        </tr>
450
        </tr>
[+20] [20] 26 lines
[+20]
257
   <p id="exec-debug">For comparison, see the <a href="#run">run</a> command. Both the exec and run commands are useful for debugging because you can modify a Pig script in an editor and then rerun the script in the Grunt shell without leaving the shell. Also, both commands promote Pig script modularity as they allow you to reuse existing components.</p>
477
   <p id="exec-debug">For comparison, see the <a href="#run">run</a> command. Both the exec and run commands are useful for debugging because you can modify a Pig script in an editor and then rerun the script in the Grunt shell without leaving the shell. Also, both commands promote Pig script modularity as they allow you to reuse existing components.</p>
258
   </section>
478
   </section>
259
   
479
   
260
   <section>
480
   <section>
261
   <title>Examples</title>
481
   <title>Examples</title>
262
   <p>In this example the script is displayed and run.</p>
482
   <p>In this example, the script is displayed and run.</p>
263

    
   
483

   
264
<source>
484
<source>
265
grunt&gt; cat myscript.pig
485
grunt&gt; cat myscript.pig
266
a = LOAD 'student' AS (name, age, gpa);
486
a = LOAD 'student' AS (name, age, gpa);
267
b = LIMIT a 3;
487
b = LIMIT a 3;
268
DUMP b;
488
DUMP b;
269

    
   
489

   
270
grunt&gt; exec myscript.pig
490
grunt&gt; exec myscript.pig
271
(alice,20,2.47)
491
(alice,20,2.47)
272
(luke,18,4.00)
492
(luke,18,4.00)
273
(holly,24,3.27)
493
(holly,24,3.27)
274
</source>
494
</source>
275

    
   
495

   
276
   <p>In this example parameter substitution is used with the exec command.</p>
496
   <p>In this example, parameter substitution is used with the exec command.</p>
277
<source>
497
<source>
278
grunt&gt; cat myscript.pig
498
grunt&gt; cat myscript.pig
279
a = LOAD 'student' AS (name, age, gpa);
499
a = LOAD 'student' AS (name, age, gpa);
280
b = ORDER a BY name;
500
b = ORDER a BY name;
281

    
   
501

   
282
STORE b into '$out';
502
STORE b into '$out';
283

    
   
503

   
284
grunt&gt; exec –param out=myoutput myscript.pig
504
grunt&gt; exec –param out=myoutput myscript.pig
285
</source>
505
</source>
286

    
   
506

   
287
      <p>In this example multiple parameters are specified.</p>
507
      <p>In this example, multiple parameters are specified.</p>
288
<source>
508
<source>
289
grunt&gt; exec –param p1=myparam1 –param p2=myparam2 myscript.pig
509
grunt&gt; exec –param p1=myparam1 –param p2=myparam2 myscript.pig
290
</source>
510
</source>
291
   </section>
511
   </section>
292
   </section>   
512
   </section>   
[+20] [20] 106 lines
[+20]
399
               <title>Usage</title>
619
               <title>Usage</title>
400
               <p>The history command shows the statements used so far.</p></section>
620
               <p>The history command shows the statements used so far.</p></section>
401
           
621
           
402
           <section>
622
           <section>
403
               <title>Example</title>
623
               <title>Example</title>
404
               <p>In this example the history command shows all the statements with line numbers and without them.</p>
624
               <p>In this example, the history command shows all the statements with line numbers and without them.</p>
405
               <source>
625
               <source>
406
grunt&gt; a = LOAD 'student' AS (name, age, gpa);
626
grunt&gt; a = LOAD 'student' AS (name, age, gpa);
407
grunt&gt; b = order a by name;
627
grunt&gt; b = order a by name;
408
grunt&gt; history
628
grunt&gt; history
409
1 a = LOAD 'student' AS (name, age, gpa);
629
1 a = LOAD 'student' AS (name, age, gpa);
[+20] [20] 42 lines
[+20]
452
   <p>The kill command will attempt to kill any MapReduce jobs associated with the Pig job. Under certain conditions, however, this may fail; for example, when a Pig job is killed and does not have a chance to call its shutdown procedures.</p>
672
   <p>The kill command will attempt to kill any MapReduce jobs associated with the Pig job. Under certain conditions, however, this may fail; for example, when a Pig job is killed and does not have a chance to call its shutdown procedures.</p>
453
   </section>
673
   </section>
454
   
674
   
455
   <section>
675
   <section>
456
   <title>Example</title>
676
   <title>Example</title>
457
   <p>In this example the job with id job_0001 is killed.</p>
677
   <p>In this example, the job with id job_0001 is killed.</p>
458
<source>
678
<source>
459
grunt&gt; kill job_0001
679
grunt&gt; kill job_0001
460
</source>
680
</source>
461
   </section></section>
681
   </section></section>

    
   
682

   

    
   
683
   <!-- +++++++++++++++++++++++++++++++++++++++ -->      

    
   
684
   <section id="ls">

    
   
685
   <title>ls</title>

    
   
686
   <p>For a file returns stat on the file. For a directory, it returns list of its direct children as in unix.</p>

    
   
687
   

    
   
688
   <section>

    
   
689
   <title>Syntax</title>

    
   
690
   <table>

    
   
691
       <tr>

    
   
692
            <td>

    
   
693
                <p>ls URI</p>

    
   
694
            </td>

    
   
695
         </tr> 

    
   
696
   </table></section>

    
   
697
   

    
   
698
   <section>

    
   
699
   <title>Terms</title>

    
   
700
   <table>

    
   
701
       <tr>

    
   
702
            <td>

    
   
703
               <p>URI</p>

    
   
704
            </td>

    
   
705
            <td>

    
   
706
               <p>URI can be a local, hdfs, or s3 path. If no URI scheme is given, the default file system is assumed.</p>

    
   
707
            </td>

    
   
708
         </tr> 

    
   
709
   </table></section>

    
   
710
   

    
   
711
   <section>

    
   
712
   <title>Example</title>

    
   
713
   <p>In this example, the ls command returns the stat of file1.</p>

    
   
714
<source>

    
   
715
grunt&gt; ls hdfs://nn1.example.com/user/hadoop/file1

    
   
716
</source>

    
   
717
   </section></section>

    
   
718
   

    
   
719
    <!-- +++++++++++++++++++++++++++++++++++++++ -->

    
   
720
    <section id="mkdir">

    
   
721
        <title>mkdir</title>

    
   
722
        <p>Creates a directory.</p>

    
   
723
        

    
   
724
        <section>

    
   
725
            <title>Syntax </title>

    
   
726
            <table>

    
   
727
                <tr>

    
   
728
                    <td>

    
   
729
                        <p>mkdir URI</p>

    
   
730
                    </td>

    
   
731
                </tr> 

    
   
732
            </table>

    
   
733
        </section>

    
   
734
        

    
   
735
        <section>

    
   
736
            <title>Terms</title>

    
   
737
            <table>

    
   
738
                <tr>

    
   
739
                    <td>

    
   
740
                        <p>key</p>

    
   
741
                    </td>

    
   
742
                    <td>

    
   
743
                        <p>Description.</p>

    
   
744
                    </td>

    
   
745
                </tr>

    
   
746
                <tr>

    
   
747
                    <td>

    
   
748
                        <p>URI</p>

    
   
749
                    </td>

    
   
750
                    <td>

    
   
751
                        <p>URI can be a local, hdfs, or s3 path. If no URI scheme is given, the default file system is assumed.</p>

    
   
752
                    </td>

    
   
753
                </tr> 

    
   
754
            </table></section>

    
   
755
        <section>

    
   
756
            <title>Example</title>

    
   
757
            <p>In this example, the mkdir command creates dir.</p>

    
   
758
            <source>

    
   
759
grunt&gt; mkdir hdfs://nn1.example.com/user/hadoop/dir</source>

    
   
760
        </section>

    
   
761
    </section>

    
   
762

   

    
   
763
    <!-- +++++++++++++++++++++++++++++++++++++++ -->

    
   
764
    <section id="mv">

    
   
765
        <title>mv</title>

    
   
766
        <p>Moves a file or directory from source to destination.</p>

    
   
767
        

    
   
768
        <section>

    
   
769
            <title>Syntax </title>

    
   
770
            <table>

    
   
771
                <tr>

    
   
772
                    <td>

    
   
773
                        <p>mv URI URI</p>

    
   
774
                    </td>

    
   
775
                </tr> 

    
   
776
            </table>

    
   
777
        </section>

    
   
778
        

    
   
779
        <section>

    
   
780
            <title>Terms</title>

    
   
781
            <table>

    
   
782
                <tr>

    
   
783
                    <td>

    
   
784
                        <p>key</p>

    
   
785
                    </td>

    
   
786
                    <td>

    
   
787
                        <p>Description.</p>

    
   
788
                    </td>

    
   
789
                </tr>

    
   
790
                <tr>

    
   
791
                    <td>

    
   
792
                        <p>URI</p>

    
   
793
                    </td>

    
   
794
                    <td>

    
   
795
                        <p>URI can be a local, hdfs, or s3 path. If no URI scheme is given, the default file system is assumed.</p>

    
   
796
                    </td>

    
   
797
                </tr> 

    
   
798
            </table></section>

    
   
799
        <section>

    
   
800
            <title>Example</title>

    
   
801
            <p>In this example, the mv command moves file1 to file2.</p>

    
   
802
            <source>

    
   
803
grunt&gt; mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2</source>

    
   
804
        </section>

    
   
805
    </section>

    
   
806

   

    
   
807
    <!-- +++++++++++++++++++++++++++++++++++++++ -->

    
   
808
    <section id="pwd">

    
   
809
        <title>pwd</title>

    
   
810
        <p>Output the path of the current working directory.</p>

    
   
811
        

    
   
812
        <section>

    
   
813
            <title>Syntax </title>

    
   
814
            <table>

    
   
815
                <tr>

    
   
816
                    <td>

    
   
817
                        <p>pwd</p>

    
   
818
                    </td>

    
   
819
                </tr> 

    
   
820
            </table>

    
   
821
        </section>

    
   
822
        

    
   
823
        <section>

    
   
824
            <title>Terms</title>

    
   
825
            <table>

    
   
826
                <tr>

    
   
827
                    <td>

    
   
828
                        <p>key</p>

    
   
829
                    </td>

    
   
830
                    <td>

    
   
831
                        <p>Description.</p>

    
   
832
                    </td>

    
   
833
                </tr>

    
   
834
                <tr>

    
   
835
                    <td>

    
   
836
                        <p>none</p>

    
   
837
                    </td>

    
   
838
                    <td>

    
   
839
                        <p>no parameters</p>

    
   
840
                    </td>

    
   
841
                </tr> 

    
   
842
            </table></section>

    
   
843
        <section>

    
   
844
            <title>Example</title>

    
   
845
            <p>In this example, the pwd command prints the current working directory.</p>

    
   
846
            <source>

    
   
847
grunt&gt; pwd</source>

    
   
848
        </section>

    
   
849
    </section>
462
   
850
   
463
   <!-- +++++++++++++++++++++++++++++++++++++++ -->
851
   <!-- +++++++++++++++++++++++++++++++++++++++ -->
464
   <section id="quit">
852
   <section id="quit">
465
   <title>quit</title>
853
   <title>quit</title>
466
   <p>Quits from the Pig grunt shell.</p>
854
   <p>Quits from the Pig grunt shell.</p>
[+20] [20] 25 lines
[+20]
492
   <title>Usage</title>
880
   <title>Usage</title>
493
   <p>The quit command enables you to quit or exit the Pig grunt shell.</p></section>
881
   <p>The quit command enables you to quit or exit the Pig grunt shell.</p></section>
494
   
882
   
495
   <section>
883
   <section>
496
   <title>Example</title>
884
   <title>Example</title>
497
   <p>In this example the quit command exits the Pig grunt shall.</p>
885
   <p>In this example, the quit command exits the Pig grunt shall.</p>
498
<source>
886
<source>
499
grunt&gt; quit
887
grunt&gt; quit
500
</source>
888
</source>
501
   </section>
889
   </section>
502
   </section>
890
   </section>
503
   
891

   

    
   
892
    <!-- +++++++++++++++++++++++++++++++++++++++ -->

    
   
893
    <section id="rm">

    
   
894
        <title>rm</title>

    
   
895
        <p>Delete a file or directory.</p>

    
   
896
        

    
   
897
        <section>

    
   
898
            <title>Syntax </title>

    
   
899
            <table>

    
   
900
                <tr>

    
   
901
                    <td>

    
   
902
                        <p>rm URI</p>

    
   
903
                    </td>

    
   
904
                </tr> 

    
   
905
            </table>

    
   
906
        </section>

    
   
907
        

    
   
908
        <section>

    
   
909
            <title>Terms</title>

    
   
910
            <table>

    
   
911
                <tr>

    
   
912
                    <td>

    
   
913
                        <p>key</p>

    
   
914
                    </td>

    
   
915
                    <td>

    
   
916
                        <p>Description.</p>

    
   
917
                    </td>

    
   
918
                </tr>

    
   
919
                <tr>

    
   
920
                    <td>

    
   
921
                        <p>URI</p>

    
   
922
                    </td>

    
   
923
                    <td>

    
   
924
                        <p>URI can be a local, hdfs, or s3 path. If no URI scheme is given, the default file system is assumed.</p>

    
   
925
                    </td>

    
   
926
                </tr> 

    
   
927
            </table></section>

    
   
928
        <section>

    
   
929
            <title>Example</title>

    
   
930
            <p>In this example, the rm command deletes file.</p>

    
   
931
            <source>

    
   
932
grunt&gt; rm hdfs://nn.example.com/file</source>

    
   
933
        </section>

    
   
934
    </section>

    
   
935

   
504
    <!-- +++++++++++++++++++++++++++++++++++++++ -->     
936
    <!-- +++++++++++++++++++++++++++++++++++++++ -->     
505
   <section id="run">
937
   <section id="run">
506
   <title>run</title>
938
   <title>run</title>
507
   <p>Run a Pig script.</p>
939
   <p>Run a Pig script.</p>
508
   
940
   
509
   <section>
941
   <section>
510
   <title>Syntax</title>
942
   <title>Syntax</title>
511
   <table>
943
   <table>
512
       <tr>
944
       <tr>
513
            <td>
945
            <td>
514
               <p>run [–param param_name = param_value] [–param_file file_name] script </p>
946
               <p>run [–param param_name=param_value] [–param_file file_name] script </p>
515
            </td>
947
            </td>
516
         </tr> 
948
         </tr> 
517
   </table></section>
949
   </table></section>
518
   
950
   
519
   <section>
951
   <section>
520
   <title>Terms</title>
952
   <title>Terms</title>
521
   <table>
953
   <table>
522
    
954
    
523
         <tr>
955
         <tr>
524
            <td>
956
            <td>
525
               <p>–param param_name = param_value</p>
957
               <p>–param param=param_value</p>
526
            </td>
958
            </td>
527
            <td>
959
            <td>
528
               <p>See <a href="cont.html#Parameter-Sub">Parameter Substitution</a>.</p>
960
               <p>See <a href="cont.html#Parameter-Sub">Parameter Substitution</a>.</p>
529
            </td>
961
            </td>
530
         </tr>
962
         </tr>
[+20] [20] 25 lines
[+20]
556
   <p>For comparison, see the <a href="#exec">exec</a> command. Both the run and exec commands are useful for debugging because you can modify a Pig script in an editor and then rerun the script in the Grunt shell without leaving the shell. Also, both commands promote Pig script modularity as they allow you to reuse existing components.</p>
988
   <p>For comparison, see the <a href="#exec">exec</a> command. Both the run and exec commands are useful for debugging because you can modify a Pig script in an editor and then rerun the script in the Grunt shell without leaving the shell. Also, both commands promote Pig script modularity as they allow you to reuse existing components.</p>
557
  </section>
989
  </section>
558
   
990
   
559
   <section>
991
   <section>
560
   <title>Example</title>
992
   <title>Example</title>
561
   <p>In this example the script interacts with the results of commands issued via the Grunt shell.</p>
993
   <p>In this example, the script interacts with the results of commands issued via the Grunt shell.</p>
562
<source>
994
<source>
563
grunt&gt; cat myscript.pig
995
grunt&gt; cat myscript.pig
564
b = ORDER a BY name;
996
b = ORDER a BY name;
565
c = LIMIT b 10;
997
c = LIMIT b 10;
566

    
   
998

   
[+20] [20] 7 lines
[+20]
574
(alice,20,2.47)
1006
(alice,20,2.47)
575
(alice,27,1.95)
1007
(alice,27,1.95)
576
(alice,36,2.27)
1008
(alice,36,2.27)
577
</source>
1009
</source>
578

    
   
1010

   
579
   <p>In this example parameter substitution is used with the run command.</p>
1011
   <p>In this example, parameter substitution is used with the run command.</p>
580
<source>
1012
<source>
581
grunt&gt; a = LOAD 'student' AS (name, age, gpa);
1013
grunt&gt; a = LOAD 'student' AS (name, age, gpa);
582

    
   
1014

   
583
grunt&gt; cat myscript.pig
1015
grunt&gt; cat myscript.pig
584
b = ORDER a BY name;
1016
b = ORDER a BY name;
[+20] [20] 123 lines
[+20]
708
   </p>
1140
   </p>
709
   </section>
1141
   </section>
710
   
1142
   
711
   <section>
1143
   <section>
712
   <title>Examples</title>
1144
   <title>Examples</title>
713
   <p>In this example key value pairs are set at the command line.</p>
1145
   <p>In this example, key value pairs are set at the command line.</p>
714
<source>
1146
<source>
715
grunt&gt; SET debug 'on'
1147
grunt&gt; SET debug 'on'
716
grunt&gt; SET job.name 'my job'
1148
grunt&gt; SET job.name 'my job'
717
grunt&gt; SET default_parallel 100
1149
grunt&gt; SET default_parallel 100
718
</source>
1150
</source>
719

    
   
1151

   
720
<p>In this example default_parallel is set in the Pig script; all MapReduce jobs that get launched will use 20 reducers.</p>
1152
<p>In this example, default_parallel is set in the Pig script; all MapReduce jobs that get launched will use 20 reducers.</p>
721
<source>
1153
<source>
722
SET default_parallel 20;
1154
SET default_parallel 20;
723
A = LOAD 'myfile.txt' USING PigStorage() AS (t, u, v);
1155
A = LOAD 'myfile.txt' USING PigStorage() AS (t, u, v);
724
B = GROUP A BY t;
1156
B = GROUP A BY t;
725
C = FOREACH B GENERATE group, COUNT(A.t) as mycount;
1157
C = FOREACH B GENERATE group, COUNT(A.t) as mycount;
726
D = ORDER C BY mycount;
1158
D = ORDER C BY mycount;
727
STORE D INTO 'mysortedcount' USING PigStorage();
1159
STORE D INTO 'mysortedcount' USING PigStorage();
728
</source>
1160
</source>
729

    
   
1161

   
730
<p>In this example multiple key value pairs are set in the Pig script. These key value pairs are put in job-conf by Pig (making the pairs available to Pig and Hadoop). This is a script-wide setting; if a key value is defined multiple times in the script the last value will take effect and will be set for all jobs generated by the script. </p>
1162
<p>In this example, multiple key value pairs are set in the Pig script. These key value pairs are put in job-conf by Pig (making the pairs available to Pig and Hadoop). This is a script-wide setting; if a key value is defined multiple times in the script the last value will take effect and will be set for all jobs generated by the script. </p>
731
<source>
1163
<source>
732
...
1164
...
733
SET mapred.map.tasks.speculative.execution false; 
1165
SET mapred.map.tasks.speculative.execution false; 
734
SET pig.logfile mylogfile.log; 
1166
SET pig.logfile mylogfile.log; 
735
SET my.arbitrary.key my.arbitary.value; 
1167
SET my.arbitrary.key my.arbitary.value; 
[+20] [20] 9 lines
src/org/apache/pig/tools/grunt/GruntParser.java
Revision c785084 New Change
 
  1. src/docs/src/documentation/content/xdocs/cmds.xml: Loading...
  2. src/org/apache/pig/tools/grunt/GruntParser.java: Loading...