Review Board 1.7.22


Sqoop2: Devguide: Describe client API for embedding Sqoop client in applications

Review Request #10271 - Created April 3, 2013 and updated

Vasanth kumar RJ
sqoop-925
Reviewers
Sqoop
sqoop-sqoop2
Sqoop 2 client API Developer guide.

Also updated copyright information.
Done

Diff revision 5 (Latest)

1 2 3 4 5
1 2 3 4 5

  1. docs/src/site/sphinx/ClientAPI.rst: Loading...
  2. docs/src/site/sphinx/conf.py: Loading...
  3. docs/src/site/sphinx/index.rst: Loading...
docs/src/site/sphinx/ClientAPI.rst
New File

    
   
1
.. Licensed to the Apache Software Foundation (ASF) under one or more

    
   
2
   contributor license agreements.  See the NOTICE file distributed with

    
   
3
   this work for additional information regarding copyright ownership.

    
   
4
   The ASF licenses this file to You under the Apache License, Version 2.0

    
   
5
   (the "License"); you may not use this file except in compliance with

    
   
6
   the License.  You may obtain a copy of the License at

    
   
7

   

    
   
8
       http://www.apache.org/licenses/LICENSE-2.0

    
   
9

   

    
   
10
   Unless required by applicable law or agreed to in writing, software

    
   
11
   distributed under the License is distributed on an "AS IS" BASIS,

    
   
12
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

    
   
13
   See the License for the specific language governing permissions and

    
   
14
   limitations under the License.

    
   
15

   

    
   
16

   

    
   
17
======================

    
   
18
Sqoop Client API Guide

    
   
19
======================

    
   
20

   

    
   
21
This document will explain how to use Sqoop Client API with external application. Client API allows you to execute the functions of sqoop commands. It requires Sqoop Client JAR and its dependencies.

    
   
22

   

    
   
23
Client API is explained using Generic JDBC Connector properties. Before executing the application using the sqoop client API, check whether sqoop server is running.

    
   
24

   

    
   
25
Workflow

    
   
26
========

    
   
27

   

    
   
28
Given workflow has to be followed for executing a job in Sqoop server.

    
   
29

   

    
   
30
  1. Create connection using Connector ID (cid) - Creates connection and returns connection ID (xid)

    
   
31
  2. Create Job using Connection ID (xid)       - Create job and returns Job ID (jid)

    
   
32
  3. Job submission with Job ID (jid)           - Submit sqoop Job to server

    
   
33

   

    
   
34
Project Dependencies

    
   
35
====================

    
   
36
Here given maven dependency

    
   
37

   

    
   
38
::

    
   
39

   

    
   
40
  <dependency>

    
   
41
    <groupId>org.apache.sqoop</groupId>

    
   
42
      <artifactId>sqoop-client</artifactId>

    
   
43
      <version>${requestedVersion}</version>

    
   
44
  </dependency>

    
   
45

   

    
   
46
Initialization

    
   
47
==============

    
   
48

   

    
   
49
First initialize the SqoopClient class with server URL as argument.

    
   
50

   

    
   
51
::

    
   
52

   

    
   
53
  String url = "http://localhost:12000/sqoop/";

    
   
54
  SqoopClient client = new SqoopClient(url);

    
   
55

   

    
   
56
Server URL value can be modfied by setting value to setServerUrl(String) method

    
   
57

   

    
   
58
::

    
   
59

   

    
   
60
  client.setServerUrl(newUrl);

    
   
61

   

    
   
62

   

    
   
63
Connection

    
   
64
==========

    
   
65

   

    
   
66
Client API allows you to create, update and delete connection. For creating or updating connection requires Connector forms and Framwork Forms. User has to retrive the connector and framework forms, then update the values.

    
   
67

   

    
   
68
Create Connection

    
   
69
-----------------

    
   
70

   

    
   
71
First create a new connection by invoking newConnection(cid) method with connector ID and returns a MConnection object with dummy id. Then fill the connection and framework forms as given below. Invoke create connection with updated connection object.

    
   
72

   

    
   
73
::

    
   
74

   

    
   
75
  //Dummy connection object

    
   
76
  MConnection newCon = client.newConnection(1);

    
   
77

   

    
   
78
  //Get connection and framework forms. Set name for connection

    
   
79
  MConnectionForms conForms = newCon.getConnectorPart();

    
   
80
  MConnectionForms frameworkForms = newCon.getFrameworkPart();

    
   
81
  newCon.setName("MyConnection");

    
   
82

   

    
   
83
  //Set connection forms values

    
   
84
  conForms.getStringInput("connection.connectionString").setValue("jdbc:mysql://localhost/my");

    
   
85
  conForms.getStringInput("connection.jdbcDriver").setValue("com.mysql.jdbc.Driver");

    
   
86
  conForms.getStringInput("connection.username").setValue("root");

    
   
87
  conForms.getStringInput("connection.password").setValue("root");

    
   
88

   

    
   
89
  frameworkForms.getIntegerInput("security.maxConnections").setValue(0);

    
   
90

   

    
   
91
  Status status  = client.createConnection(newCon);

    
   
92
  if(status.canProceed()) {

    
   
93
   System.out.println("Created. New Connection ID : " +newCon.getPersistenceId());

    
   
94
  } else {

    
   
95
   System.out.println("Check for status and forms error ");

    
   
96
  }

    
   
97

   

    
   
98
status.canProceed() returns true if status is FINE or ACCEPTABLE. Above code has given status after validation of connector and framework forms.

    
   
99

   

    
   
100
On successful execution, new connection ID is assigned for the connection. getPersistenceId() method returns ID.

    
   
101
User can retrieve a connection using below methods

    
   
102

   

    
   
103
+----------------------------+--------------------------------------+

    
   
104
|   Method                   | Description                          |

    
   
105
+============================+======================================+

    
   
106
| ``getConnection(xid)``     | Returns a connection object.         |

    
   
107
+----------------------------+--------------------------------------+

    
   
108
| ``getConnections()``       | Returns list of connection object    |

    
   
109
+----------------------------+--------------------------------------+

    
   
110

   

    
   
111
List of status code

    
   
112
-------------------

    
   
113

   

    
   
114
+------------------+------------------------------------------------------------------------------------------------------------+

    
   
115
| Function         | Description                                                                                                |

    
   
116
+==================+============================================================================================================+

    
   
117
| ``FINE``         | There are no issues, no warnings.                                                                          |

    
   
118
+------------------+------------------------------------------------------------------------------------------------------------+

    
   
119
| ``ACCEPTABLE``   | Validated entity is correct enough to be processed. There might be some warnings, but no errors.           |

    
   
120
+------------------+------------------------------------------------------------------------------------------------------------+

    
   
121
| ``UNACCEPTABLE`` | There are serious issues with validated entity. We can't proceed until reported issues will be resolved.   |

    
   
122
+------------------+------------------------------------------------------------------------------------------------------------+

    
   
123

   

    
   
124
View Error or Warning message

    
   
125
-----------------------------

    
   
126

   

    
   
127
In case of any UNACCEPTABLE AND ACCEPTABLE status, user has to iterate the connector part forms and framework part forms for getting actual error or warning message. Below piece of code describe how to itereate over the forms for input message.

    
   
128

   

    
   
129
::

    
   
130

   

    
   
131
 printMessage(newCon.getConnectorPart().getForms());

    
   
132
 printMessage(newCon.getFrameworkPart().getForms());

    
   
133

   

    
   
134
 private static void printMessage(List<MForm> formList) {

    
   
135
   for(MForm form : formList) {

    
   
136
     List<MInput<?>> inputlist = form.getInputs();

    
   
137
     if (form.getValidationMessage() != null) {

    
   
138
       System.out.println("Form message: " + form.getValidationMessage());

    
   
139
     }

    
   
140
     for (MInput minput : inputlist) {

    
   
141
       if (minput.getValidationStatus() == Status.ACCEPTABLE) {

    
   
142
         System.out.println("Warning:" + minput.getValidationMessage());

    
   
143
       } else if (minput.getValidationStatus() == Status.UNACCEPTABLE) {

    
   
144
         System.out.println("Error:" + minput.getValidationMessage());

    
   
145
       }

    
   
146
     }

    
   
147
   }

    
   
148
 }

    
   
149

   

    
   
150
Job

    
   
151
===

    
   
152

   

    
   
153
A job object holds database configurations, input or output configurations and resources required for executing as a hadoop job. Create job object requires filling connector part and framework part forms.

    
   
154

   

    
   
155
Below given code shows how to create a import job

    
   
156

   

    
   
157
::

    
   
158

   

    
   
159
  String url = "http://localhost:12000/sqoop/";

    
   
160
  SqoopClient client = new SqoopClient(url);

    
   
161
  //Creating dummy job object

    
   
162
  MJob newjob = client.newJob(1, org.apache.sqoop.model.MJob.Type.IMPORT);

    
   
163
  MJobForms connectorForm = newjob.getConnectorPart();

    
   
164
  MJobForms frameworkForm = newjob.getFrameworkPart();

    
   
165

   

    
   
166
  newjob.setName("ImportJob");

    
   
167
  //Database configuration

    
   
168
  connectorForm.getStringInput("table.schemaName").setValue("");

    
   
169
  //Input either table name or sql

    
   
170
  connectorForm.getStringInput("table.tableName").setValue("table");

    
   
171
  //connectorForm.getStringInput("table.sql").setValue("select id,name from table where ${CONDITIONS}");

    
   
172
  connectorForm.getStringInput("table.columns").setValue("id,name");

    
   
173
  connectorForm.getStringInput("table.partitionColumn").setValue("id");

    
   
174
  //Set boundary value only if required

    
   
175
  //connectorForm.getStringInput("table.boundaryQuery").setValue("");

    
   
176

   

    
   
177
  //Output configurations

    
   
178
  frameworkForm.getEnumInput("output.storageType").setValue("HDFS");

    
   
179
  frameworkForm.getEnumInput("output.outputFormat").setValue("TEXT_FILE");//Other option: SEQUENCE_FILE

    
   
180
  frameworkForm.getStringInput("output.outputDirectory").setValue("/output");

    
   
181

   

    
   
182
  //Job resources

    
   
183
  frameworkForm.getIntegerInput("throttling.extractors").setValue(1);

    
   
184
  frameworkForm.getIntegerInput("throttling.loaders").setValue(1);

    
   
185

   

    
   
186
  Status status = client.createJob(newjob);

    
   
187
  if(status.canProceed()) {

    
   
188
   System.out.println("New Job ID: "+ newjob.getPersistenceId());

    
   
189
  } else {

    
   
190
   System.out.println("Check for status and forms error ");

    
   
191
  }

    
   
192

   

    
   
193
  //Print errors or warnings

    
   
194
  printMessage(newjob.getConnectorPart().getForms());

    
   
195
  printMessage(newjob.getFrameworkPart().getForms());

    
   
196

   

    
   
197

   

    
   
198
Export job creation is same as import job, but only few input configuration changes

    
   
199

   

    
   
200
::

    
   
201

   

    
   
202
  String url = "http://localhost:12000/sqoop/";

    
   
203
  SqoopClient client = new SqoopClient(url);

    
   
204
  MJob newjob = client.newJob(1, org.apache.sqoop.model.MJob.Type.EXPORT);

    
   
205
  MJobForms connectorForm = newjob.getConnectorPart();

    
   
206
  MJobForms frameworkForm = newjob.getFrameworkPart();

    
   
207

   

    
   
208
  newjob.setName("ExportJob");

    
   
209
  //Database configuration

    
   
210
  connectorForm.getStringInput("table.schemaName").setValue("");

    
   
211
  //Input either table name or sql

    
   
212
  connectorForm.getStringInput("table.tableName").setValue("table");

    
   
213
  //connectorForm.getStringInput("table.sql").setValue("select id,name from table where ${CONDITIONS}");

    
   
214
  connectorForm.getStringInput("table.columns").setValue("id,name");

    
   
215

   

    
   
216
  //Input configurations

    
   
217
  frameworkForm.getStringInput("input.inputDirectory").setValue("/input");

    
   
218

   

    
   
219
  //Job resources

    
   
220
  frameworkForm.getIntegerInput("throttling.extractors").setValue(1);

    
   
221
  frameworkForm.getIntegerInput("throttling.loaders").setValue(1);

    
   
222

   

    
   
223
  Status status = client.createJob(newjob);

    
   
224
  if(status.canProceed()) {

    
   
225
    System.out.println("New Job ID: "+ newjob.getPersistenceId());

    
   
226
  } else {

    
   
227
    System.out.println("Check for status and forms error ");

    
   
228
  }

    
   
229

   

    
   
230
  //Print errors or warnings

    
   
231
  printMessage(newjob.getConnectorPart().getForms());

    
   
232
  printMessage(newjob.getFrameworkPart().getForms());

    
   
233

   

    
   
234
Managing connection and job

    
   
235
---------------------------

    
   
236
After creating connection or job object, you can update or delete a connection or job using given functions

    
   
237

   

    
   
238
+----------------------------------+------------------------------------------------------------------------------------+

    
   
239
|   Method                         | Description                                                                        |

    
   
240
+==================================+====================================================================================+

    
   
241
| ``updateConnection(connection)`` | Invoke update with connection object and check status for any errors or warnings   |

    
   
242
+----------------------------------+------------------------------------------------------------------------------------+

    
   
243
| ``deleteConnection(xid)``        | Delete connection. Deletes only if specified connection is used by any job         |

    
   
244
+----------------------------------+------------------------------------------------------------------------------------+

    
   
245
| ``updateJob(job)``               | Invoke update with job object and check status for any errors or warnings          |

    
   
246
+----------------------------------+------------------------------------------------------------------------------------+

    
   
247
| ``deleteJob(jid)``               | Delete job                                                                         |

    
   
248
+----------------------------------+------------------------------------------------------------------------------------+

    
   
249

   

    
   
250
Job Submission

    
   
251
==============

    
   
252

   

    
   
253
Job submission requires a job id. On successful submission, getStatus() method returns "BOOTING" or "RUNNING".

    
   
254

   

    
   
255
::

    
   
256

   

    
   
257
  //Job submission start

    
   
258
  MSubmission submission = client.startSubmission(1);

    
   
259
  System.out.println("Status : " + submission.getStatus());

    
   
260
  if(submission.getStatus().isRunning() && submission.getProgress() != -1) {

    
   
261
    System.out.println("Progress : " + String.format("%.2f %%", submission.getProgress() * 100));

    
   
262
  }

    
   
263
  System.out.println("Hadoop job id :" + submission.getExternalId());

    
   
264
  System.out.println("Job link : " + submission.getExternalLink());

    
   
265
  Counters counters = submission.getCounters();

    
   
266
  if(counters != null) {

    
   
267
    System.out.println("Counters:");

    
   
268
    for(CounterGroup group : counters) {

    
   
269
      System.out.print("\t");

    
   
270
      System.out.println(group.getName());

    
   
271
      for(Counter counter : group) {

    
   
272
        System.out.print("\t\t");

    
   
273
        System.out.print(counter.getName());

    
   
274
        System.out.print(": ");

    
   
275
        System.out.println(counter.getValue());

    
   
276
      }

    
   
277
    }

    
   
278
  }

    
   
279
  if(submission.getExceptionInfo() != null) {

    
   
280
    System.out.println("Exception info : " +submission.getExceptionInfo());

    
   
281
  }

    
   
282

   

    
   
283

   

    
   
284
  //Check job status

    
   
285
  MSubmission submission = client.getSubmissionStatus(1);

    
   
286
  if(submission.getStatus().isRunning() && submission.getProgress() != -1) {

    
   
287
    System.out.println("Progress : " + String.format("%.2f %%", submission.getProgress() * 100));

    
   
288
  }

    
   
289

   

    
   
290
  //Stop a running job

    
   
291
  submission.stopSubmission(jid);

    
   
292

   

    
   
293
Describe Forms

    
   
294
==========================

    
   
295

   

    
   
296
You can view the connection or job forms input values with labels of built-in resource bundle.

    
   
297

   

    
   
298
::

    
   
299

   

    
   
300
  String url = "http://localhost:12000/sqoop/";

    
   
301
  SqoopClient client = new SqoopClient(url);

    
   
302
  //Use getJob(jid) for describing job.

    
   
303
  //While printing connection forms, pass connector id to getResourceBundle(cid).

    
   
304
  describe(client.getConnection(1).getConnectorPart().getForms(), client.getResourceBundle(1));

    
   
305
  describe(client.getConnection(1).getFrameworkPart().getForms(), client.getFrameworkResourceBundle());

    
   
306

   

    
   
307
  void describe(List<MForm> forms, ResourceBundle resource) {

    
   
308
    for (MForm mf : forms) {

    
   
309
      System.out.println(resource.getString(mf.getLabelKey())+":");

    
   
310
      List<MInput<?>> mis = mf.getInputs();

    
   
311
      for (MInput mi : mis) {

    
   
312
        System.out.println(resource.getString(mi.getLabelKey()) + " : " + mi.getValue());

    
   
313
      }

    
   
314
      System.out.println();

    
   
315
    }

    
   
316
  }

    
   
317

   

    
   
318

   

    
   
319
Above Sqoop 2 Client API tutorial explained you how to create connection, create job and submit job.
docs/src/site/sphinx/conf.py
Revision 642d065 New Change
 
docs/src/site/sphinx/index.rst
Revision 07f1c31 New Change
 
  1. docs/src/site/sphinx/ClientAPI.rst: Loading...
  2. docs/src/site/sphinx/conf.py: Loading...
  3. docs/src/site/sphinx/index.rst: Loading...