HP IDOL OnDemand Hello World Tutorial

Tuesday, July 8, 2014

Ruby call HP IDOL OnDemand

Hewlett Packard has developed a set of JSON-based REST API’s which enable “Big Data” type processing capabilities allowing developers to process information embedded in unstructured text and images in previously inaccessible formats. This platform is called IDOL OnDemand, the APIs are published here https://www.idolondemand.com/developer/apis

In this post, I will use ruby to call HP IDOL OnDemand APIs, 3 APIs usage will be given below for demonstration, which are:

Firstly, we define a method which will be used to send request and get response from IDOL OnDemand. The method will accept the api name and parameter data. It will return the JSON object.

 def getResponse(apiName, data)  
     uri = URI("http://api.idolondemand.com/1/api/sync/%s/v1" % apiName)  
     uri.query = URI.encode_www_form(data)  
     res = Net::HTTP.get_response(uri)  
     return JSON.parse(res.body)  
 end

In order to call the IDOL OnDemand API, we need apply for api key since for each request sent to OnDemand, the apiKey parameter is required, You need to sign up in IDOL OnDemand developer page(https://www.idolondemand.com/developer/apis) to get the apiKey. In the tutorial, we will find similar text with the words "Hello World" based on wiki page, detailed request and response information can refer to https://www.idolondemand.com/developer/apis/findsimilar , besides the apiKey parameter, we need pass the "text=Hello World" and enable "print=all" to get the text content in wiki page, here is the code:

 // get the response of Find Similar API with text "Hello World"  
 jsonResponse = getResponse("findsimilar",{:text => "Hello World", :print => "all", :apiKey => apiKey })

Now, the json response is kept in jsonResponse, for the detailed response json format, please refer to https://www.idolondemand.com/developer/apis/findsimilar#response , we can extract and print the reference and content with the following code:

 result = ""
 jsonResponse["documents"].each do |item|
   result+=item["reference"]+"\n"
   result+=item["content"][0..100]+"\n"
 end

The calling for OCR Document and Sentiment Analysis API are similar as above.
OCR Document API will use the following url format:
http://api.idolondemand.com/1/api/sync/ocrdocument/v1?apiKey={apiKey}&url={url}

The url parameter will be an image url: http://www.java-made-easy.com/images/hello-world.jpg

The API will extract the text content from the image. Detail information about the API can refer to:

https://www.idolondemand.com/developer/apis/ocrdocument#overview

Sentiment Analysis API will use the following url format:

http://api.idolondemand.com/1/api/sync/analyzesentiment/v1?apiKey={apiKey}&url={url}

In the tutorial, the API will give us the sentiment score and rating based on the wiki page:http://en.wikipedia.org/wiki/Hello_world_program

Detail information about the API can refer to:

https://www.idolondemand.com/developer/apis/analyzesentiment#overview

Monday, July 7, 2014

Objective C call HP IDOL OnDemand

In this post, I will use Objective C to call HP IDOL OnDemand APIs, 3 APIs usage will be given below for demonstration, which are:

Firstly, create an Objective C interface called IdolCall, which contains the method to send request and get response from IDOL OnDemand as well as an utility method to encode parameters in URL.

 @interface IdolCall:NSObject   
  - (NSDictionary *)getResponse:(NSString *)url;   
  - (NSString *)getEncodedParam:(NSString *)paramStr;   
  @end  
  @implementation IdolCall  
 // send request to given url and get the json response  
  - (NSDictionary *)getResponse:(NSString *)urlStr {   
   NSURL *url = [NSURL URLWithString:urlStr];   
   NSData *data = [NSData dataWithContentsOfURL:url];   
   NSDictionary *dic = [NSJSONSerialization JSONObjectWithData:data options:NSJSONReadingAllowFragments error:nil];   
   return dic;   
  }  
 // encode the parameter string  
  - (NSString *)getEncodedParam:(NSString *)paramStr {   
   return (NSString *)CFBridgingRelease(CFURLCreateStringByAddingPercentEscapes(NULL, (__bridge CFStringRef)paramStr, NULL, (__bridge CFStringRef)@"!*'\"();:@&=+$,/?%#[]% ", kCFStringEncodingUTF8));   
  }   
  @end

 // your apiKey goes here  
 NSString * apiKey = @"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx";  
 // define the API urls which will be used later.  
 NSString * findSimilar = @"http://api.idolondemand.com/1/api/sync/findsimilar/v1";  
 NSString * ocrDocument = @"http://api.idolondemand.com/1/api/sync/ocrdocument/v1";  
 NSString * analyzeSentiment = @"http://api.idolondemand.com/1/api/sync/analyzesentiment/v1";  
 // create IdolCall  
 IdolCall *idolCall = [[IdolCall alloc]init];

In the tutorial, we will find similar text with the words "Hello World" based on wiki page, detailed request and response information can refer to https://www.idolondemand.com/developer/apis/findsimilar , besides the apiKey parameter, we need pass the "text=Hello World" and enable "print=all" to get the text content in wiki page, here is the code:

 // encode the parameter  
 NSString * encoded = [idolCall getEncodedParam:@"Hello World"];  
 // construct the url  
 NSString * url = [NSString stringWithFormat:@"%@?apiKey=%@&text=%@&print=all", findSimilar, apiKey, encoded];  
 // send the request to IDOL OnDemand  
 NSDictionary *dic = [idolCall getResponse:url];

Now, the json response is kept in the NSDictionary dic, for the detailed response json format, please refer to https://www.idolondemand.com/developer/apis/findsimilar#response , we can extract and print the reference and content with the following code:

 NSArray *documents = [dic objectForKey:@"documents"];  
 for(NSDictionary *doc in documents) {  
   NSString *ref = [doc objectForKey:@"reference"];  
   NSString *content = [doc objectForKey:@"content"];  
   NSLog(@"%@",ref);  
   NSLog(@"%@", [content substringToIndex:100]);  
 }

The url parameter will be an image url: http://www.java-made-easy.com/images/hello-world.jpg

The API will extract the text content from the image. Detail information about the API can refer to:

https://www.idolondemand.com/developer/apis/ocrdocument#overview

Sentiment Analysis API will use the following url format:

http://api.idolondemand.com/1/api/sync/analyzesentiment/v1?apiKey={apiKey}&url={url}

In the tutorial, the API will give us the sentiment score and rating based on the wiki page:http://en.wikipedia.org/wiki/Hello_world_program

Detail information about the API can refer to:

https://www.idolondemand.com/developer/apis/analyzesentiment#overview

Tuesday, June 17, 2014

Python call OnDemand indexing and search

It's quit simple to use Python calling HP IDOL OnDemand APIs.
In Python 3.2.3. just use urlib to send http/https requst.
import urllib.parse
import urllib.request

I define this method to send request and parse the json result.
def getJson(url, params):
data = urllib.parse.urlencode(params)
data = data.encode('utf-8')
req = urllib.request.Request(url, data)
response = urllib.request.urlopen(req, timeout=500000)
jsonstr = response.read().decode("utf-8", 'ignore');
return json.loads(jsonstr)

The following APIs will be used
createtextindex_url = 'https://api.idolondemand.com/1/api/sync/createtextindex/v1'
storeobject_url = 'https://api.idolondemand.com/1/api/sync/storeobject/v1'
viewdocument_url = 'https://api.idolondemand.com/1/api/sync/viewdocument/v1'
findrelated_url = 'https://api.idolondemand.com/1/api/sync/findrelatedconcepts/v1'
findsimilar_url = 'https://api.idolondemand.com/1/api/sync/findsimilar/v1'

The following option will be used.
params_createtextindex = {'index' : text, 'flavor' : 'explorer', 'apikey' : apikey }
params_storeobject = {'url' : doc_url, 'apikey' : apikey }
params_viewdocument = {'url' : doc_url, 'highlight_expression': 'physical activity', 'start_tag': '<b>', 'apikey' : apikey }

params_findrelated = {'url' : doc_url, 'apikey' : apikey }

python call HP IDOL OnDemand

Monday, June 16, 2014

Nodejs call HP IDOL OnDemand

Hewlett Packard has developed a set of JSON-based REST API’s which enable “Big Data”-type processing capabilities allowing developers to process information embedded in unstructured text and images in previously inaccessible formats. This platform is called IDOL OnDemand, the APIs are published here https://www.idolondemand.com/developer/apis

In this post, I will use NodeJs to call HP IDOL OnDemand APIs, 3 APIs usage will be given below for demonstration purpose, which are:

Since these APIs are all REST based and authorization required,

https.get (http://nodejs.org/api/https.html#https_https_get_options_callback) will be used to send request to HP IDOL OnDemand server to get the JSON result.

Firstly, we need to set the request headers, this is the common part for for all requests:

// Set the request headers
var headers = {
'User-Agent': 'Super Agent/0.0.1',
'Content-Type': 'application/x-www-form-urlencoded'
};

Then, we construct the request data, for all the request, the apiKey parameter is required. You need to sign up in IDOL OnDemand developer page(https://www.idolondemand.com/developer/apis) to get the apiKey. For OCR Document API calling, the request only need to use 'url' and 'apiKey' parameters, which can refer to https://www.idolondemand.com/developer/apis/ocrdocument/#request, the full request url will be:
https://api.idolondemand.com/1/api/sync/ocrdocument/v1?url={url_value}&apikey={apikey_value}
Note that, we need pay attention to the url parameter, its value should be encoded, this can be done by encodeURIComponent method in javascript.
/**
* Get the OCR document options.
* @param url the image url to extract the text.
*/
var get_ocr_options = function (url){
return {
host: 'api.idolondemand.com',
port: 443,
path: '/1/api/sync/ocrdocument/v1?url=' + encodeURIComponent(url).replace(/%20/g,'+') + '&apikey=' + apikey,
headers: headers
};
};

Now, we begin to send request to server, in the callback function, we can handle the response data. The response data is JSON format, the structure can refer to https://www.idolondemand.com/developer/apis/ocrdocument/#response , we need to parse the 'text_block' field which is the extracted text result when response is on the end. We use the build-in JSON to parse the response data to json object:
req = https.get(get_ocr_options(image_url), function(response) {
response.on('data', function (chunk) {
str += chunk;
});
response.on('end', function () {
var json = JSON.parse(str);
var len = json.text_block.length;
console.log(json.text_block[0].text);
});
});
req.end();

Find Similar API calling is similar to OCR Document API calling, the only differences are the API path and request parameters. From https://www.idolondemand.com/developer/apis/findsimilar/#request
besides the 'text' and 'apiKey' parameters, we need to set 'print=all' If we get the text content. So, the full request url will be:
https://api.idolondemand.com/1/api/sync/findsimilar/v1?text={text_value}&print=all&apikey={apikey_value}
Note that, the text_value should also be encoded via encodeURIComponent method.
/**
* Get the Find Similar options.
* @param text The text content to process.
*/
var get_findsimilar_options = function (text){
return {
host: 'api.idolondemand.com',
port: 443,
path: '/1/api/sync/findsimilar/v1?text=' + encodeURIComponent(text).replace(/%20/g,'+') + '&print=all&apikey=' + apikey,
headers: headers
};
};

The Analyze Sentiment API calling need the text output of Find Similar calling which is the content of wiki article containing many words, I save the content to a local file, and then post the file content to Analyze Sentiment API, the analyze sentiment post url will be:
https://api.idolondemand.com/1/api/sync/analyzesentiment/v1
The following code block will show how to post local file to the url:
var r = request.post(analyzesentiment_post_url, function optionalCallback (err, httpResponse, body) {
var json = JSON.parse(body);
// output the score and rating
console.log("Score:"+json.aggregate.score+" Rating:"+json.aggregate.sentiment);
});
// create form to post data
var form = r.form();
form.append('apiKey', apikey);
form.append('file', fs.createReadStream(path.join(__dirname, file)));

In order to keep the order of calling the APIs (Find Similar ---> OCR Document --->Analyze Sentiment), I use async.waterfall(https://github.com/caolan/async#waterfall) which is commonly accepted by NodeJS developers. The calling sequence flow can be controlled by following code block:

async.waterfall([
function(callback){
// send request to Find Similar API
callback(null, response);
},
function(response, callback){
// parse the response and output data
callback(null);
},
function(callback){
  // send request to OCR Document API
callback(null, response);
},
function(response, callback){
   // parse the response and output data
callback(null);
},
function(callback){
   // send request to Analyze Sentiment
callback(null, response);
},
function(response, callback){
  // parse the response and output data
callback(null);
}
], function (err, result) {
});