public class WebTableReader extends NutchTool implements org.apache.hadoop.util.Tool
Modifier and Type | Class and Description |
---|---|
static class |
WebTableReader.WebTableRegexMapper
Filters the entries from the table based on a regex
|
static class |
WebTableReader.WebTableStatCombiner |
static class |
WebTableReader.WebTableStatMapper |
static class |
WebTableReader.WebTableStatReducer |
Modifier and Type | Field and Description |
---|---|
static org.slf4j.Logger |
LOG |
currentJob, currentJobNum, numJobs, results, status
Constructor and Description |
---|
WebTableReader() |
Modifier and Type | Method and Description |
---|---|
static void |
main(String[] args) |
void |
processDumpJob(String output,
org.apache.hadoop.conf.Configuration config,
String regex,
boolean content,
boolean headers,
boolean links,
boolean text) |
void |
processStatJob(boolean sort) |
Map<String,Object> |
run(Map<String,Object> args)
Runs the tool, using a map of arguments.
|
int |
run(String[] args) |
getProgress, getStatus, killJob, stopJob
public void processDumpJob(String output, org.apache.hadoop.conf.Configuration config, String regex, boolean content, boolean headers, boolean links, boolean text) throws IOException, ClassNotFoundException, InterruptedException
public int run(String[] args) throws Exception
run
in interface org.apache.hadoop.util.Tool
Exception
Copyright © 2015 The Apache Software Foundation