public class ParseUtil
extends org.apache.hadoop.conf.Configured
Parser
s to obtain
Parse
objects.Modifier and Type | Field and Description |
---|---|
static org.slf4j.Logger |
LOG |
Constructor and Description |
---|
ParseUtil(org.apache.hadoop.conf.Configuration conf) |
Modifier and Type | Method and Description |
---|---|
org.apache.hadoop.conf.Configuration |
getConf() |
Parse |
parse(String url,
WebPage page)
|
void |
process(String key,
WebPage page)
Parses given web page and stores parsed content within page.
|
void |
setConf(org.apache.hadoop.conf.Configuration conf) |
public ParseUtil(org.apache.hadoop.conf.Configuration conf)
conf
- public org.apache.hadoop.conf.Configuration getConf()
getConf
in interface org.apache.hadoop.conf.Configurable
getConf
in class org.apache.hadoop.conf.Configured
public void setConf(org.apache.hadoop.conf.Configuration conf)
setConf
in interface org.apache.hadoop.conf.Configurable
setConf
in class org.apache.hadoop.conf.Configured
public Parse parse(String url, WebPage page) throws ParserNotFound, ParseException
Parser
s
until a successful parse is performed and a Parse
object is
returned. If the parse is unsuccessful, a message is logged to the
WARNING
level, and an empty parse is returned.ParserNotFound
- If there is no suitable parser found.ParseException
- If there is an error parsing.Copyright © 2015 The Apache Software Foundation