public class AmaExtensions
extends java.lang.Object
The oneWEX XSL extension functions are implemented as Xalan java extensions. To enable, add the following namespace declaration in the stylesheet.
xmlns:ama="xalan://com.ibm.es.ama.zing.converter.converters.xslt.AmaExtensions"
Modifier and Type | Method and Description |
---|---|
static java.lang.String |
concat(org.w3c.dom.traversal.NodeIterator ns,
java.lang.String separator)
The
ama:concat function returns a string concatenation of the set of
values contained in the expression argument. |
static void |
crawlEnqueueUrl(java.lang.String url)
During the dataset ingestion, calling this function adds an URL to the crawl
queue, and schedules it for a crawl.
|
static java.lang.String |
documentGroupId()
Returns the document's group-id.
|
static java.lang.String |
documentUrl()
Returns the document's url.
|
static java.lang.String |
formatDate(long seconds,
java.lang.String format)
The
ama:formatDate function requires one argument, which is the
date/time (expressed in Unix date-seconds) that is to be formatted. |
static java.lang.Object |
get(java.lang.String key)
Returns the node-set associated with a string (see
ama:set ). |
static java.lang.Object |
ifElse(boolean test,
java.lang.Object object1,
java.lang.Object object2)
Simple if-else statement (which does not exist in XPath 1.0).
|
static void |
log(java.lang.String message) |
static void |
log(java.lang.String message,
java.lang.String level)
The
ama:log function outputs message to system log. |
static java.lang.String |
nodeToStr(org.w3c.dom.Node node) |
static java.lang.String |
nodeToStr(org.w3c.dom.Node node,
boolean prettyPrint,
boolean preamble)
Outputs an XML nodeset as a string.
|
static java.lang.String |
nodeToStr(org.w3c.dom.traversal.NodeIterator ns) |
static java.lang.String |
nodeToStr(org.w3c.dom.traversal.NodeIterator ns,
boolean prettyPrint,
boolean preamble) |
static long |
parseDate(java.lang.String dateStr,
java.lang.String format)
Attempt to parse a string and extract a date from it.
|
static java.lang.String |
replace(java.lang.String str,
java.lang.String pattern,
java.lang.String newStr)
Replace the substrings in
str matching the java.util.regex regular expression pattern with newStr . |
static java.lang.String |
replace(java.lang.String str,
java.lang.String pattern,
java.lang.String newStr,
java.lang.String option) |
static void |
set(java.lang.String key,
java.lang.Object nodeSet)
Associate a node-set with a string.
|
static java.lang.String |
strToLower(java.lang.String str)
Convert the UTF-8 string str to lowercase.
|
static java.lang.String |
strToUpper(java.lang.String str)
Convert the UTF-8 string str to uppercase.
|
static boolean |
test(java.lang.String str,
java.lang.String pattern,
java.lang.String patternType)
Test if a string matches a certain pattern.
|
static java.lang.String |
urlResolveBase(java.lang.String url,
java.lang.String base)
Convert a relative URL
url into a fully qualified URL using a base
URL base . |
public static java.lang.String documentUrl()
public static java.lang.String documentGroupId()
public static void log(java.lang.String message, java.lang.String level)
ama:log
function outputs message to system log.message
- message to be loggedlevel
- a string representing log-level. The default is 'INFO'. See
java.util.logging.Level
.public static void log(java.lang.String message)
public static java.lang.String formatDate(long seconds, java.lang.String format)
ama:formatDate
function requires one argument, which is the
date/time (expressed in Unix date-seconds) that is to be formatted. This is
often the output value produced by the ama:parseDate
function.
The ama:formatDate
function also accepts an optional second argument,
which is a format string composed of the same format specifiers as are
accepted by the standard strftime function. If the second argument is not
present, the date/time output format defaults to "%Y-%m-%dT%H:%M:%S%z" (RFC
3339 format).
This function supports the following formatting codes:
%a or %A | The day-of-the-week name, either abbreviated (e.g. "Mon") or full (e.g. "Monday"). |
%b or %B | The month name, either abbreviated (e.g. "Jan") or full (e.g. "January"). |
%c | The preferred date and time representation for the current locale. |
%d | The day of the month (1-31). |
%H | The hour in 24-hour format (00-23), as a 2-digit number. |
%I | The hour in 12-hour format (01-12), as a 2-digit number. |
%j | The day number in the year (1-366), with or without leading zeros. |
%m | The month number (1-12). |
%M | The minute (0-59). |
%p | "AM" or "PM", or "am" or "pm". Required when using 12-hour format. |
%S | The seconds (0-61). |
%U | The week number in the year (0-53), where Sunday is the first day of the week. |
%w | The weekday number (0-6), where 0 is Sunday. |
%W | The week number in the year (0-53), where Monday is the first day of the week. |
%y | The 2-digit year, where 69-99 refer to 1969-1999 and 00-68 refer to 2000-2068. |
%Y | The 4-digit year. |
%Z | The time zone name or abbreviation. |
%% | The % character. |
seconds
- A number of seconds since 00:00:00 on January 1, 1970, Coordinated
Universal Time (UTC).format
- A format string composed of the same format specifiers that are
acceptd by the strftime() function.public static long parseDate(java.lang.String dateStr, java.lang.String format)
The optional format string specifies how the date should be parsed. The format specifiers are intended to be compatible with PHP or C strftime() format strings. This function supports the following codes:
%a or %A | The day-of-the-week name, either abbreviated (e.g. "Mon") or full (e.g. "Monday"). |
%b, %B, or %h | The month name, either abbreviated (e.g. "Jan") or full (e.g. "January"). |
%c | A shortcut for the time and date in %a %b %e %H:%M:%S %Y format, e.g. Mon Jul 7 15:30:45 2007. |
%C | The century (00-99). |
%d or %e | The day of the month (1-31), with or without leading zeros. |
%D | A shortcut for the date in %m/%d/%y form, e.g. 03/19/06. |
%F | A shortcut for the date in %Y-%m-%d form, e.g. 2004-09-25. |
%H | The hour in 24-hour format (00-23), as a 2-digit number with leading zeros. |
%I | The hour in 12-hour format (01-12), as a 2-digit number with leading zeros. Requires %p, the "AM/PM" specifier, to be included in the format string. |
%j | The day number in the year (1-366), with or without leading zeros. |
%k | The hour in 24-hour format (0-23), as a 1- or 2-digit number without leading zeros. |
%l | The hour in 12-hour format (1-12), as a 1- or 2-digit number without leading zeros. Requires %p, the "AM/PM" specifier, to be included in the format string. |
%m | The month number (1-12), with or without leading zeros. |
%M | The minute (0-59), with or without leading zeros. |
%n | The newline character. |
%p | "AM" or "PM", or "am" or "pm". Required when using 12-hour format. |
%r | A shortcut for the time in 12-hour format: %I:%M:%S %p, e.g. 10:40:22 PM. |
%R | A shortcut for the 24-hour time without seconds: %H:%M. |
%s | The number of seconds since the Unix Epoch (00:00:00 UTC on 1 January 1970). |
%S | The seconds (0-61), with or without leading zeros. |
%t | The tab character. |
%T | A shortcut for the 24-hour time with seconds: %H:%M:%S, e.g. 23:59:59. |
%u | The weekday number (1-7),, with or without leading zeros, where 1 is Monday. |
%U | The week number in the year (0-53), with or without leading zeros, where Sunday is the first day of the week. |
%w | The weekday number (0-6), with or without leading zeros, where 0 is Sunday. |
%W | The week number in the year (0-53), with or without leading zeros, where Monday is the first day of the week. |
%y | The 2-digit year, with or without leading zeros, where 69-99 refer to 1969-1999 and 00-68 refer to 2000-2068. |
%Y | The 4-digit year. |
%Z | The time zone. This may be specified either as a standard abbreviation (listed below), or as a signed offset in hours (and optionally minutes). |
%% | The % character. In the format string, the space character matches any number of whitespace characters. |
Numeric time zone offsets may be in formats such as +0200, -03:30, +12, or -5. Supported time zone abbreviations are: GMT, UT, UTC, Z, WET, WEST, BST, ART, BRT, BRST, NST, NDT, AST, ADT, CLT, CLST, EST, EDT, CST, CDT, MST, MDT, PST, PDT, AKST, AKDT, HST, HAST, HADT, SST, WAT, CET, CEST, MET, MEZ, MEST, MESZ, EET, EEST, CAT, SAST, EAT, MSK, MSD, IST, SGT, KST, JST, GST, NZST, and NZDT.
dateStr
- the date string to parse.format
- a format string identifying the format of the date supplied as the
first argument. If not supplied, the function will try to apply
the date ranges listed below.timezone:
- a string identifying the time zone to use. This can be either a
numeric time zone offset or a time zone abbreviation, as listed
below. If not specified, the default value for timezone is
'localtime'.public static void crawlEnqueueUrl(java.lang.String url)
url
needs to be an
absolute URL. If the URL is relative, ama:urlResolveBase
should be used before calling this extension.
As of Watson Explorer oneWEX v12.0.1, ama:crawlEnqueueUrl
is
supported only by web crawlers.
url
- the (absolute) URL to enqueue.public static java.lang.String replace(java.lang.String str, java.lang.String pattern, java.lang.String newStr)
str
matching the java.util.regex
regular expression pattern
with newStr
.str
- string on which to do the replacement(s).pattern
- regex pattern (see the regex specification or
RegularExpression.info for detailed information).newStr
- replacement string.public static java.lang.String replace(java.lang.String str, java.lang.String pattern, java.lang.String newStr, java.lang.String option)
public static java.lang.String strToLower(java.lang.String str)
str
- the string to convert.public static java.lang.String strToUpper(java.lang.String str)
str
- the string to convert.public static boolean test(java.lang.String str, java.lang.String pattern, java.lang.String patternType)
Example
<xsl:if test=
"ama:test(document/@url, 'http://*.pdf', 'wc')"> ... </xsl:if>
str
- string on which to try to match the patternpattern
- pattern to be matched.patternType
- the type of pattern-matching that should be used in the
comparison, which is one of the following:
java.util.regex.Pattern
.
java.util.regex.Pattern.CASE_INSENSITIVE
.
public static java.lang.String concat(org.w3c.dom.traversal.NodeIterator ns, java.lang.String separator)
ama:concat
function returns a string concatenation of the set of
values contained in the expression argument.
The optional argument separator specifies a string to use when joining the
value strings.ns
- a set of valuesseprator
- a stringpublic static java.lang.String nodeToStr(org.w3c.dom.Node node, boolean prettyPrint, boolean preamble)
ns
- the XML nodeset.prettyPrint
- if true, the string output of the function will be an indented and
attractively formatted version of the input XML. The default value
for this parameter is false().preamble
- if true, the standard XML preamble will included in the output
string. The default value for this parameter is false().public static java.lang.String nodeToStr(org.w3c.dom.traversal.NodeIterator ns, boolean prettyPrint, boolean preamble)
public static java.lang.String nodeToStr(org.w3c.dom.traversal.NodeIterator ns)
public static java.lang.String nodeToStr(org.w3c.dom.Node node)
public static java.lang.Object ifElse(boolean test, java.lang.Object object1, java.lang.Object object2)
xsl:choose
. It is also more
flexible than xsl:choose
because it returns an object
(xsl:choose
forces the conversion of objects to strings).test
- test conditionobject1
- object (string, node-set, etc) to return if the condition is true.object2
- object to return if the condition is false.public static void set(java.lang.String key, java.lang.Object nodeSet)
ama:get
.str
- a key stringnodeSet
- the node set to storepublic static java.lang.Object get(java.lang.String key)
ama:set
).str
- the "key" stringama:set
or an empty
node-set if it does not existpublic static java.lang.String urlResolveBase(java.lang.String url, java.lang.String base)
url
into a fully qualified URL using a base
URL base
. If url is already fully qualified then it is left
unchanged.url
- the URL to disambiguate.base
- the base URL.