| |
Sources:
URLs
website source
data, https, batch URL generation
UNIX Commands
use STDOUT
of a command-line program as a source
MySQL Interface
use the mysql
client software to perform queries on a MySQL database
SOAP/XML-RPC
search Google
using the Google API (your own key required)
Local Files
Process entire
directories of files
|
|
Processes
Extract Table
Data
Change Tags
Text Between
/ Text Near
Extract using
Regular Expressions
Find/Replace,
Strip/Split, Wrap Text
any UNIX command
using STDIN/STDOUT (e.g., grep, uniq, sort, perl, sed, awk)
Extract Links
|
|
Outputs
Format results
from multiple sources into reports.
Store results
within Anthracite.
Export results
to html, csv, xml, txt files
Export results
to MySQL |