metafy home > anthracite > docs > tools > sources > curl source object



anthracite curl source object


CURL is powerful open source software by David Stenberg that is built in to MacOS X and available from the command line.

Anthracite allows you to use CURL as a Source, enabling many capabilities for accessing websites, including SSL/HTTPS servers, sites that use usernames/passwords, and posting custom headers and form arguments.

The CURL Source Object is among the most complex in Anthracite, and its settings are broken into three tabs: General, Headers and Form Data.


General




The General Tab of the CURL Source Object is where you enter the basic information about what should be loaded, including the URL, any basic HTTP authentication username and password required as well as specify what data you want returned and if redirections should be followed automatically.

URL - enter a basic http or https URL here, or try CURL's options as described in the man page (see below).

User/Password - the HTTP basic authorization credentials to be used if required, these will be stored in your MacOS X System Keychain.

Return Type Selector - Do you want the Header returned by the server?

Follow location hints - if selected, CURL will follow any redirects that it encounters trying to load the URL you specified (e.g., if a server redirects you to a cached page or alternate source).


Headers


Headers are information included with the request to the web server that used to pass variables or set parameters for the request.


User Agent - Allows you to enter a custom User Agent string that will be sent to the server. Sometimes a server or script will require a particular browser to be used to access a site, the User Agent is what tells the server what browser you are using.

Cookie ("-b" arg to curl) - The Cookie field allows you to set the cookie value to be passed to the server. To get the "set cookie" value being sent by a server, you will want to use the CURL object to return the Header of the page (as shown in the General tab above) and examine it for the value to enter here. Note that you only need to enter the cookie value itself, not the "Cookie:" header prefix.

Referer - Allows you to specify the URL of the page that referred you to the site that you are loading. Often passed automatically by browsers, some dynamically generated pages use the referrer to alter the content displayed.

The Additional Header Lines table allows you to add any other headers needed, such as those in RFC2616 section 5.3 or any custom X-Headers you may need (or want) to add.


Form Data


HTML Forms send the data entered by a user in a webpage as a set of "key value" pairs for each of the fields. Often, URLs will included encoded information, such as a starting record number in a batch of results.

("--form" args to curl)

Get/Post PopUp - Allows you to specify the GET or POST method expected by the server for this request.

Arg/Val pairs - a Table to enter the pairs of items to be sent to the server, such as would have been entered by a user or in a navigation form.


More Info
Because Anthracite's CURL Source Object is based on the UNIX command line tool, you can get more information about the inner workings of the CURL Source by reviewing the manual page ("manpage"). From Anthracite's Window menu, choose "Find UNIX Commands" and then enter "curl" as the command to lookup or type "man curl" in the Terminal at a command prompt.

See: Anthracite > Windows > Find UNIX Commands


CURL/libCurl: http://curl.haxx.se/


RFC2616 HTTP Client Request Headers (see esp. 5.3): http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html



[ Anthracite Tools ] [ Anthracite Docs ] [ Metafy Home ]

Last Update: 10/18/2004


Copyright © 2004, All Rights Reserved