home > anthracite > docs > tools > sources > google api



How to Use the Google API Source


The Google API Source Object allows you to execute a query against Google's Search Engine, and then use the matching URLs returned as the list of sources to be loaded. That is, it's just like doing a Google query as you would normally, except that instead of getting back a single page listing the links to 10 other pages at a time of matching results, Anthracite instead loads the full HTML of those 10 other pages, or 50 pages, or 1000 pages.


Of course, you can then plug the output of your Google Query results (which could be many, many pages) through your Anthracite Process chains.



Step One: Get A Google API Key

Sign up at Google Accounts and you will receive a special access key that we'll use in the next step.


Step Two: Create and Configure the Google API Source Object

Enter the key you receive from Google in the "Key" field, where it will appear as bullets instead of the actual text. If you need to manage your Google Key, it is stored in your System Keychain.



Next, enter your search terms in the "Query" field just as you would in Google's search field. This is a editable search menu that keeps track of your queries.

Using the Google Source API, you can specify the beginning index of the results you'd like from the Google Database, such as the results starting at page 500, enter this (if desired) in the "Start Index" field.

The "Max Results" field value specifies how many pages you want returned from the Google Database. Each Query is expected to return 10 urls, so 100 max results would use ten queries.

All these settings are described in the Google Web APIs - Reference document on the Google website, and so we'll just quickly cover the remaining optional settings.

"Should Filter" uses Google's Automatic Filtering of duplicate results and "host crowding" of URLs from the same source.

"Safe Search" is Google's method for attempting to eliminate adult content from the results.

The "Restrict" field allows you to enter certain geographic or topical limits on the results, results will only be from a particular country such as Iraq (enter "countryIQ") or by one of several topics, such as Macintosh (enter "mac"). These Restricts, including specific configuration settings tables, are explained in detail on the Google Web APIs - Reference page.

The "Language" field lets you specify a particular set of languages from which you'd like the results, such as pages in Chinese (enter "lang_zh-CN").

At the bottom of the Google Source Object Edit View, you'll also see a line of text telling you what your current Query count is for today.

There may be restrictions on your Google API use, such as a limit of 10,000 pages per day (1000 queries X 10 URLs per result) with your Google Account, so please be sure to familiarize yourself with their service before using this powerful tool.

[ Top ] [ Metafy Home ]


Last Updated: 7/16/2004