"Anthracite Apache Template"
anthracite apache template output
This example shows how you can use pre-built HTML templates using "Apache" style tags to build formatted output with Anthracite.
This page was built with Anthracite, and incorporates both user-generated content and built-in functions.
The Tables Here are built using output from Anthracite processes, here, titles, links and 8-sentence summaries of pages returned from a Google query on "Regular Expressions":
| Steve Ramsay's Guide to Regular Expressions | Using a good regex engine and a well-crafted regular expression, one can easily search through a text file (or a hundred text files) searching for words that have the suffix ".html" (but only if the word begins with a capital letter and occurs at the beginning of the line), replace the .html suffix with a .sgml suffix, and then change all the lower case characters to upper case.
As you might guess from this example, concision is everything when it comes to crafting regular expressions, and while this syntax won't win any beauty prizes, it follows a logical and fairly standardized format which you can learn to read and write easily with just a little bit of practice.
...Ordinary macros (in particular, editable macros such as those generated by the major word processors and editors) tend not to be as fast, as flexible, as portable, as concise, or as fault-tolerant as regular expressions, but they have the advantage of being much more readable; even people with no programming background whatsoever can usually make enough sense of a macro script to change it if the need arises.
...This line, as you might guess, asks egrep to find instances of the pattern "serendipitous" in the file "foobar" and write the results to a file called "hits".
...There's always more than one way to do it with regular expressions, and in fact, if we use single-character metacharacters and quantifiers in conjunction with one another, we can search for almost all the variant spellings of "blurfle" ("bllurfle," "bllurrfle", bbluuuuurrrfffllle", and so on).
If we work this out, we come out with something like: "find a 'b' followed by any character any number of times (including zero times) followed by an 'e'."
...Suppose, for example, that we want to take the "blurfle" files listed in Important.files, list them out separately, run a program called "fragellate" on each one, and then append each successive output to a file called "fraggled_files."
...So, this sed routine will do precisely what the earlier one did: find all the instances of blurfle followed by a number between zero and nine and replace it with "fragellate blurfle[some number] >>fraggled files". |
| A Tao of Regular Expressions | Most examples are presented as vi substitution commands or as grep file search commands, but they are representative examples and the concepts can be applied in the use of tools such as sed, awk, perl and other programs that support regular expressions.
...Consider a file named test.txt consisting of the following lines: he is a rat he is in a rut the food is Rotten I like root beer We can use grep to test our regular expressions.
...Why would we use an expression like [^,]*, instead of something more straightforward like .*, to match the first parameter? Consider applying the pattern .*, to the string "10,7,2". Should it match "10," or "10,7," ? To resolve this ambiguity, regular expressions will always match the longest string possible.
...The particular piece of software that needs this data will not work if there are any whitespace characters (spaces or tabs) before or after the commas.... Here are a few lines from the data we have: Bill Jones, HI-TEK Corporation , CA, 95011 Sharon Lee Smith, Design Works Incorporated, CA, 95012 B.... The following substitution command will remove the excess spaces: :%s/[ \t]*,[ \t]*/,/g To break it down: [ \t] matches a space or tab character; [ \t]* matches 0 or more spaces or tabs; [ \t]*, matches 0 or more spaces or tabs followed by a comma; and finally [ \t]*,[ \t]* matches 0 or more spaces or tabs followed by a comma followed by 0 or more spaces or tabs.
...Suppose you have a multi-character sequence that repeats. For example, consider the following: Billy tried really hard Sally tried really really hard Timmy tried really really really hard Johnny tried really really really really hard Now suppose you want to change "really", "really really", and any number of consecutive "really" strings to a single word: "very". The command :%s/\(really \)\(really \)*/very / changes the text above to: Billy tried very hard Sally tried very hard Timmy tried very hard Johnny tried very hard The expression \(really \)* matches 0 or more sequences of "really ".
...Note that the edits don't actually happen to the input file, sed simply processes each line of the file with the command you supply and echos the result to its standard out. |
| Regular Expression HOWTO | Emacs-style patterns are slightly less readable and don't provide as many features, so there's not much reason to use the regex module when writing new code, though you might encounter old code that uses it.
...The solution is to use Python's raw string notation for regular expressions; backslashes are not handled in any special way in a string literal prefixed with "r", so r"\n" is a two-character string containing "\" and "n", while "\n" is a one-character string containing a newline.
...However, the search method of RegexObject instances scans through the string, so the match may not start at zero in that case.
...These functions take the same arguments as the corresponding RegexObject method, with the RE string added as the first argument, and still return either None or a MatchObject instance.
...Backreferences like this aren't often useful for just searching through a string -- there are few text formats which repeat data in this way -- but you'll soon find out that they're very useful when performing string substitutions.
...Except for the fact that you can't retrieve the contents of what the group matched, a non-capturing group behaves exactly the same as a capturing group; you can put anything inside it, repeat it with a repetition metacharacter such as "*", and nest it within other groups (capturing or non-capturing).
...If you're matching a fixed string, or a single character class, and you're not using any re features such as the IGNORECASE flag, then the full power of regular expressions may not be required.
...Quick-and-dirty patterns will handle common cases, but HTML and XML have special cases that will break the obvious regular expression; by the time you've written a regular expression that handles all of the possible cases, the patterns will be very complicated. |
| Regular-Expressions.info - Regex Tutorial, Examples and Reference - Regexp Patterns | In a text editor like EditPad Pro or a specialized text processing tool like PowerGREP, you could use the regular expression \b[A-Z0-9._%-]+@[A-Z0-9._%-]+\.[A-Z0-9._%-]{2,4}\b to search for an email address.... A very similar regular expression (replace the first \b with ^ and the last one with $) can be used by a programmer to check if the user entered a properly formatted email address.... Learning regular expressions and using their power will save you many hours doing similar text processing in your own code.... You can use them in powerful search and replace operations to quickly make changes across large numbers of files.... Two popular and useful tools are the text editor EditPad Pro and the all-round regex text processing utility PowerGREP.... WelcomeTutorialTools and LanguagesExamplesReferenceAbout This SiteDownload and Print PowerGREP 2 PowerGREP is probably the most powerful regex-based text processing tool available today.... Use regular expressions to search through large numbers of text and binary files, such as source code, correspondence, server or system logs, reference texts, archives, etc. Quickly find the files you are looking for, or extract the information you need.... Perform comprehensive text and binary replacement operations for easy maintenance of web sites, source code, reports, etc. Preview replacements before modifying files, and stay safe with flexible backup and undo options. |
| Regular Expression Library -- presented by ASPSmith.com Training | RegExLib.com, the Internet's first regular expression Library.
...just figured out a new expression that does something useful.
...We recommend "The Regulator" as a tool for testing regular expressions.
offered by this site to allow you to submit patterns to this site directly from the tool itself.
View all regular expression resources...
Blogs.RegexAdvice.com is a blogging community devoted to the topic of regular expressions.
Here are the latest entries from blogs aggregated on that site...
...Expressions, Parsers, Validators, User Feedback and the tradeoffs we make. |
| oreilly.com -- Online Catalog: Mastering Regular Expressions | Register your book to get email notification of new editions, special offers, and more.
...This book has been updated--the edition you're requesting is out of print. Please visit the catalog page of the latest edition.
...regular expressions, a powerful tool for manipulating text and data, are found in scripting languages, editors, programming environments, and specialized tools. In this book, author Jeffrey Friedl leads you through the steps of crafting a regular expression that gets the job done. He examines a variety of tools and uses them in an extensive array of examples, with a major focus on Perl.
Download the code examples from this book.
...Sign up to receive announcements about O'Reilly products and news. |
| Learning to Use Regular Expressions | A symbol with a special meaning can be matched, but to do so you must prefix it with the backslash character (this includes the backslash character itself: to match one backslash in the target, your regular expression should include "\\").
...One of the most powerful and common things you can do with regular expressions is to specify how many times an atom occurs in a complete regular expression.
...There is only one quantifier included with "basic" regular expression syntax, the asterisk ("*"); in English this has the meaning "some or none" or "zero or more."
...Without quantifiers, grouping expressions doesn't really serve as much purpose, but once we can add a quantifier to a subexpression we can say something about the occurrence of the subexpression as a whole.
...Using extended regular expressions, you can specify arbitrary pattern occurrence counts using a more verbose syntax than the question-mark, plus-sign, and asterisk quantifiers.
...Simply repeating the same grouped subexpression later in the regular expression does not match the same targets as using a backreference (but you have to decide what it is you actually want to match in either case).
...The way you actually specify replacements will vary between tools: a text editor might have a dialog box; command-line tools will use delimiters between match and replacement, programming languages will typically call functions with arguments for match and replacement patterns.
...Most of the time, if you are using regular expressions to modify a target text, you will want to match more general patterns than just literal strings. |
| PCRE - Perl Compatible Regular Expressions | The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5. PCRE has its own native API, as well as a set of wrapper functions that correspond to the POSIX regular expression API.
...PCRE was originally written for the Exim MTA, but is now used by many high-profile open source projects, including Python, Apache, PHP, KDE, Postfix, Analog, and nmap. Other interesting projects using PCRE include Ferite, Onyx, Hypermail, and Askemos.
...You can download it from its official home via anonymous FTP, or via HTTP or FTP from SourceForge.
Contributed source code, including C++ wrappers and sample Makefiles, is also available from the FTP site.
...You can read the text version of the PCRE man pages. For Perl 5 regular expression syntax, read the Perl regular expressions man page. |
| PHPBuilder.com - Learning to Use Regular Expressions by Example | I eventually got to know how to do it, mostly through experimenting, and seeing there wasn't much to it, I decided to write down this straight-out introduction to the syntax and a step-by-step on building regular expressions to validate money and e-mail address strings.
...You can see that if you don't use either of the two characters we mentioned, as in the last example, you're saying that the pattern may occur anywhere inside the string -- you're not "hooking" it to any of the edges.
..."ab*": matches a string that has an a followed by zero or more b's ("a", "ab", "abbb", etc.); "ab+": same, but there's at least one b ("ab", "abbb", etc.); "ab?": there might be a b or not; "a?b+$": a possible a followed by one or more b's ending a string.
..."[ab]": matches a string that has either an a or a b (that's the same as "a|b"); "[a-d]": a string that has lowercase letters 'a' through 'd' (that's equal to "a|b|c|d" and even "[abcd]"); "^[a-zA-Z]": a string that starts with a letter; "[0-9]%": a string that has a single digit before a percent sign; ",[a-zA-Z0-9]$": a string that ends in a comma followed by an alphanumeric character.
You can also list which characters you DON'T want -- just use a '^' as the first symbol in a bracket expression (i.e., "%[^a-zA-Z]%" matches a string with a character that is not a letter between two percent signs).
...On top of that, you must escape the backslash character itself in PHP3 strings, so, for instance, the regular expression "(\$|Y)[0-9]+" would have the function call: ereg("(\\$|Y)[0-9]+", $str) (what string does that validate?)
Just don't forget that bracket expressions are an exception to that rule--inside them, all special characters, including the backslash ('\'), lose their special powers (i.e., "[*\+?{}.]" matches exactly any of the characters inside the brackets).
... General multiple submit buttons can "20% - 60 = 59.83"... Array Problems How to convert to Chinese? w2k3 server + php Newbies Can someone help out a newbe? Dynamic Listbox/Dropbox that can ... PHP Redirecting refresh the page to get updated... E-mail PHP Database Inserting data into two... How to UPDATE mysql table by splitting a... Refreshing problems check whether value exists in a table... Using a DATE field, or 3 fields (year,... Install problem Server performance enhancement... apache not letting me use cgi... Installed, but i cant make it... Apache problems Coding Security System? if statement shortcut Sending custom files as email... how do i create new text file how to remove empty element from... |
| Regular Expressions | Conceptually, the implementation must examine every possible match and among those that yield the leftmost longest total matches, pick the one that does the longest match for the leftmost subexpression and so on. Note that this means that matching by subexpressions is context-dependent: a subexpression within a larger RE may match a different string from the one it would match as an independent RE, and two instances of the same subexpression within the same larger RE may match different lengths even in similar sequences of characters.
...The XCU specification specifies within the individual descriptions of those standard utilities employing regular expressions whether they permit matching of newline characters; if not stated otherwise, the use of literal newline characters or any escape sequence equivalent produces undefined results.
... BREs Matching a Single Character or Collating Element A BRE ordinary character, a special character preceded by a backslash or a period matches a single character.
... Periods in BREs A period (.), when used outside a bracket expression, is a BRE that matches any character in the supported character set except NUL.
...(left-bracket followed by a period, equals-sign or colon) are special inside a bracket expression and are used to delimit collating symbols, equivalence class expressions and character class expressions.
...To use a hyphen as the starting range point, it must either come first in the bracket expression or be specified as a collating symbol, for example: [][.-.]-0], which matches either a right bracket or any character or collating element that collates between hyphen and 0, inclusive.
...When a BRE matching a single character, a subexpression or a back-reference is followed by the special character asterisk (*), together with that asterisk it matches what zero or more consecutive occurrences of the BRE would match.
.../* -------------------------------------------- Basic regular expression -------------------------------------------- */ basic_reg_exp : RE_expression | L_ANCHOR | R_ANCHOR | L_ANCHOR R_ANCHOR | L_ANCHOR RE_expression | RE_expression R_ANCHOR | L_ANCHOR RE_expression R_ANCHOR ; RE_expression : simple_RE | RE_expression simple_RE ; simple_RE : nondupl_RE | nondupl_RE RE_dupl_symbol ; nondupl_RE : one_character_RE | Back_open_paren RE_expression Back_close_paren | Back_open_paren Back_close_paren | BACKREF ; one_character_RE : ORD_CHAR | QUOTED_CHAR | '.' |
| Text Searching in Bugzilla | Often a query including a well chosen text search pattern can result in a list of bugs that has both more relevant and fewer irrelevant bugs than would be listed if a simple substring was used.
...This guide is designed to provide you with enough background knowledge and real-life examples to be able use the other text matching types mozilla offers (especially regular expressions) -- and that's how you'll get the experience.
...As powerful as regex pattern matching and boolean charts are, neither one is the only tool you'll use. Text searching will normally be done as part of a query that includes other constraints, and it is easier and more efficient to constrain your search using other fields when you can, rather than trying to specify multiple constraints using only text matching.
For example, rather than creating a regex pattern to match variations on Windows operating system names, select each of them from the Operating System field, using ctrl-click or cmd-click, to constrain your search to bugs that afflict Windows.
...bugzilla.mozilla.org and most other installations use MySQL, which implements POSIX-compatible regex -- GNU extensions such as "\w" for word characters are not available.
...Database-specific (MySQL) The information on regex matching in this guide is specific to Bugzilla installations using MySQL; some of it may not apply to Bugzilla installations using other database engines.
...If you are used to searching on Windows with patterns like "*.html", you need to know that in regex patterns, "*" matches zero or more occurrences of the preceding character. |
| perlre
perlre | By default, the "^" character is guaranteed to match only the beginning of the string, the "$" character only the end (or before the newline at the end), and Perl does certain optimizations with the assumption that the string contains only one line.... You may, however, wish to treat a string as a multi-line buffer, such that the "^" will match after any newline within the string, and "$" will match before any newline.
...For reasons of security, this construct is forbidden if the regular expression involves run-time interpolation of variables, unless the perilous use re 'eval' pragma has been used (see re), or the variables contain results of qr// operator (see perlop/"qr/STRING/imosx").
...Be aware, however, that this pattern currently triggers a warning message under the use warnings pragma or -w switch saying it "matches the null string many times"): On simple groups, such as the pattern (?> [^()]+ ), a comparable effect may be achieved by negative look-ahead, as in [^()]+ (?!
...What's happening is that you've asked "Is it true that at the start of $x, following 0 or more non-digits, you have something that's not 123?" If the pattern matcher had let \D* expand to "ABC", this would have caused the whole pattern to fail.
...In other words, the two zero-width assertions next to each other work as though they're ANDed together, just as you'd use any built-in assertions: /^$/ matches only if you're at the beginning of the line AND the end of the line simultaneously.
...Note also that zero-length look-ahead/look-behind assertions will not backtrack to make the tail match, since they are in "logical" context: only whether they match is considered relevant.
...Characters may be specified using a metacharacter syntax much like that used in C: "\n" matches a newline, "\t" a tab, "\r" a carriage return, "\f" a form feed, etc. More generally, \nnn, where nnn is a string of octal digits, matches the character whose ASCII value is nnn. |
| Regular Expressions Reference | |
| Matchmaking with regular expressions
| This is because the period character matches everything, including the space, the tab character, and even line breaks: regular expression: t.n Matches: tan, Ten, tin, ton, t n, t#n, tpn, etc. The bracket notation To solve the problem of the period's indiscriminate matches, you can specify characters you consider meaningful with the bracket ("[]") expression, so that only those characters would match the regular expression.... "Toon" would not match because you can only match a single character within the bracket notation: regular expression: t[aeio]n Matches: tan, Ten, tin, ton The OR operator If you want to match "toon" in addition to all the words matched in the previous section, you can use the "|" notation, which is basically an OR operator.... You can also use parentheses for groupings (more on that later): regular expression: t(a|e|i|o|oo)n Matches: tan, Ten, tin, ton, toon The quantifier notations Table 1 shows the quantifier notations used to determine how many times a given notation to the immediate left of the quantifier notation should repeat itself: Table 1.
...You can obtain a match using the PatternMatcher object in one of several ways, with the string to be matched against the regular expression passed in as the first parameter: boolean matches(String input, Pattern pattern): Used if the input string and the regular expression should match exactly; in other words, the regular expression should totally describe the string input boolean matchesPrefix(String input, Pattern pattern): Used if the regular expression should match the beginning of the input string boolean contains(String input, Pattern pattern): Used if the regular expression should match part of the input string (i.e., should be a substring) You could also pass in a PatternMatcherInput object instead of a String object to the above three method calls; if you did so, you could continue matching from the point at which the last match was found in the string.
...First, create the two regular expression strings and compile them into a Pattern object using the Perl5Compiler.
...As previously mentioned, this object lets you continue matching from where the last match was found in the string; thus, it's perfect for extracting the font tag's name-value pair.
... color: red More HTML processing Let's continue with another HTML example.
...The fourth parameter is the actual string on which you wish to perform the substitution, and the last parameter lets you specify whether you wish to substitute on every occurrence of the pattern found (Util.SUBSTITUTE_ALL) or only substitute a specified number of times. |
| PERL Regular Expressions | . Match any character\w Match "word" character (alphanumeric plus "_")\W Match non-word character\s Match whitespace character\S Match non-whitespace character\d Match digit character\D Match non-digit character\t Match tab\n Match newline\r Match return\f Match formfeed\a Match alarm (bell, beep, etc)\e Match escape\021 Match octal char ( in this case 21 octal)\xf0 Match hex char ( in this case f0 hexidecimal) You can follow any character, wildcard, or series of characters and/or wildcard with a repetiton.
...Powerful regular expressions can be made with groups At its simplest, you can match either all lowercase or name case like this: if($string =~ m/(B|b)ill (C|c)linton/) {print "It is Clinton, all right!\n"} Detect all strings containing vowels if($string =~ m/(A|E|I|O|U|Y|a|e|i|o|u|y)/) {print "String contains a vowel!\n"} Detect if the line starts with any of the last three presidents: if($string =~ m/^(Clinton|Bush|Reagan)/i) {print "$string\n"}; Note that the parenthesized element will appear as $1 statements that follow the regular expression.
...For instance, to match anything that is not a vowel, do this: if($string =~ /[^AEIOUYaeiouy]/){print "This string contains a non-vowel"} Contrast to this: if($string !~ /[AEIOUYaeiouy]/){print "This string contains no vowels at all"} Best Uses of Character Classes Print all people whose name begins with A through E if($string =~ m/^[A-E]/) {print "$string\n"} If character classes are giving you quirky results, consider using groups!
...The directory output would look something like this: Volume in drive D has no label Volume Serial Number is 4547-15E0 Directory of D:\polo\marco. <DIR> 12-18-97 11:14a ... <DIR> 12-18-97 11:14a ..INDEX HTM 3,237 02-06-98 3:12p index.htmAPPDEV HTM 6,388 12-24-97 5:13p appdev.htmNORM HTM 5,297 12-24-97 5:13p norm.htmIMAGES <DIR> 12-18-97 11:14a imagesTCBK GIF 532 06-02-97 3:14p tcbk.gifLSQL HTM 5,027 12-24-97 5:13p lsql.htmCRASHPRF HTM 11,403 12-24-97 5:13p crashprf.htmWS_FTP LOG 5,416 12-24-97 5:24p WS_FTP.LOGFIBB HTM 10,234 12-24-97 5:13p fibb.htmMEMLEAK HTM 19,736 12-24-97 5:13p memleak.htmLITTPERL <DIR> 02-06-98 1:58p littperl 9 file(s) 67,270 bytes 4 dir(s) 132,464,640 bytes free UUUUgly!... Now count the bytes in the directory: my($totalBytes) = 0;while(<STDIN>) { my($line) = $_; chomp($line); if($line !~ /<DIR>/) #directories don't count { #*** only lines with dates at position 28 **** if ($line =~ /.{12}((\d| |,){14}) \d\d-\d\d-\d\d/) { my($bytes) = $1; $bytes =~ s/,//; #substitute nothing for comma -- delete commas $totalBytes += $bytes; } } }print "$totalBytes bytes in directory.\n"; Note the group within a group, where the inner one is used for character alternation, and the outer is used as a selection.
...Here's a simple example: $string =~ m/Bill Clinton/; #return true if var $string contains the name of the president $string =~ tr/Bill Clinton/Al Gore/; #replace the president with the vice president !~ Just like =~, except negated.... Here are simple examples: $string =~ m/Bill Clinton/; #return true if var $string contains the name of the president $string =~ tr/Bill Clinton/Al Gore/; #replace the president with the vice president m The match operator.... Here are some examples: $string =~ m/Bill Clinton/; #return true if var $string contains the name of the president $string =~ /Bill Clinton/; #same result as previous statement ^ This is the "beginning of line" symbol. |
| Regular Expressions in java | News Item 12/16/2002: Stevesoft has release a new version of Phreida ( http://javaregex.com/phreida).
...News Item 11/26/2002: Update: The version of xmlser is upgraded to release candidate 5 (v 0.98).
...But that's not all you can get, there's some science fiction theme t-shirts, and mouse pads.
...Since jdk 1.4 now has a stable Regex product, we feel that it is time to release package pat under the Lesser GPL.
...Online since May of 1996: Many Java software companies and packages have come and gone since Java was invented.... Documentation: Whether you want javadoc comments, a tutorial, or examples of use it's all available at left.... Speed: If the optimize() method is called package pat uses a Boyer-Moore Horspool search (if the pattern begins with a literal text string).... Java 1.0 support: If you need to restrict yourself to using java 1.0 you can still use pat. |
| The Regex Coach - interactive regular expressions | The Linux version comes with the executable and a small text file with X resources which you can (optionally) use to enhance the optical appearance of the application - either load it temporarily with xrdb -merge or use it permanently by appending it to your ~/.Xresources file.
...The Regex Coach enables you to try out the behaviour of Perl's regular expression operators (namely m//, s///, and split) interactively and in "real time", i.e. as soon as you make changes somewhere the results are instantly displayed.
...It works like this: If you've selected a valid subexpression of the regular expression in the regex pane the corresponding part of the target string is shown in orange.
...Apart from highlighting the part of the target string which corresponds to the selected area in the regex pane you can also highlight the parts which correspond to captured register groups (enclosed by parentheses) in the regular expression.
...The headline above the scan buttons which usually says "Scan from 0" will change accordingly showing a message like "Scan #n from m" which means that the regex engine is trying to find the nth match starting at character m of the target string. The target message area will be changed as well - it'll say "Match #n from k to l" instead of "Match from k to l" (or it'll say "No further match" instead of "No match" if you've pressed the scan forward button too often).
...Some of them remain, however: The engine will only try to match from position 0 if the regex starts with .* and is in single-line mode.
...It might be worthwhile to note that due to the dynamic nature of Lisp The Regex Coach could be written without changing a single line of code in the CL-PPCRE engine itself although the application has to track information and query the engine while the regular expressions is parsed and the scanners are built. |
| Web Server Administration | Get answers to these and other questions in this, part 2 of Managing Users from the book Linux Administration, A Beginner's Guide, third edition by Steven Graham and Steve Shah (McGraw-Hill/Osborne, 0072225629, 2002).
...(from the book Linux Administration, A Beginner's Guide, third edition by Steven Graham and Steve Shah, McGraw-Hill/Osborne, ISBN:0072225629, 2002). Discuss this article!
...The chapter is from the book, Network Security: The Complete Reference, by Mark Rhodes-Ousley, Roberta Bragg, and Keith Strassberg (McGraw-Hill/Osborne, 2003, ISBN: 0072226978). Discuss this article!
...In this tutorial, find out how to obtain, install and use the popular ht://Dig indexing engine to add powerful, effective search capabilities to your site with minimal time and fuss. Discuss this article!
...in: BrainDump » Kernel, Cron, and User Administration, Part 1 in: Administration » Connecting Smart Devices on the Internet in: Web-Services Get This For Your Site!
» Hardening Wireless LAN Connections, Part 2 in: Windows Security » Introduction to ASP.NET in: ASP.NET » Game Development of .Nettrix: GDI+ and Collision Detection in: .NET Get This For Your Site!
... » X-Micro EVA Portable Audio Player in: Handhelds » AOpen AK86-L Motherboard Review in: Motherboards » Digital Photography Hacks: Starlight Effects and Second-curtain Flash in: Digital Cameras Get This For Your Site!
... » Civil servants sacked over Net porn - Discuss » Chinese finger `exam cheat` virus - Discuss » Concerns Mount over Major Web Strike - Discuss Get This For Your Site! |
| oreilly.com -- Online Catalog: Mastering Regular Expressions, 2nd Edition | Register your book to get email notification of new editions, special offers, and more.
...regular expressions are an extremely powerful tool for manipulating text and data. They are now standard features in a wide range of languages and popular tools, including Perl, Java, VB.NET and C# (and any language using the .NET Framework), PHP, Python, Ruby, Tcl, MySQL, awk, and Emacs. If you don't use regular expressions yet, you will discover in this book a whole new world of mastery over your data. If you already use them, you'll appreciate this book's unprecedented detail and breadth of coverage.
Stay informed. Sign up to receive announcements about O'Reilly products and news.
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners. |
|
Amazon.com: Books: Mastering Regular Expressions
| Book News, Inc. A tutorial and reference to using the technique of regular expressions to facilitate the handling of text and data in Perl and other scripting languages, editors such as Emacs and , programming environments including Delphi and Visual C++, and specialized tools such as and Expect.
...regular expressions can be found everywhere, such as in scripting languages (including Perl, Tcl, awk, and Python); editors (including Emacs, vi, and Nisus Writer); programming environments (including Delphi and Visual C++); and specialized tools (including lex, Expect, and sed).
...They are powerful ways to search, manipulate and parse text fields and can often take several lines of code and shrink it down to a mystic, but powerful, expression.If you have ever had to parse a file for information, you know that one of the things that still haunts any programmer nowadays is how to match text.... At times it may seem like you have to read over a section twice, but you will realize that as you carry forth into the next section the material you read previously has turned into something you can now apply -- not just another example you can cut and paste and never really learn technique behind.This is a powerful book, covering many, many pages.... I find myself using it several times a week to lookup information on regular expressions and to held solidify knowledge of techniques that I have used in the past.Whether you are a Windows, Unix, or even Macintosh person -- RegEx holds the key to text manipulation -- and this book holds the map you need to find that key.
...Actually you'll probably want to read it again as the first time round you were so glued to the pages you didn't have time to try out the examples yourself.In a book such as this layout and typographical conventions are of utmost importance and this book gets this spot on. An author who can cover this subject without simply using masses of examples and dry outlines of selected syntax arrangements deserves an acolade.... It stimulates the juices and is a struggle to put down (to the detriment of your hands-on practice as mentioned above).I was quite wary of exploring the territory of regular expressions and used to be very ambivalent towards Perl but this book helped to ease me in to a whole new world of script programming.This book is not just for Perl geeks.
...Search for books by subject: Computer Bks - Languages / Programming Computer Books: General Computer Programming Languages Computers Hardware - Personal Computers - General Microcomputer Application Software Perl (Computer program language) Programming - General Programming Languages - General General Theory of Computing Programming languagesi.e., each book must be in subject 1 AND subject 2 AND ... |
| Regular Expressions | Note: This lesson covers an API introduced in the latest release of the JavaTM 2 Platform Standard Edition, version 1.4.
...java.util.regex API for pattern matching with regular expressions. This lesson does not assume the that you have any previous experience with regular expressions.... This lesson starts with the basics, and gradually builds to cover more advanced techniques.
Note: We recommend that you open java.util.regex API specification in a separate browser window.
...Describes the basic predefined character classes for whitespace, word, and digit characters.
...Examines other useful methods of the Pattern class, and explores advanced features such as compiling with flags and using embedded flag expressions.
Describes the commonly-used methods of the Matcher class. |
|
Pattern (Java 2 Platform SE v1.4.2)
| Backslashes within string literals in Java source code are interpreted as required by the Java Language Specification as either Unicode escapes or other character escapes.
...In Perl, \1 through \9 are always interpreted as back references; a backslash-escaped number greater than 9 is treated as a back reference if at least that many subexpressions exist, otherwise it is interpreted, if possible, as an octal escape.... In this class, \1 through \9 are always interpreted as back references, and a larger number is accepted as a back reference if at least that many subexpressions exist at that point in the regular expression, otherwise the parser will drop digits until the number is smaller or equal to the existing number of groups or it is one digit.
...In this class, embedded flags always take effect at the point at which they appear, whether they are at the top level or within a group; in the latter case, flags are restored at the end of the group just as in Perl.
...Case-insensitive matching can also be enabled via the embedded flag expression (?i).
...If this pattern does not match any subsequence of the input then the resulting array has just one element, namely the input sequence in string form.
...If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter.... If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded. |
| 4.2 re -- Regular expression operations | This module provides regular expression matching operations similar to those found in Perl. regular expression pattern strings may not contain null bytes, but can specify the null byte using the \number notation.
...regular expressions use the backslash character ("\") to indicate special forms or to allow special characters to be used without invoking their special meaning. This collides with Python's usage of the same character for the same purpose in string literals; for example, to match a literal backslash, one might have to write '\\\\' as the pattern string, because the regular expression must be "\\", and each backslash must be expressed as "\\" inside a regular Python string literal.
The solution is to use Python's raw string notation for regular expression patterns; backslashes are not handled in any special way in a string literal prefixed with "r". So r"\n" is a two-character string containing "\" and "n", while "\n" is a one-character string containing a newline. Usually patterns will be expressed in Python code using this raw string notation.
...The second edition of the book no longer covers Python at all, but the first edition covered writing good regular expression patterns in great detail. |
| Jeffrey Friedl's Mastering Regular Expressions | This is the web site for the second edition of Mastering regular expressions, by Jeffrey Friedl.
...To help you decide if it's worth your while to upgrade from the first edition, I wrote this article at the O'Reilly Network.
...The first edition contained a 100-page chapter devoted to Perl, so many who merely glanced at it assumed it was a book on Perl regular expressions instead of the general book on all regular expressions that it was.... However, let me assure you that while the Second Edition does have a huge chapter devoted to Perl, it has 350 other pages on general and language-specific topics (and an additional 50 pages for the table of contents and the index). The cover is still "Perl blue", but like before, the content aims at all regular expressions.
...The full index is available in one large (about 250k) web page, suitable for searching via your browser's search function.
As a testament to the hard work I put in to this book, the first errata wasn't reported until the book had been out for over a full week :-) (By contrast, it took about five minutes for the first typo to be found in the first edition.)
...You can fetch program and data listings from one page, or from a range of pages. |
| PERL -- Regular Expressions | regular expressions The patterns used in pattern matching are regular expressions such as those supplied in the Version 8 regexp routines.... The scope of $<digit> (and $\`, $& and $') extends to the end of the enclosing BLOCK or eval string, or to the next pattern match with subexpressions.... Within the pattern, \10, \11, etc. refer back to substrings if there have been at least that many left parens before the backreference.
...By default, the ^ character is only guaranteed to match at the beginning of the string, the $ character only at the end (or before the newline at the end) and perl does certain optimizations with the assumption that the string contains only one line.... You may, however, wish to treat a string as a multi-line buffer, such that the ^ will match after any newline within the string, and $ will match before any newline.
...Any item of a regular expression may be followed with digits in curly brackets of the form {n,m}, where n gives the minimum number of times to match the item and m gives the maximum.
...Unlike some other regular expression languages, there are no backslashed symbols that aren't alphanumeric.... This makes it simple to quote a string that you want to use for a pattern but that you are afraid might contain metacharacters. |
Last Updated: 08/27/04
02:48 PM
Generated with Anthracite/1.0.8
|