I generally don't subscribe to the UNIX religion, but there is one area where I am an unabashed convert: regular expressions. Yeah, the syntax is a little scary, but for processing strings, nothing is more effective. The RegEx is the power drill of the programmer's toolkit: not appropriate for every job, but the go-to tool for a lot of common jobs. And what could be more common than the humble string, particularly in this day and age of HTML, XML, SOAP, and other plain text formats? Most modern development languages have complete Regular Expression support-- even in the IDE for things like search and replace.
Over the last four years I've experimented with a number of commercial, freeware, and even homegrown RegEx tools. In the .NET era, I started with Expresso, and I recently found out about Regulator, which is hands down the most impressive free RegEx tool I've encountered to date. But that was before I met my new best friend, RegexBuddy:
I belatedly realized after I created this screenshot I may have accidentally picked the complicated "run away screaming" example. Great for me as an intermediate regex user, but not so great for introducing people to the miracle of RegEx. So let me apologize by way of explanation: this regex captures all valid HTML 4.0 tags. It also exploits a very powerful feature called named captures-- see the ?<element> and ?<attr> highlighted in that tannish-brown? In .NET you can refer to those matches with a very simple, logical syntax:
Dim mc As MatchCollection = reg.Matches(strHTML)
Dim m As Match
For Each m In mc
m.Groups("element").ToString
m.Groups("attr").ToString
Next
The one unique, killer feature that RegexBuddy has is super fast, real-time highlighting of all possible matches as you type the regular expression. That has always been my complaint about regex composition: it's difficult to tell beforehand what the effect of your regex will be until you "run" it and browse all the matches. With RegexBuddy, you don't have to-- just type and watch. No running required. But that's not the only great feature: the plain text regex decomposition and the pre-built regex library are also best of breed. Needless to say, highly recommended, and currently my preferred tool. It's not free, but TANSTAAFL.
Once you come to grips with the basics of regular expressions, you'll want a handy cheat sheet of the syntax. The best one I've found is VisiBone's JavaScript foldout. There's also an online version. All the VisiBone stuff is super cool, and brings back warm memories of those incredible Beagle Brothers posters I had for the Apple //. However, the information density does get a little ridiculous on the VisiBone cards, so I'd go with the foldouts or the wall charts, unless you enjoy squinting a lot. If you just can't get enough, and you want to learn about the thrilling history of RegEx and understand how they work under the hood (try to envison me stifling a yawn at this point) there's also the O'Reilly book.
You may not even need to know the syntax if you can drop prebuilt regexes into your code. Why build what you can steal? There are a number of sites with growing prebuilt repositories of regular expressions:
I also found this crazy tool written in LISP (!)
http://www.weitz.de/regex-coach/
Which can actually step through a regex interactively-- an actual debug mode-- and display the parse tree. Very interesting and definitely unique..
It ain't free anymore! $29.99....
Andrew on March 13, 2007 9:49 AMThe regulator on the other hand looks like a great free alternative.
Ben Blok on October 30, 2007 8:41 AMThere's now also the very swish, free, JavaScript-powered RegexPal (cutely similar name, hmm?) at http://regexpal.com/ .
Earle Martin on April 10, 2008 5:47 AMMy RegEx Tool of choice is Rad Software Regular Expression Designer, which is free and works well, at least for my needs:
http://www.radsoftware.com.au/regexdesigner/
Superb article, with fantastic references to resources. I've always used regular expressions sparingly, simply because I have to wrap my head around the syntax each time... but you have in a single post linked to a really concise set of great resources.
Bug on April 10, 2008 12:20 PMNothing beats RegexBuddy. I pay for it (personally) as well. I've tried all the freebies, and they cannot compare. It's honestly not much money (about $40), and it will save you oodles and oodles of time. Just think, if you're time is worth ~$40/hr (as mine is thanks to my dilbert-esque corporate job), then you can pay for this in just one hour. Even if it takes 8 hours of your time to pay for this (which would be $5.00 per hour for 8 hours), it's worth it!
Do yourself 2 favors. #1, learn regular expressions. #2, buy RegexBuddy.
Regards
Mick on April 10, 2008 9:01 PMHave you considered, like, I dunno... just learning the language?
It's not that hard.
Darth Mainer on April 17, 2008 12:19 PM>>Have you considered, like, I dunno... just learning the language?
>>It's not that hard.
This is a crazy statement!
Sure the RegEx concepts of matching are not hard to understand. The problem is the human usability of the syntax is one of the worst I have ever encountered in all of software development.
If RegEx were designed today it would have fancy features like "keywords" instead of overloading the '?' character to have 7 different meanings.
ecards on July 21, 2008 3:55 PMFor javascript regex testing, I use this online one - very easy to use and all.
http://www.pagecolumn.com/tool/regtest.htm
Hans Zena on November 13, 2008 1:43 AMI am in the process of learning Perl and also Java. RegEx's are perhaps one of the most important, and also complex aspects of these languages in order to extract information from input data and produce the desired output. Someone suggested just learning the language. I fully agree, but I feel that getting a tool like RegexBuddy would actually help me to learn Regex more rapidly than struggling through many books, websites, etc. I am going to buy the thing. For $40 I think it is worth it.
David on December 22, 2008 4:48 PM| Content (c) 2009 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved. |