My Code Here: June 2011

Thursday, 30 June 2011

Facebook: An Ugly Stupid Service

... designed to teach you to systematically undervalue your privacy.

Author and freedom fighter Cory Doctorow in great form on this subject, censorship, psychology, and education (from TEDxObserver):

Rediscovered this morning at this Boing Boing post - itself inspired by John Scalzi's Whatever.

TEDTalks are distributed under a Creative Commons (CC) licence.

Monday, 27 June 2011

Security Testing Part 4 - Pentests

A Monotone Spectrum

Pentests, short for Penetration Tests, describe the simulation of malicious attacks for the purpose of evaluating a computer system or network's security. These attacks are available in any colour, as long as it's greyscale...

In The Clear

Pentests are quite often clear box (or white box, or full disclosure) attacks, simply because as testers we more often than not have full access to, and complete knowledge of, the infrastructure to be tested. So, why not make best use of this advantage? Such knowledge includes all available information about the internal data structures and algorithms employed, the relevant source code used to implement these, the network diagrams, IP address data, actual passwords, and so on.

Examples of white box testing include:

API, Fault Injection and Mutation Testing methods.
Most static testing, e.g., code reviews, inspections and Valkyries (damn autocorrect: I meant walkthroughs); testing where the software isn't actually used. Instead for example, the code may be read, or scanned automatically for syntactic validity.
Code coverage can only be delivered through white box testing. Without inspecting the source code, there can be no guarantee that any given code path is exercised.

White box methodology is frequently used to evaluate the completeness of test suites created through black box testing methods (see below). This strategy allows the examination of the rarely tested parts of a system, ensuring coverage of the most important function points.

In The Black

Pentests can also of course be black box (also known as blind) tests, where there is assumed to be no prior knowledge of the infrastructure to be tested. As testers, we must first determine the location and extent of the system under test, before commencing analysis. When we use black box penetration testing methods, we are assuming the role of a real, external, black hat hacker, who is trying to intrude into our system without much actual knowledge about it. By contrast - almost literally! - white box testing can be seen as simulating what might happen after a leak of sensitive information, or equivalently, during an "inside job", when the attacker has access to confidential information.

Grey Goo

In between these two extremes, a school of grey box testing has evolved. As the name suggests, these pentests combine aspects of black and white box attacks. The main reason to do this is to provide customised test coverage for various elements in distributed systems. For example, we might use our knowledge of internal data structures and algorithms for the purpose of designing our test cases. Then when it comes to the point of actually executing these tests, we perform them as a normal user, i.e., at a black box level.

A good example of grey boxing is when we modify the content of a data repository, which is not itself part of the delimited system under test. That's not something which a user would normally be able to do, so it can introduce a white box element into an otherwise black box test or attack suite. Another example in a similar vein might be the determination of error message contents, or system boundary values, by reverse engineering.

Note that most instances of input data manipulation and/or output formatting do not qualify as grey box techniques, because by definition, input and output are not part of the system under test. So for example, integration testing between two code modules written by two distinct developers, where only certain interfaces are exposed for test, is still regarded as black boxing.

Careful Now!

Elementary white box penetration testing can often be done automatically, and therefore cheaply. Black box attacks are another matter entirely. Because you are literally attacking a network (often a working production system) blindly, your test activities will inevitably comprise actual security attacks. You will cause denial of service, both intentionally and as a side effect of the stress you put on network response time via vulnerability scanning. At worst, you might cause actual harm to the system, rendering it just as inoperable as had a real black hat attacked. Much of the time and effort required with black box pentests lies in trying not to destroy things, while still reaching deeply enough to expose vulnerabilities.

Pronounced "Awe Stem"

The OSSTMM, or Open Source Security Testing Methodology Manual, is both a peer-reviewed security testing, metric measurement and analysis methodology, and a philosophy of operational security. It is a Creative Commons licensed publication of the Institute for Security and Open Methodologies (ISECOM). As such, the encapsulated methodology, covering what / when / where to test, is itself free to use and distribute under the Open Methodology License (OML).

The Manual's primary objective is to create a scientific methodology and metrics for operational security evaluation, based upon test results. It suits most kinds of security audit: penetration tests, ethical hacks, security and vulnerability assessments, and so on. Secondarily it acts as a central reference in all security tests regardless of the size of the organization, technology, or protection, and provides analyst guidelines, enabling a certified OSSTMM audit by assuring:

test thoroughness and legal compliance;
inclusion of all necessary channels;
results quantification, consistency and repeatability; and
that factual information is derived exclusively from tests.

According to the ISECOM website, a handbook version of version 3 of the manual will be "available soon".

Previously:

Part 1 - Overview
Part 2 - Lab Work
Part 3 - The Attack Surface

Thursday, 23 June 2011

Remember EA_Spouse?

Well She Wrote A Book

No, not about the labour practices of a top video games firm in 2004 and beyond. Although that singular LiveJournal article certainly showed, among other things, that the (then anonymous) Erin Hoffman had a terrific talent for certain kinds of writing; EA: The Human Story was nominated for Joel Spolsky's Best Software Essays of 2004.

Subsequent events showed her to be equally determined, single purposed and shit stirring when deciding to embark upon a campaign, to highlight or right a wrong, to raise the profile of an issue she feels is getting brushed under the beanbag. Her gamewatch.org forum today holds over 12,000 posts in as many topics, though it seems not to have changed that world; for example, one comment from five years into the project (July 2009) revealed:

They're still doing it. I have a friend who is working 6am to 9pm 7 days a week as his project approaches release.

Despite Riccitiello's assurances otherwise, his middle management is fighting him and refusing to change. They are still paying below-the-poverty-line wages, they still are incapable of figuring out a schedule that doesn't involve abuse of its employees, and they are still playing games with employee classifications to avoid providing full benefits.

I'm in the industry, and if my company ever got acquired by EA, I would quit on the spot. My salary would be cut, my hours increased without compensation, and my work transformed into a bureaucratic mess (I've heard how heavy in middle management EA is). I'd be spending more time filling out useless make-the-managers-look-busy reports and attending endless meetings than coding and documenting. Nothing is worth this price, and people looking to enter the industry need to realize that.

Anyone but EA.

So Not About That Then

So no, like I said, the book's not actually about any of that. As you might more reasonably have guessed, Sword of Fire and Sea - subtitled The Chaos Knight, Book One - is a fantasy, written by one "obsessed with hidden truths, and the responsibility involved in uncovering them." Main character Captain Vidarian Rulorat is the last surviving member of his family. Obligated to an allegiance with the High Temple of Kara'zul by his great-grandfather's abdication of imperial commission (for love of a fire priestess, no less), Vidarian struggles to resolve the conflicts between the real world of his family legacy, and Andovar's hidden and morally ambiguous history.

One of the things drawing me towards this title, in addition to its glowing reviews by multiple Hugo Award winning SF/F novelists ("Read it and be swept away" says Allen Steele), is its length. Or rather, the dearth of it. Erin has made the very deliberate choice to keep it succinct. Short novels, she says, are rare in fantasy these days. She loves the short form, and obviously hopes many others secretly do too. From her Big Idea piece via John Scalzi:

I want to get in, get euphoric, and get out, without getting bogged down in lengthy genealogy records or endless hikes across Mordor.

Available to preorder at amazon.co.uk.

Sunday, 19 June 2011

Computer Museum (2)

Re-Animation

Fun though it was to dig out my old Sharp PC-1211 for the previous article in this series, it was a little disheartening to realise that any attempt to procure its toxic little mercury batteries would likely land me on a terrorist watchlist for the rest of my life. So it was a happy surprise to be reminded that its successor, the PC-1500, takes four bog standard AA cells. Here it is, recently emerged from the loft, all powered up and asking for permission to erase its now random memory contents, the dream debris of its (almost three decade) nap.

This 1983 purchase was funded through the usual channels. In other words, I initially went halfers with my pal Brian. Then once I'd accumulated enough buroo cheques, I did him over like Eric Cartman for full ownership. Honestly, sometimes I wonder how my friends ever did put up with my behaviour. But the time between striking this partnership and its ultimate betrayal was a golden age; the PC-1500 turned out to be a hacker's wet dream.

Peek And Poke

We discovered various ways to abuse this BASIC programmable pocket computer, forcing it to interpret pre-crafted code memory contents as data, and vice-versa. This first revealed that the BASIC ROM recognised several keywords not mentioned in the official user's manual. Not just any keywords, but the holy trinity of PEEK, POKE, and the almighty CALL. Soon we had discovered enough single machine code I/O instructions to discern that the processor was very Z80-like in its architecture, at which point Troy fell, and shit was lost.

Soon armed with the full processor instruction set, we started writing super fast Moon Landers, Star Treks, Snakes, Space Invaders... and of course my own personal favourite, Son of the Revenge of Complex Arithmetic III. Having figured out the display hardware too, we had what felt like unlimited graphics power. Although the monochrome LCD was just 156×7 pixels, it was much cleaner and sharper looking than that of the PC-1211, which had used an ugly yellow-green filter to protect its fragile, almost still prototypical, vampiric liquid crystals, from damage by daylight.

Actually on second thoughts, I think Dominoes was my favourite, for several reasons. It was the first game I wrote using the full power of the machine, and the one that paid off the initial purchase costs. The domino images were pixel-perfect, and the game let me introduce my dad to computers (no mean feat for a geek in the early 80s), because its UI was so friendly: to play the 3-4 domino, you just typed 34. Finally, the AI was terrific; it made a truly formidable opponent. How did I achieve this level of awe? I programmed it to look at your hand.

Technology Caught Evolving

So here's what my PC-1500 looks like inside. Notice the two 0.1" pitch chips, the ones labelled TC5514P, sticking up like sore thumbs on an otherwise 0.05" pitch surface mount 2-layer board with through plating. Those are 1K by 4 bit static RAMs. The big LH5801 on the bottom board is the CMOS static 8-bit CPU, its LH5811 neighbour the peripheral I/O controller (an unwritten law said these always had to be named +10 higher than the corresponding CPU part number). The whole machine, like all such pioneers, screams a thoroughgoing compromise of new and old technology; 6V performance versus 130mW C-MOS battery life.

No Peripheral Vision

We never did fork out for the audio cassette interface and printer. This might seem unbelievable now, but it's true: the ritual and preamble to playing a computer game involved an hour or so of typing it all back in again. From your own notebooks, or from multipage magazine listings. Quite often this was the point at which games evolved, as you'd notice some possible improvement, or identify a great extension, each time you laboriously re-typed the now familiar code.

That was bad enough for BASIC code. But now we had to enter machine code and hexadecimal data, all without the aid of an assembler. Not something you want to have to do even once, but we did it every day. More than the lack of program memory, it was this tiresome drudgery that taught us only ever to write optimum code, first time.

I did design, and build into one of those pale blue Marshalls Electronics project boxes, a DC power supply to use with the PC-1500. Sadly I never really trusted my own crowbar overvoltage protection circuit enough to use it for more than a few minutes at a time. Eventually binned it just last year.

Previously: Computer Museum (1)

▶

Update (June 25): the day after I power it up, the old PC-1500 begins haemorrhaging from the top right corner of the display. Click the photo on the right to sleuth the evidence. What the hell is this? So I dismantle it...

◀ Inside there's this 10cm dark red opaque plastic block, soft yet brittle, seemingly stuck to the top of the display with strawberry jam. That's it on the right hand side, while the trail of blood can now be seen on the PCB, display, bezel, and other components of the casing. This is going to take a lot of solvent, and a few drums of cotton buds...

Half a bottle of meths and 100 cotton buds later: burp. Well that was an epic, what a fiddly stripdown, clean and rebuild. But the operation, the sticky gunky gluey plastic blockectomy, has been a success, and the wee beastie is back in pristine working condition. As for the thing I removed, not been able to find any reference to that in the online PC-1500 reference material. I suspect it may have been a thermal mass.

The late seventies belonged to the bright, power hungry, red 7-segment LED wristwatch and calculator. Arriving in their New Romantic mullets, the first LCD replacements were quite temperamental, by which I mean, temperature sensitive. They needed thermistor circuits to stabilize their viewing angle. My guess is that the soft plastic strip I've removed and discarded was designed to average out fluctuations in the sensed temperature, e.g. to stop the display from fading when the device was held in warm hands.

This work, including photography, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported Licence.

Saturday, 18 June 2011

Sonny Marvello

Update 16 December 2011

This post has been removed.

Update 7 January 2012

On the other hand, the band's offensive remarks haven't yet been removed from Facebook, and are still publicly viewable. I guess it does no further harm to repeat them here:

https://www.facebook.com/sonnymarvello/posts/3015725752922

Friday, 17 June 2011

Prog Rock & Metal News

Read All Around It

Most mornings, Google News is your excellent first stop shop for all that's happened in the night. The exception is Sunday, when nothing lighter than the dead tree edition of The Sunday Times will satisfy (online access not required, thanks). Truth is, I would still buy that weekly hundredweight of paper if everything except Dan Cairns's new music reviews got blacked out.

For all those other, lesser days of the week, the customizability of the Google News U.K. page* ensures there's always more than enough news, knowledge and gossip on tap. The important step is that customization. My own choice of standard sections, given geography and blogging interests, holds no surprises: World, U.K., Scotland, Glasgow, Sci/Tech, Physics, Astronomy, Space, Computer Security, Video Games, Entertainment, and Rock Music.

If you haven't personalised your Google News page yet, use the Add a section link at the top right to find and add news categories that interest you, and the Edit this page link to organise these, or to delete those of no interest. I'll never forget that enlightened day when my whole life, well my Google News experience anyway, improved tenfold as I finally got rid of those toxic default Business, Sports, Health and Spotlight pages.

The Lost Chord

But there's one more, essential category of update you won't want to be without. I speak obviously of the latest events in the world of progressive metal music. None of the standard supplied pages can quite satisfy the exacting criteria of this specific thirst for knowledge. Despair not, fellow geek headbangers, I bring you good news! Literally. I mean I've created a suitable custom section, based on the simple query "prog rock, progressive metal", which teases out of the Googleplex, just the optimum mixture of attention worthy, classic and modern, metal, prog, and their bastard offsprogs.

On the Add a section page under Search for sections, type progressive metal and hit Search. You should return just one result, titled Prog Rock & Metal; that's my page. Here's a link to its current content:

http://news.google.co.uk/news/section?pz=1&cf=all&ned=uk&hl=en&csid=7d6ecf57b214b344

Never a week goes by without at least one or two interesting developments popping up here, which I haven't yet seen elsewhere. And the page is currently enjoying a great surge in popularity! Last time I checked, I think there were a total of approximately four subscribers. Wow! Clearly this is a phenomenon whose something something now! And just imagine: you can be subscriber number five. That's right, I'm letting you in on the ground floor of this unique opportunity.

You're welcome.

* Actually the BBC News site gets roughly equal time; public service output is an excellent antidote to filter bubbling.

Update (July 18): Yes, in the wake of the phone hacking scandal, my entire extended family and I have now joined the inevitable total boycott of all News International publications. But to my shame, I should have done this many years ago. But for all its faults, crimes, and abuses, still there's nothing out there to approach the quality of The Sunday Times when it's good. Trouble is, we can no longer determine when those times are, and when by contrast, it is being controlled - nay, written - by organised criminals. My Secondary English teacher Mrs Abraitis once enjoined us all to commit to this, then venerable, paper; and as many a raped choirboy might attest, such formative imperatives often cast long shadows.

Today I can only hope that Dan Cairns will move his musical journalism expertise quickly elsewhere.

Monday, 13 June 2011

Focus!

A Fortuitous Juxtaposition

RSS readers will have missed the coincidence of expression in my Technology sidebar today, where these two articles from different Microsoft blogs collided:

Nice to know the Key to Success at Microsoft is not something that you need to steal.

A Really Good 4096-bit AES Key Service

Where Do I Sign?

Sorry, there's just no such thing. Oh it exists in principle, yes. In theory. In Plato's universe of ideals, yes, you can buy them there. Pick them up for free, in fact. But as Robert X. Cringely (who is not a spy) explains in the article When Engineers Lie, "Build a really good 4096-bit AES key service and watch the Justice Department introduce themselves to you."

Bob also answers the perennial question about architectural secrecy: why is it needed at all, if 1024- or 2048-bit codes really would take thousands of years to crack? Isn’t the encryption, combined with a hard limit on login attempts, good enough? The answer is no. Part of the explanation is that the U.S. government insists on nobbling the key services of every provider - RSA, Cisco, Microsoft, just everybody - to ensure they're sufficiently insecure, to enable snooping. The cost of noncompliance is, of course, jail.

For the obligatory silver lining, Bob points to IPv6 and Open Source, which he reckons "are beginning to close some of those security doors that have been improperly propped open." Go read his article at I, Cringely to find out why.

V For Vendetta

Recently this little blog hasn't been doing its job, of reporting on the latest big security breaches, how they were done, whodunnit, and to whom. Quite simply there have been far too many; an almost unprecedented number of successful attacks, on a scale rarely seen previously. And as I've said before, the media coverage of this phenomenon has been such as to render my monthly Security Digest utterly redundant.

Instead we sit on the sidelines enjoying the mayhem, grumbling agreement with Bob Cringely. Or with Patrick Grey at Risky.biz, who reckons that we - the professional security community - secretly love LulzSec. Patrick's rant is still more entertaining than Bob's, and just as thought provoking, as he laments his ten years of futility, trying to get businesses to see and acknowledge the potential for chaos which LulzSec is now so ably demonstrating, and which is gradually earning the reluctant respect of security researchers:

Security types like LulzSec, because they're proving what a mess we're in. They're pointing at the elephant in the room and saying "LOOK AT THE GIGANTIC FUCKING ELEPHANT IN THE ROOM ZOMG WHY CAN'T YOU SEE IT??? ITS TRUNK IS IN YR COFFEE FFS!!!"

Not Necessarily the Official SDL Position

Local hero and a familiar name to readers of this blog, Adam Shostack chimes in to express reluctant, or guarded, agreement. In his article Are Lulz Our Best Practice? he confides that he takes "a certain amount of pleasure in watching LulzSec. Whoever’s doing it are actually entertaining, when they’re not breaking the law. And even sometimes when they are."

In prescribing the way out of our present troubles, Adam returns to the main point being made by Bob in the first article above. Salvation lies in admitting the breaches that do occur, talking about them and about the wider world of security openly, transparently and honestly.

Sunday, 12 June 2011

Seven Seas of Rhye

Wouldn't exactly call myself a Queen fan.

Commercially, they had some spectacular ups; some surprising, and some predictable, downs. Theatrically, they were always sans pareil. Lyrically of course, Freddie never stopped improving for all of his life. But musically speaking, seriously, I don't think they ever bettered this song. The rolling sixth beloved of all forms of popular music, and equally overused in them all, gets a twist in the opening piano arpeggio. Verse lengths are permuted by variations in the end pause (and one minor deviation). An excursion into the subdominant bridge resolves almost too smoothly into the tonic return. And the ingenious, percussive meter of the final verse alone is worth the ticket price.

Saw them play this in Glasgow in 1973, when they supported Mott The Hoople. They got encored - while Ian Hunter, exiting stage right, had to beg for theirs! That tour was the very first, the last, the only time Queen played support for anyone.

Saturday, 4 June 2011

Simple Regex #3½: Balancing Groups

Undo Revisited

I've been taken to task for failing to provide a Regex-based solution to the problem in the previous article of this series, namely, removal of backtracked elements from a path. One correspondent even left a comment containing the missing Regex solution! Another complained that my I/O path specific .Net workaround would be zero help to someone just happening upon my article, maybe while researching some other aspect of the underlying patterns.

It's a fair cop, and the only way to salvage even a wee bit of reputation is to provide a full explanation of the proposed Regex solution. In mitigation, the proposed mechanism (Balancing Group Definitions, described below) is also unique to the .Net implementation of Regex. But hey, I'm only the piano player...

This is one of those curious cases where the attempt to generalise the problem reveals a simplifying hidden structure. The first underlying pattern mentioned above is the Undo (or sometimes the Delete) pattern. Consider the original example:

d:\project\client\forms\..\..\version.cs

The entire midsection "\client\forms\..\.." is redundant; we'd like to delete such missteps, obtaining a "normalised" answer such as

d:\project\version.cs

Consider the operating system's behaviour as it attempts to follow the original verbose path. It acts like a finite state machine. Each time it encounters a subdirectory name, like \client or \forms, it enters that subdirectory. Each time it encounters the parent token \.. it reverses back out of the most recently entered subdirectory.

But look what happens when we replace the phrases subdirectory name with letter, and parent token with backspace. Now the problem can be seen to be completely analogous with that of correcting a typed word.

A Typo Processor

Let's use the # symbol to represent the pressing of the backspace key. Then our new problem is to take an input comprising a sequence of letters and # symbols, and parsing it, recover the correctly typed word. For example, if I misspell yield as yeild but correct the mistake manually before completing the word, then the input string (the received sequence of keystrokes) might be yei##ield, and the required output is, of course, yield.

The pattern that our Typo Detector has to detect is: a number (one or more) of # symbols, immediately preceded by exactly that same number of letters. This then is the pattern - in the example case, ei## - that must be removed, in order to, erm, yield the corrected word.

Following the naive reasoning of the original article, we might try using something like \w# (one "word" character, followed by a backspace token) to match the typo pattern. And just as before, this fails in the second simplest test case (the ei## of the previous paragraph). What we need is \w\w## to match that case. But then we also need \w\w\w### to detect the case where three consecutive mistyped characters are corrected. And so on. In fact what we need is approximated by the pseudopattern

\w{n}#{n}

where n cannot be specified in advance, but

is greater than zero, and
represents the same number in both positions.

Actually, we want even more than that. What about cases where I make a mistake while correcting a previous typo? For example yei#i##ield. Ideally we'd also like permuted edit patterns like this one, \w\w#\w##, extracted as entire single units. Unbelievably, this functionality is readily provided by the Microsoft proprietary Regex extension known as...

The Balancing Group Definition

But first, let's revise the Group concept from the ground up. Groups are simply added to a Regex by surrounding any pattern subexpression in parentheses. For example, the pattern

0141(\d{3})\d{4}

not only matches any BT landline number in Glasgow, but also allows the central three-digit exchange code to be retrieved conveniently as a group for use elsewhere. These anonymous groups are retrieved by numerical index, which can quickly get a bit too fiddly for comfort.

The next level of usability is the named group. The prefix ? within a group introduces its name, which should be enclosed in either apostrophes or angle brackets. For example, the pattern

0141(?'exchange' \d{3})\d{4}

allows retrieval of the three-digit exchange group by name. And we are nearly there. Because it's an interesting implementation detail about groups, and in particular named groups, that they are pushed on to a stack as they are identified. This means that they can also be popped off. If we can push a group for each word character typed, pop one for each backspace token, and determine when the stack becomes empty, then we can detect all redundant groups in the input stream.

The balancing group definition will delete ("pop") a previous group, and replace it ("push") with the interval between the previously defined group and the current group. Syntactically, we append the previous group name to the current one, using a hyphen for separation. So it looks like this:

(?'curr-prev' subexpression)

The current group name curr is optional, since nothing says that all groups have to be named. The previous group name prev is mandatory; we do need some way of referring to it, since we're about to pop it.

Here is the magic pattern:

\w((?>\w(?'N')|#(?'-N'))*(?(N)(?!)))#

Since it detects an erroneous word character which is subsequently deleted, it begins naturally enough with \w( and ends with )#. Hey this is easy! Now look at the first interior pattern, west of the asterisk:

(?>\w(?'N')|#(?'-N'))

This is saying a couple of things:

If there's another word character \w then push a new group, which we call N.
Otherwise if there's a backspace token # then pop the most recent group named N.

The asterisk allows this process to happen any number of times, including zero. This corresponds to typing some characters, deleting a few, typing some more, and so on. Penultimately, east of the asterisk and just prior to the final #, we find another expression:

(?(N)(?!))

This is a trick! If there are still undeleted word character groups on the stack, i.e. an N that hasn't yet been popped, this construction forces a match against the pattern (?!) about which all we need know is that it will fail. Well okay, it's called an expressionless negative lookahead, if you really must know everything! Otherwise so long as the final # appears, we have found a wholly redundant segment.

Back To The Path

If in our magic pattern we now replace \w with a pattern for a subdirectory name, such as \\\w+, and if we further replace # with the parent directory pattern \\\.\. throughout, then we transform it into the path normalizer that was the subject of the original article. The details are horrendous, because filesystem subdirectory names can contain other than word characters, including dots. We have to exclude \.. from these, and \. requires even more special handling. Nevertheless we have solved the fundamental problem of applying Regex here, and as it so happens that all of my subdirectory names were well behaved and dotless in any case, this solution would have worked for me.

Further Generalisation

The Undo/Delete pattern itself represents a simple subset of the problems that these Regex Grouping Constructs are capable of solving. Refer to the MSDN Regex topic on the applicability of Balancing Group Definitions for a complete guide to applying these constructs in any given recursively nested construct parsing context, such as matching parentheses, opening and closing brackets, quotation marks, mathematical expressions, or lines of program code containing multiple nested calls (most of which cannot be parsed in Regex implementations other than the .Net one).

Though be warned, the expository technical language used over there assumes at least some familiarity with the domain of abstract formal languages. Perhaps a better, more gentle introduction to balancing groups can be found in either of these articles:

Update (20 June 2011): I was going to share my .NET Regex Tester online with you, before I noticed that Derek Slager's is, in his own words, better... so, here's a link to his:

http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx

Good luck!

Wednesday, 1 June 2011

Tweets - May 2011

LibDems didn't just neuter their own ability to criticise FPTP and the adversarial system. They willingly corrupted it. http://bit.ly/mvAmEh
— John Kerr (@dogbiscuituk) May 5, 2011

In almost 20,000 tests to date I've always been found to wake up next day. Often after poisoning. Evidence so far is that I'll live forever.
— John Kerr (@dogbiscuituk) May 10, 2011

There’s no pitch-tuner or mixboard here. You know why? Because these folks are real musicians. Making music is what they do.
— John Kerr (@dogbiscuituk) May 10, 2011

Throw away 90% of everything, including the 10% that's left. #adviceforwriters
— John Kerr (@dogbiscuituk) May 11, 2011

Don't want a second preference? Fine, just put a "1" in your favourite box. But why on earth vote to prevent anyone else expressing theirs?
— John Kerr (@dogbiscuituk) May 15, 2011

The Wikipedia page "List of numbers" http://bit.ly/r47Wn begins "This list is incomplete; you can help by expanding it." http://xkcd.com/899
— John Kerr (@dogbiscuituk) May 16, 2011

Try entering an XKCD page address into a URL shortener. Makes it longer.
— John Kerr (@dogbiscuituk) May 16, 2011

SNOY "choice of games we're offering is good value"..."there's going to be a minority of people out there who have some of those games" WTF?
— John Kerr (@dogbiscuituk) May 18, 2011

Mark my words: console overheating problems with LA Noire (both PS3 and XBox) are being caused by excessive hard disk usage during gameplay.
— John Kerr (@dogbiscuituk) May 22, 2011

@johnmoe Then she should try an English Superinjunction
— John Kerr (@dogbiscuituk) May 23, 2011

Pat said: You know, a book is really a living thing. You are asking me to dissect it for you. Anytime you dissect a living thing, it dies.
— John Kerr (@dogbiscuituk) May 28, 2011

I was trying to sing "Kitten With a Loaded Gun", but it kept coming out "Chicken..." instead. Why was that? Clearly it was your fault.
— John Kerr (@dogbiscuituk) May 28, 2011

@stephenfry Are you sure it was Gaga's acoustic Poker Face and not Molly Lewis? Cos if it was Molly, there's something else you should know!
— John Kerr (@dogbiscuituk) May 28, 2011

Fed 12 from a single £8 leg of lamb. Well done my wife! Dishes can wait.
— John Kerr (@dogbiscuituk) May 29, 2011

Met a giant eraser walking down the street. I said, "Deleted to meet you!"
— John Kerr (@dogbiscuituk) May 29, 2011

Motion capture animations can be parameterised and replayed live. Great. LA Noire's MotionScan is only ever used for cutscenes. Why bother?
— John Kerr (@dogbiscuituk) May 30, 2011

My Code Here