A regular expression is a sequence of characters used to match, extract, and/or replace patterns in text.
Regular expressions can be useful (potentially even comprehensive) for parsing microsyntaxes, e.g. ISO dates.
Why not? Using regular expressions to parse more complex syntaxes (e.g. HTML) may lead to very weird errors, and potentially a source of vulnerabilities.
- 2009-11 You can't parse (X)HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. […] http://stackoverflow.com/a/1732454/682648
- html regex
This section is a stub.
Like most coding, find a regular expression that works for your use-case, and copy/paste.
There are many different implementations of regular expressions. This means a regular expression that works on one platform may not be supported on an other.
Because of its widespread use, it in turn has introduced new syntax. Some of these have then been brought back into the original Perl implementation.
PHP used to support POSIX Extended regular expressions through the
ereg set of functions. This was deprecated by PHP 5.3 in favour of the PCRE implementation.
The PHP Group documents some notable differences between POXIS and PCRE. These should help getting started on converting between the two formats, though isn’t an extensive comparison.