regular expression
This article is a stub. You can help the IndieWeb wiki by expanding it.
A regular expression is a sequence of characters used to match, extract, and/or replace patterns in text.
Why
Regular expressions can be useful (potentially even comprehensive) for parsing microsyntaxes, e.g. ISO dates.
Why not
Why not? Using regular expressions to parse more complex syntaxes (e.g. HTML) may lead to very weird errors, and potentially a source of vulnerabilities.
- 2009-11 You can't parse (X)HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. […] http://stackoverflow.com/a/1732454/682648
- html regex
How to
This section is a stub.
Like most coding, find a regular expression that works for your use-case, and copy/paste.
Flavours
There are many different implementations of regular expressions. This means a regular expression that works on one platform may not be supported on an other.
PCRE
PCRE, or Perl Compatible Regular Expressions, is an implementation that started 1997 with the aim of bringing Perl’s regular expressions to other platforms.
Because of its widespread use, it in turn has introduced new syntax. Some of these have then been brought back into the original Perl implementation.
Differences between PCRE and Perl are documented on Wikipedia and by The PHP Group.
PCRE2 was released at the start of 2015. PHP has switched from PCRE to PCRE2 in PHP 7.3, which was released in December 2018.
POSIX
The POSIX standard defines regular expressions for use by operating systems in an interoperable manner. This flavour is implemented by CLI tools such as grep.
PHP used to support POSIX Extended regular expressions through the ereg
set of functions. This was deprecated by PHP 5.3 in favour of the PCRE implementation.
The PHP Group documents some notable differences between POXIS and PCRE. These should help getting started on converting between the two formats, though isn’t an extensive comparison.
See Also
- https://regexr.com/ - excellent regex validator and expression explainer (what each piece does)
- https://regex101.com/ - another regex explainer like RegExr above, this one allows you to write tests
- https://en.wikipedia.org/wiki/Regular_expression