This article is a stub. You can help the IndieWeb wiki by expanding it.
A regular expression is a sequence of characters used to match, extract, and/or replace patterns in text.
Regular expressions can be useful (potentially even comprehensive) for parsing microsyntaxes, e.g. ISO dates.
Why not? Using regular expressions to parse more complex syntaxes (e.g. HTML) may lead to very weird errors, and potentially a source of vulnerabilities.
- 2009-11 You can't parse (X)HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. […] http://stackoverflow.com/a/1732454/682648
- html regex
This section is a stub.
Like most coding, find a regular expression that works for your use-case, and copy/paste.
There are many different implementations of regular expressions. This means a regular expression that works on one platform may not be supported on an other.
PCRE, or Perl Compatible Regular Expressions, is an implementation that started 1997 with the aim of bringing Perl’s regular expressions to other platforms.
Because of its widespread use, it in turn has introduced new syntax. Some of these have then been brought back into the original Perl implementation.
Differences between PCRE and Perl are documented on Wikipedia and by The PHP Group.
PCRE2 was released at the start of 2015. PHP has switched from PCRE to PCRE2 in PHP 7.3, which was released in December 2018.
The POSIX standard defines regular expressions for use by operating systems in an interoperable manner. This flavour is implemented by CLI tools such as grep.
PHP used to support POSIX Extended regular expressions through the
ereg set of functions. This was deprecated by PHP 5.3 in favour of the PCRE implementation.
The PHP Group documents some notable differences between POXIS and PCRE. These should help getting started on converting between the two formats, though isn’t an extensive comparison.
- https://regexr.com/ - excellent regex validator and expression explainer (what each piece does)
- https://regex101.com/ - another regex explainer like RegExr above, this one allows you to write tests