regular expression
This article is a stub. You can help the IndieWeb wiki by expanding it.
A regular expression is a sequence of characters used to match, extract, and/or replace patterns in text, used in many IndieWeb implementations and libraries, such as autolinkers.
Why
Regular expressions can be useful (potentially even comprehensive) for parsing microsyntaxes, e.g. ISO dates.
Why not
Why not? Using regular expressions to parse more complex syntaxes (e.g. HTML) may lead to very weird errors, and potentially a source of vulnerabilities.
- 2009-11 You can't parse (X)HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. [β¦] http://stackoverflow.com/a/1732454/682648
- html regex
How to
This section is a stub.
Like most coding, find a regular expression that works for your use-case, and copy/paste.
IndieWeb Examples
Tantek
Tantek Γelik uses regexes on his personal siteβs code in his cassis.js library for
- autolinking - see auto_link_re() function near line 1312
- extracting: ASIN
- detecting: typical post slug URL segment
Tools
- https://regexr.com/ - excellent regex validator and expression explainer (what each piece does)
- https://regex101.com/ - another regex explainer like RegExr above, this one allows you to write tests
- https://regexlearn.com/ - an interactive regex course.
- capjamesg learned a lot about regex through this course.
Variants
There are many different implementations of regular expressions. This means a regular expression that works on one platform may not be supported on an other.
See https://en.wikipedia.org/wiki/Regular_expression#Syntax for documentation of various regular expression syntaxes and variants like PCRE, differences between Perl Compatible Regular Expressions and Perl, PHP Group documentation thereof, PCRE2 use in PHP 7.3, and POSIX.