User Tools

Site Tools


Regular_Expressions

Regular Expressions (RegEx)

Keyboard Maestro uses ICU Regular Expressions (aka RegEx or RegExp) which is very similar to PCRE (Perl Compatible Regular Expressions), and you can read their documentation by choosing ICU Regular Expression Reference from the Help menu in Keyboard Maestro.

In any Action where the term “matches” is used, you may (optionally) use a regular expression.

The two most commonly used RegEx Actions are:

In all of the RegEx searches, you can choose to search in any one of the following: System Clipboard, Variable, File, Text entered into the Action, and Named Clipboard.

There are many Actions in Keyboard Maestro where you can use a regular expression, which may not be obvious. For more info see Places where Regular Expressions can be used.

Search Modifiers

The ICU calls these modifiers “flag options”.

The search modifier “Pattern to Use” shown below is placed at the very beginning of the Search/Find Regular Expression box.
For example:
(?m)^\s*\d+[\t]+
as shown in the below example #1.

PurposePattern to UseDescription
GlobalNONEThe Search using Regular Expression Action will return the first match it finds in the source string.
In order to make it be a “global” search, you need to put this Action in a For Each Action that uses the [ Substrings Matching in ] Collection. This will loop through all matches found in the source string.
This is often indicated in other tools by the /g modifier.
Case Insensitive(?i)Matching will be in a case-insensitive manner.
In some Actions this is NOT necessary since the Action already provides an “ignoring case” option.
Dot includes EOL(?s)A “.” in a pattern will match a line terminator in the input text. Note that a CR LF pair in text behave as a single line terminator, and will match a single “.” in a RE pattern.
Multi-Line(?m)Controls the behavior of “^” and “$” in a pattern.
If used, “^” and “$” will also match at the start and end of each line within the input text.
Word Boundaries(?w)Controls the behavior of \b in a pattern.
If used, word boundaries are found according to the definitions of word found in Unicode UAX 29, Text Boundaries.
Comments(?x)Ignores white space and allows #comments.
If used, white space within regular expressions is ignored and you can use #line-comments.

Capture Groups

  • Use the form of $<CG#>, or \<CG#> (in ver 8+) in the Action Replace box, where <CG#> is the number for the capture group.
    • Examples:
      • $1, $2, $23
      • ${1}, ${2}, ${23},
      • ${name} - named capture group (v9.2+, 10.13+)
      • \1, \2, \3 - single digit only (v8.0+)
    • This is the same as \<CG#> used in other apps/languages, like BBEdit.
    • The zeroth capture group (eg $0) is the entire match.

    For more information, see Capture Groups.

Use of Text Tokens

You can use Keyboard Maestro Text Tokens anywhere appropriate in both the Search pattern and the Replace pattern.

ICU 55+ Metacharacters

For the currently complete list, see ICU Regular Expression Metacharacters.

ICU 55 was released on 2015-04-01, and is available in the follow software versions:

  • macOS 10.11+ (El Capitan) except for JavaScript
  • Keyboard Maestro Native Actions

These Metacharacters offer some powerful solutions to long-standing RegEx problems. These can replace complicated RegEx patterns previously required, and are recommended for use if you are running the required versions. For example, \h and \R. ICU 55 is not available in all RegEx engines. A notable exception is JavaScript (and JXA), even in High Sierra. So if you are using the native Keyboard Maestro Actions that use RegEx, it would be available, but not necessarily in Actions that use Execute Script in another language. It is available in Execute AppleScript Actions that use ASObjC RegEx.

New Metacharacters in ICU 55+1)
Character Alternate Expression
(Pre ICU 55)
Description
\h [^\S\r\n\f] Match a Horizontal White Space character.
They are characters with Unicode General Category of Space_Separator plus the ASCII tab (\u0009).
\H [\S\r\n\f] Match a non-Horizontal White Space character.
\k<name> No Alternative Named Capture Back Reference.
\R (?:\r?\n|\r) Match a new line character, or the sequence CR LF.
The new line characters are \u000a, \u000b, \u000c, \u000d, \u0085, \u2028, \u2029
\v [\n\r] Match a new line character.
The new line characters are \u000a, \u000b, \u000c, \u000d, \u0085, \u2028, \u2029.
Does not match the new line sequence CR LF.
\V [^\n\r] Match a non-new line character.

For an in-depth discussion, see this Forum topic: RegEx for Horizontal Whitespace.

Examples

Example #1: Remove all line numbers from a string with multiple lines

Example #2: Extract Capture Group for Multiple Matches to Multiple Lines in a Variable

Sometimes you may need to extract a RegEx Capture Group after a match is made, and do this for multiple lines (matches) in the entire source string. To achieve this, you need to:

  1. Use the For Each Action to get each match into a variable (MatchString)
  2. Then use a Search Variable Action to get the Capture Group for that match

Both Actions can use the same RegEx pattern.


For a complete macro using this method, see:
MACRO: Get List of RegEx Capture Group of Multiple Matches

See also:

Actions

Conditions

Triggers

Macros That Use RegEx

Forum

General

Software

  • BBEdit – A Programming Editor with PCRE regular expression support.
    • The commercial demo expires in 30 days and reverts to the still very powerful BBEdit-Lite (freeware).

  • Patterns – A regular expression analyzer available on the app-store.
  • RegExRX – A regular expression analyzer available on the app-store.

  • RegEx101.com – An Outstanding and Comprehensive Online RegEx Analyzer.
  • Regexr – Regex testing and explaining, examples and references.

Books

Online References

Online Primers & Tutorials

Keywords: Regular Expression, RegEx, RegExp, Find, Replace, Match

Regular_Expressions.txt · Last modified: 2022/11/10 03:29 by ccstone