Skip to content

CL_ABAP_REGEX

Compile and apply regular expressions in ABAP.


Purpose

CL_ABAP_REGEX compiles a PCRE pattern into a reusable regex object. CL_ABAP_MATCHER applies that pattern to a string and returns match results via CL_ABAP_MATCH. Used when built-in FIND/REPLACE statements are not flexible enough (e.g., capture groups, iterating over all matches, replace with back-references).

PCRE, not POSIX

ABAP uses Perl-Compatible Regular Expressions (PCRE). Syntax like \d, \w, lookaheads (?=...), and non-greedy quantifiers *? all work. POSIX classes like [:alpha:] are not supported.


Key classes and methods

Class / Method Description
CL_ABAP_REGEX=>create_pcre( pattern ) Static factory — compiles pattern, returns CL_ABAP_REGEX
lo_regex->create_matcher( text = ... ) Returns a CL_ABAP_MATCHER bound to text
lo_matcher->match( ) ABAP_TRUE if the entire string matches the pattern
lo_matcher->find_next( ) Advances to the next match; returns ABAP_TRUE while matches remain
lo_matcher->get_match( ) Returns CL_ABAP_MATCH for the current match position/length
lo_matcher->replace_all( newtext = ... ) Replaces all matches; returns the modified string
lo_match->offset Zero-based start position of the match
lo_match->length Length of the matched substring

Examples

Test if a string matches a pattern (email validation)

DATA lo_regex   TYPE REF TO cl_abap_regex.
DATA lo_matcher TYPE REF TO cl_abap_matcher.
DATA lv_email   TYPE string VALUE 'user@example.com'.

lo_regex = cl_abap_regex=>create_pcre(
              pattern = '^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$' ).

lo_matcher = lo_regex->create_matcher( text = lv_email ).

IF lo_matcher->match( ) = abap_true.
  MESSAGE 'Valid email address' TYPE 'I'.
ELSE.
  MESSAGE 'Invalid email address' TYPE 'W'.
ENDIF.

Find all matches with find_next

DATA lo_regex   TYPE REF TO cl_abap_regex.
DATA lo_matcher TYPE REF TO cl_abap_matcher.
DATA lo_match   TYPE REF TO cl_abap_match.
DATA lv_text    TYPE string VALUE 'Order 1001 and order 1002 are pending'.
DATA lv_token   TYPE string.

lo_regex   = cl_abap_regex=>create_pcre( pattern = '\d+' ).
lo_matcher = lo_regex->create_matcher( text = lv_text ).

WHILE lo_matcher->find_next( ) = abap_true.
  lo_match = lo_matcher->get_match( ).
  lv_token = lv_text+lo_match->offset(lo_match->length).
  WRITE: / |Found number: { lv_token }|.
ENDWHILE.
" Output:
"   Found number: 1001
"   Found number: 1002

Replace all occurrences

DATA lo_regex   TYPE REF TO cl_abap_regex.
DATA lo_matcher TYPE REF TO cl_abap_matcher.
DATA lv_input   TYPE string VALUE 'Hello   World   ABAP'.
DATA lv_result  TYPE string.

lo_regex   = cl_abap_regex=>create_pcre( pattern = '\s+' ).
lo_matcher = lo_regex->create_matcher( text = lv_input ).

lv_result = lo_matcher->replace_all( newtext = ' ' ).
" lv_result = 'Hello World ABAP'

Simple alternative

For one-off checks or replacements, the built-in statements are shorter and avoid OO boilerplate:

DATA lv_text   TYPE string VALUE 'SAP123'.
DATA lv_result TYPE string.
DATA lv_offset TYPE i.
DATA lv_length TYPE i.

" Test match
FIND FIRST OCCURRENCE OF REGEX '\d+' IN lv_text
  MATCH OFFSET lv_offset
  MATCH LENGTH lv_length.
IF sy-subrc = 0.
  WRITE: / |Digits at offset { lv_offset }, length { lv_length }|.
ENDIF.

" Replace
lv_result = lv_text.
REPLACE ALL OCCURRENCES OF REGEX '\d+' IN lv_result WITH '###'.
" lv_result = 'SAP###'

When to use each approach

Use FIND/REPLACE with REGEX for simple one-shot tests or replacements. Use CL_ABAP_REGEX + CL_ABAP_MATCHER when you need to iterate over matches, extract capture groups, or reuse the compiled pattern across many strings.


Common pitfalls

  • Reuse CL_ABAP_REGEX objects: pattern compilation is expensive. Create the regex once (e.g., in the class constructor or a static attribute) and call create_matcher() for each new string.
" Bad — recompiles on every loop iteration
LOOP AT lt_strings INTO DATA(lv_str).
  DATA(lo_rx) = cl_abap_regex=>create_pcre( pattern = '\d+' ).
  ...
ENDLOOP.

" Good — compile once, reuse matcher
DATA(lo_rx) = cl_abap_regex=>create_pcre( pattern = '\d+' ).
LOOP AT lt_strings INTO DATA(lv_str).
  DATA(lo_m) = lo_rx->create_matcher( text = lv_str ).
  ...
ENDLOOP.
  • Case sensitivity: patterns are case-sensitive by default. Prefix with (?i) for case-insensitive matching:
lo_regex = cl_abap_regex=>create_pcre( pattern = '(?i)hello' ).
  • Escape backslashes: in ABAP string literals, \ is not an escape character, so regex patterns like \d are written as-is in a character literal — but inside a string (backtick syntax) there is no issue. Use character literals '...' for simple patterns; for patterns with single quotes, use |...| string templates.

See also

  • CL_ABAP_MATCHER — interface documentation (SE24)
  • CL_ABAP_MATCH — match result object
  • SAP Help: ABAP — Regular Expressions

Comments