Join the Stack Overflow Community
Stack Overflow is a community of 6.3 million programmers, just like you, helping each other.
Join them; it only takes a minute:
Sign up

I have the following error :

Warning: preg_replace(): Unknown modifier ']' in xxx.php on line 38

This is the code on line 38 :

<?php echo str_replace("</ul></div>", "", preg_replace("<div[^>]*><ul[^>]*>", "", wp_nav_menu(array('theme_location' => 'nav', 'echo' => false)) )); ?>

Can someone please help me to fix this problem?

share|improve this question
3  
Add delimeters around the pattern: "/<div[^>]*><ul[^>]*>/" – raina77ow Dec 20 '13 at 14:07
1  
@mario I don't really see why you put a bounty here? Are you really looking for new answers here? If yes what's wrong with the current one? – Rizier123 Jul 2 '15 at 9:36
1  
@Rizier123 The bounty description says it all: "One or more of the answers is exemplary and worthy of an additional bounty." – birgire Jul 2 '15 at 10:16
3  
Yes, this isn't meant to attract more answers. The existing one is a pretty excellent example already. It's a great visual explaination, and likely applicable to many similar cases. And such mini bounties are mainly intended as temporary public bookmark - to make it better known. And perhaps establish this as another universal reference. (Though could make sense to craft an artificial CW answer with extra examples + links afterwards…) – mario Jul 2 '15 at 19:33
1  
@mario If this should get an artifical answer, shouldn't we change the example a bit? I mean the OP is parsing HTML with regexes. I'm with you that the answer shows a lot of effort (and I like him and his posts) but I'm asking: Is this necessary? I mean a short "You need to enclose your regex with a delimiter" and a link to the (very good!) documentation would have been enough. Isn't it? IMHO all that extra information is going into the wrong direction and may confuse (expected newbie) users more than it helps. – hek2mgl Jul 2 '15 at 21:49
up vote 36 down vote accepted
+50

Why the error occurs

In PHP, a regular expression needs to be enclosed within a pair of delimiters. A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character; /, #, ~ are the most commonly used ones. Note that it is also possible to use bracket style delimiters where the opening and closing brackets are the starting and ending delimiter, i.e. <pattern_goes_here>, [pattern_goes_here] etc. are all valid.

The "Unknown modifier X" error usually occurs in the following two cases:

  • When your regular expression is missing delimiters.

  • When you use the delimiter inside the pattern without escaping it.

In this case, the regular expression is <div[^>]*><ul[^>]*>. The regex engine considers everything from < to > as the regex pattern, and everything afterwards as modifiers.

Regex: <div[^>  ]*><ul[^>]*>
       │     │  │          │
       └──┬──┘  └────┬─────┘
       pattern    modifiers

] here is an unknown modifier, because it appears after the closing > delimiter. Which is why PHP throws that error.

Depending on the pattern, the unknown modifier complaint might as well have been about *, +, p, / or ) or almost any other letter/symbol. Only imsxeADSUXJu are valid PCRE modifiers.

How to fix it

The fix is easy. Just wrap your regex pattern with any valid delimiters. In this case, you could chose ~ and get the following:

~<div[^>]*><ul[^>]*>~
│                   │
│                   └─ ending delimiter
└───────────────────── starting delimiter

If you're receiving this error despite having used a delimiter, it might be because the pattern itself contains unescaped occurrences of the said delimiter.

Or escape delimiters

/foo[^/]+bar/i would certainly throw an error. So you can escape it using a \ backslash if it appears anywhere within the regex:

/foo[^\/]+bar/i
│      │     │
└──────┼─────┴─ actual delimiters
       └─────── escaped slash(/) character

This is a tedious job if your regex pattern contains so many occurrences of the delimiter character.

The cleaner way, of course, would be to use a different delimiter altogether. Ideally a character that does not appear anywhere inside the regex pattern - #foo[^/]+bar#i.

More reading:

share|improve this answer
    
I noticed that the same occurs when one of the delimiters is inside a preg_quote(), thus something like preg_replace('/'.preg_quote('/').'/i','',$string); gives the same error of the topic. Shouldn't the slash get escaped by preg_quote()? – TechNyquist Apr 29 at 9:47

Other examples

The reference answer already explains the reason for "Unknown modifier" warnings. This is just a comparison of other typical variants.

  • When forgetting to add regex /delimiters/, the first non-letter symbol will be assumed to be one. Therefore the warning is often about what follows a grouping (…), […] meta symbol:

    preg_match("[a-zA-Z]+:\s*.$"
                ↑      ↑⬆
    
  • Sometimes your regex already uses a custom delimiter (: here), but still contains the same character as unescaped literal. It's then mistaken as premature delimiter. Which is why the very next symbol receives the "Unknown modifier ❌" trophy:

    preg_match(":\[[\d:/]+\]:"
                ↑     ⬆     ↑
    
  • When using the classic / delimiter, take care to not have it within the regex literally. This most frequently happens when trying to match unescaped filenames:

    preg_match("/pathname/filename/i"
                ↑        ⬆         ↑
    

    Or when matching angle/square bracket style tags:

    preg_match("/<%tmpl:id>(.*)</%tmpl:id>/Ui"
                ↑               ⬆         ↑
    
  • Templating-style (Smarty or BBCode) regex patterns often require {…} or […] brackets. Both should usually be escaped. (An outermost {} pair being the exception though).

    They also get misinterpreted as paired delimiters when no actual delimiter is used. If they're then also used as literal character within, then that's, of course … an error.

    preg_match("{bold[^}]+}"
                ↑      ⬆  ↑
    
  • Whenever the warning says "Delimiter must not be alphanumeric or backslash" then you also entirely forgot delimiters:

    preg_match("ab?c*"
                ↑
    
  • "Unkown modifier 'g'" often indicates a regex that was copied verbatimly from JavaScript or Perl.

    preg_match("/abc+/g"
                      ⬆
    

    PHP doesn't use the /g global flag. Instead the preg_replace function works on all occurences, and preg_match_all is the "global" searching pendant to the one-occurence preg_match.

    So, just remove the /g flag.

    See also:
    · Warning: preg_replace(): Unknown modifier 'g'
    · preg_replace: bad regex == 'Unknown Modifier'?

  • A more peculiar case pertains the PCRE_EXTENDED /x flag. This is often (or should be) used for making regexps more lofty and readable.

    This allows to use inline # comments. PHP implements the regex delimiters atop PCRE. But it doesn't treat # in any special way. Which is how a literal delimiter in a # comment can become an error:

    preg_match("/
       ab?c+  # Comment with / slash in between
    /x"
    

    (Also noteworthy that using # as #abc+#x delimiter can be doubly inadvisable.)

  • Interpolating variables into a regex requires them to be pre-escaped, or be valid regexps themselves. You can't tell beforehand if this is gonna work:

     preg_match("/id=$var;/"
                 ↑    ↺   ↑
    

    It's best to apply $var = preg_quote($var, "/") in such cases.

    See also:
    · Unknown modifier '/' in ...? what is it?

    Another alternative is using \Q…\E escapes for unquoted literal strings:

     preg_match("/id=\Q{$var}\E;/mix");
    

    Note that this is merely a convenient shortcut, not dependable/safe. It would fall apart in case that $var contained a literal '\E' itself (however unlikely).

  • Deprecated modifier /e is an entirely different problem. This has nothing to do with delimiters, but the implicit expression interpretation mode being phased out. See also: Replace deprecated preg_replace /e with preg_replace_callback

Alternative regex delimiters

As mentioned already, the quickest solution to this error is just picking a distinct delimiter. Any non-letter symbol can be used. Visually distinctive ones are often preferred:

  • ~abc+~
  • !abc+!
  • @abc+@
  • #abc+#
  • =abc+=
  • %abc+%

Technically you could use $abc$ or |abc| for delimiters. However, it's best to avoid symbols that serve as regex meta characters themselves.

The hash # as delimiter is rather popular too. But care should be taken in combination with the x/PCRE_EXTENDED readability modifier. You can't use # inline or (?#…) comments then, because those would be confused as delimiters.

Quote-only delimiters

Occassionally you see " and ' used as regex delimiters paired with their conterpart as PHP string enclosure:

  preg_match("'abc+'"
  preg_match('"abc+"'

Which is perfectly valid as far as PHP is concerned. It's sometimes convenient and unobtrusive, but not always legible in IDEs and editors.

Paired delimiters

An interesting variation are paired delimiters. Instead of using the same symbol on both ends of a regex, you can use any <...> (...) [...] {...} bracket/braces combination.

  preg_match("(abc+)"   # just delimiters here, not a capture group

While most of them also serve as regex meta characters, you can often use them without further effort. As long as those specific braces/parens within the regex are paired or escaped correctly, these variants are quite readable.

Fancy regex delimiters

A somewhat lazy trick (which is not endorsed hereby) is using non-printable ASCII characters as delimiters. This works easily in PHP by using double quotes for the regex string, and octal escapes for delimiters:

 preg_match("\001 abc+ \001mix"

The \001 is just a control character that's not usually needed. Therefore it's highly unlikely to appear within most regex patterns. Which makes it suitable here, even though not very legible.

Sadly you can't use Unicode glyps as delimiters. PHP only allows single-byte characters. And why is that? Well, glad you asked:

PHPs delimiters atop PCRE

The preg_* functions utilize the PCRE regex engine, which itself doesn't care or provide for delimiters. For resemblence with Perl the preg_* functions implement them. Which is also why you can use modifier letters /ism instead of just constants as parameter.

See ext/pcre/php_pcre.c on how the regex string is preprocessed:

  • First all leading whitespace is ignored.

  • Any non-alphanumeric symbol is taken as presumed delimiter. Note that PHP only honors single-byte characters:

    delimiter = *p++;
    if (isalnum((int)*(unsigned char *)&delimiter) || delimiter == '\\') {
            php_error_docref(NULL,E_WARNING, "Delimiter must not…");
            return NULL;
    }
    
  • The rest of the regex string is traversed left-to-right. Only backslash \\-escaped symbols are ignored.

  • Should the delimiter be found again, the remainder is verified to only contain modifier letters.

  • If the delimiter is one of the ([{< )]}> )]}> pairable braces/brackets, then the processing logic is more elaborate.

    int brackets = 1;   /* brackets nesting level */
    while (*pp != 0) {
            if (*pp == '\\' && pp[1] != 0) pp++;
            else if (*pp == end_delimiter && --brackets <= 0)
                    break;
            else if (*pp == start_delimiter)
                    brackets++;
            pp++;
    }
    

    It looks for correctly paired left and right delimiter, but ignores other braces/bracket types when counting.

  • The raw regex string is passed to the PCRE backend only after delimiter and modifier flags have been cut out.

Now this is all somewhat irrelevant. But explains where the delimiter warnings come from. And this whole procedure is all to have a minimum of Perl compatibility. There are a few minor deviations of course, like the […] character class context not receiving special treatment in PHP.

More references

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.