Package com.cedarsoftware.util
Class RegexUtilities
java.lang.Object
com.cedarsoftware.util.RegexUtilities
Utility class for safe and efficient regular expression operations.
Provides ReDoS (Regular Expression Denial of Service) protection through
timeout enforcement and performance optimization through pattern caching.
Key Features
- ReDoS Protection: Enforces configurable timeouts on regex operations to prevent catastrophic backtracking
- Pattern Caching: Caches compiled Pattern objects to avoid repeated compilation overhead
- Thread Safety: All operations are thread-safe with concurrent caching
- Invalid Pattern Tracking: Remembers invalid patterns to avoid repeated compilation attempts
Security Configuration
Security features can be controlled via system properties:cedarsoftware.security.enabled- Enable/disable all security features (default: true)cedarsoftware.regex.timeout.enabled- Enable/disable regex timeout (default: true)cedarsoftware.regex.timeout.milliseconds- Timeout in milliseconds (default: 5000)
Usage Examples
// Safe matching with timeout protection
Pattern pattern = RegexUtilities.getCachedPattern("\\d+");
boolean matches = RegexUtilities.safeMatches(pattern, "12345");
// Safe find operation with result capture
SafeMatchResult result = RegexUtilities.safeFind(pattern, "abc123def");
if (result.matched()) {
String found = result.group(0); // "123"
}
// Case-insensitive pattern caching
Pattern ciPattern = RegexUtilities.getCachedPattern("hello", true);
- Author:
- John DeRegnaucourt (jdereg@gmail.com)
Copyright (c) Cedar Software LLC
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
License
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classImmutable container for regex match results. -
Method Summary
Modifier and TypeMethodDescriptionstatic voidClears all cached patterns.static PatterngetCachedPattern(String regex) Gets a cached Pattern for the given regex string.static PatterngetCachedPattern(String regex, boolean caseInsensitive) Gets a cached Pattern for the given regex string with case sensitivity option.static PatterngetCachedPattern(String regex, int flags) Gets a cached Pattern for the given regex string with specific flags.Gets statistics about the pattern cache.static longGets the configured regex timeout in milliseconds.static booleanChecks if regex timeout protection is enabled.static booleanChecks if security features are enabled.Safely executes a pattern find operation with timeout protection.static booleansafeMatches(Pattern pattern, String input) Safely executes a pattern match operation with timeout protection.static StringsafeReplaceAll(Pattern pattern, String input, String replacement) Safely replaces all occurrences of the pattern with the replacement string.static StringsafeReplaceFirst(Pattern pattern, String input, String replacement) Safely replaces the first occurrence of the pattern with the replacement string.static String[]Safely splits the input string around matches of the pattern.
-
Method Details
-
isSecurityEnabled
public static boolean isSecurityEnabled()Checks if security features are enabled.- Returns:
- true if security is enabled (default), false otherwise
-
isRegexTimeoutEnabled
public static boolean isRegexTimeoutEnabled()Checks if regex timeout protection is enabled.- Returns:
- true if timeout is enabled (default), false otherwise
-
getRegexTimeoutMilliseconds
public static long getRegexTimeoutMilliseconds()Gets the configured regex timeout in milliseconds.- Returns:
- timeout in milliseconds (default: 5000)
-
getCachedPattern
Gets a cached Pattern for the given regex string. Patterns are compiled once and cached for reuse.- Parameters:
regex- The regular expression string- Returns:
- Cached Pattern object, or null if the pattern is invalid
-
getCachedPattern
Gets a cached Pattern for the given regex string with case sensitivity option.- Parameters:
regex- The regular expression stringcaseInsensitive- If true, pattern matching will be case-insensitive- Returns:
- Cached Pattern object, or null if the pattern is invalid
-
getCachedPattern
Gets a cached Pattern for the given regex string with specific flags.- Parameters:
regex- The regular expression stringflags- Match flags, a bit mask from Pattern constants (CASE_INSENSITIVE, MULTILINE, etc.)- Returns:
- Cached Pattern object, or null if the pattern is invalid
-
clearPatternCache
public static void clearPatternCache()Clears all cached patterns. Useful for freeing memory in long-running applications. -
getPatternCacheStats
Gets statistics about the pattern cache.- Returns:
- Map containing cache statistics (size, invalidCount)
-
safeMatches
Safely executes a pattern match operation with timeout protection. This protects against ReDoS (Regular Expression Denial of Service) attacks.- Parameters:
pattern- The Pattern to match againstinput- The input string to match- Returns:
- true if the entire input matches the pattern, false otherwise
- Throws:
SecurityException- if the operation times out (possible ReDoS attack)
-
safeFind
Safely executes a pattern find operation with timeout protection. Returns a SafeMatchResult containing the match data if found.- Parameters:
pattern- The Pattern to search forinput- The input string to search- Returns:
- SafeMatchResult containing match data, or an unmatched result if not found
- Throws:
SecurityException- if the operation times out (possible ReDoS attack)
-
safeReplaceFirst
Safely replaces the first occurrence of the pattern with the replacement string.- Parameters:
pattern- The Pattern to search forinput- The input stringreplacement- The replacement string- Returns:
- The input string with the first match replaced
- Throws:
SecurityException- if the operation times out (possible ReDoS attack)
-
safeReplaceAll
Safely replaces all occurrences of the pattern with the replacement string.- Parameters:
pattern- The Pattern to search forinput- The input stringreplacement- The replacement string- Returns:
- The input string with all matches replaced
- Throws:
SecurityException- if the operation times out (possible ReDoS attack)
-
safeSplit
Safely splits the input string around matches of the pattern.- Parameters:
pattern- The Pattern to split oninput- The input string to split- Returns:
- Array of strings split around pattern matches
- Throws:
SecurityException- if the operation times out (possible ReDoS attack)
-