MySQL Regular ExpressionsA regular expression is a special string that describes a search pattern. It's a powerful tool to give a concise and flexible way for identifying text strings such as characters and words based on patterns. It uses its own syntax that can be interpreted by a regular expression processor. A regular expression is widely used in almost all platforms, from programming languages to databases, including MySQL. A regular expression uses the backslash as an escape character that should be considered in the pattern match if double backslashes have used. The regular expressions are not case sensitive. It is abbreviated as REGEX or REGEXP in MySQL. The advantage of using regular expression is that we are not limited to search for a string based on a fixed pattern with the percent (%) sign and underscore (_) in the LIKE operator. The regular expression has more meta-characters that allow more flexibility and control while performing pattern matching. We have previously learned about wildcards, which allows us to get a similar result as regular expressions. So we may ask why we learn regular expressions if we will get the same result as the wildcards. It is because regular expressions allow us to search data matching even more complex ways compared to wildcards. SyntaxMySQL adapts the regular expression implemented by Henry Spencer. MySQL allows us to match patterns right in the SQL statements by using the REGEXP operator. The following is the basic syntax that illustrates the use of regular expressions in MySQL: In this syntax, the column_list indicates the column name returns in the result set. The table_name is the name of the table that data will be retrieved using the pattern. The WHERE field_name represents the column name on which the regular expression is performed. The REGEXP is the regular expression operator, and the pattern is the search condition to be matched by REGEXP. We can also use the RLIKE operator, which is the synonym for REGEXP that gives the same results as REGEXP. We can avoid the confusion to use this statement with the LIKE operator by using the REGEXP instead of LIKE. This statement returns true if a value in the WHERE field_name matches the pattern. Otherwise, it returns false. If either field_name or pattern is NULL, the result is always NULL. The negation form of the REGEXP operator is NOT REGEXP. Regular Expression Meta-CharactersThe following table shows the most commonly used meta-characters and constructs in a regular expression:
Let us understand the regular expressions using practical examples given below: Suppose we have a table named student_info that contains the following data. We will demonstrate various examples based on this table data. If we want to search for students whose name start with "A or B", we can use a regular expression together with the meta-characters as follows: Executing the statement, we will get the desired result. See the below output: If we want to get the student information whose name ends with k, we can use 'k$' meta-character to match the end of a string as follows: Executing the statement, we will get the desired result. See the below output: If we want to get the student information whose name contains exactly six characters, we can do this using '^' and '$ meta-characters. These characters match the beginning and end of the student name and repeat {6} times of any character '.' in-between as shown in the following statement: Executing the statement, we will get the desired result. See the below output: If we want to get the student info whose subjects contains 'i' characters, we can do this by using the below query: Executing the statement, we will get the desired result. See the below output: Regular Expression Functions and OperatorsThe following are the list of regular functions and operators in MySQL:
Let's see all of them in detail. REGEXP, RLIKE, & REGEXP_LIKE()Although these functions and operators return the same result, REGEXP_LIKE() gives us more functionality with the optional parameters. We can use them as follows: These statements give output whether string expression matches regular expression pattern or not. We will get 1 if an expression matches the pattern. Otherwise, they return 0. The below examples explain it more clearly. In the below image, the first statement returns '1' because 'B' is in the range A-Z. The second statement limited the range of the pattern to B-Z. So 'A' will not match any character within the range, and MySQL returns 0. Here we have used the alias match_ and not_match_ so that the returned column will be more understandable. REGEXP_LIKE() ParameterThe following are the five possible parameters to modify the function output:
Example In this example, we have added the 'c' and 'i' as an optional parameter, which invokes case-sensitive and case-insensitive matching. The first query gives the output 0 because 'a' is in the range 'a-z', but not in the range of capital letters A-Z. The second query gives the output 1 because of case-insensitive features. NOT REGEXP & NOT RLIKEThey are regular expression operators that compare the specified pattern and return the result, which does not match the patterns. These operators return 1 if no match is found. Otherwise, they return 0. We can use these functions as follows: Example The below statement returns 0 because 'a' is found in the given range. Here is the output: REGEXP_INSTR()It is a function that gives a result when the starting index of substring expression matches the pattern. It returns 0 if there is no match found. If either expression or pattern is NULL, it returns NULL. Here indexing starts at 1. This function uses various optional parameters that are pos, occurrence, return_option, match_type, etc. Example Suppose we want to get the index position of substring 'a' within expr (a b c d e f a). The first query returns 1 because we have not set any optional parameters, which is the string's first index. The second query returns 13 because we have modified the query with optional parameter occurrence. REGEXP_REPLACE()This function replaces the specified string character by matching characters and then returns the resulting string. If any expression, pattern, or replaceable string is not found, it will return NULL. This function can be used as follows: The replace character uses the optional parameters such as pos, occurrence, and match_type. Example This statement replaces the 'tutorials' pattern with the 'javat' pattern. Here is the output: REGEXP_SUBSTRING()This function returns the substring of an expression that matches the specified pattern. If the expression or specified pattern or even no match is found, it returns NULL. This function can be used as follows: The pattern uses the optional parameters such as pos, occurrence, and match_type. Example This statement returns the 'point' pattern, which is the third occurrence of the given range. Here is the output: Next TopicMySQL RLIKE |