How to Split a String in Java with Delimiter?
In Java, splitting string is an important and usually used operation when coding. Java provides multiple ways to split the string. But the most common way is to use the split() method of the String class. In this section, we will learn how to split a String in Java with delimiter. Along with this, we will also learn some other methods to split the string, like the use of StringTokenizer class, Scanner.useDelimiter() method. Before moving to the topic, let's understand what is delimiter.
What is a delimiter?
In Java, delimiters are the characters that split (separate) the string into tokens. Java allows us to define any characters as a delimiter. There are many string split methods provides by Java that uses whitespace character as a delimiter. The whitespace delimiter is the default delimiter in Java.
Before moving to the program, let's understand the concept of string.
The string is made up of two types of text that are tokens and delimiters. The tokens are the words that are meaningful, and the delimiters are the characters that split or separate the tokens. Let's understand it through an example.
To understand the delimiter in Java, we must friendly with the Java regular expression. It is necessary when the delimiter is used as a special character in the regular expressions, like (.) and (|).
Example of Delimiter
String: Javatpoint is the best website to learn new technologies.
In the above string, tokens are, Javatpoint, is, the, best, website, to, learn, new, technologies, and the delimiters are whitespaces between the two tokens.
How to Split a String in Java with Delimiter?
Java provides the following way to split a string into tokens:
Using Scanner.next() Method
It is the method of the Scanner class. It finds and returns the next token from the scanner. It splits the string into tokens by whitespace delimiter. The complete token is identified by the input that matches the delimiter pattern.
It throws NoSuchElementException if the next token is not available. It also throws IllegalStateException if the input scanner is closed.
Let's create a program that splits a string object using the next() method that uses whitespace to split the string into tokens.
Javatpoint is the best website to learn new technologies
In the above program, a point to notice that in the constructor of the Scanner class instead of passing the System.in we have passed a string variable str. We have done so because before manipulating the string, we need to read the string.
Using String.split() Method
The split() method of the String class is used to split a string into an array of String objects based on the specified delimiter that matches the regular expression. For example, consider the following string:
The above string is separated by commas. We can split the above string by using the following expression:
The above expression splits the string into tokens when the tokens separated by specified delimiter character comma (,). The specified string split into the following string objects:
There are two variants of the split() methods:
It splits the string according to specified regular expression regex. We can use a dot (.), space ( ), comma (,), and any character (like z, a, g, l, etc.)
The method parses a delimiter regular expression as an argument. It returns an array of String objects. It throws PatternSyntaxException if the parsed regular expression has an invalid syntax.
Let's use the split() method and split the string by a comma.
Life is your creation
In the above example, the string object is delimited by a comma. The split() method splits the string when it finds the comma as a delimiter.
Let's see another example in which we will use multiple delimiters to split the string.
If you don t like something change it
String.split(String regex, int limit)
It allows us to split string specified by delimiter but into a limited number of tokens. The method accepts two parameters regex (a delimiting regular expression) and limit. The limit parameter is used to control the number of times the pattern is applied that affects the resultant array. It returns an array of String objects computed by splitting the given string according to the limit parameter.
There is a slight difference between the variant of the split() methods that it limits the number of tokens returned after invoking the method.
It throws PatternSyntaxException if the parsed regular expression has an invalid syntax.
The limit parameter may be positive, negative, or equal to the limit.
When the limit is positive: Number of tokens: 2 46 -567-7388 When the limit is negative: Number of tokens: 2 Life,is, our,creation When the limit is equal to 0: Number of tokens: 2 Hello how are you?
In the above code snippet, we see that:
Example of Pipe Delimited String
Splitting a string delimited by pipe (|) is a little bit tricky. Because the pipe is a special character in Java regular expression.
Let's create a string delimited by pipe and split it by pipe.
L i f e | i s | y o u r | c r e a t i o n
In the above example, we see that it does not produce the same output as other delimiter yields. It should produce an array of tokens, life, yours, and creation, but it is not. It gives the result, as we have seen in the output above.
The reason behind it that the regular expression engine interprets the pipe delimiter as a Logical OR operator. The regex engine splits the String on empty String.
In order to resolve this problem, we must escape the pipe character when passed to the split() method. We use the following statement to escape the pipe character:
Add a pair of backslash (\\) before the delimiter to escape the pipe. After doing the changes in the above program, the regex engine interprets the pipe character as a delimiter.
Another way to escape the pipe character is to put the pipe character inside a pair of square brackets, as shown below. In the Java regex API, the pair of square brackets act as a character class.
Both the above statements yield the following output:
Life is your creation
Using StringTokenizer Class
Java StringTokenizer is a legacy class that is defined in java.util package. It allows us to split the string into tokens. It is not used by the programmer because the split() method of the String class does the same work. So, the programmer prefers the split() method instead of the StringTokenizer class. We use the following two methods of the class:
The method iterates over the string and checks if there are more tokens available in the tokenizer string. It returns true if there is one token is available in the string after the current position, else returns false. It internally calls the nextToken() method if it returns true and the nextToken() method returns the token.
It returns the next token from the string tokenizer. It throws NoSuchElementException if the tokens are not available in the string tokenizer.
Let's create a program that splits the string using the StringTokenizer class.
Welcome to Javatpoint
Using Scanner.useDelimiter() Method
Java Scanner class provides the useDelimiter() method to split the string into tokens. There are two variants of the useDelimiter() method:
The method sets the scanner's delimiting pattern to the specified string. It parses a delimiting pattern as an argument. It returns the Scanner.
The method sets the scanner's delimiting pattern to a pattern that constructs from the specified string. It parses a delimiting pattern as an argument. It returns the Scanner.
Note: Both the above methods behave in the same way, as invoke the useDelimiter(Pattern.compile(pattern)).
In the following program, we have used the useDelimiter() method to split the string.
Do your work self