Checking if Two Words Are Present in a String in Java

In the realm of software development, text processing is a common task. Whether you're building a search engine, a chatbot, or any application that deals with text, you might need to determine if certain words are present in a string. In this section, we will discuss how to check if two specific words are present in a string using Java, a versatile and widely-used programming language.

Java provides a rich set of methods for string manipulation through the String class. Before diving into the specifics of checking for two words, it's essential to understand some fundamental operations on strings.

Here are some common methods used in string manipulation:

  • length(): Returns the length of the string.
  • charAt(int index): Returns the character at the specified index.
  • substring(int beginIndex, int endIndex): Returns a new string that is a substring of the given string.
  • indexOf(String str): Returns the index of the first occurrence of the specified substring.
  • contains(CharSequence s): Returns true if the string contains the specified sequence of characters.

Checking for Words in a String

To check if two specific words are present in a string, you can use various approaches depending on the complexity of your requirements. We'll explore several methods, starting from basic to more advanced techniques.

Basic Approach Using contains()

The simplest way to check if two words are present in a string is to use the contains() method. This method checks if a sequence of characters is present in the string.

File Name: WordChecker.java

Output:

Are both words present? true

In this example, the areWordsPresent() method returns true if both word1 and word2 are found in the string str.

Case-Insensitive Search

Strings in Java are case-sensitive by default. To perform a case-insensitive search, you can convert both the string and the words to lowercase or uppercase before checking.

File Name: CaseInsensitiveWordChecker.java

Output:

Are both words present (case-insensitive)? True

Checking for Whole Words

The contains() method checks for a sequence of characters, which might be part of another word. To ensure you're checking for whole words, you can use regular expressions.

File Name: WholeWordChecker.java

Output:

Are both whole words present? true

In this example, the \\b is a word boundary in the regular expression, ensuring that word1 and word2 are matched as whole words.

Handling Word Variations with Regular Expressions

Regular expressions can also handle variations of the words, such as different tenses or plural forms. For instance, to check for both "quick" and "quickly", you can use a pattern like quick(ly)?.

File Name: WordVariationChecker.java

Output:

Are both word variations present? true

In this example, `word1Pattern` and `word2Pattern` use regular expression patterns to match variations of "quick" and "lazy". The `quick(ly)?` pattern matches both "quick" and "quickly", and the `laz(y|ies)?` pattern matches "lazy" and "lazies".

Advanced Approaches

For more complex scenarios, such as checking for words in large texts or in the presence of punctuation, additional strategies can be applied.

Using String Tokenization

String tokenization involves splitting the string into individual words and then checking if the desired words are present in the resulting array. This method can be useful when dealing with punctuation and other delimiters.

TokenizationChecker.java

Output:

Are both words present using tokenization? true

Using a Set for Faster Lookups

If we need to check for the presence of words in a very large string, converting the string into a set of words can make lookups faster.

File Name: SetChecker.java

Output:

Are both words present using a set? true

Using Stream API for Modern Java

Java 8 introduced the Stream API, which can be used to process sequences of elements in a functional style. Here's how you can leverage it for checking word presence:

File Name: StreamChecker.java

Output:

Are both words present using Stream API? true 

Checking for the presence of two words in a string in Java can be achieved through various methods, ranging from basic to advanced techniques. Simple approaches like using contains() can be sufficient for straightforward cases, while more complex scenarios might require regular expressions, tokenization, or leveraging libraries like Apache Commons Lang.

By understanding these different approaches, you can choose the most appropriate method based on your specific needs and constraints. Whether you are dealing with case sensitivity, whole word matching, or large-scale text processing, Java provides the tools and flexibility to handle these tasks efficiently.

Summary of Methods

Basic Approach Using contains():

  • Directly checks for the presence of substrings.
  • Suitable for simple cases.

Case-Insensitive Search:

  • Converts both the string and the words to lowercase or uppercase.
  • Ensures that case differences do not affect the search.

Checking for Whole Words with Regular Expressions:

  • Uses word boundary markers (\b) in regular expressions.
  • Ensures that only whole words are matched.

Handling Word Variations with Regular Expressions:

  • Uses more complex patterns to match variations of the words.
  • Useful for checking different tenses, plural forms, etc.

String Tokenization:

  • Splits the string into tokens and checks for the presence of words.
  • Handles punctuation and other delimiters effectively.

Using a Set for Faster Lookups:

  • Converts the string into a set of words.
  • Provides faster lookup times for large texts.

Using Apache Commons Lang:

  • Utilizes StringUtils for simplified and readable code.
  • Offers case-insensitive search with minimal code.

Using Stream API for Modern Java:

  • Leverages Java 8 Stream API for functional style processing.
  • Efficient and concise for processing sequences of elements.