Searching AlgorithmsSearching algorithms are methods or procedures used to find a specific item or element within a collection of data. These algorithms are widely used in computer science and are crucial for tasks like searching for a particular record in a database, finding an element in a sorted list, or locating a file on a computer. These are some commonly used searching algorithms: - Linear Search: In this simple algorithm, each element in the collection is sequentially checked until the desired item is found, or the entire list is traversed. It is suitable for small-sized or unsorted lists, but its time complexity is O(n) in the worst case.
- Binary Search: This algorithm is applicable only to sorted lists. It repeatedly compares the middle element of the list with the target element and narrows down the search range by half based on the comparison result. Binary search has a time complexity of O(log n), making it highly efficient for large sorted lists.
- Hashing: Hashing algorithms use a hash function to convert the search key into an index or address of an array (known as a hash table). This allows for constant-time retrieval of the desired item if the hash function is well-distributed and collisions are handled appropriately. Common hashing techniques include direct addressing, separate chaining, and open addressing.
- Interpolation Search: Similar to binary search, interpolation search works on sorted lists. Instead of always dividing the search range in half, interpolation search uses the value of the target element and the values of the endpoints to estimate its approximate position within the list. This estimation helps in quickly narrowing down the search space. The time complexity of interpolation search is typically O(log log n) on average if the data is uniformly distributed.
- Tree-based Searching: Various tree data structures, such as binary search trees (BST), AVL trees, or B-trees, can be used for efficient searching. These structures impose an ordering on the elements and provide fast search, insertion, and deletion operations. The time complexity of tree-based searching algorithms depends on the height of the tree and can range from O(log n) to O(n) in the worst case.
- Ternary Search: Ternary search is an algorithm that operates on sorted lists and repeatedly divides the search range into three parts instead of two, based on two splitting points. It is a divide-and-conquer approach and has a time complexity of O(log₃ n).
- Jump Search: Jump search is an algorithm for sorted lists that works by jumping ahead a fixed number of steps and then performing linear search in the reduced subarray. It is useful for large sorted arrays and has a time complexity of O(sqrt(n)), where n is the size of the array.
- Exponential Search: Exponential search is a technique that combines elements of binary search and linear search. It begins with a small range and doubles the search range until the target element is within the range. It then performs a binary search within that range. Exponential search is advantageous when the target element is likely to be found near the beginning of the array and has a time complexity of O(log n).
- Fibonacci Search: Fibonacci search is a searching algorithm that uses Fibonacci numbers to divide the search space. It works on sorted arrays and has a similar approach to binary search, but instead of dividing the array into halves, it divides it into two parts using Fibonacci numbers as indices. Fibonacci search has a time complexity of O(log n).
- Interpolation Search for Trees: This algorithm is an extension of interpolation search designed for tree structures such as AVL trees or Red-Black trees. It combines interpolation search principles with tree traversal to efficiently locate elements in the tree based on their values. The time complexity depends on the tree structure and can range from O(log n) to O(n) in the worst case.
- Hash-based Searching (e.g., Bloom Filter): Hash-based searching algorithms utilize hash functions and data structures like Bloom filters to determine whether an element is present in a set or not. These algorithms provide probabilistic answers, meaning they can occasionally have false positives (indicating an element is present when it is not), but no false negatives (if an element is not present, it will never claim it is). Bloom filters have a constant-time complexity for search operations.
- String Searching Algorithms: Searching algorithms specific to string data include techniques like Knuth-Morris-Pratt (KMP) algorithm, Boyer-Moore algorithm, Rabin-Karp algorithm, and many others. These algorithms optimize the search for patterns within text or strings and are widely used in text processing, pattern matching, and string matching tasks.
Implementation of Searching AlgorithmsLinear Search:Code in C++: Output: The target value 9 is found at index 4.
Code in C: Output: The target value 9 is found at index 4.
Explanation: - The linearSearch function takes three parameters: arr (the array), size (the size of the array), and target (the value to be searched).
- Inside the linearSearch function, a for loop is used to iterate through each element of the array. The loop variable i is initialized to 0, and the loop continues as long as i is less than the size of the array.
- Inside the loop, each element of the array is compared with the target value using the expression arr[i] == target. If a match is found, the function immediately returns the index i, indicating the position of the element in the array.
- If the loop finishes without finding a match, the function returns -1, indicating that the target value was not found in the array.
- In the main function, an example array myArray is created and initialized with values. The target value is set to 9.
- The size of the array is calculated by dividing the total size of the array (sizeof(myArray)) by the size of an individual element (sizeof(myArray[0])). This ensures that the size of the array is correctly passed to the linearSearch function.
- The linearSearch function is called with the array, size, and target value as arguments. The return value is stored in the result variable.
- If the result is not equal to -1, it means that the target value was found in the array. In this case, a message is printed using printf() to indicate the index at which the target value was found.
- If the result is -1, it means that the target value was not found in the array. In this case, a message is printed to indicate that the target value is not present in the array.
- The program terminates by returning 0 from the main function.
Binary Search:Recursive Code Output: Element is found at index 1
Explanation: - The binarySearch function takes an array, the element to search for (x), the lower index (low), and the higher index (high) as parameters.
- It first checks if high is greater than or equal to low. If not, it means the element was not found, and it returns -1.
- If high is greater than or equal to low, it calculates the middle index mid using the formula low + (high - low) / 2.
- It then checks if the element at the middle index (array[mid]) is equal to x. If they are equal, it means the element is found, and it returns the index mid.
- If array[mid] is greater than x, it means the element lies in the left half of the array. In this case, the binarySearch function is called recursively with low unchanged and high set to mid - 1.
- If array[mid] is less than x, it means the element lies in the right half of the array. In this case, the binarySearch function is called recursively with low set to mid + 1 and high unchanged.
- If the element is not found after the recursive calls, the function returns -1.
- In the main function, an array {3, 4, 5, 6, 7, 8, 9} is defined, and the size of the array n is calculated using the formula sizeof(array) / sizeof(array[0]).
- The element x to search for is set to 4.
- The binarySearch function is called with the array, x, and the range of indices 0 to n - 1.
- The returned result is stored in the result variable.
- Finally, the result is checked. If it is -1, it means the element was not found, and "Not found" is printed. Otherwise, the index where the element was found is printed as "Element is found at index result".
IterativeOutput: Element is found at index 1
Explanation: - The binarySearch function takes an array, the element to search for (x), the lower index (low), and the higher index (high) as parameters.
- It uses a while loop that continues until the low pointer is less than or equal to the high pointer.
- Inside the loop, it calculates the middle index mid using the formula low + (high - low) / 2.
- It then checks if the element at the middle index (array[mid]) is equal to x. If they are equal, it means the element is found, and it returns the index mid.
- If array[mid] is less than x, it means the element lies in the right half of the array. In this case, the low pointer is updated to mid + 1.
- If array[mid] is greater than x, it means the element lies in the left half of the array. In this case, the high pointer is updated to mid - 1.
- If the element is not found after the loop, it means the element is not present in the array, and the function returns -1.
- In the main function, an array {3, 4, 5, 6, 7, 8, 9} is defined, and the size of the array n is calculated using the formula sizeof(array) / sizeof(array[0]).
- The element x to search for is set to 4.
- The binarySearch function is called with the array, x, and the range of indices 0 to n - 1.
- The returned result is stored in the result variable.
- Finally, the result is checked. If it is -1, it means the element was not found, and "Not found" is printed. Otherwise, the index where the element was found is printed as "Element is found at index result".
HashingOutput: Explanation: - The C implementation starts by including the necessary header files: stdio.h for standard input/output operations and stdlib.h for memory allocation.
- A constant TABLE_SIZE is defined to determine the size of the hash table array.
- A struct Node is defined to represent a key-value pair. It has three members: key, value, and next. The next member is a pointer to the next node in case of collisions.
- A global variable table is declared as an array of struct Node* to store the hash table.
- The hash_function function takes a key as input and returns the index in the hash table array. It uses the modulo operator % with TABLE_SIZE to calculate the index.
- The insert function takes a key and value as input. It calculates the index using the hash_function and inserts a new node containing the key-value pair into the linked list at the calculated index.
- The search function takes a key as input and returns the value associated with the key if found in the hash table. It calculates the index using the hash_function and traverses the linked list at the calculated index to search for the key.
- The main function initializes the hash table, inserts elements into it, and performs a search operation.
Code in Java:Output: Explanation: - The Java implementation defines a Node class with key and value fields to represent a key-value pair.
- The HashTable class encapsulates the hash table functionality.
- The class has a constructor that takes the size of the hash table as input and initializes an ArrayList of empty lists to represent the buckets.
- The hashFunction method takes a key as input and returns the index in the hash table array. It uses the division method (key % size) to calculate the index.
- The insert method takes a key and value as input. It calculates the index using the hashFunction and inserts a new Node object into the list at the calculated index.
- The search method takes a key as input and returns the value associated with the key if found in the hash table. It calculates the index using the hashFunction and iterates over the list at the calculated index to search for the key.
- The main method creates an instance of HashTable, inserts elements into it, and performs a search operation.
Interploation Search:Output: Element 16 found at index 4
Explanation: - The interpolation search algorithm assumes that we have an input array in sorted in ascending order.
- The algorithm consists of two variables: They are low, which points to the first index of the array, and high, initialized to the last index of the array.
- The algorithm uses the target values arr[low] and arr[high] to calculate the probable location of the target element in the array where the probability of finding an element is high. It uses the projection formula:
pos = low + (((double) (high - low) / (arr[high] - arr[low])) * (value - arr[low]) The above given formula calculates the pos position as the weighted average of the lower and upper indices, depending on the values of the corresponding array elements. The objective is to calculate the location of the target detection. - The algorithm then compares the target value with arr[pos] to determine the next step:
- If arr[pos] is equal to the given target value, then the algorithm searches for the target element and returns pos.
- If arr[pos] is less than the given target value, the algorithm does not create another pos + 1, because the target value is likely at the top of the array.
- If arr[pos] is greater than the target value, the algorithm updates up at pos - 1, since the target value is likely in the lower half of the array.
- The algorithm repeats steps 3 and 4 until one of the following conditions is satisfied:
The target element is found at position pos, in which case the algorithm returns the position. The low value is greater than high, indicating that the target is not in the system. In this case, the algorithm returns -1 indicating that the element was not found. - The main function in the example demonstrates the usage of the interpolationSearch function. It initializes an array, its size, and a target value. It then calls the interpolationSearch function to search for the target value in the array.
- Finally, based on the returned index, the main function prints a corresponding message indicating whether the target element was found or not.
Code in Java: Output: Element 16 found at index 4
Tree based searching:Depth-First Search (DFS): DFS is a recursive search technique that explores the tree by going as deep as possible before backtracking. In the case of a binary tree, there are three common variants of DFS: - Pre-order traversal: Visit the current node, then recursively visit the left subtree, and finally the right subtree.
- In-order traversal: Recursively visit the left subtree, visit the current node, and then recursively visit the right subtree. In a binary search tree, this traversal visits the nodes in ascending order.
- Post-order traversal: Recursively visit the left subtree, then the right subtree, and finally visit the current node.
Breadth-First Search (BFS): BFS explores the tree level by level, visiting all the nodes at the same depth before moving to the next level. It uses a queue data structure to keep track of the nodes to visit. BFS ensures that nodes closer to the root are visited before deeper nodes. DFS: Output: Depth-First Traversal (DFS): 1 2 4 5 3
Explanation: - The necessary header files stdio.h and stdbool.h are included, which provide input/output functionality and support for boolean values, respectively.
- The constant MAX_NODES is defined with a value of 100, representing the maximum number of nodes in the binary tree. This value is not currently used in the code.
- The structure Node is defined to represent a node in the binary tree. It contains an integer data to store the node's value, and two pointers left and right pointing to the left and right child nodes, respectively.
- The function createNode is a utility function that takes an integer data as input and creates a new node with the given data value. It allocates memory for the new node using malloc and initializes the left and right pointers to NULL. The function returns a pointer to the newly created node.
- The function DFS is the recursive implementation of Depth-First Search. It takes a pointer to a node as input and performs the DFS traversal. The base case is when the current node is NULL, in which case the function simply returns. Otherwise, it processes the current node by printing its data value using printf. Then, it recursively calls DFS on the left subtree (node->left) and the right subtree (node->right).
- In the main function, a sample binary tree is created. The root node is created using createNode with a data value of 1. Two child nodes are created for the root node with data values 2 and 3, respectively. Further, two child nodes are created for the left child of the root node with data values 4 and 5, respectively.
- Finally, the DFS traversal is performed on the binary tree starting from the root node by calling DFS(root). The values of the nodes are printed in the depth-first order.
BFS: Output: Breadth-First Traversal (BFS): 1 2 3 4 5
Explanation: - The necessary header files stdio.h and stdbool.h are included, which provide input/output functionality and support for boolean values, respectively.
- The constant MAX_NODES is defined with a value of 100, representing the maximum number of nodes in the binary tree. This value is used to define the size of the queue in the Queue structure.
- The structure Node is defined to represent a node in the binary tree. It contains an integer data to store the node's value, and two pointers left and right pointing to the left and right child nodes, respectively.
- The structure Queue is defined to represent a queue that will be used for BFS. It contains an array items of Node pointers to store the nodes in the queue, and two integers front and rear to keep track of the front and rear indices of the queue.
- The function createNode is a utility function that takes an integer data as input and creates a new node with the given data value. It allocates memory for the new node using malloc and initializes the left and right pointers to NULL. The function returns a pointer to the newly created node.
- The function createQueue creates an empty queue by allocating memory for the Queue structure and initializing the front and rear indices to -1.
- The function isEmpty checks if the queue is empty by checking if the rear index is -1. If it is, the function returns true, indicating that the queue is empty.
- The function enqueue adds a node to the queue. It first checks if the queue is already full (rear index is equal to MAX_NODES - 1). If so, it prints a "Queue Overflow!" message and returns. Otherwise, it increments the rear index, adds the node to the items array at the new rear index, and updates the front index if it was -1 (indicating an empty queue).
- The function dequeue removes and returns a node from the queue. It first checks if the queue is empty by calling isEmpty. If the queue is empty, it prints a "Queue Underflow!" message and returns NULL. Otherwise, it retrieves the node from the front index, increments the front index, and checks if the front index has surpassed the rear index. If so, it sets both the front and rear indices to -1, indicating an empty queue. Finally, it returns the dequeued node.
- The function BFS is the iterative implementation of Breadth-First Search. It takes a pointer to the root node of the binary tree as input. It starts by creating a queue using createQueue and enqueues the root node. Then, it enters a loop where it dequeues a node from the queue, processes the node by printing its data value using printf, and enqueues its left and right child nodes if they exist. The loop continues until the queue becomes empty, i.e., there are no more nodes to process.
- In the main function, a sample binary tree is created. The root node is created using createNode with a data value of 1. Two child nodes are created for the root node with data values 2 and 3, respectively. Further, two child nodes are created for the left child of the root node with data values 4 and 5, respectively.
- Finally, the BFS traversal is performed on the binary tree starting from the root node by calling BFS(root). The values of the nodes are printed in the breadth-first order.
Ternary Search:Output: Element found at index 5.
Explanation: - In this example, we have an array arr containing some elements, and we want to search for a key value (in this case, 23). The ternarySearch function takes the array, the left and right indices of the current segment, and the key to search for.
- The function recursively divides the array into three segments: mid1 = left + (right - left) / 3 and mid2 = right - (right - left) / 3. It checks if the key is equal to arr[mid1] or arr[mid2]. If either condition is true, it returns the respective index.
- If the key is smaller than arr[mid1], the function recurs on the left segment (left to mid1 - 1). If the key is greater than arr[mid2], the function recurs on the right segment (mid2 + 1 to right). Otherwise, if the key is between arr[mid1] and arr[mid2], the function recurs on the middle segment (mid1 + 1 to mid2 - 1).
- If the function doesn't find the key in the array, it returns -1. The main function calls ternarySearch with the appropriate arguments and displays the result accordingly.
Exponential SearchOutput: Element found at index 5.
Explanation: - We have an array arr containing some elements, and we want to search for a key value (in this case, 23). The exponentialSearch function takes the array, the size of the array, and the key to search for.
- The function starts by checking if the key is present at index 0. If so, it immediately returns 0. Otherwise, it determines the range for the binary search by doubling the index i until either the end of the array is reached or arr[i] becomes greater than the key.
- After identifying the range, the function calls the binarySearch function to perform a binary search within that range. The binarySearch function implements the standard binary search algorithm and returns the index of the key if found, or -1 if not found.
- The main function calls exponentialSearch with the appropriate arguments and displays the result accordingly.
Code in Java: Output: Explanation: - The exponentialSearch method takes an array arr and a target value target as input and returns the index of the target value in the array. If the target value is not found, it returns -1.
- The method starts by checking if the first element of the array (arr[0]) is equal to the target value. If it is, the method returns 0 since the target is found at the first index.
- If the target value is not found at the first index, the method proceeds with the exponential search algorithm. It initializes a variable bound to 1. This variable represents the upper bound of the search range.
- The algorithm then enters a loop that doubles the bound value until either the bound exceeds the array size (n) or the element at the bound index is greater than the target value. This loop narrows down the search range exponentially.
- After determining the bound, the method calls the binarySearch method to perform a binary search within the determined search range. It passes the array, target value, lower bound (bound / 2), and upper bound (Math.min(bound, n - 1)) to the binarySearch method.
- The binarySearch method is a recursive implementation of the binary search algorithm. It takes the array, target value, left index, and right index as parameters.
- In the binarySearch method, it checks if the left index is greater than the right index. If it is, it means the target value is not present in the search range, so the method returns -1.
- Otherwise, it calculates the middle index (mid) as the average of the left and right indices.
- It then compares the element at the mid index with the target value. If they are equal, the method returns the mid index since the target value is found.
- If the element at the mid index is greater than the target value, it recursively calls the binarySearch method with the left part of the search range (left to mid - 1).
- If the element at the mid index is smaller than the target value, it recursively calls the binarySearch method with the right part of the search range (mid + 1 to right).
- The recursion continues until the target value is found or the search range is narrowed down to the point where the left index becomes greater than the right index.
- In the main method, an example array arr is created with sorted elements. The target value is set to 10.
- The exponentialSearch method is called with the array arr and the target value. The returned index is stored in the index variable.
- Finally, the index variable is checked. If it is not -1, it means the target value was found, and the corresponding message is displayed. Otherwise, if the index is -1, it means the target value was not found, and the appropriate message is displayed.
Fibonnaci SearchCode in C: Output: Element found at index 4.
Explanation: - Initially, The fibonacciSearch function which takes three parameters: an integer array arr, the size of the array n, and the key element to be searched key.
- After that, We start the algorithm by initializing three variables: fib2, fib1, and fib, representing the (m-2)th, (m-1)th, and mth Fibonacci numbers, respectively. Initially, fib2 is set to 0, fib1 is set to 1, and fib is calculated as the sum of fib2 and fib1.
- The while loop is used to find the smallest Fibonacci number that is greater than or equal to n. It continues updating fib2, fib1, and fib until fib becomes greater than or equal to n.
- After finding the Fibonacci number, the algorithm initializes the offset variable to -1. This variable will be used to mark the eliminated range from the front.
- The main part of the algorithm is a comparison-based search using Fibonacci numbers. The while loop continues until fib becomes 1.
- Inside the loop, the algorithm checks if fib2 is a valid index. If offset + fib2 is within the array bounds (offset + fib2 < n), it assigns offset + fib2 to the variable i. Otherwise, it assigns n - 1 to i.
- If the key element is greater than the value at index i, it means the key is likely to be found in the subarray after index i. In this case, the algorithm cuts the eliminated range from the front by updating fib, fib1, fib2, and offset.
- If the key element is smaller than the value at index i, it means the key is likely to be found in the subarray before index i. In this case, the algorithm cuts the eliminated range from the end by updating fib, fib1, and fib2.
- If the key element is found at index i, the algorithm immediately returns the index.
- If the loop finishes without finding the key, the algorithm performs a final check. It compares the last element of the eliminated range (arr[offset + 1]) with the key. If they match, the algorithm returns offset + 1.
- If the key element is not found in the array, the algorithm returns -1.
- In the main function, an example usage of the fibonacciSearch function is shown. It creates an array, defines the size, and sets the key element to be searched. Then, it calls the fibonacciSearch function and prints the result.
Code in Java: Output: Element found at index 4.
Explanation: - The fibonacciSearch method takes two parameters: an integer array arr and the key element to be searched key.
- The algorithm starts by initializing three variables: fib2, fib1, and fib, representing the (m-2)th, (m-1)th, and mth Fibonacci numbers, respectively. Initially, fib2 is set to 0, fib1 is set to 1, and fib is calculated as the sum of fib2 and fib1.
- The while loop is used to find the smallest Fibonacci number that is greater than or equal to the length of the input array arr. It continues updating fib2, fib1, and fib until fib becomes greater than or equal to the array length.
- After finding the Fibonacci number, the algorithm initializes the offset variable to -1. This variable will be used to mark the eliminated range from the front.
- The main part of the algorithm is a comparison-based search using Fibonacci numbers. The while loop continues until fib becomes 1.
- Inside the loop, the algorithm calculates the index i to check if fib2 is a valid index. It takes the minimum value between offset + fib2 and arr.length - 1.
- If the key element is greater than the value at index i, it means the key is likely to be found in the subarray after index i. In this case, the algorithm cuts the eliminated range from the front by updating fib, fib1, fib2, and offset.
- If the key element is smaller than the value at index i, it means the key is likely to be found in the subarray before index i. In this case, the algorithm cuts the eliminated range from the end by updating fib, fib1, and fib2.
- If the key element is found at index i, the algorithm immediately returns the index.
- If the loop finishes without finding the key, the algorithm performs a final check. It compares the last element of the eliminated range (arr[offset + 1]) with the key. If they match, the algorithm returns offset + 1.
- If the key element is not found in the array, the algorithm returns -1.
- In the main method, an example usage of the fibonacciSearch method is shown. It creates an array, sets the key element to be searched, calls the fibonacciSearch method, and prints the result.
Interpolation Search for Trees:Output: Key 30 found in the binary search tree.
Explanation: - We start by including the necessary header files, stdio.h and stdlib.h, which provide input/output and memory allocation functions, respectively.
- Next, we define a structure called Node to represent a node in the binary search tree. The structure contains an integer key to store the node's value and two pointers left and right to point to the left and right child nodes, respectively.
- We define a function createNode that takes a key value as input and dynamically allocates memory for a new node. The function initializes the node's key and sets the left and right pointers to NULL.
- The insertNode function is used to insert a new node into the binary search tree. It takes the current root node and the key value as inputs. If the root is NULL, indicating an empty tree, the function creates a new node using the createNode function and returns it. Otherwise, it recursively traverses the tree based on the key value, comparing it with the current node's key. If the key is smaller, it moves to the left child, and if the key is larger, it moves to the right child. The function continues this process until it finds an appropriate position to insert the new node.
- The interpolationSearch function performs the interpolation search algorithm in the binary search tree. It takes the root node and the key value as inputs. The function checks if the root node is NULL or if the key value matches the current node's key. If either of these conditions is true, it returns the current node. Otherwise, it compares the key value with the current node's key and decides whether to search in the left or right subtree. It recursively calls the interpolationSearch function on the appropriate subtree until it finds the key or reaches a NULL node.
- In the main function, we create an empty binary search tree root and insert some nodes using the insertNode function.
- We define a variable searchKey with the value 30, which is the key we want to search for in the binary search tree.
- We call the interpolationSearch function with the root and searchKey as arguments. If the returned result is not NULL, it means the key is found in the tree, so we print a message indicating the key is found. Otherwise, we print a message indicating the key is not found.
String Searching Algorithms:- naïve string search: The naïve string search algorithm is the most basic and direct method. By systematically contrasting each character of the search string with each character of the text, it looks for matches. The temporal complexity of this technique is O(n*m), where n is the text length and m is the length of the search string.
- Knuth-Morris-Pratt (KMP) algorithm: By using data from prior matches to prevent pointless comparisons, the KMP algorithm outperforms the naive algorithm. It computes a "prefix function" or "failure function" beforehand that offers details on possible matches within the search string. The temporal complexity of the KMP algorithm is O(n+m), where n is the text length and m is the search string length.
- Boyer-Moore algorithm: The Boyer-Moore method compares the search string to the text left to right in an effective manner. It swiftly moves the search window and skips comparisons by employing the "bad character rule" and the "good suffix rule," two heuristics. The average-case time complexity of the Boyer-Moore algorithm is O(n/m), where n is the length of the text and m is the length of the search string.
- The Rabin-Karp algorithm effectively searches for a pattern within a text by using hashing. It calculates hashes for each window as it slides one over the text while hashing the search string. A character-by-character comparison is done to confirm the match if the hashes match. The average-case time of the Rabin-Karp algorithm is O(n+m), where n is the length of the text and m is the length of the search string.
- Aho-Corasick algorithm: The Aho-Corasick algorithm is a multi-pattern string searching algorithm that efficiently searches for multiple search strings simultaneously. It constructs a finite state machine called a "trie" that stores all the search strings. The algorithm traverses the trie based on the input text, quickly finding all occurrences of the search strings. The Aho-Corasick algorithm has a time complexity of O(n+m+z), where n is the length of the text, m is the total length of all search strings, and z is the number of occurrences found.
Example:Boyer-Moore algorithm Implementation: Output: Explanation: - The code begins by including the necessary header files, stdio.h and string.h, which provide functions for input/output and string manipulation.
- The MAX_CHAR constant is defined as 256, representing the maximum number of characters that can be supported in the ASCII character set.
- The max function is defined, which takes two integers as input and returns the maximum of the two.
- The badCharHeuristic function is defined to preprocess the "bad character" heuristic table. This table is used to determine the shift when a mismatch occurs during the pattern matching process. The function takes the pattern string, its length, and the badChar array as input. It initializes all entries in the badChar array to -1 and then fills in the actual value of the last occurrence of each character in the pattern.
- The search function implements the Boyer-Moore algorithm for string searching. It takes the text and pattern strings as input.
- Inside the search function, the lengths of the text and pattern strings are calculated using the strlen function.
- The badChar array is declared to store the bad character heuristic table.
- The badCharHeuristic function is called to preprocess the badChar table, passing the pattern, its length, and the badChar array.
- The shift variable is initialized to 0, which represents the shift of the pattern with respect to the text.
- The main search loop starts. It continues until the shift is less than or equal to the difference between the text length and the pattern length.
- Inside the main search loop, another loop is used to compare characters of the pattern and text from right to left, starting from the end of the pattern. It checks for matches until either a mismatch is found or the beginning of the pattern is reached.
- If the entire pattern is matched (i.e., j < 0), a match is found, and the index where the pattern is found is printed using printf. The shift is then adjusted to find the next occurrence of the pattern by comparing the next character in the text with the bad character heuristic table. If there is a mismatch, the pattern is shifted to align the bad character in the text.
- If a mismatch occurs during the comparison loop, the shift is adjusted based on the bad character heuristic. The max function is used to calculate the maximum shift between 1 and the difference between the current index and the index of the last occurrence of the mismatched character in the badChar table.
- Finally, in the main function, an example usage is demonstrated. The text variable is set to "ABAAABCD" and the pattern variable is set to "ABC". The search function is called with these inputs to find occurrences of the pattern within the text.
- The program ends by returning 0 from the main function.
|