Advance Data Structures
Advanced data structures are one of the most important disciplines of data science since they are used for storing, organizing, managing data and information to make it more efficient, easier to access, and modify. They are the foundation for designing and developing efficient and effective software and algorithms. Knowing how to create and construct a decent data structure is essential for being a skilled programmer. With the rise of new information technology, working practices, its scope is likewise expanding.
The efficiency and performance of many of the algorithms directly depend upon the data that particular algorithm is using for calculations and other data operations. Data structures perform the basic operation of storing data during the run-time or execution of the program in the main memory, so memory management is also an important factor as it will directly affect the space complexity of a program or algorithm. So choosing an appropriate data structure plays an important role in program efficiency and performance.
List of Advanced Data Structures
Advanced-Data structures have grown into many manifolds. The broad categories into which advanced data structures are divided are as follows:
- Primitive types
- Composite or non-primitive type
- Abstract data types
- Linear Data Structures
- Tree types
- Hash-based structures
Primitive Data Structures
Primitive data structures are fundamental structures that are directly manipulated by machine instructions. On different computers, primitive data structures have different representations. Primitive data structures include integers, floats, characters, and pointers. In general, there are eight data types, such as:
1. Boolean Data Type: A Boolean data type is made up of a single bit of information that can only store true or false values. True or false conditions are tracked using this data type, and Boolean data types are also used to store the result of various conditions. Let's write a small program and see how it works.
2. Byte Data Type: The byte data type is an illustration of a primitive data type. It is a signed two's complement integer of 8 bits, and it stores whole numbers ranging from -128 to 127. A byte data type is useful for saving large amounts of memory. Let's write a small program and see how it works.
3. Char data type: A single character is stored in this data type, and the character must be enclosed in single quotes, such as 'E' or 'e'. You can also use ASCII values to display specific characters. Let's look at a simple example to see how it works.
4. Short data type: A short data type is larger than a byte but smaller than an integer. It saves values ranging from -32,768 to 32767. This data type default size is 2 bytes. Let's look at an example to understand the short data type better.
5. Int data type: This data type is capable of storing whole numbers ranging from -2147483648 to 2147483647. When creating variables with numeric values, int is generally the preferred data type.
6. Long data type: This data type is a two's complement 64-bit integer. A long data type's default size is 64 bits, and its value ranges from -263 to 263-1.
7. Float data type: You should use a floating-point type when you need a number with a decimal, such as 8.88 or 3.14515. This data type supports fractional numbers ranging from 3.4e038 to 3.4e+038. It is important to note that the value should end with an "f." Let's look at a specific example to understand this data type better.
8. Double data type: The double data type can store fractional numbers from 1.7e-308 to 1.7e+308. Note that you should end the value with a "d".
Non-Primitive or Composite Data Structures
Non-primitive data structures are those that are created by combining primitive data structures. It is a little complicated because it is derived from primitive data structures, and we can also say that it is a collection of the same or different data items. Some of the examples of Non-Primitive or Composite Data Structures are:
- Array: An array is a data structure of items that are stored in adjacent memory locations. The idea is to group objects of the same type. This makes calculating the position of each component quicker by simply putting an offset to a base value, i.e., the memory address of the array's first item (generally denoted by the name of the array).
- Records: A record is a simple data structure, and it is also known as structure, struct, or composite data. Rows are the names given to records in a database or spreadsheet. A record is a collection of fields, which may be of different types of data and typically in a fixed number and sequence. Fields of a record may also be referred to as members, especially in object-oriented programming; fields may also be referred to like elements, though this may confuse collection components.
- Union: A union is a value that can have multiple representations or layouts within the same memory location. It is made up of a variable that can hold such a data structure. To define such values and variables, some programming languages support special data types known as union types. In other words, a union type description specifies which of several allowed primitive types may be stored in its instances, such as "float or long integer". In contrast to a record (or structure), which can be defined to contain a float and an integer, a union only has one value at any given point in time.
Abstract Data Type
ADT is a type or class of an object for whom the behavior is based on a set of values and a set of functions. The concept of ADT only mentions what operations must be performed, not how these operations will be carried out. It does not specify how data will be stored in memory or which algorithms will be used to carry out the operations. It is called "abstract" because it provides a view that is independent of implementation. Abstraction is a process of presenting only the essentials while hiding the details.
The operator of a data type doesn't want to understand how that data type is implemented. For example, we have used primitive values such as int, float, and char data types to understand that they can operate and be performed without understanding how they are implemented. As a result, a user only needs to know what a data type can be, not how it will be implemented. Let's see some of the examples of the Abstract Data types:
List Data Structure: A list or sequence is an abstract data type that symbolizes a finite number of ordered values, where the same value may occur multiple times. A list instance is a computer illustration of the mathematical concept of a tuple or finite sequence. A stream is the (possibly) infinite analog of a list. Lists are the most basic type of container because they contain other values. If the same value appears more than once, each incidence is treated as a separate item.
All programming languages support list data types, lists, and list operations have their own syntax and semantics. A list is frequently formed by composing the items in sequence, separated by commas, semicolons, and/or spaces, within a pair of delimiters such as parentheses '(),' brackets '', braces ", or angle brackets '>'. List types may be indexed or sliced like array types in some languages, in which case the data type is more correctly described as an array.
Queue Data Structure: A queue is a collection of elements that are kept in a sequence and can be changed by adding entities at one end of the sequence and removing entities from the other end. By protocol, the end of the series at which elements are added is referred to as the back, tail, or rear of the queue. The end of the sequence at which elements are removed is referred to as the head or front of the queue, similar to how people line up to wait for goods or services.
A queue's operations make it a first-in-first-out (FIFO) data structure. The first element added to the queue in a FIFO data structure is also the first element removed. This is equitable to the necessity that when a new element is added, all previous elements must be removed before the new element can be deleted. A queue is a type of linear data structure, also known as a sequential collection. Queues are commonly used in computer programs, where they are incorporated as data structures with access procedures, abstract data structures, or as classes in object-oriented languages.
Stack Data Structure: A stack is an abstract data type that acts as a collection of elements and has two primary operations:
Push adds an element to the collection, whereas Pop removes the most recently added element that has not yet been removed.
The order in which elements are removed from a stack gives rise to the acronym LIFO (last in, first out). Furthermore, a peek operation may provide access to the top of the stack without modifying the stack. The term "stack" refers to this type of structure because it is analogous to a collection of physical items stacked on top of each other.
Linear Data Structure
If the data structure elements form a linear pattern or sequence, the data structure is linear.
- Control tables: The tables that control the program are referred to as control tables. They don't have strict guidelines and can be easily modified according to your needs.
- Image: A pictorial representation of the entire computer system which can reproduce images after they are shut.
- Matrix: A matrix is a two-dimensional data structure with the same type of elements in each dimension. A data frame is two-dimensional, with different columns containing different data types; however, all values within a column must be of the same data type, and all columns must be the same length.
- Lists: A list or sequence is an abstract data type that symbolizes a finite number of ordered values, where the same value may occur multiple times. A list instance is a computer illustration of a tuple or finite sequence; a stream is the (possibly) infinite analog of a list.
A tree is a broadly utilized abstract data type representing a hierarchical tree structure as a set of linked nodes, with a root value and subtrees of children with a parent node.
A tree data structure can be defined iteratively as a collection of nodes, with each node containing a value and a list of references to other nodes. The "root node" is the beginning of the tree, and the "children" are the reference nodes. There are no duplicate references, and none point to the root.
- Binary Tree: A binary tree can be defined as one of the trees in which only two children can be added to each parent node. The child nodes are known as the left child node and right child node. A binary tree is one of the most popular trees. When we apply various constraints and characteristics to a Binary tree, other trees such as AVL tree, BST (Binary Search Tree), RBT tree, etc., are formed. We will explain in detail these types of trees in further discussion. In other words, we can say that a generic tree whose elements have at most two children is called a binary tree. Since each element in a binary tree can have only 2 children, we typically name them the left and right children.
- Decision Trees: A decision tree is a decision-making tool that employs a tree-like model of decisions and their potential outcomes, such as possibility event outcomes, resource costs, and utility. It is one method of displaying an algorithm that consists solely of conditional control statements. Decision trees are a standard tool in machine learning. They are widely used in operations research, particularly in decision analysis, to help us identify the strategy most likely to achieve a goal.
- Other tree types include B-trees, dancing trees, fusion trees, heap, Leonardo heap, radix tree, suffix tree, FM-index, Spaghetti stack, rose tree, Fenwick tree, space portioning trees, interval trees, segment trees, cover trees, minimax tree, finger tree, parse trees, expression trees, weighted balanced tree.
Here are the following hash-based structures types:
- Hash List: A hash list is simply a set of hash values associated with groups of data items linked with each other in a file or folder system or another connective array format. Hash lists are used to analyze data in a database or other environment, access one or more of these items, determine the size of an array, or for other investigative purposes. When it comes to data security, hash lists are also very useful. Putting a hash into a list, rather than using a single hash value for an entire block, makes it easier to check incoming input over a peer-to-peer network or other connectivity model and determine whether any individual data set corresponding to a hash value on the list has been compromised. Analyzing a set of data blocks using a hash list divides the analysis and makes it easier to detect destructive hacking. This is a common application of hash lists in a hash cryptography system.
- Double Hashing: Double hashing is a computer programming technique that uses a secondary hash of the key as an offset when a collision occurs in hash tables in conjunction with open addressing to resolve hash collisions. A classic data structure on a table T is double hashing with open addressing. The double hashing technology utilizes one hash value as an index into the table and then moves forward with an interval until the intended value is found, a vacant position is reached, or the entire table has been tried to search; however, this interval is determined by a separate, individual hash function. Unlike the alternative collision-resolution methods of linear and quadratic probing, the interval is determined by the data so that values mapping to the same position have different bucket patterns; this minimizes repeated collisions and the effects of clustering.
A graph data structure is made up of a finite (and potentially mutable) set of vertices (also known as nodes or points) and a set of unordered pairs for an undirected graph or a set of ordered pairs for a directed graph. These pairs are recognized as edges (sometimes known as links or lines) in a directed graph, but they are also known as arrows or arcs in some cases. The vertices could be internal graph elements or external items represented by integer indices or references.
So depending upon the position of these nodes and vertices, there are different types of graphs. In this article, the different types of graphs that we are going to see are Null Graph, Trivial Graph, Non-directed Graph, Directed Graph, Connected Graph, Disconnected Graph, Regular Graph, Complete Graph, Cycle Graph, Cyclic Graph, Acyclic Graph, Finite Graph, Infinite Graph, Bipartite Graph, Planar Graph, Simple Graph, Multi Graph, Pseudo Graph, Euler Graph, Hamiltonian Graph.
- Complete Graph: A graph is said to be a complete graph if there is an edge between every pair of the graph for all the graph's vertices. In other words, we can say that all the vertices are connected to the rest of all the vertices of the graph. A complete graph of 'n' vertices contains exactly nC2 edges, and a complete graph of 'n' vertices is represented as Kn.
- Regular Graph: It should satisfy one primary condition to be called a regular: all graph vertices should have the same degree. By the degree of vertices, we mean the number of nodes associated with a particular vertice of the graph. If all the graph nodes have the same degree value, then the graph is called a regular graph. If all the vertices of a graph have the degree value of six, then the graph is called a 6-regular graph. If all the vertices in a graph are of degree 'k', then it is called a "k-regular graph".