Elasticsearch is a distributed search engine used for full-text search. In this section, we are going to discuss the physical architecture of Elasticsearch. In which we will see how documents are distributed across the physical or virtual machine. Along with it, we will also see how machines work together to form a cluster.
In Elasticsearch architecture, node and cluster play an important role. These are the center of Elasticsearch architecture. Each node in a cluster handles the HTTP request for a client who wants to send the request to the cluster.
Node and Cluster
Before begin, we need to know about the nodes and clusters to understand the architecture of Elasticsearch, as these are the center of Elasticsearch architecture. These are the essential part of elasticsearch. By default, each node in a cluster can handle transport traffic and HTTP requests. Node and cluster are discussed below in detail:
A node is a server and a part of the cluster that stores the data. It can be either virtual or physical. A node refers to an instance of Elasticsearch, not a machine. Therefore, any number of nodes can run on the same machine. Whenever an elasticsearch instance starts, a node starts running.
An Elasticsearch cluster is a group of Elasticsearch nodes, which are connected to each other and together stores all of your data. Each node contains a part of the cluster's data that you add to the cluster. You can use any number of clusters, but one node is usually sufficient. A cluster is automatically created when a node starts up.
Each and every node be a part of the cluster. It participates in searching and indexing of clusters, which means that a node participates in search query by searching the data stored by it. A node stores the data, which is searched by the search query. Let's understand with the help of an example -
You might have two nodes - Node A and Node B. Both nodes have some data, and that data is a match of the given search query. Here, we need to understand that a node contains the part of your data, which is searched by a search query. The node supports the following operations, such as - indexing and searching for data or manipulating existing data.
Elasticsearch stores your data in document form. Look at the below example of the data store in elasticsearch.
For example -
This data is stored in _source field inside the JSON object as you can see below:
The data is organized within the indices. Because every document within Elasticsearch, stored inside an index. An Index collects all the documents together logically and also provides a configuration option that is related to scalability and availability. So, whenever we need to search for data, execute search queries against the indices.
Elasticsearch architecture is highly scalable due to sharding, unless you are dealing with a large amount of data.