- Storage Gateway is a service in AWS that connects an on-premises software appliance with the cloud-based storage to provide secure integration between an organization's on-premises IT environment and AWS storage infrastructure.
Note: Here, on-premise means that an organization keeps its IT environment on site while the cloud is kept offsite with someone else responsible for its maintenance.
- Storage Gateway service allows you to securely store the data in AWS cloud for the scalable and cost-effective storage.
- Storage Gateway is a virtual appliance which is installed in a hypervisor running in a Data center used to replicate the information to the AWS particularly S3.
- Amazon Storage Gateway's virtual appliance is available for download as a virtual machine (VM) image which you can install on a host in your data center.
- Storage Gateway supports either Vmware EXI or Microsoft Hyper-V.
- Once you have installed the storage gateway, link it with your AWS account through the activation process, and then you can use the AWS Management Console to create the storage gateway option.
There are three types of Storage Gateways:
- File Gateway (NFS)
- Volume Gateway (iSCSI)
- Tape Gateway (VTL)
The above image shows that the storage gateway is categorized into three parts: File Gateway, Volume Gateway, and Tape Gateway. Volume Gateway is further classified into two parts: Stored Volumes and Cached Volumes.
- It is using the technique NFS.
- It is used to store the flat files in S3 such as word files, pdf files, pictures, videos, etc.
- It is used to store the files to S3 directly.
- Files are stored as objects in S3 buckets, and they are accessed through a Network File System (NFS) mount point.
- Ownership, permissions, and timestamps are durably stored in S3 in the user metadata of the object associated with the file.
- Once the objects are transferred to the S3, they can be used as the native S3 objects, and bucket policies such as versioning, lifecycle management, and cross-region replication can be directly applied to the objects stored in your bucket.
Architecture of File Gateway
- Storage Gateway is a virtual machine running on-premises.
- Storage Gateway is mainly connected to aws through the internet.
- It can use Direct Connect. Direct Connect is a direct connection line between the Data center and aws.
- It can also use an Amazon VPC (Virtual Private Cloud) to connect a storage gateway to aws. VPC is a virtual data center. It represents that the Application server and storage gateway do not need to be on-premises. In Amazon VPC, storage gateway sits inside the VPC, and then storage gateway sends the information to S3.
- Volume Gateway is an interface that presents your applications with disk volumes using the Iscsi block protocol. The iSCSI block protocol is block-based storage that can store an operating system, applications and also can run the SQL Server, database.
- Data written to the hard disk can be asynchronously backed up as point-in-time snapshots in your hard disks and stored in the cloud as EBS snapshots where EBS (Elastic Block Store) is a virtual hard disk which is attached to the EC2 instance. In short, we can say that the volume gateway takes the virtual hard disks that you back them up to the aws.
- Snapshots are incremental backups so that the changes made in the last snapshot are backed up. All snapshot storage is also compressed to minimize your storage charges.
Volume Gateway is of two types:
- It is a way of storing the entire copy of the data locally and asynchronously backing up the data to aws.
- Stored volumes provide low-latency access to the entire datasets of your on-premise applications and offsite backups.
- You can create a stored volume that can be a virtual storage volume which is mounted as iSCSI devices to your on-premise application services such as data services, web services.
- Data written to your stored volume is stored on your local storage hardware, and this data is asynchronously backed up to the Amazon Simple storage services in the form of Amazon Elastic Block store snapshots.
- The size of the stored volume is 1GB - 16 TB.
Architecture of Volume Gateway
- A client is talking to the server that could be an application server or a web server.
- An application server is having an Iscst connection with the volume Gateway.
- Volume Gateway is installed on the Hypervisor.
- The volume storage is also known as a virtual hard disk which is stored in physical infrastructure, and the size of the virtual hard disk is 1TB.
- The volume storage takes the snapshots and sends them to the Upload buffer.
- The upload buffer performs the multiple uploads to the S3, and all these uploads are stored as EBS snapshots.
- It is a way of storing the most recently accessed data on site, and the rest of the data is stored in aws.
- Cached Volume allows using the Amazon Simple Storage service as your primary data storage while keeping the copy of the recently accessed data locally in your storage gateway.
- Cached Volume minimizes the need to scale your on-premises storage infrastructure while still providing the low-latency access to their frequently accessed data.
- Cached Gateway stores the data that you write to the volume and retains only recently read data in on-premises storage gateway.
- The size of the cached volume is 1GB - 32 TB.
Architecture of Cached Gateway
- A client is connected to the Application server, and an application server is having an iSCSI connection with the Gateway.
- The data send by the client is stored in the cache storage and then uploaded in an upload buffer.
- The data from the upload buffer is transferred to the virtual disks, i.e., volume storage which sits inside the Amazon S3.
- Volume storage is block-based storage which cannot be stored in S3 as S3 is object-based storage. Therefore, the snapshots, i.e., the flat files are taken, and these flat files are then stored in S3.
- The most recently read data is stored in the Cache Storage.
- Tape Gateway is mainly used for taking backups.
- It uses a Tape Gateway Library interface.
- Tape Gateway offers a durable, cost-effective solution to archive your data in AWS cloud.
- The VTL interface provides a tape-based backup application infrastructure to store data on virtual tape cartridges that you create on your tape Gateway.
- It is supported by NetBackup, Backup Exec, Veeam, etc. Instead of using physical tape, they are using virtual tape, and these virtual tapes are further stored in Amazon S3.
Architecture of Tape Gateway
- Servers are connected to the Backup Application, and the Backup Application can be NetBackup, Backup Exec, Veeam, etc.
- Backup Application is connected to the Storage Gateway over the iSCSI connection.
- Virtual Gateway is represented as a virtual appliance connected over iSCSI to the Backup application.
- Virtual tapes are uploaded to an Amazon S3.
- Now, we have a Lifecycle Management policy where we can archive to the virtual tape shelf in Amazon Glacier.
Important points to remember:
- File Gateway is used for object-based storage in which all the flat files such as word files, pdf files, etc, are stored directly on S3.
- Volume Gateway is used for block-based storage, and it is using an iSCSI protocol.
- Stored Volume is a volume gateway used to store the entire dataset on site and backed up to S3.
- Cached volume is a volume gateway used to store the entire dataset in a cloud (Amazon S3) and only the most frequently accessed data is kept on site.
- Tape Gateway is used for backup and uses popular backup applications such as NetBackup, Backup Exec, Veeam, etc.