Tuesday, 22 July 2014







STORAGE AREA NETWORK (SAN)

A Brief Documentary by

Soumyajit Basu

Student Of

SYMBIOSIS INSTITUTE OF COMPUTER STUDIES AND RESEARCH (SICSR)

AFFILIATED UNDER

SYMBIOSIS INTERNATIONAL UNIVERSITY (SIU)



ABSTRACT
           
Data is an integral part of any technology. Today’s concern is totally based on how to manage storage resources in order to accommodate more data for future business implementations and how more improvements can be made in order to provide consistent and better availability of data. Storage Area Networks enable new ways of moving, storing and accessing data that promote more reliable and cost effective enterprise wide information processing. It helps in load balancing among different storage devices to improve better availability and backing up of data.     
STORAGE AREA NETWORK
Storage Area Network represents a cluster of storage devices interacting with each other and the system over the network. What makes a SAN different from normal storage is “Universal Storage Connectivity”. This triggers the storage of data by negotiating between devices and ideally helps in sharing of data.

For example in the above figure we see all of the servers are physically connected to the storage devices. If Server F needs to access the data of Server E there’s no need to copy the data because Server F can access the devices on which the Server E stored the data. Only that Server F needs the permission to access the data of Server E.

There are a few beneficial implications of using the Universal Storage Connectivity or SAN.
·        There’s no need to schedule or check data transfer between pairs of servers.

·        There’s no need to purchase and maintain extra storage to temporarily stage one server’s data onto another.

2.1   What makes Storage Area Network good?
If SAN has to provide I/O backbone for information service operation then it needs to have few qualities.
·        The SAN must be highly available since failure in providing retrieval of data is not permissible. A good SAN implementation will have a built-in operation to avoid any kind of failure imaginable.

·        The I/O performance of a SAN must be scalable as the I/O performance of SAN must grow as the number of interconnected devices grows for ease of business purposes.
2.2   Why connect to a Storage Area Network?
·        For providing Universal Connectivity.
·        Higher Availability of data, without compromising data consistency.

·        High performance.

·        Reduced cost of providing information services that contribute positively to overall enterprise goals.

·        If a storage cluster can be formed over the network then it reduces the overall cost by avoiding temporary storage which is required to temporarily store data produced by one computer and used by other.

2.3   Architecture of Storage Area Network   
In systems that include Storage Area Network the disks are connected to RAID controllers, to storage servers. External RAID controllers organize disks into arrays and present the volumes to the client. External RAID controllers may directly connect to SANs, to storage servers or directly to application or database servers.
In case of SAN the RAID controllers divide the storage area into blocks of data. This is called Stripping and after stripping this chunks of data are then written onto the disk. But before Stripping is done Mirroring needs to be done to maintain availability of data so that if any error occurs while retrieving data onto the disk the system can rollback in order to avail the data for retrieving.

When the storage is manipulated by RAID controllers the volumes are usually called virtual disks, LUNs (Logical Units) or logical disks.
The infrastructure of storage area includes switches, hubs, routers and bridges. Their main purpose is to route data between different storages across the network. The switches multiply the bandwidth among the network.

2.4   What are Storage Appliance devices? What are the different kinds of Storage Area Network Appliance?  
 
A Virtual Storage Appliance or a Virtual SAN Appliance is a software bundle that allows a storage manager to turn the unused storage capacity of a virtual server within a network into a Storage Area Network (SAN). Some storage hardware vendors are trying to encapsulate the bundle with the firmware itself. More recently this feature is being added at hardware level for optimized utilisation of memory.
2.4.1   In-band SAN Appliance
In-band SAN Appliance or symmetric virtualization actually comes in between the hosts and the storage. All I/O request and their data pass through the device. Hosts perform I/O to the virtualization device and do not interact directly with the actual storage device. The virtualization device in turn interacts with the storage device for the host.
2.4.2   Out-band SAN Appliance
Out-band SAN Appliance or asymmetric virtualization uses meta-data mapping functions in order to access the storage device. The Out-band SAN requires additional software in the host that would first request the location of the actual data. Therefore an I/O request from the host is intercepted much before the request is forwarded from the host. A meta-data look up is requested from the meta-data server which returns the physical location of the data to the host. The information is then retrieved through an actual I/O request to the storage.
3.0   STORAGE AREA NETWORK ESSENTIALS
        All the operations of the SAN are managed through a set of softwares. The various software requirements that might be necessary for the proper functioning of Storage Area Network are essential. It is also important for the administrators to provide proper protection to data in order to guarantee availability and consistency of data. For that maintenance and backup of the SAN should be taken.

3.1 Software Requirements for SAN
·        I/O drivers having unique SAN-related capabilities.

·        Storage middleware components such as volume managers to enhance and manipulate SAN exploiting capabilities.

·        Application middleware components such as cluster managers that implement the availability and scaling of the functionality of SAN.

·        Database Managers which would help in load balancing.

·        Enterprise Backup Managers that is able to make consistent backups of data with minimal overhead while the data are in use by applications. This might utilise the concept of hot backup or online backup which would freeze the database headers stopping the increment of SCN, thereby causing a pause in incrementing the checkpoint. Each data file is then copied to the backup destination.

·        SAN managers which would be used to manage the SAN environment.

3.2 Enterprise Data Protection for SAN
·        Periodic backups of online files and databases that can be used to recover from application or human errors.

·        Electronic archives that can be removed from data centres to a secure repository.

·        Running replicas of data that can be used to recover after a disaster.

·        Transport copies of data that can be used to move it from one place to another from where it is used less to a place where it is used more.
3.3   Backup for SAN
It is important to realize that we are ultimately responsible for the organization’s data. Any discrepancy or loss of data can cause a huge harm to the growing business of an organization. So backup, can be defined as the process of making separable copies of online data. A backup or copy of a set of data objects reflects the contents of the objects as they existed as a single instance.
            There are generally two kinds of backups

LOGICAL BACKUP – Logical backups are used to recover from accidentally deleted or modified data.

PHYSICAL BACKUP – Physical backups is the process of copying configuration files and the database. A physical backup is mainly responsible from recovery in case of a media failure.

Backups may be:

·        Kept at the data centre.

·        Moved to alternate sites.

·        Made unalterable.

 3.4   Enterprise Backup Architecture
There can be several backup architectures using which we can back up the data.
·        Backup Clients. The term backup client refers to both a computer with data to backup and a software component which reads data from online and feeds it to a backup server.
·        Backup Servers. It denotes either a computer or a software component that receives data from the backup client and writes the data onto the backup media.
·        Backup storage units. Basically comprises of magnetic tapes, optical disk drives controlled by a media server.
·        Backup media. The media in which the data is written.
4.0   Pros and Cons of Using SAN
The pros of using Storage Area Network are as follows.
·        Device Sharing. The ability of two or more servers to access a storage device.

·        Volume Sharing. Volume managers keep a track of volume state for a set of disks. The volume manager maintains co-ordination between two servers by co-ordinating the meta data updates so that updates from two servers do not overwrite each other and appear to all servers sharing the volume.

·        File System Sharing. The ability of two or more servers to share access to a file system on the device.

·        Database Sharing. It is common for two or more instance of a database manager running on separate servers to access the same database for availability and performance reasons.

·        File Sharing. File sharing allows processing of files by applications running on any cluster’s server.
The cons of using Storage Area Network may be listed as follows.

·        Cost Consumption. The cost for setting up the infrastructure of SAN is way too costly and is quite expensive for smaller organizations to implement it.

·        Fragmentation. Generally a pool of storage must be carved into smaller slices of fragment which is dedicated to a server. This can often lead to wasted disk space.