Glossary of Terms
ASCII
Stands for "American Standard Code for Information Interchange." ASCII is the universal standard for the numerical codes computers use to represent all upper and lower-case letters, numbers, and puctuation. Without ASCII, each type of computer would use a different way of representing letters and numbers, causing major chaos for computer programmers (allowing them even less sleep than they already get).
ASCII makes is possible for text to be represented the same way on a Dell Dimension in Minneapolis, Minnesota as it is on an Apple Power Mac in Paris, France. There are 128 standard ASCII codes, each of which can be represented by a 7 digit binary number (because 2^7 = 128)
Backup
A backup is a copy of one or more files created as an alternate in case the original data is lost or becomes unusable.
Cash
A cache stores recently-used information in a place where it can be accessed extremely fast.
CD-ROM
Stands for "Compact Disc Read-Only Memory." A CD-ROM is a CD that can be read by a computer with an optical drive. The "ROM" part of the term means the data on the disc is "read-only," or cannot altered or erased.
CD-RW
tands for "Compact Disc Re-Writable." A CD-RW is a blank CD that can be written to by a CD burner. Unlike a CD-R (CD-Recordable), a CD-RW can be written to multiple times. The data burned on a CD-RW cannot be changed, but it can be erased.
CIF
CIF (Common Intermediate Format), also known as FCIF (Full Common Intermediate Format), is a format used to standardize the horizontal and vertical resolutions in pixels of YCbCr (colour spaces) sequences in video signals, commonly used in video teleconferencing systems.
CHAP
In computing, the Challenge-Handshake Authentication Protocol (CHAP) authenticates a user or network host to an authenticating entity. CHAP is an authentication scheme used by Point to Point Protocol (PPP) servers to validate the identity of remote clients. (used in point to point protection i.e. iSCSi).
CHANNEL BONDING
The use multiple links combined to work as though they offered a single, higher-bandwidth link .
Clock Speed
Clock speed is the rate at which a processor can complete a processing cycle. It is typically measured in megahertz or gigahertz.
Cluster
A group of connected computers. A cluster can also refer to several machines grouped together, all performing a similar function.
COMPUTER CLUSTERING
The combination of multiple discrete computers into larger metacomputers .
CPU
Stands for "Central Processing Unit." This is like the brain of a computer. It processes everything from basic instructions to complex functions.
Data
Computer data is information processed or stored by a computer. This information may be in the form of text documents, images, audio clips, software programs, or other types of data.
Data transfer rate
The data transfer rate is commonly used to measure how fast data is transferred from one location to another.
DDR
Stands for "Double Data Rate." It is an advanced version of SDRAM, a type of computer memory. DDR-SDRAM, sometimes called "SDRAM II," can transfer data twice as fast as regular SDRAM chips. This is because DDR memory can send and receive signals twice per clock cycle. DDR2 RAM is an improved version of DDR memory that is faster and more efficient.
DISK PARTITIONING
Is the splitting of a single resource (usually large), such as disk space or network bandwidth, into a number of smaller, more easily utilized resources of the same type.
ENCRYPTION
Encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information.
FIBRE CHANNEL
Fibre Channel, or FC, is a gigabit-speed network technology primarily used for storage networking. Fibre Channel is standardized in the T11 Technical Committee of the International Committee for Information Technology Standards (INCITS), an American National Standards Institute (ANSI)–accredited standards committee. It started use primarily in the supercomputer field, but has become the standard connection type for storage area networks (SAN) in enterprise storage. Despite common connotations of its name, Fibre Channel signalling can run on both twisted pair copper wire and fiber-optic cables.
Format
Formatting a disk involves rewriting the directory structure, or file system, of a disk. All disks must be formatted using a supported file system in order to work with a computer.
FTP
Stands for "File Transfer Protocol." It is a common method of transferring files via the Internet from one computer to another.
GRID COMPUTING
As Computer Clustering.
GUI
Stands for "Graphical User Interface. It refers to the graphical interface of a computer that allows users to click and drag objects with a mouse instead of entering text at a command line.
Hard disk
The hard disk is a spindle of magnetic disks, called platters, that record and store information. Because the data is stored magnetically, information recorded to the hard disk remains intact after the computer has been turned off.
Hard drive
The hard drive is what stores all the data. It houses the hard disk, where all files and folders are physically located.
Host
This is a computer that acts as a server for other computers on a network. It can be a Web server, an e-mail server, an FTP server, etc.
Host connections
Please click here to download a host connection overview (PDF format).
Hub
It is a hardware device that is used to network multiple computers together. It is a central connection for all the computers in a network, which is usually Ethernet-based. Information sent to the hub can flow to any other computer on the network.
iSCSi
"Internet SCSI" protocol allows clients (called initiators) to send SCSI commands to SCSI storage devices (targets) on remote servers. It is a popular Storage Area Network (SAN) protocol, allowing organizations to consolidate storage into data center storage arrays while providing hosts (such as database and web servers) with the illusion of locally-attached disks. Unlike Fibre Channel, which requires special-purpose cabling, iSCSI can be run over long distances using existing network infrastructure.
Java
It refers to a programming language developed by Sun Microsystems. The language derives much of its syntax from C/C++, but it is object-oriented and structured around "classes" instead of functions. Java can also be used for programming applets - small programs that can be embedded in Web sites.
JBOD
JBOD (for "just a bunch of disks") ia a term for "spanning" - used to refer to a computers hard disks that have not been configured according to the RAID.
LAN
Stands for "Local Area Network". A LAN is a computer network limited to a small area such as an office building, university, or even a residential home.
LUN
Means "Logical Unit Number". LUNs are used to identify SCSI devices connected to a computer. Each device is assigned a LUN from "0" to "7", which serves as the devices unique address. LUNs can also be used for identifying virtual hard disk partitions, which are used in RAID configurations.
Mapping
Drive mapping is the way by which Microsoft and OS/2 assign a local drive letter (A through Z) with a share storage area to another computer over a betwork. After a drive has been mapped, a software application on a computer can read and write files from the shared storage area by accessing that drive, just as if that drive represented a local physical hard disk drive.
NIC
A Network card, Network Adapter, LAN Adapter or NIC (network interface card) is a piece of computer hardware designed to allow computers to communicate over a computer network.
NAS
Network-attached storage (NAS) is file-level computer data storage connected to a computer network providing data access to heterogeneous network clients. A NAS unit is essentially a self-contained computer connected to a network, with the sole purpose of supplying file-based data storage services to other devices on the network. The operating system and other software on the NAS unit provide the functionality of data storage, file systems, and access to files, and the management of these functionalities.
Operating System
Also known as an "OS," this is the software that communicates with computer hardware on the most basic level. Without an operating system, no software programs can run. The OS is what allocates memory, processes tasks, accesses disks and peripherials, and serves as the user interface.
Partition
A partition is a section of a hard disk. When a hard diskis formatted, one can usually choose the number of partitions. The computer will recognize each partition as a separate disk.
PCI
Stands for "Peripheral Component Interconnect." It is a parallel hardware bus designed by Intel and used in both PCs and Macs. Most add-on cards such as SCSI, Firewire, and USB controllers, use a PCI connection.
PCI Express
PCI Express (PCIe). PCI Express does not use a parallel bus structure, but instead is a network of serial connections controlled by a hub on the computers motherboard. This enables PCI Express cards to run significantly faster than previous PCI cards.
PCI X
Means "Peripheral Component Interconnect Extended." The first version of PCI-X supported data transfer rates of 133 MHz, which is more than twice as fast as the original PCI standard. Successor was PCI-X 2.0, which can run at speeds of 266 or 533 MHz. These speeds are fast enough to support Gigabit Ethernet cards and video capture device.
RAID LEVELS - an introduction
RAID technology enables data to be written to multiple sets of disks simultaneously, with data redundancy allowing one or more of the individual disk to fail without losing data. RAID technology has over the years been revised, refined, extended and enhanced to the point that it may now mean different things to different people.
RAID technology is divided into a number of levels whose different attributes define how data is stored and how data integrity is protected in different ways. Each defined RAID level specifies how sets of disk are arranged and the pattern in which data is written, read and verified for integrity. The RAID algorithm may be implemented in either hardware or in the operating system.
Hardware RAID controllers are intelligent disk controllers, usually with a dedicated microelectronics performing the complex RAID algorithms. Software RAID on the other hand depends on the host system microprocessor to perform RAID calculations and would therefore reduce the raw processing power available to run applications. The benefit of hardware RAID over software RAID is that it does not impinge on the host system CPU, allowing it to perform processing.
RAID LEVEL 0
RAID 0, also known as Striping, is not, strictly speaking, a RAID level at all, because it does not offer redundancy, and was not originally defined as such. It consists of two or more disks being written or read simultaneously (or as near to simultaneously as is possible). The more disks in the RAID 0 array, the better the possible performance and capacity for both IO and transfer rate, and there is no disk cost penalty. However, RAID 0 is inherently vulnerable because any disk failure whatsoever will result in total data loss, and the more disks in use, the greater the chances of disk failure.
RAID 0 is excellent for applications that require maximum performance so long as the data is not kept permanently on the RAID 0 array. Applications such as image editing, pre-press and digital rendering can benefit greatly form RAID 0 IO performance and capacity characteristics.
RAID LEVEL 1
RAID 1, also known as Mirroring, might be referred to as the opposite of RAID 0. Identical data is written across two or more disks, so that in the event of a disk failing, an complete copy of the data is still accessible, offering 100% redundancy. RAID 1 is inherently expensive because cost is doubled or (capacity halved) compared to a single disk. Read performance may be enhanced if the controller allows simultaneous reads from both drives, while write performance is reduced slightly because data has to be written twice, once to each disk. Where applicable, pairs of disks may be connected to different IO buses. This is known as Duplexing.
RAID 1 is typically used for applications where not only is redundancy paramount, but where the inconvenience of being forced to restore from backup needs to be avoided.
RAID LEVEL 0+1
RAID 0+1 may be described as a Mirrored Stripeset. RAID 1 array is layered over two RAID 0 arrays to offer benefits of both levels; performance and redundancy. With the performance of RAID 0, RAID 0+1 increases reliability as well by keeping a mirror of the striped data. It requires a minimum of four disks to implement As multiple copies of the data is kept, the cost is double that of an equivalent capacity RAID 0 array.
RAID LEVEL 10
Raid 10 may be described as Striped Mirroring. Multiple RAID 1 arrays are grouped into a single RAID 0 array. A RAID 10 array may lose all but one drive in all of the constituent RAID 1 arrays without compromising data integrity. However, if all the drives in one individual RAID 1 array should be lost, the entire RAID 10 array will be lost in a similar fashion to a single drive loss in a RAID 0 array.
RAID 10 is popular for high random request applications such as databases because write speeds are high and data security and integrity are acceptable.
RAID LEVEL 2
RAID 2 uses something similar to parity checks implemented by splitting data at bit level and spreading it over a number of disks and a number of redundancy disks. The redundancy data is calculated using a form of Error Correction Code (ECC) known as Hamming codes. When data is subsequently read the codes are also read and checked to ensure that nothing has changed since writing, and single bit errors may be corrected automatically.
RAID 2 is no longer used because because ECC was introduced as standard into individual disks. Besides, the nature of the ECC data required large numbers of disk drives, complex controllers, and other RAID levels providing better efficiency, pricing and performance and protection.
RAID LEVEL 3
RAID 3 or Striping with Dedicated Parity writes chunks of data which are smaller than the average IO size across a number of disks, with parity data stored on a single disk dedicated to that purpose. Because the data chunks are so small, data blocks are always distributed across all the disks, so any IO requires activity on every disk, which requires synchronised spindles to enable simultaneous seeks on all disks to the same position to enhance throughput in IO intensive environments.
RAID 3 is typically used for media applications requiring sequential request performance such as image editing, digital pre-press and live data streaming.
RAID LEVEL 4
RAID 4 is similar to RAID 3 except that the striping is done at block rather than byte level. This means that blocks may be read or written by a single disk, so multiple blocks on separate disks my be read or written in parallel. This enhances sequential request performance even more, but at the expense of random request performance.
NOTE: whether RAID controllers are either RAID 3 or RAID 4 is actually arguable because the definition of block size is subjective. Indeed, some RAID controllers may be adaptive, automatically changing block size to enhance requests.
RAID LEVEL 5
RAID 5 or Striping with Distributed Parity is similar to RAID 3 / 4 except that the parity data is written in a round robin fashion to alternate disks. Data chunks are much larger than the average IO size, and disks are able to satisfy requests independently, which enhances random read request performance. RAID 5 write request performance is not optimal because of the number of disk accesses required: old data and parity is read off separate disks; new parity is calculated; new data and parity is written to separate disks. Many controller manufacturers use write caching to enhance performance.
RAID 5 is currently one of the most commonly used RAID levels and is a reasonable choice for general file and application servers, database servers web, email and news servers.
RAID LEVEL 50
RAID 50 consists of RAID 0 (Striping) over multiple RAID 5 arrays. RAID 50 gives an added performance boost over RAID 5 although it is twice as expensive (in the case of two RAID 5 sets are being combined into a RAID 50). The performance is achieved at the expense of one extra disk when compared to a RAID 5 array.
RAID 50 suffers from a similar intolerance as RAID 10 in terms of failure of two drives in a constituent RAID 5 set.
RAID LEVEL 6
RAID 6 is an extension of RAID 5, with extra redundancy added with two sets of parity drive writes instead of one. This allows two drives to fail before an array is in a critical state, rather than one with RAID 3, 4 or 5. RAID 6 requires an extra disk and added processing power to offer similar performance, but as disks are cheap and the fixed costs of development are paid for, RAID 6 is fast becoming the default RAID level.
RAID LEVEL 60
RAID 60 consists of RAID 0 over multiple RAID 6 arrays. RAID 60 gives an added performance boost over RAID 6 although it is twice as expensive (in the case of two RAID 6 sets are being combined into a RAID 60). The performance is achieved at the expense of two extra disks when compared to a RAID 6 array.
JBOD AND NRAID
JBOD, meaning Just a Bunch of Disks is not actually RAID. The drives are seen as stand-alone disks, each as a logical drive. JBOD does not offer any redundancy.
NOTE: some controller manufacturers, especially those of low-end arrays, use the term JBOD to refer to what is actually NRAID.
NRAID, meaning Non-RAID, is also known as Volume Spanning, and may be thought of as the opposite of partitioning, where drives are combined to become one logical drive without any block striping. NRAID does not offer any redundancy but should a drive fail only the data residing on that drive should be lost.
NOTE: some controllers claim to offer NRAID but the loss of a single disk means the loss of all data on the array.
RAID LEVELS COMPARISON
|
|
USABLE CAPACITY |
SEQUENTIAL READ |
SEQUENTIAL WRITE |
RANDOM READ |
RANDOM WRITE |
AVAILABILITY |
|
NRAID |
High (N) |
Normal |
Normal |
Normal |
Normal |
10k-100k Hr |
|
RAID 0 |
High (N) |
Highest |
Highest |
High |
Highest |
Low |
|
RAID 1 |
Low (N/2) |
High |
Medium |
Medium |
Low |
Highest |
|
RAID 2 |
Moderate |
Higher |
Medium |
Low |
Low |
High |
|
RAID 3 |
High (N-1) |
High |
Medium |
Medium |
Low |
High |
|
RAID 4 |
High (N-1) |
Higher |
Higher |
Lower |
Lower |
High |
|
RAID 5 |
High (N-1) |
High |
Medium |
High |
Low |
High |
|
RAID 6 |
High (N-2) |
High |
Medium |
High |
Low |
Higher |
REPLICATION
Replication is the process of sharing information so as to ensure consistency and to improve reliability, fault-tolerance, or accessibility. It could be data replication if the same data is stored on multiple storage devices.
Router
It is a hardware device that routes data from a local area network (LAN) to another network connection. A router allowing only authorized machines to connect to other computer systems.
SAN
A storage area network (SAN) is an architecture to attach remote computer storage devices (such as disk arrays, tape libraries and optical jukeboxes) to servers in such a way that, to the operating system, the devices appear as locally attached.
SAS
Serial Attached SCSI a computer bus technology primarily designed for the transfer of data to and from storage devices and drives, e.g. hard disks.
SCSI
Means "Small Computer System Interface". SCSI is a computer interface used primarily for high-speed hard drives. This is because SCSI can support faster data transfer rates than the commonly used IDE storage interface. SCSI also supports daisy-chaining devices, which means several SCSI hard drives can be connected to single a SCSI interface, with little to no decrease in performance. SCSI is more and more being replaced by SAS.
SNAPSHOT
A snapshot is a copy of a set of files and directories as they were at a particular point in the past. (Overview)
Switch
A switch is used to network multiple computers together. Switches have usually 4 to 50 Ethernet ports. These ports can connect to computers, cable or DSL modems, and other switches. Switches are more advanced than hubs and less capable than routers. Unlike hubs, switches can limit the traffic to and from each port so that each device connected to the switch has a sufficient amount of bandwidth.
Telnet
This is a program that allows you log in to a Unix computer via a text-based interface.
TOE
TCP Offload Engine or TOE is a technology used in network interface cards to offload processing of the entire TCP/IP stack to the network controller. It is primarily used with high-speed network interfaces, such as gigabit Ethernet and 10 gigabit Ethernet, where processing overhead of the network stack becomes significant.
UPS
Means"Uninterruptible Power Supply." It is a type of power supply that uses battery backup to maintain power during unexpected power outages.
VIRTUALISATION
Virtualization is performed on a given hardware platform by host software (a control program), which creates a simulated computer environment, a virtual machine, for its guest software. The guest software, which is often itself a complete operating system, runs just as if it were installed on a stand-alone hardware platform.
STORAGE VIRTUALISATION
The process of completely abstracting logical storage from physical storage.
NETWORK VIRTUALISATION
Creation of a virtualized network addressing space within or across network subnets.
VIRTUAL MEMORY
Virtual Memory allows uniform, contiguous addressing of physically separate and non-contiguous memory and disk areas.
