Back to Research Page

Bala Sivalingam




Nodes and Amps organization

 

A node forms the basic unit of system on which the various components like the vproc, Bynet etc. are modeled. A node is a hardware assembly containing several tightly coupled central processing units (CPUs).  A BYNET is a Hardware interprocessor network to link nodes on an MPP system. Connects processors by broadcast, multicast, or point-to-point communication.

The BYNET also possesses high-speed logic arrays that provide bidirectional broadcast, multicast, and point-to-point communication and merge functions. A multinode system has two BYNETs. This both creates a fault tolerant environment and provides for enhanced interprocessor communication. When BYNET traffic becomes particularly heavy, the two BYNETs can handle separate (rather than redundant) traffic. The machine provides load balancing software to optimize this process.

The total bandwidth for each network link to a processor node is 10 megabytes. Because there are two network links per node and because the bandwidth is linearly scalable, the total throughput available for each node is 20 megabytes.

For example, a 16-node 5100M system has 320 megabytes of bandwidth for point-to-point connections. Total available broadcast bandwidth for any size system is 20 megabytes.

The BYNET software also provides a standard TCP/IP interface for communication among the SMP nodes.

2

 

          

 

 

Both the SMP and MPP machines run the set of software processes called vprocs on a node under the Parallel Database Extensions (PDE).

There are two types of vprocs:

 

Disk Arrays

A disk array is a matrix of independent but interconnected physical disk storage units. For the Teradata RDBMS, the disks are organized as a Redundant Array of Independent Disks (RAID), as RAID 1 (mirroring) or RAID 5 or RAIDS (parity) technology using RAID Manager. Each array typically consists of from one to four ranks of disks, with up to five disks per rank. 2

RAID Manager uses drive groups. A drive group is a set of drives that have been configured into one or more logical units (LUNs). All the disks in a drive group must be of the same RAID level (1,5 or S). A LUN is a portion of every drive in a drive group. These portions are configured to represent a single UNIX disk. Each LUN is uniquely identified and sliced into one or more UNIX slices.

Pdisks

A pdisk is a slice of LUN that is assigned to an AMP. Each pdisk is uniquely identified and independently addressable.

Virtual Disks (vdisks)2

The groups of pdisks assigned to an AMP are collectively identified as a vdisk. Vdisks are used to control the assignment of pdisks to AMPs.

 

Virtual Processors

2

The versatility of the Teradata RDBMS is based on virtual processors (vprocs) that eliminate dependency on specialized physical processors. These vprocs are a set of software processes that run on a node under the Teradata Parallel Database Extensions (PDE) and the multitasking environment of the operating system.

The maximum number of vprocs that can be supported in a single system is 16,384. The maximum number of vprocs per node can be as high as 128. Each vproc is a separate, independent copy of the processor software, isolated from other vprocs, but sharing some of the physical resources of the node, such as memory and CPUs. Multiple vprocs can run on an SMP platform or a node.

The vprocs and the tasks running under them communicate using unique-address messaging, as if they were physically isolated from each other. This message communication is done using the Boardless BYNET Driver software on single node platforms and using the BYNET hardware and BYNET Driver software on multinode platforms.

 

PEs

2

The Parsing Engine is the virtual processor that communicates with the client system on one side and with the AMPs (via the BYNET or Boardless BYNET) on the other. Each PE executes the database software that manages sessions, decomposes SQL statements into parallel steps, and returns the answer rows to the requesting client.

 

AMPs

2

The Access Module Process (AMP) is the heart of the Teradata RDBMS. The Access Module Process is a virtual processor (vproc) that provides a BYNET interface and performs many database and file management tasks.

AMPs control the management of the Teradata RDBMS and also provide control over the disk subsystem, with each AMP being assigned to a virtual disk.

Each AMP controls the following set of functions:

  1. BYNET (or Boardless BYNET) interface
  2. Database manager
  3. Locking
    1. Joins
    2. Sorting
    3. Aggregation
    4. Output data conversion
    5. Disk space management
    6. Accounting
    7. Journaling
  4. File system and disk management

Each AMP is assigned a portion of the database to control. Each AMP also maintains its portion of the

database tables stored on disks.

 

The AMP executes any SQL requests in three steps

1.        Lock the table.

2.        Execute the operation requested.

3.        End the transaction.