Back to Research Page

Pankaj Mehta




Indexing Structures

Teradata indexes are subtables that contain base table index column values and RowIDs which point to the rows in the base table having that value. The rows of these index subtables are shared across all VPROCs, but how this subtable partitioning takes place (globally or locally) will depend on whether this is a unique secondary index (USI) or non-unique secondary index (NUSI).

Primary Indexes

Teradata’s primary index is a direct hash. This is extremely fast for most operations. There are two levels of indexing the hash codes. The first level is locked in memory. This guarantees that the worst case for access to a single row is 2 I/Os. I/O caching gives a high probability that the second level index is in cache and some chance that the data block is in memory as well. Table storage mechanisms guarantee that as data is inserted or changed, no increased overhead is incurred (no hash chains, block extensions, ...). Background tasks insure that fragmentation is cleaned up on an as needed basis.

Advantages

  • Fast access for single row operations
  • The hashed structure is sorted by hash and optimized for use by the Hash Merge Join algorithm.
  • No preparation of the table is necessary, if the table is joined via its primary index.
  • The join algorithm reads it directly.

The value contained in these indexed columns is used by Teradata to determine the VPROC which will own the data as well as a logical storage location within the VPROC’s associated disk space. Most often the physical primary index will be the same column as the logical primary key.

Secondary Indexes

Secondary indexes are used to provide faster set selection. Teradata offers either unique or non-unique index structures and can use dynamically generated intersection of multiple indexes, both primary and secondary.
If unique secondary indexes are stored in a hash index structure, the secondary index value is used to determine the Virtual AMP and the storage location. If, non-unique secondary indexes are stored as local hash structures, the primary index value is used to determine the Virtual AMP, the secondary index value is used to determine the storage location.