Modern enterprises frequently run mission-critical databases containing upwards of several hundred gigabytes and, in some cases, several terabytes of data. These enterprises are challenged by the support and maintenance requirements of very large databases (VLDB), and must devise methods to meet those challenges.
Introduction to Partitioning
Partitioning addresses key issues in supporting very large tables and indexes by letting you decompose them into smaller and more manageable pieces called partitions, that happen to be entirely transparent to an application.
SQL queries and DML statements do not need to be modified in order to access partitioned tables. However, after partitions are defined, DDL statements can access and manipulate individual partitions rather than entire tables or indexes. This is how partitioning can simplify the manageability of enormous database objects.
Each partition of a table or index must have the same logical attributes, such as column names, data types, and constraints, but each partition can have separate physical attributes such as compression enabled or disabled, physical storage settings, and table spaces.
Partitioning is helpful for many different types of applications, particularly applications that manage large volumes of data. OLTP systems often benefit from improvements in manageability and availability, while data warehousing systems benefit from performance and manageability.
Partitioning offers these advantages:
- It enables data management operations such as data loads, index creation and rebuilding, and backup/recovery at the partition level, rather than on the entire table. This results in significantly reduced times for these operations.
- It improves query performance. In many cases, the results of a query can be achieved by accessing a subset of partitions, as opposed to the entire table. For some queries, this technique (called partition pruning) can provide order-of-magnitude gains in performance.
- It significantly reduces the impact of scheduled downtime for maintenance operations.
- Partition independence for partition maintenance operations lets you perform concurrent maintenance operations on different partitions of the same table or index. You can also run concurrent SELECT and DML operations against partitions that are unaffected by maintenance operations.
- It increases the availability of mission-critical databases if critical tables and indexes are divided into partitions to reduce the maintenance windows, recovery times, and impact of failures.
- Parallel execution provides specific advantages to optimize resource utilization, and minimize execution time. Parallel execution against partitioned objects is key for scalability in a clustered environment. Parallel execution is supported for queries as well as for DML and DDL.
VLDB and Partitioning
A very large database has no minimum absolute size. Although a VLDB is a database like smaller databases, there are specific challenges in managing a VLDB. These challenges are related to the sheer size, and the cost-effectiveness of performing operations against a system that size, taken for granted on smaller databases.
Several trends have been accountable for the steady growth in database size:
- For a long time, systems have been developed in isolation. Companies have started to see the benefits of combining these systems to enable cross-departmental analysis while reducing system maintenance costs. Consolidation of databases and applications is a key factor in the ongoing growth of database size.
- Many businesses face regulations that set specific requirements for storing data for a minimum amount of time. The regulations generally result in more data being stored for longer periods of time.
- Companies grow organically and through mergers and acquisitions, causing the amount of generated and processed data to increase. At the same time, the user population that relies on the database for daily activities increases.
Partitioning is a critical feature for managing significantly large databases. Growth is the basic challenge that partitioning addresses for very large databases, and partitioning enables a “divide and conquer” technique for managing the tables and indexes in the database, especially as those tables and indexes grow. Partitioning is the feature which permits a database to scale for very large datasets while maintaining consistent performance, without unduly increasing administrative or hardware resources.
Partitioning for Every Database
The benefits of partitioning are not just for very large databases; every database, even small databases, can benefit from partitioning. While partitioning is a necessity for the largest databases in the world, partitioning is obviously beneficial for the smaller database as well. Even a database whose size is measured in megabytes will see the same type of performance and manageability benefits from partitioning as the largest multi-terabyte systems.