PXF2639-Database相关概念

Shanghai Passiontech Tech, 021-51870017, sales@51Lm.cn

     
     
     

Database相关概念


型号: PXF2639


简介:数据管理系统提供最有效的管理方式用于保存数据在一个可靠的,有组织的形式。 This removes the burden of managing data from the application code and provides several key advantages that an application would 其他wise not have access to.Figure2, “保护, 组织, 和共享”depicts many of the risks that a database can help mitigate. Database features are c...
品牌
产地
型号PXF2639
折扣

数据管理系统提供最有效的管理方式用于保存数据在一个可靠的,有组织的形式。 This removes the burden of managing data from the application code and provides several key advantages that an application would 其他wise not have access to.

Figure 2, “保护, 组织, 和共享” depicts many of the risks that a database can help mitigate. Database features are clustered near related risks, and many features are optional because an application is not always susceptible to every risk. The more database features an application utilizes, the better protected it is against nearby risks.

Protect, Organize, and Share Data to Mitigate Risks

Figure 2. Protect, Organize, and Share Data to Mitigate Risks

 

Reliable storage is the primary motivation for using a database. Operations on a database are grouped into transactions, which will either succeed (commit) or fail (abort) as a single unit. Because transactions are persistent, data will not become lost or corrupted even after a power failure. After a crash, data is automatically recovered up to the last committed transaction.

[Warning] Warning

A database cannot protect against a corrupted file system or damaged media hardware. Regular backups are the only way to provide full 保护against data loss.

A database can also provide powerful tools to help organize and access data. The structure of a database is specified in the database schema, which is used as the basis for navigating the data. Data is stored in tables, which contain a list of typed columns. Keys are defined on certain columns of the table to quickly locate rows based on the data stored in the key column. This fast data access through indexed key fields is an important advantage of a database managment system.

Because data is stored in an organized format, it is possible to perform generic queries on the data. This allows existing software tools to be used with the data stored in the database, regardless of what application it was created with. Data can also be stored in a platform-independent format so that database files can be moved between different operating systems and processor architectures.

To prevent data inconsistency, it is important to manage how multiple users modify the database at the same time. A database management system provides concurrency support to ensure that transactions commited by multiple users always appear to complete sequentially. This provides shared access to the database in a safe, efficient way.

4.1. File and Memory Storage

Where data is stored controls performance and durability.

A file storage database is saved to disk continuously. Data is organized into large pages to take advantage of block device performance characteristics. The algorithms used to access data in a file storage database offer consistent overall performance, even as the size of the database grows to exceed the size of main memory.

A memory storage database is stored primarily in memory. Direct pointers are used internally so that individual operations always complete in a predictable amount of time. For this reason, the size of a memory storage database is limited by the size of main memory.

A hybrid database uses both file and memory storage to store both disk and memory tables in the same database. In this way, applications can balance requirements for durability and performance by creating some tables as memory tables. A hybrid database is created by setting the memory storage size when opening a file storage database.

[Warning] Warning

Avoid creating a memory storage that is larger than main memory. Virtual memory page faults will occur frequently in a large memory storage database. Instead, use a file storage and store some or all data in disk tables. The paging algorithms used for disk tables are specifically designed to minimize paging for table data.

The storage model can be 改变d easily because it does not affect the way that tables are used. The only differences are in:

  • How the application 连接s to the database.

  • Whether tables are created on disk or in memory. For file storage, disk tables are created by default. For memory storage, memory tables are created by default.

  • Performance characteristics.

  • What data is lost when the database is closed or an unexpected failure occurs.

The fundamental difference between disk and memory storage is the index algorithms used to sort and search for rows. Disk tables use B+ tree indexes, which have a shallow tree structure to minimize disk I/O as depicted in Figure 3, “B+ Tree Index Organization”. Reading a page from disk is an expensive operation and each node in the B+ tree is a page. The organization of a B+ tree ensures that even in a very large table, any key can be located with very few page read operations. The B+ tree contains a full copy of the data in each index column, so that full rows do not need to be loaded from disk when using an index.

B+ Tree Index Organization

Figure 3. B+ Tree Index Organization


Memory tables use T-tree indexes, which have a deep binary tree structure as depicted in Figure 4, “T-Tree Index Organization”. Without disk I/O, traversing the 节点 of the tree is an inexpensive operation. Each node contains pointers to rows in the table rather than a full copy of the index columns.

T-Tree Index Organization

Figure 4. T-Tree Index Organization

4.2. Table Type

When a table is created, one of several table types can be selected. The table type affects the overhead and performance of storing and accessing data. ITTIA DB SQL™ supports the table types listed in Table 2, “Table Types”. The default table type depends on the storage model used by the database and some storage models do not support all types of tables. The default table type for disk storage is Key Heap, while the default table type for memory storage is Memory.

Table Type Disk Storage Memory Storage
Key Heap Table Default Not Supported
Clustered Table Supported Not Supported
Memory Table Supported Default

Table 2. Table Types


The table type controls how the data is represented inside the database. However, it does not affect how the data is accessed. From an application's perspective, a table is always a collection of rows and columns with zero or more indexes.

A key heap table uses a hidden 4 byte key to identify each row. Indexes on a key heap table use a copy of this key to locate rows when fetching data, as shown inFigure 5, “Key Heap Table Organization”. One unique index can be designated as the primary key.

Key Heap Table Organization

Figure 5. Key Heap Table Organization


[Tip] Tip

The overhead of a row in a key heap table uses the following formula: 4 bytes + 2 bytes per column + 12 bytes per index + 2 bytes per index column. This is added to the size of each field that is not null in the table and in each index, which contain a copy of the indexed fields, to get the total size of the row.

This calculation is valid for the current implementation and is subject to 改变 in future releases.

A clustered table, shown in Figure 6, “Clustered Table Organization”, groups rows acording to the order specified by an index. This improves performance when accessing a batch of rows with the cluster index because related rows are often stored in the same page. However, 其他 indexes on the same table do not benefit from clustering. The clustering index is specified when the table is created and cannot be 改变d without dropping the table.

Clustered Table Organization

Figure 6. Clustered Table Organization


A clustered table has the following benefits:

  1. Performance is improved for database operations that use the clustering index to fetch a single row or a set of adjacent rows. Because the table and index are combined, such operations avoid an extra index lookup.

  2. Storage overhead is generally reduced.

A clustered table has a few drawbacks:

  1. The table must have a primary key and the application must provide a value for the primary key whenever a row is inserted. Sequence generators are available for this purpose.

  2. The primary key definition cannot be 改变d without dropping the table.

The clustering index must meet the following conditions:

  1. The clustered index is the table's primary key.

  2. All index fields are NOT NULL.

  3. No variable-width index field exceeds 255 bytes.

  4. The total size of the index fields is at most (32 - v) bytes, where v is the number of.variable-width index fields.

[Tip] Tip

The overhead of a row in a clustered table uses the following formula: 2 bytes per column + 36 bytes per non-clustered index + 2 bytes per non-clustered index column.

This calculation is valid for the current implementation and is subject to 改变 in future releases.

The most efficient table type depends on the number of indexes created on the table and the size of the primary key. A clustered table has the most compact representation for a table with only one index or an integer primary key, while a key heap table is more compact when there are many indexes and a large primay key.


url: http://www.51lm.cn/p/templates/cn/show.php?cid=905&aid=2639

 

电话400-878-1895, 传真:021-51561359 邮箱:sales@51LM.cn
sales@51LM.cn 上海徐汇区斜土路2601号嘉汇广场T1-11B