Skip to content

Latest commit

 

History

History
76 lines (49 loc) · 3.37 KB

15. Indexing.md

File metadata and controls

76 lines (49 loc) · 3.37 KB

INDEXING

Indexing in database management systems is a technique used to enhance the retrieval performance of data from large datasets. It involves creating a data structure, known as an index, that provides a quick reference to the physical location of data in a database table.

In a relational database, data is stored in tables consisting of rows and columns. When a query is executed to retrieve specific data, the DBMS can use indexes to avoid scanning the entire table and instead perform a more efficient direct lookup.

Purpose of Indexing

The main purpose of indexing is to speed up data retrieval operations, particularly when dealing with large datasets, by reducing the number of disk reads needed to locate and access the desired data.

INDEX STRUCTURE

An index is like a separate data structure, typically implemented as a B-tree or a hash table. It contains key-column values from the table along with pointers to the corresponding rows' physical locations.

A search key is a unique value or combination of values used to identify records. A data reference is a pointer stored in the index that leads to the physical location of the corresponding data in the database table.

SORTING OF INDEX FILES

Clustered Index
Always sorted, as it physically arranges data rows in the table based on the indexed column's values.

Non-clustered Index
The index itself is sorted based on the indexed column(s), but the data rows in the table remain in their original order.

Hash Index
Not sorted, as it uses a hashing function to map indexed values to specific buckets.

B-tree Index
Always sorted, as it is a sorted data structure optimized for range queries and supports various data types.

INDEXING METHODS

image

DENSE AND SPARSE INDEX

image

FEATURES OF INDEX FILE

  1. Quick data retrieval with direct access paths.
  2. Efficient search optimization and filtering.
  3. Reduced disk I/O, faster query execution.
  4. Enforces data uniqueness (primary index).
  5. Supports join optimization.
  6. Compact and automated maintenance.
  7. Various index types for flexibility.
  8. Scalability and performance optimization.
  9. Enables range queries.
  10. Ensures data consistency and integrity.

DISADVANTAGES OF INDEX FILE

  1. Increased storage overhead.
  2. Slower data modification operations (insert, update, delete).
  3. Index maintenance overhead.
  4. Complexity in database design.
  5. Longer query planning times.
  6. Potential index fragmentation.
  7. Index selection overhead.
  8. Less effective for low selectivity columns.
  9. Overhead on columns with frequent data modifications.
  10. Indexing may not benefit all query patterns.

EXAMPLE

image

"Students" table contains four columns: StudentID, Name, Age, and Grade, with three rows of sample data.

image

The "Index on StudentID" table shows the index's content, with the StudentID column as the search key and the Data Reference column as the pointer to the respective rows in the "Students" table.