Postgresql - Finding unused indexes
Saturday, September 24, 2011 at 8:39PM
Matthew Cooke

Database indexes are very useful, but the down side is that updating or adding rows to tables becomes more intensive as all the indexes also need updating.

If you are using postgresql then you probably already have access to information on index usage as this information is usually collected by default - if not, you can find more information here in the postgresql docs on statistics collection.

This query (tested on PG9) gives some information on index usage and index size:

SELECT t.tablename, 
  indexname, 
  c.reltuples AS num_rows, 
  pg_size_pretty ( pg_relation_size ( t.tablename::text ) ) AS table_size, 
  pg_size_pretty ( pg_relation_size ( indexrelname::text ) ) AS index_size, 
  CASE 
    WHEN x.is_unique = 1 
    THEN 'Y' 
    ELSE 'N' 
  END AS UNIQUE, 
  idx_scan AS number_of_scans, 
  idx_tup_read AS tuples_read, 
  idx_tup_fetch AS tuples_fetched 
FROM pg_tables t 
LEFT 
OUTER JOIN pg_class c 
  ON t.tablename = c.relname 
LEFT 
OUTER JOIN ( 
      SELECT indrelid, 
        max ( CAST ( indisunique AS integer ) ) AS is_unique 
      FROM pg_index 
      GROUP BY indrelid ) x 
  ON c.oid = x.indrelid 
LEFT 
OUTER JOIN ( 
      SELECT c.relname AS ctablename, 
        ipg.relname AS indexname, 
        x.indnatts AS number_of_columns, 
        idx_scan, 
        idx_tup_read, 
        idx_tup_fetch, 
        indexrelname 
      FROM pg_index x JOIN pg_class c 
        ON c.oid = x.indrelid JOIN pg_class ipg 
        ON ipg.oid = x.indexrelid JOIN pg_stat_all_indexes psai 
        ON x.indexrelid = psai.indexrelid ) AS foo 
  ON t.tablename = foo.ctablename 
WHERE t.schemaname = 'public' 
ORDER BY number_of_scans desc; 

If you assume that all indexes that are needed have been used at least once, this next query tells you how big all the unused indexes are:

SELECT pg_size_pretty ( sum ( pg_relation_size ( indexrelname::text ) ) ::bigint ) AS index_size 
FROM pg_tables t 
LEFT 
OUTER JOIN pg_class c 
  ON t.tablename = c.relname 
LEFT 
OUTER JOIN ( 
      SELECT indrelid, 
        max ( CAST ( indisunique AS integer ) ) AS is_unique 
      FROM pg_index 
      GROUP BY indrelid ) x 
  ON c.oid = x.indrelid 
LEFT 
OUTER JOIN ( 
      SELECT c.relname AS ctablename, 
        ipg.relname AS indexname, 
        x.indnatts AS number_of_columns, 
        idx_scan, 
        idx_tup_read, 
        idx_tup_fetch, 
        indexrelname 
      FROM pg_index x JOIN pg_class c 
        ON c.oid = x.indrelid JOIN pg_class ipg 
        ON ipg.oid = x.indexrelid JOIN pg_stat_all_indexes psai 
        ON x.indexrelid = psai.indexrelid ) AS foo 
  ON t.tablename = foo.ctablename 
WHERE t.schemaname = 'public' 
  AND idx_scan =0; 

You can inverse that last query to get the total size of all the in-use indexes by changing the idx_scan=0 to idx_scan>0. If you are heavily reliant on the database fitting into physical RAM (as I usually am) then you may want to keep an eye on the index size, along with the size of your tables.

If you are looking to reduce the size of the indexes you could look at some of your biggest indexes and think about the queries that actually run against the table - do you really need to index the *whole* table for your queries? If not, you could look at changing the index to a partial index which is basically an index which only covers part of a table based on a where clause.

There are a few other things that can help get down index size , ensure the auto vacuum setting are aggressive enough, dump-restore your database periodically, delete rows in the underlying table, etc, :)

Article originally appeared on Gridfire - Matt Cooke (http://www.gridfire.com/).
See website for complete article licensing information.