Skip to content

What indexes should history tables have? #228

@rod-glover

Description

@rod-glover

What indexes should history tables have?

Fundamental questions

The fundamental question here is: How will the history tables be used? That is, how will they be queried and what indexes will make those queries more efficient?

Let's consider a prime use case: Query the history tables in a way that reflects a past state of the database.

  • There are some straightforward conditions I've formulated that define what a true past state is, meaning whether a set of history records taken together reflects a genuine historical state of the database, as opposed to a mixed collection of records from several different past states.
  • Those conditions amount to constraints on history table primary keys.

Then the question becomes, how do the queries on primary tables map over to equivalent queries on the history tables?

Let's consider some elementary ones:

  • What stations and station history records are associated with a given network?
  • What variables are associated with a given network?
  • What observations are associated with a given station, for a specified time period? Group these by variable, and order by observation time.

Existing table indexes

Let's first consider the indexes on the existing ("primary") tables, ignoring the indexes that are used to implement unique constraints (including primary keys). We have the following:

meta_network

  • none

meta_station

  • (network_id)

meta_history

  • (station_id)
  • (freq)

meta_vars

  • (network_id)

obs_raw

  • (mod_time)
  • (obs_raw_id, obs_time, vars_id, history_id)
  • (history_id)
  • (obs_raw_id)

General rule: Index each FK. This is unsurprising.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions