-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathintagg.sgml
119 lines (95 loc) · 3.72 KB
/
intagg.sgml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
<!-- doc/src/sgml/intagg.sgml -->
<sect1 id="intagg" xreflabel="intagg">
<title>intagg</title>
<indexterm zone="intagg">
<primary>intagg</primary>
</indexterm>
<para>
The <filename>intagg</filename> module provides an integer aggregator and an
enumerator. <filename>intagg</filename> is now obsolete, because there
are built-in functions that provide a superset of its capabilities.
However, the module is still provided as a compatibility wrapper around
the built-in functions.
</para>
<sect2>
<title>Functions</title>
<para>
The aggregator is an aggregate function
<function>int_array_aggregate(integer)</>
that produces an integer array
containing exactly the integers it is fed.
This is a wrapper around <function>array_agg</>,
which does the same thing for any array type.
</para>
<para>
The enumerator is a function
<function>int_array_enum(integer[])</>
that returns <type>setof integer</>. It is essentially the reverse
operation of the aggregator: given an array of integers, expand it
into a set of rows. This is a wrapper around <function>unnest</>,
which does the same thing for any array type.
</para>
</sect2>
<sect2>
<title>Sample Uses</title>
<para>
Many database systems have the notion of a one to many table. Such a table
usually sits between two indexed tables, for example:
<programlisting>
CREATE TABLE left (id INT PRIMARY KEY, ...);
CREATE TABLE right (id INT PRIMARY KEY, ...);
CREATE TABLE one_to_many(left INT REFERENCES left, right INT REFERENCES right);
</programlisting>
It is typically used like this:
<programlisting>
SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right)
WHERE one_to_many.left = <replaceable>item</>;
</programlisting>
This will return all the items in the right hand table for an entry
in the left hand table. This is a very common construct in SQL.
</para>
<para>
Now, this methodology can be cumbersome with a very large number of
entries in the <structname>one_to_many</> table. Often,
a join like this would result in an index scan
and a fetch for each right hand entry in the table for a particular
left hand entry. If you have a very dynamic system, there is not much you
can do. However, if you have some data which is fairly static, you can
create a summary table with the aggregator.
<programlisting>
CREATE TABLE summary AS
SELECT left, int_array_aggregate(right) AS right
FROM one_to_many
GROUP BY left;
</programlisting>
This will create a table with one row per left item, and an array
of right items. Now this is pretty useless without some way of using
the array; that's why there is an array enumerator. You can do
<programlisting>
SELECT left, int_array_enum(right) FROM summary WHERE left = <replaceable>item</>;
</programlisting>
The above query using <function>int_array_enum</> produces the same results
as
<programlisting>
SELECT left, right FROM one_to_many WHERE left = <replaceable>item</>;
</programlisting>
The difference is that the query against the summary table has to get
only one row from the table, whereas the direct query against
<structname>one_to_many</> must index scan and fetch a row for each entry.
</para>
<para>
On one system, an <command>EXPLAIN</> showed a query with a cost of 8488 was
reduced to a cost of 329. The original query was a join involving the
<structname>one_to_many</> table, which was replaced by:
<programlisting>
SELECT right, count(right) FROM
( SELECT left, int_array_enum(right) AS right
FROM summary JOIN (SELECT left FROM left_table WHERE left = <replaceable>item</>) AS lefts
ON (summary.left = lefts.left)
) AS list
GROUP BY right
ORDER BY count DESC;
</programlisting>
</para>
</sect2>
</sect1>