Joining DB records to the ‘highest’ of their related (1:m) data

I often have cause to query a database table and join it with a subtable of related data (i.e. a table with which this one has a one-to-many relationship), and to retrieve only the highest (or lowest) of related records (based on some sortable field; often a date). Usually in MySQL, and often to be used in a view and so sub-queries are not an option.

The key is to

  1. Outer join with the subtable twice (well, the first join can be inner, if a subrecord result is required);
  2. add, to the second join condition, the constraint that this second-subtable’s sorting field must be greater than its first-subtable counterpart; and
  3. add a where condition that the second-subtable’s primary key be null.

Why this is so hard to remember, I do not know.

For example, for these two tables

records table
id name
1 Record One
2 Record Two
3 Record Three
subrecords table
id record_id date
1 1 2013-01-10
2 1 2013-01-09
3 1 2013-01-05
4 3 2013-01-10

the SQL is

SELECT *
FROM records r
    LEFT JOIN subrecords s1 ON (s1.record_id = r.id)
    LEFT JOIN subrecords s2 ON (s2.record_id = r.id AND s2.date > s1.date)
WHERE s2.id IS NULL
ORDER BY r.id

to give a result of

r s1 s2
id name id record_id date id record_id date
1 Record One 1 1 2013-01-10 NULL NULL NULL
2 Record Two NULL NULL NULL NULL NULL NULL
3 Record Three 4 3 2013-01-10 NULL NULL NULL

Of course the list of selected columns needs to be changed.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.