Results 1 to 21 of 21

Thread: SQL query to join multiple tables

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,502
    Blog Entries
    13
    Perhaps your initial premise is incorrect ? I'm not sure your original results were entirely accurate assuming you only want distinct grouping by datetime - otherwise ignore the following as I don't think it can be done much better than you have..! If we break down the 12 rows into 6 groups of 2 then the only 'groups' you want are the first two - correct ?

    Thus the following query would give you that.

    mysql> select t1.bID,t1.val_timestamp,t1.val_float,t2.val_int from testAND t1 inner join testAND t2 on t1.bID = t2.bID and t1.val_timestamp = t2.val_timestamp where (t1.cID = 1 and t1.val_float between 1.00 and 2.00) and t2.cID = 2 and (t2.val_int < 120 or t2.val_int > 140) ;
    +------+---------------------+-----------+---------+
    | bID | val_timestamp | val_float | val_int |
    +------+---------------------+-----------+---------+
    | 1 | 2009-12-14 00:22:33 | 1.05 | 105 |
    | 2 | 2009-12-14 00:22:33 | 1.95 | 115 |
    +------+---------------------+-----------+---------+
    2 rows in set (0.01 sec)

    grouping 1 (aID=1,2) should be selected
    grouping 2 (aID=3,4) should be selected
    grouping 3 (aID=5,6) should not be selected as val_int = 130
    grouping 4 (aID 7,8) should not be selected as val_float = 0.5
    grouping 5 (aID 9,10) should not be selected as val_int = 125
    grouping 6 (aID 11,12) should not be selected as val_float = 2.55

    and explain is better (though I did not define any indexes)

    mysql> explain select t1.bID,t1.val_timestamp,t1.val_float,t2.val_int from testAND t1 inner join testAND t2 on t1.bID = t2.bID and t1.val_timestamp = t2.val_timestamp where (t1.cID = 1 and t1.val_float between 1.00 and 2.00) and t2.cID = 2 and (t2.val_int < 120 or t2.val_int > 140) ;
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 12 | Using where |
    | 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 12 | Using where |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    2 rows in set (0.00 sec)

  2. #2
    Quote Originally Posted by Bok View Post
    Perhaps your initial premise is incorrect ? I'm not sure your original results were entirely accurate assuming you only want distinct grouping by datetime - otherwise ignore the following as I don't think it can be done much better than you have..! If we break down the 12 rows into 6 groups of 2 then the only 'groups' you want are the first two - correct ?
    Sorry this is so confusing, I'm not doing a great job of explaining things. I shouldn't have put the timestamps so closely together to match. I'm not actually trying to match timestamps, every timestamp could and probably would be different.

    I just retried my original query without the GROUP BY and it gives the same results so you are right, I don't actually need those. So if you have a table where every timestamp is different such as:

    Code:
    create table testAND (aID int auto_increment primary key, bID int, cID int, val_timestamp timestamp,val_int int, val_float float);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (1, 1, 1, '2009-12-14 00:22:33', NULL, 1.05);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (2, 1, 2, '2009-12-14 00:22:34', 105, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (3, 2, 1, '2009-12-14 00:22:35', NULL, 1.95);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (4, 2, 2, '2009-12-14 00:22:36', 115, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (5, 3, 1, '2009-12-14 00:22:37', NULL, 1.15);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (6, 3, 2, '2009-12-14 00:22:38', 130, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (7, 1, 1, '2009-12-14 00:22:44', NULL, 0.5);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (8, 1, 2, '2009-12-14 00:22:45', 150, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (9, 2, 1, '2009-12-14 00:22:46', NULL, 1.25);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (10, 2, 2, '2009-12-14 00:22:47', 125, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (11, 2, 2, '2009-12-14 00:22:55', 25, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (12, 2, 1, '2009-12-14 00:22:56', NULL, 2.55);
    Then I can run this SQL query:

    Code:
    SELECT t1.bID, t1.cID, t1.val_timestamp, t1.val_int, t1.val_float 
    FROM testand t1 INNER JOIN testand t2 ON t1.bID = t2.bID 
    WHERE (t1.cID = 2 AND (t1.val_int < 120 OR t1.val_int > 140)) 
    AND (t2.cID = 1 AND (t2.val_float >= 1.00 AND t2.val_float <= 2.00)) 
    UNION
    SELECT t2.bID, t2.cID, t2.val_timestamp, t2.val_int, t2.val_float 
    FROM testand t1 INNER JOIN testand t2 ON t1.bID = t2.bID 
    WHERE (t1.cID = 2 AND (t1.val_int < 120 OR t1.val_int > 140)) 
    AND (t2.cID = 1 AND (t2.val_float >= 1.00 AND t2.val_float <= 2.00)) 
    ORDER BY bID, cID;
    And I would get these results:
    Code:
    +------+------+---------------------+---------+-----------+
    | bID  | cID  | val_timestamp       | val_int | val_float |
    +------+------+---------------------+---------+-----------+
    |    1 |    1 | 2009-12-14 00:22:33 |    NULL |      1.05 |
    |    1 |    2 | 2009-12-14 00:22:34 |     105 |      NULL |
    |    1 |    2 | 2009-12-14 00:22:45 |     150 |      NULL |
    |    2 |    1 | 2009-12-14 00:22:35 |    NULL |      1.95 |
    |    2 |    1 | 2009-12-14 00:22:46 |    NULL |      1.25 |
    |    2 |    2 | 2009-12-14 00:22:36 |     115 |      NULL |
    |    2 |    2 | 2009-12-14 00:22:55 |      25 |      NULL |
    +------+------+---------------------+---------+-----------+
    7 rows in set (0.22 sec)
    Which is those values that are in the correct ranges and have at least one match in BOTH of the attributes. So with that, you are saying Bok that I can't really do much better then? Too bad MySQL doesn't support the INTERSECT statement since that seems to be exactly what I want to do, take two SQL queries, one searching for heart rate and one for temperature and intersect them so only ones that have patients with both show up.

    Thanks for all your help Bok and gopher_yarrowzoo, I really do appreciate it.

    Jeff.

  3. #3
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,502
    Blog Entries
    13
    UNION will always produces a filesort unfortunately. But perhaps this does a little better

    mysql> select * from testAND t1 where cID = 1 and t1.val_float between 1.00 and 2.00 and
    exists (select 1 from testAND t2 where t1.bID = t2.bID and t2.cID=2 and (t2.val_int < 120 or t2.val_int > 140))
    union
    select * from testAND t1 where cID = 2 and t1.val_int < 120 or t1.val_int > 140 and
    exists (select 1 from testAND t2 where t1.bID = t2.bID and t2.cID = 1 and (t2.val_float between 1.00 and 2.00)) order by bID,cID;
    +-----+------+------+---------------------+---------+-----------+
    | aID | bID | cID | val_timestamp | val_int | val_float |
    +-----+------+------+---------------------+---------+-----------+
    | 1 | 1 | 1 | 2009-12-14 00:22:33 | NULL | 1.05 |
    | 2 | 1 | 2 | 2009-12-14 00:22:33 | 105 | NULL |
    | 8 | 1 | 2 | 2009-12-14 00:22:44 | 150 | NULL |
    | 3 | 2 | 1 | 2009-12-14 00:22:33 | NULL | 1.95 |
    | 9 | 2 | 1 | 2009-12-14 00:22:44 | NULL | 1.25 |
    | 4 | 2 | 2 | 2009-12-14 00:22:33 | 115 | NULL |
    | 11 | 2 | 2 | 2009-12-14 00:22:55 | 25 | NULL |
    +-----+------+------+---------------------+---------+-----------+
    7 rows in set (0.00 sec)


    My head hurts.
    Last edited by Bok; 04-07-2010 at 12:20 PM. Reason: formatted a little better for clarity.

  4. #4
    Quote Originally Posted by Bok View Post
    UNION will always produces a filesort unfortunately. But perhaps this does a little better
    Interesting alternative and it does return the same thing. In my DB with 5 million rows, this query you just posted takes 12.641 sec and the previous one that I posted takes 12.125 sec so essentially the same speed.

    So I guess doing 2 SELECTS, 2 JOINS, and 1 UNION take slightly less time than doing 4 SELECTS, and 1 UNION.

  5. #5
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,502
    Blog Entries
    13
    I'd bet >90% is on the filesort in either case and it may well be skewed once it's in the query cache.

    12 secs is pretty decent for 5M rows anyway.

  6. #6
    Quote Originally Posted by Bok View Post
    I'd bet >90% is on the filesort in either case and it may well be skewed once it's in the query cache.

    12 secs is pretty decent for 5M rows anyway.
    I have query caching disabled but yes 12 secs is pretty decent. For fun I took out the UNION and timed the two SELECT statements separately.

    SELECT1 = 4.672 sec
    SELECT2 = 3.469 sec
    TOTAL = 8.141 sec

    Of course you can't really do that since the second SELECT by itself returns the same number of rows as the first most of which is duplicated. But it looks like the UNION part takes 4 seconds.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •