Results 1 to 21 of 21

Thread: SQL query to join multiple tables

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,501
    Blog Entries
    13
    Quote Originally Posted by Digital Parasite View Post
    That does not give the correct results. First, I'm not trying to find values between 120 and 140, but less than 120 and greater than 140, so everything except between 120 and 140.
    my bad on that one, misread it.



    The last row, bID=3 should not be there, while 1.15 is between 1.00 and 2.00 the cID=2 val_int is 130 which is not in the valid range. Since I'm looking for results only when both cID=1 AND cID=2 are true, this is returning too much information. I need to remove the cases where they are not both true.
    How can cID = 1 and 2 in the same row?

    Does that explain better?
    Nope, more confused than ever, perhaps it's the dang heat here >90F right now..!!

    The reason why I was using group by in my statement was to get rid of the multiple rows in the join statement. For every val_int it prints multiple val_floats, so using GROUP BY I was trying to reduce it to a single val_int or val_float for each row.

  2. #2
    Quote Originally Posted by Bok View Post
    my bad on that one, misread it.

    How can cID = 1 and 2 in the same row?

    Nope, more confused than ever, perhaps it's the dang heat here >90F right now..!!
    cID can't be 1 and 2 in the same row. Let me convert the problem to english words and simplify it to remove the ranges.

    bID = Patient
    cID is an Attribute
    cID = 1 for example means temperature
    cID = 2 for example means heart rate

    What I'm trying to find is: show all Patients that have a temperature of 1.95 AND a heart rate of 150, and list those temperatures and heart rates with timestamps.

    So Temperature and Heart Rate are not listed on the same row which is what makes this tricky, I need to see if they match the pattern I'm looking for and then only display the ones that match the pattern. In my example Patient #3 (bID = 3) has a temperature that matches the pattern but his heart rate does not so I don't want to list him or the details.

    Does that clarify anything? Wow, big heat, we were up to 85F on the weekend which is very out of the ordinary for the beginning of April way up north here.
    Last edited by Digital Parasite; 04-07-2010 at 09:18 AM.

  3. #3
    Free-DC Semi-retire gopher_yarrowzoo's Avatar
    Join Date
    Mar 2002
    Location
    Knoxville, TN
    Posts
    3,985
    That does make it rather tricky to say the least..
    I take it the table is in the from ID,cID,Data,.... and is such that ID,1,Temp then next row is ID,2,Heartrate ?
    See if it is that I'd be tempted to create a temp table / array
    thus
    table / array [id][timestamp] [heartrate] ,[id][timestamp] [temp], [id][timestamp] [other]
    table ID, Timestamp, Temp, Heartrate, other data .....
    filled as if ID, timestamp, cID==1 fill Temp || cID==2 fill Heartrate, other data

    query is the something like SELECT * from temp_table WHERE !(Temp>=120 && Temp<=140) && Heartrate is in range.
    then the array is simple.
    -TABLE-
    ID | timestamp | Temp | Heart Rate | Other Data
    run the array and display it 1 per row.
    Semi-retired from Free-DC...
    I have some time to help.....
    I need a new laptop,but who needs a laptop when you have a phone...
    Now to remember my old computer specs..


  4. #4
    Quote Originally Posted by gopher_yarrowzoo View Post
    That does make it rather tricky to say the least..
    I take it the table is in the from ID,cID,Data,.... and is such that ID,1,Temp then next row is ID,2,Heartrate ?
    See if it is that I'd be tempted to create a temp table / array
    thus
    table / array [id][timestamp] [heartrate] ,[id][timestamp] [temp], [id][timestamp] [other]
    table ID, Timestamp, Temp, Heartrate, other data .....
    filled as if ID, timestamp, cID==1 fill Temp || cID==2 fill Heartrate, other data
    You can see the exact table form in message #9 with the create statement on top: http://www.free-dc.org/forum/showpos...68&postcount=9

    Yes, we have PatientID,AttributeID,Integer Data,FloatingPoint Data.

    This table currently has 10+ million rows in it and growing. How much time/effort does it take to create a temp table in the fashion you suggest for a table that size? I have never used arrays before, what benefit is that over just creating a temporary table? I just did a google search for MySQL and arrays but don't really find any information.

    So you suggest that I create a TEMP table and then use INSERT INTO statements from the main table and filter into a "Temp" and "Heart Rate" column the right values after which I can use a simpler SQL statement to find the values?

    I think I still see a problem though, there can be a different number of temperatures and heart rates so for each patient they won't evenly fit into rows. So you would still have the case where you need to look at multiple rows to see if both the heart rate and temperatures were still in range?

  5. #5
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,501
    Blog Entries
    13
    Perhaps your initial premise is incorrect ? I'm not sure your original results were entirely accurate assuming you only want distinct grouping by datetime - otherwise ignore the following as I don't think it can be done much better than you have..! If we break down the 12 rows into 6 groups of 2 then the only 'groups' you want are the first two - correct ?

    Thus the following query would give you that.

    mysql> select t1.bID,t1.val_timestamp,t1.val_float,t2.val_int from testAND t1 inner join testAND t2 on t1.bID = t2.bID and t1.val_timestamp = t2.val_timestamp where (t1.cID = 1 and t1.val_float between 1.00 and 2.00) and t2.cID = 2 and (t2.val_int < 120 or t2.val_int > 140) ;
    +------+---------------------+-----------+---------+
    | bID | val_timestamp | val_float | val_int |
    +------+---------------------+-----------+---------+
    | 1 | 2009-12-14 00:22:33 | 1.05 | 105 |
    | 2 | 2009-12-14 00:22:33 | 1.95 | 115 |
    +------+---------------------+-----------+---------+
    2 rows in set (0.01 sec)

    grouping 1 (aID=1,2) should be selected
    grouping 2 (aID=3,4) should be selected
    grouping 3 (aID=5,6) should not be selected as val_int = 130
    grouping 4 (aID 7,8) should not be selected as val_float = 0.5
    grouping 5 (aID 9,10) should not be selected as val_int = 125
    grouping 6 (aID 11,12) should not be selected as val_float = 2.55

    and explain is better (though I did not define any indexes)

    mysql> explain select t1.bID,t1.val_timestamp,t1.val_float,t2.val_int from testAND t1 inner join testAND t2 on t1.bID = t2.bID and t1.val_timestamp = t2.val_timestamp where (t1.cID = 1 and t1.val_float between 1.00 and 2.00) and t2.cID = 2 and (t2.val_int < 120 or t2.val_int > 140) ;
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 12 | Using where |
    | 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 12 | Using where |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    2 rows in set (0.00 sec)

  6. #6
    Quote Originally Posted by Bok View Post
    Perhaps your initial premise is incorrect ? I'm not sure your original results were entirely accurate assuming you only want distinct grouping by datetime - otherwise ignore the following as I don't think it can be done much better than you have..! If we break down the 12 rows into 6 groups of 2 then the only 'groups' you want are the first two - correct ?
    Sorry this is so confusing, I'm not doing a great job of explaining things. I shouldn't have put the timestamps so closely together to match. I'm not actually trying to match timestamps, every timestamp could and probably would be different.

    I just retried my original query without the GROUP BY and it gives the same results so you are right, I don't actually need those. So if you have a table where every timestamp is different such as:

    Code:
    create table testAND (aID int auto_increment primary key, bID int, cID int, val_timestamp timestamp,val_int int, val_float float);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (1, 1, 1, '2009-12-14 00:22:33', NULL, 1.05);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (2, 1, 2, '2009-12-14 00:22:34', 105, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (3, 2, 1, '2009-12-14 00:22:35', NULL, 1.95);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (4, 2, 2, '2009-12-14 00:22:36', 115, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (5, 3, 1, '2009-12-14 00:22:37', NULL, 1.15);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (6, 3, 2, '2009-12-14 00:22:38', 130, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (7, 1, 1, '2009-12-14 00:22:44', NULL, 0.5);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (8, 1, 2, '2009-12-14 00:22:45', 150, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (9, 2, 1, '2009-12-14 00:22:46', NULL, 1.25);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (10, 2, 2, '2009-12-14 00:22:47', 125, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (11, 2, 2, '2009-12-14 00:22:55', 25, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (12, 2, 1, '2009-12-14 00:22:56', NULL, 2.55);
    Then I can run this SQL query:

    Code:
    SELECT t1.bID, t1.cID, t1.val_timestamp, t1.val_int, t1.val_float 
    FROM testand t1 INNER JOIN testand t2 ON t1.bID = t2.bID 
    WHERE (t1.cID = 2 AND (t1.val_int < 120 OR t1.val_int > 140)) 
    AND (t2.cID = 1 AND (t2.val_float >= 1.00 AND t2.val_float <= 2.00)) 
    UNION
    SELECT t2.bID, t2.cID, t2.val_timestamp, t2.val_int, t2.val_float 
    FROM testand t1 INNER JOIN testand t2 ON t1.bID = t2.bID 
    WHERE (t1.cID = 2 AND (t1.val_int < 120 OR t1.val_int > 140)) 
    AND (t2.cID = 1 AND (t2.val_float >= 1.00 AND t2.val_float <= 2.00)) 
    ORDER BY bID, cID;
    And I would get these results:
    Code:
    +------+------+---------------------+---------+-----------+
    | bID  | cID  | val_timestamp       | val_int | val_float |
    +------+------+---------------------+---------+-----------+
    |    1 |    1 | 2009-12-14 00:22:33 |    NULL |      1.05 |
    |    1 |    2 | 2009-12-14 00:22:34 |     105 |      NULL |
    |    1 |    2 | 2009-12-14 00:22:45 |     150 |      NULL |
    |    2 |    1 | 2009-12-14 00:22:35 |    NULL |      1.95 |
    |    2 |    1 | 2009-12-14 00:22:46 |    NULL |      1.25 |
    |    2 |    2 | 2009-12-14 00:22:36 |     115 |      NULL |
    |    2 |    2 | 2009-12-14 00:22:55 |      25 |      NULL |
    +------+------+---------------------+---------+-----------+
    7 rows in set (0.22 sec)
    Which is those values that are in the correct ranges and have at least one match in BOTH of the attributes. So with that, you are saying Bok that I can't really do much better then? Too bad MySQL doesn't support the INTERSECT statement since that seems to be exactly what I want to do, take two SQL queries, one searching for heart rate and one for temperature and intersect them so only ones that have patients with both show up.

    Thanks for all your help Bok and gopher_yarrowzoo, I really do appreciate it.

    Jeff.

  7. #7
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,501
    Blog Entries
    13
    UNION will always produces a filesort unfortunately. But perhaps this does a little better

    mysql> select * from testAND t1 where cID = 1 and t1.val_float between 1.00 and 2.00 and
    exists (select 1 from testAND t2 where t1.bID = t2.bID and t2.cID=2 and (t2.val_int < 120 or t2.val_int > 140))
    union
    select * from testAND t1 where cID = 2 and t1.val_int < 120 or t1.val_int > 140 and
    exists (select 1 from testAND t2 where t1.bID = t2.bID and t2.cID = 1 and (t2.val_float between 1.00 and 2.00)) order by bID,cID;
    +-----+------+------+---------------------+---------+-----------+
    | aID | bID | cID | val_timestamp | val_int | val_float |
    +-----+------+------+---------------------+---------+-----------+
    | 1 | 1 | 1 | 2009-12-14 00:22:33 | NULL | 1.05 |
    | 2 | 1 | 2 | 2009-12-14 00:22:33 | 105 | NULL |
    | 8 | 1 | 2 | 2009-12-14 00:22:44 | 150 | NULL |
    | 3 | 2 | 1 | 2009-12-14 00:22:33 | NULL | 1.95 |
    | 9 | 2 | 1 | 2009-12-14 00:22:44 | NULL | 1.25 |
    | 4 | 2 | 2 | 2009-12-14 00:22:33 | 115 | NULL |
    | 11 | 2 | 2 | 2009-12-14 00:22:55 | 25 | NULL |
    +-----+------+------+---------------------+---------+-----------+
    7 rows in set (0.00 sec)


    My head hurts.
    Last edited by Bok; 04-07-2010 at 12:20 PM. Reason: formatted a little better for clarity.

  8. #8
    Quote Originally Posted by Bok View Post
    UNION will always produces a filesort unfortunately. But perhaps this does a little better
    Interesting alternative and it does return the same thing. In my DB with 5 million rows, this query you just posted takes 12.641 sec and the previous one that I posted takes 12.125 sec so essentially the same speed.

    So I guess doing 2 SELECTS, 2 JOINS, and 1 UNION take slightly less time than doing 4 SELECTS, and 1 UNION.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •