Results 1 to 21 of 21

Thread: SQL query to join multiple tables

  1. #1

    Question SQL query to join multiple tables

    I must be searching using the wrong key words but how does one join multiple tables with SQL that share the same data type so you get them in the same column?

    To better explain let me give you an example. I have 3 tables:
    data_char, data_float, data_integer

    They all have the first 4 columns in common, but the 5th column is different. So they all have aID (int), bID (int), cID (int), val_timestamp (timestamp)

    The data_char has a val_char (varchar 255), the data_float has a val_float (float), and the data_integer has a val_int (int) column.

    The aID's in each table are auto-incrementing values and not related to each other, but the bID is how I link all the tables together.

    Is there a way to write a query (I'm using MySQL) to return a result set that has:
    Code:
    cID  val_timestamp   val_char  val_float  val_int
    for a specific bID?

    I can't find a way to get the cID and val_timestamp columns from each table to merge into a single column, they are coming out as cID cID_1 cID_2 sort of thing.

    Or what would you recommend to pull out that data from different tables?

    Thanks,
    Jeff.

  2. #2
    Free-DC Semi-retire gopher_yarrowzoo's Avatar
    Join Date
    Mar 2002
    Location
    Knoxville, TN
    Posts
    3,985
    there should be jeff let me think.... my SQL is a bit rusty.

    SELECT a.bID , a.cID, a.val_timestamp, a.data_char (name of col), b.bID , b.cID, b.val_timestamp,b.data_float, c.bID , c.cID, c.val_timestamp, c.data_int AS a=data_char, b=data_float, c=data_int WHERE a.bID == b.bID && a.bID==c.bID
    ^^^ should work else you need to use the JOIN command, maybe Bok can help ya out a bit more since he works with it more often than me.
    Semi-retired from Free-DC...
    I have some time to help.....
    I need a new laptop,but who needs a laptop when you have a phone...
    Now to remember my old computer specs..


  3. #3
    I can't seem to get that to work, the syntax is not right and I'm not sure how to fix it. Isn't the AS just for creating aliases so you can refer to something by another name but will it actually put data into the same column?

    Bok, any ideas?

    Thanks.

  4. #4
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,601
    Blog Entries
    13
    can you post the table creation details ?

    i.e. do

    show create table <tablename> for each one.

  5. #5
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,601
    Blog Entries
    13
    best guess though..

    Code:
    mysql> create table y1 (aID int auto_increment primary key, bID int, cID int, val_timestamp timestamp,val_char varchar(255));
    Query OK, 0 rows affected (0.06 sec)
    
    mysql> create table y2 (aID int auto_increment primary key, bID int, cID int, val_timestamp timestamp,val_float float);
    Query OK, 0 rows affected (0.01 sec)
    
    mysql> create table y3 (aID int auto_increment primary key, bID int, cID int, val_timestamp timestamp,val_int int);
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> insert into y1 values (1,1,1,curdate(),'qqqq');
    Query OK, 1 row affected (0.00 sec)
    
    mysql> insert into y2 values (1,1,1,curdate(),5.5);
    Query OK, 1 row affected (0.00 sec)
    
    mysql> insert into y3 values (1,1,1,curdate(),9);
    Query OK, 1 row affected (0.01 sec)
    
    mysql> select a.cID,a.val_timestamp,a.val_char,b.val_float,c.val_int from y1 a join y2 b on a.bID = b.bID join y3 c on a.bID = c.bID where a.bID = 1;
    +------+---------------------+----------+-----------+---------+
    | cID  | val_timestamp       | val_char | val_float | val_int |
    +------+---------------------+----------+-----------+---------+
    |    1 | 2009-12-14 00:00:00 | qqqq     |       5.5 |       9 |
    +------+---------------------+----------+-----------+---------+
    1 row in set (0.00 sec)
    Does that help ?

  6. #6
    Quote Originally Posted by Bok View Post
    best guess though..

    Does that help ?
    So your table creation is bang on, that is essentially what is looks like. Your SQL code executes but the problem is that you are showing a.cID,a.val_timestamp,a.val_char,b.val_float,c.val_int but not b.cID or c.cID, b.val_timestamp, or c.val_timestamp.

    All of those are independent. In your example you had 1 entry in each table which works because they all have the same time stamp and cID.

    But if you did then this on top of what you already inserted:
    Code:
    insert into y1 values (NULL,1,2,'20091214001122','xxx');
    insert into y2 values (NULL,1,7,'20091214002233',1.95);
    insert into y3 values (NULL,1,4,'20091214003344',42);
    And then ran your select statement, you would just see y1's cID of '1' and '2' and y1's timestamps, but not the other timestamps, and each value is repeated several times as seen here:

    Code:
    +------+---------------------+----------+-----------+---------+
    | cID  | val_timestamp       | val_char | val_float | val_int |
    +------+---------------------+----------+-----------+---------+
    |    1 | 2009-12-14 00:00:00 | qqqq     |       5.5 |       9 |
    |    2 | 2009-12-14 00:11:22 | xxx      |       5.5 |       9 |
    |    1 | 2009-12-14 00:00:00 | qqqq     |      1.95 |       9 |
    |    2 | 2009-12-14 00:11:22 | xxx      |      1.95 |       9 |
    |    1 | 2009-12-14 00:00:00 | qqqq     |       5.5 |      42 |
    |    2 | 2009-12-14 00:11:22 | xxx      |       5.5 |      42 |
    |    1 | 2009-12-14 00:00:00 | qqqq     |      1.95 |      42 |
    |    2 | 2009-12-14 00:11:22 | xxx      |      1.95 |      42 |
    +------+---------------------+----------+-----------+---------+
    8 rows in set (0.00 sec)
    Ideally I would like to see something like this (where bID = 1):
    Code:
    +------+---------------------+----------+-----------+---------+
    | cID  | val_timestamp       | val_char | val_float | val_int |
    +------+---------------------+----------+-----------+---------+
    |    1 | 2009-12-14 00:00:00 | qqqq     |    (NULL) |  (NULL) |
    |    1 | 2009-12-14 00:00:00 | (NULL)   |       5.5 |  (NULL) |
    |    1 | 2009-12-14 00:00:00 | (NULL)   |    (NULL) |       9 |
    |    2 | 2009-12-14 00:11:22 | xxx      |    (NULL) |  (NULL) |
    |    7 | 2009-12-14 00:22:33 | (NULL)   |      1.95 |  (NULL) |
    |    4 | 2009-12-14 00:33:44 | (NULL)   |    (NULL) |      42 |
    +------+---------------------+----------+-----------+---------+
    Is there any way to get that to happen? Or if not, can I do 3 different selects one for each table and have the results populated in one result set to work with all the variables?

    Thanks.
    Jeff.

  7. #7
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,601
    Blog Entries
    13
    perhaps, with a little fudging

    Code:
    mysql> select cID,val_timestamp,val_char,NULL as val_float,NULL as val_int from y1 where bID = 1 union  select cID,val_timestamp,NULL,val_float,NULL from y2 where bID = 1 union select  cID,val_timestamp,NULL,NULL,val_int from y3 where bID = 1 order by cID;
    +------+---------------------+----------+-----------+---------+
    | cID  | val_timestamp       | val_char | val_float | val_int |
    +------+---------------------+----------+-----------+---------+
    |    1 | 2009-12-14 00:00:00 | qqqq     |      NULL |    NULL |
    |    1 | 2009-12-14 00:00:00 | NULL     |       5.5 |    NULL |
    |    1 | 2009-12-14 00:00:00 | NULL     |      NULL |       9 |
    |    2 | 2009-12-14 00:11:22 | xxx      |      NULL |    NULL |
    |    4 | 2009-12-14 00:33:44 | NULL     |      NULL |      42 |
    |    7 | 2009-12-14 00:22:33 | NULL     |      1.95 |    NULL |
    +------+---------------------+----------+-----------+---------+
    6 rows in set (0.00 sec)

  8. #8
    Thanks Bok, that works! You are da man!

    That would have taken me a long time to figure out and I was already banging my head hard enough. I was thinking all about joins instead of unions.

    Jeff.

  9. #9
    Bok and all you other SQL experts out there. I'm now trying to take a MySQL table and try to find all the records where a bID matches the criteria for cID = 1 *AND* CID = 2. I have figured out a way to do it but it seems quite complicated. Is there an easier way to get these same results?

    Code:
    create table testAND (aID int auto_increment primary key, bID int, cID int, val_timestamp timestamp,val_int int, val_float float);
    insert into testAND values (NULL,1,1,'20091214002233',NULL,1.05);
    insert into testAND values (NULL,1,2,'20091214002233',105,NULL);
    insert into testAND values (NULL,2,1,'20091214002233',NULL,1.95);
    insert into testAND values (NULL,2,2,'20091214002233',115,NULL);
    insert into testAND values (NULL,3,1,'20091214002233',NULL,1.15);
    insert into testAND values (NULL,3,2,'20091214002233',130,NULL);
    insert into testAND values (NULL,1,1,'20091214002244',NULL,0.50);
    insert into testAND values (NULL,1,2,'20091214002244',150,NULL);
    insert into testAND values (NULL,2,1,'20091214002244',NULL,1.25);
    insert into testAND values (NULL,2,2,'20091214002244',125,NULL);
    insert into testAND values (NULL,2,2,'20091214002255',25,NULL);
    insert into testAND values (NULL,2,1,'20091214002255',NULL,2.55);
    If you create the table above, I'm trying to find all (cID = 1 AND (val_float >= 1.00 AND val_float <= 2.00) AND (cID = 2 AND (val_int < 120 OR val_int > 140). The way I figured to do this was use INNER JOIN to be able to match two different attributes at the same time but then I want to display the results of both sets of attributes so I do it twice, then use UNION. Here is my SQL statement:

    Code:
    SELECT t1.bID, t1.cID, t1.val_timestamp, t1.val_int, t1.val_float FROM testand t1 
    INNER JOIN testand t2 ON t1.bID = t2.bID 
    WHERE (t1.cID = 2 AND (t1.val_int < 120 OR t1.val_int > 140)) 
    AND (t2.cID = 1 AND (t2.val_float >= 1.00 AND t2.val_float <= 2.00)) 
    GROUP BY t1.val_timestamp, t1.val_int
    UNION
    SELECT t2.bID, t2.cID, t2.val_timestamp, t2.val_int, t2.val_float FROM testand t1 
    INNER JOIN testand t2 ON t1.bID = t2.bID 
    WHERE (t1.cID = 2 AND (t1.val_int < 120 OR t1.val_int > 140)) 
    AND (t2.cID = 1 AND (t2.val_float >= 1.00 AND t2.val_float <= 2.00)) 
    GROUP BY t2.val_timestamp, t2.val_float 
    ORDER BY bID, cID;
    This correctly (as far as I can tell) produces these results:

    Code:
    +------+------+---------------------+---------+-----------+
    | bID  | cID  | val_timestamp       | val_int | val_float |
    +------+------+---------------------+---------+-----------+
    |    1 |    1 | 2009-12-14 00:22:33 |    NULL |      1.05 |
    |    1 |    2 | 2009-12-14 00:22:33 |     105 |      NULL |
    |    1 |    2 | 2009-12-14 00:22:44 |     150 |      NULL |
    |    2 |    1 | 2009-12-14 00:22:33 |    NULL |      1.95 |
    |    2 |    1 | 2009-12-14 00:22:44 |    NULL |      1.25 |
    |    2 |    2 | 2009-12-14 00:22:33 |     115 |      NULL |
    |    2 |    2 | 2009-12-14 00:22:55 |      25 |      NULL |
    +------+------+---------------------+---------+-----------+
    7 rows in set (0.02 sec)
    Is there a faster/shorter/more efficient way to get the same results? I have a feeling I'm missing something obvious.

    Thanks.

  10. #10
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,601
    Blog Entries
    13
    Try the explain query on it, it's not efficient.

    Perhaps I'm not quite understanding what you are trying to do and I don't know why you are doing group by at all.

    Wouldn't this give you the correct results

    SELECT * FROM testAND WHERE (cID = 2 AND (val_int between 120 and 140)) OR (cID = 1 AND (val_float between 1.00 and 2.00)) order by cID;

    ?

  11. #11
    Quote Originally Posted by Bok View Post
    Try the explain query on it, it's not efficient.

    Perhaps I'm not quite understanding what you are trying to do and I don't know why you are doing group by at all.

    Wouldn't this give you the correct results

    SELECT * FROM testAND WHERE (cID = 2 AND (val_int between 120 and 140)) OR (cID = 1 AND (val_float between 1.00 and 2.00)) order by cID;
    That does not give the correct results. First, I'm not trying to find values between 120 and 140, but less than 120 and greater than 140, so everything except between 120 and 140.

    But if you re-write the query to be:
    SELECT * FROM testAND WHERE (cID = 2 AND (val_int < 120 OR val_int > 140)) OR (cID = 1 AND (val_float between 1.00 and 2.00)) order by bID, cID;

    That still won't give you the correct results. Because you have OR between the two statements. I only want to display the results if both are true.

    So that query returns:
    Code:
    +-----+------+------+---------------------+---------+-----------+
    | aID | bID  | cID  | val_timestamp       | val_int | val_float |
    +-----+------+------+---------------------+---------+-----------+
    |   1 |    1 |    1 | 2009-12-14 00:22:33 |    NULL |      1.05 |
    |   2 |    1 |    2 | 2009-12-14 00:22:33 |     105 |      NULL |
    |   8 |    1 |    2 | 2009-12-14 00:22:44 |     150 |      NULL |
    |   3 |    2 |    1 | 2009-12-14 00:22:33 |    NULL |      1.95 |
    |   9 |    2 |    1 | 2009-12-14 00:22:44 |    NULL |      1.25 |
    |   4 |    2 |    2 | 2009-12-14 00:22:33 |     115 |      NULL |
    |  11 |    2 |    2 | 2009-12-14 00:22:55 |      25 |      NULL |
    |   5 |    3 |    1 | 2009-12-14 00:22:33 |    NULL |      1.15 |
    +-----+------+------+---------------------+---------+-----------+
    The last row, bID=3 should not be there, while 1.15 is between 1.00 and 2.00 the cID=2 val_int is 130 which is not in the valid range. Since I'm looking for results only when both cID=1 AND cID=2 are true, this is returning too much information. I need to remove the cases where they are not both true.

    Does that explain better?

    The reason why I was using group by in my statement was to get rid of the multiple rows in the join statement. For every val_int it prints multiple val_floats, so using GROUP BY I was trying to reduce it to a single val_int or val_float for each row.

  12. #12
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,601
    Blog Entries
    13
    Quote Originally Posted by Digital Parasite View Post
    That does not give the correct results. First, I'm not trying to find values between 120 and 140, but less than 120 and greater than 140, so everything except between 120 and 140.
    my bad on that one, misread it.



    The last row, bID=3 should not be there, while 1.15 is between 1.00 and 2.00 the cID=2 val_int is 130 which is not in the valid range. Since I'm looking for results only when both cID=1 AND cID=2 are true, this is returning too much information. I need to remove the cases where they are not both true.
    How can cID = 1 and 2 in the same row?

    Does that explain better?
    Nope, more confused than ever, perhaps it's the dang heat here >90F right now..!!

    The reason why I was using group by in my statement was to get rid of the multiple rows in the join statement. For every val_int it prints multiple val_floats, so using GROUP BY I was trying to reduce it to a single val_int or val_float for each row.

  13. #13
    Quote Originally Posted by Bok View Post
    my bad on that one, misread it.

    How can cID = 1 and 2 in the same row?

    Nope, more confused than ever, perhaps it's the dang heat here >90F right now..!!
    cID can't be 1 and 2 in the same row. Let me convert the problem to english words and simplify it to remove the ranges.

    bID = Patient
    cID is an Attribute
    cID = 1 for example means temperature
    cID = 2 for example means heart rate

    What I'm trying to find is: show all Patients that have a temperature of 1.95 AND a heart rate of 150, and list those temperatures and heart rates with timestamps.

    So Temperature and Heart Rate are not listed on the same row which is what makes this tricky, I need to see if they match the pattern I'm looking for and then only display the ones that match the pattern. In my example Patient #3 (bID = 3) has a temperature that matches the pattern but his heart rate does not so I don't want to list him or the details.

    Does that clarify anything? Wow, big heat, we were up to 85F on the weekend which is very out of the ordinary for the beginning of April way up north here.
    Last edited by Digital Parasite; 04-07-2010 at 10:18 AM.

  14. #14
    Free-DC Semi-retire gopher_yarrowzoo's Avatar
    Join Date
    Mar 2002
    Location
    Knoxville, TN
    Posts
    3,985
    That does make it rather tricky to say the least..
    I take it the table is in the from ID,cID,Data,.... and is such that ID,1,Temp then next row is ID,2,Heartrate ?
    See if it is that I'd be tempted to create a temp table / array
    thus
    table / array [id][timestamp] [heartrate] ,[id][timestamp] [temp], [id][timestamp] [other]
    table ID, Timestamp, Temp, Heartrate, other data .....
    filled as if ID, timestamp, cID==1 fill Temp || cID==2 fill Heartrate, other data

    query is the something like SELECT * from temp_table WHERE !(Temp>=120 && Temp<=140) && Heartrate is in range.
    then the array is simple.
    -TABLE-
    ID | timestamp | Temp | Heart Rate | Other Data
    run the array and display it 1 per row.
    Semi-retired from Free-DC...
    I have some time to help.....
    I need a new laptop,but who needs a laptop when you have a phone...
    Now to remember my old computer specs..


  15. #15
    Quote Originally Posted by gopher_yarrowzoo View Post
    That does make it rather tricky to say the least..
    I take it the table is in the from ID,cID,Data,.... and is such that ID,1,Temp then next row is ID,2,Heartrate ?
    See if it is that I'd be tempted to create a temp table / array
    thus
    table / array [id][timestamp] [heartrate] ,[id][timestamp] [temp], [id][timestamp] [other]
    table ID, Timestamp, Temp, Heartrate, other data .....
    filled as if ID, timestamp, cID==1 fill Temp || cID==2 fill Heartrate, other data
    You can see the exact table form in message #9 with the create statement on top: http://www.free-dc.org/forum/showpos...68&postcount=9

    Yes, we have PatientID,AttributeID,Integer Data,FloatingPoint Data.

    This table currently has 10+ million rows in it and growing. How much time/effort does it take to create a temp table in the fashion you suggest for a table that size? I have never used arrays before, what benefit is that over just creating a temporary table? I just did a google search for MySQL and arrays but don't really find any information.

    So you suggest that I create a TEMP table and then use INSERT INTO statements from the main table and filter into a "Temp" and "Heart Rate" column the right values after which I can use a simpler SQL statement to find the values?

    I think I still see a problem though, there can be a different number of temperatures and heart rates so for each patient they won't evenly fit into rows. So you would still have the case where you need to look at multiple rows to see if both the heart rate and temperatures were still in range?

  16. #16
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,601
    Blog Entries
    13
    Perhaps your initial premise is incorrect ? I'm not sure your original results were entirely accurate assuming you only want distinct grouping by datetime - otherwise ignore the following as I don't think it can be done much better than you have..! If we break down the 12 rows into 6 groups of 2 then the only 'groups' you want are the first two - correct ?

    Thus the following query would give you that.

    mysql> select t1.bID,t1.val_timestamp,t1.val_float,t2.val_int from testAND t1 inner join testAND t2 on t1.bID = t2.bID and t1.val_timestamp = t2.val_timestamp where (t1.cID = 1 and t1.val_float between 1.00 and 2.00) and t2.cID = 2 and (t2.val_int < 120 or t2.val_int > 140) ;
    +------+---------------------+-----------+---------+
    | bID | val_timestamp | val_float | val_int |
    +------+---------------------+-----------+---------+
    | 1 | 2009-12-14 00:22:33 | 1.05 | 105 |
    | 2 | 2009-12-14 00:22:33 | 1.95 | 115 |
    +------+---------------------+-----------+---------+
    2 rows in set (0.01 sec)

    grouping 1 (aID=1,2) should be selected
    grouping 2 (aID=3,4) should be selected
    grouping 3 (aID=5,6) should not be selected as val_int = 130
    grouping 4 (aID 7,8) should not be selected as val_float = 0.5
    grouping 5 (aID 9,10) should not be selected as val_int = 125
    grouping 6 (aID 11,12) should not be selected as val_float = 2.55

    and explain is better (though I did not define any indexes)

    mysql> explain select t1.bID,t1.val_timestamp,t1.val_float,t2.val_int from testAND t1 inner join testAND t2 on t1.bID = t2.bID and t1.val_timestamp = t2.val_timestamp where (t1.cID = 1 and t1.val_float between 1.00 and 2.00) and t2.cID = 2 and (t2.val_int < 120 or t2.val_int > 140) ;
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 12 | Using where |
    | 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 12 | Using where |
    +----+-------------+-------+------+---------------+------+---------+------+------+-------------+
    2 rows in set (0.00 sec)

  17. #17
    Quote Originally Posted by Bok View Post
    Perhaps your initial premise is incorrect ? I'm not sure your original results were entirely accurate assuming you only want distinct grouping by datetime - otherwise ignore the following as I don't think it can be done much better than you have..! If we break down the 12 rows into 6 groups of 2 then the only 'groups' you want are the first two - correct ?
    Sorry this is so confusing, I'm not doing a great job of explaining things. I shouldn't have put the timestamps so closely together to match. I'm not actually trying to match timestamps, every timestamp could and probably would be different.

    I just retried my original query without the GROUP BY and it gives the same results so you are right, I don't actually need those. So if you have a table where every timestamp is different such as:

    Code:
    create table testAND (aID int auto_increment primary key, bID int, cID int, val_timestamp timestamp,val_int int, val_float float);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (1, 1, 1, '2009-12-14 00:22:33', NULL, 1.05);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (2, 1, 2, '2009-12-14 00:22:34', 105, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (3, 2, 1, '2009-12-14 00:22:35', NULL, 1.95);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (4, 2, 2, '2009-12-14 00:22:36', 115, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (5, 3, 1, '2009-12-14 00:22:37', NULL, 1.15);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (6, 3, 2, '2009-12-14 00:22:38', 130, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (7, 1, 1, '2009-12-14 00:22:44', NULL, 0.5);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (8, 1, 2, '2009-12-14 00:22:45', 150, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (9, 2, 1, '2009-12-14 00:22:46', NULL, 1.25);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (10, 2, 2, '2009-12-14 00:22:47', 125, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (11, 2, 2, '2009-12-14 00:22:55', 25, NULL);
    INSERT INTO `testand` (`aID`, `bID`, `cID`, `val_timestamp`, `val_int`, `val_float`) VALUES (12, 2, 1, '2009-12-14 00:22:56', NULL, 2.55);
    Then I can run this SQL query:

    Code:
    SELECT t1.bID, t1.cID, t1.val_timestamp, t1.val_int, t1.val_float 
    FROM testand t1 INNER JOIN testand t2 ON t1.bID = t2.bID 
    WHERE (t1.cID = 2 AND (t1.val_int < 120 OR t1.val_int > 140)) 
    AND (t2.cID = 1 AND (t2.val_float >= 1.00 AND t2.val_float <= 2.00)) 
    UNION
    SELECT t2.bID, t2.cID, t2.val_timestamp, t2.val_int, t2.val_float 
    FROM testand t1 INNER JOIN testand t2 ON t1.bID = t2.bID 
    WHERE (t1.cID = 2 AND (t1.val_int < 120 OR t1.val_int > 140)) 
    AND (t2.cID = 1 AND (t2.val_float >= 1.00 AND t2.val_float <= 2.00)) 
    ORDER BY bID, cID;
    And I would get these results:
    Code:
    +------+------+---------------------+---------+-----------+
    | bID  | cID  | val_timestamp       | val_int | val_float |
    +------+------+---------------------+---------+-----------+
    |    1 |    1 | 2009-12-14 00:22:33 |    NULL |      1.05 |
    |    1 |    2 | 2009-12-14 00:22:34 |     105 |      NULL |
    |    1 |    2 | 2009-12-14 00:22:45 |     150 |      NULL |
    |    2 |    1 | 2009-12-14 00:22:35 |    NULL |      1.95 |
    |    2 |    1 | 2009-12-14 00:22:46 |    NULL |      1.25 |
    |    2 |    2 | 2009-12-14 00:22:36 |     115 |      NULL |
    |    2 |    2 | 2009-12-14 00:22:55 |      25 |      NULL |
    +------+------+---------------------+---------+-----------+
    7 rows in set (0.22 sec)
    Which is those values that are in the correct ranges and have at least one match in BOTH of the attributes. So with that, you are saying Bok that I can't really do much better then? Too bad MySQL doesn't support the INTERSECT statement since that seems to be exactly what I want to do, take two SQL queries, one searching for heart rate and one for temperature and intersect them so only ones that have patients with both show up.

    Thanks for all your help Bok and gopher_yarrowzoo, I really do appreciate it.

    Jeff.

  18. #18
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,601
    Blog Entries
    13
    UNION will always produces a filesort unfortunately. But perhaps this does a little better

    mysql> select * from testAND t1 where cID = 1 and t1.val_float between 1.00 and 2.00 and
    exists (select 1 from testAND t2 where t1.bID = t2.bID and t2.cID=2 and (t2.val_int < 120 or t2.val_int > 140))
    union
    select * from testAND t1 where cID = 2 and t1.val_int < 120 or t1.val_int > 140 and
    exists (select 1 from testAND t2 where t1.bID = t2.bID and t2.cID = 1 and (t2.val_float between 1.00 and 2.00)) order by bID,cID;
    +-----+------+------+---------------------+---------+-----------+
    | aID | bID | cID | val_timestamp | val_int | val_float |
    +-----+------+------+---------------------+---------+-----------+
    | 1 | 1 | 1 | 2009-12-14 00:22:33 | NULL | 1.05 |
    | 2 | 1 | 2 | 2009-12-14 00:22:33 | 105 | NULL |
    | 8 | 1 | 2 | 2009-12-14 00:22:44 | 150 | NULL |
    | 3 | 2 | 1 | 2009-12-14 00:22:33 | NULL | 1.95 |
    | 9 | 2 | 1 | 2009-12-14 00:22:44 | NULL | 1.25 |
    | 4 | 2 | 2 | 2009-12-14 00:22:33 | 115 | NULL |
    | 11 | 2 | 2 | 2009-12-14 00:22:55 | 25 | NULL |
    +-----+------+------+---------------------+---------+-----------+
    7 rows in set (0.00 sec)


    My head hurts.
    Last edited by Bok; 04-07-2010 at 01:20 PM. Reason: formatted a little better for clarity.

  19. #19
    Quote Originally Posted by Bok View Post
    UNION will always produces a filesort unfortunately. But perhaps this does a little better
    Interesting alternative and it does return the same thing. In my DB with 5 million rows, this query you just posted takes 12.641 sec and the previous one that I posted takes 12.125 sec so essentially the same speed.

    So I guess doing 2 SELECTS, 2 JOINS, and 1 UNION take slightly less time than doing 4 SELECTS, and 1 UNION.

  20. #20
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,601
    Blog Entries
    13
    I'd bet >90% is on the filesort in either case and it may well be skewed once it's in the query cache.

    12 secs is pretty decent for 5M rows anyway.

  21. #21
    Quote Originally Posted by Bok View Post
    I'd bet >90% is on the filesort in either case and it may well be skewed once it's in the query cache.

    12 secs is pretty decent for 5M rows anyway.
    I have query caching disabled but yes 12 secs is pretty decent. For fun I took out the UNION and timed the two SELECT statements separately.

    SELECT1 = 4.672 sec
    SELECT2 = 3.469 sec
    TOTAL = 8.141 sec

    Of course you can't really do that since the second SELECT by itself returns the same number of rows as the first most of which is duplicated. But it looks like the UNION part takes 4 seconds.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •