Results 1 to 6 of 6

Thread: Database Question

  1. #1

    Database Question

    This isn't really an OS question but this might be the best fit for it. You guys that are using databases for the FreeDC stats and other stuff (Bok, dp, any others?) I have some DB questions for you guys.

    I'm trying to setup a DB for my PhD research. It only needs to have a few fields to store data every minute from patient monitors. But since there might be 16 patients at a time, I calculated that after roughly a year there will be over 4 billion records in the DB.

    Since I have no experience with large DB like this, can free DB such as MySQL, MS SQL, Postgres, handle tables that size?

    How many records are in the various FreeDC stats tables for example? The system running the DB is a Q6600 with 4GB RAM and XP 64bit. It will mostly just need to do INSERTs to store the data, and analysis and other processing would be done offline on a different system with exported data from it.

    Thanks in advance.

  2. #2
    Administrator Bok's Avatar
    Join Date
    Oct 2003
    Location
    Wake Forest, North Carolina, United States
    Posts
    24,471
    Blog Entries
    13
    Sure, they can handle that much data, no different to any of the Enterprise DB's like Oracle or DB2. You might want to think about your indexing very carefully potentially, perhaps with a year & period in it or something like that?. Hard to judge without knowing more.

    In mysql you could use the merge tables where there are multiple tables, of the same structure, which appear as one. That way you can seperate the physical files.

    Most records I have in a table right now on free-dc is the boinc_host_week table which holds basically snapshots of the hosts on a weekly basis. I've only been storing this info for 5 months or so and there is 115M rows there. Handles it just fine..

    Bok

  3. #3
    Free-DC Semi-retire gopher_yarrowzoo's Avatar
    Join Date
    Mar 2002
    Location
    Knoxville, TN
    Posts
    3,985
    I can confirm that some of the table data is pretty big as well some of the updates if memory serves me correct are >4Mb so yeah MySQL will handle it fine.
    Just need to as bok said carefully index the sucker
    something like this might work
    RecordID(Primary & unique),PatientID(index),Time Date Stamp, Fields......
    and another table with
    PatientID(Primary & unique), ChristianName, Surname, other details..

    And then it's just 1 SELECT where sql statement to pull up the whole enchilada
    so you'd get something like

    154 - 11/19/08 12:15 - 7 - J Doe 125/72 85% etc..
    Semi-retired from Free-DC...
    I have some time to help.....
    I need a new laptop,but who needs a laptop when you have a phone...
    Now to remember my old computer specs..


  4. #4
    Quote Originally Posted by gopher_yarrowzoo View Post
    I can confirm that some of the table data is pretty big as well some of the updates if memory serves me correct are >4Mb so yeah MySQL will handle it fine.
    Just need to as bok said carefully index the sucker
    something like this might work
    RecordID(Primary & unique),PatientID(index),Time Date Stamp, Fields......
    and another table with
    PatientID(Primary & unique), ChristianName, Surname, other details..

    And then it's just 1 SELECT where sql statement to pull up the whole enchilada
    so you'd get something like

    154 - 11/19/08 12:15 - 7 - J Doe 125/72 85% etc..
    Sweet. Now when you say (Primary & unique), isn't the definition of a primary key unique or what do you mean exactly by specifying both?

    With the first table, in order to best index things if I say wanted to do a search on all PatientID X with FieldID Y or all Patient ID X with FieldID Y between a certain Timestamp range, would I still just need to index PatientID or would I need to index more?

    The FieldID is something I have but not listed above to let me know what the field value is, so I know that "85%" for example is say "Blood Oxygen Level".


    Thanks guys, I appreciate the help.

  5. #5
    Free-DC Semi-retire gopher_yarrowzoo's Avatar
    Join Date
    Mar 2002
    Location
    Knoxville, TN
    Posts
    3,985
    Ophs just looked again at what I wrote, I meant to say Primary and Auto-increment thus it would be Unique.
    I would index whatever fields you would be regularly searching on although saying that just indexing those 2 would make searches faster - would need to check with the guru, where's Bok when ya need him

    the SQL would just be SELECT * FROM Table_1 WHERE PatientID='<someID>' AND DateTime>='<start time>' AND DateTime<='<end time>'
    I think it could be a case of using a small DB set to start with before hand and testing it so you can see the expected formats although I believe it's whatever format you ask the DB table to be set up in, as I think with Date Time fields you need to specify how exact you want the Timestamp saved ie MM/DD/YY HH:SS or HH:SS MM/DD/YY.
    Semi-retired from Free-DC...
    I have some time to help.....
    I need a new laptop,but who needs a laptop when you have a phone...
    Now to remember my old computer specs..


  6. #6
    Senior Member
    Join Date
    Feb 2004
    Location
    You can always tell a Brummie, but you can't tell him much.
    Posts
    189
    When databases start getting really big then indexes on their own are just not enough.

    Check out the fragmentation/partitioning options for the database engines you are considering (the man pages for MySQL are at http://dev.mysql.com/doc/refman/5.1/...titioning.html).

    Petey

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •