You don't get faster than scsi and it's too expensive anyway. go with western digital sata raptors, the newest ones are the shiz.
what type of cheap (like not $50K) hard disk arrays are there that can handle millions of small text files (1 to 50 Kb, most <5 Kb)?
SCSI 15K drives might not be good enough, what is the next step in faster I/O than SCSI? can a good cacheing controller with large memory help when dealing with huge numbers of files?
the total storage size doesn't need to be huge, 18 GB might be ok, although around the 50 Gb range would be better.
thanks
Use the right tool for the right job!
There is no finer raid controller with hard cache than 3ware. It supports SATA Raid which is cheapest, and you would need to change your cluster sizes to match... 10,000 rpm Raptor drives are best.Originally posted by FoBoT
what type of cheap (like not $50K) hard disk arrays are there that can handle millions of small text files (1 to 50 Kb, most <5 Kb)?
SCSI 15K drives might not be good enough, what is the next step in faster I/O than SCSI? can a good cacheing controller with large memory help when dealing with huge numbers of files?
the total storage size doesn't need to be huge, 18 GB might be ok, although around the 50 Gb range would be better.
thanks
From personal experience
Thats right, you wont get a faster drive that a 15k rpm ultra320 scsi drive with a top notch controller, however I have used the Raptors and they certainly are the next best thing (especially on cost).
However, if you want to achieve the optimum I/O speed, the simplest way is to stripe the drives, running in RAID 0 will allow the two drives to push the controller to its limits - this format is ideal when working with small files which are frequently accessed. With a good controller you will able to dictate your stripe/block sizes etc to suite your task.
how much advantage can raid 0 give using more than 2 drives? can i put 4 drives in raid 0 and get double the I/O speed?
they make 36 Gb SATA 10K drives right, so maybe i get four of those with a good controller and it'll do ok. that would be well within budget
what about fibre channel drives? anybody use those? are they just higher transfer rates? with small files i don't think higher transfer rates are what i need
Use the right tool for the right job!
Fibre channel offers a potentially much higher transfer rate than SCSI, over 200mbps full duplex rather than SCSI's 40 i belive. However, this stuff is aimed at SAN storage, where a lot of individual machines are accessing the arrays, one system would not be able to get anywhere near accessing at that speed and actually working with the data at the same time other than in small bursts...
I would get the 36GB raptors yes, the config however may vary depending on the data you want to work with.
If you need to large single array capacity, RAID0 across them all with be fine - if you dont, 2 x 2 Drive RAID0 arrays will work better and provide you with a little redundancy (if one drive fails you only lose 1 arrays data(2 drives worth), not all 4's)
You will also start to place an overhead on the controller if it has to stripe across more than a couple of drives, it is only v small but again something worth avoiding.
Ideally the perfect setup is RAID0+1, providing speed and data protection. But you will need a big wallet for that...
i only thought that 36gb as he said he was aiming at 50gb or so initially.
the 72's are a lot more dosh too.
Get a 4 port 3ware Sata controller, put 2 drives in raid 0, and the other 2 drives in raid 0, then raid 1 both raid 0 containers .Originally posted by FoBoT
how much advantage can raid 0 give using more than 2 drives? can i put 4 drives in raid 0 and get double the I/O speed?
they make 36 Gb SATA 10K drives right, so maybe i get four of those with a good controller and it'll do ok. that would be well within budget
what about fibre channel drives? anybody use those? are they just higher transfer rates? with small files i don't think higher transfer rates are what i need
36 GB x 36GB stripe = ~70GB mirrored to 36 GB x 36GB stripe = ~70GB
Fiber $$$
Much Slower and no hard cache (memory) and it's not 'intelligent' .
http://3ware.com/products/serial_ata9000.asp
Code:Card Type / RAID Level I/O Type Sequential I/O Increase over 8506-8 9500S-8 RAID 0 Read 409.4 MBytes/sec + 96.5% 9500S-8 RAID 0 Write 210.0 MBytes/sec + 4.2 % 9500S-8 RAID 5 Read 402.8 MBytes/sec + 113.0% 9500S-8 RAID 5 Write 113.0 MBytes/sec + 23.9%