There are two big IT questions people ask about Digital Pathology: how much storage will I need, and how much bandwidth? And how much will they cost? Let's figure this out.
(In the discussion below, the key assumptions are highlighted in gold.)
Storage
The main concerns about storage are capacity, cost, and backup.

raiders of the lost slide
Capacity + Cost
- Typical slide = 250MB
- Digital slide = 20mm x 15mm @ 20X (.5 micron/pixel)
- = 40,000 x 30,000 pixels = 1.2 Gpixel
- x 3 bytes/pixel = 3.6GB
- / 20:1 compression (JPEG2000) = 180MB
- Typical lab: One year = 5TB = $5,000
- (100 slides / day) x (200 days / year) = 20,000 slides
- = 5TB (1TB can hold 4,000 slides)
- x $1,000 / TB ($500 disk + $500 server, etc) = $5,000
Note that number of slides per day is the key variable.
- Typical server: Up to 80TB = 16 years
- 2 controllers, each with 4 racks, each with up to 12 drives
- With RAID 5, capacity = 2 x 4 x 11 (12-1) drives = 88TB
- With RAID 6, capacity = 2 x 4 x 10 (12-2) drives = 80TB
This example drawn from current HP models, Dell and others are similar. It is a bad idea to buy too much storage too soon; the capacity of disk drives keeps going up and the cost keeps coming down. I usually recommend buying a year's worth of storage each year.
Backup
As can be seen from the capacity and cost data above, there is no reason not to keep all needed data online as long as it is needed. Any kind of archive solution which keeps some data online and takes others offline to tape, etc. will be more complicated and more expensive, with no benefit.
There are two reasons to backup data, to protect against device failure (e.g. disk drive fails), and to protect against catastrophic failure (e.g. computer room flooded). With RAID systems can easily recover from device failure with no further backup provisions. (Given the relatively small incremental cost, I recommend RAID 6 for most situations.) Offsite backup to protect against catastrophe is often best accomplished with disk-to-disk copies to removeable drives. Because of the typical data volumes network backup solutions are usually not practical.
Bandwidth
The main concerns about bandwidth are capacity and cost.
|
|
Capacity + Cost
- Typical user = 20KB/s
- Computer screen = 1280 x 1024 pixels = 1.3Mpixel
- x 3 bytes / pixel = 4MB
- / 20:1 compression (JPEG2000) = 80KB / screen
- / screen every 4 seconds = 20KB/s
- Typical lab: 3 T1 lines = $18,000 / year
- 20 concurrent users x 20KB/s = 400KB/s
- / 200KB/s (T1 line = 1.5Mb/s = 200KB/s) = 2 T1 lines
- + 50% (safety utilization) = 3 T1 lines
- x $500 / month = $1,500 / month
Note that number of current users is the key variable.
- Additional capacity is less expensive
- DS3 (aka OC3 aka T3) = 24 x T1 = 5MB/s
- = 125 concurrent users = $6,000 / month
Each individual situation is different, but hopefully with these examples you could compute storage and bandwidth requirements for your circumstances, and approximate the cost.
Your questions and comments are welcome!



