Wednesday, April 15, 2015

Making DICOMWeb Fast

Over the years I have heard many people complain about slow DICOM performance.  In many cases these performance issues aren't with the standard itself but rather the vendor implementation or with how the system was implemented (e.g. network, storage array, VMs, etc).  Medical Images are large and tend to stress systems in ways that most architects/engineers and IT staff haven't had to deal with before.  It really takes a top notch engineering team to build a medical imaging system that can be reliable, scale and performant.  It also takes a top notch IT team to deploy and implement such a system.

The introduction of DICOMWeb does not magically fix these performance issues - in fact it may make it worse if vendors choose to bolt DICOMWeb functionality on top of their existing architecture rather than make the proper investment in reworking the software and system architecture.  Lets take a look at some of the new DICOMWeb API's and see what the issues might be:

The first thing to realize is that QIDO-RS was specifically designed to be a layer on top of CFIND.  The idea was that vendors had invested in their CFIND implementations and would prefer to build a simple REST wrapper around it rather than create a whole new subsystem.  It is therefore entirely possible to create a QIDO-RS to CFIND adapter which would allow older PACS archives that only support CFIND to be accessible via QIDO-RS.  In fact, I would expect such products to come on the market later this year.  Such a product would be great for interoperability, but doesn't necessarily mean that it will be performant.  Since QIDO-RS delivers no new functionality over CFIND (at least from a performance point of view), we can consider them roughly the same.

This of course means that any performance related issues to an archive's CFIND implementation will also likely apply to their QIDO-RS implementation.  When moving to QIDO-RS (or expanding use of CFIND), you will need to take into account normal scaling and usage issues.  If your current CFIND usage is 1 query a second today, but then it increase to 60 queries a second because more clients start accessing it via QIDO-RS or CFIND - you may have a problem.

If you are interested in how one might implement QIDO-RS, feel free to take a look at the .NET prototype I did here.

STOW-RS is fairly straightforward - send a DICOM Instance to the archive and save it.  Storing inbound DICOM is a function that most archives are good at so there shouldn't be too much of a concern about adding STOW-RS to existing systems.  There may be some systems engineering issues with scaling inbound HTTP requests but it will likely take several years before STOW-RS gets enough usage for this to be an issue.  All inbound DICOM is done via DIMSE right now and it will continue to be that way for many years to come (modalities have 10+ year life cycles and many of the existing systems will never get STOW-RS options added to them).  In the short term, STOW-RS will most likely be used for two things:
1. Adding value add SOP Instances to an existing study such as KO, PR and SR
2. Importing of visible light images from mobile devices

Both of these use cases are expected to produce much less data than normal image load so should not have a major impact on system performance.  One thing to keep in mind that sending large quantities of data over HTTP does require a bit of systems engineering work.  There are specific aspects of HTTP and TCP that still need to be taken into account such as HTTP chunking, TCP Window size and utilizing of compression (gzip/deflate).  Fortunately most IT networking folks are very much at home with these issues and should be able to make the necessary system configuration changes to make it performant.

WADO-RS consists of several API's each of which have different levels of complexity when it comes to implementation and performance.  Of these APIs the two that are most important for viewers are Retrieve Metadata and Retrieve Frames.  Unfortunately I haven't had access to an implementation of either yet so what I am going to describe here may not be 100% correct.  That being said, both of these features are heavily based on the work done on the MINT project which I was very involved in.

WADO-RS Retrieve Metadata
When it comes to viewing a study, the first call the viewer will make is to Retrieve Metadata to determine what SOP Instances are actually in the study. This call is therefore extremely time critical as the viewer won't be able to start loading any actual pixel data until the Retrieve Metadata call has completed and it has analyzed it. Once the Retrieve Metadata response has been analyzed, the viewer can determine what images it needs to load and fetch them using the Retrieve Frames call.

The best possible performance for Retrieve Metadata is when the response is already prepared and stored on disk.  In this case, the archive has very little to do other than read a file from disk and return it to the client.  Web servers are very good at just that so the HTTP handler implementation could be as simple as looking up the file for a given study (probably in a database), opening the file and then giving the file handle to the web server to stream to the client.  Ideally the result would already be compressed on disk using a standard HTTP compression algorithm such as gzip so it could be immediately streamed without having to compress on the fly (or worse - returning it to the client uncompressed).

Preparing the Retrieve Metadata response can be a bit tricky, especially for archives with lots of data archived.  To generate the response for a study, each SOP Instance must be opened, parsed and added to a data structure that will be used to generate the actual Retrieve Metadata response.  Iterating through TB (or PB) of data to do this conversion could take months if not years depending upon the storage system and available processing power. In this case, it might make sense to convert "on the fly" when requested, and even better for all priors when a new study arrives.  A related issue to this is keeping the response up to date when the study changes due to SOP Instances being added, modified or deleted.  In this case, the archive will need to detect these changes and trigger a rebuild of the Retrieve Metadata response.  This may not sound that hard to do, but it gets tricky when you start dealing with all the failure scenarios that an ACID complaint SQL Database takes care of for you (this assumes the Retrieve Metadata response is stored on the file system, not in the DB).

WADO-RS Retrieve Frames
WADO-RS Retrieve Frames allows a client to request the pixel data for one or more image frames from the archive.  Retrieve Frames allows the client to request the image frames be sent in a specific transfer syntax.  Servers are expected to support conversion from one transfer syntax to another (also known as transcoding).  WADO-URI also supports transcoding but I have worked with many implementations of it that did not support it at all.  Transcoding is CPU intensive and often requires licensing of expensive image compression libraries to support the various compression algorithms that DICOM supports (specifically JPEG2000 and JPEG-LS).  Transcoding is one of the bigger performance challenges when it comes to DICOM as the client may not support (or know) what transfer syntax the archive stored the SOP Instance in and may require the server to transcode it so it can be displayed.  It does seem that many VNA implementations are configured to archive using the JPEG2000 transfer syntax so clients should plan on supporting JPEG2000 directly if performance is important.

This is just a taste of the performance issues related to DICOMWeb.  The WADO-RS Retrieve Metadata may very well be the most complex and performance critical API that archives will have to implement.  If you are looking to run a viewer form another vendor off of your DICOMWeb archive, make sure the the archive supports WADO-RS Retrieve Metadata and does it in a performant way.  You also want to look at QIDO-RS and make sure it can scale.  The vendor's CFIND performance will be a good leading indicator of how well QIDO-RS will do.

No comments:

Post a Comment