Wednesday, April 29, 2015

HL7 FHIR DSTU 2 ImagingObjectSelection Resource

While working on the SIIM 2015 Hackathon Grand Challenge, I ran into an issue with how to represent different views of a study using the ImagingStudy resource.  What I wanted to do was create an ImagingStudy for all images in a study and another ImagingStudy for just the key images in that study.  The idea behind key images is to show a subset of images in the study - specifically those that are referenced in the diagnostic report.  Some DICOM studies can be quite large - hundreds or thousands of images (e.g thin slice CT, fMRI, perfusion, etc) and displaying all of the images can be overwhelming for most non radiologists.  Key images are therefore a critically important output of an interpretation by a radiologist to make imaging meaningful to most users outside of radiology.  It is also a key part of making the "multi-media report" work for the grand challenge.

The documentation for the ImagingStudy resource calls out the key image use case so my expectation is reasonable.  The data model also supports multiple ImagingStudy resources per DiagnosticReport so it is possible to have multiple views (e.g. full study, key images, qa images, etc) - but how would a viewer determine which one to display?  There is no code that can be used to determine what type of ImagingStudy this was.  Figuring out which ImagingStudy to initially display would therefore require some convoluted logic that would probably be unreliable.  My heart broke a bit as I encountered my first real world limitation of FHIR.  Fortunately we have control over the datasets we are using in the hackathon so I have the ability to "make it work" through a variety of means.  I decided I would reach out the FHIR gurus and see what they had to say:


I quickly learned that this was the wrong place for my question and I should check with the Imaging Integration working group.  I also learned about HL7 FHIR DSTU 2 - the next version of HL7 FHIR that is currently being worked on which includes a new resource named ImagingObjectSelection.  It seems that the new version of ImagingStudy is a manifest of the entire study and all subsets are moved into the ImagingObjectSelection.  The ImagingObjectSelect has a title property which is a CodeableConcept of type KOStitle.  KOStitle has around 20 different terms that can be used to give meaning to a given set of images.  

One thing I can't figure out from the documentation is how the ImagingObjectSelection is related to the DiagnosticReport and ImagingStudy.  It is still a works in progress so perhaps it hasn't been decided yet or just not documented.  Regardless, my faith in FHIR has been restored as the folks behind it are definitely on the right track!


Tuesday, April 28, 2015

SIIM 2015 Hackathon Grand Challenge

Yesterday I spent about eight hours building out the foundation for the SIIM 2015 Hackathon Grand Challenge.  This foundation is a simple web based radiology focused EMR client that includes functionality such as patient search and the display of a patient's radiology reports with images.  By providing such a foundation, it is hoped that hackers would have a baseline system to start hacking from rather than starting from scratch.

The SIIM Hackathon committee had already deployed a Spark FHIR server and DCM4CHEE v4 server into AWS and loaded both with synchronized data sets.  A special thanks to Mark Kohli, Steve Langer, and Jason Hostettler for their work on the datasets and Mohannad Hussain for setting up the DCM4CHEE server.  I also contributed the ImagingStudy resources to the FHIR dataset which were generated by converting the response from a WADO-RS Retrieve Metadata call using this tool.

I wanted to keep the learning curve for the baseline system to be low so hackers could get started quickly.  To support this, I decided to keep the third party dependencies to a minimum.  I ended up using only jQuery and bootstrap since most web developers are familiar with both.  I ruled out other popular (and powerful) libraries such as Meteor (my favorite), Angular and KnockoutJS as these powerful libraries take some time to learn and not everyone knows them.

My first goal was to create an architecture spike to prove out that it would work and also lay down the foundational pieces that I could build on top of.  I decided to begin with displaying a list of patients that I would obtain by querying the FHIR server.  This turned out to be very easy and took about 30 minutes to get up and going.  It basically involved making a query to the Patient resource filtered by the last name of the patients in our datasets and creating rows in a table from the results:

http://fhir.hackathon.siim.org/fhir/Patient?family=SIIM

Next up was creating a radiology centric patient view.  The first thing I needed was to display the available reports for the user.  This required querying the DiagnosticReport resource filtered by the ID for the selected Patient and creating rows in a table from the results:

http://fhir.hackathon.siim.org/fhir/DiagnosticReport?subject=Patient/siimjoe

This was up and running after another 30 minutes.  After this, I needed to display the actual report the user clicked on in the list of reports.  I hooked the click event on the table row and made a request for the associated DiagnosticReport by ID:

http://fhir.hackathon.siim.org/fhir/DiagnosticReport/2257132503242682

I grabbed the human readable form for the report from the data.text.div property.  This is normally HTML so I used the jQuery parseHTML() function to parse it into DOM nodes that I could stick directly into DOM.  This took yet another 30 minutes.

The last step is to display the images for the report.  This required searching for ImagingStudy resources based on the accession number stored in selected the DiagnosticReport resource:

http://fhir.hackathon.siim.org/fhir/ImagingStudy?accession=2257132503242682

There may actually be multiple ImagingStudy resources for a given DiagnosticReport so some logic was required to pick the right one.  I decided to pick the one with the fewest number of referenced SOP Instances assuming that the smallest one would contain the key images (rather than the entire study).  Once I had the ImagingStudy, I decided to pick the first image in the first series and display it using cornerstone.  I haven't built a WADO-RS based ImageLoader for cornerstone yet so I decided to use cornerstoneWADOImageLoader and load images via WADO-URI.  30 minutes later, I had an image displayed!

The spike took about 2 1/2 hours to complete and not only proved the concept but provided a great foundation to work from.  I spent the rest of the day refactoring the code for readability and adding more functionality. If you are really interested in seeing how this was built, check out the commit log.

Wednesday, April 22, 2015

QIDO-RS Capability Breakdown

Recently I created a github repository to track the available DICOMWeb implementations and their capabilities.  While a vendor may claim to support a given DICOMWeb API, the devil really is in the details.  These details are supposed to be described in the conformance statement but these are usually incomplete (or wrong).  This post attempts to break down QIDO-RS at a deeper level so we can more accurately assess how complete and compliant a given implementation is:


Feature Description
application/dicom+xml Can return responses with Content-Type: multipart/related; application/dicom+xml
application/json Can return responses with Content-Type: application/json
gzip Can return responses with Content-Encoding: gzip. Note that this is not mentioned in the standard and is therefore not required (but I feel should be supported and am therefore including it)
no-cache Supports the Cache-control: no-cache header which ensures the result is current and not cached on the client side. Note that this is not required by the standard (but I feel should be supported and am therefore including it)
group/element Supports specifying DICOM elements by group/element (e.g. 0020000D)
keyword Supports specifying DICOM elements by DICOM keyword (e.g. StudyInstanceUID)
sequences Supports specifying sequence elements (e.g. RequestAttributeSequence.RequestedProcedureID)
limit Supports the limit option for queries (limits the number of records returned)
offset Supports the offset option for queries (skips records in the response to support paging)
TimezoneOffsetFromUTC Supports specification of the timezone as part of date/time queries. Note that this is not required by the standard
QIDO-RS Studies Supports searching for studies via the /studies endpoint
QIDO-RS Studies response /studies response includes all required attributes
StudyDate - single Supports searching for studies by study date by single value match (e.g. exact match)
StudyDate - range Supports searching for studies by study date range
StudyTime - single Supports searching for studies by study time by single value match (e.g. exact match)
StudyTime - range Supports searching for studies by study time range
AccessionNumber - single Supports searching for studies by accession number by single value match (e.g. 1233456)
AccessionNumber - wildcard Supports searching for studies by accession number by wildcard (e.g. 1234*)
ModalitiesInStudy - single Supports searching for studies by modalities in study by single value match e.g. (CT)
ModalitiesInStudy - list Supports searching for studies by modalities in study by list of values (e.g. CT,MR,US)
ReferringPhysicianName - single Supports searching for studies by ReferringPhysicianName by single value match
ReferringPhysicianName - wildcard Supports searching for studies by ReferringPhysicianName by wildcard
PatientName - single Supports searching for studies by PatientName by single value match
PatientName - wildcard Supports searching for studies by PatientName by wildcard
PatientID - single Supports searching for studies by PatientID by single value match
PatientID - wildcard Supports searching for studies by PatientID by wildcard
PatientID - list Supports searching for studies by PatientID by list of values
StudyInstanceUID - single Supports searching for studies by StudyInstanceUID by single value match
StudyInstanceUID - list Supports searching for studies by StudyInstanceUID by list of values
StudyId - single Supports searching for studies by StudyId by single value match
StudyId - list Supports searching for studies by StudyId by list of values
StudyId - wildcard Supports searching for studies by StudyId by wildcard match
QIDO-RS Study Series Supports searching for series via the /studies/{StudyInstanceUID}/series endpoint
QIDO-RS Series Supports searching for series via the /series endpoint
QIDO-RS Series response Search for series response includes all required attributes
Modality - single Supports searching for series by Modality by single value match
Modality - list Supports searching for series by Modality by list of values
SeriesInstanceUID - single Supports searching for series by SeriesInstanceUID by single value match
SeriesInstanceUID - list Supports searching for series by SeriesInstanceUID by single value match
SeriesNumber - single Supports searching for series by SeriesNumber by single value match
SeriesNumber - range Supports searching for series by SeriesNumber by matching a range of values
SeriesNumber - list Supports searching for series by SeriesNumber by matching a list of values
PerformedProcedureStepStartDate - single Supports searching for series by PerformedProcedureStepStartDate by single value match
PerformedProcedureStepStartDate - range Supports searching for series by PerformedProcedureStepStartDate by matching a range of values
PerformedProcedureStepStartTime - single Supports searching for series by PerformedProcedureStepStartTime by single value match
PerformedProcedureStepStartTime - range Supports searching for series by PerformedProcedureStepStartTime by range of values match
ScheduledProcedureStepID - single Supports searching for series by RequestAttributeSequence.ScheduledProcedureStepID by single value match
RequestedProcedureID - single Supports searching for series by RequestAttributeSequence.RequestedProcedureID by single value match
QIDO-RS Series Instances Supports searching for instances in a series via the /studies/{StudyInstanceUID}/series/{SeriesInstanceUID}/instances endpoint
QIDO-RS Study Instances Supports searching for instances in a study via the /studies/{StudyInstanceUID}/instances endpoint
QIDO-RS Instances Supports searching for instances via the /instances endpoint
QIDO-RS Instance response Search for Instance response includes all required attributes
SOPClassUID - single Supports searching for instances by SOPClassUID by single value match
SOPClassUID - list Supports searching for instances by SOPClassUID by list of values
SOPInstanceUID - single Supports searching for instances by SOPInstanceUID by single value match
SOPInstanceUID - list Supports searching for instances by SOPInstanceUID by list of values
InstanceNumber - single Supports searching for instances by InstanceNumber by single value match
InstanceNumber - range Supports searching for instances by InstanceNumber by range of values
InstanceNumber - list Supports searching for instances by InstanceNumber by list of values


Vendors can go beyond this list of features by adding support for additional matching keys, including additional fields in the response, fuzzy matching and relational queries.  I didn't include these features as they are not required by the standard, not required by most use cases and rarely implemented.

What are your thoughts on this list?  Should something be added or removed?

Wednesday, April 15, 2015

Making DICOMWeb Fast

Over the years I have heard many people complain about slow DICOM performance.  In many cases these performance issues aren't with the standard itself but rather the vendor implementation or with how the system was implemented (e.g. network, storage array, VMs, etc).  Medical Images are large and tend to stress systems in ways that most architects/engineers and IT staff haven't had to deal with before.  It really takes a top notch engineering team to build a medical imaging system that can be reliable, scale and performant.  It also takes a top notch IT team to deploy and implement such a system.

The introduction of DICOMWeb does not magically fix these performance issues - in fact it may make it worse if vendors choose to bolt DICOMWeb functionality on top of their existing architecture rather than make the proper investment in reworking the software and system architecture.  Lets take a look at some of the new DICOMWeb API's and see what the issues might be:

QIDO-RS
The first thing to realize is that QIDO-RS was specifically designed to be a layer on top of CFIND.  The idea was that vendors had invested in their CFIND implementations and would prefer to build a simple REST wrapper around it rather than create a whole new subsystem.  It is therefore entirely possible to create a QIDO-RS to CFIND adapter which would allow older PACS archives that only support CFIND to be accessible via QIDO-RS.  In fact, I would expect such products to come on the market later this year.  Such a product would be great for interoperability, but doesn't necessarily mean that it will be performant.  Since QIDO-RS delivers no new functionality over CFIND (at least from a performance point of view), we can consider them roughly the same.

This of course means that any performance related issues to an archive's CFIND implementation will also likely apply to their QIDO-RS implementation.  When moving to QIDO-RS (or expanding use of CFIND), you will need to take into account normal scaling and usage issues.  If your current CFIND usage is 1 query a second today, but then it increase to 60 queries a second because more clients start accessing it via QIDO-RS or CFIND - you may have a problem.

If you are interested in how one might implement QIDO-RS, feel free to take a look at the .NET prototype I did here.

STOW-RS
STOW-RS is fairly straightforward - send a DICOM Instance to the archive and save it.  Storing inbound DICOM is a function that most archives are good at so there shouldn't be too much of a concern about adding STOW-RS to existing systems.  There may be some systems engineering issues with scaling inbound HTTP requests but it will likely take several years before STOW-RS gets enough usage for this to be an issue.  All inbound DICOM is done via DIMSE right now and it will continue to be that way for many years to come (modalities have 10+ year life cycles and many of the existing systems will never get STOW-RS options added to them).  In the short term, STOW-RS will most likely be used for two things:
1. Adding value add SOP Instances to an existing study such as KO, PR and SR
2. Importing of visible light images from mobile devices

Both of these use cases are expected to produce much less data than normal image load so should not have a major impact on system performance.  One thing to keep in mind that sending large quantities of data over HTTP does require a bit of systems engineering work.  There are specific aspects of HTTP and TCP that still need to be taken into account such as HTTP chunking, TCP Window size and utilizing of compression (gzip/deflate).  Fortunately most IT networking folks are very much at home with these issues and should be able to make the necessary system configuration changes to make it performant.

WADO-RS
WADO-RS consists of several API's each of which have different levels of complexity when it comes to implementation and performance.  Of these APIs the two that are most important for viewers are Retrieve Metadata and Retrieve Frames.  Unfortunately I haven't had access to an implementation of either yet so what I am going to describe here may not be 100% correct.  That being said, both of these features are heavily based on the work done on the MINT project which I was very involved in.

WADO-RS Retrieve Metadata
When it comes to viewing a study, the first call the viewer will make is to Retrieve Metadata to determine what SOP Instances are actually in the study. This call is therefore extremely time critical as the viewer won't be able to start loading any actual pixel data until the Retrieve Metadata call has completed and it has analyzed it. Once the Retrieve Metadata response has been analyzed, the viewer can determine what images it needs to load and fetch them using the Retrieve Frames call.

The best possible performance for Retrieve Metadata is when the response is already prepared and stored on disk.  In this case, the archive has very little to do other than read a file from disk and return it to the client.  Web servers are very good at just that so the HTTP handler implementation could be as simple as looking up the file for a given study (probably in a database), opening the file and then giving the file handle to the web server to stream to the client.  Ideally the result would already be compressed on disk using a standard HTTP compression algorithm such as gzip so it could be immediately streamed without having to compress on the fly (or worse - returning it to the client uncompressed).

Preparing the Retrieve Metadata response can be a bit tricky, especially for archives with lots of data archived.  To generate the response for a study, each SOP Instance must be opened, parsed and added to a data structure that will be used to generate the actual Retrieve Metadata response.  Iterating through TB (or PB) of data to do this conversion could take months if not years depending upon the storage system and available processing power. In this case, it might make sense to convert "on the fly" when requested, and even better for all priors when a new study arrives.  A related issue to this is keeping the response up to date when the study changes due to SOP Instances being added, modified or deleted.  In this case, the archive will need to detect these changes and trigger a rebuild of the Retrieve Metadata response.  This may not sound that hard to do, but it gets tricky when you start dealing with all the failure scenarios that an ACID complaint SQL Database takes care of for you (this assumes the Retrieve Metadata response is stored on the file system, not in the DB).

WADO-RS Retrieve Frames
WADO-RS Retrieve Frames allows a client to request the pixel data for one or more image frames from the archive.  Retrieve Frames allows the client to request the image frames be sent in a specific transfer syntax.  Servers are expected to support conversion from one transfer syntax to another (also known as transcoding).  WADO-URI also supports transcoding but I have worked with many implementations of it that did not support it at all.  Transcoding is CPU intensive and often requires licensing of expensive image compression libraries to support the various compression algorithms that DICOM supports (specifically JPEG2000 and JPEG-LS).  Transcoding is one of the bigger performance challenges when it comes to DICOM as the client may not support (or know) what transfer syntax the archive stored the SOP Instance in and may require the server to transcode it so it can be displayed.  It does seem that many VNA implementations are configured to archive using the JPEG2000 transfer syntax so clients should plan on supporting JPEG2000 directly if performance is important.

Conclusion
This is just a taste of the performance issues related to DICOMWeb.  The WADO-RS Retrieve Metadata may very well be the most complex and performance critical API that archives will have to implement.  If you are looking to run a viewer form another vendor off of your DICOMWeb archive, make sure the the archive supports WADO-RS Retrieve Metadata and does it in a performant way.  You also want to look at QIDO-RS and make sure it can scale.  The vendor's CFIND performance will be a good leading indicator of how well QIDO-RS will do.








Sunday, April 12, 2015

The importance of WADO-RS Retrieve Metadata

One of the most exciting aspects of WADO-RS is the Retrieve Metadata call.  Retrieve Metadata enables access to all non pixel elements for all SOP Instances in a study with a single HTTP request.  This capability doesn't exist in DIMSE services requiring applications to design around what is available via CFIND/QIDO-RS (a small subset of tags) or prefetching the entire study via CMOVE/WADO-RS Retrieve Study (which includes pixel data) in advance.  

Designing around CFIND/QIDO-RS is a huge limitation as there are many elements not returned by CFIND that are required for viewers to properly display the right initial images to a user.  For diagnostic use cases, it is really important to display the right initial images to the radiologist as quickly as possible (less than one second from opening the study ideally).  Doing this is not easy because DICOM does not define what an initial view of a study should be.  This is entirely left up to the application designer and requires taking into account specifics of the procedure, capabilities of the acquisition modality, user preferences and overall application design.

Here are some examples of the additional attributes needed beyond what is provided by CFIND/QIDO-RS:
1. Some MRI Procedures produce multiple echos in the same series.  Most users prefer that each of these echos be displayed as a separate stack
2. Some CT procedures produce multiple phases in the same series (arterial and venus).  Most users prefer that each of these phases be displayed as a separate stack
3. Some procedures will include multiple images in the same series and users will want them displayed independently (not stacked).  Detecting this often requires looking at specific tags
4. Key objects, presentation states and structured reports will often impact which images are initially displayed as well.  Note that in some cases these instances alone can drive the initial image display (e.g. display the Key Objects to a clinician)

In addition to this, the sort criteria for a stack can also vary requiring additional data not returned by CFIND/QIDO-RS.

Given that CFIND/QIDO-RS do not provide enough data for a viewer to always select the initial images to display, we are forced to pull the entire study over using CMOVE or WADO-RS Retrieve Study before we can analyze it.  While one of the goals of WADO-RS Retrieve Study is rapid access,  it isn't clear how fast the various implementations actually will be (I haven't had a chance to test any real implementations myself yet).  While it might be technically possible to load a large study over a 10 GB/s network in under a second this is not easy to do and will likely not be seen in the real world for many years to come.

While prefetching studies in advance using CMOVE or WADO-RS Retrieve Study will work, there is no way to prevent the pixel data from being sent which can limit the number of priors pulled due to limitations of the archive and network.  The pixel data is often over 100x larger than the rest of the elements in each SOP Instance so prefetching is often limited by the raw throughput of the archive software, storage subsystem or network infrastructure.

WADO-RS Retrieve Metadata therefore solves a huge problem with respect to integrating third party viewers with an image archive.  It provides rapid access to all of the non pixel data in a single HTTP request.  This provides more data than QIDO-RS (and CFIND) and is faster than WADO-RS Retrieve Study (and CMOVE).