Monday, September 14, 2015

DICOMWeb JSON Mapping Details

Last week I attended the DICOMWeb conference in Philadelphia, PA and had a chance to talk with Rob Horn on a few details of the DICOMWeb JSON mapping:

Encoding of PN attributes


The JSON example mappings in PS 3.18 show PN attribute values encoded as an object with a property 'Alphabetic' rather than a string:

        "00080090": {
            "vr": "PN",
            "Value": [
              {
                "Alphabetic": "^Bob^^Dr."
              }
            ]
        },

I was already familiar with the HL7 like encoding of person names using the ^ character to separate the different components (e.g. family name complex, given name complex, etc).  What I didn't know was that this is the alphabetic "component group" of a name and there are two other component groups - ideographic and phonetic.  These other component groups are used to support some languages where a name does not have an alphabetic version.  In P10 encoding, each of these component groups is separated by an '=' sign.  Here is an example from PH 3.5 Chapter H:

  • Yamada^Tarou=山田^太郎=やまだ^たろう

When the XML mapping was defined, it was decided to give each group component a name rather keep it as a single value separated by an = sign.  This is describe in PS 3.19 Table A.1.5-2.  DICOMWeb implementors should take note of this as PN is handled differently in XML and JSON mappings than it is in DICOM P10 encodings.  I know of one DICOMWeb implementation that is encoding PN as a string rather than an object with the different component groups.

Optionality of Value property


One thing the JSON example mappings in PS 3.18 do not show is that the Value property is optional.  The specific rule for this can be found in PS3.19 Table A.1.5-2:

A Value from the Value Field of the DICOM Data Element. There is one Infoset Value element for each DICOM Value or Sequence Item.
Required if the DICOM Data Element represented is not zero length and an Item, PersonName, InlineBinary or BulkData XML element is not present. Shall not be used if the VR of the enclosing Attribute is either SQ or PN.

What this means is that the Value property will not always be present and this needs to be taken into account when parsing the JSON response (and when creating a DICOM JSON document!).  I know of one DICOMWeb implementation that omits the Value property for PN attributes and two other implementations that send a zero length value as an empty string rather than omitting the Value property.

Tuesday, August 4, 2015

Parsing the DICOM Standard with Javascript in the browser

This morning I decided to enhance the DICOM Dump with Data Dictionary example from the dicomParser library to include the human friendly name for UIDs.  Doing this would require a lookup table of UIDs to names.  I didn't have such a lookup table so would have to borrow it from somewhere else or make it myself.  I know that DICOM publishes itself in a variety of electronic formats some of which are intended to be easy to parse for exactly this purpose but had never tried parsing them before.  I figured I would give it a shot and see how it goes.  I started out by checking out David Clunie's DICOM Status page.  I noticed that there were several formats - PDF, HTML, CHTML, Word, ODT and XML.  My heart immediately sinks as I realize that I am going to have to parse XML.  Then an idea then came to me - why don't I just use the Javascript console embedded in the web browser to extract the data I want from the HTML using Javascript?  I open the HTML page for PS 3.6 and use the "inspect element" feature to look at the structure.  I notice that the table I want is marked with an ID which means I can probably build a selector to find the tbody I want.  A few tries later and I come up with the following selector:

$('#table_A-1 ~ div tbody')

Next up is to write some javascript to iterate over each tr in the tbody and write out the UID and name in Javascript so I can paste it into my file.  A bit of trial and error later and I come up with the following:

(function () {
  var elements = document.querySelectorAll('#table_A-1 ~ div tbody tr');
  var result = "";for(var i=0; i < elements.length; i++) {
    result += "'" + elements[i].childNodes[1].childNodes[1].innerText  + "':'" + 
    elements[i].childNodes[3].childNodes[1].innerText + "',\n";
  }
  return result;
})();

Which generates exactly what I want!  I paste the resulting string into a new file and try it out - but its not working.  For some reason, the lookup on UID is not matching.  I look a bit closer and notice that the values in the HTML have some non printable characters in them:

1.2.840.10008.5.1.4.1.&#8203;1.&#8203;2

I make another change to my javascript to strip out non printable charcters:

(function () {
  var elements = document.querySelectorAll('#table_A-1 ~ div tbody tr');
  var result = "";for(var i=0; i < elements.length; i++) {
    result += "'" + elements[i].childNodes[1].childNodes[1].innerText.replace(/[^\x20-\x7E]+/g, '')  + "':'" +
    elements[i].childNodes[3].childNodes[1].innerText.replace(/[^\x20-\x7E]+/g, '') + "',\n";
  }
  return result;
})();

And now I have the data I want!  Here is a link to the resulting javascript.  Pretty cool little hack demonstrating the power of what you can do with Javascript in a web browser.  This same strategy can be used to quickly extract data from any web page into any format you want.

Wednesday, April 29, 2015

HL7 FHIR DSTU 2 ImagingObjectSelection Resource

While working on the SIIM 2015 Hackathon Grand Challenge, I ran into an issue with how to represent different views of a study using the ImagingStudy resource.  What I wanted to do was create an ImagingStudy for all images in a study and another ImagingStudy for just the key images in that study.  The idea behind key images is to show a subset of images in the study - specifically those that are referenced in the diagnostic report.  Some DICOM studies can be quite large - hundreds or thousands of images (e.g thin slice CT, fMRI, perfusion, etc) and displaying all of the images can be overwhelming for most non radiologists.  Key images are therefore a critically important output of an interpretation by a radiologist to make imaging meaningful to most users outside of radiology.  It is also a key part of making the "multi-media report" work for the grand challenge.

The documentation for the ImagingStudy resource calls out the key image use case so my expectation is reasonable.  The data model also supports multiple ImagingStudy resources per DiagnosticReport so it is possible to have multiple views (e.g. full study, key images, qa images, etc) - but how would a viewer determine which one to display?  There is no code that can be used to determine what type of ImagingStudy this was.  Figuring out which ImagingStudy to initially display would therefore require some convoluted logic that would probably be unreliable.  My heart broke a bit as I encountered my first real world limitation of FHIR.  Fortunately we have control over the datasets we are using in the hackathon so I have the ability to "make it work" through a variety of means.  I decided I would reach out the FHIR gurus and see what they had to say:


I quickly learned that this was the wrong place for my question and I should check with the Imaging Integration working group.  I also learned about HL7 FHIR DSTU 2 - the next version of HL7 FHIR that is currently being worked on which includes a new resource named ImagingObjectSelection.  It seems that the new version of ImagingStudy is a manifest of the entire study and all subsets are moved into the ImagingObjectSelection.  The ImagingObjectSelect has a title property which is a CodeableConcept of type KOStitle.  KOStitle has around 20 different terms that can be used to give meaning to a given set of images.  

One thing I can't figure out from the documentation is how the ImagingObjectSelection is related to the DiagnosticReport and ImagingStudy.  It is still a works in progress so perhaps it hasn't been decided yet or just not documented.  Regardless, my faith in FHIR has been restored as the folks behind it are definitely on the right track!


Tuesday, April 28, 2015

SIIM 2015 Hackathon Grand Challenge

Yesterday I spent about eight hours building out the foundation for the SIIM 2015 Hackathon Grand Challenge.  This foundation is a simple web based radiology focused EMR client that includes functionality such as patient search and the display of a patient's radiology reports with images.  By providing such a foundation, it is hoped that hackers would have a baseline system to start hacking from rather than starting from scratch.

The SIIM Hackathon committee had already deployed a Spark FHIR server and DCM4CHEE v4 server into AWS and loaded both with synchronized data sets.  A special thanks to Mark Kohli, Steve Langer, and Jason Hostettler for their work on the datasets and Mohannad Hussain for setting up the DCM4CHEE server.  I also contributed the ImagingStudy resources to the FHIR dataset which were generated by converting the response from a WADO-RS Retrieve Metadata call using this tool.

I wanted to keep the learning curve for the baseline system to be low so hackers could get started quickly.  To support this, I decided to keep the third party dependencies to a minimum.  I ended up using only jQuery and bootstrap since most web developers are familiar with both.  I ruled out other popular (and powerful) libraries such as Meteor (my favorite), Angular and KnockoutJS as these powerful libraries take some time to learn and not everyone knows them.

My first goal was to create an architecture spike to prove out that it would work and also lay down the foundational pieces that I could build on top of.  I decided to begin with displaying a list of patients that I would obtain by querying the FHIR server.  This turned out to be very easy and took about 30 minutes to get up and going.  It basically involved making a query to the Patient resource filtered by the last name of the patients in our datasets and creating rows in a table from the results:

http://fhir.hackathon.siim.org/fhir/Patient?family=SIIM

Next up was creating a radiology centric patient view.  The first thing I needed was to display the available reports for the user.  This required querying the DiagnosticReport resource filtered by the ID for the selected Patient and creating rows in a table from the results:

http://fhir.hackathon.siim.org/fhir/DiagnosticReport?subject=Patient/siimjoe

This was up and running after another 30 minutes.  After this, I needed to display the actual report the user clicked on in the list of reports.  I hooked the click event on the table row and made a request for the associated DiagnosticReport by ID:

http://fhir.hackathon.siim.org/fhir/DiagnosticReport/2257132503242682

I grabbed the human readable form for the report from the data.text.div property.  This is normally HTML so I used the jQuery parseHTML() function to parse it into DOM nodes that I could stick directly into DOM.  This took yet another 30 minutes.

The last step is to display the images for the report.  This required searching for ImagingStudy resources based on the accession number stored in selected the DiagnosticReport resource:

http://fhir.hackathon.siim.org/fhir/ImagingStudy?accession=2257132503242682

There may actually be multiple ImagingStudy resources for a given DiagnosticReport so some logic was required to pick the right one.  I decided to pick the one with the fewest number of referenced SOP Instances assuming that the smallest one would contain the key images (rather than the entire study).  Once I had the ImagingStudy, I decided to pick the first image in the first series and display it using cornerstone.  I haven't built a WADO-RS based ImageLoader for cornerstone yet so I decided to use cornerstoneWADOImageLoader and load images via WADO-URI.  30 minutes later, I had an image displayed!

The spike took about 2 1/2 hours to complete and not only proved the concept but provided a great foundation to work from.  I spent the rest of the day refactoring the code for readability and adding more functionality. If you are really interested in seeing how this was built, check out the commit log.

Wednesday, April 22, 2015

QIDO-RS Capability Breakdown

Recently I created a github repository to track the available DICOMWeb implementations and their capabilities.  While a vendor may claim to support a given DICOMWeb API, the devil really is in the details.  These details are supposed to be described in the conformance statement but these are usually incomplete (or wrong).  This post attempts to break down QIDO-RS at a deeper level so we can more accurately assess how complete and compliant a given implementation is:


Feature Description
application/dicom+xml Can return responses with Content-Type: multipart/related; application/dicom+xml
application/json Can return responses with Content-Type: application/json
gzip Can return responses with Content-Encoding: gzip. Note that this is not mentioned in the standard and is therefore not required (but I feel should be supported and am therefore including it)
no-cache Supports the Cache-control: no-cache header which ensures the result is current and not cached on the client side. Note that this is not required by the standard (but I feel should be supported and am therefore including it)
group/element Supports specifying DICOM elements by group/element (e.g. 0020000D)
keyword Supports specifying DICOM elements by DICOM keyword (e.g. StudyInstanceUID)
sequences Supports specifying sequence elements (e.g. RequestAttributeSequence.RequestedProcedureID)
limit Supports the limit option for queries (limits the number of records returned)
offset Supports the offset option for queries (skips records in the response to support paging)
TimezoneOffsetFromUTC Supports specification of the timezone as part of date/time queries. Note that this is not required by the standard
QIDO-RS Studies Supports searching for studies via the /studies endpoint
QIDO-RS Studies response /studies response includes all required attributes
StudyDate - single Supports searching for studies by study date by single value match (e.g. exact match)
StudyDate - range Supports searching for studies by study date range
StudyTime - single Supports searching for studies by study time by single value match (e.g. exact match)
StudyTime - range Supports searching for studies by study time range
AccessionNumber - single Supports searching for studies by accession number by single value match (e.g. 1233456)
AccessionNumber - wildcard Supports searching for studies by accession number by wildcard (e.g. 1234*)
ModalitiesInStudy - single Supports searching for studies by modalities in study by single value match e.g. (CT)
ModalitiesInStudy - list Supports searching for studies by modalities in study by list of values (e.g. CT,MR,US)
ReferringPhysicianName - single Supports searching for studies by ReferringPhysicianName by single value match
ReferringPhysicianName - wildcard Supports searching for studies by ReferringPhysicianName by wildcard
PatientName - single Supports searching for studies by PatientName by single value match
PatientName - wildcard Supports searching for studies by PatientName by wildcard
PatientID - single Supports searching for studies by PatientID by single value match
PatientID - wildcard Supports searching for studies by PatientID by wildcard
PatientID - list Supports searching for studies by PatientID by list of values
StudyInstanceUID - single Supports searching for studies by StudyInstanceUID by single value match
StudyInstanceUID - list Supports searching for studies by StudyInstanceUID by list of values
StudyId - single Supports searching for studies by StudyId by single value match
StudyId - list Supports searching for studies by StudyId by list of values
StudyId - wildcard Supports searching for studies by StudyId by wildcard match
QIDO-RS Study Series Supports searching for series via the /studies/{StudyInstanceUID}/series endpoint
QIDO-RS Series Supports searching for series via the /series endpoint
QIDO-RS Series response Search for series response includes all required attributes
Modality - single Supports searching for series by Modality by single value match
Modality - list Supports searching for series by Modality by list of values
SeriesInstanceUID - single Supports searching for series by SeriesInstanceUID by single value match
SeriesInstanceUID - list Supports searching for series by SeriesInstanceUID by single value match
SeriesNumber - single Supports searching for series by SeriesNumber by single value match
SeriesNumber - range Supports searching for series by SeriesNumber by matching a range of values
SeriesNumber - list Supports searching for series by SeriesNumber by matching a list of values
PerformedProcedureStepStartDate - single Supports searching for series by PerformedProcedureStepStartDate by single value match
PerformedProcedureStepStartDate - range Supports searching for series by PerformedProcedureStepStartDate by matching a range of values
PerformedProcedureStepStartTime - single Supports searching for series by PerformedProcedureStepStartTime by single value match
PerformedProcedureStepStartTime - range Supports searching for series by PerformedProcedureStepStartTime by range of values match
ScheduledProcedureStepID - single Supports searching for series by RequestAttributeSequence.ScheduledProcedureStepID by single value match
RequestedProcedureID - single Supports searching for series by RequestAttributeSequence.RequestedProcedureID by single value match
QIDO-RS Series Instances Supports searching for instances in a series via the /studies/{StudyInstanceUID}/series/{SeriesInstanceUID}/instances endpoint
QIDO-RS Study Instances Supports searching for instances in a study via the /studies/{StudyInstanceUID}/instances endpoint
QIDO-RS Instances Supports searching for instances via the /instances endpoint
QIDO-RS Instance response Search for Instance response includes all required attributes
SOPClassUID - single Supports searching for instances by SOPClassUID by single value match
SOPClassUID - list Supports searching for instances by SOPClassUID by list of values
SOPInstanceUID - single Supports searching for instances by SOPInstanceUID by single value match
SOPInstanceUID - list Supports searching for instances by SOPInstanceUID by list of values
InstanceNumber - single Supports searching for instances by InstanceNumber by single value match
InstanceNumber - range Supports searching for instances by InstanceNumber by range of values
InstanceNumber - list Supports searching for instances by InstanceNumber by list of values


Vendors can go beyond this list of features by adding support for additional matching keys, including additional fields in the response, fuzzy matching and relational queries.  I didn't include these features as they are not required by the standard, not required by most use cases and rarely implemented.

What are your thoughts on this list?  Should something be added or removed?

Wednesday, April 15, 2015

Making DICOMWeb Fast

Over the years I have heard many people complain about slow DICOM performance.  In many cases these performance issues aren't with the standard itself but rather the vendor implementation or with how the system was implemented (e.g. network, storage array, VMs, etc).  Medical Images are large and tend to stress systems in ways that most architects/engineers and IT staff haven't had to deal with before.  It really takes a top notch engineering team to build a medical imaging system that can be reliable, scale and performant.  It also takes a top notch IT team to deploy and implement such a system.

The introduction of DICOMWeb does not magically fix these performance issues - in fact it may make it worse if vendors choose to bolt DICOMWeb functionality on top of their existing architecture rather than make the proper investment in reworking the software and system architecture.  Lets take a look at some of the new DICOMWeb API's and see what the issues might be:

QIDO-RS
The first thing to realize is that QIDO-RS was specifically designed to be a layer on top of CFIND.  The idea was that vendors had invested in their CFIND implementations and would prefer to build a simple REST wrapper around it rather than create a whole new subsystem.  It is therefore entirely possible to create a QIDO-RS to CFIND adapter which would allow older PACS archives that only support CFIND to be accessible via QIDO-RS.  In fact, I would expect such products to come on the market later this year.  Such a product would be great for interoperability, but doesn't necessarily mean that it will be performant.  Since QIDO-RS delivers no new functionality over CFIND (at least from a performance point of view), we can consider them roughly the same.

This of course means that any performance related issues to an archive's CFIND implementation will also likely apply to their QIDO-RS implementation.  When moving to QIDO-RS (or expanding use of CFIND), you will need to take into account normal scaling and usage issues.  If your current CFIND usage is 1 query a second today, but then it increase to 60 queries a second because more clients start accessing it via QIDO-RS or CFIND - you may have a problem.

If you are interested in how one might implement QIDO-RS, feel free to take a look at the .NET prototype I did here.

STOW-RS
STOW-RS is fairly straightforward - send a DICOM Instance to the archive and save it.  Storing inbound DICOM is a function that most archives are good at so there shouldn't be too much of a concern about adding STOW-RS to existing systems.  There may be some systems engineering issues with scaling inbound HTTP requests but it will likely take several years before STOW-RS gets enough usage for this to be an issue.  All inbound DICOM is done via DIMSE right now and it will continue to be that way for many years to come (modalities have 10+ year life cycles and many of the existing systems will never get STOW-RS options added to them).  In the short term, STOW-RS will most likely be used for two things:
1. Adding value add SOP Instances to an existing study such as KO, PR and SR
2. Importing of visible light images from mobile devices

Both of these use cases are expected to produce much less data than normal image load so should not have a major impact on system performance.  One thing to keep in mind that sending large quantities of data over HTTP does require a bit of systems engineering work.  There are specific aspects of HTTP and TCP that still need to be taken into account such as HTTP chunking, TCP Window size and utilizing of compression (gzip/deflate).  Fortunately most IT networking folks are very much at home with these issues and should be able to make the necessary system configuration changes to make it performant.

WADO-RS
WADO-RS consists of several API's each of which have different levels of complexity when it comes to implementation and performance.  Of these APIs the two that are most important for viewers are Retrieve Metadata and Retrieve Frames.  Unfortunately I haven't had access to an implementation of either yet so what I am going to describe here may not be 100% correct.  That being said, both of these features are heavily based on the work done on the MINT project which I was very involved in.

WADO-RS Retrieve Metadata
When it comes to viewing a study, the first call the viewer will make is to Retrieve Metadata to determine what SOP Instances are actually in the study. This call is therefore extremely time critical as the viewer won't be able to start loading any actual pixel data until the Retrieve Metadata call has completed and it has analyzed it. Once the Retrieve Metadata response has been analyzed, the viewer can determine what images it needs to load and fetch them using the Retrieve Frames call.

The best possible performance for Retrieve Metadata is when the response is already prepared and stored on disk.  In this case, the archive has very little to do other than read a file from disk and return it to the client.  Web servers are very good at just that so the HTTP handler implementation could be as simple as looking up the file for a given study (probably in a database), opening the file and then giving the file handle to the web server to stream to the client.  Ideally the result would already be compressed on disk using a standard HTTP compression algorithm such as gzip so it could be immediately streamed without having to compress on the fly (or worse - returning it to the client uncompressed).

Preparing the Retrieve Metadata response can be a bit tricky, especially for archives with lots of data archived.  To generate the response for a study, each SOP Instance must be opened, parsed and added to a data structure that will be used to generate the actual Retrieve Metadata response.  Iterating through TB (or PB) of data to do this conversion could take months if not years depending upon the storage system and available processing power. In this case, it might make sense to convert "on the fly" when requested, and even better for all priors when a new study arrives.  A related issue to this is keeping the response up to date when the study changes due to SOP Instances being added, modified or deleted.  In this case, the archive will need to detect these changes and trigger a rebuild of the Retrieve Metadata response.  This may not sound that hard to do, but it gets tricky when you start dealing with all the failure scenarios that an ACID complaint SQL Database takes care of for you (this assumes the Retrieve Metadata response is stored on the file system, not in the DB).

WADO-RS Retrieve Frames
WADO-RS Retrieve Frames allows a client to request the pixel data for one or more image frames from the archive.  Retrieve Frames allows the client to request the image frames be sent in a specific transfer syntax.  Servers are expected to support conversion from one transfer syntax to another (also known as transcoding).  WADO-URI also supports transcoding but I have worked with many implementations of it that did not support it at all.  Transcoding is CPU intensive and often requires licensing of expensive image compression libraries to support the various compression algorithms that DICOM supports (specifically JPEG2000 and JPEG-LS).  Transcoding is one of the bigger performance challenges when it comes to DICOM as the client may not support (or know) what transfer syntax the archive stored the SOP Instance in and may require the server to transcode it so it can be displayed.  It does seem that many VNA implementations are configured to archive using the JPEG2000 transfer syntax so clients should plan on supporting JPEG2000 directly if performance is important.

Conclusion
This is just a taste of the performance issues related to DICOMWeb.  The WADO-RS Retrieve Metadata may very well be the most complex and performance critical API that archives will have to implement.  If you are looking to run a viewer form another vendor off of your DICOMWeb archive, make sure the the archive supports WADO-RS Retrieve Metadata and does it in a performant way.  You also want to look at QIDO-RS and make sure it can scale.  The vendor's CFIND performance will be a good leading indicator of how well QIDO-RS will do.








Sunday, April 12, 2015

The importance of WADO-RS Retrieve Metadata

One of the most exciting aspects of WADO-RS is the Retrieve Metadata call.  Retrieve Metadata enables access to all non pixel elements for all SOP Instances in a study with a single HTTP request.  This capability doesn't exist in DIMSE services requiring applications to design around what is available via CFIND/QIDO-RS (a small subset of tags) or prefetching the entire study via CMOVE/WADO-RS Retrieve Study (which includes pixel data) in advance.  

Designing around CFIND/QIDO-RS is a huge limitation as there are many elements not returned by CFIND that are required for viewers to properly display the right initial images to a user.  For diagnostic use cases, it is really important to display the right initial images to the radiologist as quickly as possible (less than one second from opening the study ideally).  Doing this is not easy because DICOM does not define what an initial view of a study should be.  This is entirely left up to the application designer and requires taking into account specifics of the procedure, capabilities of the acquisition modality, user preferences and overall application design.

Here are some examples of the additional attributes needed beyond what is provided by CFIND/QIDO-RS:
1. Some MRI Procedures produce multiple echos in the same series.  Most users prefer that each of these echos be displayed as a separate stack
2. Some CT procedures produce multiple phases in the same series (arterial and venus).  Most users prefer that each of these phases be displayed as a separate stack
3. Some procedures will include multiple images in the same series and users will want them displayed independently (not stacked).  Detecting this often requires looking at specific tags
4. Key objects, presentation states and structured reports will often impact which images are initially displayed as well.  Note that in some cases these instances alone can drive the initial image display (e.g. display the Key Objects to a clinician)

In addition to this, the sort criteria for a stack can also vary requiring additional data not returned by CFIND/QIDO-RS.

Given that CFIND/QIDO-RS do not provide enough data for a viewer to always select the initial images to display, we are forced to pull the entire study over using CMOVE or WADO-RS Retrieve Study before we can analyze it.  While one of the goals of WADO-RS Retrieve Study is rapid access,  it isn't clear how fast the various implementations actually will be (I haven't had a chance to test any real implementations myself yet).  While it might be technically possible to load a large study over a 10 GB/s network in under a second this is not easy to do and will likely not be seen in the real world for many years to come.

While prefetching studies in advance using CMOVE or WADO-RS Retrieve Study will work, there is no way to prevent the pixel data from being sent which can limit the number of priors pulled due to limitations of the archive and network.  The pixel data is often over 100x larger than the rest of the elements in each SOP Instance so prefetching is often limited by the raw throughput of the archive software, storage subsystem or network infrastructure.

WADO-RS Retrieve Metadata therefore solves a huge problem with respect to integrating third party viewers with an image archive.  It provides rapid access to all of the non pixel data in a single HTTP request.  This provides more data than QIDO-RS (and CFIND) and is faster than WADO-RS Retrieve Study (and CMOVE).




Sunday, March 29, 2015

FHIR Identifiers Revisited

I want to revisit the HL7 FHIR identifier as I have discovered a few new things.  The Identifier is a complex object that consists of several properties all of which are optional:

<[name] xmlns="http://hl7.org/fhir"> doco
 <!-- from Element: extension -->
 <use value="[code]"/><!-- 0..1 usual | official | temp | secondary (If known) -->
 <label value="[string]"/><!-- 0..1 Description of identifier -->
 <system value="[uri]"/><!-- 0..1 The namespace for the identifier -->
 <value value="[string]"/><!-- 0..1 The value that is unique -->
 <period><!-- 0..1 Period Time period when id is/was valid for use --></period>
 <assigner><!-- 0..1 Resource(Organization) Organization that issued id (may be just text) --></assigner>
</[name]>

Here is its definition:
"
A numeric or alphanumeric string that is associated with a single object or entity within a given system. Typically, identifiers are used to connect content in resources to external content available in other frameworks or protocols. Typically, identifiers are used to connect content in resources to external content available in other frameworks or protocols. Identifiers are associated with objects, and may be changed or retired due to human or system process and errors.
"

Here are some of the key identifiers used in Radiology informatics:

Name Description
MRN or PatientID Identifies a specific patient
Accession # or Filler Order Number Identifies a specific radiology procedure or exam
Procedure Code Identifies a type of procedure or exam (e.g. CT CHEST W/WO)
DICOM Study Instance UID Identifies a group of related DICOM SOP Instance (or images) - usually for a single radiology procedure
DICOM SOP Instance UID Identifies a single DICOM SOP Instance (or image)


The most important property of the identifier is value which actually holds the identifier itself.

The next more important property is the system which provides a scope or namespace for the identifier.  Given the different types of identifiers and systems that generate them - conflicts are sure to exist and the system provides a way of describing the context or scope of the identifier.  For example, you could have two RIS systems - RIS A and RIS B.  Both RIS systems were implemented independently of each other and started generating accession numbers starting at 1000.  Given that both systems are using the same accession numbers to identify different radiology procedures - the system must be used to make them unique when used together.  The documentation for system is

"
The namespace for the identifier
"

Here is more information about system from the documentation on identifier:
"
The system referred to by means of a URI defines how the identifier is defined (i.e. how the value is made unique). It might be a specific application or a recognized standard/specification for a set or identifiers or a way of making identifiers unique. The valueSHALL be unique within the defined system and have a consistent meaning wherever it appears. Both system and value are always case sensitive.
FHIR defines some useful URIs directly. OIDs (urn:oid:) and UUIDs (urn:uuid:) may be registered in the HL7 OID registry and should be if the content is shared or exchanged across institutional boundaries. If the identifier itself is naturally globally unique (e.g. an OID, a UUID, or a URI with no trailing local part), then the system SHALL be "urn:ietf:rfc:3986", and the URI is in the value.
"
The documentation refers to some useful URIs for various standardized codes (e.g. SNOMED, LOINC, Radlex, HL7 v2, HL7 v3).  Given the identifiers listed above, it seems that using standardized codes for the procedure code makes a lot of sense.  This is not the case for the other types though which would be generated by information systems such as the EMR, HIS, RIS, Modality or PACS.

It is interesting to note that the concept of unique identities is not new to DICOM.  DICOM adopted the ISO UID scheme for identifying types of DICOM Instances (i.e. SOP Class UID) as well as instance data (Study, Series, Instance, etc).  Equipment that generates DICOM SOP Instances (e.g. Modality, PACS, etc) is responsible for making sure the generated UID's are unique.  This is typically done by registering a unique root UID along with an algorithm to generate the numbers following the root.  The algorithm is not standardized and I have seen many systems incorrectly implement this or installations incorrectly configured which resulting in UID's that are not globally unique.

The DICOM UID root is 1.2.840.10008, the numbers following that are used to identify specific things within that root.  For example, the "CT Image Storage" SOP Class UID is 1.2.840.10008.5.1.4.1.1.2 and the UID used for Implicit Little Endian Transfer syntax is 1.2.840.10008.1.2.

You will notice the term "OID" used in the above documentation.  This is the same concept as UID's used it DICOM with a different name and perhaps issuing authority.  OID's seem to be the direction forward for HL7 FHIR and each system that generates identifies will need to register it with HL7 for $100.

The next property of interest is use which can have any of the following values

Definition
usualthe identifier recommended for display and use in real-world interactions.
officialthe identifier considered to be most trusted for the identification of this item.
tempA temporary identifier.
secondaryAn identifier that was assigned in secondary use - it serves to identify the object in a relative context, but cannot be consistently assigned to the same object again in a different context.
I can imagine temp being used to identify unknown patients (e.g. unconscious patient from motor vehicle accident in the ER without any ID).  I suppose secondary could be used for alternative identifiers (e.g. RIS specific patient identifier which is different than patient identifier in the enterprise master patient index).  Official could be used for identifiers that are standardized and therefore the same for all systems in the world.  Some examples include codes (LOINC, HL7 Codes, DICOM SOP Class UIDs, etc).  Usual could be used for identifiers for specific things (patients, visits, exams, studies, images).  Given the list of radiology identifiers above, all of them should probably be "usual" uses except for the procedure code.

The label property might be useful to provide a human readable string for standardized codes.  Standardized codes are usually intended for understanding by computers and not humans.  For example, the DICOM SOP Class UID for storing CT Images is "1.2.840.10008.5.1.4.1.1.2".  Very few people in the world would automatically recognize that number as a CT Image.  For everyone else, it is far more useful to display "CT Image" to the user instead of the number.  This is where the label comes in.  For the radiology identifiers listed above, the best candidate for including a label property is the procedure code.

The last two properties are period and assigner.  The idea behind period is that you can further constrain an identifier through a time period.  I don't have any real world use cases for period so am not sure when it might show up.  The assigner is just a URI to an Organization resource that assigns/generates the identifiers.  This is basically a REST version of the system concept.  

Moving forward with the radiology report repository spike will require that I create identifiers and I should be OK with simply populating the "value" property - although a real implementation should have system populated with a valid OID.

Saturday, March 28, 2015

Happy Birthday Cornerstone!

I just realized that cornerstone had its first birthday a little over one week ago.  The project started based on this discussion on comp.protocols.dicom where folks where folks were disappointed with the lack of open source image viewers.  While there already was two good open source viewers (Ovyiam and DICOM Web Viewer), neither of these were architected such that I could use them to build my own web based medical image viewing applications.  I was already convinced that the future of medical imaging was HTML5/JS based image viewers and being the lazy programmer that I am, didn't want to build basic image viewing functionality over and over again (I have personally coded a ww/wc algorithm at least 15 different times in various languages).  I had a personal need for a javascript SDK that made it easy to display interactive medical images in a web browser - and that is how cornerstone began.

I must admit that starting cornerstone was not easy to do.  I was starting a new business (Lury) and it seemed a bit crazy to spend my time writing code that I would be giving away for free.  This is especially true because I had figured out a number of tricks to make client side rendering possible while the industry norm was (and still is) server side rendering.  The benefits of client side rendering are compelling enough to provide the differentiation needed to make a new startup company like Lury successful in an already competitive market and giving this away for free was very hard to do.

On top of giving away some of these secrets, the code I wrote would be on display for everyone to see.  While I believe I write fairly good code, I am not perfect and what is "right" can sometimes be subjective.  Code reviews are actually quite common in closed source projects and it is one of the most vulnerable experiences you go through as a software developer.  What happens is you get in a room with a bunch of other developers and they look at your code with a magnifying glass and give you feedback.  Your "best effort" is on display and the bulk of the discussion is around how you could do better.  In safe environments, these meetings are productive and are highly educational.  The internet is not a safe place though and making it publicly available for everyone to see and criticize required a tremendous amount of courage.

Looking back on the past year, I can say that making cornerstone open source is the best thing I have ever done.  It has brought me tremendous joy to see others using cornerstone.  There are at least 50 projects that I know of using cornerstone today and probably many I don't know about.  Many of these projects are positively impacting patient outcomes and probably would not have been possible without cornerstone.  I have also made many new friends all over the world - some of which have given me an open invitation to stay with them whenever I might visit.

I want to say "thank you" to everyone who has supported the cornerstone project - it would not have been possible without your emails of encouragement, bug reports, bug fixes and new features.  The future is bright for cornerstone and everyone is welcome to be part of this!

Thursday, March 26, 2015

Radiology Report Repository Spike Update #2

Today I made progress on the spike and have it successfully creating a new Patient resource if it doesn't already exist.  The source code for the spike/prototype can be found here.  Here is a screen shot of the spike at work receiving the same message twice:




The first time it receives the message, it doesn't find an existing patient resource so it creates a new one.  The second time it receives the message, it finds the one it previously created so does not create a new one.  Note that you can issue a HTTP DELETE on the resource id to remove like this:



A few notes about this:
1) Doing everything in JavaScript is really cool.  I don't think I would be this far along if I had tried to do it all in Mirth Connect
2) Mapping from V2.x code to V3/FHIR wasn't exactly straightforward.  2x has many codes that are not in v3/FHIR.
3) I should probably bring up my own FHIR server at some point.  Not sure if I should use one of the open source versions or build a simple one myself.


Next up - creating the DiagnosticReport resource.

Wednesday, March 25, 2015

Integrating Meteor client only application with ASP.NET MVC 5

I recently had a project where I wanted to use Meteor but we were unable to utilize the meteor server for our phase 1 deliverable.  Instead we had to integrate with an existing ASP.NET MVC 5 application.  Here is how I did it:

1) Use HTTP instead of DDP for all communication to the server.  Meteor makes client/server communication really easy through publications, DDP and meteor methods.  While there are some DDP libraries that may allow your non meteor server to integrate via DDP, this was not the route we took.  Instead we decided to do all RESTful calls to our server as it had existing REST APIs that we wanted to us.

2) Tell the meteor client code to disconnect from the meteor server.  The meteor client code assumes there is a meteor server and immediately tries to establish a DDP connection to communicate with it.  Not only for normal client/server calls, but also to receive notifications of hot code pushes.  Since we don't have a meteor server in our case, we need to explicitly disconnect like so in my client/main.js:

Meteor.disconnect();

3) Build meteor for deployment.  Building meteor for deployment will generate a single JS and CSS file from all JS/CSS used by the client side in your project.  This means it will concatenate, uglify and minify everything for you.  Create a build output directory "build" as a peer to your meteor project and generate the build outout:

meteor build --directory ../build

4) Copy the generated js and css files into your ASP.NET MVC project.  The generated files should be in the build/bundle/programs/web.browser directory.  You should find a scripts folder in your ASP.NET MVC project, I created a sub directory to hold my meteor code.  Note that the meteor build creates a unique filename for the js and css file (presumably based on a file hash as it does not rename the file if the file contents dont change).  I simply renamed the js and css files from the generated name to a consistent name (e.g. main.js and main.css) so I didn't have to add/remove files to TFS all the time.  I put the files in a bundle to ensure that new versions invalidated older cached versions of the files.

5) Load the meteor js and css files from your ASP.NET CSHTML file and add the meteor runtime config.  The meteor client code depends on a global variable named __meteor_runtime_config__ which the meteor server uses to pass several properties to the client.  Since we don't have a meteor server, you need to set this up and pass it to the client from your ASP.NET MVC app.  Here is our CSHTML that we are using:

@model MeteorViewModel
@{
    ViewBag.Title = "";
    Layout = "";
    ViewBag.RootUrl = string.Format("{0}://{1}{2}", Request.Url.Scheme, Request.Url.Authority, Url.Content("~"));
}

<link rel="stylesheet" type="text/css" class="__meteor-css__" href="@Url.Content("~/Scripts/Meteor/main.css?meteor_css_resource=true")">

<script type="text/javascript">
__meteor_runtime_config__ = {
    "meteorRelease": "METEOR@1.0.3.1",
    "ROOT_URL": "@ViewBag.RootUrl" + "Scripts/Meteor/Public/",
    "ROOT_URL_PATH_PREFIX": "",
    "autoupdateVersion": "af58dd0d8e3b4c9b85ee2fa53553d2502663b530",
    "autoupdateVersionRefreshable": "a6de052a32154229bfac08baf16f2141a6b943e2",
    "autoupdateVersionCordova": "none",
    "Data": "@Model.Data"
};
</script>


@Scripts.Render("~/bundles/Meteor")


A few notes about this:
1) We need to set the ROOL_URL correctly for where we are serving the public folder from
2) I added the "Data" property to pass some data from our ASP.NET MVC app to the meteor app
3) We use ASP.NET MVC's bundle feature with the meteor code to avoid caching issues on the browser

You should be able to follow the above steps to integrate Meteor client only applications with other server side stacks.

Tuesday, March 24, 2015

Radiology Report Repository Spike Update #1

Yesterday I started working on the spike for the Radiology Report Repository.  The first thing I did was create a sample ORU message to work with.  I decided to make the most basic and simple ORU message possible and this is what I came up with:

MSH|^~\&||ABC|||201503231355||ORM^O01|1000|D|2.4|||AL|NE
PID|1|2000^^^^MR|||DUCK^DONALD||19340609|M|||1113 QUACK STREET^^DUCKBURG^CALISOTA^^^^^|||||||
PV1|1|I||EL||||||SURG||||PHY||||IN|||||||||||||||||||||ABC||ADM|||201503231330||||||||
OBR|1||3000|RAD^CARM^RAD C-ARM|||201503233154||||||||||||||||||F|||||||||
OBX|1|TX|L.RADREA2^Reason for Exam:||Report text

The key things in this message are:

Patient Name: DUCK^DONALD
Patient MRN: 2000
Accession Number: 3000
Report Date: March 23, 2015
Report Text: Report Text
Exam Code: CARM

I also need to send this message to my HL7 listener, so I downloaded and install HL7 Inspector for this.  After installing HL7 Inspector, copy the above test ORU message above into your clipboard, then right click the background of HL7 Inspector and choose "Import from Clipboard", click OK on the "Import Options" dialog and you should see the message displayed like this:




Next I need an HL7 listener.  I initially started with Mirth Connect but quickly abandoned it as I realized I would need to write quite a bit of logic and it would be very hard to develop/debug/test that logic using Mirth.  I decided to try building the prototype using Node.js and google helped me find the following code to receive HL7 Messages.  I created a new project using WebStorm, pasted that code in there, started up the Node.js process, configured HL7 Inspector to send to port 6020 and sent the message to it from HL7 Inspector:




Success!  Next up is parsing the HL7 message and implement the following pseudo code logic:

parse the HL7 message
search for Patient resource given the MRN
if Patient resource does not exist
  create Patient resource javascript object
  POST Patient resource javascript object
Create DiagnosticReport javascript object with reference to Patient resource
POST DiagnosticReport javascript object

Given this pseudo code, the first step is to parse the HL7 message.   A google search points me to the L7 library.  A bit of code later and I have parsed the fields I need to create a Patient resource:

Next up we need to start implementing the pseudo code.

Monday, March 23, 2015

HL7 FHIR Identifiers

Before moving on with this spike, I wanted to better understand how HL7 FHIR deals with identifiers since this is fundamental to actually finding my reports given what we have in a DICOM Study.  The DiagnosticReport resource has a property named identifier which is of type identifier. The documentation states that the identifier is:

"
The local ID assigned to the report by the order filler, usually by the Information System of the diagnostic service provider.
"

When it comes to radiology reports, the order filler number is supposed to match what is in the accession number field in the DICOM study so this is what we need to use to actually find the DiagnosticReport for a given DICOM Study.  Lets take a look at the identifier we sent when we created the diagnosticReport in the prior blog post:

  "identifier": {
    "use" : "official",
    "system" : "?",
    "value" : "1000"
  }

But we know that an accession number is not globally unique so we need additional matching criteria.  The patient identifier (or MRN) is a good secondary key so lets look at that.  The Patient resource has a property named identifier that has zero to many of type identifier.  Here is its documentation:

"
An identifier that applies to this person as a patient.
"

And here is the identifier we used for the Patient resource:

  "identifier": [{
    "use" : "usual",
    "label" : "MRN",
    "system" : "urn:oid:0.1.2.3.4.5.6.7",
    "value" : "654321"
  }],

In both cases the "value" is the actual identifier - but are the "use", "label" and "system" properties for?  Looking at the documentation for the identifier type we learn the following:

Here is the documentation on the use property:

usualthe identifier recommended for display and use in real-world interactions.
officialthe identifier considered to be most trusted for the identification of this item.
tempA temporary identifier.
secondaryAn identifier that was assigned in secondary use - it serves to identify the object in a relative context, but cannot be consistently assigned to the same object again in a different context.
In the above examples we used "usual" and "official" but it isn't clear when (or if) I should use one over the other.  I do a quick google search on the terms "HL7 FHIR identifier use usual official" but none of the results on the first two pages are helpful.  My search terms are very specific so I am thinking this may be an area of HL7 FHIR that is not well documented yet.  Lets analyze the identifiers from the sample DiagnosticReport resources on the spark server and see if there is any consistency.  I use the Advanced REST Client to make an open ended query for DiagnosticReports:

http://spark.furore.com/fhir/DiagnosticReport?_format=application%2fjson%2bfhir


I find two kinds:

identifier: {
  use: "official"
  system: "http://acme.com/lab/reports"
  value: "12Z986912-16258694"
}
identifier: {
  use: "official"
  system: "http://www.bmc.nl/zorgportal/identifiers/reports"
  value: "nr1239044"
}

Since both kinds used the code "official" for "use", I am lead to believe that I should do the same with one reservation - none of these DiagnosticReport examples are radiology reports.  All of the examples with identifiers on the spark server appear to be for lab results of some type.  I don't know much about lab results so it is possible that how they are identified and referenced is different than how radiology reports are.  It is interesting that the system is a url to some other system.  Presumably this other system generated the actual identifier, but given that I am building a radiology report repository from HL7 ORU messages, the originating RIS may not have a URL like this to refer to.  The spark server does not appear to validate the system at all (you will notice I used the system "?" for the diagnostic report I created) so for this spike, it probably doesn't matter what I put in there. 

Looking at the schema for the identifier, I noticed that none of the properties are required!  For this spike, I may be able to simply omit the use and system.  I try creating another DiagnosticReport with this change and the spark server accepts the HTTP POST and creates the resource!

So now I have a bit better understanding of HL7 FHIR identifiers, but still not full understanding.  I have some emails out to people that I am hoping can help provide more information.  For now I am going to leave this and move on with the spike.