The place to contribute components and workflows?

Hi CogStack team,
I’m posting this as a bookmark following discussions with Thomas a few weeks back.

As part of our experiments with Cogstack in Australia we have built some tools allowing Cogstack to extract and process documents from Cerner Millenium, or, more precisely, from a warehoused version of the cerner blob table. The process required to do this is moderately complex and may be of interest to others. The components involved are:

  • A java library that implements decompression of the proprietary cerner LZW compression. This was written based on a couple of old tools that were floating around the web.
  • A groovy script to call this library and decompress the components of nifi flowfiles
  • A workflow to extract data from the warehouse and concatenate multi blob documents into single flowfiles prior to extraction
  • a groovy script to perform concatenation

The question is where to put this stuff. Should it sit in its own repo with some instructions on how to deploy the pieces, or should it be added to the cogstack-nifi repo, and if so, where?

Also - Personally, I’d like to avoid using my custom java library, so opinions of any compression experts out there who can point the way to using more standardized approaches are welcome.


1 Like


This sounds great.

For the workflows and groovy scripts you can put them in their relevant folders ( /nifi/user-templates/ and /nifi/user-scripts/ ) create a separate branch with these new files and then submit a pull request.

For the java library this is a bit different, since it is a requirement for the workflows perhaps we can just check the sourcecode from your repository (if it is public), after than I can add the repo as a submodule in the services folder.

I can only point out that nifi already supports compression zip/gzip/lzma and some others, we can update the workflows with these alternatives for general purposes as I belive the mentioned compression algos are more than enough. What were you using LZW for ? image compression?


Cerner’s main document table stores documents as compressed blobs with a proprietary LZW compression. I might do some more work to see if I can figure out how to get the nifi tools to work on it.

1 Like