Could Someone Give me Advice for Implementing CogStack in a Large Healthcare System?

Hello there,

I am involved in a project where we are planning to implement CogStack within a large healthcare system that serves multiple hospitals and clinics. Our goal is to enhance our clinical data processing capabilities; particularly for unstructured data; to support better decision making and research.

Given the scale of our system and the diversity of data sources we deal with; I am keen to understand the best practices for implementing CogStack in such an environment.

What are the key considerations when designing the architecture for CogStack in a multi-hospital system? Are there any particular challenges or pitfalls we should be aware of, especially in terms of scalability and data integration?

How have others approached the ingestion and processing of large volumes of unstructured data across different sites? :thinking: Are there specific tools or workflows that you have found to be particularly effective in managing this?

Given the sensitive nature of healthcare data, what security measures should we prioritize when implementing CogStack? How do you ensure compliance with regulations like GDPR or HIPAA in your deployments?

What strategies have you found effective in optimizing the performance of CogStack, particularly when dealing with real-time data processing and analytics? Are there any performance bottlenecks we should anticipate and prepare for? :thinking:

Also, I have gone through this post; https://cogstack.org/enabling-mental-healthcare-research-with-cogstack-technology-minitab/ which definitely helped me out a lot.

Finally, what have been your experiences with training healthcare staff to use CogStack? Any tips on how to drive adoption across a large organization with varying levels of technical expertise?

Thank you in advance for your help and assistance. :innocent:

hi @Elizashahh - apologies for this message being marked as spam prev.

We have deployed in large and small hospital system providers, mostly in the UK NHS, but we have also assisted others to deploy and make best use of their data in Europe, Asia and the USA.

Data Ingestion

CogStack can be deployed onto commodity hardware cloud / bare metal. In the first instance, start by isolating a small dataset to ingest and run through a CogStack-Nifi data ingestion pipeline.

Once you’ve got the hang of running / modifying Nifi pipelines you can prepare ingest multiple years of data at a time, and running NLP models and re-ingesting the output.

Hardware Requirements

At least a 8 core, 16gb machine / VM should be used to ingest data. For a full deployment, you’ll want one node / VM / machine for Nifi, 2 to sink data, and a further node for NLP and MedCATTrainer. Ideally all machines in this deployment would be 16 core / 32gb RAM spec.

Security

Best to follow the guidance of the system provider here. I.e. DPIAs, HIPPA compliance etc etc. Engage information governance teams early and keep data as close to source as possible.

Staff Training

This is very important, in particular for the fine-tuning / NLP validation steps involving MedCATtrainer.

I hope this helps.

This is a lot of questions and probably easiest to continue over a discussion call. If you would prefer to discuss around your specific deployment and use cases feel free to email contact@cogstack.org and one of the team will reach out to arrange a call?