About the Company

The Department of Health – Agency’s priority is improving population health by strengthening State’s health system. The Department’s five branches, Public Health Services, Health Systems, Integrated Health, Office of Population Health and the Office of Policy and Strategic Planning work collaboratively toward that goal. Population health focuses on keeping healthy state people well, preventing those at risk from getting sick, and keeping those with chronic conditions from getting sicker. Population health promotes prevention, wellness and equity in all environments, resulting in a healthy state.

The Challenge

The agency had the immediate need for the implementation of enterprise data lake services for the COVID-19. The data analytics solution would have helped them in order to do efficient contact tracing for COVID-19 and also provide reports (public and internal) for overall contact tracing performance. The data lake solution would then be used for other sources and use cases in future.


Kapstone proposed and implemented a solution using Amazon Web Service (AWS) which would meet all the needs of the company. Various third-party data sources were integrated using services like API Gateway, Lambda, Kinesis, S3 etc. S3 buckets were designed to accommodate various data set types and schemas and Glue Crawler were developed to parse through data and generate logical schema. The data was then reported from various dashboards using Tableau which used the data from S3 via Athena. To achieve security, used AWS Secrets Manager to hold all endpoints, username and password for third party applications in a secured way. Least access privileges were given to end users of the data lake. Set up a monitoring using AWS CloudWatch which helps in investigating through services logs in case of issues.

The Benefits

As a result, highly-available data lake solution was built to handle near to real-time data. Serverless services helped in order to reduce management overhead. The process of reporting was automated to reduce the work of business users. Implemented serverless and independent services architecture with robust and scalable solution to handle bulk load for new data.