Simple and sophisticated data engineering solutions
CMD specialises low cost, automated and rapidly deployed solutions within the data domain. These solutions are backed by CI/CD processes, utilising the full power of DevOps to develop low-cost solutions that scale infinitely to meet an organisations evolving needs.
Large scale data migrations and appliance re-platforming are particular areas of CMD expertise. We automate these processes through native AWS services – such as Amazon Database Migration Service – in conjunction with our RunCMD templates using Ansible and Terraform. This ensures repeatable and robust patterns are used through the process.
With AWS Accelerators, we can migrate workloads and data from traditional on premise source such as Oracle and SQL and also cater for appliances like Greenplum, Netezza and Teradata to drastically reduce cost and reliance on these end of life data processing platforms.
CMD is uniquely positioned to develop both simple and sophisticated data engineering solutions and back that up with a fully managed service, which ensures our clients’ critical data decision-making systems are monitored and always working when needed.
This is critical to competitiveness in the age of digital: downtime and recovery windows are simply not acceptable. CMD’s automation frameworks ensure that your solutions are always on and always up-to-date.
The AWS Data Lake pattern is simple, low cost and truly extensible into traditional data use cases such as:
- Machine learning;
- Feeding external systems; and
- Internal reporting tools.
It’s designed to remove cost and increase processing speed of data pipelines. Where possible, we use spot and on-demand instances in conjunction with Lambda functions to provide real cost savings when compared to traditional data lake architecture.
1. Consulting and Solution Design
Working with our clients to design AWS based data solutions to meet their business needs by solving specific, tangible business challenges.
2. AWS Data Lake Foundations
Using the CMD CI/CD frameworks to build and deploy data lake foundations in a rapid, repeatable and reliable way.
3. Data Pipeline Development and Scheduling
Build and schedule data pipelines to ingest, classify, transform and store data that is regularly executed to ensure the clients data is always current and up to date.
4. Data Querying and Modelling
Writing queries and forming data models that generate business value from ingested data sets that can be consumed by the business for data science or data visualisation.