From POC to Production: Strategies for achieving ML at scale.
The last session I attended informed on the journey from Proof of Concept to production. It was interesting to explore how to achieve ML at scale. How this is done is through ensuring that the two groups received the best of both worlds. The two groups being the Data scientists and business stakeholders and the second group being cloud IT members. It is not always easy as both have needs which may be more vital and create a debate of priority, but the aim is to reach maximum benefits for both. The session explored the 7 different aspects to consider attaining a smooth transition from POC to Production for achieving ML at scale.
Culture is an aspect to consider as this heavily relates to the organizational norms and expectations that are to be followed through and met. Executive alignment is also dependent on these expectations. Culture is also associated with the type or level of security and governance. The culture needs to be understood for decision making to ensure expectations are embraced. Maintaining organizational culture when going from PoC to production is beneficial to be able to establish (or refine) repeatable ML practices and thereafter celebrate successes.
What this means is bringing together a variety of different skills and resources. Some of the key roles involve:
- Data Engineers
- Data Scientists
- DevOps Engineers
- Security Engineers
- Software Engineers
The data strategy is more of a 3 step plan which can be buffed out and refined along the way. The three main steps are:
•1. Labelling of Data
•2. Training Algorithms
•3. Create ML models
•Proof of Concept
This aspect is where the organization determines the business impact. It investigates what is currently occurring in that space, the internal and external impact. However, it also looks into what is currently being done in order to solve the problem. Following this, while looking into a more innovative solution there needs to be a trade-off analysis of the business value of the outcome and who will use the solution and how.
After proof of concept has been completed, there should be a functional, useful and reliable model. There will need to be the introduction of automation to enable repeatability. Repeatability through the use of automation is helpful in cases where there are a lot of consistent or frequent batch jobs that take the time or need to be done overnight.
In terms of scale, each role listed previously needs to have its own responsibility when it comes to scaling and including automation at scale. As you scale there needs to be an ML centre of excellence (CoE) to provide help to each of the branches off teams. The CoE should be a cross-functional group of individuals that can also double up roles by being part of the delivery teams.
With all the above noted, it is important to remember that organizations evolve over time and that the process should be seen as iterative instead of linear.
I hope this 3 part blog series, based on my learnings from attending the AWS Innovate Online Conference was an interesting read for you. If you missed the last two head to our blog page to read them.