Learning. Keep learning. Being a lifelong learner is a solid plan. I do honestly believe we live in a golden age of machine learning training and talent and so. People are selflessly sharing world class training materials and code on Github. The number of academic papers being shared is skyrocketing.[1] For those folks that are able to start sprinkling in a little bit of expertise as you go you can start building teams from the ground up. That starts by helping people invest in understanding this type of machine learning knowledge. That sets the foundation for them so that they can build the machine learning pipelines you need with something like TensorFlow Extended examples.[2] That is one way to really help them understand how those pipelines work for building and deploying and I think all the training to be able to do that is just amazingly available online people have been super gracious with sharing that kind of knowledge.
The hard part is pairing that knowledge with deep machine learning skills. That means you are going to need to start finding the subject matter expert knowledge within your organization. When you start really pairing people that know what you're doing at a super granular deep level to be able to work with models and do deployments that is where things will take off. It will probably end up being a team effort. You have deep knowledge subject matter experts and machine learning experts. The combination of two two groups in a team is how things get done in practice. You may want to bring in someone that's a real expert on building different layers and networks and being able to really build refined models, they may augment your team and speed things up. Sometimes you have to bring in the right folks to jumpstart things along the way.
#1 Where does the talent come from?
Personally, I believe you can build the talent from within your organization. If you doubted that assertion, then make a mental note to really challenge that bias against internal growth and development. Throughout my career I have a proven track record of helping grow internal talent. It works, but it requires investing time and having the right programs aligned to building toolkits. One of my proudest professional accomplishments is seeing somebody get promoted. It is an amazing thing to see people start bringing advanced methods and techniques to different parts of the organization. You can capture that feeling as well you just need to go out and start building out the toolkits of the people that you have. Really take the time to invest in them growing and developing as teammates and as individual contributors.
We are in the golden age of learning about machine learning.[3] More training than you can possibly consume now exists online. It exists in a variety of different forms. One of my favorites is the online labs that are now available online. The one I have used the most is called Coursera. People have built out well tooled examples of how to do machine learning. Not only can you read about it, but you can get into examples and kick the tires. That is the thing that has drawn me to TensorFlow since the product launched. So many people have been so generous with their knowledge, skills, and abilities.[4] They are sharing the keys to the machine learning kingdom online in some pretty easy to access classes, lectures, and even a few certificates. I have taken over fifty courses. You can see them on my LinkedIn profile if you really want to dig into the road I traveled. That will show you which ones I invested my own time in completing. You can also just check the links in the footnotes to find a place to start.
Sometimes building internal teams is just not fast enough. It takes time to help internal talent develop world class skills in machine learning or anything for that matter. I recognize that is a long-term goal and something you have to build toward along the way. That is where you have a few options to start looking for ways to supplement talent. One of those ways is to hire contractors and have them help you kickstart your endeavor. Another way is to find the right product or company to help you get going fast. Several companies are doing that right now and some of them can be impactful for your organization.
Typically, the data sources in an organization are not well indexed with clearly mapped features and associations. Even getting off the shelf data sources is a real challenge. For the most part the ones that people use were created to be used that way. Those data sets did not occur naturally in the wild. Even making custom tailored synthetic datasets can be a challenge for an organization that is trying to operationalize ML at scale. That is where using external products to manage the data and even accessing APIs requires planning and sustained dedication. That means that data going to the APIs must be consistent. Constantly changing data streams are a nightmare to manage internally or externally. A lot of companies like Databricks are showing up to the party and helping make sense of complex data stores.[5]
That might have been a lot to consider all in one stretch of thought, but it will all come into context the first time building a team to solve an ML problem becomes a necessity. My answer to where the talent comes from involves blending great professionals together over time to create high functioning teams. That may involve hiring in key skill sets to help supplement a team or investing in training the team if enough ramp up time exists. The shorter the amount of ramp up time the greater the need to quickly bring in external talent.
#2 How do you get the talent to work together?
Now that we talked about where the talent comes from and how to think about investing in your teams. Let’s switch gears and talk about how to get the teams to work together. This is one of those things that is much easier to talk about than to manage in practice. You can think about the mantra: let your leaders lead, let your managers manage, and let your employees succeed. That works well enough when you have agile teams that self-organize and rapidly get work done. If that is where you are sitting right now, then congratulations and appreciate what you have.
Teams are about how the different players work together. I try to think about machine learning engagements as having two key pillars. First, you need to figure out who has the deep knowledge on the product, data, and how the data relates to the customer journey. This is either going to be obvious or hard. Sometimes these folks with the greatest institutional knowledge of the data are key SMEs that play an impactful role, or they could be buried deeper in the organization at an analyst role or maybe they moved to another role.
Second find that person with deep knowledge and help them work with the machine learning expert you found. Pairing these two things together is going to be the most critical lynchpin to what you are doing. Most organizations do have data structures that were architected to work from the start in machine learning. Figuring out the right places to start. What data to label and what relates to what is really the beginning of the journey? This is one of the reasons why people with full stack machine learning skills are so important. What does that even mean? Full stack machine learning skills. I can walk into your organization and set up TensorFlow and even get the team sharing some Jupyter notebooks today. Having the right feeds, having the right machine learning hardware, having access to right production side infrastructure to swiftly move data without crushing or breaking things is where full stack skills are essential.
Maybe truly agile teams are supposed to be self-organizing, but that is probably not just going to happen the first time out the gate. Finding a common or shared purpose sounds a lot easier than it really is in practice. Getting people to self-organize around that common or shared purpose probably requires some type of ground rules or spark.
Sometimes high functioning teams just embrace the challenge and work to knock down any barriers or obstacles they might face. Most teams do not have that level of dedication, persistence, or fortitude. Typically, the project needs or just a general business problem brings a group together to take some type of action. Managing during those types of situations is always interesting and generally includes trying to bring people with diverse skill sets together.
That covers two types of teams you will encounter: high performing teams that are already assembled and teams that come together based on a specific business problem. Outside of those two common scenarios the other type of talent situation you will face might very well be a solution chasing problem. It happens now more than ever when the market is saturated with open source projects that let people jump in and start working with complex tools. The next step in that pattern is wanting to do something with that new and exciting tooling. To that end, you may find a solution just waiting for a problem to tackle. However, it might not be the right solution or even remotely close to the course of action that should be taken.
Getting talent to work together for me revolves around the business problem and what the team is trying to achieve. It is hard to rally around an end goal that is nebulous or otherwise pragmatic co-opted into something other than a resolution to the business problem in question.
We should probably jump in and spend a little bit of time on understanding the tooling necessary to allow the machine learning expert to work with the team in a productive way. You can probably tell by now that my preference is for using something robust like TensorFlow to dig in and start doing machine learning at scale. You could just start out with log files and dig in with an off the shelf product like the ML toolkit from Splunk. That is an example of a way to open the door for the team to start using a common platform to get things done.
This is a topic that I'm super passionate about and always happy to talk to people about how to take the first steps to build up internal talent for machine learning.
Footnotes:
[1] Here is a good look at machine learning papers. The volume of publication in the space is expanding exponentially https://www.technologyreview.com/2019/01/25/1436/we-analyzed-16625-papers-to-figure-out-where-ai-is-headed-next/
[2] https://github.com/tensorflow/tfx/tree/master/tfx/examples/chicago_taxi_pipeline
[3] Check out https://www.coursera.org/browse/data-science/machine-learning and https://www.qwiklabs.com/focuses/3391?parent=catalog
[4] Two of my favorite people to follow: https://twitter.com/DynamicWebPaige and https://twitter.com/lak_gcp
[5] I’m working daily to learn more about this company https://databricks.com/
What’s next for The Lindahl Letter?
Week 4: Have an ML strategy… revisited
Week 5: Let your ROI drive a fact-based decision-making process
Week 6: Understand the ongoing cost and success criteria as part of your ML strategy
Week 7: Plan to grow based on successful ROI
Week 8: Is the ML we need everywhere now?
Week 9: What is ML scale? The where and the when of ML usage
Week 10: Valuing ML use cases based on scale
Week 11: Model extensibility for few shot GPT-2
Week 12: Confounding within multiple ML model deployments
I’ll try to keep the what’s next list forward looking with at least five weeks of posts in planning or review. If you enjoyed reading this content, then please take a moment and share it with a friend.