Applied machine learning skills
Before we start I’m going to acknowledge that a lot of content exists about how to get a job in the machine learning space. The amount of scholarship on machine learning practitioners is a much smaller body of work.[1] I was not really satisfied with the darth of academic content on the subject so I had to sit back and think deeply about it. An opportunity for somebody exists in the academic community to survey and study machine learning practitioners in detail. Maybe a book like “The Innovators” by Walter Isaacson could manifest from that effort (2015).
Right now I’m looking at version 4 of a presentation titled “What is ML Scale?” that I last updated on July 16, 2021.[2] None of that content specifically deals with applied machine learning skills. Generally that content and all the ideas covered are adjacent to that consideration. That realization shocked me into two different sets of questions that might be pertinent to the topic at hand today related to applied machine learning skills. For somebody to be a full stack machine learning (engineer) practitioner they would have to be able to handle two distinct sets of skills in tandem.[3] First, they would need to be able to manage models for machine learning and everything that goes into that set of skills. Second, they need to have the skills to operationalize machine learning. Both of those things were shared in a matter of words and they could in some ways be labeled and dissected as MLOps and AIOps mixed with some deep data analytics and curation in some cases. I’ll dive more into that in the next paragraphs.
Models abound these days and we have repositories of open source ones all over the place, APIs that you can consume, and home grown solutions that are growing every day. We may have reached the point where machine learning models are generally ubiquitous and the degree of absolutely new or novel ones is shrinking daily. It takes a certain set of skills to work with the data necessary to do model training and understand the elements under consideration to make a finalized end product ready to go to production. Getting a model production ready and being sure it is ready to go requires a lot of planning and execution. That is one of the reasons that people have been buying pretrained models and using APIs to skip that step. It's entirely possible that the data in question and use case might be so specialized that building is the only path forward. A lot of use cases don’t require that, especially ones in the image/vision space of machine learning.
Operationalizing your machine learning is going to involve either connecting an API to your workflow or having the pipelines necessary to interact with your deployed model at scale. A ton of different open source MLOps tools exist to be able to do that and I have shared my analysis on them publicly. Being able to actively use these types of tools at scale and in production is a skillset. It is a skillset that is markedly different from the model creation and data management skills mentioned above. Running and operationalizing machine learning at scale is about the day to day management of use case execution. Are the response times from the model slowly down production and are the accuracy and delivery of the use case within acceptable tolerances or does action need to be taken. Efficiency of even the best models can degreed over time in a drift against your return on investment that has to be accounted for and managed.
Links and thoughts:
If you have not already started watching @ykilcher every week, then you are missing out. I watched and enjoyed, “[ML News] Microsoft trains 530B model | ConvMixer model fits into single tweet | DeepMind profitable.”
This week I watched the WAN show with Linus and Luke while writing this Substack. I watched/listened to this episode, “Apple is Tempting me... - WAN Show October 22, 2021”
I’m still really enjoying the @Microsoft developer AI Show with @AysSomething and @beastollnitz, “AI Show | Oct 22 | Translator now supports 100+ languages and dialects | Episode 36”
From Google Cloud Tech, “How to build secure software supply chains”
This is a really long video from Microsoft Devloper, “Create: DevOps”
Top 9 Tweets of the week:












Footnotes:
[1] https://scholar.google.com/scholar?hl=en&as_sdt=0%2C6&as_vis=1&q=machine+learning+practitioner&btnG=
[2] You could watch that video here
What’s next for The Lindahl Letter?
Week 41: Machine learning and the metaverse
Week 42: Time crystals and machine learning
Week 43: Practical machine learning
Week 44: Machine learning salaries
Week 45: Prompt engineering and machine learning
I’ll try to keep the what’s next list forward looking with at least five weeks of posts in planning or review. If you enjoyed reading this content, then please take a moment and share it with a friend.