Machine learning security
The folks over at NVIDIA released a new language model that is very large, “Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model.”[1] At 530 billion parameters that is a very large generative language model indeed. All of these model releases and the sharing of code that go on within the machine learning space on GitHub and via other means create a situation where both the shared security in the space and the security of each individual project is important.
A lot more articles over in Google Scholar showed up for “machine learning security” than I expected to see this week.[2] Some of that content veers off into privacy related items and some very specific use cases related to the security related applications of machine learning. Thanks in part to the islands of content on this one it is probably best to try to consider both the machine learning use cases related to security and the very real and growing questions around the security of actual machine learning implementations, code, and shared projects. A lot of the practical open source MLOps projects that I track on GitHub are scanned all over the place and people put in pull requests and communicate out the potential problems that might exist. Every once in a while you might even get to read about a temporary private fork of one of those MLOps projects where somebody is working very hard to patch some block of code to the point it passes scanning and other security measures.[3] That is one way to really work something to resolution, but it is a very interesting exercise as it is not part of driving the product features, but it is critical to being able to use the product in enterprise settings.
Detecting statistical anomalies is something that machine learning implementations are fully capable of doing. One of the use cases that people seem to like about relates to cyber security intrusion detection.[4] A lot of security is about seeing things within the normal pattern of usage that flag as abnormal. That is one of the reasons why monitoring and traffic analysis is such a big part of security and machine learning can play a vital part in that type of work. To that end, I spent some time reading an article from the Cisco team about security applications that utilize machine learning in practice.[5] It was a decent read about how machine learning can augment security use cases, but I was trying to dig more into the nature of how security is worked within the actual machine learning instances. Software in the AIOps and MLOps spaces exists and is rapidly changing every day you can see the updates on GitHub. The security of those applications is where my attention was during all of my searching. You can see the security pushes and pulls within the GitHub repositories of the MLOps software I watch on a regular basis. My plan is to circle back with an analysis of those patterns to see if I can isolate security related items.
Links and thoughts:
Linus and Luke were back in the studio again this week and I listened to them chat about things during the course of writing this missive, “Best Buy Scalping PS5s... for SHAME - WAN Show October 15, 2021”
You can check out the Microsoft Developer AI show this week, “AI Show Live - Episode 35 - Building computer vision models using AutoML for Images”
Here is the keynote (high production informational) from Google Cloud Next if you wanted to catch up on that one, “Google Cloud Next Developer Keynote”
Top 6 Tweets of the week:










Footnotes:
[2] https://scholar.google.com/scholar?q=machine+learning+security&hl=en&as_sdt=0&as_vis=1&oi=scholart
[4] https://ieeexplore.ieee.org/abstract/document/7307098
[5] https://www.cisco.com/c/en/us/products/security/machine-learning-security.html#~how-ml-works
What’s next for The Lindahl Letter?
Week 40: Applied machine learning skills
Week 41: Machine learning and the metaverse
Week 42: Time crystals and machine learning
Week 43: Practical machine learning
Week 44: Machine learning salaries
I’ll try to keep the what’s next list forward looking with at least five weeks of posts in planning or review. If you enjoyed reading this content, then please take a moment and share it with a friend.