Software supply chain data fatigue and what I’ve learned from SBOM, vulnerability reports

If you are doing any vulnerability detection in your software release pipeline today, you are already familiar with the volumes of data these scanners can generate. That dataset gets significantly larger when you add things like license scanning and Software Bill of Materials (SBOM) generation. That volume of data gets further compounded with each highly-automated pipeline you operate. This can quickly lead to what I refer to as a Software Supply Chain Security (S3C) data fatigue, as many vulnerabilities you’ll discover you simply can’t do anything about. There is an actionable signal in there actually, it’s just hard to find it in the midst of all the noise. ...

2023-01-11 · 8 min · Mark Chmarny

How I learned Dapr building tweet sentiment processing pipeline

I recently joined the Office of CTO in Azure at Microsoft and wanted to ramp up on one of the open source projects the team has built there called Dapr. Dapr describes itself as: A portable, event-driven runtime that makes it easy for developers to build resilient, microservice stateless and stateful applications that run on the cloud and edge and embraces the diversity of languages and developer frameworks. ...

2020-05-10 · 2 min · Mark Chmarny

Renaissance of custom vertical solution

We are entering a period where custom, highly-optimized, vertical solutions are becoming viable option again. This is a good news for ISVs with proven domain expertise and skilled development resources. Why do I think so? We now have: Plethora of feature-rich developer frameworks, message queues, scalable data stores, and even lower-level components in the OSS community with great documentation and a large number of use-case validation Growing number of custom solution companies (more than just ISVs) with existing deep vertical/domain expertise who are also increasingly now investing in hiring and training strong development teams Virtually every Cloud provider offering either a raw Kubernetes service or managed container execution platform which (regardless how you feel about these technologies) creates ubiquitous surface area that can be addressed with a single solution Yes, there still are many ways in which these custom development efforts can fail. Still, as one who has started their professional career developing custom software, I’m glad to see how these kinds of efforts are becoming cost effective again and increasingly represent a viable option for differentiation and real business value delivery. ...

2020-02-19 · 2 min · Mark Chmarny

Data Exchange — How to Amplify Value of Machine Data

The presentation that goes along with this post is available here In my last post I went over the value cycle of machine generated data. In this post, I want to follow up with a few ideas on how to further amplify value of that data by expanding its context beyond the walls of owning organization, in a construct we came to know as Data Exchange, and list a few innovation opportunities in each one of these areas. ...

2016-05-17 · 4 min · Mark Chmarny

HDFS has won, now de facto standard for centralized data storage

The “high-priests” of Big Data have spoken. Hadoop Distributed File System (HDFS) is now the de facto standard platform for data storage. You may have heard this “heresy” uttered before. But, for me, it wasn’t until the recent Strata conference that I began to really understand how prevalent this opinion actually is. ...

2016-04-03 · 6 min · Mark Chmarny

Gluttony of great open ML tools too hard for enterprise to use

Seems like every week we hear about yet another new open source Machine/Deep Learning library or Analytical Framework. Talking to people at Strata this week only confirmed for me that in the midst of what can only be described as virtual gluttony of open-source software, there is massive number of organizations who find it increasingly harder to implement these technologies. Even the task of identifying the right solution can overwhelm many, and result in a tailspin of endless use-case/feature comparison. ...

2015-09-21 · 2 min · Mark Chmarny

Federated not Balkanized - The Future of Data and Its Short-term Cloud Challenges

As a long-term Cloud storage user I recently wanted to re-evaluate my options. New content management providers became available and I wanted to make sure I wasn’t missing on the new shinny tech out there. As I was considering the pros and cons of each option, I realized the apparent shift in my personal attitude towards cloud data storage over last few years. My concerns used to be solely with security. Now, while the data security is still critical, I am much more interested in data access, ownership, integration and its control. ...

2015-07-21 · 3 min · Mark Chmarny

Data Opportunities, 3 Areas to Focus Innovation

About a year and a half ago, I wrote about Big Data Opportunities, focusing primarily on Leveraging Unused Data, Driving New Value From Summary Data and Harnessing Heterogeneous Data Sensors (more recently known as Internet of Things). Since that post, data space has exploded with numerous solutions addressing many of these areas. These solutions while mostly based on batch operations and limited to serial MapReduce jobs against frequently off-line, inadequately secured, Hadoop cluster, they do allow access to previously inaccessible data. ...

2015-02-21 · 4 min · Mark Chmarny