Reputation scoring for open source contributors: what reputer measures and why

Every dependency you install, every pull request you merge, carries an implicit trust decision. You trust that the person behind the commit is who they claim to be, that their account hasn’t been compromised, and that their contribution is genuine. Most of the time, that trust is warranted. But supply chain attacks like the xz utils backdoor remind us that trust without verification is a vulnerability. ...

2026-02-21 · 5 min · Mark Chmarny

Complexity can be learned but abstractions come at a long-term cost

All complexity needs to be abstracted, right? This reductionist statements misses nuance around the inherent cost/benefit tradeoffs, especially when you consider these over time. Don’t get me wrong, there often are good reasons for additional layers to make things simpler (grow adoption, lowering toil, removing friction, etc.). Still, these layers come at the long-term cost that’s often is not a part of the evaluation process. ...

2021-03-30 · 2 min · Mark Chmarny

How to debug container image content

When dealing with file permissions in a non-root image or building apps that include static content (like css or templates), I sometime get an error resulting from the final image content mismatch with my expectations. Most of the time the errors are pretty obvious, simple fix and rebuild will do. Sometimes though, you want to take a look into the image and understand what the actual layout looks like in there. ...

2019-08-27 · 2 min · Mark Chmarny

Knative momentum continues…

I wrote a new post on Google blog on the momentum behind the Knative project. How it the community reached another adoption milestone, doubling the number of its contributors. Also, another data point underscoring the Knative momentum is the month-over-month contributions which have increased over 45% since the 0.1 release, now representing more than a dozen of different companies. ...

2018-12-10 · 1 min · Mark Chmarny

Build and manage modern serverless workloads using Knative on Kubernetes

By now, Kubernetes should be the default target for your deployments. Yes, there are still use-cases where Kubernetes is not the optimal choice, but these represent an increasingly smaller number of modern workloads. The main value of Kubernetes is that it greatly abstracts much of the infrastructure management pain. The broad support amongst virtually all major Cloud Service Providers (CSP) also means that your workloads are portable. Combined with the already vibrant ecosystem of Kubernetes-related tools, means that the experience of the operator, the person responsible for managing Kubernetes, is now pretty smooth. ...

2018-07-24 · 4 min · Mark Chmarny

Service, not Volume - data explosion and how to amplify its value

Data is growing at an exponential pace. Based on recent numbers from IDC, the total amount of data in 2015 (4.4ZB) will grow to 44ZB in 2020. Franky, how much is in Zettabyte is almost inconsequential. It is the fact that all of the data generated since the beginning of time (at least the electronic part), will grow 10x in just the next four years that’s shocking! ...

2016-07-21 · 4 min · Mark Chmarny

HDFS has won, now de facto standard for centralized data storage

The “high-priests” of Big Data have spoken. Hadoop Distributed File System (HDFS) is now the de facto standard platform for data storage. You may have heard this “heresy” uttered before. But, for me, it wasn’t until the recent Strata conference that I began to really understand how prevalent this opinion actually is. ...

2016-04-03 · 6 min · Mark Chmarny

Data-related investments shift from tech to skills — talent new differentiator

Over the last decade, the access to best-of-bread data technologies has become easier. This is due mainly to the increasing popularity of open source software (OSS). While this phenomenon holds true in other areas like operating systems, application servers, development frameworks or even monitoring tools, it is perhaps most prevalent in the area of data. ...

2016-04-03 · 4 min · Mark Chmarny

Gluttony of great open ML tools too hard for enterprise to use

Seems like every week we hear about yet another new open source Machine/Deep Learning library or Analytical Framework. Talking to people at Strata this week only confirmed for me that in the midst of what can only be described as virtual gluttony of open-source software, there is massive number of organizations who find it increasingly harder to implement these technologies. Even the task of identifying the right solution can overwhelm many, and result in a tailspin of endless use-case/feature comparison. ...

2015-09-21 · 2 min · Mark Chmarny