Search: [DevOps_methodology] - Liens utiles et à partager

My Philosophy on Alerting by Rob Ewaschuk | GoogleDoc

based my observations while I was a Site Reliability Engineer at Google

GoogleDoc · alerting · monitoring · ServiceMonitoring · Linux_SysAdmin · Google · devops · DevOps_methodology · methodology

April 22, 2025 09:17:11 AM GMT+02:00 * · permalink

·

https://docs.google.com/document/d/199PqyG3UsyXlwieHaqbGiWVa8eMWi8zzAn0YfcApr8Q/edit?tab=t.0

The DevOps Phenomenon - CALMS - ACM Queue

Culture: mutual trust, willingness to learn and continuous improvement, constant flow of information, open-mindedness to changes and experimentation
Automation: deployment pipelines (CI/CD), comprehensive test automation
Lean: minimize WIP state, shorten and amplify feedback loops, look for opportunities to remove waste, fix errors as they are discovered
Measurement: monitoring, system metrics, KPIs
Sharing: sharing knowledge & practices, including successes & failures, learn from each other's experiences, proactively communicate, shadowing & pairing on tasks

devops · DevOps_methodology · Culture · pairing · REX · automation · pipeline · monitoring · ci-pipeline · Testing · KPIs · post-mortem

September 16, 2021 11:02:55 AM GMT+02:00 * · permalink

·

https://queue.acm.org/detail.cfm?id=3338532

The USE Method

The Utilization Saturation and Errors (USE) Method is a methodology for analyzing the performance of any system. It directs the construction of a checklist, which for server analysis can be used for quickly identifying resource bottlenecks or errors. It begins by posing questions, and then seeks answers, instead of beginning with given metrics (partial answers) and trying to work backwards.

The USE Method can be summarized as:
For every resource, check utilization, saturation, and errors.

On the same website:

Networking_And_Servers · methodology · DevOps_methodology · Resources · error_handling · Website_performances_&_latency

February 3, 2020 06:28:07 PM GMT+01:00 · permalink

·

http://www.brendangregg.com/usemethod.html

Some notes on running new software in production - Julia Evans

tl;dr :

Start using [your software] in production in a non-critical capacity (by sending a small percentage of traffic to it, on a less critical service, etc)
try to have each incident only once
Understand what is ok to break and isn’t

For example, with Kubernetes:

ok to break:
- any stateless control plane component can crash or be cycled out or go down for 5 minutes at any time
- kubernetes networking can break as much as it wants because we decided not to use it to start
not ok to break
- for us, if etcd goes down for 10 minutes, that’s ok
- containers not starting or crashing on startup
- containers not having access to the resources they need
- pods being terminated unexpectedly by Kubernetes

With Envoy, the breakdown is pretty different:

resilience · DevOps_methodology · ProdOps · Prog · Kubernetes · AWS

November 14, 2018 10:32:01 AM GMT+01:00 * · permalink

·

https://jvns.ca/blog/2018/11/11/understand-the-software-you-use-in-production/

The Critical Missing Piece of DevOps… And How to Find It | AWS Cloud Enterprise Strategy Blog

Not very insightful, but I'm retaining some quotes:

But IT operations includes much more than the limited “ops” functions we typically fold into a DevOps team. I’m talking about things like ticket management, incident handling, user management and authorization, backups and recovery, network management, security operations, infrastructure procurement and cost optimization, compliance reporting, and much more. In today’s IT organization, where do these responsibilities fall?

You want DevOps teams to have a streamlined, low lead-time, lean pipeline to production. Devoting team capacity to this broader set of operational functions may slow down this pipeline. There are also efficiencies to be gained by sharing these practices across the work of all the DevOps teams.

All of this is to say that a portion of IT operations still exists independently of the DevOps teams, performing those “ops” functions that are not in “DevOps” while the DevOps teams focus on that subset of ops functions specifically related to deploying code and responding to code-related incidents

Agile · AWS · DeploymentAutomationSystems · Management · IncidentManagement · DevOps_methodology · Prog

June 24, 2018 12:47:54 PM GMT+02:00 · permalink

·

https://aws.amazon.com/fr/blogs/enterprise-strategy/the-critical-missing-piece-of-devopsand-how-to-find-it/

NullPointerException - Blog d'Aeris

Blog · Crypto · DevOps_methodology · Mensuel · open-source

June 2, 2017 05:24:25 PM GMT+02:00 * · permalink

·

https://blog.imirhil.fr

Ansible v.s. Salt (SaltStack) v.s. StackStorm – Anthony Shaw – Medium

"Ansible is simple, which is a major strength", it "works by connecting to a server using SSH, copies the Python code over, executes it and then removes itself".
"Ansible Tower is the Enterprise version, it turns the command line Ansible into a service, with a web interface, scheduler and notification system."
"you can’t have long-running tasks."
"StackStorm is designed as a highly-configurable if-this-then-that service. it can react to events and then run a simple command or a complex workflow."
"MongoDB can be scaled using well-documented patterns." "StackStorm extensibility system is a key strength." "If StackStorm were a programming language, it would be strongly typed."
"Salt was born as a distributed remote execution system used to execute commands and query data on remote nodes."
"Ultra high-performance for large deployments." (LinkedIn use it)

Salt · Ansible · StackStorm · DevOps_methodology · Python · Networking_And_Servers · DeploymentAutomationSystems · Prog · Enseignement

May 29, 2017 03:03:25 PM GMT+02:00 · permalink

·

https://medium.com/@anthonypjshaw/ansible-v-s-salt-saltstack-v-s-stackstorm-3d8f57149368

We take a look at Etsy's blameless postmortems, both in terms of philosophy, process and practical measures/guidance to avoid blame and better prepare for the next outage. Because failures are inevitable in complex socio-technical systems, it’s the failure handling and resolution that can be improved by learning from postmortems.

Prog · DevOps_methodology

September 2, 2015 12:34:24 AM GMT+02:00 · permalink

·

http://www.infoq.com/articles/postmortems-etsy

Lessons learned from reading post mortems

I love reading postmortems. They’re educational, but unlike most educational docs, they tell an entertaining story. I’ve spent a decent chunk of time …

Prog · DevOps_methodology · post-mortem

August 24, 2015 01:38:43 AM GMT+02:00 * · permalink

·

http://danluu.com/postmortem-lessons/

Is there such thing as a DevOps Hierarchy of Needs? | The Ape Grid

In 1943 the psychologist Abraham Maslow proposed the concept of a 'hierarchy of needs' to describe human motivation. Most often portrayed as a pyramid, with the more fundamental needs occupying the largest space in the bottom layers, his theory states that only in the fulfilment of the lower-level needs can one hope to progress to…

Prog · DevOps_methodology

August 24, 2015 01:38:05 AM GMT+02:00 · permalink

·

http://tech.spaceapegames.com/2015/08/14/is-there-such-thing-as-a-devops-hierarchy-of-needs/

You Don't Need a DevOps Team: You Need a Tools Team

Lately the job boards have been filled with ads that look something like this: Seeking Senior DevOps Engineer Must be able to debug all databases created since 1980 Be a core contributor to at least 10 open source projects Have experience with Go, Java, Python, Ruby, and C# Understand the kernel and be able to debug panics at 3AM

Prog · DevOps_methodology

March 22, 2015 08:16:23 PM GMT+01:00 * · permalink

·

http://www.snehasoft.com/blog/uncategorized/you-dont-need-a-devops-team-you-need-a-tools-team/

DevOps @ Devoxx | Le blog Netapsys

Prog · DevOps_methodology

March 18, 2015 03:17:13 PM GMT+01:00 · permalink

·

http://blog.netapsys.fr/devops-devoxx-2/

Slides+++ : Lean Configuration Management

Configuration management is an essential ingredient in creating high performance IT. But how you implement it matters. In this talk Jez will present the princi…

Prog · DevOps_methodology

February 25, 2015 04:50:24 PM GMT+01:00 · permalink

·

http://www.slideshare.net/jezhumble/lean-configuration-management

The DevOps Checklist

Form a better understanding of DevOps and your delivery ecosystem through following the DevOps Checklist by Steve Pereira - design courtesy of Aaron Legaspi and Amit Jakhu of Myplanet.

Prog · DevOps_methodology

February 22, 2015 02:21:58 PM GMT+01:00 · permalink

·

http://devopschecklist.com/

Comment savoir si votre entreprise est DevOps | CommitStrip - Blog relating the daily life of web agencies developers

Prog · DevOps_methodology

February 17, 2015 07:05:40 PM GMT+01:00 · permalink

·

http://www.commitstrip.com/fr/2015/02/02/is-your-company-ready-for-devops/

How to Modify ITIL to Accommodate DevOps | IT User Support Management Practice

If an IT management idea could ever be said to be “exciting”, then perhaps DevOps is a strong candidate. Its basic idea is to replace the traditional separation of “Development” and “IT” or “Operations” with a single function. This restructure has laudable goals; no more developers having to commission and wait for test and production…

Prog · DevOps_methodology

January 11, 2015 09:26:18 PM GMT+01:00 · permalink

·

https://noelbruton.wordpress.com/2014/04/04/how-to-modify-itil-to-accommodate-devops/