The Rub

Automatically Simple Since 2002

DevOpsDays MSP 2015

09 July 2015

Contents

The Conference

DevOpsDays MSP is an annual 2-day conference in Minneapolis, Minnesota. This is the second year of the conference.

DevOps is *deep breath* the study and practice of combining the classically separate professions of development and operations. (my definition, which is probably different than everyone else’s).

The term ‘DevOps’ is largely credited to Patrick Debois in 2008, though the concepts and practices arguably predate the term.

The following are my personal notes from the conference, summarized in my own words, and sometimes with my own biases and omissions. This is not a transcript and it is not complete. Where possible, the original sources should be referenced.

Talks

Devops: The Missing Pieces

Katherine Daniels - talk

Great overview of devops at Etsy with some actionable advice that every company should consider.

  • DevOps History
    • Used to throw code over the wall, and it didn’t go well
    • Patrick Debois organized the first DevOpsDays conference in 2009
      • Ops and Devs should work together, communicate more
  • DevOps @ Etsy
    • “Who is in charge of DevOps?” – “We are all in charge of DevOps.”
    • No DevOps team, no DevOps engineers.
    • We’re all in charge of devops.
  • Bootcamping
    • Embedding (especially new) employees into other teams
    • 1-6 weeks, 1-3 bootcamps per person
    • Meaningful contributions with other teams
    • Empathy, inter-team understanding
  • Designated Ops
    • Every team has a designated ops engineer
    • Even non-dev teams may have an ops person
    • Designated, but not dedicated
    • Primary, secondary, advisory roles - avoids burnout, spreads load
    • Attend other team’s meetings
      • Expose operational ideas, concerns, early
      • Monitoring, operational thinking, fewer surprises
  • Pair Opsing
    • A part of Designated Ops
    • Similar to pair programming
    • Once a week, or every couple weeks
    • Refactoring ops code, changing alerting
    • Two-way knowledge sharing
  • Intersectionality
    • The study of intersections between forms or systems of oppression, discrimination
    • Accessibility, real name policies, B corps, side effects of businesses
    • Rock stars, big egos, bad
    • Looking to build orchestras, rather than rock stars
    • When hiring, look for diversity, team players,

Why you should care about DevOps in the public sector

Joshua Zimmerman - talk

Challenges in public sector technology and how DevOps can help.

  • The public sector are organizations owned or administered by the government
  • Adam Jacob’s Kung Fu Talk recommended
  • Why care about public sector? Easy example: heathcare.gov
  • Lacking the agency to fix a problem makes it feel like it isn’t your problem.
  • Gov’t applications, web apps, have issues. They’re aware of it.
    • “This is what my tax dollars pay for.” common complaint
  • We should be able to improve these applications.

  • Why devops?
    • Improve state of public infrastructure
    • Doesn’t require money! It’s culture
  • Bureaucracy
    • Mark Schwarz How DevOps Can Fix Federal Government recommended
    • Devops is about bureaucracy
    • Fix bureaucracy - devops culture and improve services
    • Structure can be a barrier to devops
    • Historical baggage - commonalities were not identified early
      • i.e. central IT department may not exist
      • Lots of institutionalized silos
    • At UW Madison, ~140 IT units on campus(!)
    • Structure and politics are hard
    • John Maeda -
      • Silo approach - cooperationn - working together independently
      • devops approach - collaboration - working together dependently
  • Metrics for success may differ from private sector
    • May never feel like you have succeeded
    • Time is often a bigger limitation than money
  • How can everyone help?
    • Devops conferences help - affordable and distributed
    • Devops resources online also contribute
    • Devops community needs better outreach to other tech communities
  • Your words matter. Choose them wisely. [use inclusive language]
    • Don’t assume the goal is to make money. For a public institution, that’s not important.
    • Give everyone a voice (Thanks devopsdays)
    • Your problems are not unique
  • Takeaways
    • Keep trying to understand us
    • Team building without assuming hiring and firing are options

Rolling Your Own vs SaaS: Tradeoffs & Considerations

Colleen Velo - talk

This detail-packed talk shows an overview of what particular pieces of software a large company is using, and why.

  • Bloom Health
    • Private health exchange
    • HIPAA compliant, PHI data
    • Entire infra in public cloud infrastructure (AWS)
  • Definitions used in this talk
    • SaaS - Software as a Service
    • Roll your own - write your own custom software privately
    • Open Source - Freely available software
    • Commercial (self hosted) - purchased software but administered locally
  • Considerations on DIY vs SaaS
    • Cost, support, staffing, company policies, security
    • hipaa security
      • data must be encrypted in transit and at rest
      • principle of least privilege
  • I excluded a bunch of slides on pros, cons and use cases of each of SaaS, Commercial, Open Source, and Roll-your-own. Watch the talk for full details.

  • Bloom’s approach to SaaS
    • AWS
      • CloudTrail and CloudCheckr - great for auditing access
      • Used to use trusted advisor instead of CloudChecker
      • CloudFormation to manage deployments
      • ElastiCache instead of sticky sessions
      • Planning to use RDS for MySQL
        • Currently still using MySQL master/slave
        • RDS MySQL now HIPAA compliant
      • AWS HIPAA compliant page recommended
    • Jira, Confluence, Hipchat
    • DockerHub private registry
      • building their own private registry
    • Github in the cloud
    • Monitoring
      • Stackdriver for system metrics, integrated with AWS
      • New Relic for application monitoring, JVM stack, groovy/tomcat
      • Pingdom for monitoring endpoints
      • Pagerduty for alerting
  • Bloom’s approach to commercial (self-hosted) software
    • Splunk for log aggregation
      • Splunk enterprise security
      • Great integration with 3rd party products
    • Casper Suite for Mac Provisioning
      • Security policies
      • Audit trail
    • Jira ticketing system for PHI data - self hosted
  • Bloom’s approach to “roll your own” (write your own solution)
    • SFTP file exchange due to HIPAA compliance
    • Dynamic service discovery
      • In 2013, not many options available
      • BHStore, based on Redis and publish/subscriber
      • Moving over to Consul (hashicorp)
        • Multi data center support
  • Bloom’s approach to open source
    • Vagrant for development/testing
    • Chef solo, migrating to Salt Stack and Docker
    • Graphite for monitoring historical application metrics
    • Packer for AMIs
  • Takeaways
    • Go to meetups to discuss solutions and experiences

Helping Developers Monitor Their Own Applications

Luke Francl - talk

DevOps from a developer perspective; how to get developer buy-in. Warning, actual code is presented and discussed!

  • Swiftype - search as a service company using API or web crawler
  • Luke is developer, admits to not knowing a lot about operations
  • –Bunch of really funny joke slides excluded–
  • Thought devops was just an ops thing..
  • Monitoring
    • Make it easy for devs to add monitoring (or it may not get done)
    • Example shown of ruby integration with nagios
      • Allows monitoring metrics and thresholds to be defined and implemented directly in code
      • See slides for particular technical details
    • With this glue, “Monitoring is addictive” for development
    • Developers are subject matter experts at their application. Give them to tools to implement their own monitoring.
    • Open sourced ruby/nagios glue at github
  • “DevOps needs developers”
    • Development used to be pretty awesome, throwing code over the wall
    • Need developer buy-in. It can’t just be about ops
    • Providing infrastructure for developers is a powerful one to convince development
    • Make it easy for developers. They’ll understand the value.
    • Moving towards having developers on call, but not there yet
  • Takeaways
    • Much of this talk was contextual and funny, which I did not attempt to reproduce here
    • Much of this talk was code, which I also did not reproduce here
    • Make monitoring directly accessible to developers

The New New Software Development Game

Mary Poppendieck - talk

Mary is from the future. You should listen to what she says and buy her books and the books recommended here.

  • Software is eating the world.
  • Things to think about
    • lower friction
      • Friction is what makes war different in reality vs on paper.
      • Before containerization in shipping, high friction - low friction with containers - Read The Box
    • limit risk
  • Lower friction, limit risk. How?
    • Architecture - microservices
      • In the 90’s, centralize everything into few databases - monolithic architecture
      • Microservices, by contrast, decouples all the things and creates a federated architecture
        • Read Building Microservices
        • Small service - does one thing well. Independently deployable.
        • Small team - end to end responsibility. END TO END. On call, monitoring, QA, deploying, etc
        • Practices
          • No central databases
          • Extensive automation and monitoring
          • Double Mock Contract Testing
          • Smart versioning services
          • Canary releasing
        • Examples: Amazon, netflix, spotify, gilt
        • Risks..
          • Dependency hell.. how is this different than objects?
          • Is it right for your domain?
            • Yes, if you have high volume
          • Do you understand the domain?
            • Often start monolithic and move to microservices when it makes sense
            • Get bounderies right first, hard to refactor later
          • Can you maintain strict discipline?
            • Restrict interaction to hardened interfaces
            • Teams maintain situational awareness of their services, its consumers, its providers
    • Architecture - containers
      • Pack dependent code into containers
      • Build once, run anywhere
      • Consistency
      • Isolation
      • Easy to use (esp docker)
      • Better server utilization
    • Architecture - testing
  • Dealing with monoliths
    • You don’t have to have microservices. i.e. Facebook
    • Antipatterns
      • “Smash!” - large infrequent releases - Guarantees failure
    • Best practice
      • Poke, test, fix.. small iterations over time
      • Continuous Delivery
        • Not new - 2010 idea. If you’re not doing this yet, you don’t care about stability, reliability, predictability. Least dangerous approach.
        • Must have test-driven development process
        • Tight collaboration between customer-facing and delivery people
        • Cross functional teams, including product, QA, and ops
        • Automated build, testing, DB migration, deployment
        • Incremental dev on mainline with continuous integration
        • Branches are not CD. Always deploy to trunk/master
        • Release is done by a switch. Deployment happens all the time.
        • Software is always production ready
        • Releases tied to business needs, not operational constraints
  • Organization
    • Dev and ops are different - Read Focus
      • Safety-focused goal people
        • prevent failure. i.e. doctors, nurses
        • Duty, obligation
        • Rewards - attention is for bad behavior
        • Nothing going wrong is ideal
        • Limit risk
      • Aspirational goal people
        • create gains
        • explore all the options
        • Rewarded for delivery
        • Lower friction
      • Both important! “Often in a marriage, you’ll have one of each”
    • One goal, shared responsibility
      • Who is responsible?
      • “We work together”
      • All of us are responsible.
    • Situational awareness - Read This is Lean
      • What makes a great (soccer) team - everyone on the field is aware of everything, all the time. The team with the most situational awareness will win.
      • Wayne Gretzky - skates to where the puck will be

Cheffing Etsy; Do too many cooks really spoil the soup?

Jon Cowie - talk

This was a very detailed and pragmatic talk about how Etsy uses Chef to deliver services.

Do too many cooks really spoil the soup?

  • What is chef?
    • Desired state config management
    • Thin server, thick client
    • Chef vocab
      • Node - a server being controlled by chef
      • Cookbook - desired state specifications, using recipes
      • Environment - a list of cookbook version constraints
      • Knife - cli for chef server
    • There is no magic pill.
    • You’re the expert, chef is just a tool, not prescriptive
  • Chef at Etsy
    • Chef server
    • ~2000 nodes
    • Almost all centos, but a couple Mac OS X
    • Everything from OS to “below code”
    • chef does not deploy code - deployinator
    • Single git repo
      • creates 2 sources of truth..
        • Humans talk to git, servers talk to chef server
      • 50 authors in last month
    • ~35 chef deploys per day
    • Many less-experienced users - trust but verify
  • Cookbook workflow
    • command line review tool - creates pull request, sends it to people
    • Push change to server, using internal tool called knife-spork
      • Helps multiple chefs avoid clashing, and gives visibility
    • Test change
      • Move node to unconstrained environment
      • knife node flip foo.etsy.com testing
      • Downsides..
        • no unit tests
        • holding cookbook in testing is blocking
        • testing env affects all cookbooks
      • Use chef-whitelist to solve pain points
  • Monitoring and debugging
    • knife-spork and CI job. Integrated with chat when changes are made (irc)
    • IRC handler - deals with exceptions, test fialures
    • Lastrun data - shows other nodes with failures
    • Dashboard that shows when deploys happen, and overall chef status provides great visibility
  • How’s it all going?
    • Some pain points
      • Change clashes due to number of chef contributors
      • Confusion over state of changes
      • People forget things
      • Testing pains
    • 2016 million dollar workflow (improvement plans)
      • deployinator-based workflow
      • push queue
      • unit tests
      • “try” based testing - ci system lets you run jenkins tests before pushing
      • More like existing CD workflows
      • Basically, require fewer people to have to use chef
  • Read Customizing Chef (speaker’s book)

  • Lastly, a “rant” about online harassment

Lets Safely Dance

Andy Fleener - talk

Overview of safety concepts in large and complex systems.

Closing Keynote

Andrew Clay Shafer

Andrew Clay Shafer’s thoughts on devops.

I excluded a long introduction recounting history of devops

  • Obligatory Deming Quote - A bad system will beat a good person every time
  • Innovators, Imitators, Idiots. (Don’t be an idiot.)
  • inputs and outputs - conway’s law and its impact on org structure
  • devops - optimizing performance and minimizing suffering.. globally.
  • Incentives for those wearing pagers vs those paid to ship new features.
    • If you wear a pager and are responsible when something breaks, you’ll probably prefer safety
  • The problem: local rationality (vs global)
    • The information we have changes what we see
    • Stimulus and response - the system has as big of impact as any individual
    • burnout is a feature of a system
      • people in a bad mood have better judgement and attention to detail
      • Perhaps depressions isn’t a malfunction, but an adaption (scientific american)
        • depression is a feature
  • “I never wanted to be a programmer”
  • “Computers are pretty easy.. it just does what I say, that’s pretty awesome”
  • “I never wanted to be a sys admin”.. “I sure as hell never wanted to be a manager”
  • There’s a tightrope between dunning-kruget and imposter syndrome
  • Humans are hard wired prefer confidence to expertise (see - sales)
  • Everyone should read this book: Badass Making Users Awesome - Kathy Sierra
  • Systems make people awesome. No-one will overcome an unhealthy system.
  • Build better systems, keep learning, keep helping each other
  • The punishment for not participating in politics is being ruled by inferiors

Ignite Talks

Ignite talks are 5 minutes each and go very fast. The following are my brief impressions.

Stop Blogging About Women in Tech

Jenna Pederson

  • 6 Actions you can take with regard to women in technology
    • Provide a community (for women)
      • i.e. geekettes
    • Provide a place to learn
      • i.e. GR8 Ladies
    • (Self) Promote
      • i.e. twitter
    • Actively Recruit
    • Mentor
    • Empower
      • i.e. HackTheGap

Effortless

Michael Lanyon - talk

  • Critical Mass experiment with web performance monitoring
  • webpagetest.org quantifies the end user experience
  • RUM - Real User Monitoring
  • Have team form a relationship with prod

Vulnerability

Larye Pohlman - talk

  • Empathy, vulnerability
  • Leadership is associated with vulnerability
  • Feel pain, show pain
  • Awkward moments shared with neighbors to make a point about sharing feelings with strangers

GameOps

Jason Clifford

DevOps and the Enterprise

Jason Walker

  • Target devops - Empathy Fairness and Contentment
  • Let time pass (after presenting new ideas)
  • Equal != Fair
  • Identify ad remove complication

Sports Stats 10

Daniel Willis

Note that Daniel is 12 years old!

Putting the R in sports

  • Installing R
  • Type R to start
  • R variables
  • Example using R to calculate ERAs
  • Reading files
  • Explaining standard deviation is hard (!)
  • vectors, era plots
  • Moneyball and sabermetrics
  • Using Lahman database for historical baseball stats

You, Me & StatsD

Mark Morris

  • Who does ‘tail -f production.log’?
  • multipurpose tool to gather information from logs without using tail.
  • Simple example from a shell
  • Enter statsd. “logging for metrics”
  • github.com/etsy/statsd
  • Metrics in buckets
  • statsd examples

If you want to have an impact, Devops is not enough

Sara Cowles

  • Segway example - great tech but little impact
  • Work hard at work worth doing
  • option 1: just build it
    • bet the farm
  • option 2: build an MVP (minimally viable product)
    • take a gamble
  • option 3: test assumptions
  • Be wrong as fast as you can
  • build -> measure -> learn feedback loop
  • empathy - the closest thing to a silver bullet

ChatOps

Jason Hand

  • Email should die
    • 28% managing email
    • 20% looking for information
  • 20-25% increase in productivity by moving conversations to chat
  • i.e. trigger jenkins build from chat
  • i.e. incident managmeent
  • benefits
    • learning
    • sharing
    • speed
    • security
    • brainstorming
    • fun
  • Private chats are an anti-pattern - use shared spaces
    • black box buckets nobody else benefits from
  • placing tools directly in the middle of conversation
  • Read ChatOps for Dummies (https://victorops.com/blog/chatops-for-dummies/)

Devops in the Machine

Matt Stratton

Pete Chesbot jokes

Open Spaces

For reference, here’s the list of open spaces.

  • Wednesday
    • Session 1
      • Saltstack Best Practices
      • Empathy/Cybernetics
      • Config and app dep managmenet
      • GameOps
      • Working with product, UX, Marketing, other non-ops non-dev folks
      • CM on Windows
      • Introverts
      • Leadership in Tech
    • Session 2
      • DevOps in Dev Environments
      • DevOps at Tiny Company
      • devops for nonprofits, orgs and low budget side projects
      • Training/Getting Buy In Socializing Holistic Thinking
      • DevOps Career Development
      • CI/CD Pipeline Toolingi
      • Building self-sevice IAAS
    • Session 3
      • DevOps Crystal Ball
      • Conference Speaking Efforts @ your company
      • Where is SW/Test Departments in DevOps
      • More contributions back to Open Source
      • Security, DevTools & Monitoring
      • Useful bots in chatrooms
      • Blameless Post-mortems
      • Remote Teamwork
  • Thursday
    • Session 1
      • Cookie Ops
      • app “herding” oragnize and manage state
      • arrested devops podcast
      • monitoring/incident managements
      • education + teaching comp sci
      • scaling elk
      • chatops
    • Session 2
      • How to make devops haters devops supporters
      • Empathy: tactics, challenges, stories
      • do your own devopsdays
      • code sync for puppet
      • empowering product and design leaders
      • docker orchestration
      • cross the finish line: devops marathon
      • werewolf
    • Session 3
      • sales and marketings place in devops
      • kanban vs devops
      • public conference post-mortem
      • cloud foundry
      • being blind / advocate for under represented groups
      • config driven monitoring/testing infrastructure
      • devsecops: doing, dreaming or what’s security doing here
      • mutable vs immutable infra