A Nod to Nothing: Code Freeze 2020

Sorry for the loose nature of the notes rather than a good writeup, but I wanted to get things collated so I can work with them. Good conference. I ate at Al's for breakfast and Hong Kong Noodles for lunch. And I realized I take the green line over to the U of MN for lunch anytime I'm downtown at work, which didn't occur to me these last six months. If I had my bicycle, it's even a short ride that way across the campus bridge. I need to get my urban on. Not as much practical knowledge at this one (for me) as some of the past events, but that means I can focus on the few things I think have practical value rather than being all over the place.

Observability and the Glorious Future - Charity Majors (Honeycomb.io)

O'Reilly Database Reliability Engineering (November 2017: http://shop.oreilly.com/product/0636920039761.do)
How often do you deploy. How long, how often do you fail, recovery time - the basics.
Hires for communication skills (initial tech interview is to get them talking at the in person). "Empowered to do their jobs".
"How do I know if it breaks?" - all changes, all features
"Serverless was a harbinger. Deployless is coming."
Developers (senior+) should amplify the hidden costs.
Team happiness = customer happiness (Steve says this too)

Observability in Big Analytics - Bonnie Holub, Teradata

Focus on the destination.
VALUE
Use what's out there (KDNuggets.com, Gartner)
Predicts 2020:Analytics and BI Strategy - Gartner (behind a paywall, I've asked about a Gartner sub at VP before. Can potentially get one via a third party, but I don't know if there's bias/editing: https://www.gartner.com/en/documents/3978987/predicts-2020-analytics-and-business-intelligence-strate)
Ditto: https://www.gartner.com/en/documents/3891788/15-insights-for-managing-data-science-teams
Also likes: https://www.mckinsey.com/business-functions/organization/our-insights/unlocking-success-in-digital-transformations
Talked about Lighthouse Projects - this is what Healthy Habits Recommender was/is.
There aren't just 5 elements of data science there is also:
Discover >> Access >> Prepare (wrangle, about where 85% of companies are and spend their time) >> Create models >> Socialize Results (dictionary as well) >> Deploy models >> monitor models >> analytics workflow
(Look for her slides to - there is one for the last point that was particularly good)

50 Years of Observability - Mary Poppendieck

What is the equivalent of metal fatigue in software? Operator fatigue. >> e.g. what Steve pushes that a focus on PIs is important.
Talked planes, bridges, three mile island
She likes the Control series by Brian out on Youtube....they're deep: https://www.youtube.com/channel/UCq0imsn84ShAe9PBOFnoIrg
Observable - all critical states known from system outputs
Observable is at war with complexity.
Controllable activator - sensor can get back to a set state in a set time.
If it's not observable, can it be totally controlled? (no)
Fault Tolerance: replication and isolation.
Responsibility (and understanding the big picture) leads to desire for observability (and isolation/duplication). >> PLEX team at VP is a form of big picture.

What's Happening in Your Production Data and ML Systems - Don Sawyer, PhData

Most practical of the lectures.
Focus on decoupled systems: Data warehouse, ML Models.
Talked Provenance as both origin and change over time.
Timestamp everything UTC (use Google Time API as an example to change it during compute).
Focus on: audit trails, data quality, repeatability, added info (pipeline).
Metadata payload. PROCESS: id/version, start/end, transformations, inputs, configuraitons, DATA VERSIONS: traces of issues, data change history, defect data, LINEAGE: sources, frequencuu of read.
Last point was a little messy (from me) but you want to trace right down to the node data touched in transit so you can hydrate anything from the last known good state.
NOT ALL DATA RECORDS require granular povenance. Can be expensive (so much data). Use a flexible or generic schema. Don't use S3 (slow). Storage considerations.
Storage: 1.) attach info to the record (can get big, note that Avro and Parquet are meant to do this), 2 send a separate event message - separate provenance API, 3.) only track some. Note that for API approaches you may end up going down a rabbit hole of tracking the tracking api.
Alternatives: Amundsen (Lyft), Marques (WeWork), DataBook (Uber), DataHub (LinkedIn)
Look at Apache Nifi (there's a pluralsight class)

Evolving Chaos Engineering - Casey Rosenthal, Verica

Ships, shoes, fruit (apricots), helium mining. He's a very funny guy.
LOOK FOR A VIDEO to watch with the team: https://www.youtube.com/watch?v=JfT9UxcEcOE
Principlesofchaos.org
Reversibility: blue/green, feature flags, ci/cd, agile to waterfall.
Moved responsibility away from the people who do the work (hierarchy)
Myths:
1. remove the people causing the accidents.
2. document best practices and use runbooks. (most interesting problems are unique)
3. defend against prior root causes, aka defense in depth. Root cause analysis: "at best, you are wasting your time." Was our sponsor audience issue an example? The answer was in part to restrict audience size. But the dig highlighted system no longer supports system-wide features after growth, high processing cost of feature, inability to test with all users, etc.
4. enforce procedures
5. avoid risk
6. simplify
7. add redundancy
Do NOT eliminate complexity. Navigate it. CI, CD, CV - continuous verification (here's a link to a CV article: https://thenewstack.io/continuous-verification-the-missing-link-to-fully-automate-your-pipeline/). That's New Relic for us.
Has two books: Chaos Engineering and Learning Chaos Engineering. First book comes out June 2020.