/awesome-data-leadership

A curated list of awesome posts, videos, and articles on leading a data team (small and large)

awesome-data-leadership

A curated list of awesome and useful posts, videos, and articles on leading a data team. This includes leadership at the middle-management, Director/VP, or C-suite level, for organizations both big and small. A few relevant engineering management articles are sprinkled in. Awesome

Please contribute by opening PRs! ⚡️

Topics

Hiring

Author Title One-sentence summary Year
Eli Goldberg Hire better data scientists: A field guide for hiring managers new to data science. Part 1. Creating better job descriptions brings in better talent. When hiring, highlight the "why you", desecribe opportunities instead of responsibilities, describe key actions and background experience needed not technologies, and proofread! 2020
Eli Goldberg Hire better data scientists: A field guide for hiring managers new to data science Part 2. Create a clear interviewing process. Make time for hiring and use your shift in priorities to your advantage, don't "wing it", write your process down and engineer it to be data driven, and modify the process not your adherence to it. 2020
Gergely Orosz Hiring (and Retaining) a Diverse Engineering Team Stories from six engineering leaders who succeeded in building and growing diverse teams 2021
Reddit “Are we being too harsh on junior candidates?” Reddit thread discussing expectations of junior ML job candidates 2022
Hacker News “When did 7 interviews become normal” A “Ask HN” forum question around the topic of over-interviewing 2022
Farhan Thawar VP of Engineering hiring cheatsheet A guide for assessing a candidate for a engineering or data leadership role: provides good and bad responses to questions. 2022
Freaking Rectange Blog How to Freaking Find Great Developers By Having Them Read Code When hiring for data engineering, analytics, data science, or ML Engineering roles, it would be better to have candidates try to read code instead of writing it (it can be neutral interview-only code). 2022
Emily Thompson Hiring Data Scientists With Intention Gives guidance on: writing a focused job description, being strategic in sourcing, and designing a structured interview process so that you can be consistent in evaluating candidates. 2022
Nate Rosidi 15 Python Coding Interview Questions You Must Know For Data Science Provides 15 examples of testing basic python dta manipulation skills for interviews. 2022
Jike Chong, Ben Lorica, Yue Cathy Chang Top Places to Work for Data Scientists: We identify U.S. organizations that will help you develop your career in data science Looks at factors that make a data science org attractive to an IC, but this provides some insights for hiring managers trying to get in the heads of talent. 2022
Randy Au Let's talk a bit about giving interviews Gives thoughts on planning and carrying out a technical data science interview. 2022

Culture

Author Title One-sentence summary Year
Emily Thompson Growing Data Teams from Reactive to Influential Reactive data teams lead to low impact and attrition, so instead acknowledge if your team is reactive, assess reactivity quantitatively, focus on near-term wins for cultural change, and build longer-term foundational work into the team’s capacity 2022
Prukalpa Sankar It’s Time for the Modern Data Culture Stack We need a modern data culture stack: best practices, values, and cultural rituals that will help data people come together and collaborate effectively. 2021
Kuba Niechcial How to set goals for engineers? Provides some examples of good engineer personnel goals and things to keep in mind (e.g. KPIs should not be personal goals). 2021
Jacob Kaplan-Moss “Exit Interviews Are a Trap” Rethinking the exit interview: there is very little upside (unlikely things will change) and potentially significant downside (bad blood, retracted references, malicious actions by employer, etc. 2022
Christoph Neijenhuis How to stop shrinkage in engineering teams The journey to stopping shrinkage in engineering teams is long and rarely straightforward, but there are practical things leaders can do to take control of the chaos, from taking steps to get out of survival mode and tackling problems around culture to involving teams in the development of a solid technical strategy. 2022
Caitlin Moorman Proficiency v. Creativity It is critical to find a balance between open-endedness/opportunities for creativity and standardized rigor when leading a data function. 2020
Shimin Zhang Why a Meeting Costs More than a MacBook Pro – the Business Case for Fewer Developers in Meetings Describes the opportunity cost of having all developers or data engineers attending meetings and describes ways to recoup this. 2022
David Waller 10 Steps to Creating a Data-Driven Culture Details some steps for working towards a data-driven culture, from taking care in choosing metrics to quantifying uncertainty. 2020

Impact

Author Title One-sentence summary Year
McKinsey Ten red flags signaling your analytics program will fail. A list ranging from the executive team doesn't have a clear vision for it's analytics program to nobody knows the quantitative impact that analytics is providing 2018
Erik Bernhardsson Building a data team at a mid-stage startup: a short story A story about a fictional company that became more data-driven and how it was done. 2021
Abinaya Sundarraj Data Management: How to Stay on Top of Your Customer’s Mind? Describes the virtues and challenges around achieving a customer-centric, data perspective in a business. 2022
Mikkel Dengsøe How to measure data quality: Practical guidelines for how to measure quality, engagement and productivity in a data team Provides some thoughts around how to evaluate your data team and suggests three categories of metrics: quality, productivity, and engagement. 2022
Sarah Krasnik Choosing a Data Catalog Although not technically on management, this tackles the critical topic of documentation, dictionaries, knowledge repos and such, which are critically important for a data org. 2022
Chad Sanderson The Existential Threat of Data Quality: and Why the Modern Data Stack Can't Solve It Despite the rapidly-evolving/growing data stack, poor data quality remains an enormous problem; the article breaks it down into "downstream" and "upstream" categories. 2022

Strategy

Author Title One-sentence summary Year
Prukalpa Sankar Data Advantage Matrix: A New Way to Think About Data Strategy Break down your data advantage into four categories (e.g. operational, strategic, product, and business opportunity) and then assess what stage each of these is at (e.g. basic, intermediate, advanced) 2021
Ilan Man Creating a Data Road Map Provides suggestions for what factors to consider when thinking about a data roadmap or data strategy (e.g. identifying the audience, set up the scaffolding, etc.) . 2019

Project Management

Author Title One-sentence summary Year
Erik Bernhardsson “Why software projects take longer than you think: a statistical model” Adding up time estimates for many subtasks isnt advised, instead, figure out which tasks have the highest uncertainty – those tasks are basically going to dominate the time to completion. 2019
Erik Bernhardsson “σ-driven project management: when is the optimal time to give up?” The post describes an abstract measure “alpha” that captures the risk of a project and based on that risk the post describes a statistical model that shows when one ought to give up on a project. 2022
Michael Kaminsky Agile Analytics, Part 1: The Good Stuff When it comes to data science and analytics, these aspects of the scrum work flow work well: acceptance criteria, pointing, two-week chunks (sprints), and explicit prioritization. 2018
Michael Kaminsky Agile Analytics, Part 2: The Bad Stuff Some aspects of agile don't work so well with data teams, these include: "The fortuitous finding", exploratory data analysis needs, product ownership / story-writing, and business-as-usual support. 2018
Michael Kaminsky Agile Analytics, Part 3: The Adjustments Adjustments are suggested for agile to work well on a data team: time-bound spikes for research, build in slack time for exploration, acceptance criteria includes “write the next story”, peer-review instead of sprint-review. 2018

Code Review

Author Title One-sentence summary Year
Gunnar Morling The Code Review Pyramid There should be a hierachy of effort in reviewing code, where more effort is spent on core concepts, how performant code is, and documentation, with less effort on test quality (though of course tests are important) and syntax. 2022
Tim Hopper Code Review Guidelines for Data Science Teams In the context of data team, desecribes what a code review should achieve, bullets to carry out pull requests, and some links to additional reading. 2020

Organization Structure and Job Titles

Author Title One-sentence summary Year
Rob Dearborn Organizing and scaling an effective data team General guidelines on what a properly-structured data team should look like, with describes ranging from 1-person data team to 32+ person team. 2022
Brittany Bennett Building Powerful Data Teams: On Investing in Junior Talent Provides suggestions on how developing junior talent: blocking off time for personal development, celebrating this blocked off time, hiring tutors, and more. 2021
Eric Colson "Beware the data science pin factory: The power of the full-stack data science generalist and the perils of division of labor through function" Beware specialization in data science (data science is not to execute. Rather, the goal is to learn and develop profound new business capabilities), as there are costs to specialization. 2019
Chuong Do "What is the most effective way to structure a data science team?" Covers how should data scientist roles be defined (analysis vs building), where should data scientists report (centralized vs decentralized), where should the data science function live (engineering org vs product org vs independent consultancy), and what should an organization do to set up data science for success. 2017
Mikkel Dengsøe "Data team structure: embedded or centralised?" There are three common models of how data teams are structured, each with their drawbacks and advantages: centralized, embedded, and hybrid. 2022
Randy Bean Chief Data Officers Struggle To Make A Business Impact There is widespread disparity of opinion on what defines a successful Chief Data Officer, so it makes sense that only CDOs are poised for success according to a recent Gartner report. 2019
Matthew Mayo Data Scientist, Data Engineer & Other Data Careers, Explained Explanations of various titles such as Data Architect, Data Engineer, Analyst, ML Engineer, and Data Scientist 2022
Gergely Orosz What Silicon Valley "Gets" about Software Engineers that Traditional Companies Do Not The Silicon Valley treats engineers as autonomous adults who are smart people because that’s who they hire because that’s who can do the work they need done, while traditional companies tend to keep developers in pure execution roles. 2021
Rifat Majumder The Data Product Manager Describes the emerging role of "Data Product Manager", and how benefits they provide an org: better business impact, a deep understanding of customer problems, and more clarity on priorities. 2021
Benn Stancil The technical pay gap: The culture we build is the culture we buy Describes the current state of confusion around data titles (using the "analytics engineer" as an example), and describes how the tech industry overvalues technical skills at times. 2022
Ben Darfler Engineering Levels at Honeycomb: Avoiding the Scope Trap Describes a nice framework for thinking about job levels, based on scope and level of project complexity. 2022

ML and AI Within an Organization

Author Title One-sentence summary Year
Monica Rogati The AI Hierarchy of Needs Before you can fully get value out of ML/AI in an organization, it is critical to have foundational data needs met (i.e. good data collection processes, checks, and analytics). 2017
Mario Perrakis “The “0 / 1 / Done” Strategy for Data Science” A description for what a DS org should aspire to: 0-day handovers facilitated by great documentation and code, 1-day prototypes enabled by good tooling and good knowledge, and a clear definition of “done”. 2022
Thomas Redman “Your Data Initiatives Can’t Just Be for Data Scientists” Describes the tole and importance of non-data experts in DS projects: collaborators, customers, and as creators of the data. 2022
Natassha Selvaraj “Why Are So Many Data Scientists Quitting Their Jobs?” Two primary factors drive a number of new data scientists out of the profession: a mis-match between employer and employee expectations around data science work and the general difficulty of ML to add clear business value. 2022
Pete Warden How Should you Protect your Machine Learning Models and IP? Some thoughts on the importance of protecting IP in a ML org. 2022
Jeff Saltz Managing Machine Learning Projects Touches on difficulties of managing ML projects and how the management process differs from standard software development. 2021
Alfred Spector, Peter Norvig, Chris Wiggins, and Jeannette M. Wing Data Science in Context: Foundations, Challenges, Opportunities A pre-release of a book that gives a thorough accounting of the history of Data Science, a high-level understanding of its applications, and the ethical and social concerns associated with it. 2022

BI and Analytics Within an Organization

Author Title One-sentence summary Year
Lenny Rachitsky Choosing Your North Star Metric Proposes metrics based on your type of business, recommends having a singular north star metric, and avoid using revenue as your metric. 2021
Ron Berman “The Value of Descriptive Analytics: Evidence from Online Retailers” The authors estimate an increase of 4%–10% in average weekly revenues post-adoption associated with the adoption of descriptive analytics among online retailers. 2020
Roger M. Stein "Why Managing Data Scientists Is Different" Two challenges in managing data scientists: (1) managing a data research effort tends to be a dynamic and self-correcting process in which it is difficult to plan either a project’s timing or final outcomes, and (2) analytics is highly sensitive to time, cost, and quality tradeoffs. 2015
Eric Colson "The Sobering Truth about the Impact of your Business Ideas" The vast majority of business ideas fail to generate a positive impact, and this underscores the value of measuring impact, collecting data, and testing. 2021
Joe McFarren 5 Tips for Managing a Successful Analytics Project In the context of analytics consulting it is important to: clearly establish project scope, be in constant communication, determine a line of escalation, monitor work with tracking apps, and track finances. 2022
Erik Balodis A Framework for Embedding Decision Intelligence into your Organization Provides a high-level overview of how to infuse decision-intelligence into an organization, along with some additional reading sources. 2022
Nelson Auner Building an Analytics Stack in 2020 Gives an overview of the modern analytics stack via three buckets: a data-moving tool (ETL), a data warehouse to store the data, and a BI layer to analyze the data. 2020
Mode The Data Team’s Guide for Marketing Metrics Good overview of the landscape of metrics used in data marketing work (as well as information on the technical side of it). 2022

Management Skills

Author Title One-sentence summary Year
David Loftesness The Engineer to Manager Transition, by Former Twitter Director of Engineering Talks about an engineering management "event loop", where you touch base on people, projects, process, and self on daily, weekly, and monthly basis. 2015
The Institute of Leadership & Management "Spotlight on Leadership Styles" Describes a set of leadership/management styles including pace-setting, democratic, laissez-faire, and more. 2018
Andy Johns How to know when to stop: A guide to avoiding burnout and establishing balance in your life—by guest author Andy Johns A framework for thinking throughout burnout including: 1) Define your personal range of tolerance, 2) Pick your career progression, 3) Pick your life progression. 2022
Alan Johnson 11 Principles of Engineering Management A brief, digestable list of management principles for new engineering managers. 2022
GitLab Preventing burnout: A manager's toolkit Provides 12 strategies managers can utilize to support their team and prevent burnout 2022
Tanya Reilly Being glue Describes the importance of "glue work" (e.g. noticing when other people in the team are blocked and helping them out, reviewing design documents and noticing what's inconsistent, onboarding the new people and making them productive faster, or improving processes to make customers happy. 2019