awesome-data-leadership
A curated list of awesome and useful posts, videos, and articles on leading a data team. This includes leadership at the middle-management, Director/VP, or C-suite level, for organizations both big and small. A few relevant engineering management articles are sprinkled in.
Please contribute by opening PRs! ⚡️
Topics
- Hiring
- Culture
- Impact
- Strategy
- Project Management
- Code Review
- Organization Structure and Job Titles
- ML and AI Within an Organization
- BI and Analytics Within an Organization
- Management Skills
Hiring
Author | Title | One-sentence summary | Year |
---|---|---|---|
Eli Goldberg | Hire better data scientists: A field guide for hiring managers new to data science. Part 1. Creating better job descriptions brings in better talent. | When hiring, highlight the "why you", desecribe opportunities instead of responsibilities, describe key actions and background experience needed not technologies, and proofread! | 2020 |
Eli Goldberg | Hire better data scientists: A field guide for hiring managers new to data science Part 2. Create a clear interviewing process. | Make time for hiring and use your shift in priorities to your advantage, don't "wing it", write your process down and engineer it to be data driven, and modify the process not your adherence to it. | 2020 |
Gergely Orosz | Hiring (and Retaining) a Diverse Engineering Team | Stories from six engineering leaders who succeeded in building and growing diverse teams | 2021 |
“Are we being too harsh on junior candidates?” | Reddit thread discussing expectations of junior ML job candidates | 2022 | |
Hacker News | “When did 7 interviews become normal” | A “Ask HN” forum question around the topic of over-interviewing | 2022 |
Farhan Thawar | VP of Engineering hiring cheatsheet | A guide for assessing a candidate for a engineering or data leadership role: provides good and bad responses to questions. | 2022 |
Freaking Rectange Blog | How to Freaking Find Great Developers By Having Them Read Code | When hiring for data engineering, analytics, data science, or ML Engineering roles, it would be better to have candidates try to read code instead of writing it (it can be neutral interview-only code). | 2022 |
Emily Thompson | Hiring Data Scientists With Intention | Gives guidance on: writing a focused job description, being strategic in sourcing, and designing a structured interview process so that you can be consistent in evaluating candidates. | 2022 |
Nate Rosidi | 15 Python Coding Interview Questions You Must Know For Data Science | Provides 15 examples of testing basic python dta manipulation skills for interviews. | 2022 |
Jike Chong, Ben Lorica, Yue Cathy Chang | Top Places to Work for Data Scientists: We identify U.S. organizations that will help you develop your career in data science | Looks at factors that make a data science org attractive to an IC, but this provides some insights for hiring managers trying to get in the heads of talent. | 2022 |
Randy Au | Let's talk a bit about giving interviews | Gives thoughts on planning and carrying out a technical data science interview. | 2022 |
Culture
Author | Title | One-sentence summary | Year |
---|---|---|---|
Emily Thompson | Growing Data Teams from Reactive to Influential | Reactive data teams lead to low impact and attrition, so instead acknowledge if your team is reactive, assess reactivity quantitatively, focus on near-term wins for cultural change, and build longer-term foundational work into the team’s capacity | 2022 |
Prukalpa Sankar | It’s Time for the Modern Data Culture Stack | We need a modern data culture stack: best practices, values, and cultural rituals that will help data people come together and collaborate effectively. | 2021 |
Kuba Niechcial | How to set goals for engineers? | Provides some examples of good engineer personnel goals and things to keep in mind (e.g. KPIs should not be personal goals). | 2021 |
Jacob Kaplan-Moss | “Exit Interviews Are a Trap” | Rethinking the exit interview: there is very little upside (unlikely things will change) and potentially significant downside (bad blood, retracted references, malicious actions by employer, etc. | 2022 |
Christoph Neijenhuis | How to stop shrinkage in engineering teams | The journey to stopping shrinkage in engineering teams is long and rarely straightforward, but there are practical things leaders can do to take control of the chaos, from taking steps to get out of survival mode and tackling problems around culture to involving teams in the development of a solid technical strategy. | 2022 |
Caitlin Moorman | Proficiency v. Creativity | It is critical to find a balance between open-endedness/opportunities for creativity and standardized rigor when leading a data function. | 2020 |
Shimin Zhang | Why a Meeting Costs More than a MacBook Pro – the Business Case for Fewer Developers in Meetings | Describes the opportunity cost of having all developers or data engineers attending meetings and describes ways to recoup this. | 2022 |
David Waller | 10 Steps to Creating a Data-Driven Culture | Details some steps for working towards a data-driven culture, from taking care in choosing metrics to quantifying uncertainty. | 2020 |
Impact
Author | Title | One-sentence summary | Year |
---|---|---|---|
McKinsey | Ten red flags signaling your analytics program will fail. | A list ranging from the executive team doesn't have a clear vision for it's analytics program to nobody knows the quantitative impact that analytics is providing | 2018 |
Erik Bernhardsson | Building a data team at a mid-stage startup: a short story | A story about a fictional company that became more data-driven and how it was done. | 2021 |
Abinaya Sundarraj | Data Management: How to Stay on Top of Your Customer’s Mind? | Describes the virtues and challenges around achieving a customer-centric, data perspective in a business. | 2022 |
Mikkel Dengsøe | How to measure data quality: Practical guidelines for how to measure quality, engagement and productivity in a data team | Provides some thoughts around how to evaluate your data team and suggests three categories of metrics: quality, productivity, and engagement. | 2022 |
Sarah Krasnik | Choosing a Data Catalog | Although not technically on management, this tackles the critical topic of documentation, dictionaries, knowledge repos and such, which are critically important for a data org. | 2022 |
Chad Sanderson | The Existential Threat of Data Quality: and Why the Modern Data Stack Can't Solve It | Despite the rapidly-evolving/growing data stack, poor data quality remains an enormous problem; the article breaks it down into "downstream" and "upstream" categories. | 2022 |
Strategy
Author | Title | One-sentence summary | Year |
---|---|---|---|
Prukalpa Sankar | Data Advantage Matrix: A New Way to Think About Data Strategy | Break down your data advantage into four categories (e.g. operational, strategic, product, and business opportunity) and then assess what stage each of these is at (e.g. basic, intermediate, advanced) | 2021 |
Ilan Man | Creating a Data Road Map | Provides suggestions for what factors to consider when thinking about a data roadmap or data strategy (e.g. identifying the audience, set up the scaffolding, etc.) . | 2019 |
Project Management
Author | Title | One-sentence summary | Year |
---|---|---|---|
Erik Bernhardsson | “Why software projects take longer than you think: a statistical model” | Adding up time estimates for many subtasks isnt advised, instead, figure out which tasks have the highest uncertainty – those tasks are basically going to dominate the time to completion. | 2019 |
Erik Bernhardsson | “σ-driven project management: when is the optimal time to give up?” | The post describes an abstract measure “alpha” that captures the risk of a project and based on that risk the post describes a statistical model that shows when one ought to give up on a project. | 2022 |
Michael Kaminsky | Agile Analytics, Part 1: The Good Stuff | When it comes to data science and analytics, these aspects of the scrum work flow work well: acceptance criteria, pointing, two-week chunks (sprints), and explicit prioritization. | 2018 |
Michael Kaminsky | Agile Analytics, Part 2: The Bad Stuff | Some aspects of agile don't work so well with data teams, these include: "The fortuitous finding", exploratory data analysis needs, product ownership / story-writing, and business-as-usual support. | 2018 |
Michael Kaminsky | Agile Analytics, Part 3: The Adjustments | Adjustments are suggested for agile to work well on a data team: time-bound spikes for research, build in slack time for exploration, acceptance criteria includes “write the next story”, peer-review instead of sprint-review. | 2018 |
Code Review
Author | Title | One-sentence summary | Year |
---|---|---|---|
Gunnar Morling | The Code Review Pyramid | There should be a hierachy of effort in reviewing code, where more effort is spent on core concepts, how performant code is, and documentation, with less effort on test quality (though of course tests are important) and syntax. | 2022 |
Tim Hopper | Code Review Guidelines for Data Science Teams | In the context of data team, desecribes what a code review should achieve, bullets to carry out pull requests, and some links to additional reading. | 2020 |
Organization Structure and Job Titles
Author | Title | One-sentence summary | Year |
---|---|---|---|
Rob Dearborn | Organizing and scaling an effective data team | General guidelines on what a properly-structured data team should look like, with describes ranging from 1-person data team to 32+ person team. | 2022 |
Brittany Bennett | Building Powerful Data Teams: On Investing in Junior Talent | Provides suggestions on how developing junior talent: blocking off time for personal development, celebrating this blocked off time, hiring tutors, and more. | 2021 |
Eric Colson | "Beware the data science pin factory: The power of the full-stack data science generalist and the perils of division of labor through function" | Beware specialization in data science (data science is not to execute. Rather, the goal is to learn and develop profound new business capabilities), as there are costs to specialization. | 2019 |
Chuong Do | "What is the most effective way to structure a data science team?" | Covers how should data scientist roles be defined (analysis vs building), where should data scientists report (centralized vs decentralized), where should the data science function live (engineering org vs product org vs independent consultancy), and what should an organization do to set up data science for success. | 2017 |
Mikkel Dengsøe | "Data team structure: embedded or centralised?" | There are three common models of how data teams are structured, each with their drawbacks and advantages: centralized, embedded, and hybrid. | 2022 |
Randy Bean | Chief Data Officers Struggle To Make A Business Impact | There is widespread disparity of opinion on what defines a successful Chief Data Officer, so it makes sense that only CDOs are poised for success according to a recent Gartner report. | 2019 |
Matthew Mayo | Data Scientist, Data Engineer & Other Data Careers, Explained | Explanations of various titles such as Data Architect, Data Engineer, Analyst, ML Engineer, and Data Scientist | 2022 |
Gergely Orosz | What Silicon Valley "Gets" about Software Engineers that Traditional Companies Do Not | The Silicon Valley treats engineers as autonomous adults who are smart people because that’s who they hire because that’s who can do the work they need done, while traditional companies tend to keep developers in pure execution roles. | 2021 |
Rifat Majumder | The Data Product Manager | Describes the emerging role of "Data Product Manager", and how benefits they provide an org: better business impact, a deep understanding of customer problems, and more clarity on priorities. | 2021 |
Benn Stancil | The technical pay gap: The culture we build is the culture we buy | Describes the current state of confusion around data titles (using the "analytics engineer" as an example), and describes how the tech industry overvalues technical skills at times. | 2022 |
Ben Darfler | Engineering Levels at Honeycomb: Avoiding the Scope Trap | Describes a nice framework for thinking about job levels, based on scope and level of project complexity. | 2022 |
ML and AI Within an Organization
Author | Title | One-sentence summary | Year |
---|---|---|---|
Monica Rogati | The AI Hierarchy of Needs | Before you can fully get value out of ML/AI in an organization, it is critical to have foundational data needs met (i.e. good data collection processes, checks, and analytics). | 2017 |
Mario Perrakis | “The “0 / 1 / Done” Strategy for Data Science” | A description for what a DS org should aspire to: 0-day handovers facilitated by great documentation and code, 1-day prototypes enabled by good tooling and good knowledge, and a clear definition of “done”. | 2022 |
Thomas Redman | “Your Data Initiatives Can’t Just Be for Data Scientists” | Describes the tole and importance of non-data experts in DS projects: collaborators, customers, and as creators of the data. | 2022 |
Natassha Selvaraj | “Why Are So Many Data Scientists Quitting Their Jobs?” | Two primary factors drive a number of new data scientists out of the profession: a mis-match between employer and employee expectations around data science work and the general difficulty of ML to add clear business value. | 2022 |
Pete Warden | How Should you Protect your Machine Learning Models and IP? | Some thoughts on the importance of protecting IP in a ML org. | 2022 |
Jeff Saltz | Managing Machine Learning Projects | Touches on difficulties of managing ML projects and how the management process differs from standard software development. | 2021 |
Alfred Spector, Peter Norvig, Chris Wiggins, and Jeannette M. Wing | Data Science in Context: Foundations, Challenges, Opportunities | A pre-release of a book that gives a thorough accounting of the history of Data Science, a high-level understanding of its applications, and the ethical and social concerns associated with it. | 2022 |
BI and Analytics Within an Organization
Author | Title | One-sentence summary | Year |
---|---|---|---|
Lenny Rachitsky | Choosing Your North Star Metric | Proposes metrics based on your type of business, recommends having a singular north star metric, and avoid using revenue as your metric. | 2021 |
Ron Berman | “The Value of Descriptive Analytics: Evidence from Online Retailers” | The authors estimate an increase of 4%–10% in average weekly revenues post-adoption associated with the adoption of descriptive analytics among online retailers. | 2020 |
Roger M. Stein | "Why Managing Data Scientists Is Different" | Two challenges in managing data scientists: (1) managing a data research effort tends to be a dynamic and self-correcting process in which it is difficult to plan either a project’s timing or final outcomes, and (2) analytics is highly sensitive to time, cost, and quality tradeoffs. | 2015 |
Eric Colson | "The Sobering Truth about the Impact of your Business Ideas" | The vast majority of business ideas fail to generate a positive impact, and this underscores the value of measuring impact, collecting data, and testing. | 2021 |
Joe McFarren | 5 Tips for Managing a Successful Analytics Project | In the context of analytics consulting it is important to: clearly establish project scope, be in constant communication, determine a line of escalation, monitor work with tracking apps, and track finances. | 2022 |
Erik Balodis | A Framework for Embedding Decision Intelligence into your Organization | Provides a high-level overview of how to infuse decision-intelligence into an organization, along with some additional reading sources. | 2022 |
Nelson Auner | Building an Analytics Stack in 2020 | Gives an overview of the modern analytics stack via three buckets: a data-moving tool (ETL), a data warehouse to store the data, and a BI layer to analyze the data. | 2020 |
Mode | The Data Team’s Guide for Marketing Metrics | Good overview of the landscape of metrics used in data marketing work (as well as information on the technical side of it). | 2022 |
Management Skills
Author | Title | One-sentence summary | Year |
---|---|---|---|
David Loftesness | The Engineer to Manager Transition, by Former Twitter Director of Engineering | Talks about an engineering management "event loop", where you touch base on people, projects, process, and self on daily, weekly, and monthly basis. | 2015 |
The Institute of Leadership & Management | "Spotlight on Leadership Styles" | Describes a set of leadership/management styles including pace-setting, democratic, laissez-faire, and more. | 2018 |
Andy Johns | How to know when to stop: A guide to avoiding burnout and establishing balance in your life—by guest author Andy Johns | A framework for thinking throughout burnout including: 1) Define your personal range of tolerance, 2) Pick your career progression, 3) Pick your life progression. | 2022 |
Alan Johnson | 11 Principles of Engineering Management | A brief, digestable list of management principles for new engineering managers. | 2022 |
GitLab | Preventing burnout: A manager's toolkit | Provides 12 strategies managers can utilize to support their team and prevent burnout | 2022 |
Tanya Reilly | Being glue | Describes the importance of "glue work" (e.g. noticing when other people in the team are blocked and helping them out, reviewing design documents and noticing what's inconsistent, onboarding the new people and making them productive faster, or improving processes to make customers happy. | 2019 |