microsoft/ApplicationInsights-dotnet

Consolidating Application Insights Repos

TimothyMothra opened this issue · 22 comments

Currently

We are shipping 17 packages across 4 repos:

Challenges

  • This creates additional work to synchronize changes across each repo.
    Typically work begins in this (Base) repo and requires a signed build to test those changes in another repo.
    This slows down the dev cycle.
  • We take this cost again whenever we do a release. Releases must occur in stages (Base > Web > Logging > Core) and involves a lot of waiting for automation to complete.
    A full release typically costs us a week. In the last 2 years, we've never released one repo without also releasing all others.
  • This creates confusion for customers. Most of our customers use "application insights" and don't know which repo to open issues.
  • This creates confusion for contributors. It takes some effort to determine the correct repo to contribute for a specific feature.

Proposal

Over the next month, we will merge all 4 repos into this one.

AspNetCore did this exercise last year and CoreClr is doing this now.

Timeframe

I'll be starting this work in October and I expect this can be completed within a month, finishing in early to mid November.

Next Steps

  • All PRs will be tagged with this issue for tracking.
  • I'll start opening prs to reorganize files before the migration to help prevent conflicts.
  • I'll migrate one repo at a time and confirm that builds are successful to prevent blocking any contributor. We want to preserve the entire commit history.
  • After the migration, I'll lock the source repo.
  • I expect there will several follow up PRs as I attempt to consolidate code.
  • I don't know the correct mechanism to migrate issues but I will make every attempt to do so.

Detailed Plan

In Scope

  • 4 Repos (Base, Web, AspNetCore, Logging)
  • 4 PR Build definitions
  • 4 Signing Definitions
  • 4 Release Pipelines

End Goal

  • Move all code to Base SDK repo, and preserve commit history
  • Shared build definitions for PRs, Signing, and Releases

Plan

Prep work

  1. DONE Reorganize individual repos. This is to ensure that there are no conflicting directories or file names and to guarantee no conflicts when migrating source. This will require some minor changes to existing build definitions.

Migrate source

  1. DONE Migrate a repo as-is to Base. Don't merge dependencies right away. The goal is to migrate the source as quickly as possible so devs can continue working.
  2. DONE Clone existing build definitions and point to new directories in Base SDK. The goal is to get builds working again as quickly as possible so devs can continue working. These can be validated against a feature branch before merging.
  3. DONE Once all builds are working, need to lock the original repo to prevent new PRs. Issues can be migrated anytime after this.
  4. DONE Safe to start the next repo.

Merge source

  1. All projects should be merged into a single directory. This can be done as follow up PRs.
    UPDATE This is blocked because each project has unique .props files that need to be consolidated.
  2. The newer single directory source will require consolidated build definitions. This can be validated against a feature branch before merging.
    UPDATE Final todo: The two linux builds need to be merged.

Merge dependencies

  1. Right now each repo has unique project dependencies (*.props, *.targets). These need to be consolidated to ensure that future code maintenance can be kept reasonable. Once the newer build infra is in place, this will be simple and low risk.
    UPDATE This is in progress, but proving to be very time consuming. Each project has unique and conflicting dependencies that need to be resolved.

Todo

  • DONE CONTRIBUTING guide

Stretch Goals

  • DONE https://aka.ms/deprecateIconUrl
  • Fix code coverage tool
  • DONE http://aka.ms/fxcopanalyzers
  • fix Perf/XDT.Tests project
  • DONE Update MSTest in all projects
  • Remove duplicate ApplicationInsightsTypes project (one per reop)
  • DONE Can we replace AddXmlLanguage?
  • DONE Make build results Public
  • Convert build definitions to YML

Risks

Potential Impact due to moving source

Mitigation: Only one repo will be migrated at a time and to completion to reduce the scope of disruptions.

Potential Impact to Releases

Migrating source could impact our ability to publish a new release.
Mitigation: Existing build definitions will not be changed, in the event that we need to rapidly rollback and release immediately.

Potential Impact to Devs

While source is being migrated, I'll need devs to pause contributions.
Mitigation: To keep this impact as low as possible, I plan to migrate source as-is without any reorganization. Keep using existing build infrastructure, just from a single repo.

Existing PRs may not get merged

Mitigation: I'm going to work with the team to get as many existing PRs merged as possible. Unfortunately, any inactive PRs may not get merged before the migration.

Can we have a detailed plan for:

  • how we run tests? today it takes ~15 minutes to run full CI for base SDK and ~1h to run tests for Web. Will it take 1.5+ hours to run all tests after making a small isolated change in ASP.NET Core SDK

  • test stability: tests are flaky. Getting all tests to pass everywhere for everything seems to have much lower probability

  • how we develop: when I introduce new feature in base - I'll need to update all SDKs and tests immediately. How can I split work between SDKs to avoid introducing HUGE PRs

Also, can we list other options to simplify the release process?

  • central repo and submodules
  • better automation scripts that work across repos
  1. We desperately need to improve our test stability. Dmitry and I have a task assigned to work on that this semester.

  2. The unit tests in Web are actually very fast. only a few minutes.
    It's the E2E tests that are slow because they need to provision new docker instances.
    I would propose that we split the E2E tests into a separate build definition so we can run them independently.

    2.a Currently, we have to open/close a PR to rerun all builds. VSTS was working on a feature to run individual builds separately. This was working in production earlier this year, but they switched it off.

  3. Maybe we disagree on this, but I feel like having extra tests to update is a good thing (code coverage). This isn't different from how we work today, except that it's split across several PRs so when reviewing a PR in Web we lose the context of what feature change necessitated this test updates. I've also seen cases where test fails in Web require re-work in Base, again taxing our overall dev cycle.

Regarding alternatives

  • I've been talking about this for over a year and I haven't heard any suggestions that would resolve all the issues we're currently experiencing. Cijo talked about his custom *.sln that he uses, but what he didn't mention is that he has to hack the *.csproj files every time he changes branches in ANY repo. I think this is an unreasonable ask.

we split the E2E tests into a separate build definition so we can run them independently.

when will we run them? Before release? Then release becomes even harder

This isn't different from how we work today, except that it's split across several PRs so when reviewing a PR in Web we lose the context of what feature change necessitated this test updates.

This is not about tests, this is about feature work. E.g. W3C was feature in 3 repos with a lot of changes in each.
After base was changed, tests in web/aspnetcore were doomed to fail.
I.e. how could I split feature work between PRs? Disable tests?

Cijo talked about his custom *.sln that he uses, but what he didn't mention is that he has to hack the *.csproj files every time he changes branches in ANY repo. I think this is an unreasonable ask.

Can we do something to prevent need of hack? Did we explore this option enough? I was proposing this for a while:

  1. central repo with submodules (they are bad and ugly so as monorepo). Central sln, build and release scripts: why hack csproj?

OR

  1. similar: central repo with build/release tools. Tools check out each repo separately, run tests, update versions, build sign nugets

So why we are thinking that monorepo is the only way to go?

when will we run them? Before release? Then release becomes even harder

I meant that a PR can run multiple builds separately. We already do this for windows vs linux tests, so it would be adding one more for E2E. This gives us quick unit test results while waiting for the longer E2E to complete. Being able to run more tests in parallel would reduce the overall time to complete a PR.

central repo

This doesn't sound simpler. I'm reading about submodules and it sounds like it makes PRs more complicated not less.
Central Repo with submodules doesn't solve our release pains because we're still waiting for the automation of one repo to complete in entirety before we can start the next.
This also doesn't solve the confusion I've seen customers experience when creating issues, contributing features, or looking for docs that are split across our several repos.


I want to hear other suggestions but I'm not convinced that other solutions have as big of an impact in improving our processes.

  • how we run tests? today it takes ~15 minutes to run full CI for base SDK and ~1h to run tests for Web. Will it take 1.5+ hours to run all tests after making a small isolated change in ASP.NET Core SDK

CI should run in parallel versus sequentially.

  • test stability: tests are flaky. Getting all tests to pass everywhere for everything seems to have much lower probability

This should be fixed.

  • how we develop: when I introduce new feature in base - I'll need to update all SDKs and tests immediately. How can I split work between SDKs to avoid introducing HUGE PRs

I can show you how to split PRs, we can take one concrete example.

automation of one repo to complete in entirety before we can start the next.

Why? We do the automation and we can do it as we want

This also doesn't solve the confusion I've seen customers experience when creating issues

I never heard that it is a big issue - we can easily transfer issues between repos and fix the confusion. What I heard that we want to improve release process.

I can show you how to split PRs, we can take one concrete example.

@reyang I know pretty well how to split PRs in general case, but for cross- SDK feature work, it might be extremely hard

I can show you how to split PRs, we can take one concrete example.

@reyang I know pretty well how to split PRs in general case, but for cross- SDK feature work, it might be extremely hard

Pick one example and we can work on it.

@lmolkova This is what I will do, it is a very common practice:

  1. implement the new path in this repo, have a flag to control it, write test cases with flag on.
  2. implement the new path in dependencies (aspnetcore and dotnet-server), guarded by the same flag, write test cases with flag on.
  3. a single change to flip the flag and update the documents
  4. remove legacy stuff in smaller PRs.
  5. remove the flag.

@reyang thanks. this makes a lot of sense, it may also complicate development, esp when each PR is 1+ hours test run.

That's why I want to see we weighted pros and cons of different approaches and thought about consequences before we are picking one solution.

Can we talk to someone in ASP.NET Core and learn how they've done it and issues they've faced?
Can we try to do some analysis before we jump into merge?

@reyang thanks. this makes a lot of sense, it may also complicate development, esp when each PR is 1+ hours test run.

1+ hours run doesn't make sense, and we should fix it.

That's why I want to see we weighted pros and cons of different approaches and thought about consequences before we are picking one solution.

Agree.

Can we talk to someone in ASP.NET Core and learn how they've done it and issues they've faced?

Not sure.

Can we try to do some analysis before we jump into merge?

Agree.

Can we talk to someone in ASP.NET Core and learn how they've done it and issues they've faced?

Not sure.

Let me see if I can help with that.

Being a welcoming to the community contribution for me outweighs the complexity of build systems. I had numerous conversations with people who fixed bugs in base SDK or asp.net core and I'm sure having a single solution for everything with cross dependencies as well as possibility to break unrelated codepath or being affected by random test failure in a differnt project would be a significant road block for them.

I have some doubts of efficiency from the technical perspective as well if we want to discuss those.

Potential (actual) Risk: all the issues opened on the old repos are now archived, cannot be modified, and have not been migrated. Here is an example: https://github.com/Microsoft/ApplicationInsights-dotnet-server/issues/176

@mxa0079, We are aware of this issue and are not ignoring it.
Github does not provide any mechanism to bulk transfer issues from one repo to another.
My team and I have manually transferred what we thought were the top active issues.
We're still investigating how to bulk transfer the remaining issues.

Thank you for opening a new issue and referencing the former.

@TimothyMothra are you going to go through all the packages on nuget and update their source links? for example: https://www.nuget.org/packages/Microsoft.ApplicationInsights.WorkerService still points to the "old" repos?

@gardnerjr I made this change already in the nuspec. That link you provided, the previous stable (2.8.2) was released from the old repo. The latest beta (2.12.0-beta4) is correctly linking to this repo. :)

I don't think I can retroactively update the links for older packages because that comes from the previously deployed nuspec, but i believe all the new releases should have the correct links.
Please let me know if you find something unexpected.
Thanks John!

Would this be the right place to raise any upgrade issues with the dotnet Application Insights assemblies? I have a project originally written against version 2.3.0, which I'm now trying to update to build against version 2.12.0, and whilst it builds successfully, at runtime, it encounters a MissingMethodException:Method not found: 'Void Microsoft.ApplicationInsights.Extensibility.PerfCounterCollector.QuickPulse.QuickPulseTelemetryProcessor..ctor(Microsoft.ApplicationInsights.Extensibility.ITelemetryProcessor)'. I've tried incrementally working through the different versions available, and can work through 2.4, 2.5, 2.6.1 & 2.6.4. Any versions above that appear to require an update of System.Diagnostics.DiagnosticSource, and start throwing the MissingMethodException at runtime

@leeds55 This is the correct repo, but please open a new issue. This issue is specifically to coordinate the status of the our repo consolidation.

This issue is stale because it has been open 300 days with no activity. Remove stale label or comment or this will be closed in 7 days.