github/gh-gei

`reclaim-mannequin --csv` exits after `Target must be a member of the XXX organization` error, rather than continuing to next row

timrogers opened this issue · 9 comments

Description

When running reclaim-mannequin --csv, the intent is that one reclaim failing doesn't cause the command to exit.

Instead, a warning should be logged, and the command should continue with reclaiming other mannequins in the CSV.

This isn't currently working as expected, and at least the Target must be a member of the XXX organization error causing the entire command to exist.

Looking at the code, this seems to be because ReclaimMannequinSkipInvitation throws an exception rather than returning a ReattributeMannequinToUserResult, and our error handling code which keeps the reclaims going seems to assume that there'll be a ReattributeMannequinToUserResult with Errors attached.

Reproduction Steps

  1. Run a migration, leaving unclaimed more than one unclaimed mannequin
  2. Use generate-mannequin-csv to generate a mannequin CSV
  3. Fill out the first mannequin with a user who is not a member of the target organization
  4. Run reclaim-mannequin --csv
  5. The command errors after processing the first row - very slowly because of lots of unnecessary retries... - and the second row is not considered.
[2023-08-21 15:48:16] [ERROR] OctoshiftCLI.OctoshiftCliException: Target must be a member of the XXX organization
   at OctoshiftCLI.Services.GithubClient.EnsureSuccessGraphQLResponse(JObject response)
   at OctoshiftCLI.Services.GithubClient.PostGraphQLAsync(String url, Object body, Dictionary`2 customHeaders)
   at OctoshiftCLI.Services.GithubApi.<>c__DisplayClass43_0.<<ReclaimMannequinSkipInvitation>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at Polly.Retry.AsyncRetryEngine.ImplementationAsync[TResult](Func`3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates`1 shouldRetryResultPredicates, Func`5 onRetryAsync, Int32 permittedRetryCount, IEnumerable`1 sleepDurationsEnumerable, Func`4 sleepDurationProvider, Boolean continueOnCapturedContext)
   at Polly.AsyncPolicy.ExecuteAsync[TResult](Func`3 action, Context context, CancellationToken cancellationToken, Boolean continueOnCapturedContext)
   at OctoshiftCLI.RetryPolicy.Retry[T](Func`1 func)
   at OctoshiftCLI.Services.GithubApi.ReclaimMannequinSkipInvitation(String orgId, String mannequinId, String targetUserId)
   at OctoshiftCLI.Services.ReclaimService.ReclaimMannequins(String[] lines, String githubTargetOrg, Boolean force, Boolean skipInvitation)
   at OctoshiftCLI.Commands.ReclaimMannequin.ReclaimMannequinCommandHandler.Handle(ReclaimMannequinCommandArgs args)
   at OctoshiftCLI.Extensions.CommandExtensions.RunHandler[TArgs,THandler](TArgs args, ServiceProvider sp, CommandBase`2 command)
   at OctoshiftCLI.Extensions.CommandExtensions.<>c__DisplayClass1_0`3.<<ConfigureCommand>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Invocation.AnonymousCommandHandler.InvokeAsync(InvocationContext )
   at System.CommandLine.Invocation.InvocationPipeline.<>c__DisplayClass4_0.<<BuildInvocationChain>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass17_0.<<UseParseErrorReporting>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass12_0.<<UseHelp>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass22_0.<<UseVersionOption>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass19_0.<<UseTypoCorrections>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<UseSuggestDirective>b__18_0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass16_0.<<UseParseDirective>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c.<<RegisterWithDotnetSuggest>b__5_0>d.MoveNext()
--- End of stack trace from previous location ---
   at System.CommandLine.Builder.CommandLineBuilderExtensions.<>c__DisplayClass8_0.<<UseExceptionHandler>b__0>d.MoveNext()

@timrogers Looking at another ocurrence of this and sometimes the reclaim skips the line (correctly) when it doesn't find the user

image

Sometimes it tries to reclaim it and then the server fails the call with the error

image

It almost as if the GetUser is either returning a user (unlikely) or the EnsureSuccessGraphQLResponse is failing.

But unfortunately it only logs the request and not the response.

It would be nice if this call logged the response request id

We should always have the request ID in the verbose log for every request. Is it not there in this case?

I didn't access to the full lock, but what I have been shown it isn't there.

Maybe I was misinformed

I can confirm the request id is there. My bad

Hi, are there any updates on this case? We'd love to use --skip-invitation in our deployment but this one is a blocker.

Hi, are there any updates on this case? We'd love to use --skip-invitation in our deployment but this one is a blocker.

This isn't something that we have picked up yet, but it's helpful to have your feedback so we can prioritise.

Which error are you hitting that is causing the process to stop? I am wondering if it is just Target must be a member of the XXX organization or other errors as well.

Which error are you hitting that is causing the process to stop? I am wondering if it is just Target must be a member of the XXX organization or other errors as well.

What we get is:

[2023-11-29 09:36:49] [INFO] You are running an up-to-date version of the gei CLI [v1.5.0]
[2023-11-29 09:36:49] [INFO] GITHUB ORG: our-org
[2023-11-29 09:36:49] [INFO] CSV: mannequins_filled.csv
[2023-11-29 09:36:49] [INFO] NO PROMPT: true
[2023-11-29 09:36:49] [INFO] SKIP INVITATION: true
[2023-11-29 09:36:49] [INFO] Reclaiming Mannequins with CSV...
Error: 1-29 09:37:51] [ERROR] Target must be a member of the our-org organization

We are generating CSV mechanically using a mapping from GHES to GitHub.com users, but currently we have to workaround this issue manually (by querying the list of members in the org and filtering CSV lines) before running reclaim-mannequin. ("blocker" was a too strong wording, sorry!)

Thanks! That's helpful. It goes without saying that I do think we should fix this. I'll have a look at how we can prioritise it.

Ran into this recently too. Similar to @kmaehashi, we are using scripting to fill in the target-user column from the output of the gh gei generate-mannequin-csv command.

Ended up using something like this to:

  1. use API to look up users in org
  2. filter the CSV to only keep users that are in org in the csv
JavaScript sample
const { Octokit } = require("@octokit/rest");
const csv = require("csv-parser");
const createCsvWriter = require("csv-writer").createObjectCsvWriter;
const fs = require("fs");
const yargs = require("yargs/yargs");
const { hideBin } = require("yargs/helpers");

async function getOrganizationUsers(org, octokit) {
  const users = await octokit.paginate(octokit.orgs.listMembers, {
    org,
    per_page: 100,
  });
  return users.map((user) => user.login);
}

function filterCsvFile(filename, orgUsers) {
  return new Promise((resolve, reject) => {
    const rows = [];
    fs.createReadStream(filename)
      .pipe(csv())
      .on('data', (row) => {
        if (orgUsers.includes(row['target-user'])) {
          console.log(`Found use in org, keeping user: ${row['target-user']}`)
          rows.push(row);
        }
      })
      .on('end', () => {
        if (rows[0]) {
          const csvWriter = createCsvWriter({
            path: filename,
            header: Object.keys(rows[0]).map(id => ({id, title: id})),
          });
          csvWriter.writeRecords(rows)
            .then(() => resolve())
            .catch(err => reject(err));
        } else {
          resolve();
        }
      })
      .on('error', (err) => {
        reject(err);
      });
  });
}

async function main() {
  const argv = yargs(hideBin(process.argv))
    .option("file", {
      alias: "f",
      type: "string",
      description: "CSV file to read from",
      demandOption: true,
    })
    .option("org", {
      alias: "o",
      type: "string",
      description: "Organization to filter users from",
      demandOption: true,
    })
    .option("token", {
      alias: "t",
      type: "string",
      description: "GitHub token",
      demandOption: true,
    }).argv;

  const orgName = argv.org;
  const csvFilePath = argv.file;
  const token = argv.token;

  const octokit = new Octokit({ 
    auth: token,
    baseUrl: `https://api.github.com`
  });

  const orgUsers = await getOrganizationUsers(orgName, octokit);
  console.log(`Started filtering ${csvFilePath} for users in ${orgName}`);
  await filterCsvFile(csvFilePath, orgUsers);
  console.log(`Finished filtering ${csvFilePath} for users in ${orgName}`);
}

main();
Bash sample
#!/bin/bash

# Check if the correct number of arguments are provided
if [ "$#" -ne 2 ]; then
    echo "Usage: ./remove-mannequins.sh <org_name> <csv_file_path>"
    exit 1
fi

# Get the organization name and CSV file path from the command line arguments
ORG_NAME=$1
CSV_FILE_PATH=$2

# Get the organization members
orgMembers=$(gh api --paginate /orgs/$ORG_NAME/members | jq -r '.[].login')

# Extract the headers
head -n 1 $CSV_FILE_PATH > filtered.csv

echo "Started filtering the CSV file..."

# Filter the CSV file
for member in $orgMembers
do
  awk -F, -v member="$member" 'NR>1 && $3 == member' $CSV_FILE_PATH >> filtered.csv
done

mv filtered.csv $CSV_FILE_PATH

echo "Finished filtering the CSV file."