Flank/flank

Flakes attribute never set correctly in FullJunitReport.xml

AaronMT opened this issue · 14 comments

<testsuite name="Pixel2.arm-30-en_US-portrait" tests="462" failures="1" flakes="0" errors="0" skipped="0" time="10277.484" timestamp="2024-06-23T06:02:53" hostname="localhost">
    <testcase name="verifyButtonTest" classname="org.app.verifyButtonTestTest" time="0.222" flaky="true">
     ...
</testsuite>

Shouldn't flakes attribute be 1 in this case? This is with v23.10.1.

Our Flank config specifies num-flaky-test-attempts: 1 with full-junit-result: true under the Flank section.

Hi @AaronMT , thanks for the report. I'm able to reproduce this issue. I'm looking into it.

Hi @AaronMT , I was able to reproduce this once but for some weird reason, I'm not able to reproduce it anymore. Are you still experiencing this issue?

Yes, I still see this issue on 23.10.1. Here's an example report from our CI from yesterday.

<testcase name="verifyReaderModeControlsTest" classname="org.mozilla.fenix.ui.ReaderViewTest" time="0.275" flaky="true">
<failure>
java.security.ProviderException: Keystore operation failed at ...
</failure>
<webLink>
...
</webLink>
</testcase>

Where flaky is set to true, but the top level testsuite has the attributeflakes set to 0

<testsuite name="Pixel2.arm-30-en_US-portrait" tests="486" failures="0" flakes="0" errors="0" skipped="0" time="10298.759" timestamp="2024-10-14T21:30:46" hostname="localhost">

Are you able to reproduce this locally?

Can you provide a sample app and test apk?

As Flank (via Test Lab) generates these artifacts, I'm not sure what you mean by reproducing locally.

Here's a recent (arm64v8a debug build from our CI) and no-arch test APK

https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/M4IjtgpLTWuZaUg43Q0ByA/artifacts/public/build/target.arm64-v8a.apk

https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/TvdzLJF3Tiij13O8gqe9dA/artifacts/public/build/target.noarch.apk

I apologise for the confusion. By reproducing locally, I meant executing Flank on your own machine to initiate Test Lab runs, rather than in CI.

Kindly provide the device specifications you have in your flank.yml

Seems to work fine locally:

<?xml version='1.0' encoding='UTF-8' ?>
<testsuites>
  <testsuite name="Pixel2.arm-30-en_US-portrait" tests="45" failures="11" flakes="1" errors="0" skipped="0" time="575.163" timestamp="2024-10-17T20:08:37" hostname="localhost">
    <testcase name="testExperimentUnenrolledViaSecretMenu" classname="org.mozilla.fenix.experimentintegration.GenericExperimentIntegrationTest" time="50.552">
      <failure>androidx.test.espresso.NoActivityResumedException: No activities in stage RESUMED. Did you forget to launch the activity. (test.getActivity() or similar)?
	at dalvik.system.VMStack.getThreadStackTrace(Native Method)
	at java.lang.Thread.getStackTrace(Thread.java:1736)
      ......
    </testcase>
    <testcase name="settingsTest" classname="org.mozilla.fenix.screenshots.ComposeMenuScreenShotTest" time="10.278" flaky="true">
      <failure>FAILED
    </failure>
......
    </testcase>
  </testsuite>
</testsuites>
gcloud:
  results-bucket: ...
  record-video: true
  timeout: 15m
  async: false
  num-flaky-test-attempts: 1

  app: /app/path
  test: /test/path

  auto-google-login: false
  use-orchestrator: true
  environment-variables:
    clearPackageData: true
  performance-metrics: true

  test-targets:
    - notPackage org.mozilla.fenix.screenshots
    - notPackage org.mozilla.fenix.syncintegration
    - notPackage org.mozilla.fenix.experimentintegration

  device:
    - model: Pixel2.arm
      version: 30
      locale: en_US

flank:
  project: ...
  max-test-shards: 100
  num-test-runs: 1
  output-style: compact
  full-junit-result: true

Does this have anything to do with full-junit-result?

Do you experience this issue when you run flank in your local environment with the above configuration?

Hello again, yes I was able to reproduce with a local Flank run call to the same configuration above.

I had a flaky test in my run below with the following top level test suite

<testsuite name="MediumPhone.arm-34-en_US-portrait" tests="498" failures="0" flakes="0" errors="0" skipped="0" time="11272.089" timestamp="2024-10-23T09:46:12" hostname="localhost">
<testcase name="verifyCFRAfterBlockingTheCookieBanner" classname="org.mozilla.fenix.ui.CookieBannerBlockerTest" time="38.066" flaky="true">
<failure>
java.lang.AssertionError: UiSelector[CONTAINS_TEXT=Less distractions, less cookies tracking you on this site.] does not exist at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.assertTrue(Assert.java:42) at org.mozilla.fenix.helpers.MatcherHelper.assertUIObjectExists(MatcherHelper.kt:100) at org.mozilla.fenix.ui.robots.BrowserRobot.verifyCookieBannerBlockerCFRExists(BrowserRobot.kt:833) at org.mozilla.fenix.ui.CookieBannerBlockerTest$verifyCFRAfterBlockingTheCookieBanner$1$4.invoke(CookieBannerBlockerTest.kt:60) at org.mozilla.fenix.ui.CookieBannerBlockerTest$verifyCFRAfterBlockingTheCookieBanner$1$4.invoke(CookieBannerBlockerTest.kt:58) at org.mozilla.fenix.ui.robots.NavigationToolbarRobot$Transition.enterURLAndEnterToBrowser(NavigationToolbarRobot.kt:226) at org.mozilla.fenix.ui.CookieBannerBlockerTest$verifyCFRAfterBlockingTheCookieBanner$1.invoke(CookieBannerBlockerTest.kt:58) at org.mozilla.fenix.ui.CookieBannerBlockerTest$verifyCFRAfterBlockingTheCookieBanner$1.invoke(CookieBannerBlockerTest.kt:45) at org.mozilla.fenix.helpers.AppAndSystemHelper.runWithCondition(AppAndSystemHelper.kt:686) at org.mozilla.fenix.ui.CookieBannerBlockerTest.verifyCFRAfterBlockingTheCookieBanner(CookieBannerBlockerTest.kt:45)
</failure>
<webLink>
https://console.firebase.google.com/project/moz-fenix/testlab/histories/bh.66b7091e15d53d45/matrices/7524624672444898348/executions/bs.a2c09c06d295e45c/testcases/2
</webLink>
</testcase>

Hi @AaronMT, after some investigation, I've found that the issue is related to the max-test-shards property. If you omit this property, allowing it to use it's default value, the behavior is as expected (with flakes reflecting the correct value).

Thanks for investigating. Out of curiosity are you planning a new release?