Skip to content

Testing Android and iOS apps on OSS CI using Nova reusable mobile workflow

Huy Do edited this page Apr 19, 2024 · 14 revisions

With the advent of new tools like ExecuTorch, it's now possible to run LLM inference locally on mobile devices using different models such as llama2. While it isn't hard to experiment with this new capability, test it out on your own devices, and see some results, it takes more efforts to automate this process and make it a part of the CI on various PyTorch-family repositories. To solve this challenge, PyTorch Dev Infra team are launching a new Nova reusable mobile workflow to do the heavy lifting for you when it comes to testing your mobile apps.

With this new reusable workflow, devs now can:

  1. Utilize our mobile infrastructure built on top of AWS Device Farm. It offers a wide variety of popular Android and iOS devices from phones to tablets.
  2. Write and run tests remotely on those devices like how you run them locally with your connected phones.
  3. Go beyond the emulator to stress test and benchmark your local LLM inference solutions on actual devices. This helps accurately answer the questions on how many token the solution could process per second and how much memory and power it needs.
  4. Debug hard-to-reproduce issues on devices that you don't have.
  5. Gather the results and share them with others via the familiar GitHub CI UX.

Quick Start

Let's say you are integrating a new ExecuTorch backend which improves llama2 inference performance. You have already run some prompts to confirm that the token per second (TPS) is higher that what's reported in https://github.com/pytorch/executorch/tree/main/examples/models/llama2#performance. The result looks good on your phones, so the next step is to confirm the value on CI. To do that, you will need a few things:

  1. Decide on a group of devices you want to run the test. Take Android as an example, you might want to run it on the recent Samsung Galaxy S2x. Such a group of devices has already been created in our infra under the ARN arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/e59f866a-30aa-4aa1-87b7-4510e5820dfa.
  2. Build the app that you want to test. It would be in the .apk format for Android and .ipa format for iOS.
  3. Prepare the test to run, we are supporting two types of tests at the moment:
    1. Instrumented tests on Android https://developer.android.com/training/testing/instrumented-tests
    2. and XCTest on iOS https://developer.apple.com/documentation/xctest
  4. Prepare an optional zip archive of any data files you want to copy to the remote devices. This usually contains the exported models themselves.
    1. On Android, the archive will be extracted to /sdcard/ directory.
    2. On iOS, the files will be on the application sandbox.

Test specification

After having these items ready, the next step is to take a minute a look at the test specification which codify how the test is run. You probably could just use the default test spec that we provides, but knowing the steps would come in handy if you need to customize. Here are some examples:

  1. The Android test spec for ExecuTorch Llama app can be found in https://ossci-assets.s3.amazonaws.com/android-llama2-device-farm-test-spec.yml. It prepares the required folder /data/local/tmp/llama/ and copy the exported model xnnpack_llama2.pte together with the tokenizer tokenizer.bin there before running the test. $DEVICEFARM_DEVICE_UDID is set by AWS Device Farm to be the target device, and the output will be available in $DEVICEFARM_LOG_DIR/instrument.log.
...
  test:
    commands:
      # By default, the following ADB command is used by Device Farm to run your Instrumentation test.
      # Please refer to Android's documentation for more options on running instrumentation tests with adb:
      # https://developer.android.com/studio/test/command-line#run-tests-with-adb
      - echo "Starting the Instrumentation test"
      - |
        adb -s $DEVICEFARM_DEVICE_UDID shell "am instrument -r -w --no-window-animation \
        $DEVICEFARM_TEST_PACKAGE_NAME/$DEVICEFARM_TEST_PACKAGE_RUNNER 2>&1 || echo \": -1\"" |
        tee $DEVICEFARM_LOG_DIR/instrument.log
...
  1. The generic iOS test spec used by ExecuTorch iOS demo app is at https://ossci-assets.s3.amazonaws.com/default-ios-device-farm-appium-test-spec.yml just invokes xcodebuild test-without-building on the target device.
  test:
    commands:
      - xcodebuild test-without-building -destination id=$DEVICEFARM_DEVICE_UDID -xctestrun $DEVICEFARM_TEST_PACKAGE_PATH/*.xctestrun  -derivedDataPath $DEVICEFARM_LOG_DIR

If you have a custom test spec, you'll need to upload them somewhere downloadable by the workflow.

Example workflows

Let's bring everything together and go through an actual example of https://github.com/pytorch/executorch/blob/main/.github/workflows/android.yml.

name: Android

on:
  ...

jobs:
  # Build all the demo apps 
  test-demo-android:
    name: test-demo-android
    uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
    strategy:
      matrix:
        include:
          - build-tool: buck2
    with:
      runner: linux.12xlarge
      docker-image: executorch-ubuntu-22.04-clang12-android
      submodules: 'true'
      ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
      timeout: 90
      # The apps are built using Nova reusable GH action, so we set the upload-artifact parameter here to make them available as artifacts on GitHub
      upload-artifact: android-apps
      script: |
        set -eux

        ... Building the apps ...
        
        # In Nova workflow, all the files under artifacts-to-be-uploaded folder will be uploaded
        mkdir -p artifacts-to-be-uploaded
        # Copy the app and its test suite to S3
        cp examples/demo-apps/android/LlamaDemo/app/build/outputs/apk/debug/*.apk artifacts-to-be-uploaded/
        cp examples/demo-apps/android/LlamaDemo/app/build/outputs/apk/androidTest/debug/*.apk artifacts-to-be-uploaded/
        # Also copy the share libraries
        cp cmake-out-android/lib/*.a artifacts-to-be-uploaded/

  # Upload the app and its test suite to S3 so that they can be downloaded by the test job
  upload-artifacts:
    needs: test-demo-android
    runs-on: linux.2xlarge
    steps:
      - name: Download the artifacts
        uses: actions/download-artifact@v3
        with:
          # The name here needs to match the name of the upload-artifact parameter
          name: android-apps
          path: ${{ runner.temp }}/artifacts/

      - name: Verify the artifacts
        shell: bash
        working-directory: ${{ runner.temp }}/artifacts/
        run: |
          ls -lah ./

      - name: Upload the artifacts to S3
        uses: seemethere/upload-artifact-s3@v5
        with:
          s3-bucket: gha-artifacts
          s3-prefix: |
            ${{ github.repository }}/${{ github.run_id }}/artifact
          retention-days: 14
          if-no-files-found: ignore
          path: ${{ runner.temp }}/artifacts/

  # Run the test on remote Android devices
  test-llama-app:
    needs: upload-artifacts
    permissions:
      id-token: write
      contents: read
    uses: pytorch/test-infra/.github/workflows/mobile_job.yml@main
    with:
      device-type: android
      runner: ubuntu-latest
      test-infra-ref: ''
      # This is the ARN of ExecuTorch project on AWS
      project-arn: arn:aws:devicefarm:us-west-2:308535385114:project:02a2cf0f-6d9b-45ee-ba1a-a086587469e6
      # This is the custom Android device pool that only includes Samsung Galaxy S2x
      device-pool-arn: arn:aws:devicefarm:us-west-2:308535385114:devicepool:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/e59f866a-30aa-4aa1-87b7-4510e5820dfa
      # Uploaded to S3 from the previous job, the name of the app comes from the project itself
      android-app-archive: https://gha-artifacts.s3.amazonaws.com/${{ github.repository }}/${{ github.run_id }}/artifact/app-debug.apk
      android-test-archive: https://gha-artifacts.s3.amazonaws.com/${{ github.repository }}/${{ github.run_id }}/artifact/app-debug-androidTest.apk
      # The test spec can be downloaded from https://ossci-assets.s3.amazonaws.com/android-llama2-device-farm-test-spec.yml. A link to download the spec also works here.
      test-spec: arn:aws:devicefarm:us-west-2:308535385114:upload:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/abd86868-fa63-467e-a5c7-218194665a77
      # The exported llama2 model and its tokenizer, can be downloaded from https://ossci-assets.s3.amazonaws.com/executorch-android-llama2-7b.zip. A link to download the archive also works here, but keep in mind that some exported models like llama2 7B is few GB in size, so it would be faster to upload it to AWS beforehand and reuse the existing resource if possible
      extra-data: arn:aws:devicefarm:us-west-2:308535385114:upload:02a2cf0f-6d9b-45ee-ba1a-a086587469e6/bd15825b-ddab-4e47-9fef-a9c8935778dd

pytorch/test-infra/.github/workflows/mobile_job.yml is the one doing the heavy lifting here. It can be tweaked with the following parameters

  • device-type: either android or ios
  • project-arn: this value is fixed for each project, please reach out to PyTorch Dev Infra if you need to get one. There are 2 available projects atm.
    • arn:aws:devicefarm:us-west-2:308535385114:project:b531574a-fb82-40ae-b687-8f0b81341ae0 for PyTorch core.
    • and arn:aws:devicefarm:us-west-2:308535385114:project:02a2cf0f-6d9b-45ee-ba1a-a086587469e6 for ExecuTorch.
  • device-pool-arn: this is the pool of remote devices to run the test. By default, it will select 5 random popular devices for the test. Please also reach out to PyTorch Dev Infra if you need something more specific. Please note that the app itself can limit which devices it can use, for example, having IPHONEOS_DEPLOYMENT_TARGET set to 17 will exclude all devices with lower iOS version.
  • test-spec: this is the test specification to drive the test and collect the results. Adb commands for Android, and xcodebuild commands for iOS can be used here.
  • extra-data: the archive with any extra data to copy over to the device before the test is run.

Some parameters are platform-specific. For Android, we have:

  • android-app-archive: the link to the Android app APK archive to run. It also accept an existing ARN if the app has already been uploaded to AWS.
  • android-test-archive: the link to the Android instrumentation tests APK archive or an existing ARN. The test archive can be built with ./gradlew assembleAndroidTest

For iOS, we have 2 other equivalent parameters for the app and the test suite. Note that they need to be built for generic iOS device and not the simulator, for examplexcodebuild build-for-testing -project <PATH_TO>.xcodeproj -scheme <THE_TEST_SUITE_TO_BUILD> -destination platform="iOS".

  • ios-ipa-archive: the link to the Android app APK archive to run or an existing ARN if the app has already been uploaded to AWS>
  • ios-xctestrun-zip the link to the iOS xctestrun zip archive or an existing ARN.

Voila! You now have a workflow to run the tests on remote mobile devices.

How to get the test results

As a reusable GitHub workflow, we depends on GitHub UX to bring the test results back via its console log. HUD support is minimal atm, but could be extended in the future. Here are some hands-on examples to illustrate how to read the console log.

Getting the token-per-second on Samsung S2x phones

Let's pick an example of the test-llama-app job. The most important step of the job is Run Android tests on devices where it runs the test on 4 different S22 devices in parallel. The test consists of 3 steps Setup Suite, Tests Suite, and Teardown Suite. Each step will return its own .logcat file for manual investigation. The most important output is the Test_spec_output.txt, where all the lines starting with the prefix [PyTorch] will be print to the console. On Samsung Galaxy S22 5G, the test observe a TPS of 6.74.

Samsung Galaxy S22 5G PASSED with stats {'total': 3, 'passed': 3, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
  Setup Suite PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
    Setup Test PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
      Saving FILE Logcat.logcat (DEVICE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00000_00000_00000_Logcat.logcat
      Saving FILE TCP_dump_log.txt (RAW_FILE) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00000_00000_00001_TCP_dump_log.txt
      Saving LOG ListArtifactType.log.json (MESSAGE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00000_00000_LOG_ListArtifactType.log.json
  Tests Suite PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
    Tests PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
      Saving FILE Test_spec_output.txt (TESTSPEC_OUTPUT) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00001_00000_00000_Test_spec_output.txt
        [PyTorch] junit.framework.AssertionFailedError: The observed TPS 6.7432113 is less than the expected TPS 10.0
      Saving FILE Customer_Artifacts.zip (CUSTOMER_ARTIFACT) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00001_00000_00001_Customer_Artifacts.zip
      Saving FILE Customer_Artifacts_Log.txt (CUSTOMER_ARTIFACT_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00001_00000_00002_Customer_Artifacts_Log.txt
      Saving FILE Test_spec_shell_script.sh (RAW_FILE) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00001_00000_00003_Test_spec_shell_script.sh
      Saving FILE Test_spec_file.yml (RAW_FILE) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00001_00000_00004_Test_spec_file.yml
      Saving FILE Video.mp4 (VIDEO) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00001_00000_00005_Video.mp4
      Saving FILE Logcat.logcat (DEVICE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00001_00000_00006_Logcat.logcat
      Saving FILE TCP_dump_log.txt (RAW_FILE) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00001_00000_00008_TCP_dump_log.txt
      Saving LOG ListArtifactType.log.json (MESSAGE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00001_00000_LOG_ListArtifactType.log.json
  Teardown Suite PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
    Teardown Test PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
      Saving FILE Logcat.logcat (DEVICE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00002_00000_00001_Logcat.logcat
      Saving FILE TCP_dump_log.txt (RAW_FILE) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00002_00000_00003_TCP_dump_log.txt
      Saving LOG ListArtifactType.log.json (MESSAGE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8734286527/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_9a5cdbc0-561c-4005-9bc8-6003fb40d76a_00002_00002_00000_LOG_ListArtifactType.log.json

Note that all other artifacts from AWS Device Farm such as the capture screen will be available to devs via S3 links.

Testing different ExecuTorch backends on iOS

Another example is to test different ExecuTorch backends on iOS. The structure of the log is the same with the 3 steps Setup Suite, Tests Suite, and Teardown Suite with all the available test artifacts from AWS Device Farm.

Apple iPhone 11 PASSED with stats {'total': 3, 'passed': 3, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
  Setup Suite PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
    Setup Test PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
      Saving FILE Syslog.syslog (DEVICE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00000_00000_00000_Syslog.syslog
      Saving FILE TCP_dump_log.txt (RAW_FILE) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00000_00000_00001_TCP_dump_log.txt
      Saving LOG ListArtifactType.log.json (MESSAGE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00000_00000_LOG_ListArtifactType.log.json
  Tests Suite PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
    Tests PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
      Saving FILE Test_spec_output.txt (TESTSPEC_OUTPUT) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00001_00000_00000_Test_spec_output.txt
      Saving FILE Test_spec_shell_script.sh (RAW_FILE) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00001_00000_00001_Test_spec_shell_script.sh
      Saving FILE Test_spec_file.yml (RAW_FILE) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00001_00000_00002_Test_spec_file.yml
      Saving FILE Customer_Artifacts.zip (CUSTOMER_ARTIFACT) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00001_00000_00003_Customer_Artifacts.zip
      Saving FILE Customer_Artifacts_Log.txt (CUSTOMER_ARTIFACT_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00001_00000_00004_Customer_Artifacts_Log.txt
      Saving FILE Video.mp4 (VIDEO) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00001_00000_00005_Video.mp4
      Saving FILE Syslog.syslog (DEVICE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00001_00000_00006_Syslog.syslog
      Saving FILE TCP_dump_log.txt (RAW_FILE) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00001_00000_00008_TCP_dump_log.txt
      Saving LOG ListArtifactType.log.json (MESSAGE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00001_00000_LOG_ListArtifactType.log.json
  Teardown Suite PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
    Teardown Test PASSED with stats {'total': 1, 'passed': 1, 'failed': 0, 'warned': 0, 'errored': 0, 'stopped': 0, 'skipped': 0}
      Saving FILE Webkit_Log.webkitlog (WEBKIT_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00002_00000_00000_Webkit_Log.webkitlog
      Saving FILE Syslog.syslog (DEVICE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00002_00000_00002_Syslog.syslog
      Saving FILE TCP_dump_log.txt (RAW_FILE) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00002_00000_00004_TCP_dump_log.txt
      Saving LOG ListArtifactType.log.json (MESSAGE_LOG) at https://gha-artifacts.s3.amazonaws.com/device_farm/8641760330/1/arn_aws_devicefarm_us-west-2_308535385114_artifact_02a2cf0f-6d9b-45ee-ba1a-a086587469e6_e67a66d2-4e66-4bb2-a6f1-bc8d3b24a529_00002_00002_00000_LOG_ListArtifactType.log.json

Downloading the Test_spec_output.txt file to get output of the xcodebuild test-without-building command where we can see the tests are passing for CoreML, MPS, portable, and XNNPACK backends.

Testing started
Test suite 'All tests' started on 'PDX000194454 - DeviceFarm (715)'
Test suite '<bundle>' started on 'PDX000194454 - DeviceFarm (715)'
Test suite 'MobileNetClassifierTest' started on 'PDX000194454 - DeviceFarm (715)'
Test case 'MobileNetClassifierTest.testV3WithCoreMLBackend()' passed on 'PDX000194454 - DeviceFarm (715)' (0.931 seconds)
Test case 'MobileNetClassifierTest.testV3WithMPSBackend()' passed on 'PDX000194454 - DeviceFarm (715)' (2.860 seconds)
Test case 'MobileNetClassifierTest.testV3WithPortableBackend()' passed on 'PDX000194454 - DeviceFarm (715)' (1.226 seconds)
Test case 'MobileNetClassifierTest.testV3WithXNNPACKBackend()' passed on 'PDX000194454 - DeviceFarm (715)' (0.116 seconds)