BobBuildTool/basement

artifact mismatch while using tools: target-toolchain: host-toolchain

Opened this issue ยท 27 comments

- name: libs::glib-tools
use: [tools]
tools:
target-toolchain: host-toolchain

Hey,

there is an issue for matching artifacts, if we use the tool replace feature.
the artifacts built on jenkins doesn't match with the host-machine using --sandbox.
the second part of the artifact-name isn't correct.

jenkins:

curl -sSg --fail -o libs_glib-tools_x86_64-bob_compat-linux-gnu_dist_1-cfd5-94ae.tgz http://fdt-c-pcs-0004.fdtech.intern/artifact/76/9f/39e5bb610ab473c20ce2405dc07623cdb843f0104983ee64093f2c7ecdf8ea68538690cfa842-1.tgz

local:

>> x86_64::linux/devel::graphviz/libs::pango-dev/libs::glib-tools
   DOWNLOAD  dev/dist/libs/glib-tools/x86_64-bob_compat-linux-gnu/2/workspace  from http://fdt-c-pcs-0004.fdtech.intern/artifact/76/9f/39e5bb610ab473c20ce2405dc07623cdb843193205b3d955a0351162413094dcae51bf3de1f2-1.tgz .. not found
Build error: Downloading artifact failed

I've quickly built it on current basement master (fe037ed) with the following minimal recipe:

inherit: ["basement::rootrecipe"]

depends:
  - libs::pango-dev

buildScript: |
  true
packageScript: |
  true

There I get the same fingerprint like your Jenkins (f0104983ee64093f2c7ecdf8ea68538690cfa842, the trailing part of the build-id):

[ 153] PACKAGE   libs::glib-tools - dev/dist/libs/glib-tools/x86_64-bob_compat-linux-gnu/1/workspace
[ 158] AUDIT     libs::glib-tools - dev/dist/libs/glib-tools/x86_64-bob_compat-linux-gnu/1/workspace .. ok
[ 159] UPLOAD    libs::glib-tools - dev/dist/libs/glib-tools/x86_64-bob_compat-linux-gnu/1/workspace  to /home/jkloetzk/src/private/mismatch/cache/3a/f1/b16448697d072325eee71e5bcd7a5d710de1f0104983ee64093f2c7ecdf8ea68538690cfa842-1.tgz .. ok

It looks like your local build is somehow modified. Do you use the same sandbox? Could you supply some recipes where it can be reproduced?

now i tested to remove the libs::glib-tools from pango.yaml and added it to the rootrecipe.yaml.

that works...!

should have both variants the same artifact?

UPDATE:
some more information:

in case of current variant:

curl -sSg --fail -o libs_glib-tools_x86_64-bob_compat-linux-gnu_dist_1-e25d-89b0.tgz http://fdt-c-pcs-0004.fdtech.intern/artifact/76/9f/39e5bb610ab473c20ce2405dc07623cdb843f0104983ee64093f2c7ecdf8ea68538690cfa842-1.tgz

same pc, where jenkins is installed:

bob dev x86_64::linux/devel::graphviz/libs::pango-dev/libs::glib-tools/  -vv
--sandbox --download forced
>> x86_64::linux/devel::sandbox/devel::bootstrap-sandbox/devel::cross-toolchain/devel::gcc-cross/devel::binutils
   FNGRPRNT  devel::binutils .. 'x86_64\nglibc 2.27\nx86_64\nlibstdc++ 2...
>> x86_64::linux/devel::graphviz/libs::pango-dev/libs::glib-tools
   DOWNLOAD  dev/dist/libs/glib-tools/x86_64-bob_compat-linux-gnu/2/workspace  from http://fdt-c-pcs-0004.fdtech.intern/artifact/76/9f/39e5bb610ab473c20ce2405dc07623cdb843193205b3d955a0351162413094dcae51bf3de1f2-1.tgz .. not found
Build error: Downloading artifact failed

my laptop with WSL2:

bob dev x86_64::linux/devel::graphviz/libs::pango-dev/libs::glib-tools/  -vv --s
andbox --download forced
>> x86_64::linux/devel::sandbox/devel::bootstrap-sandbox/devel::cross-toolchain/devel::gcc-cross/devel::binutils
   FNGRPRNT  devel::binutils .. 'x86_64\nglibc 2.27\nx86_64\nlibstdc++ 2...
>> x86_64::linux/devel::graphviz/libs::pango-dev/libs::glib-tools
   DOWNLOAD  dev/dist/libs/glib-tools/x86_64-bob_compat-linux-gnu/2/workspace  from http://fdt-c-pcs-0004.fdtech.intern/artifact/76/9f/39e5bb610ab473c20ce2405dc07623cdb843193205b3d955a0351162413094dcae51bf3de1f2-1.tgz .. not found
Build error: Downloading artifact failed

#######
variant with glib-tools in rootrecipe:

WSL2:

bob dev x86_64::linux/devel::graphviz/libs::pango-dev/  -vv --sandbox --download forced
>> x86_64::linux/devel::sandbox/devel::bootstrap-sandbox/devel::cross-toolchain/devel::gcc-cross/devel::binutils
   FNGRPRNT  devel::binutils .. 'x86_64\nglibc 2.27\nx86_64\nlibstdc++ 2...
>> x86_64::linux/devel::graphviz/libs::pango-dev
   DOWNLOAD  dev/dist/libs/pango-dev/x86_64-bob_compat-linux-gnu/2/workspace  from http://fdt-c-pcs-0004.fdtech.intern/artifact/b1/b3/7cfeddf62bc2c622f935bf2573682f12bd21ae6e377d4ca6f0edb039e2e089c4c911e40d860c-1.tgz .. ok

jenkins pc:

bob dev x86_64::linux/devel::graphviz/libs::pango-dev/  -vv --sandbox --download forced
>> x86_64::linux/devel::sandbox/devel::bootstrap-sandbox/devel::cross-toolchain/devel::gcc-cross/devel::binutils
   FNGRPRNT  devel::binutils .. 'x86_64\nglibc 2.27\nx86_64\nlibstdc++ 2...
>> x86_64::linux/devel::graphviz/libs::pango-dev
   DOWNLOAD  dev/dist/libs/pango-dev/x86_64-bob_compat-linux-gnu/2/workspace  from http://fdt-c-pcs-0004.fdtech.intern/artifact/b1/b3/7cfeddf62bc2c622f935bf2573682f12bd21ae6e377d4ca6f0edb039e2e089c4c911e40d860c-1.tgz .. ok

What other toolchain are you using in x86_64::linux? Maybe that makes a difference?

Another thing: did you set BASEMENT_HOST_COMPAT_TOOLCHAIN to 0?

Could you provide the audit.json.gz from the Jenkins build artifact (76/9f/39e5bb610ab473c20ce2405dc07623cdb843f0104983ee64093f2c7ecdf8ea68538690cfa842-1.tgz), build x86_64::linux/devel::graphviz/libs::pango-dev/libs::glib-tools on the Jenkins pc and provide the audit.json.gz too? Maybe we can spot some difference...

What other toolchain are you using in x86_64::linux? Maybe that makes a difference?

i don't think so. that is just stuff much far behind the current problem. the current problem occures in scope of the basement.

Another thing: did you set BASEMENT_HOST_COMPAT_TOOLCHAIN to 0?

nope! i know this switch. ๐Ÿ‘

Could you provide the audit.json.gz from the Jenkins build artifact (76/9f/39e5bb610ab473c20ce2405dc07623cdb843f0104983ee64093f2c7ecdf8ea68538690cfa842-1.tgz), build x86_64::linux/devel::graphviz/libs::pango-dev/libs::glib-tools on the Jenkins pc and provide the audit.json.gz too? Maybe we can spot some difference...

yep:
audit-artifact.json.gz
audit-build.json.gz

how do u compare them? my meld crashs/stucks while loading the files :/

What other toolchain are you using in x86_64::linux? Maybe that makes a difference?

i don't think so. that is just stuff much far behind the current problem. the current problem occures in scope of the basement.

I was asking to understand the environment a bit better to come up with some theory what could go wrong. So the question is, do you configure another toolchain in your x86_64::linux recipe or do you stay with the default devel::host-compat-toolchain which is configured by basement::rootrecipe? If you use another toolchain, is it configured before or after the devel::graphviz dependency?

how do u compare them? my meld crashs/stucks while loading the files :/

Unfortunately there is no good tooling yet. Comparing it on text level won't work. I hope I'll find the time to dig into it this evening...

I was asking to understand the environment a bit better to come up with some theory what could go wrong. So the question is, do you configure another toolchain in your x86_64::linux recipe or do you stay with the default devel::host-compat-toolchain which is configured by basement::rootrecipe? If you use another toolchain, is it configured before or after the devel::graphviz dependency?

x86_64/linux.yaml

inherit: ["toolchain::x86_64"]

toolchian/x86_64.yaml

inherit: [rootrecipe]

root: !expr |
  "${BOB_HOST_PLATFORM}" == "linux"

depends:
    - name: devel::cross-toolchain-x86_64-linux-gnu
      use: [tools]
      forward: True

rootrecipe.yaml

inherit: ["basement::rootrecipe", "pipython3::rootrecipe"]

depends:
    - name: devel::cuda
      use: [tools]
      forward: True
    - name: fdt::cmake-export
      use: [tools]
      forward: True
    - if: !expr |
        "${BOB_HOST_PLATFORM}" == "msys"
      name: devel::win::graphviz
      use: [tools]
      forward: True
    - if: !expr |
        "${BOB_HOST_PLATFORM}" == "linux"
      name: devel::graphviz
      use: [tools]
      forward: True

devel/cuda.yaml:

shared: True

metaEnvironment:
    PKG_VERSION: "10.2.89"

checkoutSCM:
    scm: url
    url: ${FILE_SERVER}/cuda/cuda-${PKG_VERSION}-${BOB_HOST_PLATFORM}-${ARCH}.tgz
    digestSHA256: "$(if-then-else,$(eq,${BOB_HOST_PLATFORM},msys),5263fd45bdcbeeac41664ef4006fd30bbdad7707dd43589fd72143de1264e255,\
                   $(if-then-else,$(eq,${ARCH},x86_64),63b258e6f9a375db334b63c5642634900ef283885e137b8f571f7e7b5a584c17,4238785e559f88b97b201603ad6235a13ed714e8cd37781ab33c6950652f63cc))"
    extract: False

buildScript: |
    tar -xf $1/*.tgz

packageScript: |
     rsync -a "$1/" .
     for i in $(find . -name bin) ; do
        chmod +x $i/* -R
     done

provideTools:
    cuda:
        path: "bin"
        environment:
            CUDACXX: "nvcc"
            CUDAFLAGS: "-Xcompiler='\"'$(if-then-else,$(eq,$BOB_HOST_PLATFORM,msys),-O$(if-then-else,$(eq,${BASEMENT_OPTIMIZE},0),d,${BASEMENT_OPTIMIZE})$(if-then-else,${BASEMENT_DEBUG}, -Zi,) -MD$(if-then-else,${BASEMENT_DEBUG},d,) -W3,-O${BASEMENT_OPTIMIZE}$(if-then-else,${BASEMENT_DEBUG}, -g,) -pipe)'\"'"

cmake-export.yaml

inherit: [cmake]

metaEnvironment:
    PKG_VERSION: "v1.0.1"

checkoutSCM:
    scm: git
    url: ${GIT_SERVER}/fdt/fdt.cmake.export.git
    tag: ${PKG_VERSION}

buildScript: |
    mkdir -p install/usr
    rsync -a --delete $1/bin install/usr/

packageScript: |
    rsync -a --delete $1/install/* .

provideTools:
    cmake-export: "usr/bin"

I guess, that is the full chain before. nothing strang, isn't it?
I will try to reproduce it, just with basement repository.

I've quickly built it on current basement master (fe037ed) with the following minimal recipe:

inherit: ["basement::rootrecipe"]

depends:
  - libs::pango-dev

buildScript: |
  true
packageScript: |
  true

There I get the same fingerprint like your Jenkins (f0104983ee64093f2c7ecdf8ea68538690cfa842, the trailing part of the build-id):

[ 153] PACKAGE   libs::glib-tools - dev/dist/libs/glib-tools/x86_64-bob_compat-linux-gnu/1/workspace
[ 158] AUDIT     libs::glib-tools - dev/dist/libs/glib-tools/x86_64-bob_compat-linux-gnu/1/workspace .. ok
[ 159] UPLOAD    libs::glib-tools - dev/dist/libs/glib-tools/x86_64-bob_compat-linux-gnu/1/workspace  to /home/jkloetzk/src/private/mismatch/cache/3a/f1/b16448697d072325eee71e5bcd7a5d710de1f0104983ee64093f2c7ecdf8ea68538690cfa842-1.tgz .. ok

It looks like your local build is somehow modified. Do you use the same sandbox? Could you supply some recipes where it can be reproduced?

i can't upload this recipe to jenkins server:

Parse error: Jobs are cyclic

i can't upload this recipe to jenkins server:

Parse error: Jobs are cyclic

Ok, now we have two problems. ๐Ÿ˜’ I have the feeling that this is still somehow related to the original problem...

But i also tested this with clean basement master, does this work for u?

No, I get the same error on vanilla basement with the minimal root recipe.

i also gone back to 28th of december and tried with kernel::linux-image. still the same issue. so i guess, target-toolchain: host-toolchain is the trigger of the problem.

OH: i guess the sandbox is the problem. not the target-toolchain: host-toolchain stuff. this also happens, if i just have a dependency to e.g. libs::expat-dev.

I think I found a smoking gun: libs::pango depends on libs::glib-dev and libs::glib-tools. They differ only in the packageStep. Both are built with the devel::host-compat-toolchain but the former with the target-toolchain and the latter with the renamed host-toolchain. Content wise they are the same but the host-toolchain has a fingerprintScript while the target-toolchain has none. Depending on which is built first the fingerprint part of the buildStep will be different...

Currently the build logic only looks on the variant-id. Two packages with the same variant-id but different fingerprintScript are treated as the same. Right now I have no good idea how to solve the issue...

oh. maybe u should give u and the problem a night, ... or two ;)

UPDATE: the problem "cyclic jobs" also persists, if there is no pango recipe. i checked out b9874d09bea5de1ca5e3fb7b0d470c20ea126479, there was no pango.

UPDATE2: i tested the stuff at work by replacing target-toolchain: host-toolchain by adding glib-tools in rootrecipe, that worked.

This will require a deeper change in the build logic. I'm on vacation the next weeks so it will take a bit longer... ;)

i can't upload this recipe to jenkins server:

Parse error: Jobs are cyclic

Ok, now we have two problems. ๐Ÿ˜’ I have the feeling that this is still somehow related to the original problem...

any ideas so far here? this problem blocks me a bit.

i want to setup a jenkins server for the new recipes including basement and pipython layers.

currently every user has to build all the gcc toolchains and sandboxes, this will take much time.

I haven't actually looked into the Jenkins problem but I'm quite sure this is related to the original issue. It looks like a design problem for which I don't yet have an idea. I'll put it on top of my list...

Ok, the "Jobs are cyclic" problem is basically there since Bob 0.14 where the sandboxInvariant policy was introduced! It just depends on the recipes if it's triggered or not. The fix will take some time, though...

Please test BobBuildTool/bob#439. It should solve the "jobs are cyclic" problem. The original "artifact mismatch" problem is similar but a different beast...

currently I tried to get a pango artifact built and uploaded on jenkins, that will match with laptops too.
but I have no success at all.

I did the following changes, but I didn't get a matching artifact.

  • I removed the target-toolchain: host-toolchian feature
  • I splittet glib-dev/glib-tgt and glib-tools into 2 recipes
  • I added the missing tools to rootrecipes
diff --git a/recipes/libs/fontconfig.yaml b/recipes/libs/fontconfig.yaml
index a39924a..b75c14a 100644
--- a/recipes/libs/fontconfig.yaml
+++ b/recipes/libs/fontconfig.yaml
@@ -4,15 +4,6 @@ metaEnvironment:
     PKG_VERSION: "2.13.92"

 depends:
-    - name: devel::gettext
-      use: [tools]
-      tools:
-          target-toolchain: host-toolchain
-    - name: devel::gperf
-      use: [tools]
-      tools:
-          target-toolchain: host-toolchain
-
     - libs::freetype-dev
     - libs::expat-dev

diff --git a/recipes/libs/glib.yaml b/recipes/libs/glib.yaml
index 0894227..91bd45d 100644
--- a/recipes/libs/glib.yaml
+++ b/recipes/libs/glib.yaml
@@ -1,4 +1,4 @@
-inherit: [meson, patch]
+inherit: [meson]

 metaEnvironment:
     PKG_VERSION: "2.71.0"
@@ -32,10 +32,7 @@ multiPackage:
             # make sure glibconfig.h is copied too
             mesonPackageDev "$1" "/usr/lib/glib-2.0/***"
         provideDeps: [ "*-dev" ]
+
     tgt:
         packageScript: mesonPackageLib
         provideDeps: [ "*-tgt" ]
-    tools:
-        packageScript: mesonPackageBin
-        provideTools:
-            glib: usr/bin
diff --git a/recipes/libs/pango.yaml b/recipes/libs/pango.yaml
index 317d67e..a8a4c76 100644
--- a/recipes/libs/pango.yaml
+++ b/recipes/libs/pango.yaml
@@ -4,12 +4,8 @@ metaEnvironment:
     PKG_VERSION: "1.48.2"

 depends:
-    - libs::glib-dev
-    - name: libs::glib-tools
-      use: [tools]
-      tools:
-          target-toolchain: host-toolchain
     - libs::fontconfig-dev
+    - libs::glib-dev
     - libs::harfbuzz-dev
     - libs::cairo-dev
     - libs::fribidi-dev
@@ -29,8 +25,7 @@ checkoutSCM:
     stripComponents: 1

 buildTools: [glib]
-buildScript: |
-    mesonBuild $1
+buildScript: mesonBuild $1

 multiPackage:
     dev:
diff --git a/recipes/multimedia/gst-plugins-base.yaml b/recipes/multimedia/gst-plugins-base.yaml
index b548cab..c84bf12 100644
--- a/recipes/multimedia/gst-plugins-base.yaml
+++ b/recipes/multimedia/gst-plugins-base.yaml
@@ -5,7 +5,7 @@ metaEnvironment:

 depends:
     # Requires glib-mkenums
-    - name: libs::glib-tools
+    - name: utils::glib
       use: [tools]
       tools:
         target-toolchain: host-toolchain
diff --git a/recipes/multimedia/gstreamer.yaml b/recipes/multimedia/gstreamer.yaml
index 0a466c6..e8e1944 100644
--- a/recipes/multimedia/gstreamer.yaml
+++ b/recipes/multimedia/gstreamer.yaml
@@ -5,7 +5,7 @@ metaEnvironment:

 depends:
     # Requires glib-mkenums
-    - name: libs::glib-tools
+    - name: utils::glib
       use: [tools]
       tools:
         target-toolchain: host-toolchain
diff --git a/recipes/utils/glib.yaml b/recipes/utils/glib.yaml
new file mode 100644
index 0000000..4e83cb6
--- /dev/null
+++ b/recipes/utils/glib.yaml
@@ -0,0 +1,31 @@
+inherit: [meson]
+
+metaEnvironment:
+    PKG_VERSION: "2.71.0"
+
+depends:
+    - libs::libffi-dev
+    - libs::pcre-lib-1-dev
+    - libs::zlib-dev
+
+    - use: []
+      depends:
+        - libs::libffi-tgt
+        - libs::pcre-lib-1-tgt
+        - libs::zlib-tgt
+
+checkoutSCM:
+    scm: url
+    url: http://ftp.gnome.org/pub/gnome/sources/glib/2.71/glib-${PKG_VERSION}.tar.xz
+    digestSHA256: "926816526f6e4bba9af726970ff87be7dac0b70d5805050c6207b7bb17ea4fca"
+    stripComponents: 1
+
+buildScript: |
+    mesonBuild $1 -Dtests=false
+    pushd install/usr/lib/pkgconfig
+    sed -i 's/\${bindir}\///g' glib-2.0.pc gio-2.0.pc
+    popd
+
+packageScript: mesonPackageBin
+provideTools:
+    glib: usr/bin

audits.zip

attaching the 2 audit files.

  • "build-id" doesn't match.

I've had a look at the audit trails and at least libs::fontconfig looks fishy. For the checkout step there are two different audit records that have the same variant-id but resulted in a different checkout hash (result-id):

From audit-laptop.json:

      {
         "artifact-id" : "55137756aea3de37121181c20148e225b5ec797d",
         "result-hash" : "f2dfbe54fc7794ebe4aa1a960c70bce8876191a2",
         "variant-id" : "0bca97f5775e8851b9fda835b9fa1a775af0a585"
      },

The other from libs_pango-dev_x86_64-bob_compat-linux-gnu_dist_9-2cf1-b3a4.json:

      {
         "artifact-id" : "3f822645dab0c104fa6d6de1932b875cd0ddd540",
         "result-hash" : "e3e5d3d2cac126065806448ab647c8535ccddf12",
         "variant-id" : "0bca97f5775e8851b9fda835b9fa1a775af0a585"
      },

Consequently the build-id is different between the two builds. This is certainly unrelated to this bug here. It could be that the autoconfReconfigure in the checkoutScript introduces some indeterminism. Actually I'm wondering why the reconfigure is necessary. Usually it is only needed if patches are applied...

uhhh, ur brilliant! that's it! so this isn't fully related to this problem here. PR will come.

Thanks. It reminds me of the fact that Bob still misses some audit trail analyser, though. I guess such problems can be found more or less automatically...

btw: i got a workaround, that seems to work so far.
i added the following to my rootrecipe, that will be used after the basement::rootrecipe.

index 33f44fd..1402318 100644
--- a/classes/rootrecipe.yaml
+++ b/classes/rootrecipe.yaml
@@ -7,6 +7,11 @@ depends:
     - name: fdt::cmake-export
       use: [tools]
       forward: True
+    - if: !expr |
+        "${BOB_HOST_PLATFORM}" == "linux"
+      name: libs::glib-tools
+      use: [tools]
+      forward: True
     - if: !expr |
         "${BOB_HOST_PLATFORM}" == "msys"
       name: devel::win::graphviz

so this will override the parts in e.g. pango.yaml

 depends:
    - libs::glib-dev
    - name: libs::glib-tools
      use: [tools]
      tools:
          target-toolchain: host-toolchain

and in the end, the artifacts between jenkins and local are matching!
BUT: just tested in a quick smoke test.

Thanks. It reminds me of the fact that Bob still misses some audit trail analyser, though. I guess such problems can be found more or less automatically...

maybe this isn't necessary, a quick check into the web: https://onlinetextcompare.com/json
this one is amazing. it formats completely correct and shows the differences pretty well!
also the > 2 MB files aren't a problem.