reading of repodata seems to be done inefficiently
lukash opened this issue · 8 comments
Forwarding an issue originally reported against rpm on behalf of @keszybz. The repository metadata are read entirely by libsolv. Original text:
I was debugging some scriptlet, and noticed the following from sudo strace dnf …:
231692 openat(AT_FDCWD, "/var/tmp/inst1/var/cache/dnf/rawhide-2d95c80a1fa0a67d/repodata/af4c732cee391ed2994260e2ab93ede6e2b7dc94c4b1b786347b56a998e5b6df-primary.xml.zck", O_RDONLY) = 9
231692 lseek(9, 0, SEEK_SET) = 0
231692 read(9, "\0ZCK1\201gc\240\244\237E\361p\341\7u\230\263\223\16C\2446i", 25) = 25
231692 read(9, "\224\7<\254\324\302L\3773\2454\241O0n\215", 16) = 16
231692 read(9, "...."..., 537063) = 537063
...
231692 read(9, "(\265/\375...", 101) = 101
231692 read(9, "(\265/\375..."..., 937) = 937
...
So it seems to be reading some binary records of variable length, one by one.
grep over the return values shows:
537063 30724 101 937 500 371 499 337 2987 553 550 216 466 576 408 666 1146 520 334 764 445 317 365 292 748 692 293 739 404 722 340 492 917 1255 303 429 371 280 546 697 975 500 1028 833 313 1206 573 840 1034 368 611 580 1491 637 1849 864 1053 1170 1024 1129 737 812 921 1126 536 551 491 1203 803 1403 705 2839 311 955 571 1102 643 826 480 784 527 249 973 686 1469 955 721 615 1642 588 504 518 355 1312 986 2140 1467 1200 292 610 1125 2199 876 1298 258 678 3475 2941 2241 821 537 689 1435 670 1370 346 540 2843 2160 880 349 501 342 638 1574 585 610 417 998 907 681 2125 582 592 551 674 1017 664 676 599 442 574 551 646 481 685 586 532 364 1089 1089 1333 514 518 1912 2042 387 579 469 1151 1190 1068 3963 401 731 1692 308 627 618 1476 577 343 361 1254 282 283 736 701 403 430 582 213 216 505 1407 765 511 1103 537 357 399 447 423 627 345 370 493 425 630 455 834 671 637 374 406 508 609 264 277 537 304 570 921 806 338 415 432 499 361 512 555 461 433 888 300 390 263 572 1016 265 421 263 464 439 291 333 1073 720 645 329 848 506 591 1140 901 538 283 396 412 720 320 299 311 890 454 476 374 520 326 395 258 421 311 245 654 761 451 639 221 404 498 303 306 837 476 425 347 351 573 338 313 565 391 323 335 326 926 214 432 451 2391 439 661 596 436 494 500 369 372 892 230 625 289 426 337 306 640 279 331 1135 363 317 334 459 592 463 351 459 307 361 338 450 446 321 506 288 356 409 259 349 445 507 388 406 222 388 310 360 232 753 515 350 486 527 676 260 725 331 492 379 632 280 336 338 364 378 511 384 367 589 255 292 409 797 506 422 1251 358 349 363 334 363 548 436 538 587 395 299 327 435 291 298 810 437 297 420 430 349 463 550 231 336 748 351 354 343 502 403 565 280 352 438 315 393 650 400 779 435 271 309 747 503 283 338 447 298 261 291 30 2 520 401 313 386 474 457 394 634 814 428 434 302 443 668 398 334 310 690 563 322 392 332 361 455 552 294 380 516 304 586 395 418 570 298 298 464 282 278 612 611 331 379 333 525 490 347 361 503 502 360 333 437 434 783 267 631 617 431 353 375 446 507 458 278 272 498 409 680 476 261 375 344 482 305 596 370 534 268 430 310 313 779 566 275 422 367 489 283 446 674 345 420 393 453 416 271 28 7 425 413 322 769 860 516 273 1057 262 427 665 774 453 312 993 888 428 489 396 407 331 453 849 496 262 573 636 601 657 424 585 540 353 478 394 353 283 513 317 465 244 423 710 260 599 322 251 535 806 481 393 575 282 633 365 388 367 242 249 354 314…
58033 reads, about half of them <4k.
Please implement buffered reads!
But...but... libsolv's zchunk implementation does use buffered reads. Are you sure that you're not using the libzchunk implementation? I.e. compile libsolv with WITH_SYSTEM_ZCHUNK set to true?
This was with rpm-4.17.0-4.fc35.x86_64, dnf-4.12.0-1.fc35.noarch, python3-hawkey-0.67.0-2.fc35.x86_64, libsolv-0.7.22-1.fc35.x86_64, zchunk-libs-1.2.2-1.fc35.x86_64 (following rpm -qR …
). So it seems to be linked to zchunk-libs, at least based on this quick check.
Personally, I really hope that the bug is in some library that zchunk-libs is using, and we only realize that after filing it there and move it again. Let's see how long we can keep the game going!
I'm closing this here. Please reopen if you find out that libsolv is to blame.
Was this intended to be closed?
Yes. Why do you think it should stay open?
I don't, I'm just wondering why you said it was being closed and yet it isn't
Oh, sorry. Seems I hit the wrong button. Closing now.