rng-tools + jitterentropy always hang if use the internal timer

Question

rng-tools + jitterentropy always hang if use the internal timer

Closed this issue 3 years ago · 38 comments

On some board with coarse timer, it will use the internal timer instead. But it make rngd always hang(rngd -f -x hwrng -x rdrand). And we can reproduce this issue in any board if we force to use the internal timer as below.

diff --git a/jitterentropy-base.c b/jitterentropy-base.c
index 6dbb484..07397cb 100644
--- a/jitterentropy-base.c
+++ b/jitterentropy-base.c
@@ -1387,7 +1387,8 @@ int jent_entropy_init(void)
        if (sha3_tester())
                return EHASH;
 
-       ret = jent_time_entropy_init(0);
+       //ret = jent_time_entropy_init(0);
+       ret = 1;
 
 #ifdef JENT_CONF_ENABLE_INTERNAL_TIMER
        jent_force_internal_timer = 0;
-- 
2.17.1

Answer 1 · 2021-04-30T07:24:47.000Z

nhorman/rng-tools#123

Answer 2 · 2021-04-30T07:43:52.000Z

Am Freitag, dem 30.04.2021 um 00:18 -0700 schrieb Mingli-Yu:

On some board with coarse timer, it will use the internal timer instead. But it make rngd hangs(rngd -f -x hwrng -x rdrand). And we can reproduce this issue in any board if we force to use the internal timer as below.

The internal timer requires that a 2nd thread is executed. On systems with only one CPU it may be possible that is not scheduled as needed. I am yet working on a solution to make sure the 2nd thread is operational in all circumstances. That said, I assume your boards have one CPU only? Thanks Stephan

Answer 3 · 2021-04-30T07:45:38.000Z

# cat /proc/cpuinfo | grep processor
processor	: 0
processor	: 1
processor	: 2
processor	: 3
processor	: 4
processor	: 5
processor	: 6
processor	: 7

Answer 4 · 2021-04-30T07:53:58.000Z

Am Freitag, dem 30.04.2021 um 00:45 -0700 schrieb Mingli-Yu:

``` # cat /proc/cpuinfo | grep processor processor : 0 processor : 1 processor : 2 processor : 3 processor : 4 processor : 5 processor : 6 processor : 7 ```

Now, that is interesting - this should work out of the box. I will check with the rngd code then.

Answer 5 · 2021-04-30T07:59:55.000Z

@smuellerDD
Yes, the hang issue on board with more than one cpu will gone after with the patch nhorman/rng-tools@4f79ad1 as mentioned in nhorman/rng-tools#123. Also look forward your help to guide how to address the hang issue in a better way than nhorman/rng-tools@4f79ad1 on a system with more than one cpu.

BTW on the system with only one cpu, the patch nhorman/rng-tools@4f79ad1 doesn't help. @nhorman

Answer 6 · 2021-04-30T09:03:19.000Z

Am Freitag, dem 30.04.2021 um 01:00 -0700 schrieb Mingli-Yu:

Yes, the hang issue on board with more than one cpu will gone after with the patch nhorman/rng-tools@4f79ad1 as mentioned in nhorman/rng-tools#123.

Ah, good, so the internal noise source still works as intended.

But on the system with one one cpu, the patch nhorman/rng-tools@4f79ad1 doesn't work.

Yes, that is an issue I am currently working on. Even the separate jitterentropy-rngd application sometimes suffer from that on systems with one CPU. For the immediate use, I can only recommend to disable the internal timer on such boards. But that means the Jitter RNG will not work due to the coarse timer.

Answer 7 · 2021-05-10T01:44:56.000Z

Am Freitag, dem 30.04.2021 um 01:00 -0700 schrieb Mingli-Yu:
Yes, the hang issue on board with more than one cpu will gone after with the patch nhorman/rng-tools@4f79ad1 as mentioned in nhorman/rng-tools#123.
Ah, good, so the internal noise source still works as intended.
But on the system with one one cpu, the patch nhorman/rng-tools@4f79ad1 doesn't work.
Yes, that is an issue I am currently working on. Even the separate jitterentropy-rngd application sometimes suffer from that on systems with one CPU. For the immediate use, I can only recommend to disable the internal timer on such boards. But that means the Jitter RNG will not work due to the coarse timer.

Thanks very much for your information! Look forward the solution for Jitter entropy on the boards with the coarse timer.

Answer 8 · 2021-05-21T12:10:56.000Z

@smuellerDD what if you modified the library such that, instead of providing a option to simply enable a timer, you provided a method to register a software timer thread? Such method could only be useable if jitter_entropy_init returned ECOARSETIME. Such a method would allow the application to create/manage and define the timer threads attributes (like affinity) fairly easily

Answer 9 · 2021-05-21T12:50:23.000Z

Am Freitag, dem 21.05.2021 um 05:11 -0700 schrieb Neil Horman:

@smuellerDD what if you modified the library such that, instead of providing a option to simply enable a timer, you provided a method to register a software timer thread? Such method could only be useable if jitter_entropy_init returned ECOARSETIME. Such a method would allow the application to create/manage and define the timer threads attributes (like affinity) fairly easily

What about giving the caller the ability to register three functions with the following declarations: int init_func(void) int start_func(void *(*start_routine)(void *), void *arg) and void stop_func(void) The init_func() is called during library initialization when it decides it wants to use such a timer thread. The start_func is invoked with the parameters equal to the last two parameters of pthread_create(). At that time, the thread should run the function. The stop function tells the wrapper to pthread_join and collect the remains? When the jitterentropy-library calls the start function when it wants to execute the thread and the stop function when it is about to terminate it. Ciao Stephan

Answer 10 · 2021-05-21T13:06:21.000Z

I'd be ok with that approach. The only note I would have would be a need for a way to detect if an external timer function is warranted. I.e. if jitter_entropy_init fails with ECOARSETIME or some such

Answer 11 · 2021-05-27T20:16:46.000Z

Neil, can you please have a look at https://github.com/smuellerDD/jitterentropy-library/tree/external_threading

Effectively you need to implement the functions in struct jent_notime_thread and register them with jent_entropy_switch_notime_impl as the first call.

I have updated the internal threading code to use the same logic. Thus the internal threading implementation defined with jent_notime_thread_builtin should serve as a guide.

Answer 12 · 2021-05-27T20:41:47.000Z

Hey- First look - I like it. I'm working on a companion change to rngd to take advantage of it right now. One request I have, can you make jent_notime_init and jent_notime_fini non-static, and expose them in the main header file? I ask because, looking at those functions, they appear as though they could be a common use case, it would save me re-implementing them in rngd. Other than that, looks good, I'll ping you when I have something coded up and tested Neil

…

On Thu, May 27, 2021 at 4:17 PM smuellerDD ***@***.***> wrote: Neil, can you please have a look at https://github.com/smuellerDD/jitterentropy-library/tree/external_threading Effectively you need to implement the functions in struct jent_notime_thread and register them with jent_entropy_switch_notime_impl as the first call. I have updated the internal threading code to use the same logic. Thus the internal threading implementation defined with jent_notime_thread_builtin should serve as a guide. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#37 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAH4FXU54CFKCEDMTHGP563TP2SDZANCNFSM433T3E2Q> .

-- /*************************************************** *Neil Horman ***@***.*** *gpg keyid: 1024D / 0x92A74FA1 *http://pgp.mit.edu ***************************************************/

Answer 13 · 2021-05-27T21:36:49.000Z

Hey, so I just noticed something. I missed this because fedora never upgraded to jitterentropy v 3.x, but in commit 4b062d0, when you added the internal timer, you also added the CFLAG -fvisibility=hidden. That works fine for your tests in the repo, as they all use static linking, but it completely obliterates the ability to use shared libraries, as that flag makes all symbols private, and unlinkable. If you want to be conservative about what symbols you expose, we should probably either provide a linker version script to expose symbols, or default to hidden, and make a macro to apply __attirbute__((visibility("default"))) to the symbols you want to expose. Best Neil

…

On Thu, May 27, 2021 at 4:41 PM Neil Horman ***@***.***> wrote: Hey- First look - I like it. I'm working on a companion change to rngd to take advantage of it right now. One request I have, can you make jent_notime_init and jent_notime_fini non-static, and expose them in the main header file? I ask because, looking at those functions, they appear as though they could be a common use case, it would save me re-implementing them in rngd. Other than that, looks good, I'll ping you when I have something coded up and tested Neil On Thu, May 27, 2021 at 4:17 PM smuellerDD ***@***.***> wrote: > Neil, can you please have a look at > https://github.com/smuellerDD/jitterentropy-library/tree/external_threading > > Effectively you need to implement the functions in struct > jent_notime_thread and register them with jent_entropy_switch_notime_impl > as the first call. > > I have updated the internal threading code to use the same logic. Thus > the internal threading implementation defined with > jent_notime_thread_builtin should serve as a guide. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#37 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAH4FXU54CFKCEDMTHGP563TP2SDZANCNFSM433T3E2Q> > . > -- /*************************************************** *Neil Horman ***@***.*** *gpg keyid: 1024D / 0x92A74FA1 *http://pgp.mit.edu ***************************************************/

-- /*************************************************** *Neil Horman ***@***.*** *gpg keyid: 1024D / 0x92A74FA1 *http://pgp.mit.edu ***************************************************/

Answer 14 · 2021-05-27T21:54:44.000Z

Ok, for reference I just pushed this for testing:
https://github.com/nhorman/rng-tools/tree/notime

It makes use of the new timer registration work @smuellerDD has done here. I don't have a system with a coarse enough timer to test it throughly but @Mingli-Yu if you could test it please, I would appreciate it. Note that building my branch requires that libjitterentropy be updated such that -fvisibility=hidden is removed from the CFLAGS, as mentioned above.

Answer 15 · 2021-05-28T05:46:35.000Z

Am Donnerstag, 27. Mai 2021, 23:37:03 CEST schrieb Neil Horman: Hi Neil,

Hey, so I just noticed something. I missed this because fedora never upgraded to jitterentropy v 3.x, but in commit 4b062d0, when you added the internal timer, you also added the CFLAG -fvisibility=hidden. That works fine for your tests in the repo, as they all use static linking, but it completely obliterates the ability to use shared libraries, as that flag makes all symbols private, and unlinkable. If you want to be conservative about what symbols you expose, we should probably either provide a linker version script to expose symbols, or default to hidden, and make a macro to apply __attirbute__((visibility("default"))) to the symbols you want to expose.

Thanks, added to the branch. Ciao Stephan

Best Neil On Thu, May 27, 2021 at 4:41 PM Neil Horman ***@***.***> wrote: > Hey- > > First look - I like it. I'm working on a companion change to rngd to > > take advantage of it right now. One request I have, can you make > jent_notime_init and jent_notime_fini non-static, and expose them in the > main header file? I ask because, looking at those functions, they appear > as though they could be a common use case, it would save me > re-implementing

them in rngd.

> > > > Other than that, looks good, I'll ping you when I have something coded up > and tested > Neil > > > > > On Thu, May 27, 2021 at 4:17 PM smuellerDD ***@***.***> > wrote: > > > >> Neil, can you please have a look at >> https://github.com/smuellerDD/jitterentropy-library/tree/external_threadi >> ng > >> >> >> Effectively you need to implement the functions in struct >> jent_notime_thread and register them with >> jent_entropy_switch_notime_impl >> as the first call. >> >> >> >> I have updated the internal threading code to use the same logic. Thus >> the internal threading implementation defined with >> jent_notime_thread_builtin should serve as a guide. >> >> >> >> — >> You are receiving this because you were mentioned. >> Reply to this email directly, view it on GitHub >> <#37 (comment) >> nt-849912703>,

or unsubscribe

>> <https://github.com/notifications/unsubscribe-auth/AAH4FXU54CFKCEDMTHGP56 >> 3TP2SDZANCNFSM433T3E2Q>

.

>> >> > > > > > -- > /*************************************************** > > *Neil Horman > ***@***.*** > *gpg keyid: 1024D / 0x92A74FA1 > *http://pgp.mit.edu > ***************************************************/ > > -- /*************************************************** *Neil Horman ***@***.*** *gpg keyid: 1024D / 0x92A74FA1 *http://pgp.mit.edu ***************************************************/ -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #37 (comment) 49958247

Ciao Stephan

Answer 16 · 2021-05-28T05:47:51.000Z

Am Donnerstag, 27. Mai 2021, 23:54:58 CEST schrieb Neil Horman: Hi Neil,

Ok, for reference I just pushed this for testing: https://github.com/nhorman/rng-tools/tree/notime It makes use of the new timer registration work @smuellerDD has done here. I don't have a system with a coarse enough timer to test it

Simply use the jent_entropy_collector_alloc function with JENT_FORCE_INTERNAL_TIMER will allow you to force the internal timer logic.

throughly but @Mingli-Yu if you could test it please, I would appreciate it. Note that building my branch requires that libjitterentropy be updated such that -fvisibility=hidden is removed from the CFLAGS, as mentioned above.

Ciao Stephan

Answer 17 · 2021-05-28T05:59:38.000Z

Am Donnerstag, 27. Mai 2021, 22:42:01 CEST schrieb Neil Horman: Hi Neil,

Hey- First look - I like it. I'm working on a companion change to rngd to take advantage of it right now. One request I have, can you make jent_notime_init and jent_notime_fini non-static, and expose them in the main header file? I ask because, looking at those functions, they appear as though they could be a common use case, it would save me re-implementing them in rngd.

See the update in the branch. Ciao Stephan

Other than that, looks good, I'll ping you when I have something coded up and tested Neil On Thu, May 27, 2021 at 4:17 PM smuellerDD ***@***.***> wrote: > Neil, can you please have a look at > https://github.com/smuellerDD/jitterentropy-library/tree/external_threadin > g > > > Effectively you need to implement the functions in struct > jent_notime_thread and register them with jent_entropy_switch_notime_impl > as the first call. > > > > I have updated the internal threading code to use the same logic. Thus > the > internal threading implementation defined with jent_notime_thread_builtin > should serve as a guide. > > > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#37 (comment) > t-849912703>,

or unsubscribe

> <https://github.com/notifications/unsubscribe-auth/AAH4FXU54CFKCEDMTHGP563 > TP2SDZANCNFSM433T3E2Q>

.

> > -- /*************************************************** *Neil Horman ***@***.*** *gpg keyid: 1024D / 0x92A74FA1 *http://pgp.mit.edu ***************************************************/ -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #37 (comment) 49927564

Ciao Stephan

Answer 18 · 2021-05-28T13:27:48.000Z

done, I've updated my branch to allow for the use of your refrerenced flag, and exported it as an option to the jitter source. I've done some testing with that, and it seems to work reasonably well. One thing I noticed, that I hadn't considered previously. the api that you export seems to assume that we will use a separate timer thread per entropy context. Since rngd allocates an entropy context per cpu, thats a difficult model to work with, as it implies the creation of an equal number of timer threads that amplify the thread contention issue, and continue to drive the hang condition. Currently I work around that by limiting rngds thread count to 1 when using soft timers, but I wonder if it would be worth altering the model to allow for the creation of a single thread and a reference count that tracks the number of timer starts and stops, so that we can use a single thread for all soft timer functions. Thoughts?

Answer 19 · 2021-05-31T08:06:54.000Z

Am Freitag, dem 28.05.2021 um 06:28 -0700 schrieb Neil Horman:

done, I've updated my branch to allow for the use of your refrerenced flag, and exported it as an option to the jitter source. I've done some testing with that, and it seems to work reasonably well. One thing I noticed, that I hadn't considered previously. the api that you export seems to assume that we will use a separate timer thread per entropy context. Since rngd allocates an entropy context per cpu, thats a difficult model to work with, as it implies the creation of an equal number of timer threads that amplify the thread contention issue, and continue to drive the hang condition. Currently I work around that by limiting rngds thread count to 1 when using soft timers, but I wonder if it would be worth altering the model to allow for the creation of a single thread and a reference count that tracks the number of timer starts and stops, so that we can use a single thread for all soft timer functions. Thoughts?

There should be a 1:1 relationship between the RNG thread and the timer thread, iff the timer thread is used. You get the notification that the Jitter RNG wants to use a thread with the jent_notime_init function pointer invocation. So, when you receive such a call, you should make sure you have a CPU core available that is not "occupied" by the main Jitter RNG thread. Note, I have yet an issue open that I am unsure how to solve: if there is only one CPU core available, the timerless operation of the Jitter RNG does not work. In this case (one CPU core and timerless operation is requested), the Jitter RNG should error out - at least my default jent_notime_init should do that. To implement that, I would need to know how many CPUs there are in the system. Thanks Stephan

Answer 20 · 2021-05-31T12:14:13.000Z

I'm not sure why we have to force a 1:1 relationship between rng threads and timer threads. All the timer thread is doing is simulating a free running counter. Multiple rng threads can use a single timer thread to do that if they simply read the timer threads counter when they want to 'start' a timer, and again when they 'stop' it, then compute the delta. That seems like it would be much more efficient to me.

I figured out the jent_notime_init trick.

I concur that in a single core system, theres just no way that this mechanism is going to work, at least not without additional co-operation. The threads will contend with each other and one will wind up blocking the other. We could perhaps finagle some yielding mechanism, but I think its probably best to just fail the init for now.

@Mingli-Yu any test result confirmation?

Answer 21 · 2021-06-11T02:46:43.000Z

@nhorman @smuellerDD
Thanks very much for your update! It works on the system with more cpus.

So when does https://github.com/nhorman/rng-tools/tree/notime and https://github.com/smuellerDD/jitterentropy-library/tree/external_threading merged to master branch? Or what the release version includes these changes?

Answer 22 · 2021-06-11T11:03:45.000Z

I just merged my changes, i've got a few features that could benefit from a tagged release, but why can't you just build from the head of the git tree?

Answer 23 · 2021-06-15T02:42:28.000Z

@nhorman
Thanks for your update! I can build from the head for test, but it's better to make it available in one release version and I can see it's included in 6.13 now, thanks very much!

@smuellerDD
BTW, seems the change for jitterentropy-library not merged until now, wait for the update!

Answer 24 · 2021-06-15T05:36:50.000Z

Am Montag, dem 14.06.2021 um 19:42 -0700 schrieb Mingli-Yu:

@nhorman Thanks for your update! I can build from the head for test, but it's better to make it available in one release version and I can see it's included in 6.13 now, thanks very much! @smuellerDD BTW, seems the change for jitterentropy-library not merged until now, wait for the update!

There is a bulk of patches pending which I would like to merge before make another release. Currently the LRNG work keeps me busy, but this now finally winds down allowing me to make the updates to the Jitter RNG code. Ciao Stephan

…

Answer 25 · 2021-06-27T15:21:01.000Z

Hello, Stephan,

I've got a report that rng-tools-6.13 (with the notime patchset) linked to jitterentropy-3.0.2 (a0c51e2, tag: v3.0.2) lib consumes 100% CPU on a single-core older AMD Athlon 64 CPU. I've got a process core and it looks like the problem is one discussed in this issue #37. Could you please have a look at thread stacks and probably confirm, am I correct or we're seeing some separate new issue?

Stacks are:

(gdb) info threads
  Id   Target Id                        Frame 
  1    Thread 0x7f65f3481840 (LWP 7401) __libc_read (nbytes=16, buf=0x55da04aadd00 <key.lto_priv>, fd=4)
    at ../sysdeps/unix/sysv/linux/read.c:26
^^^^^ sleeps on read()

* 2    Thread 0x7f65f3480640 (LWP 7405) 0x00007f65f4105f80 in jent_get_nstime_internal (ec=0x55da0557bed0, 
    out=0x7f65f347fb98) at /usr/src/debug/jitterentropy-3.0.2-1.fc34.x86_64/jitterentropy-base.c:760

  3    Thread 0x7f65f2c7f640 (LWP 7406) jent_notime_sample_timer (arg=0x55da0557bed0)
    at /usr/src/debug/jitterentropy-3.0.2-1.fc34.x86_64/jitterentropy-base.c:702
^^^^^ these two

(gdb) t 2
[Switching to thread 2 (Thread 0x7f65f3480640 (LWP 7405))]
#0  0x00007f65f4105f80 in jent_get_nstime_internal (ec=0x55da0557bed0, out=0x7f65f347fb98)
    at /usr/src/debug/jitterentropy-3.0.2-1.fc34.x86_64/jitterentropy-base.c:760
760			while (ec->notime_timer == ec->notime_prev_timer)

(gdb) bt
#0  0x00007f65f4105f80 in jent_get_nstime_internal (ec=0x55da0557bed0, out=0x7f65f347fb98)
    at /usr/src/debug/jitterentropy-3.0.2-1.fc34.x86_64/jitterentropy-base.c:760
#1  0x00007f65f410634d in jent_measure_jitter (ec=0x55da0557bed0, loop_cnt=0, ret_current_delta=0x0)
    at /usr/src/debug/jitterentropy-3.0.2-1.fc34.x86_64/jitterentropy-base.c:1036
#2  0x00007f65f4106410 in jent_random_data (ec=0x55da0557bed0)
    at /usr/src/debug/jitterentropy-3.0.2-1.fc34.x86_64/jitterentropy-base.c:1068
#3  0x00007f65f410649e in jent_read_entropy (ec=0x55da0557bed0, 
    data=0x7f65ec000b60 "UV\376\213˷Y/\236\346o\231\352\205\r\aױC-\220", len=16375)
    at /usr/src/debug/jitterentropy-3.0.2-1.fc34.x86_64/jitterentropy-base.c:1124

(gdb) t 3
[Switching to thread 3 (Thread 0x7f65f2c7f640 (LWP 7406))]
#0  jent_notime_sample_timer (arg=0x55da0557bed0)
    at /usr/src/debug/jitterentropy-3.0.2-1.fc34.x86_64/jitterentropy-base.c:702
702			if (ec->notime_interrupt)

(gdb) bt 
#0  jent_notime_sample_timer (arg=0x55da0557bed0)
    at /usr/src/debug/jitterentropy-3.0.2-1.fc34.x86_64/jitterentropy-base.c:702
#1  0x00007f65f3d26299 in start_thread (arg=0x7f65f2c7f640) at pthread_create.c:481

which are:

static inline void jent_get_nstime_internal(struct rand_data *ec, uint64_t *out)
{
        if (ec->enable_notime) {
                while (ec->notime_timer == ec->notime_prev_timer) // <<< LOOPS HERE
                        ;
        ...

static void *jent_notime_sample_timer(void *arg)
{
        struct rand_data *ec = (struct rand_data *)arg;
        ec->notime_timer = 0;

        while (1) {
                if (ec->notime_interrupt) // <<< LOOPS HERE
                        return NULL;
                ec->notime_timer++;
        }
        ...

In case a core itself is needed, please, see a public bz https://bugzilla.redhat.com/show_bug.cgi?id=1974132. The core is attached there:

requested core dump for faulty jitterentropy-3.0.2-1.fc34 and rng-tools-6.13-2.fc34 (762.93 KB, application/gzip)

Surely, I have no wide jitterentropy lib knowledge, still for me the above looks much like this issue#34.

Thank you and best regards,
Vladis

Answer 26 · 2021-06-27T18:47:10.000Z

Thats....Odd. I agree that the issue look similar, but I think its subtly different. Looking at the bz, it appears this system has a single cpu, and is (possibly) returning an error from jitterentropy during init indicating its time is too coarse, and as such is starting the software timer. That much is identical to this problem, but in the initial report, the problem was reported on a system with multiple cores. In a multi-core situation, a soft timer is helpful because the timer thread can run in parallel to the rngd thread, allowing for progress to be made, but in this case its impossible (due to only having a single core). In this case jitterentropy probably has no chance of working because the hardware counter is too slow to provide a reasonable timer base, and the software timer will just interfere with normal operaiton. I think, unfortunately, that jitter, in this case, should just fail to initalize. let me create a branch to test that use case for you.

Answer 27 · 2021-06-27T18:54:31.000Z

https://github.com/nhorman/rng-tools/tree/onecpu

Not sure if thats the final solution (and its untested), but that should add code that causes jitterentropy to fail if the hw counter is too coarse and theres only a single cpu to execute on. It won't fully solve the reporters issue, but it will prevent jitter from running on a system in this pessimal case, and prevent cpu saturation. They'll likely need to find another entropy source to gather entropy from. (maybe the rdrand souce, assuming their cpu isn't impacted by the AMD rdrand issue)

Answer 28 · 2021-06-28T10:31:28.000Z

Hello, @smuellerDD, @nhorman,

Thanks a ton for a help here. I believe "the software timer will just interfere with normal operaiton, jitter, in this case, should just fail to initalize" is indeed a correct approach. I have a number of small concerns still, can we please, discuss?

It just a little strange to me, as 2.2.0 version of the lib was working on that old single-CPU system of a reporter, since 4b062d0 was added in 3.0.0 only. Do I understand this correctly, that actually the lib was not working properly on this system?
Thanks for onecpu patch. I will add it and build. A note, though. Should not it be sysconf(_SC_NPROCESSORS_ONLN) in rngd_notime_init(), as we care only about CPUs we can schedule on?
I believe, this check also should be added to the jitterentropy lib itself. The lib should not check this, if an app requests its own timer thread by {init,start,stop}_func(). In this case - yes, the app should be responsible for the check. But in case a default (a lib's one) timer func is used, the lib should do the check. For now it is rng-tools only in Fedora, but later other apps may want to use the lib, while not using an app's timer thread. @smuellerDD, I would be happy to hear, your thoughts about this suggestion.

Thank you.

Answer 29 · 2021-06-28T12:20:40.000Z

@nefigtut in response to your points above:

is a very interesting datapoint, one that I didn't recognize before. While It doesn't confirm it, it suggests that this system has a hardware timer that is fine grained enough to not need a software timer (which would be my expectation). That in turn suggests we are trying to initalize the software timer when we don't need to. It would be worth investigating what the return code is from jent_time_entropy_init(0) in jent_entropy_init. We attempt to start the software timer infrastructure there if that call fails, and if it used to work, it shouldn't be failing now.
yes, you're right, I borrowed that code from the init function, and it likely should be changed in both places. Please feel free to do so.
No strong opinion on this in either direction. We can use the patch I have for testing purposes, and decide on the correct solution when we fully understand the problem, IMO.

Answer 30 · 2021-06-28T12:35:51.000Z

Am Montag, dem 28.06.2021 um 03:31 -0700 schrieb Vladis Dronov:

Hello, @smuellerDD, @nhorman, Thanks a ton for a help here. I believe _"the software timer will just interfere with normal operaiton, jitter, in this case, should just fail to initalize"_ is indeed a correct approach. I have a number of small concerns still, can we please, discuss? 1) It just a little strange to me, as 2.2.0 version of the lib was working on that old single-CPU system of a reporter, since 4b062d0 was added in 3.0.0 only. Do I understand this correctly, that actually the lib was not working properly on this system? 2) Thanks for `onecpu` patch. I will add it and build. A note, though. Should not it be `sysconf(_SC_NPROCESSORS_ONLN)` in `rngd_notime_init()`, as we care only about CPUs we can schedule on? 3) I believe, this check also should be added to the jitterentropy lib itself. The lib should not check this, if an app requests its own timer thread by `{init,start,stop}_func()`. In this case - yes, the app should be responsible for the check. But in case a default (a lib's one) timer func is used, the lib should do the check. For now it is `rng-tools` only in Fedora, but later other apps may want to use the lib, while not using an app's timer thread. @smuellerDD, I would be happy to hear, your thoughts about this suggestion.

First of all, it is a very good suggestion. I will integrate it into the internal timer logic when the external system does not provide its own implementation. Yet, it adds a dependency to the surrounding OS, and that is the only thing I want to encapsulate to ensure it is no issue when executing the Jitter RNG on other OSes or even on an OS-less environment. Ciao Stephan

…

Thank you.

Answer 31 · 2021-07-03T14:11:52.000Z

The threading code has been merged into the master branch.

Answer 32 · 2021-07-03T14:47:01.000Z

Addition of a helper that detects the available number of CPUs. Only when 2 or more CPUs are available, the internal timer thread is started.

With this change I think the discussed issues should be resolved. I am closing this issue - if there are still issues lurking, please feel free to reopen or open a new issue.

Thanks to all for support.

Answer 33 · 2021-07-03T21:46:01.000Z

i do not have an intent to re-open this issue. i want just to thank Stephan - thank you Stephan for your responses. i'll build jitterlib-v3 and newer rng-tools for fedora and ask ppl reporting issues if they are willing to test. thank you.

Answer 34 · 2021-07-09T18:19:23.000Z

@smuellerDD - Would you consider tagging a 3.0.3 release now that this issue has been fixed? Will make it easier for major distros to rebuild.

Answer 35 · 2021-07-09T19:04:18.000Z

@smuellerDD i would like to request to accept one more fix first, if possible. unfortunately, there is one more issue, related to timers. i'll post the PR on Monday, if an issue reporter replies quick, or i'll create a github issue, if he does not (this way i cannot suggest a proper fixing code, only to describe the issue). on the other hand, this is quite a corner case, so it may very well wait for the 3.0.4/.5.

Answer 36 · 2021-07-11T15:29:48.000Z

Am Freitag, 9. Juli 2021, 20:19:34 CEST schrieb graysky: Hi graysky,

@smuellerDD - Would you consider tagging a 3.0.4 release now that this issue has been fixed? Will make it easier for major distros to rebuild.

I am going to tag 3.1.0 shortly. All changes I wanted to have in are now in. One more test round and then we should be there. Ciao Stephan

Answer 37 · 2021-07-19T12:13:45.000Z

I am going to tag 3.1.0 shortly. All changes I wanted to have in are now in.

@smuellerDD - Ping :D

Answer 38 · 2021-07-19T12:14:41.000Z

Am Montag, dem 19.07.2021 um 05:13 -0700 schrieb graysky:

> I am going to tag 3.1.0 shortly. All changes I wanted to have in are now > in. @smuellerDD - Ping :D

Mea culpa - yes, trying... :-) Ciao Stephan

…