dmlc/parameter_server

questions about darlin

Opened this issue · 2 comments

Hi muli
I have read part of the code of darlin,I can`t find the implementation of api "void run" in darlin for server and worker, it seems that only scheduler run is used?

the running long also shows:
I0303 14:07:51.796239 16734 postoffice.cc:104] Scheduler has connected 2 servers and 2 workers
I0303 14:07:51.798970 16734 darlin.h:35] Train l_1 logistic regression by block coordinate descent
I0303 14:07:51.799216 16734 postmaster.cc:9] Found 8 files
I0303 14:07:51.799244 16734 postmaster.cc:13] Assign 8 files to 2 workers
I0303 14:07:51.803696 16734 bcd.h:53] Loaded 223836 examples in 0.004 sec
I0303 14:07:53.559219 16734 bcd.h:71] Preprocessing is finished in 1.755 sec
I0303 14:07:53.559294 16734 bcd.h:73] Features with frequency <= 4 are filtered
I0303 14:07:53.559429 16734 bcd.h:98] Features are partitioned into 449 blocks
I0303 14:07:53.559465 16734 bcd.h:120] Prior feature groups: 127, 120
I0303 14:07:53.559478 16734 darlin.h:43] Maximal allowed delay: 8
| training | sparsity | KKT filter | time (sec.)
iter | objective relative | |w|_0 | threshold #activet |(app:min max) total
----+------------------------+-----------+---------------------+-----------------
0 | 1.45414e+05 1.000e+00 | 3601 | 1.0e+20 12754852 | 0.3 0.3 1.1
1 | 1.44370e+05 7.226e-03 | 2916 | 2.9e-01 4162 | 0.3 0.3 1.1
2 | 1.44197e+05 1.203e-03 | 2689 | 1.9e-03 3101 | 0.1 0.1 0.7
3 | 1.44015e+05 1.264e-03 | 2493 | 6.1e-03 2763 | 0.1 0.1 0.7
4 | 1.43952e+05 4.340e-04 | 2382 | 1.0e-03 2529 | 0.1 0.1 0.7
5 | 1.43933e+05 1.366e-04 | 2325 | 5.3e-04 2407 | 0.1 0.1 0.7
6 | 1.43921e+05 7.836e-05 | 2291 | 1.1e-04 2337 | 0.1 0.1 0.7
7 | 1.43916e+05 3.719e-05 | 2276 | 7.8e-04 2299 | 0.1 0.1 0.7
8 | 1.43912e+05 3.138e-05 | 2261 | 1.9e-04 2282 | 0.1 0.1 0.7
9 | 1.43908e+05 2.278e-05 | 2260 | 2.8e-05 2271 | 0.1 0.1 0.7
10 | 1.43906e+05 1.575e-05 | 2248 | 1.5e-03 2263 | 0.1 0.1 0.7
11 | 1.43904e+05 1.541e-05 | 2240 | 1.2e-05 12754840 | 0.3 0.3 1.1

in the log, I also can not find the logs relevant to worker and server

mli commented

yeah. in this batch algorithm, the scheduler issues tasks into workers and
servers in run(), while the other nodes only need to accept calls from the
scheduler, so they have empty run().

On Tue, Mar 3, 2015 at 10:47 PM, leecy0405 notifications@github.com wrote:

Hi muli
I have read part of the code of darlin,I can`t find the implementation of
api "void run" in darlin for server and worker, it seems that only
scheduler run is used?

the running long also shows:
I0303 14:07:51.796239 16734 postoffice.cc:104] Scheduler has connected 2
servers and 2 workers
I0303 14:07:51.798970 16734 darlin.h:35] Train l_1 logistic regression by
block coordinate descent
I0303 14:07:51.799216 16734 postmaster.cc:9] Found 8 files
I0303 14:07:51.799244 16734 postmaster.cc:13] Assign 8 files to 2 workers
I0303 14:07:51.803696 16734 bcd.h:53] Loaded 223836 examples in 0.004 sec
I0303 14:07:53.559219 16734 bcd.h:71] Preprocessing is finished in 1.755
sec
I0303 14:07:53.559294 16734 bcd.h:73] Features with frequency <= 4 are
filtered
I0303 14:07:53.559429 16734 bcd.h:98] Features are partitioned into 449
blocks
I0303 14:07:53.559465 16734 bcd.h:120] Prior feature groups: 127, 120
I0303 14:07:53.559478 16734 darlin.h:43] Maximal allowed delay: 8
| training | sparsity | KKT filter | time (sec.)
iter | objective relative | |w|_0 | threshold #activet |(app:min max) total

----+------------------------+-----------+---------------------+-----------------
0 | 1.45414e+05 1.000e+00 | 3601 | 1.0e+20 12754852 | 0.3 0.3 1.1
1 | 1.44370e+05 7.226e-03 | 2916 | 2.9e-01 4162 | 0.3 0.3 1.1
2 | 1.44197e+05 1.203e-03 | 2689 | 1.9e-03 3101 | 0.1 0.1 0.7
3 | 1.44015e+05 1.264e-03 | 2493 | 6.1e-03 2763 | 0.1 0.1 0.7
4 | 1.43952e+05 4.340e-04 | 2382 | 1.0e-03 2529 | 0.1 0.1 0.7
5 | 1.43933e+05 1.366e-04 | 2325 | 5.3e-04 2407 | 0.1 0.1 0.7
6 | 1.43921e+05 7.836e-05 | 2291 | 1.1e-04 2337 | 0.1 0.1 0.7
7 | 1.43916e+05 3.719e-05 | 2276 | 7.8e-04 2299 | 0.1 0.1 0.7
8 | 1.43912e+05 3.138e-05 | 2261 | 1.9e-04 2282 | 0.1 0.1 0.7
9 | 1.43908e+05 2.278e-05 | 2260 | 2.8e-05 2271 | 0.1 0.1 0.7
10 | 1.43906e+05 1.575e-05 | 2248 | 1.5e-03 2263 | 0.1 0.1 0.7
11 | 1.43904e+05 1.541e-05 | 2240 | 1.2e-05 12754840 | 0.3 0.3 1.1

in the log, I also can not find the logs relevant to worker and server


Reply to this email directly or view it on GitHub
#13.

Thanks a lot, I have realized the mechanism, thank you for paying attention~