damo record for storing DAMOS actions applied hisotory
honggyukim opened this issue · 26 comments
Hi SeongJae,
Sorry for throwing many draft ideas, but I'm just wondering if it's possible to keep the history of DAMOS actions applied.
The current damo schemes
is very useful when operating our custom DAMOS actions, but we would like to keep the history when and how some regions are affected by the registered actions.
I don't have a strong idea how to display such data as of now, but I'm just leaving the idea for the future record.
Thanks!
Hi Honggyu,
Sorry for throwing many draft ideas
Your brilliant ideas are really helpful, please don't say that.
we would like to keep the history when and how some regions are affected by the registered actions.
Partly for this purpose, we have developed damo status
and damo show --tried_regions_of
. As always saying, the features are not yet stable, and therefore the interface could be changed, but we will support such capability anyway. I guess you also aware of the feature, right? And if so, I guess you think those are insufficient for your case, because those show only a snapshot? For recording, I think you could use the feature repeatedly and save the output. I believe this will work if you are using sufficiently large aggregation interval
.
I agree that it might not be sufficient for your case. That is, I think you might want to use small aggregation interval
and/or think the snapshot retrieving overhead is too high. If that's the case, I think we can make DAMOS to apply the actions in its own time interval rather than aggregation interval
, or add yet another tracepoint for DAMOS tried regions, so that you can record entire tried regions for every trial.
Hi SeongJae,
And if so, I guess you think those are insufficient for your case, because those show only a snapshot?
Yes, I would like to keep the record and examine it when needed.
For recording, I think you could use the feature repeatedly and save the output.
I'm afraid that it affects the performance because I saw that writing commit
to state
looked a bit costly.
I think it would be useful to capture only when some regions are affected by some DAMOS actions and the detailed information of those regions.
or add yet another tracepoint for DAMOS tried regions, so that you can record entire tried regions for every trial.
That might be the one I was looking for. But it looks we should insert custom tracepoints inside DAMON kernel code.
Thanks very much for your explanation!
I'm afraid that it affects the performance
Agreed. Unless the max_nr_regions
is small enough and aggr_interval
is small, the overhead could be significant. We're planning on more snapshot overhead controlling for the reason, but the feature is obviously not designed for recording of every findings in such case.
That might be the one I was looking for. But it looks we should insert custom tracepoints inside DAMON kernel code.
Glad to hear that. And sure, it needs update of DAMON (kernel) change. I will start work on it.
Glad to hear that. And sure, it needs update of DAMON (kernel) change. I will start work on it.
Thanks very much for your support. It will be very helpful when collecting stats from how the DAMOS actions are applied and we can examine the final performance result.
An RFC patch for the kernel part change has posted: https://lore.kernel.org/damon/20230827004045.49516-1-sj@kernel.org/
Thanks very much for your support. I will apply the patch and see how I can use it in damo as well.
Support of the feature in damo
is still a todo item. I'm planning to add an option to damo record
for specifying which scheme's tried regions to report, like damo show --tried_regions_of
. At the moment, you could use perf
to record it and show the results via perf script
like command.
_damon_result would need some update to deal with the traceevent format.
Thanks. I think it'd be useful if the tracepoint is recorded with --record
in schemes command as follows.
$ damo schemes --record -c config.json ...
We could also think about support running damo schemes
and damo record
separately as well.
it'd be useful if the tracepoint is recorded with --record in schemes command
I'm not sure if it would be a good option, since it might make the roles of schemes
and record
a little bit confusing. I think adding --record
option to start
command instead might make sense. That said, record
command supports --damos_*
options. So, adding --record
option to start
command might not make much sense. From here, one question might follow. Why letting scheme
do recording is not good while letting record
do DAMOS control is ok? That's since DAMOS is the part of DAMON, while "recording" is somewhat related with DAMON-external components including perf
and damo
's logics.
I agree this is somewhat confusing, and I was willing to even deprecate schemes
command. I changed the mind recently, and now thinking keeping schemes
command for only beginners or people who shown the past demonstration of the command, with limited capabilities.
I think the absence of the documents for the commands might made you confused. Sorry for the inconvenience.
support running damo schemes and damo record separately
Good point. Nevertheless, this is already supported by damo
. Executing damo record
without monitoring target argument makes it to check if DAMON is running, and record its monitoring results. I think this would better to be clearly documented, but we don't have good such document yet. Sorry for your inconvenience. I'm gonna write some.
Implemented a basic support of this feature via[1]. It passed only a minimal test, and the interface (option name and etc) might be changed in near future, though.
[1] 15fc065
Implemented a basic support of this feature via[1]. It passed only a minimal test, and the interface (option name and etc) might be changed in near future, though.
Sorry for the late response and thanks very much for the support.
I was willing to even deprecate schemes command.
I have a problem when running damo schemes
background, it breaks the terminal. So it'd be much better if there is a way to run damo start
for the equivalent to damo schemes -c action.json
, but non-blocking, which means immediately returns so that I can run the other commands right away.
This is especially needed when writing a automated script because damo schemes
goes to blocking status waiting for Ctrl-C
is pressed.
I have a problem when running damo schemes background, it breaks the terminal.
Let's say there is a script file as follows.
$ cat script.sh
#!/bin/bash -x
sudo ./damo schemes -c pageout.json &
DAMO_PID=$!
echo "damo pid: $DAMO_PID"
sleep 3 # Do something here!
If I run it, then it terminatesand I saw it stops the running kdamond properly.
$ ./script.sh
+ DAMO_PID=1256368
+ echo 'damo pid: 1256368'
damo pid: 1256368
+ sleep 3
+ sudo ./damo schemes -c pageout.json
Press Ctrl+C to stop
+ sudo kill 1256368
signal 15 received
However, it breaks my terminal and show the output weird as follows. I have run ps
and pwd
but I don't see the characters that I typed and it shows the output in a broken way.
PID TTY TIME CMD
1254435 pts/4 00:00:00 bash
1256463 pts/4 00:00:00 ps
$
/home/honggyu/work/damo
$
To avoid this problem, I think it'd be useful if there is a way to run damo schemes
in a non-blocking way so that I can avoid running the command in background in my shell script.
Hi Honggyu,
it'd be much better if there is a way to run damo start for the equivalent to damo schemes -c action.json, but non-blocking
damo start
supports the -c
option. You should be able to do that with the option. e.g., damo start -c action.json
. Please let me know if it doesn't work.
I have a problem when running damo schemes background, it breaks the terminal.
I tried your script on my test machine but the issue doesn't reproduce. I guess some more things involved?
Hello,
I also had the same problem. I'm not 100% sure, but it seems like that problem is related to
https://askubuntu.com/questions/1459049/bash-script-launching-background-process-breaks-terminal-output-and-kills-backgr
, not damo. How about trying to use sudo -b
instead of &
?
Right. The screen breaking problem is not from damo and even not related to this issue so we don't have to talk about it here.
I was willing to even deprecate schemes command.
I started to mention it because of this comment and I also think that we don't need schemes
command separately.
Hi Honggyu, have you had a chance to test the feature[1] that we implemented for this issue? If so, could you please confirm if it works, or some bugs found?
[1] 15fc065
Hi Honggyu, have you had a chance to test the feature[1] that we implemented for this issue? If so, could you please confirm if it works, or some bugs found?
[1] 15fc065
Hi SeongJae, I thought that the following RFC patch was needed to test this feature.
An RFC patch for the kernel part change has posted: https://lore.kernel.org/damon/20230827004045.49516-1-sj@kernel.org/
If no, then I need to know the command sequence for testing. Could you please give more guide or update the document for this usage and expected output? Thanks.
Hi Honggyu, sorry for late response.
I thought that the following RFC patch was needed to test this feature.
You're right. I was thinking that you could test that from damon/next tree or mm tree. The pull request containing that has recently sent[1] to Linus, and merged into the mainline.
$ ../lazybox/git_helpers/find_change_from.py --subject "mm/damon/core: use nr_accesses_bp as a source of damos_before_apply tracepoint" linus/master
a72217ad596e ("mm/damon/core: use nr_accesses_bp as a source of damos_before_apply tracepoint")
I need to know the command sequence for testing.
You could use damo record
, with --schemes_target_regions
option. You could check the results with damo show
or damo report
as usual. Of course DAMON with a scheme should running.
[1] https://lore.kernel.org/mm-commits/20231101145447.60320c9044e7db4dba2d93e3@linux-foundation.org/
Hi SeongJae,
Thanks for your comment.
I've recorded masim with --schemes_target_regions
as follows.
$ ./damo record --schemes_target_regions "./masim/masim masim/configs/stairs_30secs.cfg"
Press Ctrl+C to stop
initial phase: 87,119 accesses/msec, 5001 msecs run
phase 0: 94,896 accesses/msec, 2500 msecs run
phase 1: 91,556 accesses/msec, 2501 msecs run
phase 2: 94,948 accesses/msec, 2500 msecs run
phase 3: 93,795 accesses/msec, 2500 msecs run
phase 4: 61,970 accesses/msec, 2500 msecs run
phase 5: 92,956 accesses/msec, 2500 msecs run
phase 6: 93,795 accesses/msec, 2500 msecs run
phase 7: 92,798 accesses/msec, 2500 msecs run
phase 8: 93,480 accesses/msec, 2500 msecs run
phase 9: 93,443 accesses/msec, 2501 msecs run
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.145 MB damon.data ]
But I don't know how to see the result with report
command. I saw the help message of damo report
then tried each as follows.
$ ./damo report raw
no monitoring result in the file
$ ./damo report nr_regions
# <percentile> <# regions>
$ ./damo report wss
# <percentile> <wss>
$ ./damo report heats --heatmap stdout
Traceback (most recent call last):
File "/home/root/damo/./damo", line 116, in <module>
main()
File "/home/root/damo/./damo", line 113, in main
subcmd.execute(args)
File "/home/root/damo/_damo_subcmds.py", line 31, in execute
self.module.main(args)
File "/home/root/damo/damo_report.py", line 38, in main
subcmd.execute(args)
File "/home/root/damo/_damo_subcmds.py", line 31, in execute
self.module.main(args)
File "/home/root/damo/damo_heats.py", line 314, in main
set_missed_args(args, records)
File "/home/root/damo/damo_heats.py", line 200, in set_missed_args
guide = guides[0]
IndexError: list index out of range
Could you help me this out by showing the exact command sequence and output that I can expect with the feature? Thanks.
I used the kernel version as follows.
$ uname -r
6.6.0-14651-gd2f51b3516da
Hi Honggyu,
The command you used ($ ./damo record --schemes_target_regions "./masim/masim masim/configs/stairs_30secs.cfg"
) wouldn't install any DAMOS scheme. Hence there is no scheme target regions and nothing to record.
I confirmed installing scheme using --damos_*
command line arguments like below works.
$ sudo ./damo record --damos_action pageout --damos_access_rate 0% 0% --damos_age 2s max --schemes_target_regions "../masim/masim ../masim/configs/stairs_30secs.cfg"
[...]
$ sudo ./damo report raw
base_time_absolute: 55 m 13.210 s
monitoring_start: 0 ns
monitoring_end: 3.722 s
monitoring_duration: 3.722 s
target_id: 0
nr_regions: 19
# start_addr end_addr length nr_accesses age
563226a71000-56322741e000 ( 9.676 MiB) 0 20
56322741e000-5632279b9000 ( 5.605 MiB) 0 20
7fffde5a9000-7fffde5c8000 ( 124.000 KiB) 0 20
7fffde5ca000-7fffde5f4000 ( 168.000 KiB) 0 20
56322718f000-5632279b9000 ( 8.164 MiB) 0 20
56322659f000-5632271e2000 ( 12.262 MiB) 0 20
7fffde5a9000-7fffde5c8000 ( 124.000 KiB) 0 20
[...]
Could you please test again like above?
Hi SeongJae,
Sorry for the late answer and also thanks for your help as always.
I've just tested what you suggested, then found damo record
works fine.
$ ./damo record --damos_action pageout --damos_access_rate 0% 0% --damos_age 2s max --schemes_target_regions "./masim/masim ./masim/configs/stairs_30secs
.cfg"
Press Ctrl+C to stop
initial phase: 90,185 accesses/msec, 5001 msecs run
...
phase 9: 93,008 accesses/msec, 2500 msecs run
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.215 MB damon.data (150 samples) ]
Then it also shows damo report raw
as follows.
$ ./damo report raw
base_time_absolute: 2 h 30 m 48.782 s
monitoring_start: 0 ns
monitoring_end: 2.829 s
monitoring_duration: 2.829 s
target_id: 0
nr_regions: 13
# start_addr end_addr length nr_accesses age
55d6fd215000-55d6fde5c000 ( 12.277 MiB) 0 20
7ffda9ecd000-7ffda9ee7000 ( 104.000 KiB) 0 20
55d6fde5c000-55d6feb50000 ( 12.953 MiB) 0 20
55d6feb50000-55d6fefc1000 ( 4.441 MiB) 0 20
7f8fea640000-7f8feab4d000 ( 5.051 MiB) 0 20
55d6fd215000-55d6fde5c000 ( 12.277 MiB) 0 20
55d6fde5c000-55d6feb50000 ( 12.953 MiB) 0 20
7f8fea642000-7f8feab4d000 ( 5.043 MiB) 0 20
55d6feb50000-55d6fefc1000 ( 4.441 MiB) 0 20
7ffda9ecd000-7ffda9f25000 ( 352.000 KiB) 0 20
55d6fd215000-55d6fdee0000 ( 12.793 MiB) 0 20
55d6fdee0000-55d6febe9000 ( 13.035 MiB) 0 20
7f8fea642000-7f8feab4d000 ( 5.043 MiB) 0 20
...
The only thing I would like to confirm is that if it's correct if those above the list of region information only shows the regions that fit into the DAMOS scheme rule. Thanks.
if those above the list of region information only shows the regions that fit into the DAMOS scheme rule.
It should be. If not, that's something we need to investigate. Please let us know if you find such case.
Thanks. I will use this feature then tell you when something looks incorrect. But it looks working fine as I've tested so far.