/Linux-cgdcbxd

control group daemon to manage net_prio groups using DCB netlink events

Primary LanguageCGNU Lesser General Public License v2.1LGPL-2.1

Design
========

After cgroup system has taken shape, its time to have some basic tools
in user space which can enable a user to use the resource management
functionality effictively.

One of the needed functionality is rule based placement of a task.  In general,
there can be either uid or gid or exec based rules. Admin/root will
control/enforce uid/gid based rules and exec based rules can be defined by
user in a config file residing in user's home dir and fully controlled by user.

uid/gid based rules will be defined in /etc/cgrules.conf config file and
this file will be managed by root.

Basic idea is that to begin with provide facility to implement rules
based on uid and gid. So a hierarchy might look like as follows.

                        /mnt/cgroup
                        |       |
                      gid1      gid2
                     |   |
                   uid1 uid2
                  |   |
               proj1  proj2


Admin will write rules to control the resources among users. Then users
can manage their own cgroup at their own (proj1 and proj2) and control
the resources as they want.

Following are the few methods using which tasks can be placed in right
cgroups.

- Use pam_cgroup PAM plugin which will make sure users are placed in right
  cgroup at login time and any tasks launch after login, will continue to run
  in user's cgroup.

- Use command line tool "cgexec" to launch the task in right cgroup.

- Modify the program and use libcgroup provided APIs for placing a task
  in right cgroup before doing exec().

- Use "cgclassify" tool to classify a already running task.

- May be, a user space daemon can also be implemented which will listen to
  kernel events and place the task in right group based on the rules.
  This method involves few concerns.

	- Reliability of netlink socket. Messages can be dropped.
		- Change the netlink with a cgroup controller which exports the
	  	  events.

	- Delay incurred since the event took place and task was actually placed
  	  in right cgroup.

	- daemon will interfere with container's tasks which is not desired.

HOWTO
=====

Section 1:
----------
To use "cgexec" to place the task in right cgroup.

- make cgexec
- Run a task using cgexec. Following is the cgexec syntax.

 cgexec [-g <list of controllers>:<relative path to cgroup>] command [arguments]

 Note: Cgroup controllers and path are optional. If user does not know the
       right cgroup, cgexec will automatically place the task in right
       cgroup based on /etc/cgrules.conf

Example:
	cgexec -g *:test1 ls
	cgexec -g cpu,memory:test1 ls -l
	cgexec -g cpu,memory:test1 -g swap:test2 ls -l

Section 2
---------
To use "cgclassify" to place task in right cgroup.

- make cgclassify
- Pick a task's pid to be classified, and run
	cgclassify <list of pids>

Example:
--------
	cgclassify 2140 4325

	Note: After classification check out whether tasks 2140 and 4325
	are in the right cgroup or not (Based on rules in /etc/cgrules.conf)

Section 3:
----------
To use a pam plugin which will automatically place the task in right
cgroup upon login.

- Build pam_cgroup.so
	make pam_cgroup.so
- Copy pam_cgroup.so to /lib/security/
- Edit /etc/pam.d/su to make use of pam_cgroup.so session module upon
  execution of su.

example:
	Add following line at the end of /etc/pam.d/su file

session                optional        pam_cgroup.so

- Now launch a shell for a user "xyz" using su and the resulting shell
  should be running in the cgroup designated for the user as specified
  by cgrules.conf

  ex. "su test1"

Try similar things with other services like sshd.

Note: pam_cgroup.so moves the service providing process in the right cgroup
      and not the process which will be launched later. Due to parent child
      relationship, yet to be forked/execed process will launch in right
      group.

Ex. Lets say user root does "su test1". In this case process "su" is the
    one providing service (launching a shell) for user "test1". pam_cgroup.so
    will move process "su" to the user "test1"'s cgroup (Decided by the uid
    and gid of "test1"). Now once su forks/execs a shell for user test1,
    final shell is effectively running in the cgroup it should have been
    running based on /etc/cgrules.conf for user test1.


Section 4:
----------
To use cgrulesengd which will move a task to right cgroup based on
rules in /etc/cgrules.conf do following.

- build and install latest libcgroup.so
- build cgrulesengd
	make cgrulesengd
- specify some uid/gid based rules in /etc/cgrules.conf
- Mount some controllers and create an hierarchy of cgroups (matching
  your rules).
- Run cgrulesengd.
	- ./cgrulesengd
- Launch some task or login as a user to the sytem. Daemon should automatically
  place the task in right cgroup.

FAQ
===
Q. Why admin should not control "exec" based rules.
A. Unix file system provides access control only based on uid/gid. So
   even if admin starts putting tasks based on uid/gid, it can't enforce
   it. For example, consider following scenario.

  Lets say an admin creates following cgroup hierarchy.

			/container
			|	|
  		database	browser
		|    |		|     |
	      user1  user2     user1  user2

 Now admin wants to run all the database jobs under /container/database/
 and all the firefox jobs under /container/browser/. Based on which user
 launched it, jobs should go in either user1 or user2 dir.

 Now assume that database subdir has to more cpu resources as compared to
 browser subdir. Then a user, say user2, can always move his browser job
 also to /container/database/user2 dir to get more cpu resources and
 admin will not be able to control that.

 [Note: user2 will control what tasks can be added in /container/database/user2
  and will contol what further subdirs can be created under user2 dir. Root
  should not restrict the control to root only for practical purposes. Its
  something like that till /container/databse, admin controls the resources
  and below that how resources are further subdivided among various projects
  should be controlled by respective user].

In the light of above, it seems to make more sense that admin should enforce
rules only based on uid and gid. Probably later we can have a per user exec
based rules config file (~/.cgrules.conf), which can be parsed by cgrulesd
and then jobs launched by user will be placed in right cgroup based on
combination of rules in /etc/cgrules.conf and ~/cgrules.conf.