/xdmod

An open framework for collecting and analyzing HPC metrics.

Primary LanguagePHPGNU Lesser General Public License v3.0LGPL-3.0

Open XDMoD

XDMoD (XD Metrics on Demand) is an NSF-funded open source tool designed to audit and facilitate the utilization of the XSEDE cyberinfrastructure by providing a wide range of metrics on XSEDE resources, including resource utilization, resource performance, and impact on scholarship and research. The XDMoD framework is designed to meet the following objectives:

  1. Provide the user community with a tool to manage their allocations and optimize their resource utilization,
  2. Provide operational staff with the ability to monitor and tune resource performance,
  3. Provide management with a tool to monitor utilization, user base, and performance of resources, and
  4. Provide metrics to help measure scientific impact.

While initially focused on the XSEDE program, Open XDMoD has been created to be adaptable to any HPC environment.

For more information, including information about additional Open XDMoD capabilities provided as optional modules, please visit the Open XDMoD website.

Support

Support is available by emailing ccr-xdmod-help@buffalo.edu. Please include the following in your support request. Failure to include this information may delay support. See the Open XDMoD Support page for additional information.

  • Open XDMoD version number
  • Operating system and version where Open XDMoD is installed
  • The output of xdmod-check-config
  • PHP and MySQL version (e.g, the output from php --version, mysql --version, and the SQL command SHOW VARIABLES LIKE "%version%";)
  • A description of the problem you are experiencing
  • Detailed steps to reproduce the problem

Modules for Open XDMoD

In addition to the analysis functionality provided by the main Open XDMoD package, Open XDMoD also supports extending its capabilities through modules. The XDMoD team currently supports the modules below.

Application Kernels

This module enables analysis of "application kernels", which are regression tests for the performance of an HPC system and the software that runs on it. For more information, visit the Application Kernels Module website and repository.

Job Performance (SUPReMM)

This module enables analysis of individual and aggregate job performance using hardware data from CPUs, memory, filesystems, network interfaces, and more. For more information, visit the Job Performance Module website and repository.

Open OnDemand

This module enables display and analysis of usage of Open OnDemand. For more information, visit the module website and repository.

Installation

Prebuilt packages of Open XDMoD are available as releases on GitHub. Packages for Open XDMoD modules are available as releases in their respective repositories.

See the installation instructions on the Open XDMoD website for additional information.

Contributing

Feedback is always welcome, and contributions are greatly appreciated! Before getting started, please see our contributing instructions and guidelines.

In short, the steps to take are:

  1. Fork any repositories for Open XDMoD or its modules that you wish to work on.
  2. Clone and set up the repositories on your local system (see "Developing", below).
  3. Develop your work and test it, ensuring your work follows our contributing guidelines.
  4. Push your work to your forks.
  5. Open pull requests for your work from your forks to the central repositories. The pull requests will then be reviewed by the XDMoD team.

Developing

Development on Open XDMoD and its modules can be started using either Repo or Git. If you are unsure which to start with, try Repo, as it is easy to transition from a Repo workflow to a pure Git workflow. If you don't want to install yet another tool, using Git will work just fine.

Whichever tool you choose, we recommend keeping your various repository clones inside a dedicated directory on your local system. This will make it easier to use quality assurance tools locally.

Before starting with either, however, you will want to fork any repositories you are interested in working on. Simply visit the repositories' pages on GitHub and click the Fork button. Once you have finished working on a feature or bug fix for a project, push the work to your fork and open a pull request against the main repo for that project.

Using Git

To get started on core Open XDMoD development, simply clone the Open XDMoD repository.

To work on an Open XDMoD module, one option is to clone the module repository directly inside of your Open XDMoD repository's open_xdmod/modules directory. If your module repository is named with the xdmod- prefix, remove it from the clone's directory name. Alternatively, you may clone the module repository elsewhere and create a symbolic link to it from open_xdmod/modules.

For example, to work on your fork of the SUPReMM module using a direct clone, run this command, substituting in your GitHub username and Open XDMoD repo location:

git clone git@github.com:[username]/xdmod-supremm [xdmod_repo]/open_xdmod/modules/supremm

To work on your fork of the SUPReMM module using an external clone and a symbolic link, run these commands, substituting in your GitHub username and the relevant paths:

git clone git@github.com:[username]/xdmod-supremm [supremm_repo]
ln -s [supremm_repo] [xdmod_repo]/open_xdmod/modules/supremm

Using Repo

To assist with initial setup and development across Open XDMoD and its modules, we support the use of Repo, a tool built by the Android development team to help manage multi-repository projects. It can help you get started by setting up multiple repositories at once, and it provides some convenience functions for working across repositories. We supply a Repo manifest repository that can be used to get started with Open XDMoD and first-party modules.

The steps below will get you started, but further documentation on using Repo can be found here. At any point, standard Git commands may be used in individual project directories, as the directories are simply standard Git repositories.

To clone Open XDMoD and first-party modules, run the following commands, substituting in the branch for the version you wish you base your work on (e.g. xdmod6.5):

repo init -u git@github.com:ubccr/xdmod-repo-manifest -b [branch]
repo sync

Note that unlike git clone, Repo does not automatically create a local branch tracking the initial branch that was checked out (although you can create these branches manually, if you wish, by running repo forall -c 'git checkout [branch]').

To add forks of the various projects to all repositories at once, run this command, substituting in your GitHub username:

repo forall -c 'git remote add origin git@github.com:[username]/$REPO_PROJECT'

To check that the above command worked correctly, you can run this command:

repo forall -c 'git fetch origin'

Now that the repositories have been set up, you can use standard Git commands in each repository. Repo also provides some convenience functions for performing tasks across all repositories. For example, repo status will display the current branch and changes for all repositories. You can also use repo forall to execute any shell command in all repositories.

Custom Modules with Repo

If you are working on custom modules for Open XDMoD, you can tell Repo where to find them using one of two methods.

If you want to use a custom configuration in multiple places, you can fork or clone the XDMoD Repo manifest repository and apply modifications there. (Note that a fork, like the source repo, will be public. If you wish to keep your custom configs private, clone the main repo directly and treat the clone as an independent repo. You can then push the clone to your own private server, or you can just keep it local to your machine.) The existing entries for various Open XDMoD modules can be used as templates for your custom modules. The format of the Repo manifest file is described in greater detail here. Once you have made your changes and committed them, run these commands to pull in your changes:

repo init -u [local_or_remote_path_to_manifest_repo] -b [branch_with_changes]
repo sync

If you just want to make some small changes locally, you can add local manifest files to .repo/local_manifests that will extend the main manifest file being used. More information about the manifest file format and local manifest files may be found here. Once you have made the desired changes, run repo sync to pull the changes in.

Installing Quality Assurance Tools

If you wish to locally install and use the quality assurance tools that will be used to check your code, you can do so by following the instructions on the Open XDMoD QA repository page.

Building

Dependencies

See Software Requirements.

Build Dependencies

NOTE: Modules for Open XDMoD may have their own build dependencies.

Steps

The examples in the steps below apply to Open XDMoD, but similar procedures may be followed to build modules for Open XDMoD as well. Simply ensure that the modules (or symbolic links to the modules) are present in open_xdmod/modules and do not have the xdmod- prefix. For example, to build the SUPReMM module (which is stored in the repository xdmod-supremm), clone it or create a symbolic link to it at open_xdmod/modules/supremm.

Source

This process has been tested on CentOS 7. Known issues are documented in the Building FAQ below. If you run into any issues not listed below on these or any other platforms, please let us know. The tested version of composer is 1.10.25 on CentOS 7 and 2.4.2 on Rocky 8.

  1. Change directory to the root of the Open XDMoD repository.
  2. Install Composer dependencies for Open XDMoD.
    • export COMPOSER=composer-el[7|8].json
      • use el7 if you are building this on CentOS7 w/ PHP5.4
      • use el8 if you are building this on CentOS8 ( or equivalent ) w/ PHP7.2
    • composer install
    • Depending on the versions of various software installed on your system, you may run into errors. If you do, see the Building FAQ below.
  3. Run the package builder script.
    • open_xdmod/build_scripts/build_package.php --module xdmod
    • To build Open XDMoD modules, substitute xdmod with the name of a module's directory within open_xdmod/modules.

The resulting tarball will be located in open_xdmod/build.

RPM

This process has been tested on CentOS 7. Known issues are documented in the Building FAQ below. If you run into any issues not listed below on these or any other platforms, please let us know.

This procedure assumes your rpmbuild directory is ~/rpmbuild. If it is not, substitute accordingly.

  1. Change directory to the root of the Open XDMoD repository.
  2. If you have not already, create a source tarball using the steps in the Source section.
  3. Copy the source tarball to the SOURCES directory in your rpmbuild directory.
    • cp open_xdmod/build/xdmod-x.y.z.tar.gz ~/rpmbuild/SOURCES
  4. Extract the .spec file from the source tarball into the SPECS directory in your rpmbuild directory.
    • tar -xOf ~/rpmbuild/SOURCES/xdmod-x.y.z.tar.gz xdmod-x.y.z/xdmod.spec >~/rpmbuild/SPECS/xdmod.spec
  5. Run rpmbuild.
    • rpmbuild -bb ~/rpmbuild/SPECS/xdmod.spec
    • There may be warnings about files not being found or files being listed twice. These are likely benign - see the Building FAQ below.

The resulting RPM will be located in ~/rpmbuild/RPMS/noarch.

Building FAQ

Why is Composer unable to download some files?

Certain combinations of PHP and Composer do not handle redirects over HTTPS correctly. This is known to affect the version of PHP that CentOS 6 supplies combined with current stable versions of Composer (as of this writing, 1.3.2). To get things working, try one or more steps below.

  1. Update Composer to a newer version.
  2. If the above did not work or is not feasible, you can globally disable HTTPS in Composer by running composer config -g disable-tls true. While disabling HTTPS is not recommended by the Composer developers or us, all dependencies downloaded using XDMoD's config files will be checked against checksums to help prevent against tampering.

Why is Composer failing to unzip Ext JS?

Older versions of Composer (< 1.3.2) had issues with the Ext JS zip file.

Upgrading to at least version 1.3.2 resolves this issue.

Why is rpmbuild warning about files not being found?

When building modules for Open XDMoD, rpmbuild may warn about core Open XDMoD files being missing. These warnings can be safely ignored if they are for the following files:

  • /usr/share/xdmod/configuration/linker.php
  • /usr/share/xdmod/configuration/constants.php
  • /etc/xdmod/portal_settings.ini
  • /usr/local/xdmod/etc/logrotate.d/xdmod
  • /etc/cron.d/xdmod
  • /usr/local/xdmod/etc/apache.d/xdmod.conf

If other files are not being found, then there may be a problem with the build configuration or the module's RPM spec file.

Why is rpmbuild warning about files being listed twice?

When building the core Open XDMoD package, robots.txt is included both as a code file (being inside the html directory) and as a config file (in order to prevent customizations from being lost on upgrade). This warning can be safely ignored.

If other files produce this warning, then there may be a problem with the build configuration or the module's RPM spec file.

License

Open XDMoD is released under the GNU Lesser General Public License ("LGPL") Version 3.0. See the LICENSE file for details.

Open XDMoD includes several libraries that are licensed separately. See the license page on the Open XDMoD website for details.

Non-Commercial Licenses

Some software products used by Open XDMoD are not free for commercial use. See the license page on the Open XDMoD website for details.

Reference

When referencing XDMoD, please cite the following publication:

Jeffrey T. Palmer, Steven M. Gallo, Thomas R. Furlani, Matthew D. Jones, Robert L. DeLeon, Joseph P. White, Nikolay Simakov, Abani K. Patra, Jeanette Sperhac, Thomas Yearke, Ryan Rathsam, Martins Innus, Cynthia D. Cornelius, James C. Browne, William L. Barth, Richard T. Evans, "Open XDMoD: A Tool for the Comprehensive Management of High-Performance Computing Resources", Computing in Science & Engineering, Vol 17, Issue 4, 2015, pp. 52-62. DOI:10.1109/MCSE.2015.68