/cluster-data

Migrated from code.google.com/p/googleclusterdata

Primary LanguageTeX

Overview

This repository describes various traces from parts of the Google cluster management software and systems.

Please let us know about any issues, insights, or papers you publish using these traces by sending email to the discussion group. (And please join this group to be kept up to date with new announcements!) The more specific the data, the more likely we are to be able to help you.

If you have (or generate) tools that help analyze or decode the trace data, or useful analyses, do please share them with this community.

A trace bibliography provides bibtex data for papers about or derived from these traces. If you publish one, please email a bibtex entry for it, so it can be added to the bibliography. (Try to mimic the format used there as exactly as possible.)

Cluster workload traces

These are traces of workloads running on Google compute cells.

  • ClusterData2011_2 provides data from an 12.5k-machine cell over about a month-long period in May 2011.
  • TraceVersion1 is an older, short trace that describes a 7 hour period from one cell (cluster). Deprecated. For new work, we recommend using the ClusterData2011_2 trace instead.

ETA traces

These are execution traces from ETA (Exploratory Testing Architecture) - a testing framework that explores interactions between distrinbuted, concurrently-executing components.