Questions regarding the paper
KmA004 opened this issue · 5 comments
1/ I want to ask you about the inputs of your code,
What and where is it exactly in the code ?
2/ Can I change the inputs to see different results, and how ?
-
One of the datasets is TPC-H and we include it in https://github.com/hongzimao/decima-sim/tree/master/spark_env/tpch. In the code, we load the job with https://github.com/hongzimao/decima-sim/blob/master/spark_env/job_generator.py#L110 and https://github.com/hongzimao/decima-sim/blob/master/spark_env/job_generator.py#L9.
-
You can change the input to your dataset for job descriptions as long as they provide (1) the DAG topology (i.e., parent-child relation), (2) features on each node (e.g., number of tasks, task durations, etc.), (3) specification for inter-arrival process and distribution for different jobs. You can modify the job loader code above.
Hope these help!
First of all , I would like to thank you for your quick response.
Yes, the above information is very helpful for me.
I have also some questions regarding the parameters,
1/ Which parameters define the model ?
2/ Is there any brief description of the parameters?
Thanks..
Indeed, I have a similar question, that is why the models of GCN and GSN are designed manually instead being designed by the tensorflow module? Is it that you want use a specific initialization method?
This is the code for parameter initialization for GCN: https://github.com/hongzimao/decima-sim/blob/master/gcn.py#L50. The initialization follows the standard "Glorot initialization scheme" (See Xavier Glorot & Yoshua Bengio (AISTATS 2010) initialization (Eqn 16) for more details). I wrote the code from scratch because when we developed this codebase there wasn't standard graph neural network module in Tensorflow. Writing everything also gave us full control for understanding the details.
Btw, there was also an issue for batching. We wanted to support batching multiple graphs with different size. Standard TF module required each input in a batch to have the same size at the time. As an example, take a look at https://github.com/hongzimao/decima-sim/blob/c010dd74ff4b7566bd0ac989c90a32cfbc630d84/sparse_op.py for how we concatenate multiple inputs in sparse matrices for batching, if you are interested.