# Introduction  
This code investigates the effectiveness of large language models (LLMs) in transforming causal domain knowledge into a representation that better aligns with recommendations from causal data science.

## method

The approach consists of two main tasks. 

### Experiment on Task~1: identifying if two entities represent different values of the same causal variable

To generated data  
```bash
python src/CMR1/CMR1_data_generation.py
```

To sample the data 
```bash
python src/CMR1/CMR1_data_sampling.py
```
To run the experiment  the data 
```bash
python src/CMR1/CMR1_Experiment.py
```

To get the cosine similarity  
```bash
python src/CMR1/get_cos_sim.py
```

### Experiment on Task~2: identifying interaction entities which represent values of different causal variables simultaneously


To generated data  
```bash
python src/CMR1/CMR1_data_generation.py
```
To sample the data 
```bash
python src/CMR2/CMR2_data_sampling.py
```
To run the experiment  the data 
```bash
python src/CMR2/CMR2_Experiment.py
```

To get the cosine similarity  
```bash
python src/CMR2/get_cos_sim.py