worldbank/REaLTabFormer
A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.
Jupyter NotebookMIT
Issues
- 0
- 0
- 1
OSError: rtf_checkpoints/not-best-disc-model does not appear to have a file named config.json,
#80 opened by MonishSoundarRaj - 2
How can we create more synthetic data?
#77 opened by limhasic - 1
Error on mps device
#78 opened by qzhu2017 - 1
Generating Balanced Synthesized Data
#79 opened by erland-ramadhan - 2
- 1
- 2
Documentation page not working
#68 opened by sfragkoul - 0
Missing value imputation on regular data
#74 opened by RejwankabirHamim - 2
- 2
missing data
#70 opened by limhasic - 0
RuntimeError: Error(s) in loading state_dict for GPT2LMHeadModel: size mismatch for transformer.wte.weight
#69 opened by akefhabbal-qu - 2
OSError: rtf_checkpoints/not-best-disc-model does not appear to have a file named config.json.
#66 opened by AhmadKajjan-QU - 2
Could order of columns affect performance of synthetic data quality?
#65 opened by efstathios-chatzikyriakidis - 1
Multi-GPU training
#56 opened by vinay-k12 - 0
Python datetime.date data type is handled as str and datatype handling in general
#64 opened by efstathios-chatzikyriakidis - 0
Parallelization of inference/generation in both tabular and child models.
#63 opened by efstathios-chatzikyriakidis - 0
Maximum number of columns limitation in tabular GPT-2 model?
#62 opened by efstathios-chatzikyriakidis - 2
Early stopping with sensitivity vs validation loss metric and the effects on synthetic data quality.
#60 opened by efstathios-chatzikyriakidis - 3
Out of memory exception on tabular model with 25k rows and 37 columns
#59 opened by efstathios-chatzikyriakidis - 0
- 2
Possible mix-up of token columns
#47 opened by liu305 - 0
Is it possible to run REalTabFormer on AWS Inferentia and Trainium VM instances?
#58 opened by efstathios-chatzikyriakidis - 4
Conditional generation?
#48 opened by gminorcoles - 0
Inquiries on fitting parent and child tables
#57 opened by ThomasK1018 - 3
Bug in REaLTabFormer.sample() when relational model generates no data
#51 opened by efstathios-chatzikyriakidis - 1
- 5
Logistic detection metric
#42 opened by zechchair - 1
- 0
Possible Improvements for CPU inference
#49 opened by australDream - 1
Bug in model.sample() when column contains integer values while column type is string.
#36 opened by echatzikyriakidis - 4
Running Realtab on Macs
#43 opened by vinay-k12 - 1
See "IndexError: index out of range in self" when related_num parameter is specified in child model sampler
#46 opened by liu305 - 2
- 2
- 1
ERROR: multiprocess 0.70.15 has requirement dill>=0.3.7, but you'll have dill 0.3.6 which is incompatible.
#40 opened by echatzikyriakidis - 3
- 2
Is it possible to do iterative training? Load the weight and retrain on new data.
#28 opened by vinay-k12 - 2
AssertionError: The target length 10 of the data doesn't include the numeric precision at 20. Increase max_len to at least 22.
#30 opened by vinay-k12 - 3
CPU OOM during tokenization - Tabular format
#23 opened by mohanvrk - 5
Bug in model.sample() when column contains integer values while column type is string.
#31 opened by echatzikyriakidis - 2
- 2
_validate_get_device() could be nice to be called also in model.sample() and model.predict()
#32 opened by echatzikyriakidis - 4
- 0
- 1
Speeding the training on mixed data set - categorical data, numerical and text.
#26 opened by vinay-k12 - 0
Use different join columns (parent_join_on, child_join_on) in relational model fit method.
#27 opened by echatzikyriakidis - 2
Unable to complete training in colab
#24 opened by vinay-k12 - 0
How to model N-M association tables?
#21 opened by echatzikyriakidis