AbsaOSS/py2k

KafkaModel and DynamicKafkaModel

felipemmelo opened this issue · 0 comments

The current usage example found in README is like below.

from py2k.models import DynamicKafkaModel
from py2k.writer import KafkaWriter

# assuming we have a pandas DataFrame, df
serialized_df = DynamicKafkaModel(df=df,model_name='test_model').from_pandas()

writer = KafkaWriter(
    topic="topic_name",
    schema_registry_config=schema_registry_config,
    producer_config=producer_config
)

writer.write(serialized_df)

The output from DynamicKafkaModel is called serialized_df but it is not actually serialized in the context of what the library does, as pointed out in #34 .

Also, DynamicKafkaModel is only converting Pandas dataframes, whatever their schemas, to key-value records ready to then be serialized as Avro and dispatched to Kafka.

From that, having Model as part of the name might be slightly misleading. Maybe using something like KafkaFormatter, PandasToKafkaTransformer, etc might be more informative to both, users and contributors.