apache/kibble-1

Refactor the configuration methods of Apache Kibble

turbaszek opened this issue · 6 comments

Hello,

Currently, Apache Kibble is configured via kibble.yaml which is generated by running the setup.py script. This file is referenced in multiple places in the codebase which is not necessary in my opinion.

I would like to propose to use for the standard Python library configparser instead of yaml.
https://docs.python.org/3.7/library/configparser.html

So instead of:

accounts:
  allowSignup: true
  verify: true
api:
  database: 2
  version: 0.1.0
elasticsearch:
  dbname: kibble
  host: elasticsearch
  port: 9200
  ssl: false
mail:
  mailhost: localhost
  mailport: 25
  sender: Kibble <noreply@kibble.kibble>

we would have:

[accounts]
allowSignup = True
verify = True

[api]
database = 2
version = 0.1.0

[elasticsearch]
dbname = kibble
host = elasticsearch
port = 9200
ss = false

[mail]
mailhost = localhost
mailport = 25
sender = Kibble <noreply@kibble.kibble>

The main advantage of using configparser is that we will be able to parse the config file once and then access any value doing something like config.get("section", "key").

Additionally, we may take some inspiration from Apache Airflow project and:

  • introduce "default config" that will be used for fallback values that were not defined by user
  • use env variables to override the config (dead useful for configuring environments) for example export KIBBLE__MAIL__MAILPORT=34 would override the config value of mail.mailport

@turbaszek I am interested in taking this up. As clarification, how would you like this to be incorporated? Do you want refactor ../setup/setup.py to use configparser instead or yaml?

@skekre98 thanks for help!

Do you want refactor ../setup/setup.py to use configparser instead or yaml?

Yes, I think this would be more python-native solution. However, I would suggest to add this setup/configuration.py file where we do:

class KibbleConfigParser(...):
    ...

conf = KibbleConfigParser()

Then we can import the conf anywhere and access the configuration information. But I'm open to any suggestion.

One more thing: I think it would be simpler to merge dbname, host and mort of elastiscearch into single connection uri field. WDYT?

@skekre98 thanks for help!

Do you want refactor ../setup/setup.py to use configparser instead or yaml?

Yes, I think this would be more python-native solution. However, I would suggest to add this setup/configuration.py file where we do:

class KibbleConfigParser(...):
    ...

conf = KibbleConfigParser()

Then we can import the conf anywhere and access the configuration information. But I'm open to any suggestion.

One more thing: I think it would be simpler to merge dbname, host and mort of elastiscearch into single connection uri field. WDYT?

Definitely +1 to merging the entire URL (I think I said the same thing earlier somewhere else), that will also make it easier to auto-configure via the command line environment, as you'll just be exporting one variable :)

Cool, I can start building up the class. Based on my current understanding we could initialize the parser with kibble.yaml in __init__() followed by another getter function:

class KibbleConfigParser(...):
    
    def __init__():
        # initialize with kibble.yaml

    def get(section, key):
        # export variable

Is my understanding along the lines of what you are looking for?

@skekre98 I think that python native ConfigParser does not support yamls out of the box. But it may be possible to construct ConfigParser from dictionary - however, I'm not sure if this is worth having additional custom logic.

I think keeping it simple as possible would be the best option. So:

  1. Use ini type of file instead of yaml: https://docs.python.org/3.7/library/configparser.html#supported-ini-file-structure
  2. And then:
config = configparser.ConfigParser()
config.read("kibble.ini")

However, I'm open to any other suggestion

@turbaszek yes, I definitely think keeping in ini format will work better. Misunderstanding on my part 😅