KingPin is the Python toolset used at Pinterest for facilitating service oriented architecture and application configuration management.
- Kazoo Utils: A wrapper for Kazoo and implements utils like ServerSet, DataWatcher and Hosts selectors.
- Thrift Utils: A wrapper for python Thrift clients which transparently handles retry and integrated with service discovery framework provided in Kazoo Utils.
- Config Utils: A system that stores configuration on S3 and uses Zookeeper as the notification system to broadcast updates to subscribers.
- ZK Update Monitor: A daemon which syncs configurations and serversets to local disk from Zookeeper and S3.
- Decider: A common utility widely used at Pinterest which controls the global values. Built on top of Config Utils.
- Managed Datastructure: A common utility wide used at Pinterest which supports easy access/modification of configurations in Map/List format in Python.
- MetaConfig Manager: A system that manages all configurations/serversets and dependencies (subscriptions), built on top of Zookeeper and S3.
Kingpin contains various python packages which serve as the basic infrastructure LEGO widely used at Pinterest.
KingPin relies on 3rd party packages like Kazoo, Boto, Zookeeper and AWS S3.
Zookeeper is used to store dynamic serversets and acts as the notification channel for config changes. S3 is used to store configuration data. On top of Zookeeper and Kazoo, we have Kazoo Utils which we have some improvements on Kazoo native APIs and does service discovery and registration. We use thrift as the RPC protocal, so we built thrift-utils on top of thrift and kazoo_utils which make people easier to write a python thrift client.
We also have our own application configuration management system built on top of Zookeeper and S3 which we call it Config Utils (we call it ConfigV2 internally). Basically people can make change to a configuration, the system will put the new content to a new versioned file in S3, and update the version number in Zookeeper. You can refer to this engineering blog which contains more information inside.
We also built some data structures like lists, hashmap and json for Pinterest Engineers to easily read/write to configurations. We also have a structure called Decider (ranging from 0 to 100) for engineers to dynamically change values in realtime, which is pretty useful for experiments control.
Both application configuration management and service discovery are watched and updated by a process running on each box called ZK Update Monitor. Our philosophy is to leverage the atomic broadcasting guarantee provided by Zookeeper to fanout the change to every box so each machine only needs to talk to local file, instead of establishing a session to Zookeeper. ZK Update Monitor is the only proxy talking to Zookeeper on each box for getting the newest serverset of configuration content. We have an engineering blog which contains the more details about it.
Another framework we provide in KingPin is called MetaConfig Manager. We found that sometimes we need to manage the dependency like what cluster needs what configuration or serverset
.
MetaConfig is basically a configuraiton which tells ZK Update Monitor how to download a particular configuration or serverset. The MetaConfig Manager stores the dependency graph in Zookeeper and
metaconfigs in S3. If a box want to add a serverset or configuration dependency, simply call MetaConfig API and the new config will be added to all subscribers
immediately.
Here the steps how to make KingPin running.
KingPin is tested under Python 2.7.3 at Pinterest scale. We highly recommend you run KingPin using Python 2.7.3.
Please follow this website to make sure pip is installed on your box:
wget https://bootstrap.pypa.io/get-pip.py
python get-pip.py
In order to talk to S3, you need to prepare a boto (yaml) config file with aws keys in place.
We provide an template for boto S3 config file in kingpin/examples/example_aws_keyfile.conf
.
Replacing the aws_access_key_id
and aws_secret_access_key
with your AWS keypairs under Credentials
section to configure your aws/s3 acecss. Prepare the file in some place ready for future use.
If you don't have a Zookeeper cluster and want to try KingPin out on local machine, you can download it.
After downloaded, untar it and run:
cd zookeeper-3.4.6
cp conf/zoo_sample.cfg conf/zoo.cfg
sudo bin/zkServer.sh start
Test whether it works by trying to connect:
bin/zkCli.sh -server 127.0.0.1:2181
In case you want to stop fo Zookeeper instance:
sudo bin/zkServer.sh stop
We use a file to list a set of Zookeeper endpoints. For the local Zookeeper single node case, we provided a
file examples/local_zk_hosts
. There should only be one liner there which points to the localhost zookeeper node.
You need to refer to that file for future use.
Follow the instructions in this page for installing thrift library.
KingPin is a python package, we suppose you already have Python and Pip installed.
In order not to screw up with your exsiting Python libs, we recommend using Virtual Environments as the clean container of KingPin and its dependencies.
cd kingpin
pip install virtualenv
virtualenv env
virtualenv -p /usr/bin/python2.7 venv
source venv/bin/activate
pip install -r requirements.txt
If you get some error when installing gevent, you may need to install libevent first.' If after installing libevent you still get the header file cannot found error, you can manually install gevent by running:
CFLAGS="-I /usr/local/include -L /usr/local/lib" pip install gevent==0.13.8
Note that the INCLUDE and LIB path may vary, both are the actual places you libevent is installed.
After you get this error fixed, you may need to run pip install -r requirements.txt
again to finish the dependency installation.
sudo python setup.py install
ZK Update Monitor is running inside Supervisor. Supervisor makes ZK Update Monitor to run under the supervisord container.
We also recommend installing supervisor under Venv. To install supervisor, simply run:
pip install supervisor
We provided an example supervisor configuration under examples/ directory. The configuration has some fields to fill in like dependency name and S3 bucket name.
Once you filled in the blanks in the configuration, you can run ZK Update Monitor via Supervisor under KingPin Directory:
sudo supervisord -c examples/supervisor_zk_update_monitor.conf
You can verify if zk_update_monitor is running by:
ps aux | grep zk_update_monitor
nosetests
Here we provide 2 typical usages of KingPin: Application Configuration Management which deploys configurations to every box in real time,
and Service Discovery which uses thrift as the RPC mechanism. We assume you already follow steps in Configurations
and have Zookeeper
up and running, S3 key file set up and KingPin properly installed.
Here is an example of creating a manageddata, updating it and let ZK Update Monitor download the content.
The application configuration framework consists of 3 parts: Manageddata or Decider on top of Config Utils, ZK Update Monitor, and MetaConfig Manager.
In the following example, we walk through the process that an application subscribes a managed list, and watches the changes and download to local disk.
Decider works almost the same except the difference in APIs. We provide a script in examples/metaconfig_shell.py
which help you easily
create conifg and dependencies.
Of course, on top of the config APIs, you can easily build a fancier UI.
Before doing everything, you need to create a directory under /var/ path using sudo permission:
sudo mkdir /var/config
The name of manageddata has a Domain
and a Key
. The domain is like the group, the key is like the member. For example,
a typical managedlist used inside Pinterest is called config.manageddata.spam.blacklist
. Here "spam" is the domain,
"blacklist" is the key.
Run the following command for starting the an interactive tool for creating a manageddata configuration:
cd kingpin
python examples/metaconfig_shell.py -z examples/local_zk_hosts -a examples/example_aws_keyfile.conf -b [Your S3 Bucket to Put Config Data] -e s3.amazonaws.com
Type "2" -> "1" to create a manageddata, and give the domain and key name. Let's give the domain called "test", key called "test_config".
The created manageddata is then called config.manageddata.test.test_config
. Remember this name for future use.
The manageddata should be created. Now we need to create a dependency. A dependency is a collection of manageddata or serverset which tells the subscription of a set of serversets or configurations. ZK Update Monitor need to know the dependency of the localbox so it can subscribe and downlaod corresponding configurations.
Run the metaconfig_shell.py
again, and type "1" and create a dependency. In order to tell the difference between a dependency and a configuration internally, we require dependency names to be ended with .dep
.
Let's create a a dependency called test_dependency.dep
.
Then we should be able to add the created manageddata to the created dependency. Run metaconfig_shell.py
and type "3", type test_dependency.dep
and config.manageddata.test.test_config
respectively to add the manageddata config into the dependency.
Don't forget to start ZK Update Monitor. You need to replace the [dependency-name]
in the Zk Update Monitor command line inside supervisord configuration (which is localed in examples/supervisor_zk_update_monitor.conf
to "test_dependency.dep", also don't want to change the bucket name.
Now ZK Update Monitor is watching changes of the configuration called "config.manageddata.test.test_config". You can double check by querying the Zk Update Monitor admin flask endpoint:
curl 127.0.0.1:9898/admin/watched_metaconfigs
It should show the watched configuration list, currenly should be only "config.manageddata.test.test_config".
Now we can try change the content of the manageddata configuration. For now the modification can be done in the python shell. Inside Pinterest we use a web portal for people the change the content. Suppose the configuration will be a managedlist.
Bring up python.
python
Try add a value into the managedlist.
from kingpin.manageddata.managed_datastructures import ManagedList
managedlist = ManagedList("test", "test_config", "test config", "a config for test", ["127.0.0.1:2181"], "examples/example_aws_keyfile.conf", "some_test_bucket")
managedlist.add("test_data")
If no error happens, you can check the content of the managedlist which is /var/config/config.manageddata.test.test_config
.
There should be one item called test_data
inside.
In this example we bring up a test thrift server locally, register it to a serverset and use thrift_util to talk to the test server.
Before doing everything, you need to create a directory under /var/ path using sudo permission:
sudo mkdir /var/serverset
Let's create the serverset of the service "test_service". Running in a environment called "prod". The format of a serverset name is discovery.{service name}.{service environment}.
So the serverset name in this case will be discovery.test_service.prod
.
Run the metaconfig_shell again to create the serverset:
cd kingpin
python examples/metaconfig_shell.py -z examples/local_zk_hosts -a examples/example_aws_keyfile.conf -b [Your S3 Bucket to Put Config Data] -e s3.amazonaws.com
Type "2" -> "2" to create a serverset, and give the service name and service environment. Let's give the service name called "test_service", environment called "prod".
Suppose you have already created the dependency called "test_dependency.dep". We can add the serverset to the dependency so ZK Update Monitor can watch any change of the serverset:
python examples/metaconfig_shell.py -z examples/local_zk_hosts -a examples/example_aws_keyfile.conf -b [Your S3 Bucket to Put Config Data] -e s3.amazonaws.com
Type "3" to add the serverset discovery.test_service.prod
to test_dependency.dep
as the instruction shows.
When a service host is started, it needs to register itself to the serverset so client can know the endpoint to connect to.
We delegate the registration to another daemon called zk_register
.
Try starting zk_regsiter to register the local host to discovery.test_service.prod as port 8081:
cd kingpin
zk_register.py -p 8081 -s test_service -e prod -z examples/local_zk_hosts
The command will hang because it need to keep a session to Zookeeper to keep an ephemeral node up representing this host. To verify the updated
serverset is observed by ZK Update Monitor, go to /var/serverset/
, there should be a file called discovery.test_service.prod
and it
should have one line which is the endpoint of localhost the service will be running on.
One thing you may need to put special care is the failure case - if the service process is dead but the zk_register is still running, clients may talk to an endpoint with no service running on that port. Here are suggestions to prevent that:
- Have some extra logic checking the healthness of the server inside zk_register so it will stop registering if the healthcheck fails.
- Put the registration part inside the service process. You may need to implement the registration logic in other languages.
We provide an example thrift definition in examples/ directory called test_service.thrift
.
Run the following command to get the generated code in Python:
thrift -r --gen py examples/test_service.thrift
We provided an server startup script under examples/test_service_server.py. After the
python thrift code is generated, move the test_service_server.py
under gen-py/test_service.
Run test_service_server
and the server will be running on port 8081.
Here are an example using Thrift Util to construct a wrapper thrift client which has features like connection pool management, dynamic serverset management and retry management, etc:
from kingpin.thrift_utils.thrift_client_mixin import PooledThriftClientMixin
from kingpin.thrift_utils.base_thrift_exceptions import ThriftConnectionError
from kingpin.kazoo_utils.hosts import HostsProvider
import TestService
class TestServiceConnectionException(ThriftConnectionError):
pass
class TestServiceClient(TestService.Client, PooledThriftClientMixin):
def get_connection_exception_class(self):
return TestServiceConnectionException
testservice_client = TestServiceClient(
HostsProvider([], file_path="/var/serverset/discovery.test_service.prod"),
timeout=3000,
pool_size=10,
always_retry_on_new_host=True)
The above example is how we constrcut the thrift client class with Thrift Util features turned on. The class should inherit from your thrift service client (generated by thrift) annd PooledThriftClientMixed (make sure the generated thrift client class is the first base class).
You must implement a method called get_connection_exception_class
which returns a class which will be thrown when all retries are failed.
Then you need to initialize an client object. In the constructor, you need to pass in a HostsProvider as the first argument. Passing in the serverset file path there so the client can always read the endpoint in the file (which may change dynamically). The frist parameter of HostsProvider constructor is a list of static endpoint list, we don't recommend using that because it does't provide the benefit of dynamic serverset.
We also provide an example under examples/test_service_client.py
, move it under gen-py/test_service
and run the script, it should output "pong".
Shu Zhang [ @ShuZhang1989 ] (https://github.com/shuzhang1989)
Xiaofang Chen, Tracy Chou, Dannie Chu, Pavan Chitumalla, Steve Cohen, Jayme Cox, Michael Fu, Jiacheng Hong, Xun Liu, Yash Nelapati, Aren Sandersen, Aleksandar Veselinovic, Chris Walters, Yongsheng Wu, Shu Zhang
Copyright 2016 Pinterest, Inc.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.