manta-muskie: The Manta WebAPI

This repository is part of the Joyent Manta project. For contribution guidelines, issues, and general documentation, visit the main Manta project page.

manta-muskie holds the source code for the Manta WebAPI, otherwise known as "the front door". It is analogous to CloudAPI for SDC. See the restdown docs for API information, but effectively this is where you go to call PUT/GET/DEL on your stuff.

API documentation is in docs/index.md. Some design documentation (possibly quite dated) is in docs/internal. Developer notes are in this README.

Active Branches

There are currently two active branches of this repository, for the two active major versions of Manta. See the mantav2 overview document for details on major Manta versions.

master - For development of mantav2, the latest version of Manta.
mantav1 - For development of mantav1, the long term support maintenance version of Manta.

Testing

muskie comes with its own set of unit tests. You typically test muskie by starting up a local instance of the server that's configured to point to the rest of your existing SDC/Manta deployment. This setup depends on several prerequisites in your development environment, (not the least of which IS a development environment, see here to set up a dev zone (NOTE: internal documentation))

You should set up a non-operator SDC account that will also have access to Manta. The manta-deployment zone includes a tool called add-dev-user that can be used to do this. You can find it at /opt/smartdc/manta-deployment/tools/add-dev-user. add-dev-user must be run on a zone with sdc-ldap available. Copy the script to your CoaL/lab headnode and run it from there. The second argument to add-dev-user is the quoted string of the public key from step 2 below. For example:
```
 add-dev-user muskie_test_user "ssh-rsa AAAAB3NzaC1..."
```
The ssh key that you use to authenticate as this account should be passwordless. It must also be stored locally to the muskie dev zone that is running the tests, with the private key in file $HOME/.ssh/id_rsa. You can override this path by setting MUSKIETEST_REGULAR_KEYFILE in the environment to the location of the private key. The public key path will be generated by appending .pub to the private key path; e.g., $HOME/.ssh/id_rsa.pub.
Some tests also require an operator account to test with. By default, the tests will use the "poseidon" account, but you must provide a valid private key for the poseidon account at $HOME/.ssh/id_rsa_poseidon. One way to do this is to copy the private key from a deployed CoaL/lab muskie zone in your set up. You can find a muskie zone by running vmadm list | grep webapi on your headnode. Optionally, you can provide a separate operator SDC account by setting MUSKIETEST_OPERATOR_USER and the location of its private key at MUSKIETEST_OPERATOR_KEYFILE in your environment.
Your SDC and Manta environment variables should point at the SDC and Manta instances that you're testing with. If you're running tests against a muskie devzone then localhost:8080 is probably the MANTA_URL. If you're running against a deployed CoaL/lab Muskie, running
```
 $ vmadm get <webapi_zone_uuid> | json -a nics | json -a nic_tag ip
```
will get you the MANTA_URL. Either way, running
```
 $ vmadm lookup -j alias=cloudapi0 | json -a nics | json -a ip
```
will get you the SDC_URL. NOTE/GOTCHA: you must remember to have set up cloudapi on your CoaL/lab with sdcadm post-setup cloudapi. The SDC and Manta variables should refer to the same user account, and they should both refer to the ssh key stored in $HOME/.ssh/id\_rsa.pub mentioned above.
Before running the tests, you must set the MUSKIE_SALT, MUSKIE_KEY, and MUSKIE_IV environment variables to the same values being used for the muskie instances in your existing Manta installation. You can find these values in SAPI, using:
```
 sdc-sapi /services?application_uuid="$(sdc-sapi \
     /applications?name=manta | json -H application_uuid)&name=webapi" |
     json -H -a metadata
```
You'll need to create a muskie configuration file that's appropriate for your environment. The easiest way to do this is to copy "etc/config.coal.json" in this repo into a new file "config.json". Then:

a. Modify all instances of "coal.joyent.us" with the DNS name for your SDC or Manta install. You can replace these DNS names with IP addresses, or you can use hostnames and configure /etc/resolv.conf with both the SDC and Manta resolvers. Either way, your dev/test zone must be on both the "admin" and "manta" networks in order to communicate with both SDC and Manta components.

b. Replace the "salt", "key", and "iv" values in the "authToken" section with the corresponding MUSKIE_ configuration variables described in step 4 above.

c. If you would like, replace the "datacenter", "server_uuid", and "zone_uuid" fields with appropriate values from your setup. If these fields are not updated, the metric collection facility will use the defaults provided in the file, which may not represent the real values of your machine. This step is not required.

In summary, you should make sure these environment variables are set properly:

Environment variable	Details
`MANTA_URL`	points to port 8080 the instance of muskie that you're testing
`MANTA_USER`	refers to your non-operator user created above
`MANTA_KEY_ID`	refers to a passwordless ssh key in `$HOME/.ssh/id_rsa`; see also: `MUSKIETEST_REGULAR_KEYFILE`
`MANTA_TLS_INSECURE`	usually 1 in an environment with self-signed certificates
`MUSKIETEST_OPERATOR_USER`	operator account for testing (optional, "poseidon" used by default)
`MUSKIETEST_OPERATOR_KEYFILE`	path to a passwordless ssh key for `MUSKIETEST_OPERATOR_USER` (optional, `$HOME/.ssh/id_rsa_poseidon` used by default)
`MUSKIETEST_REGULAR_KEYFILE`	path to a passwordless ssh key for `MANTA_USER` (optional, `$HOME/.ssh/id_rsa` used by default)
`SDC_URL`	points to the SDC deployment that you're using to test
`SDC_ACCOUNT`	same value as `MANTA_USER`
`SDC_KEY_ID`	same value as `MANTA_KEY_ID`
`SDC_TESTING`	analogous to `MANTA_TLS_INSECURE`, but for SDC
`MUSKIE_IV`	from values in SAPI (see above)
`MUSKIE_KEY`	from values in SAPI (see above)
`MUSKIE_SALT`	from values in SAPI (see above)

On a test system called "emy-10.joyent.us", these may look like this:

MANTA_URL=http://localhost:8080
MANTA_USER=dap
MANTA_KEY_ID=43:7b:f1:98:41:9c:37:90:18:b9:07:92:07:ac:a9:eb
MANTA_TLS_INSECURE=1
SDC_URL=https://cloudapi.emy-10.joyent.us
SDC_ACCOUNT=dap
SDC_KEY_ID=43:7b:f1:98:41:9c:37:90:18:b9:07:92:07:ac:a9:eb
SDC_TESTING=1

To run the tests against changes in this repository:

Run make to build muskie. This will pull down the correct Node executable for your platform and your version of muskie and then npm install dependent modules.
Configure your user account for access control by running:
```
 $ make test-ac-setup
```
This uses the local copy of Node (pulled down as part of the build) to run the access-control setup script. This creates roles and other server-side configuration that's used as part of the tests.
Start muskie using the configuration file you created above:
```
 $ build/node/bin/node main.js -f etc/config.json
```
Run make test.

Deploying a Muskie Image

If you're changing anything about the way muskie is deployed, configured, or started, you should definitely test creating a muskie image and deploying that into your Manta. This is always a good idea anyway. To run tests against an image, your configuration will be a bit different. Your MANTA_URL will be the manta network IP of a muskie instance, with a port number of a muskie process inside a muskie zone (8081). Your SDC_URL will be the external network IP of the cloudapi0 zone. You can find both of these IPs with the commands:

$ vmadm get <webapi_zone_uuid> | json -a nics | json -a nic_tag ip
$ vmadm lookup -j alias=cloudapi0 | json -a nics | json -a ip

There are various documents about deploying/updating a muskie image in Manta. If you're doing this for the first time, and not sure what to do, I had success with make buildimage which leaves you with an image and manifest in ./bits. You can then import this image and follow this guide to upgrading manta components: https://github.com/joyent/manta/blob/master/docs/operator-guide/maintenance.md#upgrading-manta-components

If you run into any problems when following this procedure against the latest version of #master, please let us know. There are are a couple of things to check first before reporting problems:

If a test fails that has a name like "(fails if MANTA_USER is operator)", then check to see that your MANTA_USER is indeed not an operator.
If a test fails with an InvalidRoleTag error, whose message may say something like 'Role tag "muskie_test_role_default" is invalid.', then check that you ran make test-ac-setup as described above for the user that you're using. (Note that you may see some other muskie_test_role in the message.)
If a test fails with a message like "Error: MUSKIE_SALT required", then check that you've specified the three MUSKIE_ environment variables described above.
If a test fails due to authorization errors, you may have an incorrect muskie configuration. Check that the MUSKIE_ID, MUSKIE_KEY and MUSKIE_IV attributes in your config.json match the environment variables set for the user running the tests ($MANTA_USER).
If the "rmdir mpuRoot" and "ls top" tests fail, MPU may not be enabled. MPU GC is not supported, so if MPU is left enabled, records may accumulate in metadata shards. For this reason, non-developers should not enable MPU. However, if you are a developer, and you recently upgraded from a pre-MPU muskie version, ensure the line '"enableMPU": true' is present in your config.json file. If you are running tests against an image whose configuration is managed by SAPI, which includes any zones deployed using manta-adm, you will need to set this variable using sapiadm:
```
 $ sapiadm update $(sdc-sapi /services?name=webapi | json -Ha uuid) metadata."MPU_ENABLE"=true
```
Note that ANY value (including false) set for MPU_ENABLE will write a value of true to the muskie configuarion file. So, once you have done that, next check that /opt/smartdc/muskie/etc/config.json in each of your webapi zones has updated to contain "enableMPU": true". Then restart your muskie instances so that they will pick up the changes to the configuration file:
```
 $ manta-oneach -s webapi 'svcadm restart "*muskie-*"'
```

Metrics

Muskie exposes metrics via node-artedi. See the design document for more information about the metrics that are exposed, and how to access them. For development, it is probably easiest to use curl to scrape metrics:

$ curl http://localhost:8881/metrics

Notably, some metadata labels are not being collected due to their potential for high cardinality. Specifically, remote IP address, object owner, and caller username are not collected. Metadata labels that have a large number of unique values cause memory strain on metric client processes (muskie) as well as metric servers (Prometheus). It's important to understand what kind of an effect on the entire system the addition of metrics and metadata labels can have before adding them. This is an issue that would likely not appear in a development or staging environment.

Notes on DNS and service discovery

Like most other components in Triton and Manta, Muskie (deployed with service name "webapi") uses Registrar to register its instances in internal DNS so that other components can find them. The general mechanism is documented in detail in the Registrar README. There are some quirks worth noting about how Muskie uses this mechanism.

First, while most components use local config-agent manifests that are checked into the component repository (e.g., $repo_root/sapi_manifest/registrar), Muskie still uses an application-provided SAPI manifest. See MANTA-3173 for details.

Second, Muskie registers itself with DNS domain manta.$dns_suffix (where $dns_suffix is the DNS suffix for the whole deployment). This is the same DNS name that the "loadbalancer" service uses for its instances. If you look up manta.$dns_suffix in a running Manta deployment, you get back the list of "loadbalancer" instances -- not any of the "webapi" (muskie) instances. That's because "loadbalancer" treats this like an ordinary service registration with a service record at manta.$dns_suffix and load_balancer records underneath that that represent individual instances of the manta.$dns_suffix service, but "webapi" registers host records underneath that domain. As the above-mentioned Registrar docs explain, host records are not included in DNS results when a client queries for the service DNS name. They can only be used to query for the IP address of a specific instance. The net result of all this is that you can find the IP address of a Muskie zone whose zonename you know by querying for $zonename.manta.$dns_suffix, but there is no way to enumerate the Muskie instances using DNS, nor is there a way to add that without changing the DNS name for webapi instances, which would be a flag day for Muppet. (This may explain why muppet is a ZooKeeper consumer rather than just a DNS client.)

Dtrace Probes

Muskie has two dtrace providers. The first, muskie, has the following probes:

client_close: json. Fires if a client uploading an object or part closes before data has been streamed to mako. Also fires if the client closes the connection while the stream is in progress. The argument json object has the following format:

{
    id: restify uuid, or x-request-id/request-id http header (string)
    method: request http method (string)
    headers: http headers specified by the client (object)
    url: http request url (string)
    bytes_sent: number of bytes streamed to mako before client close (int)
    bytes_expected: number of bytes that should have been streamed (int)
}

socket_timeout: json. Fires when the timeout limit is reached on a connection to a client. This timeout can be configured either by setting the SOCKET_TIMEOUT environment variable. The default is 120 seconds. The object passed has the same fields to the client_close dtrace probe, except for the bytes_sent and bytes_expected. These parameters are only present if muskie is able to determine the last request sent on this socket.

The second provider, muskie-throttle, has the following probes, which will not fire if the throttle is disabled:

request_throttled: int, int, char *, char * - slots occupied, queued requests, url, method. Fires when a request has been throttled.
request_handled: int, int, char *, char * - slots occupied, queued requests, url, method. Fires after a request has been handled. Internally, the muskie throttle is implemented with a vasync-queue. A "slot" in the above description refers to one of concurrency possible spaces allotted for concurrently scheduled request-handling callbacks. If all slots are occupied, incoming requests will be "queued", which indicates that they are waiting for slots to free up.
queue_enter: char * - restify request uuid. This probe fires as a request enters the queue.
queue_leave: char * - restify request uuid. This probe fires as a request is dequeued, before it is handled. The purpose of these probes is to make it easy to write d scripts that measure the latency impact the throttle has on individual requests.

The script bin/throttlestat.d is implemented as an analog to moraystat.d with the queue_enter and queue_leave probes. It is a good starting point for gaining insight into both how actively a muskie process is being throttled and how much stress it is under.

The throttle probes are provided in a separate provider to prevent coupling the throttle implementation with muskie itself. Future work may involve making the throttle a generic module that can be included in any service with minimal code modification.