/rfd

Requests for Discussion

Primary LanguageRoffMozilla Public License 2.0MPL-2.0

Requests for Discussion

Writing down ideas for system enhancement while they are still nascent allows for important, actionable technical discussion. We capture these in Requests for Discussion, which are documents in the original sprit of the IETF Request for Comments, as expressed by RFC 3:

The content of a note may be any thought, suggestion, etc. related to the software or other aspect of the network. Notes are encouraged to be timely rather than polished. Philosophical positions without examples or other specifics, specific suggestions or implementation techniques without introductory or background explication, and explicit questions without any attempted answers are all acceptable. The minimum length for a note is one sentence.

These standards (or lack of them) are stated explicitly for two reasons. First, there is a tendency to view a written statement as ipso facto authoritative, and we hope to promote the exchange and discussion of considerably less than authoritative ideas. Second, there is a natural hesitancy to publish something unpolished, and we hope to ease this inhibition.

The philosophy of our Requests for Discussion is exactly this: timely rather than polished, with the immediate idea of promoting technical discussion. Over time, we expect that this discussion will often converge on an authoritative explanation of new functionality -- but it's entirely acceptable for an RFD to serve only as a vector of discussion. (We use the term "Requests for Discussion" in lieu of "Requests for Comments" to avoid conflation with the IETF construct -- and the more formal writing that it has come to represent.)

RFDs

state RFD
publish RFD 1 Triton Container Naming Service
publish RFD 2 Docker Logging in SDC
draft RFD 3 Triton Compute Nodes Reboot
draft RFD 4 Docker Build Implementation For Triton
publish RFD 5 Triton Change Feed Support
publish RFD 6 Improving Triton and Manta RAS Infrastructure
draft RFD 7 Datalink LLDP and State Tracking
predraft RFD 8 Datalink Fault Management Topology
publish RFD 9 sdcadm fabrics management
publish RFD 10 Sending GZ Docker Logs to Manta
draft RFD 11 IPv6 and multiple IP addresses support in Triton
draft RFD 12 Bedtime for node-smartdc
draft RFD 13 RBAC v2 for Improved Organization and Docker RBAC Support
draft RFD 14 Signed ZFS Send
draft RFD 15 Reduce/Eliminate runtime LX image customization
predraft RFD 16 Manta Metering
abandoned RFD 17 Cloud Analytics v2
publish RFD 18 Support for using labels to select networks and packages
predraft RFD 19 Interface Drift In Workflow Modules
draft RFD 20 Manta Slop-Aware Zone Scheduling
draft RFD 21 Metadata Scrubber For Triton
draft RFD 22 Improved user experience after a request has failed
publish RFD 23 Manta docs pipeline
draft RFD 24 Designation API improvements to facilitate platform update
draft RFD 25 Pluralizing CloudAPI CreateMachine et al
publish RFD 26 Network Shared Storage for Triton
publish RFD 27 Triton Container Monitor
predraft RFD 28 Improving syncing between Compute Nodes and NAPI
draft RFD 29 Nothing in Triton should rely on ur outside bootstrapping and emergencies
predraft RFD 30 Handling "lastexited" for zones when CN is rebooted or crashes
draft RFD 31 libscsi and uscsi(7I) Improvements for Firmware Upgrade
draft RFD 32 Multiple IP Addresses in NAPI
publish RFD 33 Moray client v2
publish RFD 34 Instance migration
draft RFD 35 Distributed Tracing for Triton
draft RFD 36 Mariposa
draft RFD 37 Metrics Instrumenter
publish RFD 38 Zone Physical Memory Capping
draft RFD 39 VM Attribute Cache (vminfod)
publish RFD 40 Standalone IMGAPI deployment
draft RFD 41 Improved JavaScript errors
draft RFD 42 Provide global zone pkgsrc package set
publish RFD 43 Rack Aware Network Pools
predraft RFD 44 Create VMs with Delegated Datasets
abandoned RFD 45 Tooling for code reviews and code standards
publish RFD 46 Origin images for Triton and Manta core images
publish RFD 47 Retention policy for Joyent engineering data in Manta
predraft RFD 48 Triton A&A Overhaul (AUTHAPI)
predraft RFD 49 AUTHAPI internals
predraft RFD 50 Enhanced Audit Trail for Instance Lifecycle Events
draft RFD 51 Code Review Guidance
draft RFD 52 Moray test suite rework
draft RFD 53 Improving ZFS Pool Layout Flexibility
predraft RFD 54 Remove 'autoboot' when VMs stop from within
abandoned RFD 55 LX support for Mount Namespaces
predraft RFD 56 Revamp Cloudapi
publish RFD 57 Moving to Content Addressable Docker Images
predraft RFD 58 Moving Net-Agent Forward
publish RFD 59 Update Triton to Node.js v4-LTS
draft RFD 60 Scaling the Designation API
draft RFD 61 CNAPI High Availability
predraft RFD 62 Replace Workflow API
abandoned RFD 63 Adding branding to kernel cred_t
predraft RFD 64 Hardware Inventory GRUB Menu Item
publish RFD 65 Multipart Uploads for Manta
draft RFD 66 USBA improvements for USB 3.x
draft RFD 67 Triton headnode resilience
draft RFD 68 Triton versioning
publish RFD 69 Metadata socket improvements
draft RFD 70 Joyent Repository Metadata
publish RFD 71 Manta Client-side Encryption
abandoned RFD 72 Chroot-independent Device Access
publish RFD 73 Moray client support for SRV-based service discovery
draft RFD 74 Manta fault tolerance test plan
abandoned RFD 75 Virtualizing the number of CPUs
draft RFD 76 Improving Manta Networking Setup
draft RFD 77 Hardware-backed per-zone crypto tokens
publish RFD 78 Making Moray's findobjects requests robust with regards to unindexed fields
predraft RFD 79 Reserved for Mariposa
predraft RFD 80 Reserved for Mariposa
predraft RFD 81 Reserved for Mariposa
draft RFD 82 Triton agents install and update
publish RFD 83 Triton http_proxy support
predraft RFD 84 Providing Manta access on multiple networks
publish RFD 85 Tactical improvements for Manta alarms
publish RFD 86 ContainerPilot 3
predraft RFD 87 Docker Events for Triton
publish RFD 88 DC and Hardware Management Futures
publish RFD 89 Project Tiresias
predraft RFD 90 Handling CPU Caps in Triton
predraft RFD 91 Application Metrics in SDC and Manta
predraft RFD 92 Triton Services High Availability
publish RFD 93 Modernize TLS Options
draft RFD 94 Global Zone metrics in CMON
publish RFD 95 Seamless Muppet Reconfiguration
publish RFD 96 Named thread API
draft RFD 97 Project Hookshot - Improved VLAN Handling
predraft RFD 98 Issue Prioritisation Guidelines
publish RFD 99 Client Library for Collecting Application Metrics
draft RFD 100 Improving lint and style checks in JavaScript
draft RFD 101 Models for operational escalation into engineering
publish RFD 102 Requests for Enhancement
draft RFD 103 Operationalize Resharding
draft RFD 104 Engineering Guide - General Principles
draft RFD 105 Engineering Guide - Node.js Best Practices
abandoned RFD 106 Engineering Guide - Go Best Practices
publish RFD 107 Self assigned IP's and reservations
draft RFD 108 Remove Support for the Kernel Memory Cage
predraft RFD 109 Run Operator-Script Earlier during Image Creation
predraft RFD 110 Operator-Configurable Throttles for Manta
publish RFD 111 Manta Incident Response Practice
draft RFD 112 Manta Storage Auditor
publish RFD 113 x-account image transfer and x-DC image copying
predraft RFD 114 GPGPU Instance Support in Triton
draft RFD 115 Improving Manta Data Path Availability
predraft RFD 116 Manta Bucket Exploration
predraft RFD 117 Network Traits
draft RFD 118 MAC Checksum Offload Extensions
draft RFD 119 Routing Between Fabric Networks
abandoned RFD 120 The Triton Router Object, phase 1 (intra-DC, fabric only)
predraft RFD 121 bhyve brand
draft RFD 122 Per-brand resource and property customization
predraft RFD 123 Online Manta Garbage Collection
draft RFD 124 Manta Incident Response Guide
predraft RFD 125 Online Schema Changes in Manta
draft RFD 126 Zone Configuration Conversions
predraft RFD 127 In-process Brand Hooks
draft RFD 128 VXLAN Tunneling Performance Improvements
predraft RFD 129 Manta Performance Bottleneck Investigation
predraft RFD 130 The Triton Remote Network Object
predraft RFD 131 The Triton Datacenter API (DCAPI)
publish RFD 132 Conch: Unified Rack Integration Process
publish RFD 133 Conch: Improved Device Validation
publish RFD 134 Conch: User Access Control
draft RFD 135 Conch: Job Queue and Real-Time Notifications
draft RFD 136 Conch: Orchestration
publish RFD 137 CPU Autoreplacement and ID Synthesis
predraft RFD 138 Multi-subnet Admin Networks
predraft RFD 139 Node.js test frameworks and Triton guidelines
predraft RFD 140 Conch: Datacenter Designer
predraft RFD 141 Platform Image Build v2 (PIBv2)
draft RFD 142 Use SMF logging for Manta services
draft RFD 143 Manta Scalable Garbage Collection Plan
predraft RFD 144 Conch: Datacenter Switch Automation
publish RFD 145 Lullaby 3: Improving the Triton/Manta builds
predraft RFD 146 Conch: Inventory System
publish RFD 147 Project Tiresias: USB Topology
abandoned RFD 148 Snapper: VM Snapshots
draft RFD 149 PostgreSQL Schema For Manta buckets
draft RFD 150 Operationalizing Prometheus, Thanos, and Grafana
publish RFD 151 Assessing Software Engineering Candidates
draft RFD 152 Rack Aware Networking
draft RFD 153 Incremental metadata expansion for Manta buckets
publish RFD 154 Flexible disk space for bhyve VMs
publish RFD 155 Manta Buckets API
publish RFD 156 SmartOS/Triton Boot Modernization
draft RFD 157 Notices to Operators
draft RFD 158 NAT Reform, including public IPs for fabric-attached instances.
draft RFD 159 Manta Storage Zone Capacity Limit
predraft RFD 160 CloudWatch-like Metrics for Manta
predraft RFD 161 Rust on SmartOS/illumos
predraft RFD 162 Online repair and rebalance of Manta objects
draft RFD 163 Cloud Firewall Logging
draft RFD 164 Open Source Policy
draft RFD 165 Security Updates for Triton/Manta Core Images
draft RFD 166 Improving phy Management
predraft RFD 167 Drop i386 and multiarch Package Sets
draft RFD 168 Bootstrapping a Manta Buckets deployment
draft RFD 169 Encrypted kernel crash dump
draft RFD 170 Manta Picker Component
abandoned RFD 171 A Proposal for Manta SnapLinks
predraft RFD 172 CNS Aggregation
predraft RFD 173 KBMAPI and kbmd
draft RFD 174 Improving Manta Storage Unit Cost (iSCSI)
publish RFD 175 SmartOS integration process changes
publish RFD 176 SmartOS and Triton boot from ZFS pool
predraft RFD 177 Linux Compute Node Umbrella
predraft RFD 178 Linux Platform Image
predraft RFD 179 Linux Compute Node Networking
predraft RFD 180 Linux Compute Node Containers
draft RFD 181 Improving Manta Storage Unit Cost (MinIO)
draft RFD 182 Altering system pool detection in SmartOS/Triton
predraft RFD 183 Triton Volume Replication and Backup

Contents of an RFD

The following is a way to help you think about and structure an RFD document. This includes some things that we think you might want to include. If you're unsure if you need to write an RFD, here are some occasions where it usually is appropriate:

  • Adding new endpoints to an API or creating an entirely new API
  • Adding new commands or adding new options
  • Changing the behaviour of endpoints, commands, APIs
  • Otherwise changing the implementation of a component in a significant way
  • Something that changes how users and operators interact with the overall system.
  • Changing the way that software is developed or deployed
  • Changing the way that software is packaged or operated
  • Otherwise changing the way that software is built

This is deliberately broad; the most important common strain across RFDs is that they are technical documents describing implementation considerations of some flavor or another. Note that this does not include high-level descriptions of desired functionality; such requests should instead phrased as Requests for Enhancement.

RFDs start as a simple markdown file that use a bit of additional metadata to describe its current state. Every RFD needs a title that serves as a simple synopsis of the document. (This title is not fixed; RFDs are numbered to allow the title to change.) In general, we recommend any initial RFD address and/or ask the following questions:

Title

This is a simple synopsis of the document. Note, the title is not fixed. It may change as the RFD evolves.

What problem is this solving?

The goal here is to describe the problems that we are trying to address that motivate the solution. The problem should not be described in terms of the solution.

What are the principles and constraints on the design of the solution?

You should use this section to describe the first principles or other important decisions that constrain the problem. For example, a constraint on the design may be that we should be able to do an operation without downtime.

How will users interact with these features?

Here, you should consider both operators, end users, and developers. You should consider not only how they'll verify that it's working correctly, but also how they'll verify if it's broken and what actions they should take from there.

What repositories are being changed, if known?

If it's known, a list of what git repositories are being changed as a result of this would be quite useful.

What public interfaces are changing?

What interfaces that users and operators are using and rely upon are changing? Note that when changing public interfaces we have to be extra careful to ensure that we don't break existing users and scripts.

What private interfaces are changing?

What interfaces that are private to the system are changing? Changing these interfaces may impact the system, but should not impact operators and users directly.

What is the upgrade impact?

For an existing install, what are the implications if anything is upgraded through the normal update mechanisms, e.g. platform reboot, sdcadm update, manta-adm update, etc. Are there any special steps that need to be taken or do certain updates need to happen together for this

What is the security impact?

What (untrusted) user input (including both data and code) will be used as part of the change? Which components will interact with that input? How will that input be validated and managed securely? What new operations are exposed and which privileges will they require (both system privileges and Triton privileges)? How would an attacker use the proposed facilities to escalate their privileges?

Mechanics of an RFD

To create a new RFD, you should do the following steps.

Allocate a new RFD number

RFDs are numbered starting at 1, and then increase from there. When you start, you should allocate the next currently unused number. Note that if someone puts back to the repository before you, then you should just increase your number to the next available one. So, if the next RFD would be number 42, then you should make the directory 0042 and place it in the file 0042.md. Note, that while we use four digits in the directories and numbering, when referring to an RFD, you do not need to use the leading zeros.

$ mkdir -p rfd/0042
$ cp prototypes/prototype.md rfd/0042/README.md
$

Write the RFD

At this point, you should write up the RFD. Any files that end in *.md will automatically be rendered into HTML and any other assets in that directory will automatically be copied into the output directory.

RFDs should have a default text width of 80 characters. Any other materials related to that RFD should be in the same directory.

RFD Metadata and State

At the start of every RFD document, we'd like to include a brief amount of metadata. The metadata format is based on the python-markdown2 metadata format. It'd look like:

---
authors: Han Solo <han.solo@shot.first.org>, Alexander Hamilton <ah@treasury.gov>
state: draft
---

We keep track of two pieces of metadata. The first is the authors, the second is the state. There may be any number of authors, they should be listed with their name and e-mail address.

Currently the only piece of metadata we keep track of is the state. The state can be in any of the following. An RFD can be in one of the following four states:

  1. predraft
  2. draft
  3. publish
  4. abandoned

While a document is in the predraft state, it indicates that the work is not yet ready for discussion, but the RFD is effectively a placeholder. Documents under active discussion should be in the draft state. Once (or if) discussion has converged and the document has come to reflect reality rather than propose it, it should be updated to the publish state.

Note that just because something is in the publish state does not mean that it cannot be updated and corrected. See the "Touching up" section for more information.

Finally, if an idea is found to be non-viable (that is, deliberately never implemented) or if an RFD should be otherwise indicated that it should be ignored, it can be moved into the abandoned state.

Start the discussion

Once you have reached a point where you're happy with your thoughts and notes, then to start the discussion, you should first make sure you've pushed your changes to the repository and that the build is working.

From here, send an e-mail to the appropriate mailing list that best fits your work. The options are:

The subject of the message should be the RFD number and synopsis. For example, if you RFD number 169 with the title Overlay Networks for Triton, then the subject would be RFD 169 Overlay Networks for Triton.

In the body, make sure to include a link to the RFD.

If an RFD is in the predraft or draft state, you should also open an issue to allow for additional opportunity for discussion of the RFD. This issue should have the synopsis that reflects its purpose (e.g. "RFD 169: Discussion") and the body should explain its intent (e.g. "This issue represents an opportunity for discussion of RFD 169 while it remains in a pre-published state."). Moreover, a discussion field should be added to the RFD metadata, with a URL that points to an issue query for the RFD number. For example:

---
authors: Chewbacca <chewie77@falcon.org>
state: draft
discussion: https://github.com/TritonDataCenter/rfd/issues?q="RFD+169"
---

When the RFD is transitioned into the publish state, the discussion issue should be closed with an explanatory note (e.g. "This RFD has been published and while additional feedback is welcome, this discussion issue is being closed."), but the discussion link should remain in the RFD metadata.

Note that discussion might happen via more than one means; if discussion is being duplicated across media, it's up to the author(s) to reflect or otherwise reconcile discussion in the RFD itself. (That is, it is the RFD that is canonical, not necessarily the discussion which may be occurring online, offline, in person, over chat, or wherever human-to-human interaction can be found.)

Finishing up

When discussion has wrapped up and the relevant feedback has been incorporated, then you should go ahead and change the state of the document to publish and push that change.

Touching up

As work progresses on a project, it may turn out that our initial ideas and theories have been disproved or other architectural issues have come up. In such cases, you should come back and update the RFD to reflect the final conclusions or, if it's a rather substantial issue, then you should consider creating a new RFD.

Contributing

Contributions are welcome, you do not have to be a Joyent employee to submit an RFD or to comment on one. The discussions for RFDs happen on the open on the various mailing lists related to Triton, Manta, and SmartOS.

To submit a new RFD, please provide a git patch or a pull request that consists of a single squashed commit and we will incorporate it into the repository or feel free to send out the document to the mailing list and as we discuss it, we can work together to pull it into the RFD repository.