jetstack/navigator

Cleaning up v1alpha1 API in preparation for a 'stable' cut

Closed this issue · 5 comments

We want to cut a 'stable' v1alpha1 API that can be depended upon, and begin working in the v1alpha2 API.

Before we do this, in the brief remaining window where we can perform breaking API changes, we should make our 'last' API changes to v1alpha1.

I'd like to call a review on all of our resources:

  • Pilot
type PilotSpec struct {
	Elasticsearch *PilotElasticsearchSpec `json:"elasticsearch"`
}

type PilotPhase string

const (
	// PreStart occurs before the Pilot's subprocess has been started.
	PilotPhasePreStart PilotPhase = "PreStart"
	// PostStart occurs immediately after the Pilot's subprocess has been
	// started.
	PilotPhasePostStart PilotPhase = "PostStart"
	// PreStop occurs just before the Pilot's subprocess is sent a graceful
	// termination signal. These hooks will block termination of the process.
	PilotPhasePreStop PilotPhase = "PreStop"
	// PostStop occurs after the Pilot's has stopped. These can be used to
	// clean up, or whatever other action that may need to be performed.
	PilotPhasePostStop PilotPhase = "PostStop"
)

type PilotElasticsearchSpec struct {
}

type PilotStatus struct {
	LastCompletedPhase PilotPhase       `json:"lastCompletedPhase"`
	Conditions         []PilotCondition `json:"conditions"`
	// Contains status information specific to Elasticsearch Pilots
	Elasticsearch *ElasticsearchPilotStatus `json:"elasticsearch,omitempty"`
}

type ElasticsearchPilotStatus struct {
	// Documents is the current number of documents on this node. nil indicates
	// an unknown number of documents, whereas 0 indicates that the node is
	// empty
	Documents *int64 `json:"documents,omitempty"`
}

// PilotCondition contains condition information for a Pilot.
type PilotCondition struct {
	// Type of the condition, currently ('Ready').
	Type PilotConditionType `json:"type"`

	// Status of the condition, one of ('True', 'False', 'Unknown').
	Status ConditionStatus `json:"status"`

	// LastTransitionTime is the timestamp corresponding to the last status
	// change of this condition.
	LastTransitionTime metav1.Time `json:"lastTransitionTime"`

	// Reason is a brief machine readable explanation for the condition's last
	// transition.
	Reason string `json:"reason"`

	// Message is a human readable description of the details of the last
	// transition, complementing reason.
	Message string `json:"message"`
}

// PilotConditionType represents a Pilot condition value.
type PilotConditionType string

const (
	// PilotConditionReady represents the fact that a given Pilot condition
	// is in ready state.
	PilotConditionReady PilotConditionType = "Ready"
	// PilotConditionStarted represents the fact that a given Pilot condition
	// is in started state.
	PilotConditionStarted PilotConditionType = "Started"
	// PilotConditionStopped represents the fact that a given Pilot
	// condition is in a stopped state.
	PilotConditionStopped PilotConditionType = "Stopped"
)

// ConditionStatus represents a condition's status.
type ConditionStatus string

// These are valid condition statuses. "ConditionTrue" means a resource is in
// the condition; "ConditionFalse" means a resource is not in the condition;
// "ConditionUnknown" means kubernetes can't decide if a resource is in the
// condition or not. In the future, we could add other intermediate
// conditions, e.g. ConditionDegraded.
const (
	// ConditionTrue represents the fact that a given condition is true
	ConditionTrue ConditionStatus = "True"

	// ConditionFalse represents the fact that a given condition is false
	ConditionFalse ConditionStatus = "False"

	// ConditionUnknown represents the fact that a given condition is unknown
	ConditionUnknown ConditionStatus = "Unknown"
)
  • ElasticsearchCluster
// ElasticsearchClusterSpec describes a specification for an ElasticsearchCluster
type ElasticsearchClusterSpec struct {
	Plugins   []string                       `json:"plugins"`
	NodePools []ElasticsearchClusterNodePool `json:"nodePools"`
	Pilot     ElasticsearchPilotImage        `json:"pilot"`
	Image     ElasticsearchImage             `json:"image"`
	Sysctl    []string                       `json:"sysctl"`
}

// ElasticsearchClusterNodePool describes a node pool within an ElasticsearchCluster.
// The nodes in this pool will be configured to be of the specified roles
type ElasticsearchClusterNodePool struct {
	Name         string                                `json:"name"`
	Replicas     int64                                 `json:"replicas"`
	Roles        []ElasticsearchClusterRole            `json:"roles"`
	NodeSelector map[string]string                     `json:"nodeSelector"`
	Resources    *v1.ResourceRequirements              `json:"resources,omitempty"`
	Persistence  ElasticsearchClusterPersistenceConfig `json:"persistence,omitempty"`
	// Config is a map of configuration files to be placed in the elasticsearch
	// config directory. Environment variables may be used in these files and
	// they will be automatically expanded by the Elasticsearch process.
	Config map[string]string `json:"config"`
}

type ElasticsearchClusterRole string

const (
	ElasticsearchRoleData   ElasticsearchClusterRole = "data"
	ElasticsearchRoleMaster ElasticsearchClusterRole = "master"
	ElasticsearchRoleIngest ElasticsearchClusterRole = "ingest"
)

type ElasticsearchClusterPersistenceConfig struct {
	Enabled      bool   `json:"enabled"`
	Size         string `json:"size"`
	StorageClass string `json:"storageClass"`
}

type ImageSpec struct {
	Repository string `json:"repository"`
	Tag        string `json:"tag"`
	PullPolicy string `json:"pullPolicy"`
}

type ElasticsearchPilotImage struct {
	ImageSpec `json:",inline"`
}

type ElasticsearchImage struct {
	ImageSpec `json:",inline"`
	FsGroup   int64 `json:"fsGroup"`
}

type ElasticsearchClusterStatus struct {
	NodePools map[string]ElasticsearchClusterNodePoolStatus `json:"nodePools"`
}

type ElasticsearchClusterNodePoolStatus struct {
	ReadyReplicas int64 `json:"readyReplicas"`
}
  • CassandraCluster
type CassandraClusterSpec struct {
	Sysctl    []string                   `json:"sysctl"`
	NodePools []CassandraClusterNodePool `json:"nodePools"`
	Image     CassandraImage             `json:"image"`
	CqlPort   int32                      `json:"cqlPort"`
}

type CassandraImage struct {
	ImageSpec `json:",inline"`
}

// CassandraClusterNodePool describes a node pool within a CassandraCluster.
type CassandraClusterNodePool struct {
	Name     string `json:"name"`
	Replicas int64  `json:"replicas"`
}

type CassandraClusterStatus struct {
	NodePools map[string]CassandraClusterNodePoolStatus `json:"nodePools"`
}

type CassandraClusterNodePoolStatus struct {
	ReadyReplicas int64 `json:"readyReplicas"`
}

(taken from: https://github.com/jetstack/navigator/blob/afed2beb6a1f9638fb2b2ffe9fdad9b73704717c/pkg/apis/navigator/v1alpha1/types.go)

In CassandraCluster:

  • should we expose CqlPort? Can we manage it ourselves for now until such time that it's required to change it?
  • per our conversation yesterday, do we want to expose multiple node pool support right now? We could switch to this model in future more easily than we can switch back.
  • is ReadyReplicas used at the moment?

In ElasticsearchCluster:

  • we currently expose elasticsearchcluster.spec.nodePools.config in which the user (usually) specifies an entire elasticsearch.yml to use. This creates a hard dependency between the NODE_MASTER, NODE_INGEST and NODE_DATA boolean env vars set by the Pilot, and the users config in elasticsearch.yml.

I've planned for a while to allow the user to not have to specify all of the fields in this file, and instead have the pilot automatically inject config into the file (i.e. have the elasticsearchcluster.spec.nodePools.config["elasticsearch.yml"] key append instead of replace).

I'm not sure where best the tradeoff lies between overly complex ElasticsearchCluster manifests and the right level of customisability.

/cc @cehoffman @wallrj

In #194 (comment) @munnerz wrote:

should we expose CqlPort? Can we manage it ourselves for now until such time that it's required to change it?

Probably not. I've looked at GoCQL and its cluster discovery mechanism and I think the CQL loadbalancing service may not be necessary and may actually confuse the discovery process....since the loadbalancer IP address isn't one of the addresses advertised by the cluster nodes. See #232

In which case, we may not need the readiness probe either 🤔

per our conversation yesterday, do we want to expose multiple node pool support right now? We could switch to this model in future more easily than we can switch back.

I'm think the nodepool may be the mechanism by which we support multiple racks and datacentres. See #227

And perhaps you'd add another node pool if you want to change the size or class of the disks attached to nodes. Add a new pool with desired disks. Remove the pool with original disks.
See #233

is ReadyReplicas used at the moment?

Not yet. But hopefully soon. See #140

I think we should remove sysctls and document an alternative.
Seems like it should be something configured when provisioning the kubernetes node.
And then there should be a way of scheduling elasticsearch pods to only the nodes where those sysctls have been configured.

I'm pretty much in agreement at the moment. The only sysctl settings I think that should be allowed through are those in the safe/whitelist that kubelet knows how to apply through annotations on the pod. https://kubernetes.io/docs/concepts/cluster-administration/sysctl-cluster/#setting-sysctls-for-a-pod

Before we close this, we should make sure we don't return 'null' as a field value anywhere, else clients could get confused.

I've seen it at various points in our API, and for now I think we'll need to run a manual check. Perhaps some kind of basic checking of default values for all fields results in a resource returned with no nulls set.