Fresh dep ensure

This commit is contained in:
Mike Cronce
2018-11-26 13:23:56 -05:00
parent 93cb8a04d7
commit 407478ab9a
9016 changed files with 551394 additions and 279685 deletions

View File

@ -14,6 +14,7 @@ tags:
subordinate: false
series:
- xenial
- bionic
requires:
apiserver:
interface: http

View File

@ -1,6 +1,6 @@
options:
channel:
type: string
default: "1.10/stable"
default: "1.11/stable"
description: |
Snap channel to install Kubernetes snaps from

View File

@ -15,6 +15,7 @@ tags:
- conformance
series:
- xenial
- bionic
requires:
kubernetes-master:
interface: http

View File

@ -1,19 +1,19 @@
# Kubernetes-master
[Kubernetes](http://kubernetes.io/) is an open source system for managing
[Kubernetes](http://kubernetes.io/) is an open source system for managing
application containers across a cluster of hosts. The Kubernetes project was
started by Google in 2014, combining the experience of running production
started by Google in 2014, combining the experience of running production
workloads combined with best practices from the community.
The Kubernetes project defines some new terms that may be unfamiliar to users
or operators. For more information please refer to the concept guide in the
or operators. For more information please refer to the concept guide in the
[getting started guide](https://kubernetes.io/docs/home/).
This charm is an encapsulation of the Kubernetes master processes and the
This charm is an encapsulation of the Kubernetes master processes and the
operations to run on any cloud for the entire lifecycle of the cluster.
This charm is built from other charm layers using the Juju reactive framework.
The other layers focus on specific subset of operations making this layer
The other layers focus on specific subset of operations making this layer
specific to operations of Kubernetes master processes.
# Deployment
@ -23,15 +23,15 @@ charms to model a complete Kubernetes cluster. A Kubernetes cluster needs a
distributed key value store such as [Etcd](https://coreos.com/etcd/) and the
kubernetes-worker charm which delivers the Kubernetes node services. A cluster
requires a Software Defined Network (SDN) and Transport Layer Security (TLS) so
the components in a cluster communicate securely.
the components in a cluster communicate securely.
Please take a look at the [Canonical Distribution of Kubernetes](https://jujucharms.com/canonical-kubernetes/)
or the [Kubernetes core](https://jujucharms.com/kubernetes-core/) bundles for
Please take a look at the [Canonical Distribution of Kubernetes](https://jujucharms.com/canonical-kubernetes/)
or the [Kubernetes core](https://jujucharms.com/kubernetes-core/) bundles for
examples of complete models of Kubernetes clusters.
# Resources
The kubernetes-master charm takes advantage of the [Juju Resources](https://jujucharms.com/docs/2.0/developer-resources)
The kubernetes-master charm takes advantage of the [Juju Resources](https://jujucharms.com/docs/2.0/developer-resources)
feature to deliver the Kubernetes software.
In deployments on public clouds the Charm Store provides the resource to the
@ -40,9 +40,41 @@ firewall rules may not be able to contact the Charm Store. In these network
restricted environments the resource can be uploaded to the model by the Juju
operator.
#### Snap Refresh
The kubernetes resources used by this charm are snap packages. When not
specified during deployment, these resources come from the public store. By
default, the `snapd` daemon will refresh all snaps installed from the store
four (4) times per day. A charm configuration option is provided for operators
to control this refresh frequency.
>NOTE: this is a global configuration option and will affect the refresh
time for all snaps installed on a system.
Examples:
```sh
## refresh kubernetes-master snaps every tuesday
juju config kubernetes-master snapd_refresh="tue"
## refresh snaps at 11pm on the last (5th) friday of the month
juju config kubernetes-master snapd_refresh="fri5,23:00"
## delay the refresh as long as possible
juju config kubernetes-master snapd_refresh="max"
## use the system default refresh timer
juju config kubernetes-master snapd_refresh=""
```
For more information on the possible values for `snapd_refresh`, see the
*refresh.timer* section in the [system options][] documentation.
[system options]: https://forum.snapcraft.io/t/system-options/87
# Configuration
This charm supports some configuration options to set up a Kubernetes cluster
This charm supports some configuration options to set up a Kubernetes cluster
that works in your environment:
#### dns_domain
@ -61,14 +93,14 @@ Enable RBAC and Node authorisation.
# DNS for the cluster
The DNS add-on allows the pods to have a DNS names in addition to IP addresses.
The Kubernetes cluster DNS server (based off the SkyDNS library) supports
forward lookups (A records), service lookups (SRV records) and reverse IP
The Kubernetes cluster DNS server (based off the SkyDNS library) supports
forward lookups (A records), service lookups (SRV records) and reverse IP
address lookups (PTR records). More information about the DNS can be obtained
from the [Kubernetes DNS admin guide](http://kubernetes.io/docs/admin/dns/).
# Actions
The kubernetes-master charm models a few one time operations called
The kubernetes-master charm models a few one time operations called
[Juju actions](https://jujucharms.com/docs/stable/actions) that can be run by
Juju users.
@ -80,7 +112,7 @@ requires a relation to the ceph-mon charm before it can create the volume.
#### restart
This action restarts the master processes `kube-apiserver`,
This action restarts the master processes `kube-apiserver`,
`kube-controller-manager`, and `kube-scheduler` when the user needs a restart.
# More information
@ -93,7 +125,7 @@ This action restarts the master processes `kube-apiserver`,
# Contact
The kubernetes-master charm is free and open source operations created
by the containers team at Canonical.
by the containers team at Canonical.
Canonical also offers enterprise support and customization services. Please
refer to the [Kubernetes product page](https://www.ubuntu.com/cloud/kubernetes)

View File

@ -1,7 +1,7 @@
restart:
description: Restart the Kubernetes master services on demand.
create-rbd-pv:
description: Create RADOS Block Device (RDB) volume in Ceph and creates PersistentVolume.
description: Create RADOS Block Device (RDB) volume in Ceph and creates PersistentVolume. Note this is deprecated on Kubernetes >= 1.10 in favor of CSI, where PersistentVolumes are created dynamically to back PersistentVolumeClaims.
params:
name:
type: string

View File

@ -38,6 +38,14 @@ def main():
this script thinks the environment is 'sane' enough to provision volumes.
'''
# k8s >= 1.10 uses CSI and doesn't directly create persistent volumes
if get_version('kube-apiserver') >= (1, 10):
print('This action is deprecated in favor of CSI creation of persistent volumes')
print('in Kubernetes >= 1.10. Just create the PVC and a PV will be created')
print('for you.')
action_fail('Deprecated, just create PVC.')
return
# validate relationship pre-reqs before additional steps can be taken
if not validate_relation():
print('Failed ceph relationship check')
@ -89,6 +97,23 @@ def main():
check_call(cmd)
def get_version(bin_name):
"""Get the version of an installed Kubernetes binary.
:param str bin_name: Name of binary
:return: 3-tuple version (maj, min, patch)
Example::
>>> `get_version('kubelet')
(1, 6, 0)
"""
cmd = '{} --version'.format(bin_name).split()
version_string = check_output(cmd).decode('utf-8')
return tuple(int(q) for q in re.findall("[0-9]+", version_string)[:3])
def action_get_or_default(key):
''' Convenience method to manage defaults since actions dont appear to
properly support defaults '''

View File

@ -1,4 +1,52 @@
options:
audit-policy:
type: string
default: |
apiVersion: audit.k8s.io/v1beta1
kind: Policy
rules:
# Don't log read-only requests from the apiserver
- level: None
users: ["system:apiserver"]
verbs: ["get", "list", "watch"]
# Don't log kube-proxy watches
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- resources: ["endpoints", "services"]
# Don't log nodes getting their own status
- level: None
userGroups: ["system:nodes"]
verbs: ["get"]
resources:
- resources: ["nodes"]
# Don't log kube-controller-manager and kube-scheduler getting endpoints
- level: None
users: ["system:unsecured"]
namespaces: ["kube-system"]
verbs: ["get"]
resources:
- resources: ["endpoints"]
# Log everything else at the Request level.
- level: Request
omitStages:
- RequestReceived
description: |
Audit policy passed to kube-apiserver via --audit-policy-file.
For more info, please refer to the upstream documentation at
https://kubernetes.io/docs/tasks/debug-application-cluster/audit/
audit-webhook-config:
type: string
default: ""
description: |
Audit webhook config passed to kube-apiserver via --audit-webhook-config-file.
For more info, please refer to the upstream documentation at
https://kubernetes.io/docs/tasks/debug-application-cluster/audit/
addons-registry:
type: string
default: ""
description: Specify the docker registry to use when applying addons
enable-dashboard-addons:
type: boolean
default: True
@ -41,7 +89,7 @@ options:
will not be loaded.
channel:
type: string
default: "1.10/stable"
default: "1.11/stable"
description: |
Snap channel to install Kubernetes master services from
client_password:
@ -99,3 +147,18 @@ options:
default: true
description: |
If true the metrics server for Kubernetes will be deployed onto the cluster.
snapd_refresh:
default: "max"
type: string
description: |
How often snapd handles updates for installed snaps. Setting an empty
string will check 4x per day. Set to "max" to delay the refresh as long
as possible. You may also set a custom string as described in the
'refresh.timer' section here:
https://forum.snapcraft.io/t/system-options/87
default-storage:
type: string
default: "auto"
description: |
The storage class to make the default storage class. Allowed values are "auto",
"none", "ceph-xfs", "ceph-ext4". Note: Only works in Kubernetes >= 1.10

View File

@ -15,8 +15,11 @@ includes:
- 'interface:kube-dns'
- 'interface:kube-control'
- 'interface:public-address'
- 'interface:aws'
- 'interface:gcp'
- 'interface:aws-integration'
- 'interface:gcp-integration'
- 'interface:openstack-integration'
- 'interface:vsphere-integration'
- 'interface:azure-integration'
options:
basic:
packages:

View File

@ -20,6 +20,7 @@ tags:
subordinate: false
series:
- xenial
- bionic
provides:
kube-api-endpoint:
interface: http
@ -41,9 +42,15 @@ requires:
ceph-storage:
interface: ceph-admin
aws:
interface: aws
interface: aws-integration
gcp:
interface: gcp
interface: gcp-integration
openstack:
interface: openstack-integration
vsphere:
interface: vsphere-integration
azure:
interface: azure-integration
resources:
kubectl:
type: file

View File

@ -15,6 +15,7 @@
# limitations under the License.
import base64
import hashlib
import os
import re
import random
@ -27,6 +28,7 @@ import ipaddress
from charms.leadership import leader_get, leader_set
from shutil import move
from tempfile import TemporaryDirectory
from pathlib import Path
from shlex import split
@ -64,8 +66,11 @@ from charmhelpers.contrib.charmsupport import nrpe
nrpe.Check.shortname_re = '[\.A-Za-z0-9-_]+$'
gcp_creds_env_key = 'GOOGLE_APPLICATION_CREDENTIALS'
snap_resources = ['kubectl', 'kube-apiserver', 'kube-controller-manager',
'kube-scheduler', 'cdk-addons']
os.environ['PATH'] += os.pathsep + os.path.join(os.sep, 'snap', 'bin')
db = unitdata.kv()
def set_upgrade_needed(forced=False):
@ -86,7 +91,6 @@ def channel_changed():
def service_cidr():
''' Return the charm's service-cidr config '''
db = unitdata.kv()
frozen_cidr = db.get('kubernetes-master.service-cidr')
return frozen_cidr or hookenv.config('service-cidr')
@ -94,7 +98,6 @@ def service_cidr():
def freeze_service_cidr():
''' Freeze the service CIDR. Once the apiserver has started, we can no
longer safely change this value. '''
db = unitdata.kv()
db.set('kubernetes-master.service-cidr', service_cidr())
@ -103,14 +106,21 @@ def check_for_upgrade_needed():
'''An upgrade charm event was triggered by Juju, react to that here.'''
hookenv.status_set('maintenance', 'Checking resources')
# migrate to new flags
if is_state('kubernetes-master.restarted-for-cloud'):
remove_state('kubernetes-master.restarted-for-cloud')
set_state('kubernetes-master.cloud.ready')
if is_state('kubernetes-master.cloud-request-sent'):
# minor change, just for consistency
remove_state('kubernetes-master.cloud-request-sent')
set_state('kubernetes-master.cloud.request-sent')
migrate_from_pre_snaps()
add_rbac_roles()
set_state('reconfigure.authentication.setup')
remove_state('authentication.setup')
changed = snap_resources_changed()
if changed == 'yes':
set_upgrade_needed()
elif changed == 'unknown':
if not db.get('snap.resources.fingerprint.initialised'):
# We are here on an upgrade from non-rolling master
# Since this upgrade might also include resource updates eg
# juju upgrade-charm kubernetes-master --resource kube-any=my.snap
@ -118,6 +128,9 @@ def check_for_upgrade_needed():
# Forcibly means we do not prompt the user to call the upgrade action.
set_upgrade_needed(forced=True)
migrate_resource_checksums()
check_resources_for_upgrade_needed()
# Set the auto storage backend to etcd2.
auto_storage_backend = leader_get('auto_storage_backend')
is_leader = is_state('leadership.is_leader')
@ -125,27 +138,56 @@ def check_for_upgrade_needed():
leader_set(auto_storage_backend='etcd2')
def snap_resources_changed():
'''
Check if the snapped resources have changed. The first time this method is
called will report "unknown".
def get_resource_checksum_db_key(resource):
''' Convert a resource name to a resource checksum database key. '''
return 'kubernetes-master.resource-checksums.' + resource
Returns: "yes" in case a snap resource file has changed,
"no" in case a snap resources are the same as last call,
"unknown" if it is the first time this method is called
'''
db = unitdata.kv()
resources = ['kubectl', 'kube-apiserver', 'kube-controller-manager',
'kube-scheduler', 'cdk-addons']
paths = [hookenv.resource_get(resource) for resource in resources]
if db.get('snap.resources.fingerprint.initialised'):
result = 'yes' if any_file_changed(paths) else 'no'
return result
else:
db.set('snap.resources.fingerprint.initialised', True)
any_file_changed(paths)
return 'unknown'
def calculate_resource_checksum(resource):
''' Calculate a checksum for a resource '''
md5 = hashlib.md5()
path = hookenv.resource_get(resource)
if path:
with open(path, 'rb') as f:
data = f.read()
md5.update(data)
return md5.hexdigest()
def migrate_resource_checksums():
''' Migrate resource checksums from the old schema to the new one '''
for resource in snap_resources:
new_key = get_resource_checksum_db_key(resource)
if not db.get(new_key):
path = hookenv.resource_get(resource)
if path:
# old key from charms.reactive.helpers.any_file_changed
old_key = 'reactive.files_changed.' + path
old_checksum = db.get(old_key)
db.set(new_key, old_checksum)
else:
# No resource is attached. Previously, this meant no checksum
# would be calculated and stored. But now we calculate it as if
# it is a 0-byte resource, so let's go ahead and do that.
zero_checksum = hashlib.md5().hexdigest()
db.set(new_key, zero_checksum)
def check_resources_for_upgrade_needed():
hookenv.status_set('maintenance', 'Checking resources')
for resource in snap_resources:
key = get_resource_checksum_db_key(resource)
old_checksum = db.get(key)
new_checksum = calculate_resource_checksum(resource)
if new_checksum != old_checksum:
set_upgrade_needed()
def calculate_and_store_resource_checksums():
for resource in snap_resources:
key = get_resource_checksum_db_key(resource)
checksum = calculate_resource_checksum(resource)
db.set(key, checksum)
def add_rbac_roles():
@ -253,7 +295,8 @@ def install_snaps():
snap.install('kube-scheduler', channel=channel)
hookenv.status_set('maintenance', 'Installing cdk-addons snap')
snap.install('cdk-addons', channel=channel)
snap_resources_changed()
calculate_and_store_resource_checksums()
db.set('snap.resources.fingerprint.initialised', True)
set_state('kubernetes-master.snaps.installed')
remove_state('kubernetes-master.components.started')
@ -393,15 +436,76 @@ def set_app_version():
hookenv.application_version_set(version.split(b' v')[-1].rstrip())
@when('kubernetes-master.snaps.installed')
@when('snap.refresh.set')
@when('leadership.is_leader')
def process_snapd_timer():
''' Set the snapd refresh timer on the leader so all cluster members
(present and future) will refresh near the same time. '''
# Get the current snapd refresh timer; we know layer-snap has set this
# when the 'snap.refresh.set' flag is present.
timer = snap.get(snapname='core', key='refresh.timer').decode('utf-8')
# The first time through, data_changed will be true. Subsequent calls
# should only update leader data if something changed.
if data_changed('master_snapd_refresh', timer):
hookenv.log('setting snapd_refresh timer to: {}'.format(timer))
leader_set({'snapd_refresh': timer})
@when('kubernetes-master.snaps.installed')
@when('snap.refresh.set')
@when('leadership.changed.snapd_refresh')
@when_not('leadership.is_leader')
def set_snapd_timer():
''' Set the snapd refresh.timer on non-leader cluster members. '''
# NB: This method should only be run when 'snap.refresh.set' is present.
# Layer-snap will always set a core refresh.timer, which may not be the
# same as our leader. Gating with 'snap.refresh.set' ensures layer-snap
# has finished and we are free to set our config to the leader's timer.
timer = leader_get('snapd_refresh')
hookenv.log('setting snapd_refresh timer to: {}'.format(timer))
snap.set_refresh_timer(timer)
@hookenv.atexit
def set_final_status():
''' Set the final status of the charm as we leave hook execution '''
try:
goal_state = hookenv.goal_state()
except NotImplementedError:
goal_state = {}
vsphere_joined = is_state('endpoint.vsphere.joined')
azure_joined = is_state('endpoint.azure.joined')
cloud_blocked = is_state('kubernetes-master.cloud.blocked')
if vsphere_joined and cloud_blocked:
hookenv.status_set('blocked',
'vSphere integration requires K8s 1.12 or greater')
return
if azure_joined and cloud_blocked:
hookenv.status_set('blocked',
'Azure integration requires K8s 1.11 or greater')
return
if is_state('kubernetes-master.cloud.pending'):
hookenv.status_set('waiting', 'Waiting for cloud integration')
return
if not is_state('kube-api-endpoint.available'):
hookenv.status_set('blocked', 'Waiting for kube-api-endpoint relation')
if 'kube-api-endpoint' in goal_state.get('relations', {}):
status = 'waiting'
else:
status = 'blocked'
hookenv.status_set(status, 'Waiting for kube-api-endpoint relation')
return
if not is_state('kube-control.connected'):
hookenv.status_set('blocked', 'Waiting for workers.')
if 'kube-control' in goal_state.get('relations', {}):
status = 'waiting'
else:
status = 'blocked'
hookenv.status_set(status, 'Waiting for workers.')
return
upgrade_needed = is_state('kubernetes-master.upgrade-needed')
@ -431,12 +535,6 @@ def set_final_status():
hookenv.status_set('waiting', 'Waiting to retry addon deployment')
return
req_sent = is_state('kubernetes-master.cloud-request-sent')
aws_ready = is_state('endpoint.aws.ready')
gcp_ready = is_state('endpoint.gcp.ready')
if req_sent and not (aws_ready or gcp_ready):
hookenv.status_set('waiting', 'waiting for cloud integration')
if addons_configured and not all_kube_system_pods_running():
hookenv.status_set('waiting', 'Waiting for kube-system pods to start')
return
@ -474,7 +572,9 @@ def master_services_down():
@when('etcd.available', 'tls_client.server.certificate.saved',
'authentication.setup')
@when('leadership.set.auto_storage_backend')
@when_not('kubernetes-master.components.started')
@when_not('kubernetes-master.components.started',
'kubernetes-master.cloud.pending',
'kubernetes-master.cloud.blocked')
def start_master(etcd):
'''Run the Kubernetes master components.'''
hookenv.status_set('maintenance',
@ -491,7 +591,7 @@ def start_master(etcd):
handle_etcd_relation(etcd)
# Add CLI options to all components
configure_apiserver(etcd.get_connection_string(), getStorageBackend())
configure_apiserver(etcd.get_connection_string())
configure_controller_manager()
configure_scheduler()
set_state('kubernetes-master.components.started')
@ -661,7 +761,8 @@ def kick_api_server(tls):
tls_client.reset_certificate_write_flag('server')
@when('kubernetes-master.components.started')
@when_any('kubernetes-master.components.started', 'ceph-storage.configured')
@when('leadership.is_leader')
def configure_cdk_addons():
''' Configure CDK addons '''
remove_state('cdk-addons.configured')
@ -669,17 +770,39 @@ def configure_cdk_addons():
gpuEnable = (get_version('kube-apiserver') >= (1, 9) and
load_gpu_plugin == "auto" and
is_state('kubernetes-master.gpu.enabled'))
registry = hookenv.config('addons-registry')
dbEnabled = str(hookenv.config('enable-dashboard-addons')).lower()
dnsEnabled = str(hookenv.config('enable-kube-dns')).lower()
metricsEnabled = str(hookenv.config('enable-metrics')).lower()
if (is_state('ceph-storage.configured') and
get_version('kube-apiserver') >= (1, 10)):
cephEnabled = "true"
else:
cephEnabled = "false"
ceph_ep = endpoint_from_flag('ceph-storage.available')
ceph = {}
default_storage = ''
if ceph_ep:
b64_ceph_key = base64.b64encode(ceph_ep.key().encode('utf-8'))
ceph['admin_key'] = b64_ceph_key.decode('ascii')
ceph['kubernetes_key'] = b64_ceph_key.decode('ascii')
ceph['mon_hosts'] = ceph_ep.mon_hosts()
default_storage = hookenv.config('default-storage')
args = [
'arch=' + arch(),
'dns-ip=' + get_deprecated_dns_ip(),
'dns-domain=' + hookenv.config('dns_domain'),
'registry=' + registry,
'enable-dashboard=' + dbEnabled,
'enable-kube-dns=' + dnsEnabled,
'enable-metrics=' + metricsEnabled,
'enable-gpu=' + str(gpuEnable).lower()
'enable-gpu=' + str(gpuEnable).lower(),
'enable-ceph=' + cephEnabled,
'ceph-admin-key=' + (ceph.get('admin_key', '')),
'ceph-kubernetes-key=' + (ceph.get('admin_key', '')),
'ceph-mon-hosts="' + (ceph.get('mon_hosts', '')) + '"',
'default-storage=' + default_storage,
]
check_call(['snap', 'set', 'cdk-addons'] + args)
if not addons_ready():
@ -754,6 +877,15 @@ def ceph_storage(ceph_admin):
configuration, and the ceph secret key file used for authentication.
This method will install the client package, and render the requisit files
in order to consume the ceph-storage relation.'''
# deprecated in 1.10 in favor of using CSI
if get_version('kube-apiserver') >= (1, 10):
# this is actually false, but by setting this flag we won't keep
# running this function for no reason. Also note that we watch this
# flag to run cdk-addons.apply.
set_state('ceph-storage.configured')
return
ceph_context = {
'mon_hosts': ceph_admin.mon_hosts(),
'fsid': ceph_admin.fsid(),
@ -888,13 +1020,14 @@ def on_config_allow_privileged_change():
remove_state('config.changed.allow-privileged')
@when('config.changed.api-extra-args')
@when_any('config.changed.api-extra-args',
'config.changed.audit-policy',
'config.changed.audit-webhook-config')
@when('kubernetes-master.components.started')
@when('leadership.set.auto_storage_backend')
@when('etcd.available')
def on_config_api_extra_args_change(etcd):
configure_apiserver(etcd.get_connection_string(),
getStorageBackend())
def reconfigure_apiserver(etcd):
configure_apiserver(etcd.get_connection_string())
@when('config.changed.controller-manager-extra-args')
@ -1105,8 +1238,6 @@ def parse_extra_args(config_key):
def configure_kubernetes_service(service, base_args, extra_args_key):
db = unitdata.kv()
prev_args_key = 'kubernetes-master.prev_args.' + service
prev_args = db.get(prev_args_key) or {}
@ -1128,7 +1259,20 @@ def configure_kubernetes_service(service, base_args, extra_args_key):
db.set(prev_args_key, args)
def configure_apiserver(etcd_connection_string, leader_etcd_version):
def remove_if_exists(path):
try:
os.remove(path)
except FileNotFoundError:
pass
def write_audit_config_file(path, contents):
with open(path, 'w') as f:
header = '# Autogenerated by kubernetes-master charm'
f.write(header + '\n' + contents)
def configure_apiserver(etcd_connection_string):
api_opts = {}
# Get the tls paths from the layer data.
@ -1166,8 +1310,9 @@ def configure_apiserver(etcd_connection_string, leader_etcd_version):
api_opts['logtostderr'] = 'true'
api_opts['insecure-bind-address'] = '127.0.0.1'
api_opts['insecure-port'] = '8080'
api_opts['storage-backend'] = leader_etcd_version
api_opts['storage-backend'] = getStorageBackend()
api_opts['basic-auth-file'] = '/root/cdk/basic_auth.csv'
api_opts['token-auth-file'] = '/root/cdk/known_tokens.csv'
api_opts['service-account-key-file'] = '/root/cdk/serviceaccount.key'
api_opts['kubelet-preferred-address-types'] = \
@ -1185,7 +1330,6 @@ def configure_apiserver(etcd_connection_string, leader_etcd_version):
api_opts['etcd-servers'] = etcd_connection_string
admission_control_pre_1_9 = [
'Initializers',
'NamespaceLifecycle',
'LimitRanger',
'ServiceAccount',
@ -1215,9 +1359,6 @@ def configure_apiserver(etcd_connection_string, leader_etcd_version):
if kube_version < (1, 6):
hookenv.log('Removing DefaultTolerationSeconds from admission-control')
admission_control_pre_1_9.remove('DefaultTolerationSeconds')
if kube_version < (1, 7):
hookenv.log('Removing Initializers from admission-control')
admission_control_pre_1_9.remove('Initializers')
if kube_version < (1, 9):
api_opts['admission-control'] = ','.join(admission_control_pre_1_9)
else:
@ -1241,6 +1382,44 @@ def configure_apiserver(etcd_connection_string, leader_etcd_version):
cloud_config_path = _cloud_config_path('kube-apiserver')
api_opts['cloud-provider'] = 'gce'
api_opts['cloud-config'] = str(cloud_config_path)
elif is_state('endpoint.openstack.ready'):
cloud_config_path = _cloud_config_path('kube-apiserver')
api_opts['cloud-provider'] = 'openstack'
api_opts['cloud-config'] = str(cloud_config_path)
elif (is_state('endpoint.vsphere.ready') and
get_version('kube-apiserver') >= (1, 12)):
cloud_config_path = _cloud_config_path('kube-apiserver')
api_opts['cloud-provider'] = 'vsphere'
api_opts['cloud-config'] = str(cloud_config_path)
elif is_state('endpoint.azure.ready'):
cloud_config_path = _cloud_config_path('kube-apiserver')
api_opts['cloud-provider'] = 'azure'
api_opts['cloud-config'] = str(cloud_config_path)
audit_root = '/root/cdk/audit'
os.makedirs(audit_root, exist_ok=True)
audit_log_path = audit_root + '/audit.log'
api_opts['audit-log-path'] = audit_log_path
api_opts['audit-log-maxsize'] = '100'
api_opts['audit-log-maxbackup'] = '9'
audit_policy_path = audit_root + '/audit-policy.yaml'
audit_policy = hookenv.config('audit-policy')
if audit_policy:
write_audit_config_file(audit_policy_path, audit_policy)
api_opts['audit-policy-file'] = audit_policy_path
else:
remove_if_exists(audit_policy_path)
audit_webhook_config_path = audit_root + '/audit-webhook-config.yaml'
audit_webhook_config = hookenv.config('audit-webhook-config')
if audit_webhook_config:
write_audit_config_file(audit_webhook_config_path,
audit_webhook_config)
api_opts['audit-webhook-config-file'] = audit_webhook_config_path
else:
remove_if_exists(audit_webhook_config_path)
configure_kubernetes_service('kube-apiserver', api_opts, 'api-extra-args')
restart_apiserver()
@ -1269,6 +1448,19 @@ def configure_controller_manager():
cloud_config_path = _cloud_config_path('kube-controller-manager')
controller_opts['cloud-provider'] = 'gce'
controller_opts['cloud-config'] = str(cloud_config_path)
elif is_state('endpoint.openstack.ready'):
cloud_config_path = _cloud_config_path('kube-controller-manager')
controller_opts['cloud-provider'] = 'openstack'
controller_opts['cloud-config'] = str(cloud_config_path)
elif (is_state('endpoint.vsphere.ready') and
get_version('kube-apiserver') >= (1, 12)):
cloud_config_path = _cloud_config_path('kube-controller-manager')
controller_opts['cloud-provider'] = 'vsphere'
controller_opts['cloud-config'] = str(cloud_config_path)
elif is_state('endpoint.azure.ready'):
cloud_config_path = _cloud_config_path('kube-controller-manager')
controller_opts['cloud-provider'] = 'azure'
controller_opts['cloud-config'] = str(cloud_config_path)
configure_kubernetes_service('kube-controller-manager', controller_opts,
'controller-manager-extra-args')
@ -1347,7 +1539,6 @@ def set_token(password, save_salt):
param: password - the password to be stored
param: save_salt - the key to store the value of the token.'''
db = unitdata.kv()
db.set(save_salt, password)
return db.get(save_salt)
@ -1496,9 +1687,29 @@ def clear_cluster_tag_sent():
@when_any('endpoint.aws.joined',
'endpoint.gcp.joined')
'endpoint.gcp.joined',
'endpoint.openstack.joined',
'endpoint.vsphere.joined',
'endpoint.azure.joined')
@when_not('kubernetes-master.cloud.ready')
def set_cloud_pending():
k8s_version = get_version('kube-apiserver')
k8s_1_11 = k8s_version >= (1, 11)
k8s_1_12 = k8s_version >= (1, 12)
vsphere_joined = is_state('endpoint.vsphere.joined')
azure_joined = is_state('endpoint.azure.joined')
if (vsphere_joined and not k8s_1_12) or (azure_joined and not k8s_1_11):
set_state('kubernetes-master.cloud.blocked')
else:
remove_state('kubernetes-master.cloud.blocked')
set_state('kubernetes-master.cloud.pending')
@when_any('endpoint.aws.joined',
'endpoint.gcp.joined',
'endpoint.azure.joined')
@when('leadership.set.cluster_tag')
@when_not('kubernetes-master.cloud-request-sent')
@when_not('kubernetes-master.cloud.request-sent')
def request_integration():
hookenv.status_set('maintenance', 'requesting cloud integration')
cluster_tag = leader_get('cluster_tag')
@ -1524,28 +1735,55 @@ def request_integration():
})
cloud.enable_object_storage_management()
cloud.enable_security_management()
elif is_state('endpoint.azure.joined'):
cloud = endpoint_from_flag('endpoint.azure.joined')
cloud.tag_instance({
'k8s-io-cluster-name': cluster_tag,
'k8s-io-role-master': 'master',
})
cloud.enable_object_storage_management()
cloud.enable_security_management()
cloud.enable_instance_inspection()
cloud.enable_network_management()
cloud.enable_dns_management()
cloud.enable_block_storage_management()
set_state('kubernetes-master.cloud-request-sent')
set_state('kubernetes-master.cloud.request-sent')
@when_none('endpoint.aws.joined',
'endpoint.gcp.joined')
@when('kubernetes-master.cloud-request-sent')
def clear_requested_integration():
remove_state('kubernetes-master.cloud-request-sent')
'endpoint.gcp.joined',
'endpoint.openstack.joined',
'endpoint.vsphere.joined',
'endpoint.azure.joined')
def clear_cloud_flags():
remove_state('kubernetes-master.cloud.pending')
remove_state('kubernetes-master.cloud.request-sent')
remove_state('kubernetes-master.cloud.blocked')
remove_state('kubernetes-master.cloud.ready')
@when_any('endpoint.aws.ready',
'endpoint.gcp.ready')
@when_not('kubernetes-master.restarted-for-cloud')
def restart_for_cloud():
'endpoint.gcp.ready',
'endpoint.openstack.ready',
'endpoint.vsphere.ready',
'endpoint.azure.ready')
@when_not('kubernetes-master.cloud.blocked',
'kubernetes-master.cloud.ready')
def cloud_ready():
if is_state('endpoint.gcp.ready'):
_write_gcp_snap_config('kube-apiserver')
_write_gcp_snap_config('kube-controller-manager')
set_state('kubernetes-master.restarted-for-cloud')
elif is_state('endpoint.openstack.ready'):
_write_openstack_snap_config('kube-apiserver')
_write_openstack_snap_config('kube-controller-manager')
elif is_state('endpoint.vsphere.ready'):
_write_vsphere_snap_config('kube-apiserver')
_write_vsphere_snap_config('kube-controller-manager')
elif is_state('endpoint.azure.ready'):
_write_azure_snap_config('kube-apiserver')
_write_azure_snap_config('kube-controller-manager')
remove_state('kubernetes-master.cloud.pending')
set_state('kubernetes-master.cloud.ready')
remove_state('kubernetes-master.components.started') # force restart
@ -1565,6 +1803,10 @@ def _daemon_env_path(component):
return _snap_common_path(component) / 'environment'
def _cdk_addons_template_path():
return Path('/snap/cdk-addons/current/templates')
def _write_gcp_snap_config(component):
# gcp requires additional credentials setup
gcp = endpoint_from_flag('endpoint.gcp.ready')
@ -1592,3 +1834,70 @@ def _write_gcp_snap_config(component):
daemon_env += '{}={}\n'.format(gcp_creds_env_key, creds_path)
daemon_env_path.parent.mkdir(parents=True, exist_ok=True)
daemon_env_path.write_text(daemon_env)
def _write_openstack_snap_config(component):
# openstack requires additional credentials setup
openstack = endpoint_from_flag('endpoint.openstack.ready')
cloud_config_path = _cloud_config_path(component)
cloud_config_path.write_text('\n'.join([
'[Global]',
'auth-url = {}'.format(openstack.auth_url),
'username = {}'.format(openstack.username),
'password = {}'.format(openstack.password),
'tenant-name = {}'.format(openstack.project_name),
'domain-name = {}'.format(openstack.user_domain_name),
]))
def _write_vsphere_snap_config(component):
# vsphere requires additional cloud config
vsphere = endpoint_from_flag('endpoint.vsphere.ready')
# NB: vsphere provider will ask kube-apiserver and -controller-manager to
# find a uuid from sysfs unless a global config value is set. Our strict
# snaps cannot read sysfs, so let's do it in the charm. An invalid uuid is
# not fatal for storage, but it will muddy the logs; try to get it right.
uuid_file = '/sys/class/dmi/id/product_uuid'
try:
with open(uuid_file, 'r') as f:
uuid = f.read().strip()
except IOError as err:
hookenv.log("Unable to read UUID from sysfs: {}".format(err))
uuid = 'UNKNOWN'
cloud_config_path = _cloud_config_path(component)
cloud_config_path.write_text('\n'.join([
'[Global]',
'insecure-flag = true',
'datacenters = "{}"'.format(vsphere.datacenter),
'vm-uuid = "VMware-{}"'.format(uuid),
'[VirtualCenter "{}"]'.format(vsphere.vsphere_ip),
'user = {}'.format(vsphere.user),
'password = {}'.format(vsphere.password),
'[Workspace]',
'server = {}'.format(vsphere.vsphere_ip),
'datacenter = "{}"'.format(vsphere.datacenter),
'default-datastore = "{}"'.format(vsphere.datastore),
'folder = "kubernetes"',
'resourcepool-path = ""',
'[Disk]',
'scsicontrollertype = "pvscsi"',
]))
def _write_azure_snap_config(component):
azure = endpoint_from_flag('endpoint.azure.ready')
cloud_config_path = _cloud_config_path(component)
cloud_config_path.write_text(json.dumps({
'useInstanceMetadata': True,
'useManagedIdentityExtension': True,
'subscriptionId': azure.subscription_id,
'resourceGroup': azure.resource_group,
'location': azure.resource_group_location,
'vnetName': azure.vnet_name,
'vnetResourceGroup': azure.vnet_resource_group,
'subnetName': azure.subnet_name,
'securityGroupName': azure.security_group_name,
}))

View File

@ -27,6 +27,38 @@ To add additional compute capacity to your Kubernetes workers, you may
join any related kubernetes-master, and enlist themselves as ready once the
deployment is complete.
## Snap Configuration
The kubernetes resources used by this charm are snap packages. When not
specified during deployment, these resources come from the public store. By
default, the `snapd` daemon will refresh all snaps installed from the store
four (4) times per day. A charm configuration option is provided for operators
to control this refresh frequency.
>NOTE: this is a global configuration option and will affect the refresh
time for all snaps installed on a system.
Examples:
```sh
## refresh kubernetes-worker snaps every tuesday
juju config kubernetes-worker snapd_refresh="tue"
## refresh snaps at 11pm on the last (5th) friday of the month
juju config kubernetes-worker snapd_refresh="fri5,23:00"
## delay the refresh as long as possible
juju config kubernetes-worker snapd_refresh="max"
## use the system default refresh timer
juju config kubernetes-worker snapd_refresh=""
```
For more information on the possible values for `snapd_refresh`, see the
*refresh.timer* section in the [system options][] documentation.
[system options]: https://forum.snapcraft.io/t/system-options/87
## Operational actions
The kubernetes-worker charm supports the following Operational Actions:
@ -89,7 +121,7 @@ service is not reachable.
Note: When debugging connection issues with NodePort services, its important
to first check the kube-proxy service on the worker units. If kube-proxy is not
running, the associated port-mapping will not be configured in the iptables
rulechains.
rulechains.
If you need to close the NodePort once a workload has been terminated, you can
follow the same steps inversely.
@ -97,4 +129,3 @@ follow the same steps inversely.
```
juju run --application kubernetes-worker close-port 30510
```

View File

@ -13,16 +13,17 @@ options:
cluster. Declare node labels in key=value format, separated by spaces.
allow-privileged:
type: string
default: "auto"
default: "true"
description: |
Allow privileged containers to run on worker nodes. Supported values are
"true", "false", and "auto". If "true", kubelet will run in privileged
mode by default. If "false", kubelet will never run in privileged mode.
If "auto", kubelet will not run in privileged mode by default, but will
switch to privileged mode if gpu hardware is detected.
switch to privileged mode if gpu hardware is detected. Pod security
policies (PSP) should be used to restrict container privileges.
channel:
type: string
default: "1.10/stable"
default: "1.11/stable"
description: |
Snap channel to install Kubernetes worker services from
require-manual-upgrade:
@ -58,6 +59,15 @@ options:
The value for this config must be a JSON array of credential objects, like this:
[{"server": "my.registry", "username": "myUser", "password": "myPass"}]
ingress-ssl-chain-completion:
type: boolean
default: false
description: |
Enable chain completion for TLS certificates used by the nginx ingress
controller. Set this to true if you would like the ingress controller
to attempt auto-retrieval of intermediate certificates. The default
(false) is recommended for all production kubernetes installations, and
any environment which does not have outbound Internet access.
nginx-image:
type: string
default: "auto"
@ -70,3 +80,29 @@ options:
description: |
Docker image to use for the default backend. Auto will select an image
based on architecture.
snapd_refresh:
default: "max"
type: string
description: |
How often snapd handles updates for installed snaps. Setting an empty
string will check 4x per day. Set to "max" to delay the refresh as long
as possible. You may also set a custom string as described in the
'refresh.timer' section here:
https://forum.snapcraft.io/t/system-options/87
kubelet-extra-config:
default: "{}"
type: string
description: |
Extra configuration to be passed to kubelet. Any values specified in this
config will be merged into a KubeletConfiguration file that is passed to
the kubelet service via the --config flag. This can be used to override
values provided by the charm.
Requires Kubernetes 1.10+.
The value for this config must be a YAML mapping that can be safely
merged with a KubeletConfiguration file. For example:
{evictionHard: {memory.available: 200Mi}}
For more information about KubeletConfiguration, see upstream docs:
https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/

View File

@ -3,6 +3,7 @@ includes:
- 'layer:basic'
- 'layer:debug'
- 'layer:snap'
- 'layer:leadership'
- 'layer:docker'
- 'layer:metrics'
- 'layer:nagios'
@ -12,8 +13,11 @@ includes:
- 'interface:kubernetes-cni'
- 'interface:kube-dns'
- 'interface:kube-control'
- 'interface:aws'
- 'interface:gcp'
- 'interface:aws-integration'
- 'interface:gcp-integration'
- 'interface:openstack-integration'
- 'interface:vsphere-integration'
- 'interface:azure-integration'
- 'interface:mount'
config:
deletes:

View File

@ -18,6 +18,7 @@ tags:
- misc
series:
- xenial
- bionic
subordinate: false
requires:
kube-api-endpoint:
@ -30,9 +31,15 @@ requires:
kube-control:
interface: kube-control
aws:
interface: aws
interface: aws-integration
gcp:
interface: gcp
interface: gcp-integration
openstack:
interface: openstack-integration
vsphere:
interface: vsphere-integration
azure:
interface: azure-integration
nfs:
interface: mount
provides:

View File

@ -14,12 +14,16 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import hashlib
import json
import os
import random
import shutil
import subprocess
import time
import yaml
from charms.leadership import leader_get, leader_set
from pathlib import Path
from shlex import split
@ -36,7 +40,7 @@ from charms.reactive import when, when_any, when_not, when_none
from charms.kubernetes.common import get_version
from charms.reactive.helpers import data_changed, any_file_changed
from charms.reactive.helpers import data_changed
from charms.templating.jinja2 import render
from charmhelpers.core import hookenv, unitdata
@ -52,6 +56,7 @@ kubeconfig_path = '/root/cdk/kubeconfig'
kubeproxyconfig_path = '/root/cdk/kubeproxyconfig'
kubeclientconfig_path = '/root/.kube/config'
gcp_creds_env_key = 'GOOGLE_APPLICATION_CREDENTIALS'
snap_resources = ['kubectl', 'kubelet', 'kube-proxy']
os.environ['PATH'] += os.pathsep + os.path.join(os.sep, 'snap', 'bin')
db = unitdata.kv()
@ -59,11 +64,21 @@ db = unitdata.kv()
@hook('upgrade-charm')
def upgrade_charm():
# migrate to new flags
if is_state('kubernetes-worker.restarted-for-cloud'):
remove_state('kubernetes-worker.restarted-for-cloud')
set_state('kubernetes-worker.cloud.ready')
if is_state('kubernetes-worker.cloud-request-sent'):
# minor change, just for consistency
remove_state('kubernetes-worker.cloud-request-sent')
set_state('kubernetes-worker.cloud.request-sent')
# Trigger removal of PPA docker installation if it was previously set.
set_state('config.changed.install_from_upstream')
hookenv.atexit(remove_state, 'config.changed.install_from_upstream')
cleanup_pre_snap_services()
migrate_resource_checksums()
check_resources_for_upgrade_needed()
# Remove the RC for nginx ingress if it exists
@ -88,12 +103,56 @@ def upgrade_charm():
set_state('kubernetes-worker.restart-needed')
def get_resource_checksum_db_key(resource):
''' Convert a resource name to a resource checksum database key. '''
return 'kubernetes-worker.resource-checksums.' + resource
def calculate_resource_checksum(resource):
''' Calculate a checksum for a resource '''
md5 = hashlib.md5()
path = hookenv.resource_get(resource)
if path:
with open(path, 'rb') as f:
data = f.read()
md5.update(data)
return md5.hexdigest()
def migrate_resource_checksums():
''' Migrate resource checksums from the old schema to the new one '''
for resource in snap_resources:
new_key = get_resource_checksum_db_key(resource)
if not db.get(new_key):
path = hookenv.resource_get(resource)
if path:
# old key from charms.reactive.helpers.any_file_changed
old_key = 'reactive.files_changed.' + path
old_checksum = db.get(old_key)
db.set(new_key, old_checksum)
else:
# No resource is attached. Previously, this meant no checksum
# would be calculated and stored. But now we calculate it as if
# it is a 0-byte resource, so let's go ahead and do that.
zero_checksum = hashlib.md5().hexdigest()
db.set(new_key, zero_checksum)
def check_resources_for_upgrade_needed():
hookenv.status_set('maintenance', 'Checking resources')
resources = ['kubectl', 'kubelet', 'kube-proxy']
paths = [hookenv.resource_get(resource) for resource in resources]
if any_file_changed(paths):
set_upgrade_needed()
for resource in snap_resources:
key = get_resource_checksum_db_key(resource)
old_checksum = db.get(key)
new_checksum = calculate_resource_checksum(resource)
if new_checksum != old_checksum:
set_upgrade_needed()
def calculate_and_store_resource_checksums():
for resource in snap_resources:
key = get_resource_checksum_db_key(resource)
checksum = calculate_resource_checksum(resource)
db.set(key, checksum)
def set_upgrade_needed():
@ -142,16 +201,8 @@ def channel_changed():
set_upgrade_needed()
@when('kubernetes-worker.snaps.upgrade-needed')
@when_not('kubernetes-worker.snaps.upgrade-specified')
def upgrade_needed_status():
msg = 'Needs manual upgrade, run the upgrade action'
hookenv.status_set('blocked', msg)
@when('kubernetes-worker.snaps.upgrade-specified')
def install_snaps():
check_resources_for_upgrade_needed()
channel = hookenv.config('channel')
hookenv.status_set('maintenance', 'Installing kubectl snap')
snap.install('kubectl', channel=channel, classic=True)
@ -159,6 +210,7 @@ def install_snaps():
snap.install('kubelet', channel=channel, classic=True)
hookenv.status_set('maintenance', 'Installing kube-proxy snap')
snap.install('kube-proxy', channel=channel, classic=True)
calculate_and_store_resource_checksums()
set_state('kubernetes-worker.snaps.installed')
set_state('kubernetes-worker.restart-needed')
remove_state('kubernetes-worker.snaps.upgrade-needed')
@ -173,7 +225,7 @@ def shutdown():
'''
try:
if os.path.isfile(kubeconfig_path):
kubectl('delete', 'node', gethostname().lower())
kubectl('delete', 'node', get_node_name())
except CalledProcessError:
hookenv.log('Failed to unregister node.')
service_stop('snap.kubelet.daemon')
@ -243,26 +295,73 @@ def set_app_version():
@when('kubernetes-worker.snaps.installed')
@when_not('kube-control.dns.available')
def notify_user_transient_status():
''' Notify to the user we are in a transient state and the application
is still converging. Potentially remotely, or we may be in a detached loop
wait state '''
@when('snap.refresh.set')
@when('leadership.is_leader')
def process_snapd_timer():
''' Set the snapd refresh timer on the leader so all cluster members
(present and future) will refresh near the same time. '''
# Get the current snapd refresh timer; we know layer-snap has set this
# when the 'snap.refresh.set' flag is present.
timer = snap.get(snapname='core', key='refresh.timer').decode('utf-8')
# During deployment the worker has to start kubelet without cluster dns
# configured. If this is the first unit online in a service pool waiting
# to self host the dns pod, and configure itself to query the dns service
# declared in the kube-system namespace
hookenv.status_set('waiting', 'Waiting for cluster DNS.')
# The first time through, data_changed will be true. Subsequent calls
# should only update leader data if something changed.
if data_changed('worker_snapd_refresh', timer):
hookenv.log('setting snapd_refresh timer to: {}'.format(timer))
leader_set({'snapd_refresh': timer})
@when('kubernetes-worker.snaps.installed',
'kube-control.dns.available')
@when_not('kubernetes-worker.snaps.upgrade-needed')
def charm_status(kube_control):
@when('kubernetes-worker.snaps.installed')
@when('snap.refresh.set')
@when('leadership.changed.snapd_refresh')
@when_not('leadership.is_leader')
def set_snapd_timer():
''' Set the snapd refresh.timer on non-leader cluster members. '''
# NB: This method should only be run when 'snap.refresh.set' is present.
# Layer-snap will always set a core refresh.timer, which may not be the
# same as our leader. Gating with 'snap.refresh.set' ensures layer-snap
# has finished and we are free to set our config to the leader's timer.
timer = leader_get('snapd_refresh')
hookenv.log('setting snapd_refresh timer to: {}'.format(timer))
snap.set_refresh_timer(timer)
@hookenv.atexit
def charm_status():
'''Update the status message with the current status of kubelet.'''
update_kubelet_status()
vsphere_joined = is_state('endpoint.vsphere.joined')
azure_joined = is_state('endpoint.azure.joined')
cloud_blocked = is_state('kubernetes-worker.cloud.blocked')
if vsphere_joined and cloud_blocked:
hookenv.status_set('blocked',
'vSphere integration requires K8s 1.12 or greater')
return
if azure_joined and cloud_blocked:
hookenv.status_set('blocked',
'Azure integration requires K8s 1.11 or greater')
return
if is_state('kubernetes-worker.cloud.pending'):
hookenv.status_set('waiting', 'Waiting for cloud integration')
return
if not is_state('kube-control.dns.available'):
# During deployment the worker has to start kubelet without cluster dns
# configured. If this is the first unit online in a service pool
# waiting to self host the dns pod, and configure itself to query the
# dns service declared in the kube-system namespace
hookenv.status_set('waiting', 'Waiting for cluster DNS.')
return
if is_state('kubernetes-worker.snaps.upgrade-specified'):
hookenv.status_set('waiting', 'Upgrade pending')
return
if is_state('kubernetes-worker.snaps.upgrade-needed'):
hookenv.status_set('blocked',
'Needs manual upgrade, run the upgrade action')
return
if is_state('kubernetes-worker.snaps.installed'):
update_kubelet_status()
return
else:
pass # will have been set by snap layer or other handler
def update_kubelet_status():
@ -347,6 +446,8 @@ def watch_for_changes(kube_api, kube_control, cni):
'kube-control.dns.available', 'kube-control.auth.available',
'cni.available', 'kubernetes-worker.restart-needed',
'worker.auth.bootstrapped')
@when_not('kubernetes-worker.cloud.pending',
'kubernetes-worker.cloud.blocked')
def start_worker(kube_api, kube_control, auth_control, cni):
''' Start kubelet using the provided API and DNS info.'''
servers = get_kube_api_servers(kube_api)
@ -403,8 +504,8 @@ def sdn_changed():
@when('kubernetes-worker.config.created')
@when_not('kubernetes-worker.ingress.available')
def render_and_launch_ingress():
''' If configuration has ingress daemon set enabled, launch the ingress load
balancer and default http backend. Otherwise attempt deletion. '''
''' If configuration has ingress daemon set enabled, launch the ingress
load balancer and default http backend. Otherwise attempt deletion. '''
config = hookenv.config()
# If ingress is enabled, launch the ingress controller
if config.get('ingress'):
@ -470,8 +571,9 @@ def apply_node_labels():
@when_any('config.changed.kubelet-extra-args',
'config.changed.proxy-extra-args')
def extra_args_changed():
'config.changed.proxy-extra-args',
'config.changed.kubelet-extra-config')
def config_changed_requires_restart():
set_state('kubernetes-worker.restart-needed')
@ -592,6 +694,20 @@ def configure_kubernetes_service(service, base_args, extra_args_key):
db.set(prev_args_key, args)
def merge_kubelet_extra_config(config, extra_config):
''' Updates config to include the contents of extra_config. This is done
recursively to allow deeply nested dictionaries to be merged.
This is destructive: it modifies the config dict that is passed in.
'''
for k, extra_config_value in extra_config.items():
if isinstance(extra_config_value, dict):
config_value = config.setdefault(k, {})
merge_kubelet_extra_config(config_value, extra_config_value)
else:
config[k] = extra_config_value
def configure_kubelet(dns, ingress_ip):
layer_options = layer.options('tls-client')
ca_cert_path = layer_options.get('ca_certificate_path')
@ -603,35 +719,93 @@ def configure_kubelet(dns, ingress_ip):
kubelet_opts['kubeconfig'] = kubeconfig_path
kubelet_opts['network-plugin'] = 'cni'
kubelet_opts['v'] = '0'
kubelet_opts['address'] = '0.0.0.0'
kubelet_opts['port'] = '10250'
kubelet_opts['cluster-domain'] = dns['domain']
kubelet_opts['anonymous-auth'] = 'false'
kubelet_opts['client-ca-file'] = ca_cert_path
kubelet_opts['tls-cert-file'] = server_cert_path
kubelet_opts['tls-private-key-file'] = server_key_path
kubelet_opts['logtostderr'] = 'true'
kubelet_opts['fail-swap-on'] = 'false'
kubelet_opts['node-ip'] = ingress_ip
if (dns['enable-kube-dns']):
kubelet_opts['cluster-dns'] = dns['sdn-ip']
# set --allow-privileged flag for kubelet
kubelet_opts['allow-privileged'] = set_privileged()
if is_state('kubernetes-worker.gpu.enabled'):
hookenv.log('Adding '
'--feature-gates=DevicePlugins=true '
'to kubelet')
kubelet_opts['feature-gates'] = 'DevicePlugins=true'
if is_state('endpoint.aws.ready'):
kubelet_opts['cloud-provider'] = 'aws'
elif is_state('endpoint.gcp.ready'):
cloud_config_path = _cloud_config_path('kubelet')
kubelet_opts['cloud-provider'] = 'gce'
kubelet_opts['cloud-config'] = str(cloud_config_path)
elif is_state('endpoint.openstack.ready'):
cloud_config_path = _cloud_config_path('kubelet')
kubelet_opts['cloud-provider'] = 'openstack'
kubelet_opts['cloud-config'] = str(cloud_config_path)
elif is_state('endpoint.vsphere.joined'):
# vsphere just needs to be joined on the worker (vs 'ready')
cloud_config_path = _cloud_config_path('kubelet')
kubelet_opts['cloud-provider'] = 'vsphere'
# NB: vsphere maps node product-id to its uuid (no config file needed).
uuid_file = '/sys/class/dmi/id/product_uuid'
with open(uuid_file, 'r') as f:
uuid = f.read().strip()
kubelet_opts['provider-id'] = 'vsphere://{}'.format(uuid)
elif is_state('endpoint.azure.ready'):
azure = endpoint_from_flag('endpoint.azure.ready')
cloud_config_path = _cloud_config_path('kubelet')
kubelet_opts['cloud-provider'] = 'azure'
kubelet_opts['cloud-config'] = str(cloud_config_path)
kubelet_opts['provider-id'] = azure.vm_id
if get_version('kubelet') >= (1, 10):
# Put together the KubeletConfiguration data
kubelet_config = {
'apiVersion': 'kubelet.config.k8s.io/v1beta1',
'kind': 'KubeletConfiguration',
'address': '0.0.0.0',
'authentication': {
'anonymous': {
'enabled': False
},
'x509': {
'clientCAFile': ca_cert_path
}
},
'clusterDomain': dns['domain'],
'failSwapOn': False,
'port': 10250,
'tlsCertFile': server_cert_path,
'tlsPrivateKeyFile': server_key_path
}
if dns['enable-kube-dns']:
kubelet_config['clusterDNS'] = [dns['sdn-ip']]
if is_state('kubernetes-worker.gpu.enabled'):
kubelet_config['featureGates'] = {
'DevicePlugins': True
}
# Add kubelet-extra-config. This needs to happen last so that it
# overrides any config provided by the charm.
kubelet_extra_config = hookenv.config('kubelet-extra-config')
kubelet_extra_config = yaml.load(kubelet_extra_config)
merge_kubelet_extra_config(kubelet_config, kubelet_extra_config)
# Render the file and configure Kubelet to use it
os.makedirs('/root/cdk/kubelet', exist_ok=True)
with open('/root/cdk/kubelet/config.yaml', 'w') as f:
f.write('# Generated by kubernetes-worker charm, do not edit\n')
yaml.dump(kubelet_config, f)
kubelet_opts['config'] = '/root/cdk/kubelet/config.yaml'
else:
# NOTE: This is for 1.9. Once we've dropped 1.9 support, we can remove
# this whole block and the parent if statement.
kubelet_opts['address'] = '0.0.0.0'
kubelet_opts['anonymous-auth'] = 'false'
kubelet_opts['client-ca-file'] = ca_cert_path
kubelet_opts['cluster-domain'] = dns['domain']
kubelet_opts['fail-swap-on'] = 'false'
kubelet_opts['port'] = '10250'
kubelet_opts['tls-cert-file'] = server_cert_path
kubelet_opts['tls-private-key-file'] = server_key_path
if dns['enable-kube-dns']:
kubelet_opts['cluster-dns'] = dns['sdn-ip']
if is_state('kubernetes-worker.gpu.enabled'):
kubelet_opts['feature-gates'] = 'DevicePlugins=true'
if get_version('kubelet') >= (1, 11):
kubelet_opts['dynamic-config-dir'] = '/root/cdk/kubelet/dynamic-config'
configure_kubernetes_service('kubelet', kubelet_opts, 'kubelet-extra-args')
@ -696,6 +870,7 @@ def create_kubeconfig(kubeconfig, server, ca, key=None, certificate=None,
@when_any('config.changed.default-backend-image',
'config.changed.ingress-ssl-chain-completion',
'config.changed.nginx-image')
@when('kubernetes-worker.config.created')
def launch_default_ingress_controller():
@ -716,13 +891,13 @@ def launch_default_ingress_controller():
context['defaultbackend_image'] == "auto"):
if context['arch'] == 's390x':
context['defaultbackend_image'] = \
"k8s.gcr.io/defaultbackend-s390x:1.4"
"k8s.gcr.io/defaultbackend-s390x:1.5"
elif context['arch'] == 'arm64':
context['defaultbackend_image'] = \
"k8s.gcr.io/defaultbackend-arm64:1.4"
"k8s.gcr.io/defaultbackend-arm64:1.5"
else:
context['defaultbackend_image'] = \
"k8s.gcr.io/defaultbackend:1.4"
"k8s.gcr.io/defaultbackend-amd64:1.5"
# Render the default http backend (404) replicationcontroller manifest
manifest = addon_path.format('default-http-backend.yaml')
@ -738,17 +913,16 @@ def launch_default_ingress_controller():
return
# Render the ingress daemon set controller manifest
context['ssl_chain_completion'] = config.get(
'ingress-ssl-chain-completion')
context['ingress_image'] = config.get('nginx-image')
if context['ingress_image'] == "" or context['ingress_image'] == "auto":
if context['arch'] == 's390x':
context['ingress_image'] = \
"docker.io/cdkbot/nginx-ingress-controller-s390x:0.9.0-beta.13"
elif context['arch'] == 'arm64':
context['ingress_image'] = \
"k8s.gcr.io/nginx-ingress-controller-arm64:0.9.0-beta.15"
else:
context['ingress_image'] = \
"k8s.gcr.io/nginx-ingress-controller:0.9.0-beta.15" # noqa
images = {'amd64': 'quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.16.1', # noqa
'arm64': 'quay.io/kubernetes-ingress-controller/nginx-ingress-controller-arm64:0.16.1', # noqa
's390x': 'quay.io/kubernetes-ingress-controller/nginx-ingress-controller-s390x:0.16.1', # noqa
'ppc64el': 'quay.io/kubernetes-ingress-controller/nginx-ingress-controller-ppc64le:0.16.1', # noqa
}
context['ingress_image'] = images.get(context['arch'], images['amd64'])
if get_version('kubelet') < (1, 9):
context['daemonset_api_version'] = 'extensions/v1beta1'
else:
@ -910,7 +1084,7 @@ def enable_gpu():
if get_version('kubelet') < (1, 9):
hookenv.status_set(
'active',
'Upgrade to snap channel >= 1.9/stable to enable GPU suppport.'
'Upgrade to snap channel >= 1.9/stable to enable GPU support.'
)
return
@ -1008,10 +1182,20 @@ def missing_kube_control():
missing.
"""
hookenv.status_set(
'blocked',
'Relate {}:kube-control kubernetes-master:kube-control'.format(
hookenv.service_name()))
try:
goal_state = hookenv.goal_state()
except NotImplementedError:
goal_state = {}
if 'kube-control' in goal_state.get('relations', {}):
hookenv.status_set(
'waiting',
'Waiting for kubernetes-master to become ready')
else:
hookenv.status_set(
'blocked',
'Relate {}:kube-control kubernetes-master:kube-control'.format(
hookenv.service_name()))
@when('docker.ready')
@ -1040,11 +1224,17 @@ def get_node_name():
if is_state('endpoint.aws.ready'):
cloud_provider = 'aws'
elif is_state('endpoint.gcp.ready'):
cloud_provider = 'gcp'
cloud_provider = 'gce'
elif is_state('endpoint.openstack.ready'):
cloud_provider = 'openstack'
elif is_state('endpoint.vsphere.ready'):
cloud_provider = 'vsphere'
elif is_state('endpoint.azure.ready'):
cloud_provider = 'azure'
if cloud_provider == 'aws':
return getfqdn()
return getfqdn().lower()
else:
return gethostname()
return gethostname().lower()
class ApplyNodeLabelFailed(Exception):
@ -1084,9 +1274,29 @@ def remove_label(label):
@when_any('endpoint.aws.joined',
'endpoint.gcp.joined')
'endpoint.gcp.joined',
'endpoint.openstack.joined',
'endpoint.vsphere.joined',
'endpoint.azure.joined')
@when_not('kubernetes-worker.cloud.ready')
def set_cloud_pending():
k8s_version = get_version('kubelet')
k8s_1_11 = k8s_version >= (1, 11)
k8s_1_12 = k8s_version >= (1, 12)
vsphere_joined = is_state('endpoint.vsphere.joined')
azure_joined = is_state('endpoint.azure.joined')
if (vsphere_joined and not k8s_1_12) or (azure_joined and not k8s_1_11):
set_state('kubernetes-worker.cloud.blocked')
else:
remove_state('kubernetes-worker.cloud.blocked')
set_state('kubernetes-worker.cloud.pending')
@when_any('endpoint.aws.joined',
'endpoint.gcp.joined',
'endpoint.azure.joined')
@when('kube-control.cluster_tag.available')
@when_not('kubernetes-worker.cloud-request-sent')
@when_not('kubernetes-worker.cloud.request-sent')
def request_integration():
hookenv.status_set('maintenance', 'requesting cloud integration')
kube_control = endpoint_from_flag('kube-control.cluster_tag.available')
@ -1109,26 +1319,47 @@ def request_integration():
'k8s-io-cluster-name': cluster_tag,
})
cloud.enable_object_storage_management()
elif is_state('endpoint.azure.joined'):
cloud = endpoint_from_flag('endpoint.azure.joined')
cloud.tag_instance({
'k8s-io-cluster-name': cluster_tag,
})
cloud.enable_object_storage_management()
cloud.enable_instance_inspection()
cloud.enable_dns_management()
set_state('kubernetes-worker.cloud-request-sent')
hookenv.status_set('waiting', 'waiting for cloud integration')
set_state('kubernetes-worker.cloud.request-sent')
hookenv.status_set('waiting', 'Waiting for cloud integration')
@when_none('endpoint.aws.joined',
'endpoint.gcp.joined')
def clear_requested_integration():
remove_state('kubernetes-worker.cloud-request-sent')
'endpoint.gcp.joined',
'endpoint.openstack.joined',
'endpoint.vsphere.joined',
'endpoint.azure.joined')
def clear_cloud_flags():
remove_state('kubernetes-worker.cloud.pending')
remove_state('kubernetes-worker.cloud.request-sent')
remove_state('kubernetes-worker.cloud.blocked')
remove_state('kubernetes-worker.cloud.ready')
@when_any('endpoint.aws.ready',
'endpoint.gcp.ready')
@when_not('kubernetes-worker.restarted-for-cloud')
def restart_for_cloud():
'endpoint.gcp.ready',
'endpoint.openstack.ready',
'endpoint.vsphere.ready',
'endpoint.azure.ready')
@when_not('kubernetes-worker.cloud.blocked',
'kubernetes-worker.cloud.ready')
def cloud_ready():
remove_state('kubernetes-worker.cloud.pending')
if is_state('endpoint.gcp.ready'):
_write_gcp_snap_config('kubelet')
set_state('kubernetes-worker.restarted-for-cloud')
set_state('kubernetes-worker.restart-needed')
elif is_state('endpoint.openstack.ready'):
_write_openstack_snap_config('kubelet')
elif is_state('endpoint.azure.ready'):
_write_azure_snap_config('kubelet')
set_state('kubernetes-worker.cloud.ready')
set_state('kubernetes-worker.restart-needed') # force restart
def _snap_common_path(component):
@ -1176,6 +1407,37 @@ def _write_gcp_snap_config(component):
daemon_env_path.write_text(daemon_env)
def _write_openstack_snap_config(component):
# openstack requires additional credentials setup
openstack = endpoint_from_flag('endpoint.openstack.ready')
cloud_config_path = _cloud_config_path(component)
cloud_config_path.write_text('\n'.join([
'[Global]',
'auth-url = {}'.format(openstack.auth_url),
'username = {}'.format(openstack.username),
'password = {}'.format(openstack.password),
'tenant-name = {}'.format(openstack.project_name),
'domain-name = {}'.format(openstack.user_domain_name),
]))
def _write_azure_snap_config(component):
azure = endpoint_from_flag('endpoint.azure.ready')
cloud_config_path = _cloud_config_path(component)
cloud_config_path.write_text(json.dumps({
'useInstanceMetadata': True,
'useManagedIdentityExtension': True,
'subscriptionId': azure.subscription_id,
'resourceGroup': azure.resource_group,
'location': azure.resource_group_location,
'vnetName': azure.vnet_name,
'vnetResourceGroup': azure.vnet_resource_group,
'subnetName': azure.subnet_name,
'securityGroupName': azure.security_group_name,
}))
def get_first_mount(mount_relation):
mount_relation_list = mount_relation.mounts()
if mount_relation_list and len(mount_relation_list) > 0:

View File

@ -176,3 +176,4 @@ spec:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-http-backend
- --configmap=$(POD_NAMESPACE)/nginx-load-balancer-conf
- --enable-ssl-chain-completion={{ ssl_chain_completion }}