build: move e2e dependencies into e2e/go.mod

Several packages are only used while running the e2e suite. These
packages are less important to update, as the they can not influence the
final executable that is part of the Ceph-CSI container-image.

By moving these dependencies out of the main Ceph-CSI go.mod, it is
easier to identify if a reported CVE affects Ceph-CSI, or only the
testing (like most of the Kubernetes CVEs).

Signed-off-by: Niels de Vos <ndevos@ibm.com>
This commit is contained in:
Niels de Vos
2025-03-04 08:57:28 +01:00
committed by mergify[bot]
parent 15da101b1b
commit bec6090996
8047 changed files with 1407827 additions and 3453 deletions

View File

@ -0,0 +1,67 @@
rules:
# The core E2E framework is meant to be a normal Kubernetes client,
# which means that it shouldn't depend on internal code.
# The following packages are okay to use:
#
# public API
- selectorRegexp: ^k8s[.]io/(api|apimachinery|client-go|component-base|klog|pod-security-admission|utils)
allowedPrefixes: [ "" ]
# stdlib
- selectorRegexp: ^[a-z]+(/|$)
allowedPrefixes: [ "" ]
# stdlib x and proto
- selectorRegexp: ^golang.org/x|^google.golang.org/protobuf
allowedPrefixes: [ "" ]
# Ginkgo + Gomega
- selectorRegexp: ^github.com/onsi/(ginkgo|gomega)
allowedPrefixes: [ "" ]
# kube-openapi
- selectorRegexp: ^k8s.io/kube-openapi
allowedPrefixes: [ "" ]
# Public SIG Repos
- selectorRegexp: ^sigs.k8s.io/(json|yaml|structured-merge-diff)
allowedPrefixes: [ "" ]
# some of the shared test helpers (but not E2E sub-packages!)
- selectorRegexp: ^k8s[.]io/kubernetes/test/(e2e/framework/internal/|utils)
allowedPrefixes: [ "" ]
# Third party deps
- selectorRegexp: ^github.com/|^gopkg.in
allowedPrefixes: [
"gopkg.in/inf.v0",
"gopkg.in/evanphx/json-patch.v4",
"github.com/blang/semver/",
"github.com/davecgh/go-spew/spew",
"github.com/go-logr/logr",
"github.com/gogo/protobuf/proto",
"github.com/gogo/protobuf/sortkeys",
"github.com/golang/protobuf/proto",
"github.com/google/gnostic-models/openapiv2",
"github.com/google/gnostic-models/openapiv3",
"github.com/google/go-cmp/cmp",
"github.com/google/go-cmp/cmp/cmpopts",
"github.com/google/gofuzz",
"github.com/google/uuid",
"github.com/imdario/mergo",
"github.com/prometheus/client_golang/",
"github.com/prometheus/client_model/",
"github.com/prometheus/common/",
"github.com/prometheus/procfs",
"github.com/spf13/cobra",
"github.com/spf13/pflag",
"github.com/stretchr/testify/assert",
"github.com/stretchr/testify/require"
]
# Everything else isn't.
#
# In particular importing any test/e2e/framework/* package would be a
# violation (sub-packages get to use the framework, not the other way
# around).
- selectorRegexp: .

20
e2e/vendor/k8s.io/kubernetes/test/e2e/framework/OWNERS generated vendored Normal file
View File

@ -0,0 +1,20 @@
# See the OWNERS docs at https://go.k8s.io/owners
approvers:
- andrewsykim
- pohly
- oomichi
- neolit123
- SataQiu
reviewers:
- sig-testing-reviewers
- andrewsykim
- pohly
- oomichi
- neolit123
- SataQiu
labels:
- area/e2e-test-framework
emeritus_approvers:
- fabriziopandini
- timothysc

View File

@ -0,0 +1,88 @@
# Overview
The Kubernetes E2E framework simplifies writing Ginkgo tests suites. It's main
usage is for these tests suites in the Kubernetes repository itself:
- test/e2e: runs as client for a Kubernetes cluster. The e2e.test binary is
used for conformance testing.
- test/e2e_node: runs on the same node as a kubelet instance. Used for testing
kubelet.
- test/e2e_kubeadm: test suite for kubeadm.
Usage of the framework outside of Kubernetes is possible, but not encouraged.
Downstream users have to be prepared to deal with API changes.
# Code Organization
The core framework is the `k8s.io/kubernetes/test/e2e/framework` package. It
contains functionality that all E2E suites are expected to need:
- connecting to the apiserver
- managing per-test namespaces
- logging (`Logf`)
- failure handling (`Fail`, `Failf`)
- writing concise JUnit test results
It also contains a `TestContext` with settings that can be controlled via
command line flags. For historic reasons, this also contains settings for
individual tests or packages that are not part of the core framework.
Optional functionality is placed in sub packages like
`test/e2e/framework/pod`. The core framework does not depend on those. Sub
packages may depend on the core framework.
The advantages of splitting the code like this are:
- leaner go doc packages by grouping related functions together
- not forcing all E2E suites to import all functionality
- avoiding import cycles
# Execution Flow
When a test suite gets invoked, the top-level `Describe` calls register the
callbacks that define individual tests, but does not invoke them yet. After
that init phase, command line flags are parsed and the `Describe` callbacks are
invoked. Those then define the actual tests for the test suite. Command line
flags can be used to influence the test definitions.
Now `Context/BeforeEach/AfterEach/It` define code that will be called later
when executing a specific test. During this setup phase, `f :=
framework.NewDefaultFramework("some tests")` creates a `Framework` instance for
one or more tests. `NewDefaultFramework` initializes that instance anew for
each test with a `BeforeEach` callback. Starting with Kubernetes 1.26, that
instance gets cleaned up after all other code for a test has been invoked, so
the following code is correct:
```
f := framework.NewDefaultFramework("some tests")
ginkgo.AfterEach(func() {
# Do something with f.ClientSet.
}
ginkgo.It("test something", func(ctx context.Context) {
# The actual test.
})
```
Optional functionality can be injected into each test by adding a callback to
`NewFrameworkExtensions` in an init function. `NewDefaultFramework` will invoke
those callbacks as if the corresponding code had been added to each test like this:
```
f := framework.NewDefaultFramework("some tests")
optional.SomeCallback(f)
```
`SomeCallback` then can register additional `BeforeEach` or `AfterEach`
callbacks that use the test's `Framework` instance.
When a test runs, callbacks defined for it with `BeforeEach` and `AfterEach`
are called in first-in-first-out order. Since the migration to ginkgo v2 in
Kubernetes 1.25, the `AfterEach` callback is called also when there has been a
test failure. This can be used to run cleanup code for a test
reliably. However,
[`ginkgo.DeferCleanup`](https://onsi.github.io/ginkgo/#spec-cleanup-aftereach-and-defercleanup)
is often a better alternative. Its callbacks are executed in first-in-last-out
order.
`test/e2e/framework/internal/unittests/cleanup/cleanup.go` shows how these
different callbacks can be used and in which order they are going to run.

108
e2e/vendor/k8s.io/kubernetes/test/e2e/framework/bugs.go generated vendored Normal file
View File

@ -0,0 +1,108 @@
/*
Copyright 2023 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import (
"errors"
"fmt"
"os"
"path/filepath"
"sort"
"strings"
"sync"
"github.com/onsi/ginkgo/v2/types"
)
var (
bugs []Bug
bugMutex sync.Mutex
)
// RecordBug stores information about a bug in the E2E suite source code that
// cannot be reported through ginkgo.Fail because it was found outside of some
// test, for example during test registration.
//
// This can be used instead of raising a panic. Then all bugs can be reported
// together instead of failing after the first one.
func RecordBug(bug Bug) {
bugMutex.Lock()
defer bugMutex.Unlock()
bugs = append(bugs, bug)
}
type Bug struct {
FileName string
LineNumber int
Message string
}
// NewBug creates a new bug with a location that is obtained by skipping a certain number
// of stack frames. Passing zero will record the source code location of the direct caller
// of NewBug.
func NewBug(message string, skip int) Bug {
location := types.NewCodeLocation(skip + 1)
return Bug{FileName: location.FileName, LineNumber: location.LineNumber, Message: message}
}
// FormatBugs produces a report that includes all bugs recorded earlier via
// RecordBug. An error is returned with the report if there have been bugs.
func FormatBugs() error {
bugMutex.Lock()
defer bugMutex.Unlock()
if len(bugs) == 0 {
return nil
}
lines := make([]string, 0, len(bugs))
wd, err := os.Getwd()
if err != nil {
return fmt.Errorf("get current directory: %v", err)
}
// Sort by file name, line number, message. For the sake of simplicity
// this uses the full file name even though the output the may use a
// relative path. Usually the result should be the same because full
// paths will all have the same prefix.
sort.Slice(bugs, func(i, j int) bool {
switch strings.Compare(bugs[i].FileName, bugs[j].FileName) {
case -1:
return true
case 1:
return false
}
if bugs[i].LineNumber < bugs[j].LineNumber {
return true
}
if bugs[i].LineNumber > bugs[j].LineNumber {
return false
}
return bugs[i].Message < bugs[j].Message
})
for _, bug := range bugs {
// Use relative paths, if possible.
path := bug.FileName
if wd != "" {
if relpath, err := filepath.Rel(wd, bug.FileName); err == nil {
path = relpath
}
}
lines = append(lines, fmt.Sprintf("ERROR: %s:%d: %s\n", path, bug.LineNumber, strings.TrimSpace(bug.Message)))
}
return errors.New(strings.Join(lines, ""))
}

View File

@ -0,0 +1,12 @@
# This E2E framework sub-package is currently allowed to use arbitrary
# dependencies except of k/k/pkg, therefore we need to override the
# restrictions from the parent .import-restrictions file.
#
# At some point it may become useful to also check this package's
# dependencies more careful.
rules:
- selectorRegexp: "^k8s[.]io/kubernetes/pkg"
allowedPrefixes: []
- selectorRegexp: ""
allowedPrefixes: [ "" ]

View File

@ -0,0 +1,263 @@
/*
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Package config simplifies the declaration of configuration options.
// Right now the implementation maps them directly to command line
// flags. When combined with test/e2e/framework/viperconfig in a test
// suite, those flags then can also be read from a config file.
//
// The command line flags all get stored in a private flag set. The
// developer of the E2E test suite decides how they are exposed. Options
// include:
// - exposing as normal flags in the actual command line:
// CopyFlags(Flags, flag.CommandLine)
// - populate via test/e2e/framework/viperconfig:
// viperconfig.ViperizeFlags("my-config.yaml", "", Flags)
// - a combination of both:
// CopyFlags(Flags, flag.CommandLine)
// viperconfig.ViperizeFlags("my-config.yaml", "", flag.CommandLine)
//
// Instead of defining flags one-by-one, test developers annotate a
// structure with tags and then call a single function. This is the
// same approach as in https://godoc.org/github.com/jessevdk/go-flags,
// but implemented so that a test suite can continue to use the normal
// "flag" package.
//
// For example, a file storage/csi.go might define:
//
// var scaling struct {
// NumNodes int `default:"1" description:"number of nodes to run on"`
// Master string
// }
// _ = config.AddOptions(&scaling, "storage.csi.scaling")
//
// This defines the following command line flags:
//
// -storage.csi.scaling.numNodes=<int> - number of nodes to run on (default: 1)
// -storage.csi.scaling.master=<string>
//
// All fields in the structure must be exported and have one of the following
// types (same as in the `flag` package):
// - bool
// - time.Duration
// - float64
// - string
// - int
// - int64
// - uint
// - uint64
// - and/or nested or embedded structures containing those basic types.
//
// Each basic entry may have a tag with these optional keys:
//
// usage: additional explanation of the option
// default: the default value, in the same format as it would
// be given on the command line and true/false for
// a boolean
//
// The names of the final configuration options are a combination of an
// optional common prefix for all options in the structure and the
// name of the fields, concatenated with a dot. To get names that are
// consistent with the command line flags defined by `ginkgo`, the
// initial character of each field name is converted to lower case.
//
// There is currently no support for aliases, so renaming the fields
// or the common prefix will be visible to users of the test suite and
// may breaks scripts which use the old names.
//
// The variable will be filled with the actual values by the test
// suite before running tests. Beware that the code which registers
// Ginkgo tests cannot use those config options, because registering
// tests and options both run before the E2E test suite handles
// parameters.
package config
import (
"flag"
"fmt"
"reflect"
"strconv"
"time"
"unicode"
"unicode/utf8"
)
// Flags is the flag set that AddOptions adds to. Test authors should
// also use it instead of directly adding to the global command line.
var Flags = flag.NewFlagSet("", flag.ContinueOnError)
// CopyFlags ensures that all flags that are defined in the source flag
// set appear in the target flag set as if they had been defined there
// directly. From the flag package it inherits the behavior that there
// is a panic if the target already contains a flag from the source.
func CopyFlags(source *flag.FlagSet, target *flag.FlagSet) {
source.VisitAll(func(flag *flag.Flag) {
// We don't need to copy flag.DefValue. The original
// default (from, say, flag.String) was stored in
// the value and gets extracted by Var for the help
// message.
target.Var(flag.Value, flag.Name, flag.Usage)
})
}
// AddOptions analyzes the options value and creates the necessary
// flags to populate it.
//
// The prefix can be used to root the options deeper in the overall
// set of options, with a dot separating different levels.
//
// The function always returns true, to enable this simplified
// registration of options:
// _ = AddOptions(...)
//
// It panics when it encounters an error, like unsupported types
// or option name conflicts.
func AddOptions(options interface{}, prefix string) bool {
return AddOptionsToSet(Flags, options, prefix)
}
// AddOptionsToSet is the same as AddOption, except that it allows choosing the flag set.
func AddOptionsToSet(flags *flag.FlagSet, options interface{}, prefix string) bool {
optionsType := reflect.TypeOf(options)
if optionsType == nil {
panic("options parameter without a type - nil?!")
}
if optionsType.Kind() != reflect.Pointer || optionsType.Elem().Kind() != reflect.Struct {
panic(fmt.Sprintf("need a pointer to a struct, got instead: %T", options))
}
addStructFields(flags, optionsType.Elem(), reflect.Indirect(reflect.ValueOf(options)), prefix)
return true
}
func addStructFields(flags *flag.FlagSet, structType reflect.Type, structValue reflect.Value, prefix string) {
for i := 0; i < structValue.NumField(); i++ {
entry := structValue.Field(i)
addr := entry.Addr()
structField := structType.Field(i)
name := structField.Name
r, n := utf8.DecodeRuneInString(name)
name = string(unicode.ToLower(r)) + name[n:]
usage := structField.Tag.Get("usage")
def := structField.Tag.Get("default")
if prefix != "" {
name = prefix + "." + name
}
if structField.PkgPath != "" {
panic(fmt.Sprintf("struct entry %q not exported", name))
}
ptr := addr.Interface()
if structField.Anonymous {
// Entries in embedded fields are treated like
// entries, in the struct itself, i.e. we add
// them with the same prefix.
addStructFields(flags, structField.Type, entry, prefix)
continue
}
if structField.Type.Kind() == reflect.Struct {
// Add nested options.
addStructFields(flags, structField.Type, entry, name)
continue
}
// We could switch based on structField.Type. Doing a
// switch after getting an interface holding the
// pointer to the entry has the advantage that we
// immediately have something that we can add as flag
// variable.
//
// Perhaps generics will make this entire switch redundant someday...
switch ptr := ptr.(type) {
case *bool:
var defValue bool
parseDefault(&defValue, name, def)
flags.BoolVar(ptr, name, defValue, usage)
case *time.Duration:
var defValue time.Duration
parseDefault(&defValue, name, def)
flags.DurationVar(ptr, name, defValue, usage)
case *float64:
var defValue float64
parseDefault(&defValue, name, def)
flags.Float64Var(ptr, name, defValue, usage)
case *string:
flags.StringVar(ptr, name, def, usage)
case *int:
var defValue int
parseDefault(&defValue, name, def)
flags.IntVar(ptr, name, defValue, usage)
case *int64:
var defValue int64
parseDefault(&defValue, name, def)
flags.Int64Var(ptr, name, defValue, usage)
case *uint:
var defValue uint
parseDefault(&defValue, name, def)
flags.UintVar(ptr, name, defValue, usage)
case *uint64:
var defValue uint64
parseDefault(&defValue, name, def)
flags.Uint64Var(ptr, name, defValue, usage)
default:
panic(fmt.Sprintf("unsupported struct entry type %q: %T", name, entry.Interface()))
}
}
}
// parseDefault is necessary because "flag" wants the default in the
// actual type and cannot take a string. It would be nice to reuse the
// existing code for parsing from the "flag" package, but it isn't
// exported.
func parseDefault(value interface{}, name, def string) {
if def == "" {
return
}
checkErr := func(err error, value interface{}) {
if err != nil {
panic(fmt.Sprintf("invalid default %q for %T entry %s: %s", def, value, name, err))
}
}
switch value := value.(type) {
case *bool:
v, err := strconv.ParseBool(def)
checkErr(err, *value)
*value = v
case *time.Duration:
v, err := time.ParseDuration(def)
checkErr(err, *value)
*value = v
case *float64:
v, err := strconv.ParseFloat(def, 64)
checkErr(err, *value)
*value = v
case *int:
v, err := strconv.Atoi(def)
checkErr(err, *value)
*value = v
case *int64:
v, err := strconv.ParseInt(def, 0, 64)
checkErr(err, *value)
*value = v
case *uint:
v, err := strconv.ParseUint(def, 0, strconv.IntSize)
checkErr(err, *value)
*value = uint(v)
case *uint64:
v, err := strconv.ParseUint(def, 0, 64)
checkErr(err, *value)
*value = v
default:
panic(fmt.Sprintf("%q: setting defaults not supported for type %T", name, value))
}
}

View File

@ -0,0 +1,12 @@
# This E2E framework sub-package is currently allowed to use arbitrary
# dependencies except of k/k/pkg, therefore we need to override the
# restrictions from the parent .import-restrictions file.
#
# At some point it may become useful to also check this package's
# dependencies more careful.
rules:
- selectorRegexp: "^k8s[.]io/kubernetes/pkg"
allowedPrefixes: []
- selectorRegexp: ""
allowedPrefixes: [ "" ]

View File

@ -0,0 +1,188 @@
/*
Copyright 2014 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package debug
import (
"context"
"fmt"
"sort"
"time"
"github.com/onsi/ginkgo/v2"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/fields"
clientset "k8s.io/client-go/kubernetes"
restclient "k8s.io/client-go/rest"
"k8s.io/kubernetes/test/e2e/framework"
e2emetrics "k8s.io/kubernetes/test/e2e/framework/metrics"
e2epod "k8s.io/kubernetes/test/e2e/framework/pod"
)
// EventsLister is a func that lists events.
type EventsLister func(opts metav1.ListOptions, ns string) (*v1.EventList, error)
// dumpEventsInNamespace dumps events in the given namespace.
func dumpEventsInNamespace(eventsLister EventsLister, namespace string) {
ginkgo.By(fmt.Sprintf("Collecting events from namespace %q.", namespace))
events, err := eventsLister(metav1.ListOptions{}, namespace)
framework.ExpectNoError(err, "failed to list events in namespace %q", namespace)
ginkgo.By(fmt.Sprintf("Found %d events.", len(events.Items)))
// Sort events by their first timestamp
sortedEvents := events.Items
if len(sortedEvents) > 1 {
sort.Sort(byFirstTimestamp(sortedEvents))
}
for _, e := range sortedEvents {
framework.Logf("At %v - event for %v: %v %v: %v", e.FirstTimestamp, e.InvolvedObject.Name, e.Source, e.Reason, e.Message)
}
// Note that we don't wait for any Cleanup to propagate, which means
// that if you delete a bunch of pods right before ending your test,
// you may or may not see the killing/deletion/Cleanup events.
}
// DumpAllNamespaceInfo dumps events, pods and nodes information in the given namespace.
func DumpAllNamespaceInfo(ctx context.Context, c clientset.Interface, namespace string) {
dumpEventsInNamespace(func(opts metav1.ListOptions, ns string) (*v1.EventList, error) {
return c.CoreV1().Events(ns).List(ctx, opts)
}, namespace)
e2epod.DumpAllPodInfoForNamespace(ctx, c, namespace, framework.TestContext.ReportDir)
// If cluster is large, then the following logs are basically useless, because:
// 1. it takes tens of minutes or hours to grab all of them
// 2. there are so many of them that working with them are mostly impossible
// So we dump them only if the cluster is relatively small.
maxNodesForDump := framework.TestContext.MaxNodesToGather
nodes, err := c.CoreV1().Nodes().List(ctx, metav1.ListOptions{})
if err != nil {
framework.Logf("unable to fetch node list: %v", err)
return
}
if len(nodes.Items) <= maxNodesForDump {
dumpAllNodeInfo(ctx, c, nodes)
} else {
framework.Logf("skipping dumping cluster info - cluster too large")
}
}
// byFirstTimestamp sorts a slice of events by first timestamp, using their involvedObject's name as a tie breaker.
type byFirstTimestamp []v1.Event
func (o byFirstTimestamp) Len() int { return len(o) }
func (o byFirstTimestamp) Swap(i, j int) { o[i], o[j] = o[j], o[i] }
func (o byFirstTimestamp) Less(i, j int) bool {
if o[i].FirstTimestamp.Equal(&o[j].FirstTimestamp) {
return o[i].InvolvedObject.Name < o[j].InvolvedObject.Name
}
return o[i].FirstTimestamp.Before(&o[j].FirstTimestamp)
}
func dumpAllNodeInfo(ctx context.Context, c clientset.Interface, nodes *v1.NodeList) {
names := make([]string, len(nodes.Items))
for ix := range nodes.Items {
names[ix] = nodes.Items[ix].Name
}
DumpNodeDebugInfo(ctx, c, names, framework.Logf)
}
// DumpNodeDebugInfo dumps debug information of the given nodes.
func DumpNodeDebugInfo(ctx context.Context, c clientset.Interface, nodeNames []string, logFunc func(fmt string, args ...interface{})) {
for _, n := range nodeNames {
logFunc("\nLogging node info for node %v", n)
node, err := c.CoreV1().Nodes().Get(ctx, n, metav1.GetOptions{})
if err != nil {
logFunc("Error getting node info %v", err)
}
logFunc("Node Info: %v", node)
logFunc("\nLogging kubelet events for node %v", n)
for _, e := range getNodeEvents(ctx, c, n) {
logFunc("source %v type %v message %v reason %v first ts %v last ts %v, involved obj %+v",
e.Source, e.Type, e.Message, e.Reason, e.FirstTimestamp, e.LastTimestamp, e.InvolvedObject)
}
logFunc("\nLogging pods the kubelet thinks are on node %v", n)
podList, err := getKubeletPods(ctx, c, n)
if err != nil {
logFunc("Unable to retrieve kubelet pods for node %v: %v", n, err)
continue
}
for _, p := range podList.Items {
logFunc("%s/%s started at %v (%d+%d container statuses recorded)", p.Namespace, p.Name, p.Status.StartTime, len(p.Status.InitContainerStatuses), len(p.Status.ContainerStatuses))
for _, c := range p.Status.InitContainerStatuses {
logFunc("\tInit container %v ready: %v, restart count %v",
c.Name, c.Ready, c.RestartCount)
}
for _, c := range p.Status.ContainerStatuses {
logFunc("\tContainer %v ready: %v, restart count %v",
c.Name, c.Ready, c.RestartCount)
}
}
_, err = e2emetrics.HighLatencyKubeletOperations(ctx, c, 10*time.Second, n, logFunc)
framework.ExpectNoError(err)
// TODO: Log node resource info
}
}
// getKubeletPods retrieves the list of pods on the kubelet.
func getKubeletPods(ctx context.Context, c clientset.Interface, node string) (*v1.PodList, error) {
var client restclient.Result
finished := make(chan struct{}, 1)
go func() {
// call chain tends to hang in some cases when Node is not ready. Add an artificial timeout for this call. #22165
client = c.CoreV1().RESTClient().Get().
Resource("nodes").
SubResource("proxy").
Name(fmt.Sprintf("%v:%v", node, framework.KubeletPort)).
Suffix("pods").
Do(ctx)
finished <- struct{}{}
}()
select {
case <-finished:
result := &v1.PodList{}
if err := client.Into(result); err != nil {
return &v1.PodList{}, err
}
return result, nil
case <-time.After(framework.PodGetTimeout):
return &v1.PodList{}, fmt.Errorf("Waiting up to %v for getting the list of pods", framework.PodGetTimeout)
}
}
// logNodeEvents logs kubelet events from the given node. This includes kubelet
// restart and node unhealthy events. Note that listing events like this will mess
// with latency metrics, beware of calling it during a test.
func getNodeEvents(ctx context.Context, c clientset.Interface, nodeName string) []v1.Event {
selector := fields.Set{
"involvedObject.kind": "Node",
"involvedObject.name": nodeName,
"involvedObject.namespace": metav1.NamespaceAll,
"source": "kubelet",
}.AsSelector().String()
options := metav1.ListOptions{FieldSelector: selector}
events, err := c.CoreV1().Events(metav1.NamespaceSystem).List(ctx, options)
if err != nil {
framework.Logf("Unexpected error retrieving node events %v", err)
return []v1.Event{}
}
return events.Items
}

View File

@ -0,0 +1,288 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package debug
import (
"bytes"
"context"
"fmt"
"strconv"
"strings"
"sync"
"text/tabwriter"
"time"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubernetes/test/e2e/framework"
e2essh "k8s.io/kubernetes/test/e2e/framework/ssh"
)
const (
// Minimal period between polling log sizes from components
pollingPeriod = 60 * time.Second
workersNo = 5
kubeletLogsPath = "/var/log/kubelet.log"
kubeProxyLogsPath = "/var/log/kube-proxy.log"
kubeAddonsLogsPath = "/var/log/kube-addons.log"
kubeMasterAddonsLogsPath = "/var/log/kube-master-addons.log"
apiServerLogsPath = "/var/log/kube-apiserver.log"
controllersLogsPath = "/var/log/kube-controller-manager.log"
schedulerLogsPath = "/var/log/kube-scheduler.log"
)
var (
nodeLogsToCheck = []string{kubeletLogsPath, kubeProxyLogsPath}
masterLogsToCheck = []string{kubeletLogsPath, kubeAddonsLogsPath, kubeMasterAddonsLogsPath,
apiServerLogsPath, controllersLogsPath, schedulerLogsPath}
)
// TimestampedSize contains a size together with a time of measurement.
type TimestampedSize struct {
timestamp time.Time
size int
}
// LogSizeGatherer is a worker which grabs a WorkItem from the channel and does assigned work.
type LogSizeGatherer struct {
stopChannel chan bool
data *LogsSizeData
wg *sync.WaitGroup
workChannel chan WorkItem
}
// LogsSizeVerifier gathers data about log files sizes from master and node machines.
// It oversees a <workersNo> workers which do the gathering.
type LogsSizeVerifier struct {
client clientset.Interface
stopChannel chan bool
// data stores LogSizeData groupped per IP and log_path
data *LogsSizeData
masterAddress string
nodeAddresses []string
wg sync.WaitGroup
workChannel chan WorkItem
workers []*LogSizeGatherer
}
// SingleLogSummary is a structure for handling average generation rate and number of probes.
type SingleLogSummary struct {
AverageGenerationRate int
NumberOfProbes int
}
// LogSizeDataTimeseries is map of timestamped size.
type LogSizeDataTimeseries map[string]map[string][]TimestampedSize
// LogsSizeDataSummary is map of log summary.
// node -> file -> data
type LogsSizeDataSummary map[string]map[string]SingleLogSummary
// PrintHumanReadable returns string of log size data summary.
// TODO: make sure that we don't need locking here
func (s *LogsSizeDataSummary) PrintHumanReadable() string {
buf := &bytes.Buffer{}
w := tabwriter.NewWriter(buf, 1, 0, 1, ' ', 0)
fmt.Fprintf(w, "host\tlog_file\taverage_rate (B/s)\tnumber_of_probes\n")
for k, v := range *s {
fmt.Fprintf(w, "%v\t\t\t\n", k)
for path, data := range v {
fmt.Fprintf(w, "\t%v\t%v\t%v\n", path, data.AverageGenerationRate, data.NumberOfProbes)
}
}
w.Flush()
return buf.String()
}
// PrintJSON returns the summary of log size data with JSON format.
func (s *LogsSizeDataSummary) PrintJSON() string {
return framework.PrettyPrintJSON(*s)
}
// SummaryKind returns the summary of log size data summary.
func (s *LogsSizeDataSummary) SummaryKind() string {
return "LogSizeSummary"
}
// LogsSizeData is a structure for handling timeseries of log size data and lock.
type LogsSizeData struct {
data LogSizeDataTimeseries
lock sync.Mutex
}
// WorkItem is a command for a worker that contains an IP of machine from which we want to
// gather data and paths to all files we're interested in.
type WorkItem struct {
ip string
paths []string
backoffMultiplier int
}
func prepareData(masterAddress string, nodeAddresses []string) *LogsSizeData {
data := make(LogSizeDataTimeseries)
ips := append(nodeAddresses, masterAddress)
for _, ip := range ips {
data[ip] = make(map[string][]TimestampedSize)
}
return &LogsSizeData{
data: data,
lock: sync.Mutex{},
}
}
func (d *LogsSizeData) addNewData(ip, path string, timestamp time.Time, size int) {
d.lock.Lock()
defer d.lock.Unlock()
d.data[ip][path] = append(
d.data[ip][path],
TimestampedSize{
timestamp: timestamp,
size: size,
},
)
}
// NewLogsVerifier creates a new LogsSizeVerifier which will stop when stopChannel is closed
func NewLogsVerifier(ctx context.Context, c clientset.Interface) *LogsSizeVerifier {
nodeAddresses, err := e2essh.NodeSSHHosts(ctx, c)
framework.ExpectNoError(err)
instanceAddress := framework.APIAddress() + ":22"
workChannel := make(chan WorkItem, len(nodeAddresses)+1)
workers := make([]*LogSizeGatherer, workersNo)
verifier := &LogsSizeVerifier{
client: c,
data: prepareData(instanceAddress, nodeAddresses),
masterAddress: instanceAddress,
nodeAddresses: nodeAddresses,
wg: sync.WaitGroup{},
workChannel: workChannel,
workers: workers,
}
verifier.wg.Add(workersNo)
for i := 0; i < workersNo; i++ {
workers[i] = &LogSizeGatherer{
data: verifier.data,
wg: &verifier.wg,
workChannel: workChannel,
}
}
return verifier
}
// GetSummary returns a summary (average generation rate and number of probes) of the data gathered by LogSizeVerifier
func (s *LogsSizeVerifier) GetSummary() *LogsSizeDataSummary {
result := make(LogsSizeDataSummary)
for k, v := range s.data.data {
result[k] = make(map[string]SingleLogSummary)
for path, data := range v {
if len(data) > 1 {
last := data[len(data)-1]
first := data[0]
rate := (last.size - first.size) / int(last.timestamp.Sub(first.timestamp)/time.Second)
result[k][path] = SingleLogSummary{
AverageGenerationRate: rate,
NumberOfProbes: len(data),
}
}
}
}
return &result
}
// Run starts log size gathering. It starts a gorouting for every worker and then blocks until stopChannel is closed
func (s *LogsSizeVerifier) Run(ctx context.Context) {
s.workChannel <- WorkItem{
ip: s.masterAddress,
paths: masterLogsToCheck,
backoffMultiplier: 1,
}
for _, node := range s.nodeAddresses {
s.workChannel <- WorkItem{
ip: node,
paths: nodeLogsToCheck,
backoffMultiplier: 1,
}
}
for _, worker := range s.workers {
go worker.Run(ctx)
}
<-s.stopChannel
s.wg.Wait()
}
// Run starts log size gathering.
func (g *LogSizeGatherer) Run(ctx context.Context) {
for g.Work(ctx) {
}
}
func (g *LogSizeGatherer) pushWorkItem(workItem WorkItem) {
select {
case <-time.After(time.Duration(workItem.backoffMultiplier) * pollingPeriod):
g.workChannel <- workItem
case <-g.stopChannel:
return
}
}
// Work does a single unit of work: tries to take out a WorkItem from the queue, ssh-es into a given machine,
// gathers data, writes it to the shared <data> map, and creates a gorouting which reinserts work item into
// the queue with a <pollingPeriod> delay. Returns false if worker should exit.
func (g *LogSizeGatherer) Work(ctx context.Context) bool {
var workItem WorkItem
select {
case <-g.stopChannel:
g.wg.Done()
return false
case workItem = <-g.workChannel:
}
sshResult, err := e2essh.SSH(
ctx,
fmt.Sprintf("ls -l %v | awk '{print $9, $5}' | tr '\n' ' '", strings.Join(workItem.paths, " ")),
workItem.ip,
framework.TestContext.Provider,
)
if err != nil {
framework.Logf("Error while trying to SSH to %v, skipping probe. Error: %v", workItem.ip, err)
// In case of repeated error give up.
if workItem.backoffMultiplier >= 128 {
framework.Logf("Failed to ssh to a node %v multiple times in a row. Giving up.", workItem.ip)
g.wg.Done()
return false
}
workItem.backoffMultiplier *= 2
go g.pushWorkItem(workItem)
return true
}
workItem.backoffMultiplier = 1
results := strings.Split(sshResult.Stdout, " ")
now := time.Now()
for i := 0; i+1 < len(results); i = i + 2 {
path := results[i]
size, err := strconv.Atoi(results[i+1])
if err != nil {
framework.Logf("Error during conversion to int: %v, skipping data. Error: %v", results[i+1], err)
continue
}
g.data.addNewData(workItem.ip, path, now, size)
}
go g.pushWorkItem(workItem)
return true
}

View File

@ -0,0 +1,664 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package debug
import (
"bufio"
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"math"
"regexp"
"sort"
"strconv"
"strings"
"sync"
"text/tabwriter"
"time"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/fields"
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
clientset "k8s.io/client-go/kubernetes"
kubeletstatsv1alpha1 "k8s.io/kubelet/pkg/apis/stats/v1alpha1"
"k8s.io/kubernetes/test/e2e/framework"
e2essh "k8s.io/kubernetes/test/e2e/framework/ssh"
)
// ResourceConstraint is a struct to hold constraints.
type ResourceConstraint struct {
CPUConstraint float64
MemoryConstraint uint64
}
// SingleContainerSummary is a struct to hold single container summary.
type SingleContainerSummary struct {
Name string
CPU float64
Mem uint64
}
// ContainerResourceUsage is a structure for gathering container resource usage.
type ContainerResourceUsage struct {
Name string
Timestamp time.Time
CPUUsageInCores float64
MemoryUsageInBytes uint64
MemoryWorkingSetInBytes uint64
MemoryRSSInBytes uint64
// The interval used to calculate CPUUsageInCores.
CPUInterval time.Duration
}
// ResourceUsagePerContainer is map of ContainerResourceUsage
type ResourceUsagePerContainer map[string]*ContainerResourceUsage
// ResourceUsageSummary is a struct to hold resource usage summary.
// we can't have int here, as JSON does not accept integer keys.
type ResourceUsageSummary map[string][]SingleContainerSummary
// PrintHumanReadable prints resource usage summary in human readable.
func (s *ResourceUsageSummary) PrintHumanReadable() string {
buf := &bytes.Buffer{}
w := tabwriter.NewWriter(buf, 1, 0, 1, ' ', 0)
for perc, summaries := range *s {
buf.WriteString(fmt.Sprintf("%v percentile:\n", perc))
fmt.Fprintf(w, "container\tcpu(cores)\tmemory(MB)\n")
for _, summary := range summaries {
fmt.Fprintf(w, "%q\t%.3f\t%.2f\n", summary.Name, summary.CPU, float64(summary.Mem)/(1024*1024))
}
w.Flush()
}
return buf.String()
}
// PrintJSON prints resource usage summary in JSON.
func (s *ResourceUsageSummary) PrintJSON() string {
return framework.PrettyPrintJSON(*s)
}
// SummaryKind returns string of ResourceUsageSummary
func (s *ResourceUsageSummary) SummaryKind() string {
return "ResourceUsageSummary"
}
type uint64arr []uint64
func (a uint64arr) Len() int { return len(a) }
func (a uint64arr) Swap(i, j int) { a[i], a[j] = a[j], a[i] }
func (a uint64arr) Less(i, j int) bool { return a[i] < a[j] }
type usageDataPerContainer struct {
cpuData []float64
memUseData []uint64
memWorkSetData []uint64
}
func computePercentiles(timeSeries []ResourceUsagePerContainer, percentilesToCompute []int) map[int]ResourceUsagePerContainer {
if len(timeSeries) == 0 {
return make(map[int]ResourceUsagePerContainer)
}
dataMap := make(map[string]*usageDataPerContainer)
for i := range timeSeries {
for name, data := range timeSeries[i] {
if dataMap[name] == nil {
dataMap[name] = &usageDataPerContainer{
cpuData: make([]float64, 0, len(timeSeries)),
memUseData: make([]uint64, 0, len(timeSeries)),
memWorkSetData: make([]uint64, 0, len(timeSeries)),
}
}
dataMap[name].cpuData = append(dataMap[name].cpuData, data.CPUUsageInCores)
dataMap[name].memUseData = append(dataMap[name].memUseData, data.MemoryUsageInBytes)
dataMap[name].memWorkSetData = append(dataMap[name].memWorkSetData, data.MemoryWorkingSetInBytes)
}
}
for _, v := range dataMap {
sort.Float64s(v.cpuData)
sort.Sort(uint64arr(v.memUseData))
sort.Sort(uint64arr(v.memWorkSetData))
}
result := make(map[int]ResourceUsagePerContainer)
for _, perc := range percentilesToCompute {
data := make(ResourceUsagePerContainer)
for k, v := range dataMap {
percentileIndex := int(math.Ceil(float64(len(v.cpuData)*perc)/100)) - 1
data[k] = &ContainerResourceUsage{
Name: k,
CPUUsageInCores: v.cpuData[percentileIndex],
MemoryUsageInBytes: v.memUseData[percentileIndex],
MemoryWorkingSetInBytes: v.memWorkSetData[percentileIndex],
}
}
result[perc] = data
}
return result
}
func leftMergeData(left, right map[int]ResourceUsagePerContainer) map[int]ResourceUsagePerContainer {
result := make(map[int]ResourceUsagePerContainer)
for percentile, data := range left {
result[percentile] = data
if _, ok := right[percentile]; !ok {
continue
}
for k, v := range right[percentile] {
result[percentile][k] = v
}
}
return result
}
type resourceGatherWorker struct {
c clientset.Interface
nodeName string
wg *sync.WaitGroup
containerIDs []string
stopCh chan struct{}
dataSeries []ResourceUsagePerContainer
finished bool
inKubemark bool
resourceDataGatheringPeriod time.Duration
probeDuration time.Duration
printVerboseLogs bool
}
func (w *resourceGatherWorker) singleProbe(ctx context.Context) {
data := make(ResourceUsagePerContainer)
if w.inKubemark {
kubemarkData := getKubemarkMasterComponentsResourceUsage(ctx)
if kubemarkData == nil {
return
}
for k, v := range kubemarkData {
data[k] = &ContainerResourceUsage{
Name: v.Name,
MemoryWorkingSetInBytes: v.MemoryWorkingSetInBytes,
CPUUsageInCores: v.CPUUsageInCores,
}
}
} else {
nodeUsage, err := getOneTimeResourceUsageOnNode(w.c, w.nodeName, w.probeDuration, func() []string { return w.containerIDs })
if err != nil {
framework.Logf("Error while reading data from %v: %v", w.nodeName, err)
return
}
for k, v := range nodeUsage {
data[k] = v
if w.printVerboseLogs {
framework.Logf("Get container %v usage on node %v. CPUUsageInCores: %v, MemoryUsageInBytes: %v, MemoryWorkingSetInBytes: %v", k, w.nodeName, v.CPUUsageInCores, v.MemoryUsageInBytes, v.MemoryWorkingSetInBytes)
}
}
}
w.dataSeries = append(w.dataSeries, data)
}
// getOneTimeResourceUsageOnNode queries the node's /stats/summary endpoint
// and returns the resource usage of all containerNames for the past
// cpuInterval.
// The acceptable range of the interval is 2s~120s. Be warned that as the
// interval (and #containers) increases, the size of kubelet's response
// could be significant. E.g., the 60s interval stats for ~20 containers is
// ~1.5MB. Don't hammer the node with frequent, heavy requests.
//
// cadvisor records cumulative cpu usage in nanoseconds, so we need to have two
// stats points to compute the cpu usage over the interval. Assuming cadvisor
// polls every second, we'd need to get N stats points for N-second interval.
// Note that this is an approximation and may not be accurate, hence we also
// write the actual interval used for calculation (based on the timestamps of
// the stats points in ContainerResourceUsage.CPUInterval.
//
// containerNames is a function returning a collection of container names in which
// user is interested in.
func getOneTimeResourceUsageOnNode(
c clientset.Interface,
nodeName string,
cpuInterval time.Duration,
containerNames func() []string,
) (ResourceUsagePerContainer, error) {
const (
// cadvisor records stats about every second.
cadvisorStatsPollingIntervalInSeconds float64 = 1.0
// cadvisor caches up to 2 minutes of stats (configured by kubelet).
maxNumStatsToRequest int = 120
)
numStats := int(float64(cpuInterval.Seconds()) / cadvisorStatsPollingIntervalInSeconds)
if numStats < 2 || numStats > maxNumStatsToRequest {
return nil, fmt.Errorf("numStats needs to be > 1 and < %d", maxNumStatsToRequest)
}
// Get information of all containers on the node.
summary, err := getStatsSummary(c, nodeName)
if err != nil {
return nil, err
}
f := func(name string, newStats *kubeletstatsv1alpha1.ContainerStats) *ContainerResourceUsage {
if newStats == nil || newStats.CPU == nil || newStats.Memory == nil {
return nil
}
return &ContainerResourceUsage{
Name: name,
Timestamp: newStats.StartTime.Time,
CPUUsageInCores: float64(removeUint64Ptr(newStats.CPU.UsageNanoCores)) / 1000000000,
MemoryUsageInBytes: removeUint64Ptr(newStats.Memory.UsageBytes),
MemoryWorkingSetInBytes: removeUint64Ptr(newStats.Memory.WorkingSetBytes),
MemoryRSSInBytes: removeUint64Ptr(newStats.Memory.RSSBytes),
CPUInterval: 0,
}
}
// Process container infos that are relevant to us.
containers := containerNames()
usageMap := make(ResourceUsagePerContainer, len(containers))
for _, pod := range summary.Pods {
for _, container := range pod.Containers {
isInteresting := false
for _, interestingContainerName := range containers {
if container.Name == interestingContainerName {
isInteresting = true
break
}
}
if !isInteresting {
continue
}
if usage := f(pod.PodRef.Name+"/"+container.Name, &container); usage != nil {
usageMap[pod.PodRef.Name+"/"+container.Name] = usage
}
}
}
return usageMap, nil
}
// getStatsSummary contacts kubelet for the container information.
func getStatsSummary(c clientset.Interface, nodeName string) (*kubeletstatsv1alpha1.Summary, error) {
ctx, cancel := context.WithTimeout(context.Background(), framework.SingleCallTimeout)
defer cancel()
data, err := c.CoreV1().RESTClient().Get().
Resource("nodes").
SubResource("proxy").
Name(fmt.Sprintf("%v:%v", nodeName, framework.KubeletPort)).
Suffix("stats/summary").
Do(ctx).Raw()
if err != nil {
return nil, err
}
summary := kubeletstatsv1alpha1.Summary{}
err = json.Unmarshal(data, &summary)
if err != nil {
return nil, err
}
return &summary, nil
}
func removeUint64Ptr(ptr *uint64) uint64 {
if ptr == nil {
return 0
}
return *ptr
}
func (w *resourceGatherWorker) gather(ctx context.Context, initialSleep time.Duration) {
defer utilruntime.HandleCrash()
defer w.wg.Done()
defer framework.Logf("Closing worker for %v", w.nodeName)
defer func() { w.finished = true }()
select {
case <-time.After(initialSleep):
w.singleProbe(ctx)
for {
select {
case <-time.After(w.resourceDataGatheringPeriod):
w.singleProbe(ctx)
case <-ctx.Done():
return
case <-w.stopCh:
return
}
}
case <-ctx.Done():
return
case <-w.stopCh:
return
}
}
// ContainerResourceGatherer is a struct for gathering container resource.
type ContainerResourceGatherer struct {
client clientset.Interface
stopCh chan struct{}
workers []resourceGatherWorker
workerWg sync.WaitGroup
containerIDs []string
options ResourceGathererOptions
}
// ResourceGathererOptions is a struct to hold options for resource.
type ResourceGathererOptions struct {
InKubemark bool
Nodes NodesSet
ResourceDataGatheringPeriod time.Duration
ProbeDuration time.Duration
PrintVerboseLogs bool
}
// NodesSet is a value of nodes set.
type NodesSet int
const (
// AllNodes means all containers on all nodes.
AllNodes NodesSet = 0
// MasterNodes means all containers on Master nodes only.
MasterNodes NodesSet = 1
// MasterAndDNSNodes means all containers on Master nodes and DNS containers on other nodes.
MasterAndDNSNodes NodesSet = 2
)
// nodeHasControlPlanePods returns true if specified node has control plane pods
// (kube-scheduler and/or kube-controller-manager).
func nodeHasControlPlanePods(ctx context.Context, c clientset.Interface, nodeName string) (bool, error) {
regKubeScheduler := regexp.MustCompile("kube-scheduler-.*")
regKubeControllerManager := regexp.MustCompile("kube-controller-manager-.*")
podList, err := c.CoreV1().Pods(metav1.NamespaceSystem).List(ctx, metav1.ListOptions{
FieldSelector: fields.OneTermEqualSelector("spec.nodeName", nodeName).String(),
})
if err != nil {
return false, err
}
if len(podList.Items) < 1 {
framework.Logf("Can't find any pods in namespace %s to grab metrics from", metav1.NamespaceSystem)
}
for _, pod := range podList.Items {
if regKubeScheduler.MatchString(pod.Name) || regKubeControllerManager.MatchString(pod.Name) {
return true, nil
}
}
return false, nil
}
// NewResourceUsageGatherer returns a new ContainerResourceGatherer.
func NewResourceUsageGatherer(ctx context.Context, c clientset.Interface, options ResourceGathererOptions, pods *v1.PodList) (*ContainerResourceGatherer, error) {
g := ContainerResourceGatherer{
client: c,
stopCh: make(chan struct{}),
containerIDs: make([]string, 0),
options: options,
}
if options.InKubemark {
g.workerWg.Add(1)
g.workers = append(g.workers, resourceGatherWorker{
inKubemark: true,
stopCh: g.stopCh,
wg: &g.workerWg,
finished: false,
resourceDataGatheringPeriod: options.ResourceDataGatheringPeriod,
probeDuration: options.ProbeDuration,
printVerboseLogs: options.PrintVerboseLogs,
})
return &g, nil
}
// Tracks kube-system pods if no valid PodList is passed in.
var err error
if pods == nil {
pods, err = c.CoreV1().Pods("kube-system").List(ctx, metav1.ListOptions{})
if err != nil {
framework.Logf("Error while listing Pods: %v", err)
return nil, err
}
}
dnsNodes := make(map[string]bool)
for _, pod := range pods.Items {
if options.Nodes == MasterNodes {
isControlPlane, err := nodeHasControlPlanePods(ctx, c, pod.Spec.NodeName)
if err != nil {
return nil, err
}
if !isControlPlane {
continue
}
}
if options.Nodes == MasterAndDNSNodes {
isControlPlane, err := nodeHasControlPlanePods(ctx, c, pod.Spec.NodeName)
if err != nil {
return nil, err
}
if !isControlPlane && pod.Labels["k8s-app"] != "kube-dns" {
continue
}
}
for _, container := range pod.Status.InitContainerStatuses {
g.containerIDs = append(g.containerIDs, container.Name)
}
for _, container := range pod.Status.ContainerStatuses {
g.containerIDs = append(g.containerIDs, container.Name)
}
if options.Nodes == MasterAndDNSNodes {
dnsNodes[pod.Spec.NodeName] = true
}
}
nodeList, err := c.CoreV1().Nodes().List(ctx, metav1.ListOptions{})
if err != nil {
framework.Logf("Error while listing Nodes: %v", err)
return nil, err
}
for _, node := range nodeList.Items {
isControlPlane, err := nodeHasControlPlanePods(ctx, c, node.Name)
if err != nil {
return nil, err
}
if options.Nodes == AllNodes || isControlPlane || dnsNodes[node.Name] {
g.workerWg.Add(1)
g.workers = append(g.workers, resourceGatherWorker{
c: c,
nodeName: node.Name,
wg: &g.workerWg,
containerIDs: g.containerIDs,
stopCh: g.stopCh,
finished: false,
inKubemark: false,
resourceDataGatheringPeriod: options.ResourceDataGatheringPeriod,
probeDuration: options.ProbeDuration,
printVerboseLogs: options.PrintVerboseLogs,
})
if options.Nodes == MasterNodes {
break
}
}
}
return &g, nil
}
// StartGatheringData starts a stat gathering worker blocks for each node to track,
// and blocks until StopAndSummarize is called.
func (g *ContainerResourceGatherer) StartGatheringData(ctx context.Context) {
if len(g.workers) == 0 {
return
}
delayPeriod := g.options.ResourceDataGatheringPeriod / time.Duration(len(g.workers))
delay := time.Duration(0)
for i := range g.workers {
go g.workers[i].gather(ctx, delay)
delay += delayPeriod
}
g.workerWg.Wait()
}
// StopAndSummarize stops stat gathering workers, processes the collected stats,
// generates resource summary for the passed-in percentiles, and returns the summary.
// It returns an error if the resource usage at any percentile is beyond the
// specified resource constraints.
func (g *ContainerResourceGatherer) StopAndSummarize(percentiles []int, constraints map[string]ResourceConstraint) (*ResourceUsageSummary, error) {
close(g.stopCh)
framework.Logf("Closed stop channel. Waiting for %v workers", len(g.workers))
finished := make(chan struct{}, 1)
go func() {
g.workerWg.Wait()
finished <- struct{}{}
}()
select {
case <-finished:
framework.Logf("Waitgroup finished.")
case <-time.After(2 * time.Minute):
unfinished := make([]string, 0)
for i := range g.workers {
if !g.workers[i].finished {
unfinished = append(unfinished, g.workers[i].nodeName)
}
}
framework.Logf("Timed out while waiting for waitgroup, some workers failed to finish: %v", unfinished)
}
if len(percentiles) == 0 {
framework.Logf("Warning! Empty percentile list for stopAndPrintData.")
return &ResourceUsageSummary{}, fmt.Errorf("Failed to get any resource usage data")
}
data := make(map[int]ResourceUsagePerContainer)
for i := range g.workers {
if g.workers[i].finished {
stats := computePercentiles(g.workers[i].dataSeries, percentiles)
data = leftMergeData(stats, data)
}
}
// Workers has been stopped. We need to gather data stored in them.
sortedKeys := []string{}
for name := range data[percentiles[0]] {
sortedKeys = append(sortedKeys, name)
}
sort.Strings(sortedKeys)
violatedConstraints := make([]string, 0)
summary := make(ResourceUsageSummary)
for _, perc := range percentiles {
for _, name := range sortedKeys {
usage := data[perc][name]
summary[strconv.Itoa(perc)] = append(summary[strconv.Itoa(perc)], SingleContainerSummary{
Name: name,
CPU: usage.CPUUsageInCores,
Mem: usage.MemoryWorkingSetInBytes,
})
// Verifying 99th percentile of resource usage
if perc != 99 {
continue
}
// Name has a form: <pod_name>/<container_name>
containerName := strings.Split(name, "/")[1]
constraint, ok := constraints[containerName]
if !ok {
continue
}
if usage.CPUUsageInCores > constraint.CPUConstraint {
violatedConstraints = append(
violatedConstraints,
fmt.Sprintf("Container %v is using %v/%v CPU",
name,
usage.CPUUsageInCores,
constraint.CPUConstraint,
),
)
}
if usage.MemoryWorkingSetInBytes > constraint.MemoryConstraint {
violatedConstraints = append(
violatedConstraints,
fmt.Sprintf("Container %v is using %v/%v MB of memory",
name,
float64(usage.MemoryWorkingSetInBytes)/(1024*1024),
float64(constraint.MemoryConstraint)/(1024*1024),
),
)
}
}
}
if len(violatedConstraints) > 0 {
return &summary, errors.New(strings.Join(violatedConstraints, "\n"))
}
return &summary, nil
}
// kubemarkResourceUsage is a struct for tracking the resource usage of kubemark.
type kubemarkResourceUsage struct {
Name string
MemoryWorkingSetInBytes uint64
CPUUsageInCores float64
}
func getMasterUsageByPrefix(ctx context.Context, prefix string) (string, error) {
sshResult, err := e2essh.SSH(ctx, fmt.Sprintf("ps ax -o %%cpu,rss,command | tail -n +2 | grep %v | sed 's/\\s+/ /g'", prefix), framework.APIAddress()+":22", framework.TestContext.Provider)
if err != nil {
return "", err
}
return sshResult.Stdout, nil
}
// getKubemarkMasterComponentsResourceUsage returns the resource usage of kubemark which contains multiple combinations of cpu and memory usage for each pod name.
func getKubemarkMasterComponentsResourceUsage(ctx context.Context) map[string]*kubemarkResourceUsage {
result := make(map[string]*kubemarkResourceUsage)
// Get kubernetes component resource usage
sshResult, err := getMasterUsageByPrefix(ctx, "kube")
if err != nil {
framework.Logf("Error when trying to SSH to master machine. Skipping probe. %v", err)
return nil
}
scanner := bufio.NewScanner(strings.NewReader(sshResult))
for scanner.Scan() {
var cpu float64
var mem uint64
var name string
fmt.Sscanf(strings.TrimSpace(scanner.Text()), "%f %d /usr/local/bin/kube-%s", &cpu, &mem, &name)
if name != "" {
// Gatherer expects pod_name/container_name format
fullName := name + "/" + name
result[fullName] = &kubemarkResourceUsage{Name: fullName, MemoryWorkingSetInBytes: mem * 1024, CPUUsageInCores: cpu / 100}
}
}
// Get etcd resource usage
sshResult, err = getMasterUsageByPrefix(ctx, "bin/etcd")
if err != nil {
framework.Logf("Error when trying to SSH to master machine. Skipping probe")
return nil
}
scanner = bufio.NewScanner(strings.NewReader(sshResult))
for scanner.Scan() {
var cpu float64
var mem uint64
var etcdKind string
fmt.Sscanf(strings.TrimSpace(scanner.Text()), "%f %d /bin/sh -c /usr/local/bin/etcd", &cpu, &mem)
dataDirStart := strings.Index(scanner.Text(), "--data-dir")
if dataDirStart < 0 {
continue
}
fmt.Sscanf(scanner.Text()[dataDirStart:], "--data-dir=/var/%s", &etcdKind)
if etcdKind != "" {
// Gatherer expects pod_name/container_name format
fullName := "etcd/" + etcdKind
result[fullName] = &kubemarkResourceUsage{Name: fullName, MemoryWorkingSetInBytes: mem * 1024, CPUUsageInCores: cpu / 100}
}
}
return result
}

View File

@ -0,0 +1,352 @@
/*
Copyright 2014 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import (
"context"
"errors"
"fmt"
"strings"
"time"
ginkgotypes "github.com/onsi/ginkgo/v2/types"
"github.com/onsi/gomega"
"github.com/onsi/gomega/format"
"github.com/onsi/gomega/types"
)
// MakeMatcher builds a gomega.Matcher based on a single callback function.
// That function is passed the actual value that is to be checked.
// There are three possible outcomes of the check:
// - An error is returned, which then is converted into a failure
// by Gomega.
// - A non-nil failure function is returned, which then is called
// by Gomega once a failure string is needed. This is useful
// to avoid unnecessarily preparing a failure string for intermediate
// failures in Eventually or Consistently.
// - Both function and error are nil, which means that the check
// succeeded.
func MakeMatcher[T interface{}](match func(actual T) (failure func() string, err error)) types.GomegaMatcher {
return &matcher[T]{
match: match,
}
}
type matcher[T interface{}] struct {
match func(actual T) (func() string, error)
failure func() string
}
func (m *matcher[T]) Match(actual interface{}) (success bool, err error) {
if actual, ok := actual.(T); ok {
failure, err := m.match(actual)
if err != nil {
return false, err
}
m.failure = failure
if failure != nil {
return false, nil
}
return true, nil
}
var empty T
return false, gomega.StopTrying(fmt.Sprintf("internal error: expected %T, got:\n%s", empty, format.Object(actual, 1)))
}
func (m *matcher[T]) FailureMessage(actual interface{}) string {
return m.failure()
}
func (m matcher[T]) NegatedFailureMessage(actual interface{}) string {
return m.failure()
}
var _ types.GomegaMatcher = &matcher[string]{}
// Gomega returns an interface that can be used like gomega to express
// assertions. The difference is that failed assertions are returned as an
// error:
//
// if err := Gomega().Expect(pod.Status.Phase).To(gomega.Equal(v1.Running)); err != nil {
// return fmt.Errorf("test pod not running: %w", err)
// }
//
// This error can get wrapped to provide additional context for the
// failure. The test then should use ExpectNoError to turn a non-nil error into
// a failure.
//
// When using this approach, there is no need for call offsets and extra
// descriptions for the Expect call because the call stack will be dumped when
// ExpectNoError is called and the additional description(s) can be added by
// wrapping the error.
//
// Asynchronous assertions use the framework's Poll interval and PodStart timeout
// by default.
func Gomega() GomegaInstance {
return gomegaInstance{}
}
type GomegaInstance interface {
Expect(actual interface{}) Assertion
Eventually(ctx context.Context, args ...interface{}) AsyncAssertion
Consistently(ctx context.Context, args ...interface{}) AsyncAssertion
}
type Assertion interface {
Should(matcher types.GomegaMatcher) error
ShouldNot(matcher types.GomegaMatcher) error
To(matcher types.GomegaMatcher) error
ToNot(matcher types.GomegaMatcher) error
NotTo(matcher types.GomegaMatcher) error
}
type AsyncAssertion interface {
Should(matcher types.GomegaMatcher) error
ShouldNot(matcher types.GomegaMatcher) error
WithTimeout(interval time.Duration) AsyncAssertion
WithPolling(interval time.Duration) AsyncAssertion
}
type gomegaInstance struct{}
var _ GomegaInstance = gomegaInstance{}
func (g gomegaInstance) Expect(actual interface{}) Assertion {
return assertion{actual: actual}
}
func (g gomegaInstance) Eventually(ctx context.Context, args ...interface{}) AsyncAssertion {
return newAsyncAssertion(ctx, args, false)
}
func (g gomegaInstance) Consistently(ctx context.Context, args ...interface{}) AsyncAssertion {
return newAsyncAssertion(ctx, args, true)
}
func newG() (*FailureError, gomega.Gomega) {
var failure FailureError
g := gomega.NewGomega(func(msg string, callerSkip ...int) {
failure = FailureError{
msg: msg,
}
})
return &failure, g
}
type assertion struct {
actual interface{}
}
func (a assertion) Should(matcher types.GomegaMatcher) error {
err, g := newG()
if !g.Expect(a.actual).Should(matcher) {
err.backtrace()
return *err
}
return nil
}
func (a assertion) ShouldNot(matcher types.GomegaMatcher) error {
err, g := newG()
if !g.Expect(a.actual).ShouldNot(matcher) {
err.backtrace()
return *err
}
return nil
}
func (a assertion) To(matcher types.GomegaMatcher) error {
err, g := newG()
if !g.Expect(a.actual).To(matcher) {
err.backtrace()
return *err
}
return nil
}
func (a assertion) ToNot(matcher types.GomegaMatcher) error {
err, g := newG()
if !g.Expect(a.actual).ToNot(matcher) {
err.backtrace()
return *err
}
return nil
}
func (a assertion) NotTo(matcher types.GomegaMatcher) error {
err, g := newG()
if !g.Expect(a.actual).NotTo(matcher) {
err.backtrace()
return *err
}
return nil
}
type asyncAssertion struct {
ctx context.Context
args []interface{}
timeout time.Duration
interval time.Duration
consistently bool
}
func newAsyncAssertion(ctx context.Context, args []interface{}, consistently bool) asyncAssertion {
return asyncAssertion{
ctx: ctx,
args: args,
// PodStart is used as default because waiting for a pod is the
// most common operation.
timeout: TestContext.timeouts.PodStart,
interval: TestContext.timeouts.Poll,
consistently: consistently,
}
}
func (a asyncAssertion) newAsync() (*FailureError, gomega.AsyncAssertion) {
err, g := newG()
var assertion gomega.AsyncAssertion
if a.consistently {
assertion = g.Consistently(a.ctx, a.args...)
} else {
assertion = g.Eventually(a.ctx, a.args...)
}
assertion = assertion.WithTimeout(a.timeout).WithPolling(a.interval)
return err, assertion
}
func (a asyncAssertion) Should(matcher types.GomegaMatcher) error {
err, assertion := a.newAsync()
if !assertion.Should(matcher) {
err.backtrace()
return *err
}
return nil
}
func (a asyncAssertion) ShouldNot(matcher types.GomegaMatcher) error {
err, assertion := a.newAsync()
if !assertion.ShouldNot(matcher) {
err.backtrace()
return *err
}
return nil
}
func (a asyncAssertion) WithTimeout(timeout time.Duration) AsyncAssertion {
a.timeout = timeout
return a
}
func (a asyncAssertion) WithPolling(interval time.Duration) AsyncAssertion {
a.interval = interval
return a
}
// FailureError is an error where the error string is meant to be passed to
// ginkgo.Fail directly, i.e. adding some prefix like "unexpected error" is not
// necessary. It is also not necessary to dump the error struct.
type FailureError struct {
msg string
fullStackTrace string
}
func (f FailureError) Error() string {
return f.msg
}
func (f FailureError) Backtrace() string {
return f.fullStackTrace
}
func (f FailureError) Is(target error) bool {
return target == ErrFailure
}
func (f *FailureError) backtrace() {
f.fullStackTrace = ginkgotypes.NewCodeLocationWithStackTrace(2).FullStackTrace
}
// ErrFailure is an empty error that can be wrapped to indicate that an error
// is a FailureError. It can also be used to test for a FailureError:.
//
// return fmt.Errorf("some problem%w", ErrFailure)
// ...
// err := someOperation()
// if errors.Is(err, ErrFailure) {
// ...
// }
var ErrFailure error = FailureError{}
// ExpectNoError checks if "err" is set, and if so, fails assertion while logging the error.
//
// As in [gomega.Expect], the explain parameters can be used to provide
// additional information in case of a failure in one of these two ways:
// - A single string is used as first line of the failure message directly.
// - A string with additional parameters is passed through [fmt.Sprintf].
func ExpectNoError(err error, explain ...interface{}) {
ExpectNoErrorWithOffset(1, err, explain...)
}
// ExpectNoErrorWithOffset checks if "err" is set, and if so, fails assertion while logging the error at "offset" levels above its caller
// (for example, for call chain f -> g -> ExpectNoErrorWithOffset(1, ...) error would be logged for "f").
//
// As in [gomega.Expect], the explain parameters can be used to provide
// additional information in case of a failure in one of these two ways:
// - A single string is used as first line of the failure message directly.
// - A string with additional parameters is passed through [fmt.Sprintf].
func ExpectNoErrorWithOffset(offset int, err error, explain ...interface{}) {
if err == nil {
return
}
// Errors usually contain unexported fields. We have to use
// a formatter here which can print those.
prefix := ""
if len(explain) > 0 {
if str, ok := explain[0].(string); ok {
prefix = fmt.Sprintf(str, explain[1:]...) + ": "
} else {
prefix = fmt.Sprintf("unexpected explain arguments, need format string: %v", explain)
}
}
// This intentionally doesn't use gomega.Expect. Instead we take
// full control over what information is presented where:
// - The complete error object is logged because it may contain
// additional information that isn't included in its error
// string.
// - It is not included in the failure message because
// it might make the failure message very large and/or
// cause error aggregation to work less well: two
// failures at the same code line might not be matched in
// https://go.k8s.io/triage because the error details are too
// different.
//
// Some errors include all relevant information in the Error
// string. For those we can skip the redundant log message.
// For our own failures we only log the additional stack backtrace
// because it is not included in the failure message.
var failure FailureError
if errors.As(err, &failure) && failure.Backtrace() != "" {
log(offset+1, fmt.Sprintf("Failed inside E2E framework:\n %s", strings.ReplaceAll(failure.Backtrace(), "\n", "\n ")))
} else if !errors.Is(err, ErrFailure) {
log(offset+1, fmt.Sprintf("Unexpected error: %s\n%s", prefix, format.Object(err, 1)))
}
Fail(prefix+err.Error(), 1+offset)
}

View File

@ -0,0 +1,97 @@
/*
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import (
"bytes"
"fmt"
"sync"
)
// FlakeReport is a struct for managing the flake report.
type FlakeReport struct {
lock sync.RWMutex
Flakes []string `json:"flakes"`
FlakeCount int `json:"flakeCount"`
}
// NewFlakeReport returns a new flake report.
func NewFlakeReport() *FlakeReport {
return &FlakeReport{
Flakes: []string{},
}
}
func buildDescription(optionalDescription ...interface{}) string {
switch len(optionalDescription) {
case 0:
return ""
default:
return fmt.Sprintf(optionalDescription[0].(string), optionalDescription[1:]...)
}
}
// RecordFlakeIfError records the error (if non-nil) as a flake along with an optional description.
// This can be used as a replacement of framework.ExpectNoError() for non-critical errors that can
// be considered as 'flakes' to avoid causing failures in tests.
func (f *FlakeReport) RecordFlakeIfError(err error, optionalDescription ...interface{}) {
if err == nil {
return
}
msg := fmt.Sprintf("Unexpected error occurred: %v", err)
desc := buildDescription(optionalDescription...)
if desc != "" {
msg = fmt.Sprintf("%v (Description: %v)", msg, desc)
}
Logf("%s", msg)
f.lock.Lock()
defer f.lock.Unlock()
f.Flakes = append(f.Flakes, msg)
f.FlakeCount++
}
// GetFlakeCount returns the flake count.
func (f *FlakeReport) GetFlakeCount() int {
f.lock.RLock()
defer f.lock.RUnlock()
return f.FlakeCount
}
// PrintHumanReadable returns string of flake report.
func (f *FlakeReport) PrintHumanReadable() string {
f.lock.RLock()
defer f.lock.RUnlock()
buf := bytes.Buffer{}
buf.WriteString(fmt.Sprintf("FlakeCount: %v\n", f.FlakeCount))
buf.WriteString("Flakes:\n")
for _, flake := range f.Flakes {
buf.WriteString(fmt.Sprintf("%v\n", flake))
}
return buf.String()
}
// PrintJSON returns the summary of frake report with JSON format.
func (f *FlakeReport) PrintJSON() string {
f.lock.RLock()
defer f.lock.RUnlock()
return PrettyPrintJSON(f)
}
// SummaryKind returns the summary of flake report.
func (f *FlakeReport) SummaryKind() string {
return "FlakeReport"
}

View File

@ -0,0 +1,774 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Package framework contains provider-independent helper code for
// building and running E2E tests with Ginkgo. The actual Ginkgo test
// suites gets assembled by combining this framework, the optional
// provider support code and specific tests via a separate .go file
// like Kubernetes' test/e2e.go.
package framework
import (
"context"
"fmt"
"math/rand"
"os"
"path"
"reflect"
"strings"
"time"
"k8s.io/apimachinery/pkg/runtime"
v1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/labels"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apimachinery/pkg/util/wait"
v1svc "k8s.io/client-go/applyconfigurations/core/v1"
"k8s.io/client-go/discovery"
cacheddiscovery "k8s.io/client-go/discovery/cached/memory"
"k8s.io/client-go/dynamic"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/client-go/kubernetes/scheme"
"k8s.io/client-go/rest"
"k8s.io/client-go/restmapper"
scaleclient "k8s.io/client-go/scale"
admissionapi "k8s.io/pod-security-admission/api"
"github.com/onsi/ginkgo/v2"
)
const (
// DefaultNamespaceDeletionTimeout is timeout duration for waiting for a namespace deletion.
DefaultNamespaceDeletionTimeout = 5 * time.Minute
defaultServiceAccountName = "default"
)
var (
// NewFrameworkExtensions lists functions that get called by
// NewFramework after constructing a new framework and after
// calling ginkgo.BeforeEach for the framework.
//
// This can be used by extensions of the core framework to modify
// settings in the framework instance or to add additional callbacks
// with ginkgo.BeforeEach/AfterEach/DeferCleanup.
//
// When a test runs, functions will be invoked in this order:
// - BeforeEaches defined by tests before f.NewDefaultFramework
// in the order in which they were defined (first-in-first-out)
// - f.BeforeEach
// - BeforeEaches defined by tests after f.NewDefaultFramework
// - It callback
// - all AfterEaches in the order in which they were defined
// - all DeferCleanups with the order reversed (first-in-last-out)
// - f.AfterEach
//
// Because a test might skip test execution in a BeforeEach that runs
// before f.BeforeEach, AfterEach callbacks that depend on the
// framework instance must check whether it was initialized. They can
// do that by checking f.ClientSet for nil. DeferCleanup callbacks
// don't need to do this because they get defined when the test
// runs.
NewFrameworkExtensions []func(f *Framework)
)
// Framework supports common operations used by e2e tests; it will keep a client & a namespace for you.
// Eventual goal is to merge this with integration test framework.
//
// You can configure the pod security level for your test by setting the `NamespacePodSecurityLevel`
// which will set all three of pod security admission enforce, warn and audit labels on the namespace.
// The default pod security profile is "restricted".
// Each of the labels can be overridden by using more specific NamespacePodSecurity* attributes of this
// struct.
type Framework struct {
BaseName string
// Set together with creating the ClientSet and the namespace.
// Guaranteed to be unique in the cluster even when running the same
// test multiple times in parallel.
UniqueName string
clientConfig *rest.Config
ClientSet clientset.Interface
KubemarkExternalClusterClientSet clientset.Interface
DynamicClient dynamic.Interface
ScalesGetter scaleclient.ScalesGetter
SkipNamespaceCreation bool // Whether to skip creating a namespace
SkipSecretCreation bool // Whether to skip creating secret for a test
Namespace *v1.Namespace // Every test has at least one namespace unless creation is skipped
namespacesToDelete []*v1.Namespace // Some tests have more than one.
NamespaceDeletionTimeout time.Duration
NamespacePodSecurityEnforceLevel admissionapi.Level // The pod security enforcement level for namespaces to be applied.
NamespacePodSecurityWarnLevel admissionapi.Level // The pod security warn (client logging) level for namespaces to be applied.
NamespacePodSecurityAuditLevel admissionapi.Level // The pod security audit (server logging) level for namespaces to be applied.
NamespacePodSecurityLevel admissionapi.Level // The pod security level to be used for all of enforcement, warn and audit. Can be rewritten by more specific configuration attributes.
// Flaky operation failures in an e2e test can be captured through this.
flakeReport *FlakeReport
// configuration for framework's client
Options Options
// Place where various additional data is stored during test run to be printed to ReportDir,
// or stdout if ReportDir is not set once test ends.
TestSummaries []TestDataSummary
// Timeouts contains the custom timeouts used during the test execution.
Timeouts *TimeoutContext
// DumpAllNamespaceInfo is invoked by the framework to record
// information about a namespace after a test failure.
DumpAllNamespaceInfo DumpAllNamespaceInfoAction
}
// DumpAllNamespaceInfoAction is called after each failed test for namespaces
// created for the test.
type DumpAllNamespaceInfoAction func(ctx context.Context, f *Framework, namespace string)
// TestDataSummary is an interface for managing test data.
type TestDataSummary interface {
SummaryKind() string
PrintHumanReadable() string
PrintJSON() string
}
// Options is a struct for managing test framework options.
type Options struct {
ClientQPS float32
ClientBurst int
GroupVersion *schema.GroupVersion
}
// NewFrameworkWithCustomTimeouts makes a framework with custom timeouts.
// For timeout values that are zero the normal default value continues to
// be used.
func NewFrameworkWithCustomTimeouts(baseName string, timeouts *TimeoutContext) *Framework {
f := NewDefaultFramework(baseName)
in := reflect.ValueOf(timeouts).Elem()
out := reflect.ValueOf(f.Timeouts).Elem()
for i := 0; i < in.NumField(); i++ {
value := in.Field(i)
if !value.IsZero() {
out.Field(i).Set(value)
}
}
return f
}
// NewDefaultFramework makes a new framework and sets up a BeforeEach which
// initializes the framework instance. It cleans up with a DeferCleanup,
// which runs last, so a AfterEach in the test still has a valid framework
// instance.
func NewDefaultFramework(baseName string) *Framework {
options := Options{
ClientQPS: 20,
ClientBurst: 50,
}
return NewFramework(baseName, options, nil)
}
// NewFramework creates a test framework.
func NewFramework(baseName string, options Options, client clientset.Interface) *Framework {
f := &Framework{
BaseName: baseName,
Options: options,
ClientSet: client,
Timeouts: NewTimeoutContext(),
}
// The order is important here: if the extension calls ginkgo.BeforeEach
// itself, then it can be sure that f.BeforeEach already ran when its
// own callback gets invoked.
ginkgo.BeforeEach(f.BeforeEach, AnnotatedLocation("set up framework"))
for _, extension := range NewFrameworkExtensions {
extension(f)
}
return f
}
// BeforeEach gets a client and makes a namespace.
func (f *Framework) BeforeEach(ctx context.Context) {
// DeferCleanup, in contrast to AfterEach, triggers execution in
// first-in-last-out order. This ensures that the framework instance
// remains valid as long as possible.
//
// In addition, AfterEach will not be called if a test never gets here.
ginkgo.DeferCleanup(f.AfterEach, AnnotatedLocation("tear down framework"))
// Registered later and thus runs before deleting namespaces.
ginkgo.DeferCleanup(f.dumpNamespaceInfo, AnnotatedLocation("dump namespaces"))
ginkgo.By("Creating a kubernetes client")
config, err := LoadConfig()
ExpectNoError(err)
config.QPS = f.Options.ClientQPS
config.Burst = f.Options.ClientBurst
if f.Options.GroupVersion != nil {
config.GroupVersion = f.Options.GroupVersion
}
if TestContext.KubeAPIContentType != "" {
config.ContentType = TestContext.KubeAPIContentType
}
f.clientConfig = rest.CopyConfig(config)
f.ClientSet, err = clientset.NewForConfig(config)
ExpectNoError(err)
f.DynamicClient, err = dynamic.NewForConfig(config)
ExpectNoError(err)
// create scales getter, set GroupVersion and NegotiatedSerializer to default values
// as they are required when creating a REST client.
if config.GroupVersion == nil {
config.GroupVersion = &schema.GroupVersion{}
}
if config.NegotiatedSerializer == nil {
config.NegotiatedSerializer = scheme.Codecs
}
restClient, err := rest.RESTClientFor(config)
ExpectNoError(err)
discoClient, err := discovery.NewDiscoveryClientForConfig(config)
ExpectNoError(err)
cachedDiscoClient := cacheddiscovery.NewMemCacheClient(discoClient)
restMapper := restmapper.NewDeferredDiscoveryRESTMapper(cachedDiscoClient)
restMapper.Reset()
resolver := scaleclient.NewDiscoveryScaleKindResolver(cachedDiscoClient)
f.ScalesGetter = scaleclient.New(restClient, restMapper, dynamic.LegacyAPIPathResolverFunc, resolver)
TestContext.CloudConfig.Provider.FrameworkBeforeEach(f)
if !f.SkipNamespaceCreation {
ginkgo.By(fmt.Sprintf("Building a namespace api object, basename %s", f.BaseName))
namespace, err := f.CreateNamespace(ctx, f.BaseName, map[string]string{
"e2e-framework": f.BaseName,
})
ExpectNoError(err)
f.Namespace = namespace
if TestContext.VerifyServiceAccount {
ginkgo.By("Waiting for a default service account to be provisioned in namespace")
err = WaitForDefaultServiceAccountInNamespace(ctx, f.ClientSet, namespace.Name)
ExpectNoError(err)
ginkgo.By("Waiting for kube-root-ca.crt to be provisioned in namespace")
err = WaitForKubeRootCAInNamespace(ctx, f.ClientSet, namespace.Name)
ExpectNoError(err)
} else {
Logf("Skipping waiting for service account")
}
f.UniqueName = f.Namespace.GetName()
} else {
// not guaranteed to be unique, but very likely
f.UniqueName = fmt.Sprintf("%s-%08x", f.BaseName, rand.Int31())
}
f.flakeReport = NewFlakeReport()
}
func (f *Framework) dumpNamespaceInfo(ctx context.Context) {
if !ginkgo.CurrentSpecReport().Failed() {
return
}
if !TestContext.DumpLogsOnFailure {
return
}
if f.DumpAllNamespaceInfo == nil {
return
}
ginkgo.By("dump namespace information after failure", func() {
if !f.SkipNamespaceCreation {
for _, ns := range f.namespacesToDelete {
f.DumpAllNamespaceInfo(ctx, f, ns.Name)
}
}
})
}
// printSummaries prints summaries of tests.
func printSummaries(summaries []TestDataSummary, testBaseName string) {
now := time.Now()
for i := range summaries {
Logf("Printing summary: %v", summaries[i].SummaryKind())
switch TestContext.OutputPrintType {
case "hr":
if TestContext.ReportDir == "" {
Logf("%s", summaries[i].PrintHumanReadable())
} else {
// TODO: learn to extract test name and append it to the kind instead of timestamp.
filePath := path.Join(TestContext.ReportDir, summaries[i].SummaryKind()+"_"+testBaseName+"_"+now.Format(time.RFC3339)+".txt")
if err := os.WriteFile(filePath, []byte(summaries[i].PrintHumanReadable()), 0644); err != nil {
Logf("Failed to write file %v with test performance data: %v", filePath, err)
}
}
case "json":
fallthrough
default:
if TestContext.OutputPrintType != "json" {
Logf("Unknown output type: %v. Printing JSON", TestContext.OutputPrintType)
}
if TestContext.ReportDir == "" {
Logf("%v JSON\n%v", summaries[i].SummaryKind(), summaries[i].PrintJSON())
Logf("Finished")
} else {
// TODO: learn to extract test name and append it to the kind instead of timestamp.
filePath := path.Join(TestContext.ReportDir, summaries[i].SummaryKind()+"_"+testBaseName+"_"+now.Format(time.RFC3339)+".json")
Logf("Writing to %s", filePath)
if err := os.WriteFile(filePath, []byte(summaries[i].PrintJSON()), 0644); err != nil {
Logf("Failed to write file %v with test performance data: %v", filePath, err)
}
}
}
}
}
// AfterEach deletes the namespace, after reading its events.
func (f *Framework) AfterEach(ctx context.Context) {
// This should not happen. Given ClientSet is a public field a test must have updated it!
// Error out early before any API calls during cleanup.
if f.ClientSet == nil {
Failf("The framework ClientSet must not be nil at this point")
}
// DeleteNamespace at the very end in defer, to avoid any
// expectation failures preventing deleting the namespace.
defer func() {
nsDeletionErrors := map[string]error{}
// Whether to delete namespace is determined by 3 factors: delete-namespace flag, delete-namespace-on-failure flag and the test result
// if delete-namespace set to false, namespace will always be preserved.
// if delete-namespace is true and delete-namespace-on-failure is false, namespace will be preserved if test failed.
if TestContext.DeleteNamespace && (TestContext.DeleteNamespaceOnFailure || !ginkgo.CurrentSpecReport().Failed()) {
for _, ns := range f.namespacesToDelete {
ginkgo.By(fmt.Sprintf("Destroying namespace %q for this suite.", ns.Name))
if err := f.ClientSet.CoreV1().Namespaces().Delete(ctx, ns.Name, metav1.DeleteOptions{}); err != nil {
if !apierrors.IsNotFound(err) {
nsDeletionErrors[ns.Name] = err
// Dump namespace if we are unable to delete the namespace and the dump was not already performed.
if !ginkgo.CurrentSpecReport().Failed() && TestContext.DumpLogsOnFailure && f.DumpAllNamespaceInfo != nil {
f.DumpAllNamespaceInfo(ctx, f, ns.Name)
}
} else {
Logf("Namespace %v was already deleted", ns.Name)
}
}
}
} else {
if !TestContext.DeleteNamespace {
Logf("Found DeleteNamespace=false, skipping namespace deletion!")
} else {
Logf("Found DeleteNamespaceOnFailure=false and current test failed, skipping namespace deletion!")
}
}
// Unsetting this is relevant for a following test that uses
// the same instance because it might not reach f.BeforeEach
// when some other BeforeEach skips the test first.
f.Namespace = nil
f.clientConfig = nil
f.ClientSet = nil
f.namespacesToDelete = nil
// if we had errors deleting, report them now.
if len(nsDeletionErrors) != 0 {
messages := []string{}
for namespaceKey, namespaceErr := range nsDeletionErrors {
messages = append(messages, fmt.Sprintf("Couldn't delete ns: %q: %s (%#v)", namespaceKey, namespaceErr, namespaceErr))
}
Fail(strings.Join(messages, ","))
}
}()
TestContext.CloudConfig.Provider.FrameworkAfterEach(f)
// Report any flakes that were observed in the e2e test and reset.
if f.flakeReport != nil && f.flakeReport.GetFlakeCount() > 0 {
f.TestSummaries = append(f.TestSummaries, f.flakeReport)
f.flakeReport = nil
}
printSummaries(f.TestSummaries, f.BaseName)
}
// DeleteNamespace can be used to delete a namespace. Additionally it can be used to
// dump namespace information so as it can be used as an alternative of framework
// deleting the namespace towards the end.
func (f *Framework) DeleteNamespace(ctx context.Context, name string) {
defer func() {
err := f.ClientSet.CoreV1().Namespaces().Delete(ctx, name, metav1.DeleteOptions{})
if err != nil && !apierrors.IsNotFound(err) {
Logf("error deleting namespace %s: %v", name, err)
return
}
err = WaitForNamespacesDeleted(ctx, f.ClientSet, []string{name}, DefaultNamespaceDeletionTimeout)
if err != nil {
Logf("error deleting namespace %s: %v", name, err)
return
}
// remove deleted namespace from namespacesToDelete map
for i, ns := range f.namespacesToDelete {
if ns == nil {
continue
}
if ns.Name == name {
f.namespacesToDelete = append(f.namespacesToDelete[:i], f.namespacesToDelete[i+1:]...)
}
}
}()
// if current test failed then we should dump namespace information
if !f.SkipNamespaceCreation && ginkgo.CurrentSpecReport().Failed() && TestContext.DumpLogsOnFailure && f.DumpAllNamespaceInfo != nil {
f.DumpAllNamespaceInfo(ctx, f, name)
}
}
// CreateNamespace creates a namespace for e2e testing.
func (f *Framework) CreateNamespace(ctx context.Context, baseName string, labels map[string]string) (*v1.Namespace, error) {
createTestingNS := TestContext.CreateTestingNS
if createTestingNS == nil {
createTestingNS = CreateTestingNS
}
if labels == nil {
labels = make(map[string]string)
} else {
labelsCopy := make(map[string]string)
for k, v := range labels {
labelsCopy[k] = v
}
labels = labelsCopy
}
labels[admissionapi.EnforceLevelLabel] = firstNonEmptyPSaLevelOrRestricted(f.NamespacePodSecurityEnforceLevel, f.NamespacePodSecurityLevel)
labels[admissionapi.WarnLevelLabel] = firstNonEmptyPSaLevelOrRestricted(f.NamespacePodSecurityWarnLevel, f.NamespacePodSecurityLevel)
labels[admissionapi.AuditLevelLabel] = firstNonEmptyPSaLevelOrRestricted(f.NamespacePodSecurityAuditLevel, f.NamespacePodSecurityLevel)
ns, err := createTestingNS(ctx, baseName, f.ClientSet, labels)
// check ns instead of err to see if it's nil as we may
// fail to create serviceAccount in it.
f.AddNamespacesToDelete(ns)
if TestContext.E2EDockerConfigFile != "" && !f.SkipSecretCreation {
// With the Secret created, the default service account (in the new namespace)
// is patched with the secret and can then be referenced by all the pods spawned by E2E process, and repository authentication should be successful.
secret, err := f.createSecretFromDockerConfig(ctx, ns.Name)
if err != nil {
return ns, fmt.Errorf("failed to create secret from docker config file: %v", err)
}
serviceAccountClient := f.ClientSet.CoreV1().ServiceAccounts(ns.Name)
serviceAccountConfig := v1svc.ServiceAccount(defaultServiceAccountName, ns.Name)
serviceAccountConfig.ImagePullSecrets = append(serviceAccountConfig.ImagePullSecrets, v1svc.LocalObjectReferenceApplyConfiguration{Name: &secret.Name})
svc, err := serviceAccountClient.Apply(ctx, serviceAccountConfig, metav1.ApplyOptions{FieldManager: "e2e-framework"})
if err != nil {
return ns, fmt.Errorf("failed to patch imagePullSecret [%s] to service account [%s]: %v", secret.Name, svc.Name, err)
}
}
return ns, err
}
func firstNonEmptyPSaLevelOrRestricted(levelConfig ...admissionapi.Level) string {
for _, l := range levelConfig {
if len(l) > 0 {
return string(l)
}
}
return string(admissionapi.LevelRestricted)
}
// createSecretFromDockerConfig creates a secret using the private image registry credentials.
// The credentials are provided by --e2e-docker-config-file flag.
func (f *Framework) createSecretFromDockerConfig(ctx context.Context, namespace string) (*v1.Secret, error) {
contents, err := os.ReadFile(TestContext.E2EDockerConfigFile)
if err != nil {
return nil, fmt.Errorf("error reading docker config file: %v", err)
}
secretObject := &v1.Secret{
Data: map[string][]byte{v1.DockerConfigJsonKey: contents},
Type: v1.SecretTypeDockerConfigJson,
}
secretObject.GenerateName = "registry-cred"
Logf("create image pull secret %s", secretObject.Name)
secret, err := f.ClientSet.CoreV1().Secrets(namespace).Create(ctx, secretObject, metav1.CreateOptions{})
return secret, err
}
// RecordFlakeIfError records flakeness info if error happens.
// NOTE: This function is not used at any places yet, but we are in progress for https://github.com/kubernetes/kubernetes/issues/66239 which requires this. Please don't remove this.
func (f *Framework) RecordFlakeIfError(err error, optionalDescription ...interface{}) {
f.flakeReport.RecordFlakeIfError(err, optionalDescription...)
}
// AddNamespacesToDelete adds one or more namespaces to be deleted when the test
// completes.
func (f *Framework) AddNamespacesToDelete(namespaces ...*v1.Namespace) {
for _, ns := range namespaces {
if ns == nil {
continue
}
f.namespacesToDelete = append(f.namespacesToDelete, ns)
}
}
// ClientConfig an externally accessible method for reading the kube client config.
func (f *Framework) ClientConfig() *rest.Config {
ret := rest.CopyConfig(f.clientConfig)
// json is least common denominator
ret.ContentType = runtime.ContentTypeJSON
ret.AcceptContentTypes = runtime.ContentTypeJSON
return ret
}
// KubeUser is a struct for managing kubernetes user info.
type KubeUser struct {
Name string `yaml:"name"`
User struct {
Username string `yaml:"username"`
Password string `yaml:"password" datapolicy:"password"`
Token string `yaml:"token" datapolicy:"token"`
} `yaml:"user"`
}
// KubeCluster is a struct for managing kubernetes cluster info.
type KubeCluster struct {
Name string `yaml:"name"`
Cluster struct {
CertificateAuthorityData string `yaml:"certificate-authority-data"`
Server string `yaml:"server"`
} `yaml:"cluster"`
}
// KubeConfig is a struct for managing kubernetes config.
type KubeConfig struct {
Contexts []struct {
Name string `yaml:"name"`
Context struct {
Cluster string `yaml:"cluster"`
User string
} `yaml:"context"`
} `yaml:"contexts"`
Clusters []KubeCluster `yaml:"clusters"`
Users []KubeUser `yaml:"users"`
}
// FindUser returns user info which is the specified user name.
func (kc *KubeConfig) FindUser(name string) *KubeUser {
for _, user := range kc.Users {
if user.Name == name {
return &user
}
}
return nil
}
// FindCluster returns cluster info which is the specified cluster name.
func (kc *KubeConfig) FindCluster(name string) *KubeCluster {
for _, cluster := range kc.Clusters {
if cluster.Name == name {
return &cluster
}
}
return nil
}
// PodStateVerification represents a verification of pod state.
// Any time you have a set of pods that you want to operate against or query,
// this struct can be used to declaratively identify those pods.
type PodStateVerification struct {
// Optional: only pods that have k=v labels will pass this filter.
Selectors map[string]string
// Required: The phases which are valid for your pod.
ValidPhases []v1.PodPhase
// Optional: only pods passing this function will pass the filter
// Verify a pod.
// As an optimization, in addition to specifying filter (boolean),
// this function allows specifying an error as well.
// The error indicates that the polling of the pod spectrum should stop.
Verify func(v1.Pod) (bool, error)
// Optional: only pods with this name will pass the filter.
PodName string
}
// ClusterVerification is a struct for a verification of cluster state.
type ClusterVerification struct {
client clientset.Interface
namespace *v1.Namespace // pointer rather than string, since ns isn't created until before each.
podState PodStateVerification
}
// NewClusterVerification creates a new cluster verification.
func (f *Framework) NewClusterVerification(namespace *v1.Namespace, filter PodStateVerification) *ClusterVerification {
return &ClusterVerification{
f.ClientSet,
namespace,
filter,
}
}
func passesPodNameFilter(pod v1.Pod, name string) bool {
return name == "" || strings.Contains(pod.Name, name)
}
func passesVerifyFilter(pod v1.Pod, verify func(p v1.Pod) (bool, error)) (bool, error) {
if verify == nil {
return true, nil
}
verified, err := verify(pod)
// If an error is returned, by definition, pod verification fails
if err != nil {
return false, err
}
return verified, nil
}
func passesPhasesFilter(pod v1.Pod, validPhases []v1.PodPhase) bool {
passesPhaseFilter := false
for _, phase := range validPhases {
if pod.Status.Phase == phase {
passesPhaseFilter = true
}
}
return passesPhaseFilter
}
// filterLabels returns a list of pods which have labels.
func filterLabels(ctx context.Context, selectors map[string]string, cli clientset.Interface, ns string) (*v1.PodList, error) {
var err error
var selector labels.Selector
var pl *v1.PodList
// List pods based on selectors. This might be a tiny optimization rather then filtering
// everything manually.
if len(selectors) > 0 {
selector = labels.SelectorFromSet(labels.Set(selectors))
options := metav1.ListOptions{LabelSelector: selector.String()}
pl, err = cli.CoreV1().Pods(ns).List(ctx, options)
} else {
pl, err = cli.CoreV1().Pods(ns).List(ctx, metav1.ListOptions{})
}
return pl, err
}
// filter filters pods which pass a filter. It can be used to compose
// the more useful abstractions like ForEach, WaitFor, and so on, which
// can be used directly by tests.
func (p *PodStateVerification) filter(ctx context.Context, c clientset.Interface, namespace *v1.Namespace) ([]v1.Pod, error) {
if len(p.ValidPhases) == 0 || namespace == nil {
panic(fmt.Errorf("Need to specify a valid pod phases (%v) and namespace (%v). ", p.ValidPhases, namespace))
}
ns := namespace.Name
pl, err := filterLabels(ctx, p.Selectors, c, ns) // Build an v1.PodList to operate against.
Logf("Selector matched %v pods for %v", len(pl.Items), p.Selectors)
if len(pl.Items) == 0 || err != nil {
return pl.Items, err
}
unfilteredPods := pl.Items
filteredPods := []v1.Pod{}
ReturnPodsSoFar:
// Next: Pod must match at least one of the states that the user specified
for _, pod := range unfilteredPods {
if !(passesPhasesFilter(pod, p.ValidPhases) && passesPodNameFilter(pod, p.PodName)) {
continue
}
passesVerify, err := passesVerifyFilter(pod, p.Verify)
if err != nil {
Logf("Error detected on %v : %v !", pod.Name, err)
break ReturnPodsSoFar
}
if passesVerify {
filteredPods = append(filteredPods, pod)
}
}
return filteredPods, err
}
// WaitFor waits for some minimum number of pods to be verified, according to the PodStateVerification
// definition.
func (cl *ClusterVerification) WaitFor(ctx context.Context, atLeast int, timeout time.Duration) ([]v1.Pod, error) {
pods := []v1.Pod{}
var returnedErr error
err := wait.PollUntilContextTimeout(ctx, 1*time.Second, timeout, false, func(ctx context.Context) (bool, error) {
pods, returnedErr = cl.podState.filter(ctx, cl.client, cl.namespace)
// Failure
if returnedErr != nil {
Logf("Cutting polling short: We got an error from the pod filtering layer.")
// stop polling if the pod filtering returns an error. that should never happen.
// it indicates, for example, that the client is broken or something non-pod related.
return false, returnedErr
}
Logf("Found %v / %v", len(pods), atLeast)
// Success
if len(pods) >= atLeast {
return true, nil
}
// Keep trying...
return false, nil
})
Logf("WaitFor completed with timeout %v. Pods found = %v out of %v", timeout, len(pods), atLeast)
return pods, err
}
// WaitForOrFail provides a shorthand WaitFor with failure as an option if anything goes wrong.
func (cl *ClusterVerification) WaitForOrFail(ctx context.Context, atLeast int, timeout time.Duration) {
pods, err := cl.WaitFor(ctx, atLeast, timeout)
if err != nil || len(pods) < atLeast {
Failf("Verified %v of %v pods , error : %v", len(pods), atLeast, err)
}
}
// ForEach runs a function against every verifiable pod. Be warned that this doesn't wait for "n" pods to verify,
// so it may return very quickly if you have strict pod state requirements.
//
// For example, if you require at least 5 pods to be running before your test will pass,
// its smart to first call "clusterVerification.WaitFor(5)" before you call clusterVerification.ForEach.
func (cl *ClusterVerification) ForEach(ctx context.Context, podFunc func(v1.Pod)) error {
pods, err := cl.podState.filter(ctx, cl.client, cl.namespace)
if err == nil {
if len(pods) == 0 {
Failf("No pods matched the filter.")
}
Logf("ForEach: Found %v pods from the filter. Now looping through them.", len(pods))
for _, p := range pods {
podFunc(p)
}
} else {
Logf("ForEach: Something went wrong when filtering pods to execute against: %v", err)
}
return err
}

148
e2e/vendor/k8s.io/kubernetes/test/e2e/framework/get.go generated vendored Normal file
View File

@ -0,0 +1,148 @@
/*
Copyright 2023 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import (
"context"
"errors"
"fmt"
"time"
"github.com/onsi/gomega"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// GetFunc is a function which retrieves a certain object.
type GetFunc[T any] func(ctx context.Context) (T, error)
// APIGetFunc is a get functions as used in client-go.
type APIGetFunc[T any] func(ctx context.Context, name string, getOptions metav1.GetOptions) (T, error)
// APIListFunc is a list functions as used in client-go.
type APIListFunc[T any] func(ctx context.Context, listOptions metav1.ListOptions) (T, error)
// GetObject takes a get function like clientset.CoreV1().Pods(ns).Get
// and the parameters for it and returns a function that executes that get
// operation in a [gomega.Eventually] or [gomega.Consistently].
//
// Delays and retries are handled by [HandleRetry]. A "not found" error is
// a fatal error that causes polling to stop immediately. If that is not
// desired, then wrap the result with [IgnoreNotFound].
func GetObject[T any](get APIGetFunc[T], name string, getOptions metav1.GetOptions) GetFunc[T] {
return HandleRetry(func(ctx context.Context) (T, error) {
return get(ctx, name, getOptions)
})
}
// ListObjects takes a list function like clientset.CoreV1().Pods(ns).List
// and the parameters for it and returns a function that executes that list
// operation in a [gomega.Eventually] or [gomega.Consistently].
//
// Delays and retries are handled by [HandleRetry].
func ListObjects[T any](list APIListFunc[T], listOptions metav1.ListOptions) GetFunc[T] {
return HandleRetry(func(ctx context.Context) (T, error) {
return list(ctx, listOptions)
})
}
// HandleRetry wraps an arbitrary get function. When the wrapped function
// returns an error, HandleGetError will decide whether the call should be
// retried and if requested, will sleep before doing so.
//
// This is meant to be used inside [gomega.Eventually] or [gomega.Consistently].
func HandleRetry[T any](get GetFunc[T]) GetFunc[T] {
return func(ctx context.Context) (T, error) {
t, err := get(ctx)
if err != nil {
if retry, delay := ShouldRetry(err); retry {
if delay > 0 {
// We could return
// gomega.TryAgainAfter(delay) here,
// but then we need to funnel that
// error through any other
// wrappers. Waiting directly is simpler.
ctx, cancel := context.WithTimeout(ctx, delay)
defer cancel()
<-ctx.Done()
}
return t, err
}
// Give up polling immediately.
var null T
return t, gomega.StopTrying(fmt.Sprintf("Unexpected final error while getting %T", null)).Wrap(err)
}
return t, nil
}
}
// ShouldRetry decides whether to retry an API request. Optionally returns a
// delay to retry after.
func ShouldRetry(err error) (retry bool, retryAfter time.Duration) {
// if the error sends the Retry-After header, we respect it as an explicit confirmation we should retry.
if delay, shouldRetry := apierrors.SuggestsClientDelay(err); shouldRetry {
return shouldRetry, time.Duration(delay) * time.Second
}
// these errors indicate a transient error that should be retried.
if apierrors.IsTimeout(err) ||
apierrors.IsTooManyRequests(err) ||
apierrors.IsServiceUnavailable(err) ||
errors.As(err, &transientError{}) {
return true, 0
}
return false, 0
}
// RetryNotFound wraps an arbitrary get function. When the wrapped function
// encounters a "not found" error, that error is treated as a transient problem
// and polling continues.
//
// This is meant to be used inside [gomega.Eventually] or [gomega.Consistently].
func RetryNotFound[T any](get GetFunc[T]) GetFunc[T] {
return func(ctx context.Context) (T, error) {
t, err := get(ctx)
if apierrors.IsNotFound(err) {
// If we are wrapping HandleRetry, then the error will
// be gomega.StopTrying. We need to get rid of that,
// otherwise gomega.Eventually will stop.
var stopTryingErr gomega.PollingSignalError
if errors.As(err, &stopTryingErr) {
if wrappedErr := errors.Unwrap(stopTryingErr); wrappedErr != nil {
err = wrappedErr
}
}
// Mark the error as transient in case that we get
// wrapped by HandleRetry.
err = transientError{error: err}
}
return t, err
}
}
// transientError wraps some other error and indicates that the
// wrapper error is something that may go away.
type transientError struct {
error
}
func (err transientError) Unwrap() error {
return err.error
}

View File

@ -0,0 +1,117 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Package framework contains provider-independent helper code for
// building and running E2E tests with Ginkgo. The actual Ginkgo test
// suites gets assembled by combining this framework, the optional
// provider support code and specific tests via a separate .go file
// like Kubernetes' test/e2e.go.
package framework
import (
"flag"
"fmt"
"os"
"strings"
"time"
"github.com/onsi/ginkgo/v2"
ginkgotypes "github.com/onsi/ginkgo/v2/types"
"k8s.io/klog/v2"
"k8s.io/klog/v2/textlogger"
_ "k8s.io/component-base/logs/testinit" // Ensure command line flags are registered.
)
var (
logConfig = textlogger.NewConfig(
textlogger.Output(ginkgo.GinkgoWriter),
textlogger.Backtrace(unwind),
)
ginkgoLogger = textlogger.NewLogger(logConfig)
TimeNow = time.Now // Can be stubbed out for testing.
Pid = os.Getpid() // Can be stubbed out for testing.
)
func init() {
// ktesting and testinit already registered the -v and -vmodule
// command line flags. To configure the textlogger and klog
// consistently, we need to intercept the Set call. This
// can be done by swapping out the flag.Value for the -v and
// -vmodule flags with a wrapper which calls both.
var fs flag.FlagSet
logConfig.AddFlags(&fs)
fs.VisitAll(func(loggerFlag *flag.Flag) {
klogFlag := flag.CommandLine.Lookup(loggerFlag.Name)
if klogFlag != nil {
klogFlag.Value = &valueChain{Value: loggerFlag.Value, parentValue: klogFlag.Value}
}
})
// Now install the textlogger as the klog default logger.
// Calls like klog.Info then will write to ginkgo.GingoWriter
// through the textlogger.
//
// However, stack unwinding is then still being done by klog and thus
// ignores ginkgo.GingkoHelper. Tests should use framework.Logf or
// structured, contextual logging.
writer, _ := ginkgoLogger.GetSink().(textlogger.KlogBufferWriter)
opts := []klog.LoggerOption{
klog.ContextualLogger(true),
klog.WriteKlogBuffer(writer.WriteKlogBuffer),
}
klog.SetLoggerWithOptions(ginkgoLogger, opts...)
}
type valueChain struct {
flag.Value
parentValue flag.Value
}
func (v *valueChain) Set(value string) error {
if err := v.Value.Set(value); err != nil {
return err
}
if err := v.parentValue.Set(value); err != nil {
return err
}
return nil
}
func unwind(skip int) (string, int) {
location := ginkgotypes.NewCodeLocation(skip + 1)
return location.FileName, location.LineNumber
}
// log re-implements klog.Info: same header, but stack unwinding
// with support for ginkgo.GinkgoWriter and skipping stack levels.
func log(offset int, msg string) {
now := TimeNow()
file, line := unwind(offset + 1)
if file == "" {
file = "???"
line = 1
} else if slash := strings.LastIndex(file, "/"); slash >= 0 {
file = file[slash+1:]
}
_, month, day := now.Date()
hour, minute, second := now.Clock()
header := fmt.Sprintf("I%02d%02d %02d:%02d:%02d.%06d %d %s:%d]",
month, day, hour, minute, second, now.Nanosecond()/1000, Pid, file, line)
fmt.Fprintln(ginkgo.GinkgoWriter, header, msg)
}

View File

@ -0,0 +1,595 @@
/*
Copyright 2022 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import (
"fmt"
"path"
"reflect"
"regexp"
"slices"
"strings"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/ginkgo/v2/types"
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/util/sets"
utilfeature "k8s.io/apiserver/pkg/util/feature"
"k8s.io/component-base/featuregate"
)
// Feature is the name of a certain feature that the cluster under test must have.
// Such features are different from feature gates.
type Feature string
// Environment is the name for the environment in which a test can run, like
// "Linux" or "Windows".
type Environment string
// NodeFeature is the name of a feature that a node must support. To be
// removed, see
// https://github.com/kubernetes/enhancements/tree/master/keps/sig-testing/3041-node-conformance-and-features#nodefeature.
type NodeFeature string
type Valid[T comparable] struct {
items sets.Set[T]
frozen bool
}
// Add registers a new valid item name. The expected usage is
//
// var SomeFeature = framework.ValidFeatures.Add("Some")
//
// during the init phase of an E2E suite. Individual tests should not register
// their own, to avoid uncontrolled proliferation of new items. E2E suites can,
// but don't have to, enforce that by freezing the set of valid names.
func (v *Valid[T]) Add(item T) T {
if v.frozen {
RecordBug(NewBug(fmt.Sprintf(`registry %T is already frozen, "%v" must not be added anymore`, *v, item), 1))
}
if v.items == nil {
v.items = sets.New[T]()
}
if v.items.Has(item) {
RecordBug(NewBug(fmt.Sprintf(`registry %T already contains "%v", it must not be added again`, *v, item), 1))
}
v.items.Insert(item)
return item
}
func (v *Valid[T]) Freeze() {
v.frozen = true
}
// These variables contain the parameters that [WithFeature], [WithEnvironment]
// and [WithNodeFeatures] accept. The framework itself has no pre-defined
// constants. Test suites and tests may define their own and then add them here
// before calling these With functions.
var (
ValidFeatures Valid[Feature]
ValidEnvironments Valid[Environment]
ValidNodeFeatures Valid[NodeFeature]
)
var errInterface = reflect.TypeOf((*error)(nil)).Elem()
// IgnoreNotFound can be used to wrap an arbitrary function in a call to
// [ginkgo.DeferCleanup]. When the wrapped function returns an error that
// `apierrors.IsNotFound` considers as "not found", the error is ignored
// instead of failing the test during cleanup. This is useful for cleanup code
// that just needs to ensure that some object does not exist anymore.
func IgnoreNotFound(in any) any {
inType := reflect.TypeOf(in)
inValue := reflect.ValueOf(in)
return reflect.MakeFunc(inType, func(args []reflect.Value) []reflect.Value {
out := inValue.Call(args)
if len(out) > 0 {
lastValue := out[len(out)-1]
last := lastValue.Interface()
if last != nil && lastValue.Type().Implements(errInterface) && apierrors.IsNotFound(last.(error)) {
out[len(out)-1] = reflect.Zero(errInterface)
}
}
return out
}).Interface()
}
// AnnotatedLocation can be used to provide more informative source code
// locations by passing the result as additional parameter to a
// BeforeEach/AfterEach/DeferCleanup/It/etc.
func AnnotatedLocation(annotation string) types.CodeLocation {
return AnnotatedLocationWithOffset(annotation, 1)
}
// AnnotatedLocationWithOffset skips additional call stack levels. With 0 as offset
// it is identical to [AnnotatedLocation].
func AnnotatedLocationWithOffset(annotation string, offset int) types.CodeLocation {
codeLocation := types.NewCodeLocation(offset + 1)
codeLocation.FileName = path.Base(codeLocation.FileName)
codeLocation = types.NewCustomCodeLocation(annotation + " | " + codeLocation.String())
return codeLocation
}
// SIGDescribe returns a wrapper function for ginkgo.Describe which injects
// the SIG name as annotation. The parameter should be lowercase with
// no spaces and no sig- or SIG- prefix.
func SIGDescribe(sig string) func(...interface{}) bool {
if !sigRE.MatchString(sig) || strings.HasPrefix(sig, "sig-") {
RecordBug(NewBug(fmt.Sprintf("SIG label must be lowercase, no spaces and no sig- prefix, got instead: %q", sig), 1))
}
return func(args ...interface{}) bool {
args = append([]interface{}{WithLabel("sig-" + sig)}, args...)
return registerInSuite(ginkgo.Describe, args)
}
}
var sigRE = regexp.MustCompile(`^[a-z]+(-[a-z]+)*$`)
// ConformanceIt is wrapper function for ginkgo It. Adds "[Conformance]" tag and makes static analysis easier.
func ConformanceIt(args ...interface{}) bool {
args = append(args, ginkgo.Offset(1), WithConformance())
return It(args...)
}
// It is a wrapper around [ginkgo.It] which supports framework With* labels as
// optional arguments in addition to those already supported by ginkgo itself,
// like [ginkgo.Label] and [ginkgo.Offset].
//
// Text and arguments may be mixed. The final text is a concatenation
// of the text arguments and special tags from the With functions.
func It(args ...interface{}) bool {
return registerInSuite(ginkgo.It, args)
}
// It is a shorthand for the corresponding package function.
func (f *Framework) It(args ...interface{}) bool {
return registerInSuite(ginkgo.It, args)
}
// Describe is a wrapper around [ginkgo.Describe] which supports framework
// With* labels as optional arguments in addition to those already supported by
// ginkgo itself, like [ginkgo.Label] and [ginkgo.Offset].
//
// Text and arguments may be mixed. The final text is a concatenation
// of the text arguments and special tags from the With functions.
func Describe(args ...interface{}) bool {
return registerInSuite(ginkgo.Describe, args)
}
// Describe is a shorthand for the corresponding package function.
func (f *Framework) Describe(args ...interface{}) bool {
return registerInSuite(ginkgo.Describe, args)
}
// Context is a wrapper around [ginkgo.Context] which supports framework With*
// labels as optional arguments in addition to those already supported by
// ginkgo itself, like [ginkgo.Label] and [ginkgo.Offset].
//
// Text and arguments may be mixed. The final text is a concatenation
// of the text arguments and special tags from the With functions.
func Context(args ...interface{}) bool {
return registerInSuite(ginkgo.Context, args)
}
// Context is a shorthand for the corresponding package function.
func (f *Framework) Context(args ...interface{}) bool {
return registerInSuite(ginkgo.Context, args)
}
// registerInSuite is the common implementation of all wrapper functions. It
// expects to be called through one intermediate wrapper.
func registerInSuite(ginkgoCall func(string, ...interface{}) bool, args []interface{}) bool {
var ginkgoArgs []interface{}
var offset ginkgo.Offset
var texts []string
addLabel := func(label string) {
texts = append(texts, fmt.Sprintf("[%s]", label))
ginkgoArgs = append(ginkgoArgs, ginkgo.Label(label))
}
haveEmptyStrings := false
for _, arg := range args {
switch arg := arg.(type) {
case label:
fullLabel := strings.Join(arg.parts, ":")
addLabel(fullLabel)
if arg.extraFeature != "" {
texts = append(texts, fmt.Sprintf("[%s]", arg.extraFeature))
ginkgoArgs = append(ginkgoArgs, ginkgo.Label("Feature:"+arg.extraFeature))
}
if fullLabel == "Serial" {
ginkgoArgs = append(ginkgoArgs, ginkgo.Serial)
}
case ginkgo.Offset:
offset = arg
case string:
if arg == "" {
haveEmptyStrings = true
}
texts = append(texts, arg)
default:
ginkgoArgs = append(ginkgoArgs, arg)
}
}
offset += 2 // This function and its direct caller.
// Now that we have the final offset, we can record bugs.
if haveEmptyStrings {
RecordBug(NewBug("empty strings as separators are unnecessary and need to be removed", int(offset)))
}
// Enforce that text snippets to not start or end with spaces because
// those lead to double spaces when concatenating below.
for _, text := range texts {
if strings.HasPrefix(text, " ") || strings.HasSuffix(text, " ") {
RecordBug(NewBug(fmt.Sprintf("trailing or leading spaces are unnecessary and need to be removed: %q", text), int(offset)))
}
}
ginkgoArgs = append(ginkgoArgs, offset)
text := strings.Join(texts, " ")
return ginkgoCall(text, ginkgoArgs...)
}
var (
tagRe = regexp.MustCompile(`\[.*?\]`)
deprecatedTags = sets.New("Conformance", "Flaky", "NodeConformance", "Disruptive", "Serial", "Slow")
deprecatedTagPrefixes = sets.New("Environment", "Feature", "NodeFeature", "FeatureGate")
deprecatedStability = sets.New("Alpha", "Beta")
)
// validateSpecs checks that the test specs were registered as intended.
func validateSpecs(specs types.SpecReports) {
checked := sets.New[call]()
for _, spec := range specs {
for i, text := range spec.ContainerHierarchyTexts {
c := call{
text: text,
location: spec.ContainerHierarchyLocations[i],
}
if checked.Has(c) {
// No need to check the same container more than once.
continue
}
checked.Insert(c)
validateText(c.location, text, spec.ContainerHierarchyLabels[i])
}
c := call{
text: spec.LeafNodeText,
location: spec.LeafNodeLocation,
}
if !checked.Has(c) {
validateText(spec.LeafNodeLocation, spec.LeafNodeText, spec.LeafNodeLabels)
checked.Insert(c)
}
}
}
// call acts as (mostly) unique identifier for a container node call like
// Describe or Context. It's not perfect because theoretically a line might
// have multiple calls with the same text, but that isn't a problem in
// practice.
type call struct {
text string
location types.CodeLocation
}
// validateText checks for some known tags that should not be added through the
// plain text strings anymore. Eventually, all such tags should get replaced
// with the new APIs.
func validateText(location types.CodeLocation, text string, labels []string) {
for _, tag := range tagRe.FindAllString(text, -1) {
if tag == "[]" {
recordTextBug(location, "[] in plain text is invalid")
continue
}
// Strip square brackets.
tag = tag[1 : len(tag)-1]
if slices.Contains(labels, tag) {
// Okay, was also set as label.
continue
}
if deprecatedTags.Has(tag) {
recordTextBug(location, fmt.Sprintf("[%s] in plain text is deprecated and must be added through With%s instead", tag, tag))
}
if deprecatedStability.Has(tag) {
if slices.Contains(labels, "Feature:"+tag) {
// Okay, was also set as label.
continue
}
recordTextBug(location, fmt.Sprintf("[%s] in plain text is deprecated and must be added by defining the feature gate through WithFeatureGate instead", tag))
}
if index := strings.Index(tag, ":"); index > 0 {
prefix := tag[:index]
if deprecatedTagPrefixes.Has(prefix) {
recordTextBug(location, fmt.Sprintf("[%s] in plain text is deprecated and must be added through With%s(%s) instead", tag, prefix, tag[index+1:]))
}
}
}
}
func recordTextBug(location types.CodeLocation, message string) {
RecordBug(Bug{FileName: location.FileName, LineNumber: location.LineNumber, Message: message})
}
// WithEnvironment specifies that a certain test or group of tests only works
// with a feature available. The return value must be passed as additional
// argument to [framework.It], [framework.Describe], [framework.Context].
//
// The feature must be listed in ValidFeatures.
func WithFeature(name Feature) interface{} {
return withFeature(name)
}
// WithFeature is a shorthand for the corresponding package function.
func (f *Framework) WithFeature(name Feature) interface{} {
return withFeature(name)
}
func withFeature(name Feature) interface{} {
if !ValidFeatures.items.Has(name) {
RecordBug(NewBug(fmt.Sprintf("WithFeature: unknown feature %q", name), 2))
}
return newLabel("Feature", string(name))
}
// WithFeatureGate specifies that a certain test or group of tests depends on a
// feature gate being enabled. The return value must be passed as additional
// argument to [framework.It], [framework.Describe], [framework.Context].
//
// The feature gate must be listed in
// [k8s.io/apiserver/pkg/util/feature.DefaultMutableFeatureGate]. Once a
// feature gate gets removed from there, the WithFeatureGate calls using it
// also need to be removed.
//
// [Alpha] resp. [Beta] get added to the test name automatically depending
// on the current stability level of the feature. Feature:Alpha resp.
// Feature:Beta get added to the Ginkgo labels because this is a special
// requirement for how the cluster needs to be configured.
//
// If the test can run in any cluster that has alpha resp. beta features and
// API groups enabled, then annotating it with just WithFeatureGate is
// sufficient. Otherwise, WithFeature has to be used to define the additional
// requirements.
func WithFeatureGate(featureGate featuregate.Feature) interface{} {
return withFeatureGate(featureGate)
}
// WithFeatureGate is a shorthand for the corresponding package function.
func (f *Framework) WithFeatureGate(featureGate featuregate.Feature) interface{} {
return withFeatureGate(featureGate)
}
func withFeatureGate(featureGate featuregate.Feature) interface{} {
spec, ok := utilfeature.DefaultMutableFeatureGate.GetAll()[featureGate]
if !ok {
RecordBug(NewBug(fmt.Sprintf("WithFeatureGate: the feature gate %q is unknown", featureGate), 2))
}
// We use mixed case (i.e. Beta instead of BETA). GA feature gates have no level string.
var level string
if spec.PreRelease != "" {
level = string(spec.PreRelease)
level = strings.ToUpper(level[0:1]) + strings.ToLower(level[1:])
}
l := newLabel("FeatureGate", string(featureGate))
l.extraFeature = level
return l
}
// WithEnvironment specifies that a certain test or group of tests only works
// in a certain environment. The return value must be passed as additional
// argument to [framework.It], [framework.Describe], [framework.Context].
//
// The environment must be listed in ValidEnvironments.
func WithEnvironment(name Environment) interface{} {
return withEnvironment(name)
}
// WithEnvironment is a shorthand for the corresponding package function.
func (f *Framework) WithEnvironment(name Environment) interface{} {
return withEnvironment(name)
}
func withEnvironment(name Environment) interface{} {
if !ValidEnvironments.items.Has(name) {
RecordBug(NewBug(fmt.Sprintf("WithEnvironment: unknown environment %q", name), 2))
}
return newLabel("Environment", string(name))
}
// WithNodeFeature specifies that a certain test or group of tests only works
// if the node supports a certain feature. The return value must be passed as
// additional argument to [framework.It], [framework.Describe],
// [framework.Context].
//
// The environment must be listed in ValidNodeFeatures.
func WithNodeFeature(name NodeFeature) interface{} {
return withNodeFeature(name)
}
// WithNodeFeature is a shorthand for the corresponding package function.
func (f *Framework) WithNodeFeature(name NodeFeature) interface{} {
return withNodeFeature(name)
}
func withNodeFeature(name NodeFeature) interface{} {
if !ValidNodeFeatures.items.Has(name) {
RecordBug(NewBug(fmt.Sprintf("WithNodeFeature: unknown environment %q", name), 2))
}
return newLabel("NodeFeature", string(name))
}
// WithConformace specifies that a certain test or group of tests must pass in
// all conformant Kubernetes clusters. The return value must be passed as
// additional argument to [framework.It], [framework.Describe],
// [framework.Context].
func WithConformance() interface{} {
return withConformance()
}
// WithConformance is a shorthand for the corresponding package function.
func (f *Framework) WithConformance() interface{} {
return withConformance()
}
func withConformance() interface{} {
return newLabel("Conformance")
}
// WithNodeConformance specifies that a certain test or group of tests for node
// functionality that does not depend on runtime or Kubernetes distro specific
// behavior. The return value must be passed as additional argument to
// [framework.It], [framework.Describe], [framework.Context].
func WithNodeConformance() interface{} {
return withNodeConformance()
}
// WithNodeConformance is a shorthand for the corresponding package function.
func (f *Framework) WithNodeConformance() interface{} {
return withNodeConformance()
}
func withNodeConformance() interface{} {
return newLabel("NodeConformance")
}
// WithDisruptive specifies that a certain test or group of tests temporarily
// affects the functionality of the Kubernetes cluster. The return value must
// be passed as additional argument to [framework.It], [framework.Describe],
// [framework.Context].
func WithDisruptive() interface{} {
return withDisruptive()
}
// WithDisruptive is a shorthand for the corresponding package function.
func (f *Framework) WithDisruptive() interface{} {
return withDisruptive()
}
func withDisruptive() interface{} {
return newLabel("Disruptive")
}
// WithSerial specifies that a certain test or group of tests must not run in
// parallel with other tests. The return value must be passed as additional
// argument to [framework.It], [framework.Describe], [framework.Context].
//
// Starting with ginkgo v2, serial and parallel tests can be executed in the
// same invocation. Ginkgo itself will ensure that the serial tests run
// sequentially.
func WithSerial() interface{} {
return withSerial()
}
// WithSerial is a shorthand for the corresponding package function.
func (f *Framework) WithSerial() interface{} {
return withSerial()
}
func withSerial() interface{} {
return newLabel("Serial")
}
// WithSlow specifies that a certain test or group of tests must not run in
// parallel with other tests. The return value must be passed as additional
// argument to [framework.It], [framework.Describe], [framework.Context].
func WithSlow() interface{} {
return withSlow()
}
// WithSlow is a shorthand for the corresponding package function.
func (f *Framework) WithSlow() interface{} {
return WithSlow()
}
func withSlow() interface{} {
return newLabel("Slow")
}
// WithLabel is a wrapper around [ginkgo.Label]. Besides adding an arbitrary
// label to a test, it also injects the label in square brackets into the test
// name.
func WithLabel(label string) interface{} {
return withLabel(label)
}
// WithLabel is a shorthand for the corresponding package function.
func (f *Framework) WithLabel(label string) interface{} {
return withLabel(label)
}
func withLabel(label string) interface{} {
return newLabel(label)
}
// WithFlaky specifies that a certain test or group of tests are failing randomly.
// These tests are usually filtered out and ran separately from other tests.
func WithFlaky() interface{} {
return withFlaky()
}
// WithFlaky is a shorthand for the corresponding package function.
func (f *Framework) WithFlaky() interface{} {
return withFlaky()
}
func withFlaky() interface{} {
return newLabel("Flaky")
}
type label struct {
// parts get concatenated with ":" to build the full label.
parts []string
// extra is an optional feature name. It gets added as [<extraFeature>]
// to the test name and as Feature:<extraFeature> to the labels.
extraFeature string
// explanation gets set for each label to help developers
// who pass a label to a ginkgo function. They need to use
// the corresponding framework function instead.
explanation string
}
func newLabel(parts ...string) label {
return label{
parts: parts,
explanation: "If you see this as part of an 'Unknown Decorator' error from Ginkgo, then you need to replace the ginkgo.It/Context/Describe call with the corresponding framework.It/Context/Describe or (if available) f.It/Context/Describe.",
}
}
// TagsEqual can be used to check whether two tags are the same.
// It's safe to compare e.g. the result of WithSlow() against the result
// of WithSerial(), the result will be false. False is also returned
// when a parameter is some completely different value.
func TagsEqual(a, b interface{}) bool {
al, ok := a.(label)
if !ok {
return false
}
bl, ok := b.(label)
if !ok {
return false
}
if al.extraFeature != bl.extraFeature {
return false
}
return slices.Equal(al.parts, bl.parts)
}

View File

@ -0,0 +1,46 @@
/*
Copyright 2022 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package junit
import (
"github.com/onsi/ginkgo/v2"
"github.com/onsi/ginkgo/v2/reporters"
"github.com/onsi/ginkgo/v2/types"
)
// WriteJUnitReport generates a JUnit file that is shorter than the one
// normally written by `ginkgo --junit-report`. This is needed because the full
// report can become too large for tools like Spyglass
// (https://github.com/kubernetes/kubernetes/issues/111510).
func WriteJUnitReport(report ginkgo.Report, filename string) error {
config := reporters.JunitReportConfig{
// Remove details for specs where we don't care.
OmitTimelinesForSpecState: types.SpecStatePassed | types.SpecStateSkipped,
// Don't write <failure message="summary">. The same text is
// also in the full text for the failure. If we were to write
// both, then tools like kettle and spyglass would concatenate
// the two strings and thus show duplicated information.
OmitFailureMessageAttr: true,
// All labels are also part of the spec texts in inline [] tags,
// so we don't need to write them separately.
OmitSpecLabels: true,
}
return reporters.GenerateJUnitReportWithConfig(report, filename, config)
}

View File

@ -0,0 +1,12 @@
# This E2E framework sub-package is currently allowed to use arbitrary
# dependencies except of k/k/pkg, therefore we need to override the
# restrictions from the parent .import-restrictions file.
#
# At some point it may become useful to also check this package's
# dependencies more careful.
rules:
- selectorRegexp: "^k8s[.]io/kubernetes/pkg"
allowedPrefixes: []
- selectorRegexp: ""
allowedPrefixes: [ "" ]

View File

@ -0,0 +1,195 @@
/*
Copyright 2014 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package kubectl
import (
"bytes"
"fmt"
"io"
"net"
"net/url"
"os"
"os/exec"
"strings"
"syscall"
"time"
"k8s.io/client-go/tools/clientcmd"
uexec "k8s.io/utils/exec"
"k8s.io/kubernetes/test/e2e/framework"
)
// KubectlBuilder is used to build, customize and execute a kubectl Command.
// Add more functions to customize the builder as needed.
type KubectlBuilder struct {
cmd *exec.Cmd
timeout <-chan time.Time
}
// NewKubectlCommand returns a KubectlBuilder for running kubectl.
func NewKubectlCommand(namespace string, args ...string) *KubectlBuilder {
b := new(KubectlBuilder)
tk := NewTestKubeconfig(framework.TestContext.CertDir, framework.TestContext.Host, framework.TestContext.KubeConfig, framework.TestContext.KubeContext, framework.TestContext.KubectlPath, namespace)
b.cmd = tk.KubectlCmd(args...)
return b
}
// AppendEnv appends the given environment and returns itself.
func (b *KubectlBuilder) AppendEnv(env []string) *KubectlBuilder {
if b.cmd.Env == nil {
b.cmd.Env = os.Environ()
}
b.cmd.Env = append(b.cmd.Env, env...)
return b
}
// WithTimeout sets the given timeout and returns itself.
func (b *KubectlBuilder) WithTimeout(t <-chan time.Time) *KubectlBuilder {
b.timeout = t
return b
}
// WithStdinData sets the given data to stdin and returns itself.
func (b KubectlBuilder) WithStdinData(data string) *KubectlBuilder {
b.cmd.Stdin = strings.NewReader(data)
return &b
}
// WithStdinReader sets the given reader and returns itself.
func (b KubectlBuilder) WithStdinReader(reader io.Reader) *KubectlBuilder {
b.cmd.Stdin = reader
return &b
}
// ExecOrDie runs the kubectl executable or dies if error occurs.
func (b KubectlBuilder) ExecOrDie(namespace string) string {
str, err := b.Exec()
// In case of i/o timeout error, try talking to the apiserver again after 2s before dying.
// Note that we're still dying after retrying so that we can get visibility to triage it further.
if isTimeout(err) {
framework.Logf("Hit i/o timeout error, talking to the server 2s later to see if it's temporary.")
time.Sleep(2 * time.Second)
retryStr, retryErr := RunKubectl(namespace, "version")
framework.Logf("stdout: %q", retryStr)
framework.Logf("err: %v", retryErr)
}
framework.ExpectNoError(err)
return str
}
func isTimeout(err error) bool {
switch err := err.(type) {
case *url.Error:
if err, ok := err.Err.(net.Error); ok && err.Timeout() {
return true
}
case net.Error:
if err.Timeout() {
return true
}
}
return false
}
// Exec runs the kubectl executable.
func (b KubectlBuilder) Exec() (string, error) {
stdout, _, err := b.ExecWithFullOutput()
return stdout, err
}
// ExecWithFullOutput runs the kubectl executable, and returns the stdout and stderr.
func (b KubectlBuilder) ExecWithFullOutput() (string, string, error) {
var stdout, stderr bytes.Buffer
cmd := b.cmd
cmd.Stdout, cmd.Stderr = &stdout, &stderr
framework.Logf("Running '%s %s'", cmd.Path, strings.Join(cmd.Args[1:], " ")) // skip arg[0] as it is printed separately
if err := cmd.Start(); err != nil {
return "", "", fmt.Errorf("error starting %v:\nCommand stdout:\n%v\nstderr:\n%v\nerror:\n%v", cmd, cmd.Stdout, cmd.Stderr, err)
}
errCh := make(chan error, 1)
go func() {
errCh <- cmd.Wait()
}()
select {
case err := <-errCh:
if err != nil {
var rc = 127
if ee, ok := err.(*exec.ExitError); ok {
rc = int(ee.Sys().(syscall.WaitStatus).ExitStatus())
framework.Logf("rc: %d", rc)
}
return stdout.String(), stderr.String(), uexec.CodeExitError{
Err: fmt.Errorf("error running %v:\nCommand stdout:\n%v\nstderr:\n%v\nerror:\n%v", cmd, cmd.Stdout, cmd.Stderr, err),
Code: rc,
}
}
case <-b.timeout:
b.cmd.Process.Kill()
return "", "", fmt.Errorf("timed out waiting for command %v:\nCommand stdout:\n%v\nstderr:\n%v", cmd, cmd.Stdout, cmd.Stderr)
}
framework.Logf("stderr: %q", stderr.String())
framework.Logf("stdout: %q", stdout.String())
return stdout.String(), stderr.String(), nil
}
// RunKubectlOrDie is a convenience wrapper over kubectlBuilder
func RunKubectlOrDie(namespace string, args ...string) string {
return NewKubectlCommand(namespace, args...).ExecOrDie(namespace)
}
// RunKubectl is a convenience wrapper over kubectlBuilder
func RunKubectl(namespace string, args ...string) (string, error) {
return NewKubectlCommand(namespace, args...).Exec()
}
// RunKubectlWithFullOutput is a convenience wrapper over kubectlBuilder
// It will also return the command's stderr.
func RunKubectlWithFullOutput(namespace string, args ...string) (string, string, error) {
return NewKubectlCommand(namespace, args...).ExecWithFullOutput()
}
// RunKubectlOrDieInput is a convenience wrapper over kubectlBuilder that takes input to stdin
func RunKubectlOrDieInput(namespace string, data string, args ...string) string {
return NewKubectlCommand(namespace, args...).WithStdinData(data).ExecOrDie(namespace)
}
// RunKubectlInput is a convenience wrapper over kubectlBuilder that takes input to stdin
func RunKubectlInput(namespace string, data string, args ...string) (string, error) {
return NewKubectlCommand(namespace, args...).WithStdinData(data).Exec()
}
// RunKubemciWithKubeconfig is a convenience wrapper over RunKubemciCmd
func RunKubemciWithKubeconfig(args ...string) (string, error) {
if framework.TestContext.KubeConfig != "" {
args = append(args, "--"+clientcmd.RecommendedConfigPathFlag+"="+framework.TestContext.KubeConfig)
}
return RunKubemciCmd(args...)
}
// RunKubemciCmd is a convenience wrapper over kubectlBuilder to run kubemci.
// It assumes that kubemci exists in PATH.
func RunKubemciCmd(args ...string) (string, error) {
// kubemci is assumed to be in PATH.
kubemci := "kubemci"
b := new(KubectlBuilder)
args = append(args, "--gcp-project="+framework.TestContext.CloudConfig.ProjectID)
b.cmd = exec.Command(kubemci, args...)
return b.Exec()
}

View File

@ -0,0 +1,206 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package kubectl
import (
"bytes"
"context"
"fmt"
"os/exec"
"path/filepath"
"strings"
"time"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/client-go/tools/clientcmd"
"k8s.io/kubernetes/test/e2e/framework"
e2epod "k8s.io/kubernetes/test/e2e/framework/pod"
testutils "k8s.io/kubernetes/test/utils"
"github.com/onsi/ginkgo/v2"
)
const (
maxKubectlExecRetries = 5
)
// TestKubeconfig is a struct containing the needed attributes from TestContext and Framework(Namespace).
type TestKubeconfig struct {
CertDir string
Host string
KubeConfig string
KubeContext string
KubectlPath string
Namespace string // Every test has at least one namespace unless creation is skipped
}
// NewTestKubeconfig returns a new Kubeconfig struct instance.
func NewTestKubeconfig(certdir, host, kubeconfig, kubecontext, kubectlpath, namespace string) *TestKubeconfig {
return &TestKubeconfig{
CertDir: certdir,
Host: host,
KubeConfig: kubeconfig,
KubeContext: kubecontext,
KubectlPath: kubectlpath,
Namespace: namespace,
}
}
// KubectlCmd runs the kubectl executable through the wrapper script.
func (tk *TestKubeconfig) KubectlCmd(args ...string) *exec.Cmd {
defaultArgs := []string{}
// Reference a --server option so tests can run anywhere.
if tk.Host != "" {
defaultArgs = append(defaultArgs, "--"+clientcmd.FlagAPIServer+"="+tk.Host)
}
if tk.KubeConfig != "" {
defaultArgs = append(defaultArgs, "--"+clientcmd.RecommendedConfigPathFlag+"="+tk.KubeConfig)
// Reference the KubeContext
if tk.KubeContext != "" {
defaultArgs = append(defaultArgs, "--"+clientcmd.FlagContext+"="+tk.KubeContext)
}
} else {
if tk.CertDir != "" {
defaultArgs = append(defaultArgs,
fmt.Sprintf("--certificate-authority=%s", filepath.Join(tk.CertDir, "ca.crt")),
fmt.Sprintf("--client-certificate=%s", filepath.Join(tk.CertDir, "kubecfg.crt")),
fmt.Sprintf("--client-key=%s", filepath.Join(tk.CertDir, "kubecfg.key")))
}
}
if tk.Namespace != "" {
defaultArgs = append(defaultArgs, fmt.Sprintf("--namespace=%s", tk.Namespace))
}
kubectlArgs := append(defaultArgs, args...)
//We allow users to specify path to kubectl, so you can test either "kubectl" or "cluster/kubectl.sh"
//and so on.
cmd := exec.Command(tk.KubectlPath, kubectlArgs...)
//caller will invoke this and wait on it.
return cmd
}
// LogFailedContainers runs `kubectl logs` on a failed containers.
func LogFailedContainers(ctx context.Context, c clientset.Interface, ns string, logFunc func(ftm string, args ...interface{})) {
podList, err := c.CoreV1().Pods(ns).List(ctx, metav1.ListOptions{})
if err != nil {
logFunc("Error getting pods in namespace '%s': %v", ns, err)
return
}
logFunc("Running kubectl logs on non-ready containers in %v", ns)
for _, pod := range podList.Items {
if res, err := testutils.PodRunningReady(&pod); !res || err != nil {
kubectlLogPod(ctx, c, pod, "", framework.Logf)
}
}
}
func kubectlLogPod(ctx context.Context, c clientset.Interface, pod v1.Pod, containerNameSubstr string, logFunc func(ftm string, args ...interface{})) {
for _, container := range pod.Spec.Containers {
if strings.Contains(container.Name, containerNameSubstr) {
// Contains() matches all strings if substr is empty
logs, err := e2epod.GetPodLogs(ctx, c, pod.Namespace, pod.Name, container.Name)
if err != nil {
logs, err = e2epod.GetPreviousPodLogs(ctx, c, pod.Namespace, pod.Name, container.Name)
if err != nil {
logFunc("Failed to get logs of pod %v, container %v, err: %v", pod.Name, container.Name, err)
}
}
logFunc("Logs of %v/%v:%v on node %v", pod.Namespace, pod.Name, container.Name, pod.Spec.NodeName)
logFunc("%s : STARTLOG\n%s\nENDLOG for container %v:%v:%v", containerNameSubstr, logs, pod.Namespace, pod.Name, container.Name)
}
}
}
// WriteFileViaContainer writes a file using kubectl exec echo <contents> > <path> via specified container
// because of the primitive technique we're using here, we only allow ASCII alphanumeric characters
func (tk *TestKubeconfig) WriteFileViaContainer(podName, containerName string, path string, contents string) error {
ginkgo.By("writing a file in the container")
allowedCharacters := "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
for _, c := range contents {
if !strings.ContainsRune(allowedCharacters, c) {
return fmt.Errorf("Unsupported character in string to write: %v", c)
}
}
command := fmt.Sprintf("echo '%s' > '%s'; sync", contents, path)
stdout, stderr, err := tk.kubectlExecWithRetry(tk.Namespace, podName, containerName, "--", "/bin/sh", "-c", command)
if err != nil {
framework.Logf("error running kubectl exec to write file: %v\nstdout=%v\nstderr=%v)", err, string(stdout), string(stderr))
}
return err
}
// ReadFileViaContainer reads a file using kubectl exec cat <path>.
func (tk *TestKubeconfig) ReadFileViaContainer(podName, containerName string, path string) (string, error) {
ginkgo.By("reading a file in the container")
stdout, stderr, err := tk.kubectlExecWithRetry(tk.Namespace, podName, containerName, "--", "cat", path)
if err != nil {
framework.Logf("error running kubectl exec to read file: %v\nstdout=%v\nstderr=%v)", err, string(stdout), string(stderr))
}
return string(stdout), err
}
func (tk *TestKubeconfig) kubectlExecWithRetry(namespace string, podName, containerName string, args ...string) ([]byte, []byte, error) {
for numRetries := 0; numRetries < maxKubectlExecRetries; numRetries++ {
if numRetries > 0 {
framework.Logf("Retrying kubectl exec (retry count=%v/%v)", numRetries+1, maxKubectlExecRetries)
}
stdOutBytes, stdErrBytes, err := tk.kubectlExec(namespace, podName, containerName, args...)
if err != nil {
if strings.Contains(strings.ToLower(string(stdErrBytes)), "i/o timeout") {
// Retry on "i/o timeout" errors
framework.Logf("Warning: kubectl exec encountered i/o timeout.\nerr=%v\nstdout=%v\nstderr=%v)", err, string(stdOutBytes), string(stdErrBytes))
continue
}
if strings.Contains(strings.ToLower(string(stdErrBytes)), "container not found") {
// Retry on "container not found" errors
framework.Logf("Warning: kubectl exec encountered container not found.\nerr=%v\nstdout=%v\nstderr=%v)", err, string(stdOutBytes), string(stdErrBytes))
time.Sleep(2 * time.Second)
continue
}
}
return stdOutBytes, stdErrBytes, err
}
err := fmt.Errorf("Failed: kubectl exec failed %d times with \"i/o timeout\". Giving up", maxKubectlExecRetries)
return nil, nil, err
}
func (tk *TestKubeconfig) kubectlExec(namespace string, podName, containerName string, args ...string) ([]byte, []byte, error) {
var stdout, stderr bytes.Buffer
cmdArgs := []string{
"exec",
fmt.Sprintf("--namespace=%v", namespace),
podName,
fmt.Sprintf("-c=%v", containerName),
}
cmdArgs = append(cmdArgs, args...)
cmd := tk.KubectlCmd(cmdArgs...)
cmd.Stdout, cmd.Stderr = &stdout, &stderr
framework.Logf("Running '%s %s'", cmd.Path, strings.Join(cmdArgs, " "))
err := cmd.Run()
return stdout.Bytes(), stderr.Bytes(), err
}

44
e2e/vendor/k8s.io/kubernetes/test/e2e/framework/log.go generated vendored Normal file
View File

@ -0,0 +1,44 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import (
"fmt"
"github.com/onsi/ginkgo/v2"
)
// Logf logs the info.
//
// Use this instead of `klog.Infof` because stack unwinding automatically
// skips over helper functions which marked themselves as helper by
// calling [ginkgo.GinkgoHelper].
func Logf(format string, args ...interface{}) {
log(1, fmt.Sprintf(format, args...))
}
// Failf logs the fail info, including a stack trace starts with its direct caller
// (for example, for call chain f -> g -> Failf("foo", ...) error would be logged for "g").
func Failf(format string, args ...interface{}) {
msg := fmt.Sprintf(format, args...)
skip := 1
ginkgo.Fail(msg, skip)
panic("unreachable")
}
// Fail is an alias for ginkgo.Fail.
var Fail = ginkgo.Fail

View File

@ -0,0 +1,12 @@
# This E2E framework sub-package is currently allowed to use arbitrary
# dependencies except of k/k/pkg, therefore we need to override the
# restrictions from the parent .import-restrictions file.
#
# At some point it may become useful to also check this package's
# dependencies more careful.
rules:
- selectorRegexp: "^k8s[.]io/kubernetes/pkg"
allowedPrefixes: []
- selectorRegexp: ""
allowedPrefixes: [ "" ]

View File

@ -0,0 +1,14 @@
# See the OWNERS docs at https://go.k8s.io/owners
approvers:
- sig-instrumentation-approvers
emeritus_approvers:
- fabxc
- piosz
- fgrzadkowski
- kawych
- x13n
reviewers:
- sig-instrumentation-reviewers
labels:
- sig/instrumentation

View File

@ -0,0 +1,89 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import (
"fmt"
e2eperftype "k8s.io/kubernetes/test/e2e/perftype"
)
// APICall is a struct for managing API call.
type APICall struct {
Resource string `json:"resource"`
Subresource string `json:"subresource"`
Verb string `json:"verb"`
Scope string `json:"scope"`
Latency LatencyMetric `json:"latency"`
Count int `json:"count"`
}
// APIResponsiveness is a struct for managing multiple API calls.
type APIResponsiveness struct {
APICalls []APICall `json:"apicalls"`
}
// SummaryKind returns the summary of API responsiveness.
func (a *APIResponsiveness) SummaryKind() string {
return "APIResponsiveness"
}
// PrintHumanReadable returns metrics with JSON format.
func (a *APIResponsiveness) PrintHumanReadable() string {
return PrettyPrintJSON(a)
}
// PrintJSON returns metrics of PerfData(50, 90 and 99th percentiles) with JSON format.
func (a *APIResponsiveness) PrintJSON() string {
return PrettyPrintJSON(APICallToPerfData(a))
}
func (a *APIResponsiveness) Len() int { return len(a.APICalls) }
func (a *APIResponsiveness) Swap(i, j int) {
a.APICalls[i], a.APICalls[j] = a.APICalls[j], a.APICalls[i]
}
func (a *APIResponsiveness) Less(i, j int) bool {
return a.APICalls[i].Latency.Perc99 < a.APICalls[j].Latency.Perc99
}
// currentAPICallMetricsVersion is the current apicall performance metrics version. We should
// bump up the version each time we make incompatible change to the metrics.
const currentAPICallMetricsVersion = "v1"
// APICallToPerfData transforms APIResponsiveness to PerfData.
func APICallToPerfData(apicalls *APIResponsiveness) *e2eperftype.PerfData {
perfData := &e2eperftype.PerfData{Version: currentAPICallMetricsVersion}
for _, apicall := range apicalls.APICalls {
item := e2eperftype.DataItem{
Data: map[string]float64{
"Perc50": float64(apicall.Latency.Perc50) / 1000000, // us -> ms
"Perc90": float64(apicall.Latency.Perc90) / 1000000,
"Perc99": float64(apicall.Latency.Perc99) / 1000000,
},
Unit: "ms",
Labels: map[string]string{
"Verb": apicall.Verb,
"Resource": apicall.Resource,
"Subresource": apicall.Subresource,
"Scope": apicall.Scope,
"Count": fmt.Sprintf("%v", apicall.Count),
},
}
perfData.DataItems = append(perfData.DataItems, item)
}
return perfData
}

View File

@ -0,0 +1,42 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import (
"k8s.io/component-base/metrics/testutil"
)
// APIServerMetrics is metrics for API server
type APIServerMetrics testutil.Metrics
// Equal returns true if all metrics are the same as the arguments.
func (m *APIServerMetrics) Equal(o APIServerMetrics) bool {
return (*testutil.Metrics)(m).Equal(testutil.Metrics(o))
}
func newAPIServerMetrics() APIServerMetrics {
result := testutil.NewMetrics()
return APIServerMetrics(result)
}
func parseAPIServerMetrics(data string) (APIServerMetrics, error) {
result := newAPIServerMetrics()
if err := testutil.ParseMetrics(data, (*testutil.Metrics)(&result)); err != nil {
return APIServerMetrics{}, err
}
return result, nil
}

View File

@ -0,0 +1,40 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import "k8s.io/component-base/metrics/testutil"
// ClusterAutoscalerMetrics is metrics for cluster autoscaler
type ClusterAutoscalerMetrics testutil.Metrics
// Equal returns true if all metrics are the same as the arguments.
func (m *ClusterAutoscalerMetrics) Equal(o ClusterAutoscalerMetrics) bool {
return (*testutil.Metrics)(m).Equal(testutil.Metrics(o))
}
func newClusterAutoscalerMetrics() ClusterAutoscalerMetrics {
result := testutil.NewMetrics()
return ClusterAutoscalerMetrics(result)
}
func parseClusterAutoscalerMetrics(data string) (ClusterAutoscalerMetrics, error) {
result := newClusterAutoscalerMetrics()
if err := testutil.ParseMetrics(data, (*testutil.Metrics)(&result)); err != nil {
return ClusterAutoscalerMetrics{}, err
}
return result, nil
}

View File

@ -0,0 +1,40 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import "k8s.io/component-base/metrics/testutil"
// ControllerManagerMetrics is metrics for controller manager
type ControllerManagerMetrics testutil.Metrics
// Equal returns true if all metrics are the same as the arguments.
func (m *ControllerManagerMetrics) Equal(o ControllerManagerMetrics) bool {
return (*testutil.Metrics)(m).Equal(testutil.Metrics(o))
}
func newControllerManagerMetrics() ControllerManagerMetrics {
result := testutil.NewMetrics()
return ControllerManagerMetrics(result)
}
func parseControllerManagerMetrics(data string) (ControllerManagerMetrics, error) {
result := newControllerManagerMetrics()
if err := testutil.ParseMetrics(data, (*testutil.Metrics)(&result)); err != nil {
return ControllerManagerMetrics{}, err
}
return result, nil
}

View File

@ -0,0 +1,127 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import (
"bytes"
"encoding/json"
"fmt"
"k8s.io/component-base/metrics/testutil"
"k8s.io/kubernetes/test/e2e/framework"
)
const (
// Cluster Autoscaler metrics names
caFunctionMetric = "cluster_autoscaler_function_duration_seconds_bucket"
caFunctionMetricLabel = "function"
)
// ComponentCollection is metrics collection of components.
type ComponentCollection Collection
func (m *ComponentCollection) filterMetrics() {
apiServerMetrics := make(APIServerMetrics)
for _, metric := range interestingAPIServerMetrics {
apiServerMetrics[metric] = (*m).APIServerMetrics[metric]
}
controllerManagerMetrics := make(ControllerManagerMetrics)
for _, metric := range interestingControllerManagerMetrics {
controllerManagerMetrics[metric] = (*m).ControllerManagerMetrics[metric]
}
kubeletMetrics := make(map[string]KubeletMetrics)
for kubelet, grabbed := range (*m).KubeletMetrics {
kubeletMetrics[kubelet] = make(KubeletMetrics)
for _, metric := range interestingKubeletMetrics {
kubeletMetrics[kubelet][metric] = grabbed[metric]
}
}
(*m).APIServerMetrics = apiServerMetrics
(*m).ControllerManagerMetrics = controllerManagerMetrics
(*m).KubeletMetrics = kubeletMetrics
}
// PrintHumanReadable returns e2e metrics with JSON format.
func (m *ComponentCollection) PrintHumanReadable() string {
buf := bytes.Buffer{}
for _, interestingMetric := range interestingAPIServerMetrics {
buf.WriteString(fmt.Sprintf("For %v:\n", interestingMetric))
for _, sample := range (*m).APIServerMetrics[interestingMetric] {
buf.WriteString(fmt.Sprintf("\t%v\n", testutil.PrintSample(sample)))
}
}
for _, interestingMetric := range interestingControllerManagerMetrics {
buf.WriteString(fmt.Sprintf("For %v:\n", interestingMetric))
for _, sample := range (*m).ControllerManagerMetrics[interestingMetric] {
buf.WriteString(fmt.Sprintf("\t%v\n", testutil.PrintSample(sample)))
}
}
for _, interestingMetric := range interestingClusterAutoscalerMetrics {
buf.WriteString(fmt.Sprintf("For %v:\n", interestingMetric))
for _, sample := range (*m).ClusterAutoscalerMetrics[interestingMetric] {
buf.WriteString(fmt.Sprintf("\t%v\n", testutil.PrintSample(sample)))
}
}
for kubelet, grabbed := range (*m).KubeletMetrics {
buf.WriteString(fmt.Sprintf("For %v:\n", kubelet))
for _, interestingMetric := range interestingKubeletMetrics {
buf.WriteString(fmt.Sprintf("\tFor %v:\n", interestingMetric))
for _, sample := range grabbed[interestingMetric] {
buf.WriteString(fmt.Sprintf("\t\t%v\n", testutil.PrintSample(sample)))
}
}
}
return buf.String()
}
// PrettyPrintJSON converts metrics to JSON format.
// TODO: This function should be replaced with framework.PrettyPrintJSON after solving
// circulary dependency between core framework and this metrics subpackage.
func PrettyPrintJSON(metrics interface{}) string {
output := &bytes.Buffer{}
if err := json.NewEncoder(output).Encode(metrics); err != nil {
framework.Logf("Error building encoder: %v", err)
return ""
}
formatted := &bytes.Buffer{}
if err := json.Indent(formatted, output.Bytes(), "", " "); err != nil {
framework.Logf("Error indenting: %v", err)
return ""
}
return string(formatted.Bytes())
}
// PrintJSON returns e2e metrics with JSON format.
func (m *ComponentCollection) PrintJSON() string {
m.filterMetrics()
return PrettyPrintJSON(m)
}
// SummaryKind returns the summary of e2e metrics.
func (m *ComponentCollection) SummaryKind() string {
return "ComponentCollection"
}
// ComputeClusterAutoscalerMetricsDelta computes the change in cluster
// autoscaler metrics.
func (m *ComponentCollection) ComputeClusterAutoscalerMetricsDelta(before Collection) {
if beforeSamples, found := before.ClusterAutoscalerMetrics[caFunctionMetric]; found {
if afterSamples, found := m.ClusterAutoscalerMetrics[caFunctionMetric]; found {
testutil.ComputeHistogramDelta(beforeSamples, afterSamples, caFunctionMetricLabel)
}
}
}

View File

@ -0,0 +1,75 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import (
"context"
"github.com/onsi/ginkgo/v2"
"k8s.io/kubernetes/test/e2e/framework"
)
func GrabBeforeEach(ctx context.Context, f *framework.Framework) (result *Collection) {
gatherMetricsAfterTest := framework.TestContext.GatherMetricsAfterTest == "true" || framework.TestContext.GatherMetricsAfterTest == "master"
if !gatherMetricsAfterTest || !framework.TestContext.IncludeClusterAutoscalerMetrics {
return nil
}
ginkgo.By("Gathering metrics before test", func() {
grabber, err := NewMetricsGrabber(ctx, f.ClientSet, f.KubemarkExternalClusterClientSet, f.ClientConfig(), !framework.ProviderIs("kubemark"), false, false, false, framework.TestContext.IncludeClusterAutoscalerMetrics, false)
if err != nil {
framework.Logf("Failed to create MetricsGrabber (skipping ClusterAutoscaler metrics gathering before test): %v", err)
return
}
metrics, err := grabber.Grab(ctx)
if err != nil {
framework.Logf("MetricsGrabber failed to grab CA metrics before test (skipping metrics gathering): %v", err)
return
}
framework.Logf("Gathered ClusterAutoscaler metrics before test")
result = &metrics
})
return
}
func GrabAfterEach(ctx context.Context, f *framework.Framework, before *Collection) {
if framework.TestContext.GatherMetricsAfterTest == "false" {
return
}
ginkgo.By("Gathering metrics after test", func() {
// Grab apiserver, scheduler, controller-manager metrics and (optionally) nodes' kubelet metrics.
grabMetricsFromKubelets := framework.TestContext.GatherMetricsAfterTest != "master" && !framework.ProviderIs("kubemark")
grabber, err := NewMetricsGrabber(ctx, f.ClientSet, f.KubemarkExternalClusterClientSet, f.ClientConfig(), grabMetricsFromKubelets, true, true, true, framework.TestContext.IncludeClusterAutoscalerMetrics, false)
if err != nil {
framework.Logf("Failed to create MetricsGrabber (skipping metrics gathering): %v", err)
return
}
received, err := grabber.Grab(ctx)
if err != nil {
framework.Logf("MetricsGrabber failed to grab some of the metrics: %v", err)
return
}
if before == nil {
before = &Collection{}
}
(*ComponentCollection)(&received).ComputeClusterAutoscalerMetricsDelta(*before)
f.TestSummaries = append(f.TestSummaries, (*ComponentCollection)(&received))
})
}

View File

@ -0,0 +1,58 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
var interestingAPIServerMetrics = []string{
"apiserver_request_total",
"apiserver_request_latency_seconds",
"apiserver_init_events_total",
}
var interestingControllerManagerMetrics = []string{
"garbage_collector_attempt_to_delete_queue_latency",
"garbage_collector_attempt_to_delete_work_duration",
"garbage_collector_attempt_to_orphan_queue_latency",
"garbage_collector_attempt_to_orphan_work_duration",
"garbage_collector_dirty_processing_latency_microseconds",
"garbage_collector_event_processing_latency_microseconds",
"garbage_collector_graph_changes_queue_latency",
"garbage_collector_graph_changes_work_duration",
"garbage_collector_orphan_processing_latency_microseconds",
"namespace_queue_latency",
"namespace_queue_latency_sum",
"namespace_queue_latency_count",
"namespace_retries",
"namespace_work_duration",
"namespace_work_duration_sum",
"namespace_work_duration_count",
}
var interestingKubeletMetrics = []string{
"kubelet_docker_operations_errors_total",
"kubelet_docker_operations_duration_seconds",
"kubelet_pod_start_duration_seconds",
"kubelet_pod_start_sli_duration_seconds",
"kubelet_pod_worker_duration_seconds",
"kubelet_pod_worker_start_duration_seconds",
}
var interestingClusterAutoscalerMetrics = []string{
"function_duration_seconds",
"errors_total",
"evicted_pods_total",
}

View File

@ -0,0 +1,47 @@
/*
Copyright 2024 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import (
"fmt"
"k8s.io/component-base/metrics/testutil"
)
// KubeProxyMetrics is metrics for kube-proxy
type KubeProxyMetrics testutil.Metrics
// GetCounterMetricValue returns value for metric type counter.
func (m *KubeProxyMetrics) GetCounterMetricValue(metricName string) (float64, error) {
if len(testutil.Metrics(*m)[metricName]) == 0 {
return 0, fmt.Errorf("metric '%s' not found", metricName)
}
return float64(testutil.Metrics(*m)[metricName][0].Value), nil
}
func newKubeProxyMetricsMetrics() KubeProxyMetrics {
result := testutil.NewMetrics()
return KubeProxyMetrics(result)
}
func parseKubeProxyMetrics(data string) (KubeProxyMetrics, error) {
result := newKubeProxyMetricsMetrics()
if err := testutil.ParseMetrics(data, (*testutil.Metrics)(&result)); err != nil {
return KubeProxyMetrics{}, err
}
return result, nil
}

View File

@ -0,0 +1,214 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import (
"context"
"fmt"
"io"
"net/http"
"sort"
"strconv"
"strings"
"time"
"k8s.io/apimachinery/pkg/util/sets"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/component-base/metrics/testutil"
"k8s.io/kubernetes/test/e2e/framework"
)
const (
proxyTimeout = 2 * time.Minute
// dockerOperationsLatencyKey is the key for the operation latency metrics.
// Taken from k8s.io/kubernetes/pkg/kubelet/dockershim/metrics
dockerOperationsLatencyKey = "docker_operations_duration_seconds"
// Taken from k8s.io/kubernetes/pkg/kubelet/metrics
kubeletSubsystem = "kubelet"
// Taken from k8s.io/kubernetes/pkg/kubelet/metrics
podWorkerDurationKey = "pod_worker_duration_seconds"
// Taken from k8s.io/kubernetes/pkg/kubelet/metrics
podStartDurationKey = "pod_start_duration_seconds"
// Taken from k8s.io/kubernetes/pkg/kubelet/metrics
podStartSLIDurationKey = "pod_start_sli_duration_seconds"
// Taken from k8s.io/kubernetes/pkg/kubelet/metrics
cgroupManagerOperationsKey = "cgroup_manager_duration_seconds"
// Taken from k8s.io/kubernetes/pkg/kubelet/metrics
podWorkerStartDurationKey = "pod_worker_start_duration_seconds"
// Taken from k8s.io/kubernetes/pkg/kubelet/metrics
plegRelistDurationKey = "pleg_relist_duration_seconds"
)
// KubeletMetrics is metrics for kubelet
type KubeletMetrics testutil.Metrics
// Equal returns true if all metrics are the same as the arguments.
func (m *KubeletMetrics) Equal(o KubeletMetrics) bool {
return (*testutil.Metrics)(m).Equal(testutil.Metrics(o))
}
// NewKubeletMetrics returns new metrics which are initialized.
func NewKubeletMetrics() KubeletMetrics {
result := testutil.NewMetrics()
return KubeletMetrics(result)
}
// GrabKubeletMetricsWithoutProxy retrieve metrics from the kubelet on the given node using a simple GET over http.
func GrabKubeletMetricsWithoutProxy(ctx context.Context, nodeName, path string) (KubeletMetrics, error) {
req, err := http.NewRequestWithContext(ctx, "GET", fmt.Sprintf("http://%s%s", nodeName, path), nil)
if err != nil {
return KubeletMetrics{}, err
}
resp, err := http.DefaultClient.Do(req)
if err != nil {
return KubeletMetrics{}, err
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return KubeletMetrics{}, err
}
return parseKubeletMetrics(string(body))
}
func parseKubeletMetrics(data string) (KubeletMetrics, error) {
result := NewKubeletMetrics()
if err := testutil.ParseMetrics(data, (*testutil.Metrics)(&result)); err != nil {
return KubeletMetrics{}, err
}
return result, nil
}
// KubeletLatencyMetric stores metrics scraped from the kubelet server's /metric endpoint.
// TODO: Get some more structure around the metrics and this type
type KubeletLatencyMetric struct {
// eg: list, info, create
Operation string
// eg: sync_pods, pod_worker
Method string
// 0 <= quantile <=1, e.g. 0.95 is 95%tile, 0.5 is median.
Quantile float64
Latency time.Duration
}
// KubeletLatencyMetrics implements sort.Interface for []KubeletMetric based on
// the latency field.
type KubeletLatencyMetrics []KubeletLatencyMetric
func (a KubeletLatencyMetrics) Len() int { return len(a) }
func (a KubeletLatencyMetrics) Swap(i, j int) { a[i], a[j] = a[j], a[i] }
func (a KubeletLatencyMetrics) Less(i, j int) bool { return a[i].Latency > a[j].Latency }
// If a apiserver client is passed in, the function will try to get kubelet metrics from metrics grabber;
// or else, the function will try to get kubelet metrics directly from the node.
func getKubeletMetricsFromNode(ctx context.Context, c clientset.Interface, nodeName string) (KubeletMetrics, error) {
if c == nil {
return GrabKubeletMetricsWithoutProxy(ctx, nodeName, "/metrics")
}
grabber, err := NewMetricsGrabber(ctx, c, nil, nil, true, false, false, false, false, false)
if err != nil {
return KubeletMetrics{}, err
}
return grabber.GrabFromKubelet(ctx, nodeName)
}
// GetKubeletMetrics gets all metrics in kubelet subsystem from specified node and trims
// the subsystem prefix.
func GetKubeletMetrics(ctx context.Context, c clientset.Interface, nodeName string) (KubeletMetrics, error) {
ms, err := getKubeletMetricsFromNode(ctx, c, nodeName)
if err != nil {
return KubeletMetrics{}, err
}
kubeletMetrics := make(KubeletMetrics)
for name, samples := range ms {
const prefix = kubeletSubsystem + "_"
if !strings.HasPrefix(name, prefix) {
// Not a kubelet metric.
continue
}
method := strings.TrimPrefix(name, prefix)
kubeletMetrics[method] = samples
}
return kubeletMetrics, nil
}
// GetDefaultKubeletLatencyMetrics calls GetKubeletLatencyMetrics with a set of default metricNames
// identifying common latency metrics.
// Note that the KubeletMetrics passed in should not contain subsystem prefix.
func GetDefaultKubeletLatencyMetrics(ms KubeletMetrics) KubeletLatencyMetrics {
latencyMetricNames := sets.NewString(
podWorkerDurationKey,
podWorkerStartDurationKey,
podStartDurationKey,
podStartSLIDurationKey,
cgroupManagerOperationsKey,
dockerOperationsLatencyKey,
podWorkerStartDurationKey,
plegRelistDurationKey,
)
return GetKubeletLatencyMetrics(ms, latencyMetricNames)
}
// GetKubeletLatencyMetrics filters ms to include only those contained in the metricNames set,
// then constructs a KubeletLatencyMetrics list based on the samples associated with those metrics.
func GetKubeletLatencyMetrics(ms KubeletMetrics, filterMetricNames sets.String) KubeletLatencyMetrics {
var latencyMetrics KubeletLatencyMetrics
for name, samples := range ms {
if !filterMetricNames.Has(name) {
continue
}
for _, sample := range samples {
latency := sample.Value
operation := string(sample.Metric["operation_type"])
var quantile float64
if val, ok := sample.Metric[testutil.QuantileLabel]; ok {
var err error
if quantile, err = strconv.ParseFloat(string(val), 64); err != nil {
continue
}
}
latencyMetrics = append(latencyMetrics, KubeletLatencyMetric{
Operation: operation,
Method: name,
Quantile: quantile,
Latency: time.Duration(int64(latency)) * time.Microsecond,
})
}
}
return latencyMetrics
}
// HighLatencyKubeletOperations logs and counts the high latency metrics exported by the kubelet server via /metrics.
func HighLatencyKubeletOperations(ctx context.Context, c clientset.Interface, threshold time.Duration, nodeName string, logFunc func(fmt string, args ...interface{})) (KubeletLatencyMetrics, error) {
ms, err := GetKubeletMetrics(ctx, c, nodeName)
if err != nil {
return KubeletLatencyMetrics{}, err
}
latencyMetrics := GetDefaultKubeletLatencyMetrics(ms)
sort.Sort(latencyMetrics)
var badMetrics KubeletLatencyMetrics
logFunc("\nLatency metrics for node %v", nodeName)
for _, m := range latencyMetrics {
if m.Latency > threshold {
badMetrics = append(badMetrics, m)
framework.Logf("%+v", m)
}
}
return badMetrics, nil
}

View File

@ -0,0 +1,38 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import (
"time"
)
// PodLatencyData encapsulates pod startup latency information.
type PodLatencyData struct {
// Name of the pod
Name string
// Node this pod was running on
Node string
// Latency information related to pod startuptime
Latency time.Duration
}
// LatencySlice is an array of PodLatencyData which encapsulates pod startup latency information.
type LatencySlice []PodLatencyData
func (a LatencySlice) Len() int { return len(a) }
func (a LatencySlice) Swap(i, j int) { a[i], a[j] = a[j], a[i] }
func (a LatencySlice) Less(i, j int) bool { return a[i].Latency < a[j].Latency }

View File

@ -0,0 +1,565 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import (
"context"
"errors"
"fmt"
"net"
"regexp"
"strconv"
"sync"
"time"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/fields"
"k8s.io/apimachinery/pkg/util/wait"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/client-go/rest"
"k8s.io/klog/v2"
"k8s.io/kubernetes/test/e2e/framework"
e2epod "k8s.io/kubernetes/test/e2e/framework/pod"
e2epodoutput "k8s.io/kubernetes/test/e2e/framework/pod/output"
)
const (
// kubeSchedulerPort is the default port for the scheduler status server.
kubeSchedulerPort = 10259
// kubeControllerManagerPort is the default port for the controller manager status server.
kubeControllerManagerPort = 10257
// snapshotControllerPort is the port for the snapshot controller
snapshotControllerPort = 9102
// kubeProxyPort is the default port for the kube-proxy status server.
kubeProxyPort = 10249
)
// MetricsGrabbingDisabledError is an error that is wrapped by the
// different MetricsGrabber.Wrap functions when metrics grabbing is
// not supported. Tests that check metrics data should then skip
// the check.
var MetricsGrabbingDisabledError = errors.New("metrics grabbing disabled")
// Collection is metrics collection of components
type Collection struct {
APIServerMetrics APIServerMetrics
APIServerMetricsSLIs APIServerMetrics
ControllerManagerMetrics ControllerManagerMetrics
SnapshotControllerMetrics SnapshotControllerMetrics
KubeletMetrics map[string]KubeletMetrics
SchedulerMetrics SchedulerMetrics
ClusterAutoscalerMetrics ClusterAutoscalerMetrics
}
// Grabber provides functions which grab metrics from components
type Grabber struct {
client clientset.Interface
externalClient clientset.Interface
config *rest.Config
grabFromAPIServer bool
grabFromControllerManager bool
grabFromKubelets bool
grabFromScheduler bool
grabFromClusterAutoscaler bool
grabFromSnapshotController bool
kubeScheduler string
waitForSchedulerReadyOnce sync.Once
kubeControllerManager string
waitForControllerManagerReadyOnce sync.Once
snapshotController string
waitForSnapshotControllerReadyOnce sync.Once
}
// NewMetricsGrabber prepares for grabbing metrics data from several different
// components. It should be called when those components are running because
// it needs to communicate with them to determine for which components
// metrics data can be retrieved.
//
// Collecting metrics data is an optional debug feature. Not all clusters will
// support it. If disabled for a component, the corresponding Grab function
// will immediately return an error derived from MetricsGrabbingDisabledError.
func NewMetricsGrabber(ctx context.Context, c clientset.Interface, ec clientset.Interface, config *rest.Config, kubelets bool, scheduler bool, controllers bool, apiServer bool, clusterAutoscaler bool, snapshotController bool) (*Grabber, error) {
kubeScheduler := ""
kubeControllerManager := ""
snapshotControllerManager := ""
regKubeScheduler := regexp.MustCompile("kube-scheduler-.*")
regKubeControllerManager := regexp.MustCompile("kube-controller-manager-.*")
regSnapshotController := regexp.MustCompile("volume-snapshot-controller.*")
if (scheduler || controllers) && config == nil {
return nil, errors.New("a rest config is required for grabbing kube-controller and kube-controller-manager metrics")
}
podList, err := c.CoreV1().Pods(metav1.NamespaceSystem).List(ctx, metav1.ListOptions{})
if err != nil {
return nil, err
}
if len(podList.Items) < 1 {
klog.Warningf("Can't find any pods in namespace %s to grab metrics from", metav1.NamespaceSystem)
}
for _, pod := range podList.Items {
if regKubeScheduler.MatchString(pod.Name) {
kubeScheduler = pod.Name
}
if regKubeControllerManager.MatchString(pod.Name) {
kubeControllerManager = pod.Name
}
if regSnapshotController.MatchString(pod.Name) {
snapshotControllerManager = pod.Name
}
if kubeScheduler != "" && kubeControllerManager != "" && snapshotControllerManager != "" {
break
}
}
if clusterAutoscaler && ec == nil {
klog.Warningf("Did not receive an external client interface. Grabbing metrics from ClusterAutoscaler is disabled.")
}
return &Grabber{
client: c,
externalClient: ec,
config: config,
grabFromAPIServer: apiServer,
grabFromControllerManager: checkPodDebugHandlers(ctx, c, controllers, "kube-controller-manager", kubeControllerManager),
grabFromKubelets: kubelets,
grabFromScheduler: checkPodDebugHandlers(ctx, c, scheduler, "kube-scheduler", kubeScheduler),
grabFromClusterAutoscaler: clusterAutoscaler,
grabFromSnapshotController: checkPodDebugHandlers(ctx, c, snapshotController, "snapshot-controller", snapshotControllerManager),
kubeScheduler: kubeScheduler,
kubeControllerManager: kubeControllerManager,
snapshotController: snapshotControllerManager,
}, nil
}
func checkPodDebugHandlers(ctx context.Context, c clientset.Interface, requested bool, component, podName string) bool {
if !requested {
return false
}
if podName == "" {
klog.Warningf("Can't find %s pod. Grabbing metrics from %s is disabled.", component, component)
return false
}
// The debug handlers on the host where the pod runs might be disabled.
// We can check that indirectly by trying to retrieve log output.
limit := int64(1)
if _, err := c.CoreV1().Pods(metav1.NamespaceSystem).GetLogs(podName, &v1.PodLogOptions{LimitBytes: &limit}).DoRaw(ctx); err != nil {
klog.Warningf("Can't retrieve log output of %s (%q). Debug handlers might be disabled in kubelet. Grabbing metrics from %s is disabled.",
podName, err, component)
return false
}
// Metrics gathering enabled.
return true
}
// HasControlPlanePods returns true if metrics grabber was able to find control-plane pods
func (g *Grabber) HasControlPlanePods() bool {
return g.kubeScheduler != "" && g.kubeControllerManager != ""
}
// GrabFromKubelet returns metrics from kubelet
func (g *Grabber) GrabFromKubelet(ctx context.Context, nodeName string) (KubeletMetrics, error) {
nodes, err := g.client.CoreV1().Nodes().List(ctx, metav1.ListOptions{FieldSelector: fields.Set{"metadata.name": nodeName}.AsSelector().String()})
if err != nil {
return KubeletMetrics{}, err
}
if len(nodes.Items) != 1 {
return KubeletMetrics{}, fmt.Errorf("Error listing nodes with name %v, got %v", nodeName, nodes.Items)
}
kubeletPort := nodes.Items[0].Status.DaemonEndpoints.KubeletEndpoint.Port
return g.grabFromKubeletInternal(ctx, nodeName, int(kubeletPort), "metrics")
}
// GrabresourceMetricsFromKubelet returns resource metrics from kubelet
func (g *Grabber) GrabResourceMetricsFromKubelet(ctx context.Context, nodeName string) (KubeletMetrics, error) {
nodes, err := g.client.CoreV1().Nodes().List(ctx, metav1.ListOptions{FieldSelector: fields.Set{"metadata.name": nodeName}.AsSelector().String()})
if err != nil {
return KubeletMetrics{}, err
}
if len(nodes.Items) != 1 {
return KubeletMetrics{}, fmt.Errorf("Error listing nodes with name %v, got %v", nodeName, nodes.Items)
}
kubeletPort := nodes.Items[0].Status.DaemonEndpoints.KubeletEndpoint.Port
return g.grabFromKubeletInternal(ctx, nodeName, int(kubeletPort), "metrics/resource")
}
func (g *Grabber) grabFromKubeletInternal(ctx context.Context, nodeName string, kubeletPort int, pathSuffix string) (KubeletMetrics, error) {
if kubeletPort <= 0 || kubeletPort > 65535 {
return KubeletMetrics{}, fmt.Errorf("Invalid Kubelet port %v. Skipping Kubelet's metrics gathering", kubeletPort)
}
output, err := g.getMetricsFromNode(ctx, nodeName, int(kubeletPort), pathSuffix)
if err != nil {
return KubeletMetrics{}, err
}
return parseKubeletMetrics(output)
}
func (g *Grabber) getMetricsFromNode(ctx context.Context, nodeName string, kubeletPort int, pathSuffix string) (string, error) {
// There's a problem with timing out during proxy. Wrapping this in a goroutine to prevent deadlock.
finished := make(chan struct{}, 1)
var err error
var rawOutput []byte
go func() {
rawOutput, err = g.client.CoreV1().RESTClient().Get().
Resource("nodes").
SubResource("proxy").
Name(fmt.Sprintf("%v:%v", nodeName, kubeletPort)).
Suffix(pathSuffix).
Do(ctx).Raw()
finished <- struct{}{}
}()
select {
case <-time.After(proxyTimeout):
return "", fmt.Errorf("Timed out when waiting for proxy to gather metrics from %v", nodeName)
case <-finished:
if err != nil {
return "", err
}
return string(rawOutput), nil
}
}
// GrabFromKubeProxy returns metrics from kube-proxy
func (g *Grabber) GrabFromKubeProxy(ctx context.Context, nodeName string) (KubeProxyMetrics, error) {
nodes, err := g.client.CoreV1().Nodes().List(ctx, metav1.ListOptions{FieldSelector: fields.Set{"metadata.name": nodeName}.AsSelector().String()})
if err != nil {
return KubeProxyMetrics{}, err
}
if len(nodes.Items) != 1 {
return KubeProxyMetrics{}, fmt.Errorf("error listing nodes with name %v, got %v", nodeName, nodes.Items)
}
output, err := g.grabFromKubeProxy(ctx, nodeName)
if err != nil {
return KubeProxyMetrics{}, err
}
return parseKubeProxyMetrics(output)
}
func (g *Grabber) grabFromKubeProxy(ctx context.Context, nodeName string) (string, error) {
hostCmdPodName := fmt.Sprintf("grab-kube-proxy-metrics-%s", framework.RandomSuffix())
hostCmdPod := e2epod.NewExecPodSpec(metav1.NamespaceSystem, hostCmdPodName, true)
nodeSelection := e2epod.NodeSelection{Name: nodeName}
e2epod.SetNodeSelection(&hostCmdPod.Spec, nodeSelection)
if _, err := g.client.CoreV1().Pods(metav1.NamespaceSystem).Create(ctx, hostCmdPod, metav1.CreateOptions{}); err != nil {
return "", fmt.Errorf("failed to create pod to fetch metrics: %w", err)
}
if err := e2epod.WaitTimeoutForPodReadyInNamespace(ctx, g.client, hostCmdPodName, metav1.NamespaceSystem, 5*time.Minute); err != nil {
return "", fmt.Errorf("error waiting for pod to be up: %w", err)
}
host := "127.0.0.1"
if framework.TestContext.ClusterIsIPv6() {
host = "::1"
}
stdout, err := e2epodoutput.RunHostCmd(metav1.NamespaceSystem, hostCmdPodName, fmt.Sprintf("curl --silent %s/metrics", net.JoinHostPort(host, strconv.Itoa(kubeProxyPort))))
_ = g.client.CoreV1().Pods(metav1.NamespaceSystem).Delete(ctx, hostCmdPodName, metav1.DeleteOptions{})
return stdout, err
}
// GrabFromScheduler returns metrics from scheduler
func (g *Grabber) GrabFromScheduler(ctx context.Context) (SchedulerMetrics, error) {
if !g.grabFromScheduler {
return SchedulerMetrics{}, fmt.Errorf("kube-scheduler: %w", MetricsGrabbingDisabledError)
}
var err error
g.waitForSchedulerReadyOnce.Do(func() {
if readyErr := e2epod.WaitTimeoutForPodReadyInNamespace(ctx, g.client, g.kubeScheduler, metav1.NamespaceSystem, 5*time.Minute); readyErr != nil {
err = fmt.Errorf("error waiting for kube-scheduler pod to be ready: %w", readyErr)
}
})
if err != nil {
return SchedulerMetrics{}, err
}
var lastMetricsFetchErr error
var output string
if metricsWaitErr := wait.PollUntilContextTimeout(ctx, time.Second, time.Minute, true, func(ctx context.Context) (bool, error) {
output, lastMetricsFetchErr = g.getSecureMetricsFromPod(ctx, g.kubeScheduler, metav1.NamespaceSystem, kubeSchedulerPort)
return lastMetricsFetchErr == nil, nil
}); metricsWaitErr != nil {
err := fmt.Errorf("error waiting for kube-scheduler pod to expose metrics: %v; %v", metricsWaitErr, lastMetricsFetchErr)
return SchedulerMetrics{}, err
}
return parseSchedulerMetrics(output)
}
// GrabFromClusterAutoscaler returns metrics from cluster autoscaler
func (g *Grabber) GrabFromClusterAutoscaler(ctx context.Context) (ClusterAutoscalerMetrics, error) {
if !g.HasControlPlanePods() && g.externalClient == nil {
return ClusterAutoscalerMetrics{}, fmt.Errorf("ClusterAutoscaler: %w", MetricsGrabbingDisabledError)
}
var client clientset.Interface
var namespace string
if g.externalClient != nil {
client = g.externalClient
namespace = "kubemark"
} else {
client = g.client
namespace = metav1.NamespaceSystem
}
output, err := g.getMetricsFromPod(ctx, client, "cluster-autoscaler", namespace, 8085)
if err != nil {
return ClusterAutoscalerMetrics{}, err
}
return parseClusterAutoscalerMetrics(output)
}
// GrabFromControllerManager returns metrics from controller manager
func (g *Grabber) GrabFromControllerManager(ctx context.Context) (ControllerManagerMetrics, error) {
if !g.grabFromControllerManager {
return ControllerManagerMetrics{}, fmt.Errorf("kube-controller-manager: %w", MetricsGrabbingDisabledError)
}
var err error
g.waitForControllerManagerReadyOnce.Do(func() {
if readyErr := e2epod.WaitTimeoutForPodReadyInNamespace(ctx, g.client, g.kubeControllerManager, metav1.NamespaceSystem, 5*time.Minute); readyErr != nil {
err = fmt.Errorf("error waiting for kube-controller-manager pod to be ready: %w", readyErr)
}
})
if err != nil {
return ControllerManagerMetrics{}, err
}
var output string
var lastMetricsFetchErr error
if metricsWaitErr := wait.PollUntilContextTimeout(ctx, time.Second, time.Minute, true, func(ctx context.Context) (bool, error) {
output, lastMetricsFetchErr = g.getSecureMetricsFromPod(ctx, g.kubeControllerManager, metav1.NamespaceSystem, kubeControllerManagerPort)
return lastMetricsFetchErr == nil, nil
}); metricsWaitErr != nil {
err := fmt.Errorf("error waiting for kube-controller-manager to expose metrics: %v; %v", metricsWaitErr, lastMetricsFetchErr)
return ControllerManagerMetrics{}, err
}
return parseControllerManagerMetrics(output)
}
// GrabFromSnapshotController returns metrics from controller manager
func (g *Grabber) GrabFromSnapshotController(ctx context.Context, podName string, port int) (SnapshotControllerMetrics, error) {
if !g.grabFromSnapshotController {
return SnapshotControllerMetrics{}, fmt.Errorf("volume-snapshot-controller: %w", MetricsGrabbingDisabledError)
}
// Use overrides if provided via test config flags.
// Otherwise, use the default volume-snapshot-controller pod name and port.
if podName == "" {
podName = g.snapshotController
}
if port == 0 {
port = snapshotControllerPort
}
var err error
g.waitForSnapshotControllerReadyOnce.Do(func() {
if readyErr := e2epod.WaitTimeoutForPodReadyInNamespace(ctx, g.client, podName, metav1.NamespaceSystem, 5*time.Minute); readyErr != nil {
err = fmt.Errorf("error waiting for volume-snapshot-controller pod to be ready: %w", readyErr)
}
})
if err != nil {
return SnapshotControllerMetrics{}, err
}
var output string
var lastMetricsFetchErr error
if metricsWaitErr := wait.PollUntilContextTimeout(ctx, time.Second, time.Minute, true, func(ctx context.Context) (bool, error) {
output, lastMetricsFetchErr = g.getMetricsFromPod(ctx, g.client, podName, metav1.NamespaceSystem, port)
return lastMetricsFetchErr == nil, nil
}); metricsWaitErr != nil {
err = fmt.Errorf("error waiting for volume-snapshot-controller pod to expose metrics: %v; %v", metricsWaitErr, lastMetricsFetchErr)
return SnapshotControllerMetrics{}, err
}
return parseSnapshotControllerMetrics(output)
}
// GrabFromAPIServer returns metrics from API server
func (g *Grabber) GrabFromAPIServer(ctx context.Context) (APIServerMetrics, error) {
output, err := g.getMetricsFromAPIServer(ctx)
if err != nil {
return APIServerMetrics{}, err
}
return parseAPIServerMetrics(output)
}
// GrabMetricsSLIsFromAPIServer returns metrics from API server
func (g *Grabber) GrabMetricsSLIsFromAPIServer(ctx context.Context) (APIServerMetrics, error) {
output, err := g.getMetricsSLIsFromAPIServer(ctx)
if err != nil {
return APIServerMetrics{}, err
}
return parseAPIServerMetrics(output)
}
func (g *Grabber) getMetricsFromAPIServer(ctx context.Context) (string, error) {
rawOutput, err := g.client.CoreV1().RESTClient().Get().RequestURI("/metrics").Do(ctx).Raw()
if err != nil {
return "", err
}
return string(rawOutput), nil
}
func (g *Grabber) getMetricsSLIsFromAPIServer(ctx context.Context) (string, error) {
rawOutput, err := g.client.CoreV1().RESTClient().Get().RequestURI("/metrics/slis").Do(ctx).Raw()
if err != nil {
return "", err
}
return string(rawOutput), nil
}
// Grab returns metrics from corresponding component
func (g *Grabber) Grab(ctx context.Context) (Collection, error) {
result := Collection{}
var errs []error
if g.grabFromAPIServer {
metrics, err := g.GrabFromAPIServer(ctx)
if err != nil {
errs = append(errs, err)
} else {
result.APIServerMetrics = metrics
}
metrics, err = g.GrabMetricsSLIsFromAPIServer(ctx)
if err != nil {
errs = append(errs, err)
} else {
result.APIServerMetricsSLIs = metrics
}
}
if g.grabFromScheduler {
metrics, err := g.GrabFromScheduler(ctx)
if err != nil {
errs = append(errs, err)
} else {
result.SchedulerMetrics = metrics
}
}
if g.grabFromControllerManager {
metrics, err := g.GrabFromControllerManager(ctx)
if err != nil {
errs = append(errs, err)
} else {
result.ControllerManagerMetrics = metrics
}
}
if g.grabFromSnapshotController {
metrics, err := g.GrabFromSnapshotController(ctx, g.snapshotController, snapshotControllerPort)
if err != nil {
errs = append(errs, err)
} else {
result.SnapshotControllerMetrics = metrics
}
}
if g.grabFromClusterAutoscaler {
metrics, err := g.GrabFromClusterAutoscaler(ctx)
if err != nil {
errs = append(errs, err)
} else {
result.ClusterAutoscalerMetrics = metrics
}
}
if g.grabFromKubelets {
result.KubeletMetrics = make(map[string]KubeletMetrics)
nodes, err := g.client.CoreV1().Nodes().List(ctx, metav1.ListOptions{})
if err != nil {
errs = append(errs, err)
} else {
for _, node := range nodes.Items {
kubeletPort := node.Status.DaemonEndpoints.KubeletEndpoint.Port
metrics, err := g.grabFromKubeletInternal(ctx, node.Name, int(kubeletPort), "metrics")
if err != nil {
errs = append(errs, err)
}
result.KubeletMetrics[node.Name] = metrics
}
}
}
if len(errs) > 0 {
return result, fmt.Errorf("Errors while grabbing metrics: %v", errs)
}
return result, nil
}
// getMetricsFromPod retrieves metrics data from an insecure port.
func (g *Grabber) getMetricsFromPod(ctx context.Context, client clientset.Interface, podName string, namespace string, port int) (string, error) {
rawOutput, err := client.CoreV1().RESTClient().Get().
Namespace(namespace).
Resource("pods").
SubResource("proxy").
Name(fmt.Sprintf("%s:%d", podName, port)).
Suffix("metrics").
Do(ctx).Raw()
if err != nil {
return "", err
}
return string(rawOutput), nil
}
// getSecureMetricsFromPod retrieves metrics from a pod that uses TLS
// and checks client credentials. Conceptually this function is
// similar to "kubectl port-forward" + "kubectl get --raw
// https://localhost:<port>/metrics". It uses the same credentials
// as kubelet.
func (g *Grabber) getSecureMetricsFromPod(ctx context.Context, podName string, namespace string, port int) (string, error) {
dialer := e2epod.NewDialer(g.client, g.config)
metricConfig := rest.CopyConfig(g.config)
addr := e2epod.Addr{
Namespace: namespace,
PodName: podName,
Port: port,
}
metricConfig.Dial = func(ctx context.Context, network, address string) (net.Conn, error) {
return dialer.DialContainerPort(ctx, addr)
}
// This should make it possible verify the server, but while it
// got past the server name check, certificate validation
// still failed.
metricConfig.Host = addr.String()
metricConfig.ServerName = "localhost"
// Verifying the pod certificate with the same root CA
// as for the API server led to an error about "unknown root
// certificate". Disabling certificate checking on the client
// side gets around that and should be good enough for
// E2E testing.
metricConfig.Insecure = true
metricConfig.CAFile = ""
metricConfig.CAData = nil
// clientset.NewForConfig is used because
// metricClient.RESTClient() is directly usable, in contrast
// to the client constructed by rest.RESTClientFor().
metricClient, err := clientset.NewForConfig(metricConfig)
if err != nil {
return "", err
}
rawOutput, err := metricClient.RESTClient().Get().
AbsPath("metrics").
Do(ctx).Raw()
if err != nil {
return "", err
}
return string(rawOutput), nil
}

View File

@ -0,0 +1,29 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import (
"time"
)
// LatencyMetric is a struct for dashboard metrics.
type LatencyMetric struct {
Perc50 time.Duration `json:"Perc50"`
Perc90 time.Duration `json:"Perc90"`
Perc99 time.Duration `json:"Perc99"`
Perc100 time.Duration `json:"Perc100"`
}

View File

@ -0,0 +1,40 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import "k8s.io/component-base/metrics/testutil"
// SchedulerMetrics is metrics for scheduler
type SchedulerMetrics testutil.Metrics
// Equal returns true if all metrics are the same as the arguments.
func (m *SchedulerMetrics) Equal(o SchedulerMetrics) bool {
return (*testutil.Metrics)(m).Equal(testutil.Metrics(o))
}
func newSchedulerMetrics() SchedulerMetrics {
result := testutil.NewMetrics()
return SchedulerMetrics(result)
}
func parseSchedulerMetrics(data string) (SchedulerMetrics, error) {
result := newSchedulerMetrics()
if err := testutil.ParseMetrics(data, (*testutil.Metrics)(&result)); err != nil {
return SchedulerMetrics{}, err
}
return result, nil
}

View File

@ -0,0 +1,40 @@
/*
Copyright 2021 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package metrics
import "k8s.io/component-base/metrics/testutil"
// SnapshotControllerMetrics is metrics for controller manager
type SnapshotControllerMetrics testutil.Metrics
// Equal returns true if all metrics are the same as the arguments.
func (m *SnapshotControllerMetrics) Equal(o SnapshotControllerMetrics) bool {
return (*testutil.Metrics)(m).Equal(testutil.Metrics(o))
}
func newSnapshotControllerMetrics() SnapshotControllerMetrics {
result := testutil.NewMetrics()
return SnapshotControllerMetrics(result)
}
func parseSnapshotControllerMetrics(data string) (SnapshotControllerMetrics, error) {
result := newSnapshotControllerMetrics()
if err := testutil.ParseMetrics(data, (*testutil.Metrics)(&result)); err != nil {
return SnapshotControllerMetrics{}, err
}
return result, nil
}

View File

@ -0,0 +1,49 @@
/*
Copyright 2023 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
// NamespacedName comprises a resource name, with a mandatory namespace,
// rendered as "<namespace>/<name>". It implements NamedObject and thus can be
// used as function parameter instead of a full API object.
type NamespacedName struct {
Namespace string
Name string
}
var _ NamedObject = NamespacedName{}
// NamedObject is a subset of metav1.Object which provides read-only access
// to name and namespace of an object.
type NamedObject interface {
GetNamespace() string
GetName() string
}
// GetNamespace implements NamedObject.
func (n NamespacedName) GetNamespace() string {
return n.Namespace
}
// GetName implements NamedObject.
func (n NamespacedName) GetName() string {
return n.Name
}
// String returns the general purpose string representation
func (n NamespacedName) String() string {
return n.Namespace + "/" + n.Name
}

View File

@ -0,0 +1,12 @@
# This E2E framework sub-package is currently allowed to use arbitrary
# dependencies except of k/k/pkg, therefore we need to override the
# restrictions from the parent .import-restrictions file.
#
# At some point it may become useful to also check this package's
# dependencies more careful.
rules:
- selectorRegexp: "^k8s[.]io/kubernetes/pkg"
allowedPrefixes: []
- selectorRegexp: ""
allowedPrefixes: [ "" ]

View File

@ -0,0 +1,219 @@
/*
Copyright 2014 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package node
import (
"context"
"encoding/json"
"fmt"
"time"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/wait"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/client-go/util/retry"
"k8s.io/kubernetes/test/e2e/framework"
testutils "k8s.io/kubernetes/test/utils"
)
const (
// Minimal number of nodes for the cluster to be considered large.
largeClusterThreshold = 100
)
// WaitForAllNodesSchedulable waits up to timeout for all
// (but TestContext.AllowedNotReadyNodes) to become schedulable.
func WaitForAllNodesSchedulable(ctx context.Context, c clientset.Interface, timeout time.Duration) error {
if framework.TestContext.AllowedNotReadyNodes == -1 {
return nil
}
framework.Logf("Waiting up to %v for all (but %d) nodes to be schedulable", timeout, framework.TestContext.AllowedNotReadyNodes)
return wait.PollUntilContextTimeout(
ctx,
30*time.Second,
timeout,
true,
CheckReadyForTests(ctx, c, framework.TestContext.NonblockingTaints, framework.TestContext.AllowedNotReadyNodes, largeClusterThreshold),
)
}
// AddOrUpdateLabelOnNode adds the given label key and value to the given node or updates value.
func AddOrUpdateLabelOnNode(c clientset.Interface, nodeName string, labelKey, labelValue string) {
framework.ExpectNoError(testutils.AddLabelsToNode(c, nodeName, map[string]string{labelKey: labelValue}))
}
// ExpectNodeHasLabel expects that the given node has the given label pair.
func ExpectNodeHasLabel(ctx context.Context, c clientset.Interface, nodeName string, labelKey string, labelValue string) {
ginkgo.By("verifying the node has the label " + labelKey + " " + labelValue)
node, err := c.CoreV1().Nodes().Get(ctx, nodeName, metav1.GetOptions{})
framework.ExpectNoError(err)
gomega.Expect(node.Labels).To(gomega.HaveKeyWithValue(labelKey, labelValue))
}
// RemoveLabelOffNode is for cleaning up labels temporarily added to node,
// won't fail if target label doesn't exist or has been removed.
func RemoveLabelOffNode(c clientset.Interface, nodeName string, labelKey string) {
ginkgo.By("removing the label " + labelKey + " off the node " + nodeName)
framework.ExpectNoError(testutils.RemoveLabelOffNode(c, nodeName, []string{labelKey}))
ginkgo.By("verifying the node doesn't have the label " + labelKey)
framework.ExpectNoError(testutils.VerifyLabelsRemoved(c, nodeName, []string{labelKey}))
}
// ExpectNodeHasTaint expects that the node has the given taint.
func ExpectNodeHasTaint(ctx context.Context, c clientset.Interface, nodeName string, taint *v1.Taint) {
ginkgo.By("verifying the node has the taint " + taint.ToString())
if has, err := NodeHasTaint(ctx, c, nodeName, taint); !has {
framework.ExpectNoError(err)
framework.Failf("Failed to find taint %s on node %s", taint.ToString(), nodeName)
}
}
// NodeHasTaint returns true if the node has the given taint, else returns false.
func NodeHasTaint(ctx context.Context, c clientset.Interface, nodeName string, taint *v1.Taint) (bool, error) {
node, err := c.CoreV1().Nodes().Get(ctx, nodeName, metav1.GetOptions{})
if err != nil {
return false, err
}
nodeTaints := node.Spec.Taints
if len(nodeTaints) == 0 || !taintExists(nodeTaints, taint) {
return false, nil
}
return true, nil
}
// AllNodesReady checks whether all registered nodes are ready. Setting -1 on
// framework.TestContext.AllowedNotReadyNodes will bypass the post test node readiness check.
// TODO: we should change the AllNodesReady call in AfterEach to WaitForAllNodesHealthy,
// and figure out how to do it in a configurable way, as we can't expect all setups to run
// default test add-ons.
func AllNodesReady(ctx context.Context, c clientset.Interface, timeout time.Duration) error {
if err := allNodesReady(ctx, c, timeout); err != nil {
return fmt.Errorf("checking for ready nodes: %w", err)
}
return nil
}
func allNodesReady(ctx context.Context, c clientset.Interface, timeout time.Duration) error {
if framework.TestContext.AllowedNotReadyNodes == -1 {
return nil
}
framework.Logf("Waiting up to %v for all (but %d) nodes to be ready", timeout, framework.TestContext.AllowedNotReadyNodes)
var notReady []*v1.Node
err := wait.PollUntilContextTimeout(ctx, framework.Poll, timeout, true, func(ctx context.Context) (bool, error) {
notReady = nil
// It should be OK to list unschedulable Nodes here.
nodes, err := c.CoreV1().Nodes().List(ctx, metav1.ListOptions{})
if err != nil {
return false, err
}
for i := range nodes.Items {
node := &nodes.Items[i]
if !IsConditionSetAsExpected(node, v1.NodeReady, true) {
notReady = append(notReady, node)
}
}
// Framework allows for <TestContext.AllowedNotReadyNodes> nodes to be non-ready,
// to make it possible e.g. for incorrect deployment of some small percentage
// of nodes (which we allow in cluster validation). Some nodes that are not
// provisioned correctly at startup will never become ready (e.g. when something
// won't install correctly), so we can't expect them to be ready at any point.
return len(notReady) <= framework.TestContext.AllowedNotReadyNodes, nil
})
if err != nil && !wait.Interrupted(err) {
return err
}
if len(notReady) > framework.TestContext.AllowedNotReadyNodes {
msg := ""
for _, node := range notReady {
msg = fmt.Sprintf("%s, %s", msg, node.Name)
}
return fmt.Errorf("Not ready nodes: %#v", msg)
}
return nil
}
// taintExists checks if the given taint exists in list of taints. Returns true if exists false otherwise.
func taintExists(taints []v1.Taint, taintToFind *v1.Taint) bool {
for _, taint := range taints {
if taint.MatchTaint(taintToFind) {
return true
}
}
return false
}
// IsARM64 checks whether the k8s Node has arm64 arch.
func IsARM64(node *v1.Node) bool {
arch, ok := node.Labels["kubernetes.io/arch"]
if ok {
return arch == "arm64"
}
return false
}
// AddExtendedResource adds a fake resource to the Node.
func AddExtendedResource(ctx context.Context, clientSet clientset.Interface, nodeName string, extendedResourceName v1.ResourceName, extendedResourceQuantity resource.Quantity) {
extendedResource := v1.ResourceName(extendedResourceName)
ginkgo.By("Adding a custom resource")
extendedResourceList := v1.ResourceList{
extendedResource: extendedResourceQuantity,
}
patchPayload, err := json.Marshal(v1.Node{
Status: v1.NodeStatus{
Capacity: extendedResourceList,
Allocatable: extendedResourceList,
},
})
framework.ExpectNoError(err, "Failed to marshal node JSON")
_, err = clientSet.CoreV1().Nodes().Patch(ctx, nodeName, types.StrategicMergePatchType, []byte(patchPayload), metav1.PatchOptions{}, "status")
framework.ExpectNoError(err)
}
// RemoveExtendedResource removes a fake resource from the Node.
func RemoveExtendedResource(ctx context.Context, clientSet clientset.Interface, nodeName string, extendedResourceName v1.ResourceName) {
extendedResource := v1.ResourceName(extendedResourceName)
ginkgo.By("Removing a custom resource")
err := retry.RetryOnConflict(retry.DefaultBackoff, func() error {
node, err := clientSet.CoreV1().Nodes().Get(ctx, nodeName, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("failed to get node %s: %w", nodeName, err)
}
delete(node.Status.Capacity, extendedResource)
delete(node.Status.Allocatable, extendedResource)
_, err = clientSet.CoreV1().Nodes().UpdateStatus(ctx, node, metav1.UpdateOptions{})
return err
})
framework.ExpectNoError(err)
}

View File

@ -0,0 +1,94 @@
/*
Copyright 2014 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package node
import (
"context"
"sync"
"time"
"github.com/onsi/ginkgo/v2"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/util/wait"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubernetes/test/e2e/framework"
e2essh "k8s.io/kubernetes/test/e2e/framework/ssh"
)
// NodeKiller is a utility to simulate node failures.
type NodeKiller struct {
config framework.NodeKillerConfig
client clientset.Interface
provider string
}
// NewNodeKiller creates new NodeKiller.
func NewNodeKiller(config framework.NodeKillerConfig, client clientset.Interface, provider string) *NodeKiller {
config.NodeKillerStopCtx, config.NodeKillerStop = context.WithCancel(context.Background())
return &NodeKiller{config, client, provider}
}
// Run starts NodeKiller until stopCh is closed.
func (k *NodeKiller) Run(ctx context.Context) {
// wait.JitterUntil starts work immediately, so wait first.
time.Sleep(wait.Jitter(k.config.Interval, k.config.JitterFactor))
wait.JitterUntilWithContext(ctx, func(ctx context.Context) {
nodes := k.pickNodes(ctx)
k.kill(ctx, nodes)
}, k.config.Interval, k.config.JitterFactor, true)
}
func (k *NodeKiller) pickNodes(ctx context.Context) []v1.Node {
nodes, err := GetReadySchedulableNodes(ctx, k.client)
framework.ExpectNoError(err)
numNodes := int(k.config.FailureRatio * float64(len(nodes.Items)))
nodes, err = GetBoundedReadySchedulableNodes(ctx, k.client, numNodes)
framework.ExpectNoError(err)
return nodes.Items
}
func (k *NodeKiller) kill(ctx context.Context, nodes []v1.Node) {
wg := sync.WaitGroup{}
wg.Add(len(nodes))
for _, node := range nodes {
node := node
go func() {
defer ginkgo.GinkgoRecover()
defer wg.Done()
framework.Logf("Stopping docker and kubelet on %q to simulate failure", node.Name)
err := e2essh.IssueSSHCommand(ctx, "sudo systemctl stop docker kubelet", k.provider, &node)
if err != nil {
framework.Logf("ERROR while stopping node %q: %v", node.Name, err)
return
}
time.Sleep(k.config.SimulatedDowntime)
framework.Logf("Rebooting %q to repair the node", node.Name)
err = e2essh.IssueSSHCommand(ctx, "sudo reboot", k.provider, &node)
if err != nil {
framework.Logf("ERROR while rebooting node %q: %v", node.Name, err)
return
}
}()
}
wg.Wait()
}

View File

@ -0,0 +1,837 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package node
import (
"context"
"encoding/json"
"fmt"
"net"
"strings"
"time"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/conversion"
"k8s.io/apimachinery/pkg/fields"
"k8s.io/apimachinery/pkg/labels"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/rand"
"k8s.io/apimachinery/pkg/util/sets"
"k8s.io/apimachinery/pkg/util/strategicpatch"
"k8s.io/apimachinery/pkg/util/wait"
clientset "k8s.io/client-go/kubernetes"
clientretry "k8s.io/client-go/util/retry"
"k8s.io/kubernetes/test/e2e/framework"
netutil "k8s.io/utils/net"
)
const (
// poll is how often to Poll pods, nodes and claims.
poll = 2 * time.Second
// singleCallTimeout is how long to try single API calls (like 'get' or 'list'). Used to prevent
// transient failures from failing tests.
singleCallTimeout = 5 * time.Minute
// ssh port
sshPort = "22"
)
var (
// unreachableTaintTemplate is the taint for when a node becomes unreachable.
// Copied from pkg/controller/nodelifecycle to avoid pulling extra dependencies
unreachableTaintTemplate = &v1.Taint{
Key: v1.TaintNodeUnreachable,
Effect: v1.TaintEffectNoExecute,
}
// notReadyTaintTemplate is the taint for when a node is not ready for executing pods.
// Copied from pkg/controller/nodelifecycle to avoid pulling extra dependencies
notReadyTaintTemplate = &v1.Taint{
Key: v1.TaintNodeNotReady,
Effect: v1.TaintEffectNoExecute,
}
// updateTaintBackOff contains the maximum retries and the wait interval between two retries.
updateTaintBackOff = wait.Backoff{
Steps: 5,
Duration: 100 * time.Millisecond,
Jitter: 1.0,
}
)
// PodNode is a pod-node pair indicating which node a given pod is running on
type PodNode struct {
// Pod represents pod name
Pod string
// Node represents node name
Node string
}
// FirstAddress returns the first address of the given type of each node.
func FirstAddress(nodelist *v1.NodeList, addrType v1.NodeAddressType) string {
for _, n := range nodelist.Items {
for _, addr := range n.Status.Addresses {
if addr.Type == addrType && addr.Address != "" {
return addr.Address
}
}
}
return ""
}
func isNodeConditionSetAsExpected(node *v1.Node, conditionType v1.NodeConditionType, wantTrue, silent bool) bool {
// Check the node readiness condition (logging all).
for _, cond := range node.Status.Conditions {
// Ensure that the condition type and the status matches as desired.
if cond.Type == conditionType {
// For NodeReady condition we need to check Taints as well
if cond.Type == v1.NodeReady {
hasNodeControllerTaints := false
// For NodeReady we need to check if Taints are gone as well
taints := node.Spec.Taints
for _, taint := range taints {
if taint.MatchTaint(unreachableTaintTemplate) || taint.MatchTaint(notReadyTaintTemplate) {
hasNodeControllerTaints = true
break
}
}
if wantTrue {
if (cond.Status == v1.ConditionTrue) && !hasNodeControllerTaints {
return true
}
msg := ""
if !hasNodeControllerTaints {
msg = fmt.Sprintf("Condition %s of node %s is %v instead of %t. Reason: %v, message: %v",
conditionType, node.Name, cond.Status == v1.ConditionTrue, wantTrue, cond.Reason, cond.Message)
} else {
msg = fmt.Sprintf("Condition %s of node %s is %v, but Node is tainted by NodeController with %v. Failure",
conditionType, node.Name, cond.Status == v1.ConditionTrue, taints)
}
if !silent {
framework.Logf("%s", msg)
}
return false
}
// TODO: check if the Node is tainted once we enable NC notReady/unreachable taints by default
if cond.Status != v1.ConditionTrue {
return true
}
if !silent {
framework.Logf("Condition %s of node %s is %v instead of %t. Reason: %v, message: %v",
conditionType, node.Name, cond.Status == v1.ConditionTrue, wantTrue, cond.Reason, cond.Message)
}
return false
}
if (wantTrue && (cond.Status == v1.ConditionTrue)) || (!wantTrue && (cond.Status != v1.ConditionTrue)) {
return true
}
if !silent {
framework.Logf("Condition %s of node %s is %v instead of %t. Reason: %v, message: %v",
conditionType, node.Name, cond.Status == v1.ConditionTrue, wantTrue, cond.Reason, cond.Message)
}
return false
}
}
if !silent {
framework.Logf("Couldn't find condition %v on node %v", conditionType, node.Name)
}
return false
}
// IsConditionSetAsExpected returns a wantTrue value if the node has a match to the conditionType, otherwise returns an opposite value of the wantTrue with detailed logging.
func IsConditionSetAsExpected(node *v1.Node, conditionType v1.NodeConditionType, wantTrue bool) bool {
return isNodeConditionSetAsExpected(node, conditionType, wantTrue, false)
}
// IsConditionSetAsExpectedSilent returns a wantTrue value if the node has a match to the conditionType, otherwise returns an opposite value of the wantTrue.
func IsConditionSetAsExpectedSilent(node *v1.Node, conditionType v1.NodeConditionType, wantTrue bool) bool {
return isNodeConditionSetAsExpected(node, conditionType, wantTrue, true)
}
// isConditionUnset returns true if conditions of the given node do not have a match to the given conditionType, otherwise false.
func isConditionUnset(node *v1.Node, conditionType v1.NodeConditionType) bool {
for _, cond := range node.Status.Conditions {
if cond.Type == conditionType {
return false
}
}
return true
}
// Filter filters nodes in NodeList in place, removing nodes that do not
// satisfy the given condition
func Filter(nodeList *v1.NodeList, fn func(node v1.Node) bool) {
var l []v1.Node
for _, node := range nodeList.Items {
if fn(node) {
l = append(l, node)
}
}
nodeList.Items = l
}
// TotalRegistered returns number of schedulable Nodes.
func TotalRegistered(ctx context.Context, c clientset.Interface) (int, error) {
nodes, err := waitListSchedulableNodes(ctx, c)
if err != nil {
framework.Logf("Failed to list nodes: %v", err)
return 0, err
}
return len(nodes.Items), nil
}
// TotalReady returns number of ready schedulable Nodes.
func TotalReady(ctx context.Context, c clientset.Interface) (int, error) {
nodes, err := waitListSchedulableNodes(ctx, c)
if err != nil {
framework.Logf("Failed to list nodes: %v", err)
return 0, err
}
// Filter out not-ready nodes.
Filter(nodes, func(node v1.Node) bool {
return IsConditionSetAsExpected(&node, v1.NodeReady, true)
})
return len(nodes.Items), nil
}
// GetSSHExternalIP returns node external IP concatenated with port 22 for ssh
// e.g. 1.2.3.4:22
func GetSSHExternalIP(node *v1.Node) (string, error) {
framework.Logf("Getting external IP address for %s", node.Name)
for _, a := range node.Status.Addresses {
if a.Type == v1.NodeExternalIP && a.Address != "" {
return net.JoinHostPort(a.Address, sshPort), nil
}
}
return "", fmt.Errorf("Couldn't get the external IP of host %s with addresses %v", node.Name, node.Status.Addresses)
}
// GetSSHInternalIP returns node internal IP concatenated with port 22 for ssh
func GetSSHInternalIP(node *v1.Node) (string, error) {
for _, address := range node.Status.Addresses {
if address.Type == v1.NodeInternalIP && address.Address != "" {
return net.JoinHostPort(address.Address, sshPort), nil
}
}
return "", fmt.Errorf("Couldn't get the internal IP of host %s with addresses %v", node.Name, node.Status.Addresses)
}
// FirstAddressByTypeAndFamily returns the first address that matches the given type and family of the list of nodes
func FirstAddressByTypeAndFamily(nodelist *v1.NodeList, addrType v1.NodeAddressType, family v1.IPFamily) string {
for _, n := range nodelist.Items {
addresses := GetAddressesByTypeAndFamily(&n, addrType, family)
if len(addresses) > 0 {
return addresses[0]
}
}
return ""
}
// GetAddressesByTypeAndFamily returns a list of addresses of the given addressType for the given node
// and filtered by IPFamily
func GetAddressesByTypeAndFamily(node *v1.Node, addressType v1.NodeAddressType, family v1.IPFamily) (ips []string) {
for _, nodeAddress := range node.Status.Addresses {
if nodeAddress.Type != addressType {
continue
}
if nodeAddress.Address == "" {
continue
}
if family == v1.IPv6Protocol && netutil.IsIPv6String(nodeAddress.Address) {
ips = append(ips, nodeAddress.Address)
}
if family == v1.IPv4Protocol && !netutil.IsIPv6String(nodeAddress.Address) {
ips = append(ips, nodeAddress.Address)
}
}
return
}
// GetAddresses returns a list of addresses of the given addressType for the given node
func GetAddresses(node *v1.Node, addressType v1.NodeAddressType) (ips []string) {
for j := range node.Status.Addresses {
nodeAddress := &node.Status.Addresses[j]
if nodeAddress.Type == addressType && nodeAddress.Address != "" {
ips = append(ips, nodeAddress.Address)
}
}
return
}
// CollectAddresses returns a list of addresses of the given addressType for the given list of nodes
func CollectAddresses(nodes *v1.NodeList, addressType v1.NodeAddressType) []string {
ips := []string{}
for i := range nodes.Items {
ips = append(ips, GetAddresses(&nodes.Items[i], addressType)...)
}
return ips
}
// PickIP picks one public node IP
func PickIP(ctx context.Context, c clientset.Interface) (string, error) {
publicIps, err := GetPublicIps(ctx, c)
if err != nil {
return "", fmt.Errorf("get node public IPs error: %w", err)
}
if len(publicIps) == 0 {
return "", fmt.Errorf("got unexpected number (%d) of public IPs", len(publicIps))
}
ip := publicIps[0]
return ip, nil
}
// GetPublicIps returns a public IP list of nodes.
func GetPublicIps(ctx context.Context, c clientset.Interface) ([]string, error) {
nodes, err := GetReadySchedulableNodes(ctx, c)
if err != nil {
return nil, fmt.Errorf("get schedulable and ready nodes error: %w", err)
}
ips := CollectAddresses(nodes, v1.NodeExternalIP)
if len(ips) == 0 {
// If ExternalIP isn't set, assume the test programs can reach the InternalIP
ips = CollectAddresses(nodes, v1.NodeInternalIP)
}
return ips, nil
}
// GetReadySchedulableNodes addresses the common use case of getting nodes you can do work on.
// 1) Needs to be schedulable.
// 2) Needs to be ready.
// If EITHER 1 or 2 is not true, most tests will want to ignore the node entirely.
// If there are no nodes that are both ready and schedulable, this will return an error.
func GetReadySchedulableNodes(ctx context.Context, c clientset.Interface) (nodes *v1.NodeList, err error) {
nodes, err = checkWaitListSchedulableNodes(ctx, c)
if err != nil {
return nil, fmt.Errorf("listing schedulable nodes error: %w", err)
}
Filter(nodes, func(node v1.Node) bool {
return IsNodeSchedulable(&node) && isNodeUntainted(&node)
})
if len(nodes.Items) == 0 {
return nil, fmt.Errorf("there are currently no ready, schedulable nodes in the cluster")
}
return nodes, nil
}
// GetBoundedReadySchedulableNodes is like GetReadySchedulableNodes except that it returns
// at most maxNodes nodes. Use this to keep your test case from blowing up when run on a
// large cluster.
func GetBoundedReadySchedulableNodes(ctx context.Context, c clientset.Interface, maxNodes int) (nodes *v1.NodeList, err error) {
nodes, err = GetReadySchedulableNodes(ctx, c)
if err != nil {
return nil, err
}
if len(nodes.Items) > maxNodes {
shuffled := make([]v1.Node, maxNodes)
perm := rand.Perm(len(nodes.Items))
for i, j := range perm {
if j < len(shuffled) {
shuffled[j] = nodes.Items[i]
}
}
nodes.Items = shuffled
}
return nodes, nil
}
// GetRandomReadySchedulableNode gets a single randomly-selected node which is available for
// running pods on. If there are no available nodes it will return an error.
func GetRandomReadySchedulableNode(ctx context.Context, c clientset.Interface) (*v1.Node, error) {
nodes, err := GetReadySchedulableNodes(ctx, c)
if err != nil {
return nil, err
}
return &nodes.Items[rand.Intn(len(nodes.Items))], nil
}
// GetReadyNodesIncludingTainted returns all ready nodes, even those which are tainted.
// There are cases when we care about tainted nodes
// E.g. in tests related to nodes with gpu we care about nodes despite
// presence of nvidia.com/gpu=present:NoSchedule taint
func GetReadyNodesIncludingTainted(ctx context.Context, c clientset.Interface) (nodes *v1.NodeList, err error) {
nodes, err = checkWaitListSchedulableNodes(ctx, c)
if err != nil {
return nil, fmt.Errorf("listing schedulable nodes error: %w", err)
}
Filter(nodes, func(node v1.Node) bool {
return IsNodeSchedulable(&node)
})
return nodes, nil
}
// isNodeUntainted tests whether a fake pod can be scheduled on "node", given its current taints.
// TODO: need to discuss wether to return bool and error type
func isNodeUntainted(node *v1.Node) bool {
return isNodeUntaintedWithNonblocking(node, "")
}
// isNodeUntaintedWithNonblocking tests whether a fake pod can be scheduled on "node"
// but allows for taints in the list of non-blocking taints.
func isNodeUntaintedWithNonblocking(node *v1.Node, nonblockingTaints string) bool {
// Simple lookup for nonblocking taints based on comma-delimited list.
nonblockingTaintsMap := map[string]struct{}{}
for _, t := range strings.Split(nonblockingTaints, ",") {
if strings.TrimSpace(t) != "" {
nonblockingTaintsMap[strings.TrimSpace(t)] = struct{}{}
}
}
n := node
if len(nonblockingTaintsMap) > 0 {
nodeCopy := node.DeepCopy()
nodeCopy.Spec.Taints = []v1.Taint{}
for _, v := range node.Spec.Taints {
if _, isNonblockingTaint := nonblockingTaintsMap[v.Key]; !isNonblockingTaint {
nodeCopy.Spec.Taints = append(nodeCopy.Spec.Taints, v)
}
}
n = nodeCopy
}
return toleratesTaintsWithNoScheduleNoExecuteEffects(n.Spec.Taints, nil)
}
func toleratesTaintsWithNoScheduleNoExecuteEffects(taints []v1.Taint, tolerations []v1.Toleration) bool {
filteredTaints := []v1.Taint{}
for _, taint := range taints {
if taint.Effect == v1.TaintEffectNoExecute || taint.Effect == v1.TaintEffectNoSchedule {
filteredTaints = append(filteredTaints, taint)
}
}
toleratesTaint := func(taint v1.Taint) bool {
for _, toleration := range tolerations {
if toleration.ToleratesTaint(&taint) {
return true
}
}
return false
}
for _, taint := range filteredTaints {
if !toleratesTaint(taint) {
return false
}
}
return true
}
// IsNodeSchedulable returns true if:
// 1) doesn't have "unschedulable" field set
// 2) it also returns true from IsNodeReady
func IsNodeSchedulable(node *v1.Node) bool {
if node == nil {
return false
}
return !node.Spec.Unschedulable && IsNodeReady(node)
}
// IsNodeReady returns true if:
// 1) it's Ready condition is set to true
// 2) doesn't have NetworkUnavailable condition set to true
func IsNodeReady(node *v1.Node) bool {
nodeReady := IsConditionSetAsExpected(node, v1.NodeReady, true)
networkReady := isConditionUnset(node, v1.NodeNetworkUnavailable) ||
IsConditionSetAsExpectedSilent(node, v1.NodeNetworkUnavailable, false)
return nodeReady && networkReady
}
// isNodeSchedulableWithoutTaints returns true if:
// 1) doesn't have "unschedulable" field set
// 2) it also returns true from IsNodeReady
// 3) it also returns true from isNodeUntainted
func isNodeSchedulableWithoutTaints(node *v1.Node) bool {
return IsNodeSchedulable(node) && isNodeUntainted(node)
}
// hasNonblockingTaint returns true if the node contains at least
// one taint with a key matching the regexp.
func hasNonblockingTaint(node *v1.Node, nonblockingTaints string) bool {
if node == nil {
return false
}
// Simple lookup for nonblocking taints based on comma-delimited list.
nonblockingTaintsMap := map[string]struct{}{}
for _, t := range strings.Split(nonblockingTaints, ",") {
if strings.TrimSpace(t) != "" {
nonblockingTaintsMap[strings.TrimSpace(t)] = struct{}{}
}
}
for _, taint := range node.Spec.Taints {
if _, hasNonblockingTaint := nonblockingTaintsMap[taint.Key]; hasNonblockingTaint {
return true
}
}
return false
}
// GetNodeHeartbeatTime returns the timestamp of the last status update of the node.
func GetNodeHeartbeatTime(node *v1.Node) metav1.Time {
for _, condition := range node.Status.Conditions {
if condition.Type == v1.NodeReady {
return condition.LastHeartbeatTime
}
}
return metav1.Time{}
}
// PodNodePairs return podNode pairs for all pods in a namespace
func PodNodePairs(ctx context.Context, c clientset.Interface, ns string) ([]PodNode, error) {
var result []PodNode
podList, err := c.CoreV1().Pods(ns).List(ctx, metav1.ListOptions{})
if err != nil {
return result, err
}
for _, pod := range podList.Items {
result = append(result, PodNode{
Pod: pod.Name,
Node: pod.Spec.NodeName,
})
}
return result, nil
}
// GetClusterZones returns the values of zone label collected from all nodes.
func GetClusterZones(ctx context.Context, c clientset.Interface) (sets.String, error) {
nodes, err := c.CoreV1().Nodes().List(ctx, metav1.ListOptions{})
if err != nil {
return nil, fmt.Errorf("Error getting nodes while attempting to list cluster zones: %w", err)
}
// collect values of zone label from all nodes
zones := sets.NewString()
for _, node := range nodes.Items {
if zone, found := node.Labels[v1.LabelFailureDomainBetaZone]; found {
zones.Insert(zone)
}
if zone, found := node.Labels[v1.LabelTopologyZone]; found {
zones.Insert(zone)
}
}
return zones, nil
}
// GetSchedulableClusterZones returns the values of zone label collected from all nodes which are schedulable.
func GetSchedulableClusterZones(ctx context.Context, c clientset.Interface) (sets.Set[string], error) {
// GetReadySchedulableNodes already filters our tainted and unschedulable nodes.
nodes, err := GetReadySchedulableNodes(ctx, c)
if err != nil {
return nil, fmt.Errorf("error getting nodes while attempting to list cluster zones: %w", err)
}
// collect values of zone label from all nodes
zones := sets.New[string]()
for _, node := range nodes.Items {
if zone, found := node.Labels[v1.LabelFailureDomainBetaZone]; found {
zones.Insert(zone)
}
if zone, found := node.Labels[v1.LabelTopologyZone]; found {
zones.Insert(zone)
}
}
return zones, nil
}
// CreatePodsPerNodeForSimpleApp creates pods w/ labels. Useful for tests which make a bunch of pods w/o any networking.
func CreatePodsPerNodeForSimpleApp(ctx context.Context, c clientset.Interface, namespace, appName string, podSpec func(n v1.Node) v1.PodSpec, maxCount int) map[string]string {
nodes, err := GetBoundedReadySchedulableNodes(ctx, c, maxCount)
// TODO use wrapper methods in expect.go after removing core e2e dependency on node
gomega.ExpectWithOffset(2, err).NotTo(gomega.HaveOccurred())
podLabels := map[string]string{
"app": appName + "-pod",
}
for i, node := range nodes.Items {
framework.Logf("%v/%v : Creating container with label app=%v-pod", i, maxCount, appName)
_, err := c.CoreV1().Pods(namespace).Create(ctx, &v1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf(appName+"-pod-%v", i),
Labels: podLabels,
},
Spec: podSpec(node),
}, metav1.CreateOptions{})
// TODO use wrapper methods in expect.go after removing core e2e dependency on node
gomega.ExpectWithOffset(2, err).NotTo(gomega.HaveOccurred())
}
return podLabels
}
// RemoveTaintsOffNode removes a list of taints from the given node
// It is simply a helper wrapper for RemoveTaintOffNode
func RemoveTaintsOffNode(ctx context.Context, c clientset.Interface, nodeName string, taints []v1.Taint) {
for _, taint := range taints {
RemoveTaintOffNode(ctx, c, nodeName, taint)
}
}
// RemoveTaintOffNode removes the given taint from the given node.
func RemoveTaintOffNode(ctx context.Context, c clientset.Interface, nodeName string, taint v1.Taint) {
err := removeNodeTaint(ctx, c, nodeName, nil, &taint)
// TODO use wrapper methods in expect.go after removing core e2e dependency on node
gomega.ExpectWithOffset(2, err).NotTo(gomega.HaveOccurred())
verifyThatTaintIsGone(ctx, c, nodeName, &taint)
}
// AddOrUpdateTaintOnNode adds the given taint to the given node or updates taint.
func AddOrUpdateTaintOnNode(ctx context.Context, c clientset.Interface, nodeName string, taint v1.Taint) {
// TODO use wrapper methods in expect.go after removing the dependency on this
// package from the core e2e framework.
err := addOrUpdateTaintOnNode(ctx, c, nodeName, &taint)
gomega.ExpectWithOffset(2, err).NotTo(gomega.HaveOccurred())
}
// addOrUpdateTaintOnNode add taints to the node. If taint was added into node, it'll issue API calls
// to update nodes; otherwise, no API calls. Return error if any.
// copied from pkg/controller/controller_utils.go AddOrUpdateTaintOnNode()
func addOrUpdateTaintOnNode(ctx context.Context, c clientset.Interface, nodeName string, taints ...*v1.Taint) error {
if len(taints) == 0 {
return nil
}
firstTry := true
return clientretry.RetryOnConflict(updateTaintBackOff, func() error {
var err error
var oldNode *v1.Node
// First we try getting node from the API server cache, as it's cheaper. If it fails
// we get it from etcd to be sure to have fresh data.
if firstTry {
oldNode, err = c.CoreV1().Nodes().Get(ctx, nodeName, metav1.GetOptions{ResourceVersion: "0"})
firstTry = false
} else {
oldNode, err = c.CoreV1().Nodes().Get(ctx, nodeName, metav1.GetOptions{})
}
if err != nil {
return err
}
var newNode *v1.Node
oldNodeCopy := oldNode
updated := false
for _, taint := range taints {
curNewNode, ok, err := addOrUpdateTaint(oldNodeCopy, taint)
if err != nil {
return fmt.Errorf("failed to update taint of node")
}
updated = updated || ok
newNode = curNewNode
oldNodeCopy = curNewNode
}
if !updated {
return nil
}
return patchNodeTaints(ctx, c, nodeName, oldNode, newNode)
})
}
// addOrUpdateTaint tries to add a taint to annotations list. Returns a new copy of updated Node and true if something was updated
// false otherwise.
// copied from pkg/util/taints/taints.go AddOrUpdateTaint()
func addOrUpdateTaint(node *v1.Node, taint *v1.Taint) (*v1.Node, bool, error) {
newNode := node.DeepCopy()
nodeTaints := newNode.Spec.Taints
var newTaints []v1.Taint
updated := false
for i := range nodeTaints {
if taint.MatchTaint(&nodeTaints[i]) {
if semantic.DeepEqual(*taint, nodeTaints[i]) {
return newNode, false, nil
}
newTaints = append(newTaints, *taint)
updated = true
continue
}
newTaints = append(newTaints, nodeTaints[i])
}
if !updated {
newTaints = append(newTaints, *taint)
}
newNode.Spec.Taints = newTaints
return newNode, true, nil
}
// semantic can do semantic deep equality checks for core objects.
// Example: apiequality.Semantic.DeepEqual(aPod, aPodWithNonNilButEmptyMaps) == true
// copied from pkg/apis/core/helper/helpers.go Semantic
var semantic = conversion.EqualitiesOrDie(
func(a, b resource.Quantity) bool {
// Ignore formatting, only care that numeric value stayed the same.
// TODO: if we decide it's important, it should be safe to start comparing the format.
//
// Uninitialized quantities are equivalent to 0 quantities.
return a.Cmp(b) == 0
},
func(a, b metav1.MicroTime) bool {
return a.UTC() == b.UTC()
},
func(a, b metav1.Time) bool {
return a.UTC() == b.UTC()
},
func(a, b labels.Selector) bool {
return a.String() == b.String()
},
func(a, b fields.Selector) bool {
return a.String() == b.String()
},
)
// removeNodeTaint is for cleaning up taints temporarily added to node,
// won't fail if target taint doesn't exist or has been removed.
// If passed a node it'll check if there's anything to be done, if taint is not present it won't issue
// any API calls.
func removeNodeTaint(ctx context.Context, c clientset.Interface, nodeName string, node *v1.Node, taints ...*v1.Taint) error {
if len(taints) == 0 {
return nil
}
// Short circuit for limiting amount of API calls.
if node != nil {
match := false
for _, taint := range taints {
if taintExists(node.Spec.Taints, taint) {
match = true
break
}
}
if !match {
return nil
}
}
firstTry := true
return clientretry.RetryOnConflict(updateTaintBackOff, func() error {
var err error
var oldNode *v1.Node
// First we try getting node from the API server cache, as it's cheaper. If it fails
// we get it from etcd to be sure to have fresh data.
if firstTry {
oldNode, err = c.CoreV1().Nodes().Get(ctx, nodeName, metav1.GetOptions{ResourceVersion: "0"})
firstTry = false
} else {
oldNode, err = c.CoreV1().Nodes().Get(ctx, nodeName, metav1.GetOptions{})
}
if err != nil {
return err
}
var newNode *v1.Node
oldNodeCopy := oldNode
updated := false
for _, taint := range taints {
curNewNode, ok, err := removeTaint(oldNodeCopy, taint)
if err != nil {
return fmt.Errorf("failed to remove taint of node")
}
updated = updated || ok
newNode = curNewNode
oldNodeCopy = curNewNode
}
if !updated {
return nil
}
return patchNodeTaints(ctx, c, nodeName, oldNode, newNode)
})
}
// patchNodeTaints patches node's taints.
func patchNodeTaints(ctx context.Context, c clientset.Interface, nodeName string, oldNode *v1.Node, newNode *v1.Node) error {
oldData, err := json.Marshal(oldNode)
if err != nil {
return fmt.Errorf("failed to marshal old node %#v for node %q: %w", oldNode, nodeName, err)
}
newTaints := newNode.Spec.Taints
newNodeClone := oldNode.DeepCopy()
newNodeClone.Spec.Taints = newTaints
newData, err := json.Marshal(newNodeClone)
if err != nil {
return fmt.Errorf("failed to marshal new node %#v for node %q: %w", newNodeClone, nodeName, err)
}
patchBytes, err := strategicpatch.CreateTwoWayMergePatch(oldData, newData, v1.Node{})
if err != nil {
return fmt.Errorf("failed to create patch for node %q: %w", nodeName, err)
}
_, err = c.CoreV1().Nodes().Patch(ctx, nodeName, types.StrategicMergePatchType, patchBytes, metav1.PatchOptions{})
return err
}
// removeTaint tries to remove a taint from annotations list. Returns a new copy of updated Node and true if something was updated
// false otherwise.
func removeTaint(node *v1.Node, taint *v1.Taint) (*v1.Node, bool, error) {
newNode := node.DeepCopy()
nodeTaints := newNode.Spec.Taints
if len(nodeTaints) == 0 {
return newNode, false, nil
}
if !taintExists(nodeTaints, taint) {
return newNode, false, nil
}
newTaints, _ := deleteTaint(nodeTaints, taint)
newNode.Spec.Taints = newTaints
return newNode, true, nil
}
// deleteTaint removes all the taints that have the same key and effect to given taintToDelete.
func deleteTaint(taints []v1.Taint, taintToDelete *v1.Taint) ([]v1.Taint, bool) {
var newTaints []v1.Taint
deleted := false
for i := range taints {
if taintToDelete.MatchTaint(&taints[i]) {
deleted = true
continue
}
newTaints = append(newTaints, taints[i])
}
return newTaints, deleted
}
func verifyThatTaintIsGone(ctx context.Context, c clientset.Interface, nodeName string, taint *v1.Taint) {
ginkgo.By("verifying the node doesn't have the taint " + taint.ToString())
nodeUpdated, err := c.CoreV1().Nodes().Get(ctx, nodeName, metav1.GetOptions{})
// TODO use wrapper methods in expect.go after removing core e2e dependency on node
gomega.ExpectWithOffset(2, err).NotTo(gomega.HaveOccurred())
if taintExists(nodeUpdated.Spec.Taints, taint) {
framework.Fail("Failed removing taint " + taint.ToString() + " of the node " + nodeName)
}
}

View File

@ -0,0 +1,43 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package node
import (
"context"
"time"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/kubernetes/test/e2e/framework"
e2ekubectl "k8s.io/kubernetes/test/e2e/framework/kubectl"
)
// WaitForSSHTunnels waits for establishing SSH tunnel to busybox pod.
func WaitForSSHTunnels(ctx context.Context, namespace string) {
framework.Logf("Waiting for SSH tunnels to establish")
e2ekubectl.RunKubectl(namespace, "run", "ssh-tunnel-test",
"--image=busybox",
"--restart=Never",
"--command", "--",
"echo", "Hello")
defer e2ekubectl.RunKubectl(namespace, "delete", "pod", "ssh-tunnel-test")
// allow up to a minute for new ssh tunnels to establish
wait.PollUntilContextTimeout(ctx, 5*time.Second, time.Minute, true, func(ctx context.Context) (bool, error) {
_, err := e2ekubectl.RunKubectl(namespace, "logs", "ssh-tunnel-test")
return err == nil, nil
})
}

View File

@ -0,0 +1,313 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package node
import (
"context"
"fmt"
"regexp"
"time"
"github.com/onsi/gomega"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/fields"
"k8s.io/apimachinery/pkg/util/wait"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubernetes/test/e2e/framework"
)
const sleepTime = 20 * time.Second
var requiredPerNodePods = []*regexp.Regexp{
regexp.MustCompile(".*kube-proxy.*"),
regexp.MustCompile(".*fluentd-elasticsearch.*"),
regexp.MustCompile(".*node-problem-detector.*"),
}
// WaitForReadyNodes waits up to timeout for cluster to has desired size and
// there is no not-ready nodes in it. By cluster size we mean number of schedulable Nodes.
func WaitForReadyNodes(ctx context.Context, c clientset.Interface, size int, timeout time.Duration) error {
_, err := CheckReady(ctx, c, size, timeout)
return err
}
// WaitForTotalHealthy checks whether all registered nodes are ready and all required Pods are running on them.
func WaitForTotalHealthy(ctx context.Context, c clientset.Interface, timeout time.Duration) error {
framework.Logf("Waiting up to %v for all nodes to be ready", timeout)
var notReady []v1.Node
var missingPodsPerNode map[string][]string
err := wait.PollUntilContextTimeout(ctx, poll, timeout, true, func(ctx context.Context) (bool, error) {
notReady = nil
// It should be OK to list unschedulable Nodes here.
nodes, err := c.CoreV1().Nodes().List(ctx, metav1.ListOptions{ResourceVersion: "0"})
if err != nil {
return false, err
}
for _, node := range nodes.Items {
if !IsConditionSetAsExpected(&node, v1.NodeReady, true) {
notReady = append(notReady, node)
}
}
pods, err := c.CoreV1().Pods(metav1.NamespaceAll).List(ctx, metav1.ListOptions{ResourceVersion: "0"})
if err != nil {
return false, err
}
systemPodsPerNode := make(map[string][]string)
for _, pod := range pods.Items {
if pod.Namespace == metav1.NamespaceSystem && pod.Status.Phase == v1.PodRunning {
if pod.Spec.NodeName != "" {
systemPodsPerNode[pod.Spec.NodeName] = append(systemPodsPerNode[pod.Spec.NodeName], pod.Name)
}
}
}
missingPodsPerNode = make(map[string][]string)
for _, node := range nodes.Items {
if isNodeSchedulableWithoutTaints(&node) {
for _, requiredPod := range requiredPerNodePods {
foundRequired := false
for _, presentPod := range systemPodsPerNode[node.Name] {
if requiredPod.MatchString(presentPod) {
foundRequired = true
break
}
}
if !foundRequired {
missingPodsPerNode[node.Name] = append(missingPodsPerNode[node.Name], requiredPod.String())
}
}
}
}
return len(notReady) == 0 && len(missingPodsPerNode) == 0, nil
})
if err != nil && !wait.Interrupted(err) {
return err
}
if len(notReady) > 0 {
return fmt.Errorf("Not ready nodes: %v", notReady)
}
if len(missingPodsPerNode) > 0 {
return fmt.Errorf("Not running system Pods: %v", missingPodsPerNode)
}
return nil
}
// WaitConditionToBe returns whether node "name's" condition state matches wantTrue
// within timeout. If wantTrue is true, it will ensure the node condition status
// is ConditionTrue; if it's false, it ensures the node condition is in any state
// other than ConditionTrue (e.g. not true or unknown).
func WaitConditionToBe(ctx context.Context, c clientset.Interface, name string, conditionType v1.NodeConditionType, wantTrue bool, timeout time.Duration) bool {
framework.Logf("Waiting up to %v for node %s condition %s to be %t", timeout, name, conditionType, wantTrue)
for start := time.Now(); time.Since(start) < timeout; time.Sleep(poll) {
node, err := c.CoreV1().Nodes().Get(ctx, name, metav1.GetOptions{})
if err != nil {
framework.Logf("Couldn't get node %s", name)
continue
}
if IsConditionSetAsExpected(node, conditionType, wantTrue) {
return true
}
}
framework.Logf("Node %s didn't reach desired %s condition status (%t) within %v", name, conditionType, wantTrue, timeout)
return false
}
// WaitForNodeToBeNotReady returns whether node name is not ready (i.e. the
// readiness condition is anything but ready, e.g false or unknown) within
// timeout.
func WaitForNodeToBeNotReady(ctx context.Context, c clientset.Interface, name string, timeout time.Duration) bool {
return WaitConditionToBe(ctx, c, name, v1.NodeReady, false, timeout)
}
// WaitForNodeToBeReady returns whether node name is ready within timeout.
func WaitForNodeToBeReady(ctx context.Context, c clientset.Interface, name string, timeout time.Duration) bool {
return WaitConditionToBe(ctx, c, name, v1.NodeReady, true, timeout)
}
func WaitForNodeSchedulable(ctx context.Context, c clientset.Interface, name string, timeout time.Duration, wantSchedulable bool) bool {
framework.Logf("Waiting up to %v for node %s to be schedulable: %t", timeout, name, wantSchedulable)
for start := time.Now(); time.Since(start) < timeout; time.Sleep(poll) {
node, err := c.CoreV1().Nodes().Get(ctx, name, metav1.GetOptions{})
if err != nil {
framework.Logf("Couldn't get node %s", name)
continue
}
if IsNodeSchedulable(node) == wantSchedulable {
return true
}
}
framework.Logf("Node %s didn't reach desired schedulable status (%t) within %v", name, wantSchedulable, timeout)
return false
}
// WaitForNodeHeartbeatAfter waits up to timeout for node to send the next
// heartbeat after the given timestamp.
//
// To ensure the node status is posted by a restarted kubelet process,
// after should be retrieved by [GetNodeHeartbeatTime] while the kubelet is down.
func WaitForNodeHeartbeatAfter(ctx context.Context, c clientset.Interface, name string, after metav1.Time, timeout time.Duration) {
framework.Logf("Waiting up to %v for node %s to send a heartbeat after %v", timeout, name, after)
gomega.Eventually(ctx, func() (time.Time, error) {
node, err := c.CoreV1().Nodes().Get(ctx, name, metav1.GetOptions{})
if err != nil {
framework.Logf("Couldn't get node %s", name)
return time.Time{}, err
}
return GetNodeHeartbeatTime(node).Time, nil
}, timeout, poll).Should(gomega.BeTemporally(">", after.Time), "Node %s didn't send a heartbeat", name)
}
// CheckReady waits up to timeout for cluster to has desired size and
// there is no not-ready nodes in it. By cluster size we mean number of schedulable Nodes.
func CheckReady(ctx context.Context, c clientset.Interface, size int, timeout time.Duration) ([]v1.Node, error) {
for start := time.Now(); time.Since(start) < timeout; time.Sleep(sleepTime) {
nodes, err := waitListSchedulableNodes(ctx, c)
if err != nil {
framework.Logf("Failed to list nodes: %v", err)
continue
}
numNodes := len(nodes.Items)
// Filter out not-ready nodes.
Filter(nodes, func(node v1.Node) bool {
nodeReady := IsConditionSetAsExpected(&node, v1.NodeReady, true)
networkReady := isConditionUnset(&node, v1.NodeNetworkUnavailable) || IsConditionSetAsExpected(&node, v1.NodeNetworkUnavailable, false)
return nodeReady && networkReady
})
numReady := len(nodes.Items)
if numNodes == size && numReady == size {
framework.Logf("Cluster has reached the desired number of ready nodes %d", size)
return nodes.Items, nil
}
framework.Logf("Waiting for ready nodes %d, current ready %d, not ready nodes %d", size, numReady, numNodes-numReady)
}
return nil, fmt.Errorf("timeout waiting %v for number of ready nodes to be %d", timeout, size)
}
// waitListSchedulableNodes is a wrapper around listing nodes supporting retries.
func waitListSchedulableNodes(ctx context.Context, c clientset.Interface) (*v1.NodeList, error) {
var nodes *v1.NodeList
var err error
if wait.PollUntilContextTimeout(ctx, poll, singleCallTimeout, true, func(ctx context.Context) (bool, error) {
nodes, err = c.CoreV1().Nodes().List(ctx, metav1.ListOptions{FieldSelector: fields.Set{
"spec.unschedulable": "false",
}.AsSelector().String()})
if err != nil {
return false, err
}
return true, nil
}) != nil {
return nodes, err
}
return nodes, nil
}
// checkWaitListSchedulableNodes is a wrapper around listing nodes supporting retries.
func checkWaitListSchedulableNodes(ctx context.Context, c clientset.Interface) (*v1.NodeList, error) {
nodes, err := waitListSchedulableNodes(ctx, c)
if err != nil {
return nil, fmt.Errorf("error: %s. Non-retryable failure or timed out while listing nodes for e2e cluster", err)
}
return nodes, nil
}
// CheckReadyForTests returns a function which will return 'true' once the number of ready nodes is above the allowedNotReadyNodes threshold (i.e. to be used as a global gate for starting the tests).
func CheckReadyForTests(ctx context.Context, c clientset.Interface, nonblockingTaints string, allowedNotReadyNodes, largeClusterThreshold int) func(ctx context.Context) (bool, error) {
attempt := 0
return func(ctx context.Context) (bool, error) {
if allowedNotReadyNodes == -1 {
return true, nil
}
attempt++
var nodesNotReadyYet []v1.Node
opts := metav1.ListOptions{
ResourceVersion: "0",
// remove uncordoned nodes from our calculation, TODO refactor if node v2 API removes that semantic.
FieldSelector: fields.Set{"spec.unschedulable": "false"}.AsSelector().String(),
}
allNodes, err := c.CoreV1().Nodes().List(ctx, opts)
if err != nil {
var terminalListNodesErr error
framework.Logf("Unexpected error listing nodes: %v", err)
if attempt >= 3 {
terminalListNodesErr = err
}
return false, terminalListNodesErr
}
for _, node := range allNodes.Items {
if !readyForTests(&node, nonblockingTaints) {
nodesNotReadyYet = append(nodesNotReadyYet, node)
}
}
// Framework allows for <TestContext.AllowedNotReadyNodes> nodes to be non-ready,
// to make it possible e.g. for incorrect deployment of some small percentage
// of nodes (which we allow in cluster validation). Some nodes that are not
// provisioned correctly at startup will never become ready (e.g. when something
// won't install correctly), so we can't expect them to be ready at any point.
//
// We log the *reason* why nodes are not schedulable, specifically, its usually the network not being available.
if len(nodesNotReadyYet) > 0 {
// In large clusters, log them only every 10th pass.
if len(nodesNotReadyYet) < largeClusterThreshold || attempt%10 == 0 {
framework.Logf("Unschedulable nodes= %v, maximum value for starting tests= %v", len(nodesNotReadyYet), allowedNotReadyNodes)
for _, node := range nodesNotReadyYet {
framework.Logf(" -> Node %s [[[ Ready=%t, Network(available)=%t, Taints=%v, NonblockingTaints=%v ]]]",
node.Name,
IsConditionSetAsExpectedSilent(&node, v1.NodeReady, true),
IsConditionSetAsExpectedSilent(&node, v1.NodeNetworkUnavailable, false),
node.Spec.Taints,
nonblockingTaints,
)
}
if len(nodesNotReadyYet) > allowedNotReadyNodes {
ready := len(allNodes.Items) - len(nodesNotReadyYet)
remaining := len(nodesNotReadyYet) - allowedNotReadyNodes
framework.Logf("==== node wait: %v out of %v nodes are ready, max notReady allowed %v. Need %v more before starting.", ready, len(allNodes.Items), allowedNotReadyNodes, remaining)
}
}
}
return len(nodesNotReadyYet) <= allowedNotReadyNodes, nil
}
}
// readyForTests determines whether or not we should continue waiting for the nodes
// to enter a testable state. By default this means it is schedulable, NodeReady, and untainted.
// Nodes with taints nonblocking taints are permitted to have that taint and
// also have their node.Spec.Unschedulable field ignored for the purposes of this function.
func readyForTests(node *v1.Node, nonblockingTaints string) bool {
if hasNonblockingTaint(node, nonblockingTaints) {
// If the node has one of the nonblockingTaints taints; just check that it is ready
// and don't require node.Spec.Unschedulable to be set either way.
if !IsNodeReady(node) || !isNodeUntaintedWithNonblocking(node, nonblockingTaints) {
return false
}
} else {
if !IsNodeSchedulable(node) || !isNodeUntainted(node) {
return false
}
}
return true
}

View File

@ -0,0 +1,26 @@
/*
Copyright 2014 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
// AppendContainerCommandGroupIfNeeded returns container command group parameter if necessary.
func AppendContainerCommandGroupIfNeeded(args []string) []string {
if TestContext.CloudConfig.Region != "" {
// TODO(wojtek-t): Get rid of it once Regional Clusters go to GA.
return append([]string{"beta"}, args...)
}
return args
}

View File

@ -0,0 +1,12 @@
# This E2E framework sub-package is currently allowed to use arbitrary
# dependencies except of k/k/pkg, therefore we need to override the
# restrictions from the parent .import-restrictions file.
#
# At some point it may become useful to also check this package's
# dependencies more careful.
rules:
- selectorRegexp: "^k8s[.]io/kubernetes/pkg"
allowedPrefixes: []
- selectorRegexp: ""
allowedPrefixes: [ "" ]

View File

@ -0,0 +1,267 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pod
import (
"context"
"fmt"
"time"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/uuid"
clientset "k8s.io/client-go/kubernetes"
imageutils "k8s.io/kubernetes/test/utils/image"
admissionapi "k8s.io/pod-security-admission/api"
)
const (
VolumeMountPathTemplate = "/mnt/volume%d"
VolumeMountPath1 = "/mnt/volume1"
)
// Config is a struct containing all arguments for creating a pod.
// SELinux testing requires to pass HostIPC and HostPID as boolean arguments.
type Config struct {
NS string
PVCs []*v1.PersistentVolumeClaim
PVCsReadOnly bool
InlineVolumeSources []*v1.VolumeSource
SecurityLevel admissionapi.Level
Command string
HostIPC bool
HostPID bool
SeLinuxLabel *v1.SELinuxOptions
FsGroup *int64
NodeSelection NodeSelection
ImageID imageutils.ImageID
PodFSGroupChangePolicy *v1.PodFSGroupChangePolicy
}
// CreateUnschedulablePod with given claims based on node selector
func CreateUnschedulablePod(ctx context.Context, client clientset.Interface, namespace string, nodeSelector map[string]string, pvclaims []*v1.PersistentVolumeClaim, securityLevel admissionapi.Level, command string) (*v1.Pod, error) {
pod := MakePod(namespace, nodeSelector, pvclaims, securityLevel, command)
pod, err := client.CoreV1().Pods(namespace).Create(ctx, pod, metav1.CreateOptions{})
if err != nil {
return nil, fmt.Errorf("pod Create API error: %w", err)
}
// Waiting for pod to become Unschedulable
err = WaitForPodNameUnschedulableInNamespace(ctx, client, pod.Name, namespace)
if err != nil {
return pod, fmt.Errorf("pod %q is not Unschedulable: %w", pod.Name, err)
}
// get fresh pod info
pod, err = client.CoreV1().Pods(namespace).Get(ctx, pod.Name, metav1.GetOptions{})
if err != nil {
return pod, fmt.Errorf("pod Get API error: %w", err)
}
return pod, nil
}
// CreateClientPod defines and creates a pod with a mounted PV. Pod runs infinite loop until killed.
func CreateClientPod(ctx context.Context, c clientset.Interface, ns string, pvc *v1.PersistentVolumeClaim) (*v1.Pod, error) {
return CreatePod(ctx, c, ns, nil, []*v1.PersistentVolumeClaim{pvc}, admissionapi.LevelPrivileged, "")
}
// CreatePod with given claims based on node selector
func CreatePod(ctx context.Context, client clientset.Interface, namespace string, nodeSelector map[string]string, pvclaims []*v1.PersistentVolumeClaim, securityLevel admissionapi.Level, command string) (*v1.Pod, error) {
pod := MakePod(namespace, nodeSelector, pvclaims, securityLevel, command)
pod, err := client.CoreV1().Pods(namespace).Create(ctx, pod, metav1.CreateOptions{})
if err != nil {
return nil, fmt.Errorf("pod Create API error: %w", err)
}
// Waiting for pod to be running
err = WaitForPodNameRunningInNamespace(ctx, client, pod.Name, namespace)
if err != nil {
return pod, fmt.Errorf("pod %q is not Running: %w", pod.Name, err)
}
// get fresh pod info
pod, err = client.CoreV1().Pods(namespace).Get(ctx, pod.Name, metav1.GetOptions{})
if err != nil {
return pod, fmt.Errorf("pod Get API error: %w", err)
}
return pod, nil
}
// CreateSecPod creates security pod with given claims
func CreateSecPod(ctx context.Context, client clientset.Interface, podConfig *Config, timeout time.Duration) (*v1.Pod, error) {
return CreateSecPodWithNodeSelection(ctx, client, podConfig, timeout)
}
// CreateSecPodWithNodeSelection creates security pod with given claims
func CreateSecPodWithNodeSelection(ctx context.Context, client clientset.Interface, podConfig *Config, timeout time.Duration) (*v1.Pod, error) {
pod, err := MakeSecPod(podConfig)
if err != nil {
return nil, fmt.Errorf("Unable to create pod: %w", err)
}
pod, err = client.CoreV1().Pods(podConfig.NS).Create(ctx, pod, metav1.CreateOptions{})
if err != nil {
return nil, fmt.Errorf("pod Create API error: %w", err)
}
// Waiting for pod to be running
err = WaitTimeoutForPodRunningInNamespace(ctx, client, pod.Name, podConfig.NS, timeout)
if err != nil {
return pod, fmt.Errorf("pod %q is not Running: %w", pod.Name, err)
}
// get fresh pod info
pod, err = client.CoreV1().Pods(podConfig.NS).Get(ctx, pod.Name, metav1.GetOptions{})
if err != nil {
return pod, fmt.Errorf("pod Get API error: %w", err)
}
return pod, nil
}
// MakePod returns a pod definition based on the namespace. The pod references the PVC's
// name. A slice of BASH commands can be supplied as args to be run by the pod
func MakePod(ns string, nodeSelector map[string]string, pvclaims []*v1.PersistentVolumeClaim, securityLevel admissionapi.Level, command string) *v1.Pod {
if len(command) == 0 {
command = InfiniteSleepCommand
}
podSpec := &v1.Pod{
TypeMeta: metav1.TypeMeta{
Kind: "Pod",
APIVersion: "v1",
},
ObjectMeta: metav1.ObjectMeta{
GenerateName: "pvc-tester-",
Namespace: ns,
},
Spec: v1.PodSpec{
Containers: []v1.Container{
{
Name: "write-pod",
Image: GetDefaultTestImage(),
Command: GenerateScriptCmd(command),
SecurityContext: GenerateContainerSecurityContext(securityLevel),
},
},
RestartPolicy: v1.RestartPolicyOnFailure,
},
}
setVolumes(&podSpec.Spec, pvclaims, nil /*inline volume sources*/, false /*PVCs readonly*/)
if nodeSelector != nil {
podSpec.Spec.NodeSelector = nodeSelector
}
if securityLevel == admissionapi.LevelRestricted {
podSpec = MustMixinRestrictedPodSecurity(podSpec)
}
return podSpec
}
// MakeSecPod returns a pod definition based on the namespace. The pod references the PVC's
// name. A slice of BASH commands can be supplied as args to be run by the pod.
func MakeSecPod(podConfig *Config) (*v1.Pod, error) {
if podConfig.NS == "" {
return nil, fmt.Errorf("Cannot create pod with empty namespace")
}
if len(podConfig.Command) == 0 {
podConfig.Command = InfiniteSleepCommand
}
podName := "pod-" + string(uuid.NewUUID())
if podConfig.FsGroup == nil && !NodeOSDistroIs("windows") {
podConfig.FsGroup = func(i int64) *int64 {
return &i
}(1000)
}
podSpec := &v1.Pod{
TypeMeta: metav1.TypeMeta{
Kind: "Pod",
APIVersion: "v1",
},
ObjectMeta: metav1.ObjectMeta{
Name: podName,
Namespace: podConfig.NS,
},
Spec: *MakePodSpec(podConfig),
}
return podSpec, nil
}
// MakePodSpec returns a PodSpec definition
func MakePodSpec(podConfig *Config) *v1.PodSpec {
image := imageutils.BusyBox
if podConfig.ImageID != imageutils.None {
image = podConfig.ImageID
}
securityLevel := podConfig.SecurityLevel
if securityLevel == "" {
securityLevel = admissionapi.LevelBaseline
}
podSpec := &v1.PodSpec{
HostIPC: podConfig.HostIPC,
HostPID: podConfig.HostPID,
SecurityContext: GeneratePodSecurityContext(podConfig.FsGroup, podConfig.SeLinuxLabel),
Containers: []v1.Container{
{
Name: "write-pod",
Image: GetTestImage(image),
Command: GenerateScriptCmd(podConfig.Command),
SecurityContext: GenerateContainerSecurityContext(securityLevel),
},
},
RestartPolicy: v1.RestartPolicyOnFailure,
}
if podConfig.PodFSGroupChangePolicy != nil {
podSpec.SecurityContext.FSGroupChangePolicy = podConfig.PodFSGroupChangePolicy
}
setVolumes(podSpec, podConfig.PVCs, podConfig.InlineVolumeSources, podConfig.PVCsReadOnly)
SetNodeSelection(podSpec, podConfig.NodeSelection)
return podSpec
}
func setVolumes(podSpec *v1.PodSpec, pvcs []*v1.PersistentVolumeClaim, inlineVolumeSources []*v1.VolumeSource, pvcsReadOnly bool) {
var volumeMounts = make([]v1.VolumeMount, 0)
var volumeDevices = make([]v1.VolumeDevice, 0)
var volumes = make([]v1.Volume, len(pvcs)+len(inlineVolumeSources))
volumeIndex := 0
for _, pvclaim := range pvcs {
volumename := fmt.Sprintf("volume%v", volumeIndex+1)
volumeMountPath := fmt.Sprintf(VolumeMountPathTemplate, volumeIndex+1)
if pvclaim.Spec.VolumeMode != nil && *pvclaim.Spec.VolumeMode == v1.PersistentVolumeBlock {
volumeDevices = append(volumeDevices, v1.VolumeDevice{Name: volumename, DevicePath: volumeMountPath})
} else {
volumeMounts = append(volumeMounts, v1.VolumeMount{Name: volumename, MountPath: volumeMountPath})
}
volumes[volumeIndex] = v1.Volume{
Name: volumename,
VolumeSource: v1.VolumeSource{
PersistentVolumeClaim: &v1.PersistentVolumeClaimVolumeSource{
ClaimName: pvclaim.Name,
ReadOnly: pvcsReadOnly,
},
},
}
volumeIndex++
}
for _, src := range inlineVolumeSources {
volumename := fmt.Sprintf("volume%v", volumeIndex+1)
volumeMountPath := fmt.Sprintf(VolumeMountPathTemplate, volumeIndex+1)
// In-line volumes can be only filesystem, not block.
volumeMounts = append(volumeMounts, v1.VolumeMount{Name: volumename, MountPath: volumeMountPath})
volumes[volumeIndex] = v1.Volume{Name: volumename, VolumeSource: *src}
volumeIndex++
}
podSpec.Containers[0].VolumeMounts = volumeMounts
podSpec.Containers[0].VolumeDevices = volumeDevices
podSpec.Volumes = volumes
}

View File

@ -0,0 +1,104 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pod
import (
"context"
"fmt"
"time"
"github.com/onsi/ginkgo/v2"
v1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubernetes/test/e2e/framework"
)
const (
// PodDeleteTimeout is how long to wait for a pod to be deleted.
PodDeleteTimeout = 5 * time.Minute
)
// DeletePodOrFail deletes the pod of the specified namespace and name. Resilient to the pod
// not existing.
func DeletePodOrFail(ctx context.Context, c clientset.Interface, ns, name string) {
ginkgo.By(fmt.Sprintf("Deleting pod %s in namespace %s", name, ns))
err := c.CoreV1().Pods(ns).Delete(ctx, name, metav1.DeleteOptions{})
if err != nil && apierrors.IsNotFound(err) {
return
}
expectNoError(err, "failed to delete pod %s in namespace %s", name, ns)
}
// DeletePodWithWait deletes the passed-in pod and waits for the pod to be terminated. Resilient to the pod
// not existing.
func DeletePodWithWait(ctx context.Context, c clientset.Interface, pod *v1.Pod) error {
if pod == nil {
return nil
}
return DeletePodWithWaitByName(ctx, c, pod.GetName(), pod.GetNamespace())
}
// DeletePodWithWaitByName deletes the named and namespaced pod and waits for the pod to be terminated. Resilient to the pod
// not existing.
func DeletePodWithWaitByName(ctx context.Context, c clientset.Interface, podName, podNamespace string) error {
framework.Logf("Deleting pod %q in namespace %q", podName, podNamespace)
err := c.CoreV1().Pods(podNamespace).Delete(ctx, podName, metav1.DeleteOptions{})
if err != nil {
if apierrors.IsNotFound(err) {
return nil // assume pod was already deleted
}
return fmt.Errorf("pod Delete API error: %w", err)
}
framework.Logf("Wait up to %v for pod %q to be fully deleted", PodDeleteTimeout, podName)
err = WaitForPodNotFoundInNamespace(ctx, c, podName, podNamespace, PodDeleteTimeout)
if err != nil {
return fmt.Errorf("pod %q was not deleted: %w", podName, err)
}
return nil
}
// DeletePodWithGracePeriod deletes the passed-in pod. Resilient to the pod not existing.
func DeletePodWithGracePeriod(ctx context.Context, c clientset.Interface, pod *v1.Pod, grace int64) error {
return DeletePodWithGracePeriodByName(ctx, c, pod.GetName(), pod.GetNamespace(), grace)
}
// DeletePodsWithGracePeriod deletes the passed-in pods. Resilient to the pods not existing.
func DeletePodsWithGracePeriod(ctx context.Context, c clientset.Interface, pods []v1.Pod, grace int64) error {
for _, pod := range pods {
if err := DeletePodWithGracePeriod(ctx, c, &pod, grace); err != nil {
return err
}
}
return nil
}
// DeletePodWithGracePeriodByName deletes a pod by name and namespace. Resilient to the pod not existing.
func DeletePodWithGracePeriodByName(ctx context.Context, c clientset.Interface, podName, podNamespace string, grace int64) error {
framework.Logf("Deleting pod %q in namespace %q", podName, podNamespace)
err := c.CoreV1().Pods(podNamespace).Delete(ctx, podName, *metav1.NewDeleteOptions(grace))
if err != nil {
if apierrors.IsNotFound(err) {
return nil // assume pod was already deleted
}
return fmt.Errorf("pod Delete API error: %w", err)
}
return nil
}

View File

@ -0,0 +1,215 @@
/*
Copyright 2021 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pod
import (
"context"
"errors"
"fmt"
"io"
"net"
"net/http"
"regexp"
"strconv"
"time"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apimachinery/pkg/util/httpstream"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/kubernetes/scheme"
"k8s.io/client-go/rest"
"k8s.io/client-go/tools/portforward"
"k8s.io/client-go/transport/spdy"
"k8s.io/klog/v2"
)
// NewTransport creates a transport which uses the port forward dialer.
// URLs must use <namespace>.<pod>:<port> as host.
func NewTransport(client kubernetes.Interface, restConfig *rest.Config) *http.Transport {
return &http.Transport{
DialContext: func(ctx context.Context, _, addr string) (net.Conn, error) {
dialer := NewDialer(client, restConfig)
a, err := ParseAddr(addr)
if err != nil {
return nil, err
}
return dialer.DialContainerPort(ctx, *a)
},
}
}
// NewDialer creates a dialer that supports connecting to container ports.
func NewDialer(client kubernetes.Interface, restConfig *rest.Config) *Dialer {
return &Dialer{
client: client,
restConfig: restConfig,
}
}
// Dialer holds the relevant parameters that are independent of a particular connection.
type Dialer struct {
client kubernetes.Interface
restConfig *rest.Config
}
// DialContainerPort connects to a certain container port in a pod.
func (d *Dialer) DialContainerPort(ctx context.Context, addr Addr) (conn net.Conn, finalErr error) {
restClient := d.client.CoreV1().RESTClient()
restConfig := d.restConfig
if restConfig.GroupVersion == nil {
restConfig.GroupVersion = &schema.GroupVersion{}
}
if restConfig.NegotiatedSerializer == nil {
restConfig.NegotiatedSerializer = scheme.Codecs
}
// The setup code around the actual portforward is from
// https://github.com/kubernetes/kubernetes/blob/c652ffbe4a29143623a1aaec39f745575f7e43ad/staging/src/k8s.io/kubectl/pkg/cmd/portforward/portforward.go
req := restClient.Post().
Resource("pods").
Namespace(addr.Namespace).
Name(addr.PodName).
SubResource("portforward")
transport, upgrader, err := spdy.RoundTripperFor(restConfig)
if err != nil {
return nil, fmt.Errorf("create round tripper: %w", err)
}
dialer := spdy.NewDialer(upgrader, &http.Client{Transport: transport}, "POST", req.URL())
streamConn, _, err := dialer.Dial(portforward.PortForwardProtocolV1Name)
if err != nil {
return nil, fmt.Errorf("dialer failed: %w", err)
}
requestID := "1"
defer func() {
if finalErr != nil {
streamConn.Close()
}
}()
// create error stream
headers := http.Header{}
headers.Set(v1.StreamType, v1.StreamTypeError)
headers.Set(v1.PortHeader, fmt.Sprintf("%d", addr.Port))
headers.Set(v1.PortForwardRequestIDHeader, requestID)
// We're not writing to this stream, just reading an error message from it.
// This happens asynchronously.
errorStream, err := streamConn.CreateStream(headers)
if err != nil {
return nil, fmt.Errorf("error creating error stream: %w", err)
}
errorStream.Close()
go func() {
message, err := io.ReadAll(errorStream)
switch {
case err != nil:
klog.ErrorS(err, "error reading from error stream")
case len(message) > 0:
klog.ErrorS(errors.New(string(message)), "an error occurred connecting to the remote port")
}
}()
// create data stream
headers.Set(v1.StreamType, v1.StreamTypeData)
dataStream, err := streamConn.CreateStream(headers)
if err != nil {
return nil, fmt.Errorf("error creating data stream: %w", err)
}
return &stream{
Stream: dataStream,
streamConn: streamConn,
}, nil
}
// Addr contains all relevant parameters for a certain port in a pod.
// The container should be running before connections are attempted,
// otherwise the connection will fail.
type Addr struct {
Namespace, PodName string
Port int
}
var _ net.Addr = Addr{}
func (a Addr) Network() string {
return "port-forwarding"
}
func (a Addr) String() string {
return fmt.Sprintf("%s.%s:%d", a.Namespace, a.PodName, a.Port)
}
// ParseAddr expects a <namespace>.<pod>:<port number> as produced
// by Addr.String.
func ParseAddr(addr string) (*Addr, error) {
parts := addrRegex.FindStringSubmatch(addr)
if parts == nil {
return nil, fmt.Errorf("%q: must match the format <namespace>.<pod>:<port number>", addr)
}
port, _ := strconv.Atoi(parts[3])
return &Addr{
Namespace: parts[1],
PodName: parts[2],
Port: port,
}, nil
}
var addrRegex = regexp.MustCompile(`^([^\.]+)\.([^:]+):(\d+)$`)
type stream struct {
addr Addr
httpstream.Stream
streamConn httpstream.Connection
}
var _ net.Conn = &stream{}
func (s *stream) Close() error {
s.Stream.Close()
s.streamConn.Close()
return nil
}
func (s *stream) LocalAddr() net.Addr {
return LocalAddr{}
}
func (s *stream) RemoteAddr() net.Addr {
return s.addr
}
func (s *stream) SetDeadline(t time.Time) error {
return nil
}
func (s *stream) SetReadDeadline(t time.Time) error {
return nil
}
func (s *stream) SetWriteDeadline(t time.Time) error {
return nil
}
type LocalAddr struct{}
var _ net.Addr = LocalAddr{}
func (l LocalAddr) Network() string { return "port-forwarding" }
func (l LocalAddr) String() string { return "apiserver" }

View File

@ -0,0 +1,155 @@
/*
Copyright 2016 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pod
import (
"bytes"
"context"
"io"
"net/url"
"strings"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/kubernetes/scheme"
restclient "k8s.io/client-go/rest"
"k8s.io/client-go/tools/remotecommand"
"k8s.io/kubernetes/test/e2e/framework"
"github.com/onsi/gomega"
)
// ExecOptions passed to ExecWithOptions
type ExecOptions struct {
Command []string
Namespace string
PodName string
ContainerName string
Stdin io.Reader
CaptureStdout bool
CaptureStderr bool
// If false, whitespace in std{err,out} will be removed.
PreserveWhitespace bool
Quiet bool
}
// ExecWithOptions executes a command in the specified container,
// returning stdout, stderr and error. `options` allowed for
// additional parameters to be passed.
func ExecWithOptions(f *framework.Framework, options ExecOptions) (string, string, error) {
return ExecWithOptionsContext(context.Background(), f, options)
}
func ExecWithOptionsContext(ctx context.Context, f *framework.Framework, options ExecOptions) (string, string, error) {
if !options.Quiet {
framework.Logf("ExecWithOptions %+v", options)
}
const tty = false
framework.Logf("ExecWithOptions: Clientset creation")
req := f.ClientSet.CoreV1().RESTClient().Post().
Resource("pods").
Name(options.PodName).
Namespace(options.Namespace).
SubResource("exec")
req.VersionedParams(&v1.PodExecOptions{
Container: options.ContainerName,
Command: options.Command,
Stdin: options.Stdin != nil,
Stdout: options.CaptureStdout,
Stderr: options.CaptureStderr,
TTY: tty,
}, scheme.ParameterCodec)
var stdout, stderr bytes.Buffer
framework.Logf("ExecWithOptions: execute(POST %s)", req.URL())
err := execute(ctx, "POST", req.URL(), f.ClientConfig(), options.Stdin, &stdout, &stderr, tty)
if options.PreserveWhitespace {
return stdout.String(), stderr.String(), err
}
return strings.TrimSpace(stdout.String()), strings.TrimSpace(stderr.String()), err
}
// ExecCommandInContainerWithFullOutput executes a command in the
// specified container and return stdout, stderr and error
func ExecCommandInContainerWithFullOutput(f *framework.Framework, podName, containerName string, cmd ...string) (string, string, error) {
// TODO (pohly): add context support
return ExecWithOptions(f, ExecOptions{
Command: cmd,
Namespace: f.Namespace.Name,
PodName: podName,
ContainerName: containerName,
Stdin: nil,
CaptureStdout: true,
CaptureStderr: true,
PreserveWhitespace: false,
})
}
// ExecCommandInContainer executes a command in the specified container.
func ExecCommandInContainer(f *framework.Framework, podName, containerName string, cmd ...string) string {
stdout, stderr, err := ExecCommandInContainerWithFullOutput(f, podName, containerName, cmd...)
framework.Logf("Exec stderr: %q", stderr)
framework.ExpectNoError(err,
"failed to execute command in pod %v, container %v: %v",
podName, containerName, err)
return stdout
}
// ExecShellInContainer executes the specified command on the pod's container.
func ExecShellInContainer(f *framework.Framework, podName, containerName string, cmd string) string {
return ExecCommandInContainer(f, podName, containerName, "/bin/sh", "-c", cmd)
}
func execCommandInPod(ctx context.Context, f *framework.Framework, podName string, cmd ...string) string {
pod, err := NewPodClient(f).Get(ctx, podName, metav1.GetOptions{})
framework.ExpectNoError(err, "failed to get pod %v", podName)
gomega.Expect(pod.Spec.Containers).NotTo(gomega.BeEmpty())
return ExecCommandInContainer(f, podName, pod.Spec.Containers[0].Name, cmd...)
}
func execCommandInPodWithFullOutput(ctx context.Context, f *framework.Framework, podName string, cmd ...string) (string, string, error) {
pod, err := NewPodClient(f).Get(ctx, podName, metav1.GetOptions{})
framework.ExpectNoError(err, "failed to get pod %v", podName)
gomega.Expect(pod.Spec.Containers).NotTo(gomega.BeEmpty())
return ExecCommandInContainerWithFullOutput(f, podName, pod.Spec.Containers[0].Name, cmd...)
}
// ExecShellInPod executes the specified command on the pod.
func ExecShellInPod(ctx context.Context, f *framework.Framework, podName string, cmd string) string {
return execCommandInPod(ctx, f, podName, "/bin/sh", "-c", cmd)
}
// ExecShellInPodWithFullOutput executes the specified command on the Pod and returns stdout, stderr and error.
func ExecShellInPodWithFullOutput(ctx context.Context, f *framework.Framework, podName string, cmd string) (string, string, error) {
return execCommandInPodWithFullOutput(ctx, f, podName, "/bin/sh", "-c", cmd)
}
func execute(ctx context.Context, method string, url *url.URL, config *restclient.Config, stdin io.Reader, stdout, stderr io.Writer, tty bool) error {
exec, err := remotecommand.NewSPDYExecutor(config, method, url)
if err != nil {
return err
}
return exec.StreamWithContext(ctx, remotecommand.StreamOptions{
Stdin: stdin,
Stdout: stdout,
Stderr: stderr,
Tty: tty,
})
}

View File

@ -0,0 +1,31 @@
/*
Copyright 2023 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pod
import (
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubernetes/test/e2e/framework"
)
// Get creates a function which retrieves the pod anew each time the function
// is called. Fatal errors are detected by framework.GetObject and cause
// polling to stop.
func Get(c clientset.Interface, pod framework.NamedObject) framework.GetFunc[*v1.Pod] {
return framework.GetObject(c.CoreV1().Pods(pod.GetNamespace()).Get, pod.GetName(), metav1.GetOptions{})
}

View File

@ -0,0 +1,105 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pod
import (
v1 "k8s.io/api/core/v1"
)
// NodeSelection specifies where to run a pod, using a combination of fixed node name,
// node selector and/or affinity.
type NodeSelection struct {
Name string
Selector map[string]string
Affinity *v1.Affinity
}
// setNodeAffinityRequirement sets affinity with specified operator to nodeName to nodeSelection
func setNodeAffinityRequirement(nodeSelection *NodeSelection, operator v1.NodeSelectorOperator, nodeName string) {
// Add node-anti-affinity.
if nodeSelection.Affinity == nil {
nodeSelection.Affinity = &v1.Affinity{}
}
if nodeSelection.Affinity.NodeAffinity == nil {
nodeSelection.Affinity.NodeAffinity = &v1.NodeAffinity{}
}
if nodeSelection.Affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution == nil {
nodeSelection.Affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution = &v1.NodeSelector{}
}
nodeSelection.Affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms = append(nodeSelection.Affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms,
v1.NodeSelectorTerm{
MatchFields: []v1.NodeSelectorRequirement{
{Key: "metadata.name", Operator: operator, Values: []string{nodeName}},
},
})
}
// SetNodeAffinityTopologyRequirement sets node affinity to a specified topology
func SetNodeAffinityTopologyRequirement(nodeSelection *NodeSelection, topology map[string]string) {
if nodeSelection.Affinity == nil {
nodeSelection.Affinity = &v1.Affinity{}
}
if nodeSelection.Affinity.NodeAffinity == nil {
nodeSelection.Affinity.NodeAffinity = &v1.NodeAffinity{}
}
if nodeSelection.Affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution == nil {
nodeSelection.Affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution = &v1.NodeSelector{}
}
for k, v := range topology {
nodeSelection.Affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms = append(nodeSelection.Affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms,
v1.NodeSelectorTerm{
MatchExpressions: []v1.NodeSelectorRequirement{
{Key: k, Operator: v1.NodeSelectorOpIn, Values: []string{v}},
},
})
}
}
// SetAffinity sets affinity to nodeName to nodeSelection
func SetAffinity(nodeSelection *NodeSelection, nodeName string) {
setNodeAffinityRequirement(nodeSelection, v1.NodeSelectorOpIn, nodeName)
}
// SetAntiAffinity sets anti-affinity to nodeName to nodeSelection
func SetAntiAffinity(nodeSelection *NodeSelection, nodeName string) {
setNodeAffinityRequirement(nodeSelection, v1.NodeSelectorOpNotIn, nodeName)
}
// SetNodeAffinity modifies the given pod object with
// NodeAffinity to the given node name.
func SetNodeAffinity(podSpec *v1.PodSpec, nodeName string) {
nodeSelection := &NodeSelection{}
SetAffinity(nodeSelection, nodeName)
podSpec.Affinity = nodeSelection.Affinity
}
// SetNodeSelection modifies the given pod object with
// the specified NodeSelection
func SetNodeSelection(podSpec *v1.PodSpec, nodeSelection NodeSelection) {
podSpec.NodeSelector = nodeSelection.Selector
podSpec.Affinity = nodeSelection.Affinity
// pod.Spec.NodeName should not be set directly because
// it will bypass the scheduler, potentially causing
// kubelet to Fail the pod immediately if it's out of
// resources. Instead, we want the pod to remain
// pending in the scheduler until the node has resources
// freed up.
if nodeSelection.Name != "" {
SetNodeAffinity(podSpec, nodeSelection.Name)
}
}

View File

@ -0,0 +1,284 @@
/*
Copyright 2014 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package output
import (
"context"
"fmt"
"strings"
"time"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
gomegatypes "github.com/onsi/gomega/types"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/labels"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubectl/pkg/util/podutils"
"k8s.io/kubernetes/test/e2e/framework"
e2ekubectl "k8s.io/kubernetes/test/e2e/framework/kubectl"
e2epod "k8s.io/kubernetes/test/e2e/framework/pod"
)
// DEPRECATED constants. Use the timeouts in framework.Framework instead.
const (
// Poll is how often to Poll pods, nodes and claims.
Poll = 2 * time.Second
)
// LookForStringInPodExec looks for the given string in the output of a command
// executed in the first container of specified pod.
func LookForStringInPodExec(ns, podName string, command []string, expectedString string, timeout time.Duration) (result string, err error) {
return LookForStringInPodExecToContainer(ns, podName, "", command, expectedString, timeout)
}
// LookForStringInPodExecToContainer looks for the given string in the output of a
// command executed in specified pod container, or first container if not specified.
func LookForStringInPodExecToContainer(ns, podName, containerName string, command []string, expectedString string, timeout time.Duration) (result string, err error) {
return lookForString(expectedString, timeout, func() string {
args := []string{"exec", podName, fmt.Sprintf("--namespace=%v", ns)}
if len(containerName) > 0 {
args = append(args, fmt.Sprintf("--container=%s", containerName))
}
args = append(args, "--")
args = append(args, command...)
return e2ekubectl.RunKubectlOrDie(ns, args...)
})
}
// lookForString looks for the given string in the output of fn, repeatedly calling fn until
// the timeout is reached or the string is found. Returns last log and possibly
// error if the string was not found.
func lookForString(expectedString string, timeout time.Duration, fn func() string) (result string, err error) {
for t := time.Now(); time.Since(t) < timeout; time.Sleep(Poll) {
result = fn()
if strings.Contains(result, expectedString) {
return
}
}
err = fmt.Errorf("Failed to find \"%s\", last result: \"%s\"", expectedString, result)
return
}
// RunHostCmd runs the given cmd in the context of the given pod using `kubectl exec`
// inside of a shell.
func RunHostCmd(ns, name, cmd string) (string, error) {
return e2ekubectl.RunKubectl(ns, "exec", name, "--", "/bin/sh", "-x", "-c", cmd)
}
// RunHostCmdWithFullOutput runs the given cmd in the context of the given pod using `kubectl exec`
// inside of a shell. It will also return the command's stderr.
func RunHostCmdWithFullOutput(ns, name, cmd string) (string, string, error) {
return e2ekubectl.RunKubectlWithFullOutput(ns, "exec", name, "--", "/bin/sh", "-x", "-c", cmd)
}
// RunHostCmdOrDie calls RunHostCmd and dies on error.
func RunHostCmdOrDie(ns, name, cmd string) string {
stdout, err := RunHostCmd(ns, name, cmd)
framework.Logf("stdout: %v", stdout)
framework.ExpectNoError(err)
return stdout
}
// RunHostCmdWithRetries calls RunHostCmd and retries all errors
// until it succeeds or the specified timeout expires.
// This can be used with idempotent commands to deflake transient Node issues.
func RunHostCmdWithRetries(ns, name, cmd string, interval, timeout time.Duration) (string, error) {
start := time.Now()
for {
out, err := RunHostCmd(ns, name, cmd)
if err == nil {
return out, nil
}
if elapsed := time.Since(start); elapsed > timeout {
return out, fmt.Errorf("RunHostCmd still failed after %v: %w", elapsed, err)
}
framework.Logf("Waiting %v to retry failed RunHostCmd: %v", interval, err)
time.Sleep(interval)
}
}
// LookForStringInLog looks for the given string in the log of a specific pod container
func LookForStringInLog(ns, podName, container, expectedString string, timeout time.Duration) (result string, err error) {
return lookForString(expectedString, timeout, func() string {
return e2ekubectl.RunKubectlOrDie(ns, "logs", podName, container)
})
}
// LookForStringInLogWithoutKubectl looks for the given string in the log of a specific pod container
func LookForStringInLogWithoutKubectl(ctx context.Context, client clientset.Interface, ns string, podName string, container string, expectedString string, timeout time.Duration) (result string, err error) {
return lookForString(expectedString, timeout, func() string {
podLogs, err := e2epod.GetPodLogs(ctx, client, ns, podName, container)
framework.ExpectNoError(err)
return podLogs
})
}
// CreateEmptyFileOnPod creates empty file at given path on the pod.
func CreateEmptyFileOnPod(namespace string, podName string, filePath string) error {
_, err := e2ekubectl.RunKubectl(namespace, "exec", podName, "--", "/bin/sh", "-c", fmt.Sprintf("touch %s", filePath))
return err
}
// DumpDebugInfo dumps debug info of tests.
func DumpDebugInfo(ctx context.Context, c clientset.Interface, ns string) {
sl, _ := c.CoreV1().Pods(ns).List(ctx, metav1.ListOptions{LabelSelector: labels.Everything().String()})
for _, s := range sl.Items {
desc, _ := e2ekubectl.RunKubectl(ns, "describe", "po", s.Name)
framework.Logf("\nOutput of kubectl describe %v:\n%v", s.Name, desc)
l, _ := e2ekubectl.RunKubectl(ns, "logs", s.Name, "--tail=100")
framework.Logf("\nLast 100 log lines of %v:\n%v", s.Name, l)
}
}
// MatchContainerOutput creates a pod and waits for all it's containers to exit with success.
// It then tests that the matcher with each expectedOutput matches the output of the specified container.
func MatchContainerOutput(
ctx context.Context,
f *framework.Framework,
pod *v1.Pod,
containerName string,
expectedOutput []string,
matcher func(string, ...interface{}) gomegatypes.GomegaMatcher) error {
return MatchMultipleContainerOutputs(ctx, f, pod, map[string][]string{containerName: expectedOutput}, matcher)
}
func MatchMultipleContainerOutputs(
ctx context.Context,
f *framework.Framework,
pod *v1.Pod,
expectedOutputs map[string][]string, // map of container name -> expected outputs
matcher func(string, ...interface{}) gomegatypes.GomegaMatcher) error {
ns := pod.ObjectMeta.Namespace
if ns == "" {
ns = f.Namespace.Name
}
podClient := e2epod.PodClientNS(f, ns)
createdPod := podClient.Create(ctx, pod)
defer func() {
ginkgo.By("delete the pod")
podClient.DeleteSync(ctx, createdPod.Name, metav1.DeleteOptions{}, e2epod.DefaultPodDeletionTimeout)
}()
// Wait for client pod to complete.
podErr := e2epod.WaitForPodSuccessInNamespaceTimeout(ctx, f.ClientSet, createdPod.Name, ns, f.Timeouts.PodStart)
// Grab its logs. Get host first.
podStatus, err := podClient.Get(ctx, createdPod.Name, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("failed to get pod status: %w", err)
}
if podErr != nil {
// Pod failed. Dump all logs from all containers to see what's wrong
_ = podutils.VisitContainers(&podStatus.Spec, podutils.AllContainers, func(c *v1.Container, containerType podutils.ContainerType) bool {
logs, err := e2epod.GetPodLogs(ctx, f.ClientSet, ns, podStatus.Name, c.Name)
if err != nil {
framework.Logf("Failed to get logs from node %q pod %q container %q: %v",
podStatus.Spec.NodeName, podStatus.Name, c.Name, err)
} else {
framework.Logf("Output of node %q pod %q container %q: %s", podStatus.Spec.NodeName, podStatus.Name, c.Name, logs)
}
return true
})
return fmt.Errorf("expected pod %q success: %v", createdPod.Name, podErr)
}
for cName, expectedOutput := range expectedOutputs {
framework.Logf("Trying to get logs from node %s pod %s container %s: %v",
podStatus.Spec.NodeName, podStatus.Name, cName, err)
// Sometimes the actual containers take a second to get started, try to get logs for 60s
logs, err := e2epod.GetPodLogs(ctx, f.ClientSet, ns, podStatus.Name, cName)
if err != nil {
framework.Logf("Failed to get logs from node %q pod %q container %q. %v",
podStatus.Spec.NodeName, podStatus.Name, cName, err)
return fmt.Errorf("failed to get logs from %s for %s: %w", podStatus.Name, cName, err)
}
for _, expected := range expectedOutput {
m := matcher(expected)
matches, err := m.Match(logs)
if err != nil {
return fmt.Errorf("expected %q in container output: %w", expected, err)
} else if !matches {
return fmt.Errorf("expected %q in container output: %s", expected, m.FailureMessage(logs))
}
}
}
return nil
}
// TestContainerOutput runs the given pod in the given namespace and waits
// for all of the containers in the podSpec to move into the 'Success' status, and tests
// the specified container log against the given expected output using a substring matcher.
func TestContainerOutput(ctx context.Context, f *framework.Framework, scenarioName string, pod *v1.Pod, containerIndex int, expectedOutput []string) {
TestContainerOutputMatcher(ctx, f, scenarioName, pod, containerIndex, expectedOutput, gomega.ContainSubstring)
}
// TestContainerOutputRegexp runs the given pod in the given namespace and waits
// for all of the containers in the podSpec to move into the 'Success' status, and tests
// the specified container log against the given expected output using a regexp matcher.
func TestContainerOutputRegexp(ctx context.Context, f *framework.Framework, scenarioName string, pod *v1.Pod, containerIndex int, expectedOutput []string) {
TestContainerOutputsRegexp(ctx, f, scenarioName, pod, map[int][]string{containerIndex: expectedOutput})
}
func TestContainerOutputsRegexp(ctx context.Context, f *framework.Framework, scenarioName string, pod *v1.Pod, expectedOutputs map[int][]string) {
TestContainerOutputsMatcher(ctx, f, scenarioName, pod, expectedOutputs, gomega.MatchRegexp)
}
// TestContainerOutputMatcher runs the given pod in the given namespace and waits
// for all of the containers in the podSpec to move into the 'Success' status, and tests
// the specified container log against the given expected output using the given matcher.
func TestContainerOutputMatcher(ctx context.Context, f *framework.Framework,
scenarioName string,
pod *v1.Pod,
containerIndex int,
expectedOutput []string,
matcher func(string, ...interface{}) gomegatypes.GomegaMatcher) {
ginkgo.By(fmt.Sprintf("Creating a pod to test %v", scenarioName))
if containerIndex < 0 || containerIndex >= len(pod.Spec.Containers) {
framework.Failf("Invalid container index: %d", containerIndex)
}
framework.ExpectNoError(MatchContainerOutput(ctx, f, pod, pod.Spec.Containers[containerIndex].Name, expectedOutput, matcher))
}
func TestContainerOutputsMatcher(ctx context.Context, f *framework.Framework,
scenarioName string,
pod *v1.Pod,
expectedOutputs map[int][]string,
matcher func(string, ...interface{}) gomegatypes.GomegaMatcher) {
ginkgo.By(fmt.Sprintf("Creating a pod to test %v", scenarioName))
expectedNameOutputs := make(map[string][]string, len(expectedOutputs))
for containerIndex, expectedOutput := range expectedOutputs {
expectedOutput := expectedOutput
if containerIndex < 0 || containerIndex >= len(pod.Spec.Containers) {
framework.Failf("Invalid container index: %d", containerIndex)
}
expectedNameOutputs[pod.Spec.Containers[containerIndex].Name] = expectedOutput
}
framework.ExpectNoError(MatchMultipleContainerOutputs(ctx, f, pod, expectedNameOutputs, matcher))
}

View File

@ -0,0 +1,387 @@
/*
Copyright 2016 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pod
import (
"context"
"encoding/json"
"fmt"
"regexp"
"sync"
"time"
v1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/sets"
"k8s.io/apimachinery/pkg/util/strategicpatch"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/client-go/kubernetes/scheme"
v1core "k8s.io/client-go/kubernetes/typed/core/v1"
"k8s.io/kubectl/pkg/util/podutils"
"github.com/onsi/ginkgo/v2"
ginkgotypes "github.com/onsi/ginkgo/v2/types"
"github.com/onsi/gomega"
"k8s.io/kubernetes/test/e2e/framework"
)
const (
// DefaultPodDeletionTimeout is the default timeout for deleting pod
DefaultPodDeletionTimeout = 3 * time.Minute
// the status of container event, copied from k8s.io/kubernetes/pkg/kubelet/events
killingContainer = "Killing"
// the status of container event, copied from k8s.io/kubernetes/pkg/kubelet/events
failedToCreateContainer = "Failed"
// the status of container event, copied from k8s.io/kubernetes/pkg/kubelet/events
startedContainer = "Started"
// it is copied from k8s.io/kubernetes/pkg/kubelet/sysctl
forbiddenReason = "SysctlForbidden"
// which test created this pod?
AnnotationTestOwner = "owner.test"
)
// global flags so we can enable features per-suite instead of per-client.
var (
// GlobalOwnerTracking controls if newly created PodClients should automatically annotate
// the pod with the owner test. The owner test is identified by "sourcecodepath:linenumber".
// Annotating the pods this way is useful to troubleshoot tests which do insufficient cleanup.
// Default is false to maximize backward compatibility.
// See also: WithOwnerTracking, AnnotationTestOwner
GlobalOwnerTracking bool
)
// ImagePrePullList is the images used in the current test suite. It should be initialized in test suite and
// the images in the list should be pre-pulled in the test suite. Currently, this is only used by
// node e2e test.
var ImagePrePullList sets.String
// NewPodClient is a convenience method for getting a pod client interface in the framework's namespace,
// possibly applying test-suite specific transformations to the pod spec, e.g. for
// node e2e pod scheduling.
func NewPodClient(f *framework.Framework) *PodClient {
return &PodClient{
f: f,
PodInterface: f.ClientSet.CoreV1().Pods(f.Namespace.Name),
namespace: f.Namespace.Name,
ownerTracking: GlobalOwnerTracking,
}
}
// PodClientNS is a convenience method for getting a pod client interface in an alternative namespace,
// possibly applying test-suite specific transformations to the pod spec, e.g. for
// node e2e pod scheduling.
func PodClientNS(f *framework.Framework, namespace string) *PodClient {
return &PodClient{
f: f,
PodInterface: f.ClientSet.CoreV1().Pods(namespace),
namespace: namespace,
ownerTracking: GlobalOwnerTracking,
}
}
// PodClient is a struct for pod client.
type PodClient struct {
f *framework.Framework
v1core.PodInterface
namespace string
ownerTracking bool
}
// WithOwnerTracking controls automatic add of annotations recording the code location
// which created a pod. This is helpful when troubleshooting e2e tests (like e2e_node)
// which leak pods because insufficient cleanup.
// Note we want a shallow clone to avoid mutating the receiver.
// The default is the value of GlobalOwnerTracking *when the client was created*.
func (c PodClient) WithOwnerTracking(value bool) *PodClient {
c.ownerTracking = value
return &c
}
// Create creates a new pod according to the framework specifications (don't wait for it to start).
func (c *PodClient) Create(ctx context.Context, pod *v1.Pod) *v1.Pod {
ginkgo.GinkgoHelper()
c.mungeSpec(pod)
c.setOwnerAnnotation(pod)
p, err := c.PodInterface.Create(ctx, pod, metav1.CreateOptions{})
framework.ExpectNoError(err, "Error creating Pod")
return p
}
// CreateSync creates a new pod according to the framework specifications, and wait for it to start and be running and ready.
func (c *PodClient) CreateSync(ctx context.Context, pod *v1.Pod) *v1.Pod {
ginkgo.GinkgoHelper()
p := c.Create(ctx, pod)
framework.ExpectNoError(WaitTimeoutForPodReadyInNamespace(ctx, c.f.ClientSet, p.Name, c.namespace, framework.PodStartTimeout))
// Get the newest pod after it becomes running and ready, some status may change after pod created, such as pod ip.
p, err := c.Get(ctx, p.Name, metav1.GetOptions{})
framework.ExpectNoError(err)
return p
}
// CreateBatch create a batch of pods. All pods are created before waiting.
func (c *PodClient) CreateBatch(ctx context.Context, pods []*v1.Pod) []*v1.Pod {
ginkgo.GinkgoHelper()
ps := make([]*v1.Pod, len(pods))
var wg sync.WaitGroup
for i, pod := range pods {
wg.Add(1)
go func(i int, pod *v1.Pod) {
defer wg.Done()
defer ginkgo.GinkgoRecover()
ps[i] = c.CreateSync(ctx, pod)
}(i, pod)
}
wg.Wait()
return ps
}
// Update updates the pod object. It retries if there is a conflict, throw out error if
// there is any other apierrors. name is the pod name, updateFn is the function updating the
// pod object.
func (c *PodClient) Update(ctx context.Context, name string, updateFn func(pod *v1.Pod)) {
framework.ExpectNoError(wait.PollUntilContextTimeout(ctx, time.Millisecond*500, time.Second*30, false, func(ctx context.Context) (bool, error) {
pod, err := c.PodInterface.Get(ctx, name, metav1.GetOptions{})
if err != nil {
return false, fmt.Errorf("failed to get pod %q: %w", name, err)
}
updateFn(pod)
_, err = c.PodInterface.Update(ctx, pod, metav1.UpdateOptions{})
if err == nil {
framework.Logf("Successfully updated pod %q", name)
return true, nil
}
if apierrors.IsConflict(err) {
framework.Logf("Conflicting update to pod %q, re-get and re-update: %v", name, err)
return false, nil
}
return false, fmt.Errorf("failed to update pod %q: %w", name, err)
}))
}
// AddEphemeralContainerSync adds an EphemeralContainer to a pod and waits for it to be running.
func (c *PodClient) AddEphemeralContainerSync(ctx context.Context, pod *v1.Pod, ec *v1.EphemeralContainer, timeout time.Duration) error {
podJS, err := json.Marshal(pod)
framework.ExpectNoError(err, "error creating JSON for pod %q", FormatPod(pod))
ecPod := pod.DeepCopy()
ecPod.Spec.EphemeralContainers = append(ecPod.Spec.EphemeralContainers, *ec)
ecJS, err := json.Marshal(ecPod)
framework.ExpectNoError(err, "error creating JSON for pod with ephemeral container %q", FormatPod(pod))
patch, err := strategicpatch.CreateTwoWayMergePatch(podJS, ecJS, pod)
framework.ExpectNoError(err, "error creating patch to add ephemeral container %q", FormatPod(pod))
// Clients may optimistically attempt to add an ephemeral container to determine whether the EphemeralContainers feature is enabled.
if _, err := c.Patch(ctx, pod.Name, types.StrategicMergePatchType, patch, metav1.PatchOptions{}, "ephemeralcontainers"); err != nil {
return err
}
framework.ExpectNoError(WaitForContainerRunning(ctx, c.f.ClientSet, c.namespace, pod.Name, ec.Name, timeout))
return nil
}
// FormatPod returns a string representing a pod in a consistent human readable format,
// with pod name, namespace and pod UID as part of the string.
// This code is taken from k/k/pkg/kubelet/util/format/pod.go to remove
// e2e framework -> k/k/pkg/kubelet dependency.
func FormatPod(pod *v1.Pod) string {
if pod == nil {
return "<nil>"
}
return fmt.Sprintf("%s_%s(%s)", pod.Name, pod.Namespace, pod.UID)
}
// DeleteSync deletes the pod and wait for the pod to disappear for `timeout`. If the pod doesn't
// disappear before the timeout, it will fail the test.
func (c *PodClient) DeleteSync(ctx context.Context, name string, options metav1.DeleteOptions, timeout time.Duration) {
err := c.Delete(ctx, name, options)
if err != nil && !apierrors.IsNotFound(err) {
framework.Failf("Failed to delete pod %q: %v", name, err)
}
framework.ExpectNoError(WaitForPodNotFoundInNamespace(ctx, c.f.ClientSet, name, c.namespace, timeout), "wait for pod %q to disappear", name)
}
// addTestOrigin adds annotations to help identifying tests which incorrectly leak pods because insufficient cleanup
func (c *PodClient) setOwnerAnnotation(pod *v1.Pod) {
if !c.ownerTracking {
return
}
ginkgo.GinkgoHelper()
location := ginkgotypes.NewCodeLocation(0)
if pod.Annotations == nil {
pod.Annotations = make(map[string]string)
}
pod.Annotations[AnnotationTestOwner] = fmt.Sprintf("%s:%d", location.FileName, location.LineNumber)
}
// mungeSpec apply test-suite specific transformations to the pod spec.
func (c *PodClient) mungeSpec(pod *v1.Pod) {
if !framework.TestContext.NodeE2E {
return
}
gomega.Expect(pod.Spec.NodeName).To(gomega.Or(gomega.BeZero(), gomega.Equal(framework.TestContext.NodeName)), "Test misconfigured")
pod.Spec.NodeName = framework.TestContext.NodeName
// Node e2e does not support the default DNSClusterFirst policy. Set
// the policy to DNSDefault, which is configured per node.
pod.Spec.DNSPolicy = v1.DNSDefault
// PrepullImages only works for node e2e now. For cluster e2e, image prepull is not enforced,
// we should not munge ImagePullPolicy for cluster e2e pods.
if !framework.TestContext.PrepullImages {
return
}
// If prepull is enabled, munge the container spec to make sure the images are not pulled
// during the test.
for i := range pod.Spec.Containers {
c := &pod.Spec.Containers[i]
if c.ImagePullPolicy == v1.PullAlways {
// If the image pull policy is PullAlways, the image doesn't need to be in
// the allow list or pre-pulled, because the image is expected to be pulled
// in the test anyway.
continue
}
// If the image policy is not PullAlways, the image must be in the pre-pull list and
// pre-pulled.
gomega.Expect(ImagePrePullList.Has(c.Image)).To(gomega.BeTrueBecause("Image %q is not in the pre-pull list, consider adding it to PrePulledImages in test/e2e/common/util.go or NodePrePullImageList in test/e2e_node/image_list.go", c.Image))
// Do not pull images during the tests because the images in pre-pull list should have
// been prepulled.
c.ImagePullPolicy = v1.PullNever
}
}
// WaitForSuccess waits for pod to succeed.
// TODO(random-liu): Move pod wait function into this file
func (c *PodClient) WaitForSuccess(ctx context.Context, name string, timeout time.Duration) {
gomega.Expect(WaitForPodCondition(ctx, c.f.ClientSet, c.namespace, name, fmt.Sprintf("%s or %s", v1.PodSucceeded, v1.PodFailed), timeout,
func(pod *v1.Pod) (bool, error) {
switch pod.Status.Phase {
case v1.PodFailed:
return true, fmt.Errorf("pod %q failed with reason: %q, message: %q", name, pod.Status.Reason, pod.Status.Message)
case v1.PodSucceeded:
return true, nil
default:
return false, nil
}
},
)).To(gomega.Succeed(), "wait for pod %q to succeed", name)
}
// WaitForFinish waits for pod to finish running, regardless of success or failure.
func (c *PodClient) WaitForFinish(ctx context.Context, name string, timeout time.Duration) {
gomega.Expect(WaitForPodCondition(ctx, c.f.ClientSet, c.namespace, name, fmt.Sprintf("%s or %s", v1.PodSucceeded, v1.PodFailed), timeout,
func(pod *v1.Pod) (bool, error) {
switch pod.Status.Phase {
case v1.PodFailed:
return true, nil
case v1.PodSucceeded:
return true, nil
default:
return false, nil
}
},
)).To(gomega.Succeed(), "wait for pod %q to finish running", name)
}
// WaitForErrorEventOrSuccess waits for pod to succeed or an error event for that pod.
func (c *PodClient) WaitForErrorEventOrSuccess(ctx context.Context, pod *v1.Pod) (*v1.Event, error) {
return c.WaitForErrorEventOrSuccessWithTimeout(ctx, pod, framework.PodStartTimeout)
}
// WaitForErrorEventOrSuccessWithTimeout waits for pod to succeed or an error event for that pod for a specified time
func (c *PodClient) WaitForErrorEventOrSuccessWithTimeout(ctx context.Context, pod *v1.Pod, timeout time.Duration) (*v1.Event, error) {
var ev *v1.Event
err := wait.PollUntilContextTimeout(ctx, framework.Poll, timeout, false, func(ctx context.Context) (bool, error) {
evnts, err := c.f.ClientSet.CoreV1().Events(pod.Namespace).Search(scheme.Scheme, pod)
if err != nil {
return false, fmt.Errorf("error in listing events: %w", err)
}
for _, e := range evnts.Items {
switch e.Reason {
case killingContainer, failedToCreateContainer, forbiddenReason:
ev = &e
return true, nil
case startedContainer:
return true, nil
default:
// ignore all other errors
}
}
return false, nil
})
return ev, err
}
// MatchContainerOutput gets output of a container and match expected regexp in the output.
func (c *PodClient) MatchContainerOutput(ctx context.Context, name string, containerName string, expectedRegexp string) error {
f := c.f
output, err := GetPodLogs(ctx, f.ClientSet, f.Namespace.Name, name, containerName)
if err != nil {
return fmt.Errorf("failed to get output for container %q of pod %q", containerName, name)
}
regex, err := regexp.Compile(expectedRegexp)
if err != nil {
return fmt.Errorf("failed to compile regexp %q: %w", expectedRegexp, err)
}
if !regex.MatchString(output) {
return fmt.Errorf("failed to match regexp %q in output %q", expectedRegexp, output)
}
return nil
}
// PodIsReady returns true if the specified pod is ready. Otherwise false.
func (c *PodClient) PodIsReady(ctx context.Context, name string) bool {
pod, err := c.Get(ctx, name, metav1.GetOptions{})
framework.ExpectNoError(err)
return podutils.IsPodReady(pod)
}
// RemoveString returns a newly created []string that contains all items from slice
// that are not equal to s.
// This code is taken from k/k/pkg/util/slice/slice.go to remove
// e2e/framework/pod -> k/k/pkg/util/slice dependency.
func removeString(slice []string, s string) []string {
newSlice := make([]string, 0)
for _, item := range slice {
if item != s {
newSlice = append(newSlice, item)
}
}
if len(newSlice) == 0 {
// Sanitize for unit tests so we don't need to distinguish empty array
// and nil.
return nil
}
return newSlice
}
// RemoveFinalizer removes the pod's finalizer
func (c *PodClient) RemoveFinalizer(ctx context.Context, podName string, finalizerName string) {
framework.Logf("Removing pod's %q finalizer: %q", podName, finalizerName)
c.Update(ctx, podName, func(pod *v1.Pod) {
pod.ObjectMeta.Finalizers = removeString(pod.ObjectMeta.Finalizers, finalizerName)
})
}

View File

@ -0,0 +1,397 @@
/*
Copyright 2024 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pod
import (
"context"
"encoding/json"
"errors"
"fmt"
"strconv"
"strings"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
utilerrors "k8s.io/apimachinery/pkg/util/errors"
kubecm "k8s.io/kubernetes/pkg/kubelet/cm"
"k8s.io/kubernetes/test/e2e/framework"
imageutils "k8s.io/kubernetes/test/utils/image"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
)
const (
CgroupCPUPeriod string = "/sys/fs/cgroup/cpu/cpu.cfs_period_us"
CgroupCPUShares string = "/sys/fs/cgroup/cpu/cpu.shares"
CgroupCPUQuota string = "/sys/fs/cgroup/cpu/cpu.cfs_quota_us"
CgroupMemLimit string = "/sys/fs/cgroup/memory/memory.limit_in_bytes"
Cgroupv2MemLimit string = "/sys/fs/cgroup/memory.max"
Cgroupv2MemRequest string = "/sys/fs/cgroup/memory.min"
Cgroupv2CPULimit string = "/sys/fs/cgroup/cpu.max"
Cgroupv2CPURequest string = "/sys/fs/cgroup/cpu.weight"
CPUPeriod string = "100000"
MinContainerRuntimeVersion string = "1.6.9"
)
var (
podOnCgroupv2Node *bool
)
type ContainerResources struct {
CPUReq string
CPULim string
MemReq string
MemLim string
EphStorReq string
EphStorLim string
ExtendedResourceReq string
ExtendedResourceLim string
}
func (cr *ContainerResources) ResourceRequirements() *v1.ResourceRequirements {
if cr == nil {
return nil
}
var lim, req v1.ResourceList
if cr.CPULim != "" || cr.MemLim != "" || cr.EphStorLim != "" {
lim = make(v1.ResourceList)
}
if cr.CPUReq != "" || cr.MemReq != "" || cr.EphStorReq != "" {
req = make(v1.ResourceList)
}
if cr.CPULim != "" {
lim[v1.ResourceCPU] = resource.MustParse(cr.CPULim)
}
if cr.MemLim != "" {
lim[v1.ResourceMemory] = resource.MustParse(cr.MemLim)
}
if cr.EphStorLim != "" {
lim[v1.ResourceEphemeralStorage] = resource.MustParse(cr.EphStorLim)
}
if cr.CPUReq != "" {
req[v1.ResourceCPU] = resource.MustParse(cr.CPUReq)
}
if cr.MemReq != "" {
req[v1.ResourceMemory] = resource.MustParse(cr.MemReq)
}
if cr.EphStorReq != "" {
req[v1.ResourceEphemeralStorage] = resource.MustParse(cr.EphStorReq)
}
return &v1.ResourceRequirements{Limits: lim, Requests: req}
}
type ResizableContainerInfo struct {
Name string
Resources *ContainerResources
CPUPolicy *v1.ResourceResizeRestartPolicy
MemPolicy *v1.ResourceResizeRestartPolicy
RestartCount int32
}
type containerPatch struct {
Name string `json:"name"`
Resources struct {
Requests struct {
CPU string `json:"cpu,omitempty"`
Memory string `json:"memory,omitempty"`
EphStor string `json:"ephemeral-storage,omitempty"`
} `json:"requests"`
Limits struct {
CPU string `json:"cpu,omitempty"`
Memory string `json:"memory,omitempty"`
EphStor string `json:"ephemeral-storage,omitempty"`
} `json:"limits"`
} `json:"resources"`
}
type patchSpec struct {
Spec struct {
Containers []containerPatch `json:"containers"`
} `json:"spec"`
}
func getTestResourceInfo(tcInfo ResizableContainerInfo) (res v1.ResourceRequirements, resizePol []v1.ContainerResizePolicy) {
if tcInfo.Resources != nil {
res = *tcInfo.Resources.ResourceRequirements()
}
if tcInfo.CPUPolicy != nil {
cpuPol := v1.ContainerResizePolicy{ResourceName: v1.ResourceCPU, RestartPolicy: *tcInfo.CPUPolicy}
resizePol = append(resizePol, cpuPol)
}
if tcInfo.MemPolicy != nil {
memPol := v1.ContainerResizePolicy{ResourceName: v1.ResourceMemory, RestartPolicy: *tcInfo.MemPolicy}
resizePol = append(resizePol, memPol)
}
return res, resizePol
}
func InitDefaultResizePolicy(containers []ResizableContainerInfo) {
noRestart := v1.NotRequired
setDefaultPolicy := func(ci *ResizableContainerInfo) {
if ci.CPUPolicy == nil {
ci.CPUPolicy = &noRestart
}
if ci.MemPolicy == nil {
ci.MemPolicy = &noRestart
}
}
for i := range containers {
setDefaultPolicy(&containers[i])
}
}
func makeResizableContainer(tcInfo ResizableContainerInfo) v1.Container {
cmd := "grep Cpus_allowed_list /proc/self/status | cut -f2 && sleep 1d"
res, resizePol := getTestResourceInfo(tcInfo)
tc := v1.Container{
Name: tcInfo.Name,
Image: imageutils.GetE2EImage(imageutils.BusyBox),
Command: []string{"/bin/sh"},
Args: []string{"-c", cmd},
Resources: res,
ResizePolicy: resizePol,
}
return tc
}
func MakePodWithResizableContainers(ns, name, timeStamp string, tcInfo []ResizableContainerInfo) *v1.Pod {
var testContainers []v1.Container
for _, ci := range tcInfo {
tc := makeResizableContainer(ci)
testContainers = append(testContainers, tc)
}
pod := &v1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: ns,
Labels: map[string]string{
"time": timeStamp,
},
},
Spec: v1.PodSpec{
OS: &v1.PodOS{Name: v1.Linux},
Containers: testContainers,
RestartPolicy: v1.RestartPolicyOnFailure,
},
}
return pod
}
func VerifyPodResizePolicy(gotPod *v1.Pod, wantCtrs []ResizableContainerInfo) {
ginkgo.GinkgoHelper()
gomega.Expect(gotPod.Spec.Containers).To(gomega.HaveLen(len(wantCtrs)), "number of containers in pod spec should match")
for i, wantCtr := range wantCtrs {
gotCtr := &gotPod.Spec.Containers[i]
ctr := makeResizableContainer(wantCtr)
gomega.Expect(gotCtr.Name).To(gomega.Equal(ctr.Name))
gomega.Expect(gotCtr.ResizePolicy).To(gomega.Equal(ctr.ResizePolicy))
}
}
func VerifyPodResources(gotPod *v1.Pod, wantCtrs []ResizableContainerInfo) {
ginkgo.GinkgoHelper()
gomega.Expect(gotPod.Spec.Containers).To(gomega.HaveLen(len(wantCtrs)), "number of containers in pod spec should match")
for i, wantCtr := range wantCtrs {
gotCtr := &gotPod.Spec.Containers[i]
ctr := makeResizableContainer(wantCtr)
gomega.Expect(gotCtr.Name).To(gomega.Equal(ctr.Name))
gomega.Expect(gotCtr.Resources).To(gomega.Equal(ctr.Resources))
}
}
func VerifyPodStatusResources(gotPod *v1.Pod, wantCtrs []ResizableContainerInfo) error {
ginkgo.GinkgoHelper()
var errs []error
if len(gotPod.Status.ContainerStatuses) != len(wantCtrs) {
return fmt.Errorf("expectation length mismatch: got %d statuses, want %d",
len(gotPod.Status.ContainerStatuses), len(wantCtrs))
}
for i, wantCtr := range wantCtrs {
gotCtrStatus := &gotPod.Status.ContainerStatuses[i]
ctr := makeResizableContainer(wantCtr)
if gotCtrStatus.Name != ctr.Name {
errs = append(errs, fmt.Errorf("container status %d name %q != expected name %q", i, gotCtrStatus.Name, ctr.Name))
continue
}
if err := framework.Gomega().Expect(*gotCtrStatus.Resources).To(gomega.Equal(ctr.Resources)); err != nil {
errs = append(errs, fmt.Errorf("container[%s] status resources mismatch: %w", ctr.Name, err))
}
}
return utilerrors.NewAggregate(errs)
}
func VerifyPodContainersCgroupValues(ctx context.Context, f *framework.Framework, pod *v1.Pod, tcInfo []ResizableContainerInfo) error {
ginkgo.GinkgoHelper()
if podOnCgroupv2Node == nil {
value := IsPodOnCgroupv2Node(f, pod)
podOnCgroupv2Node = &value
}
cgroupMemLimit := Cgroupv2MemLimit
cgroupCPULimit := Cgroupv2CPULimit
cgroupCPURequest := Cgroupv2CPURequest
if !*podOnCgroupv2Node {
cgroupMemLimit = CgroupMemLimit
cgroupCPULimit = CgroupCPUQuota
cgroupCPURequest = CgroupCPUShares
}
var errs []error
for _, ci := range tcInfo {
if ci.Resources == nil {
continue
}
tc := makeResizableContainer(ci)
if tc.Resources.Limits != nil || tc.Resources.Requests != nil {
var expectedCPUShares int64
var expectedCPULimitString, expectedMemLimitString string
expectedMemLimitInBytes := tc.Resources.Limits.Memory().Value()
cpuRequest := tc.Resources.Requests.Cpu()
cpuLimit := tc.Resources.Limits.Cpu()
if cpuRequest.IsZero() && !cpuLimit.IsZero() {
expectedCPUShares = int64(kubecm.MilliCPUToShares(cpuLimit.MilliValue()))
} else {
expectedCPUShares = int64(kubecm.MilliCPUToShares(cpuRequest.MilliValue()))
}
cpuQuota := kubecm.MilliCPUToQuota(cpuLimit.MilliValue(), kubecm.QuotaPeriod)
if cpuLimit.IsZero() {
cpuQuota = -1
}
expectedCPULimitString = strconv.FormatInt(cpuQuota, 10)
expectedMemLimitString = strconv.FormatInt(expectedMemLimitInBytes, 10)
if *podOnCgroupv2Node {
if expectedCPULimitString == "-1" {
expectedCPULimitString = "max"
}
expectedCPULimitString = fmt.Sprintf("%s %s", expectedCPULimitString, CPUPeriod)
if expectedMemLimitString == "0" {
expectedMemLimitString = "max"
}
// convert cgroup v1 cpu.shares value to cgroup v2 cpu.weight value
// https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2254-cgroup-v2#phase-1-convert-from-cgroups-v1-settings-to-v2
expectedCPUShares = int64(1 + ((expectedCPUShares-2)*9999)/262142)
}
if expectedMemLimitString != "0" {
errs = append(errs, VerifyCgroupValue(f, pod, ci.Name, cgroupMemLimit, expectedMemLimitString))
}
errs = append(errs, VerifyCgroupValue(f, pod, ci.Name, cgroupCPULimit, expectedCPULimitString))
errs = append(errs, VerifyCgroupValue(f, pod, ci.Name, cgroupCPURequest, strconv.FormatInt(expectedCPUShares, 10)))
}
}
return utilerrors.NewAggregate(errs)
}
func verifyContainerRestarts(pod *v1.Pod, expectedContainers []ResizableContainerInfo) error {
ginkgo.GinkgoHelper()
expectContainerRestarts := map[string]int32{}
for _, ci := range expectedContainers {
expectContainerRestarts[ci.Name] = ci.RestartCount
}
errs := []error{}
for _, cs := range pod.Status.ContainerStatuses {
expectedRestarts := expectContainerRestarts[cs.Name]
if cs.RestartCount != expectedRestarts {
errs = append(errs, fmt.Errorf("unexpected number of restarts for container %s: got %d, want %d", cs.Name, cs.RestartCount, expectedRestarts))
}
}
return utilerrors.NewAggregate(errs)
}
func WaitForPodResizeActuation(ctx context.Context, f *framework.Framework, podClient *PodClient, pod *v1.Pod) *v1.Pod {
ginkgo.GinkgoHelper()
// Wait for resize to complete.
framework.ExpectNoError(WaitForPodCondition(ctx, f.ClientSet, pod.Namespace, pod.Name, "resize status cleared", f.Timeouts.PodStart,
func(pod *v1.Pod) (bool, error) {
if pod.Status.Resize == v1.PodResizeStatusInfeasible {
// This is a terminal resize state
return false, fmt.Errorf("resize is infeasible")
}
return pod.Status.Resize == "", nil
}), "pod should finish resizing")
resizedPod, err := framework.GetObject(podClient.Get, pod.Name, metav1.GetOptions{})(ctx)
framework.ExpectNoError(err, "failed to get resized pod")
return resizedPod
}
func ExpectPodResized(ctx context.Context, f *framework.Framework, resizedPod *v1.Pod, expectedContainers []ResizableContainerInfo) {
ginkgo.GinkgoHelper()
// Put each error on a new line for readability.
formatErrors := func(err error) error {
var agg utilerrors.Aggregate
if !errors.As(err, &agg) {
return err
}
errStrings := make([]string, len(agg.Errors()))
for i, err := range agg.Errors() {
errStrings[i] = err.Error()
}
return fmt.Errorf("[\n%s\n]", strings.Join(errStrings, ",\n"))
}
// Verify Pod Containers Cgroup Values
var errs []error
if cgroupErrs := VerifyPodContainersCgroupValues(ctx, f, resizedPod, expectedContainers); cgroupErrs != nil {
errs = append(errs, fmt.Errorf("container cgroup values don't match expected: %w", formatErrors(cgroupErrs)))
}
if resourceErrs := VerifyPodStatusResources(resizedPod, expectedContainers); resourceErrs != nil {
errs = append(errs, fmt.Errorf("container status resources don't match expected: %w", formatErrors(resourceErrs)))
}
if restartErrs := verifyContainerRestarts(resizedPod, expectedContainers); restartErrs != nil {
errs = append(errs, fmt.Errorf("container restart counts don't match expected: %w", formatErrors(restartErrs)))
}
if len(errs) > 0 {
resizedPod.ManagedFields = nil // Suppress managed fields in error output.
framework.ExpectNoError(formatErrors(utilerrors.NewAggregate(errs)),
"Verifying pod resources resize state. Pod: %s", framework.PrettyPrintJSON(resizedPod))
}
}
// ResizeContainerPatch generates a patch string to resize the pod container.
func ResizeContainerPatch(containers []ResizableContainerInfo) (string, error) {
var patch patchSpec
for _, container := range containers {
var cPatch containerPatch
cPatch.Name = container.Name
cPatch.Resources.Requests.CPU = container.Resources.CPUReq
cPatch.Resources.Requests.Memory = container.Resources.MemReq
cPatch.Resources.Limits.CPU = container.Resources.CPULim
cPatch.Resources.Limits.Memory = container.Resources.MemLim
patch.Spec.Containers = append(patch.Spec.Containers, cPatch)
}
patchBytes, err := json.Marshal(patch)
if err != nil {
return "", err
}
return string(patchBytes), nil
}

View File

@ -0,0 +1,574 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pod
import (
"context"
"fmt"
"os"
"path/filepath"
"strconv"
"strings"
"time"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/labels"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/klog/v2"
"k8s.io/kubernetes/test/e2e/framework"
testutils "k8s.io/kubernetes/test/utils"
imageutils "k8s.io/kubernetes/test/utils/image"
)
// LabelLogOnPodFailure can be used to mark which Pods will have their logs logged in the case of
// a test failure. By default, if there are no Pods with this label, only the first 5 Pods will
// have their logs fetched.
const LabelLogOnPodFailure = "log-on-pod-failure"
// TODO: Move to its own subpkg.
// expectNoError checks if "err" is set, and if so, fails assertion while logging the error.
func expectNoError(err error, explain ...interface{}) {
expectNoErrorWithOffset(1, err, explain...)
}
// TODO: Move to its own subpkg.
// expectNoErrorWithOffset checks if "err" is set, and if so, fails assertion while logging the error at "offset" levels above its caller
// (for example, for call chain f -> g -> expectNoErrorWithOffset(1, ...) error would be logged for "f").
func expectNoErrorWithOffset(offset int, err error, explain ...interface{}) {
if err != nil {
framework.Logf("Unexpected error occurred: %v", err)
}
gomega.ExpectWithOffset(1+offset, err).NotTo(gomega.HaveOccurred(), explain...)
}
// PodsCreated returns a pod list matched by the given name.
func PodsCreated(ctx context.Context, c clientset.Interface, ns, name string, replicas int32) (*v1.PodList, error) {
label := labels.SelectorFromSet(labels.Set(map[string]string{"name": name}))
return PodsCreatedByLabel(ctx, c, ns, name, replicas, label)
}
// PodsCreatedByLabel returns a created pod list matched by the given label.
func PodsCreatedByLabel(ctx context.Context, c clientset.Interface, ns, name string, replicas int32, label labels.Selector) (*v1.PodList, error) {
timeout := 2 * time.Minute
for start := time.Now(); time.Since(start) < timeout; time.Sleep(5 * time.Second) {
options := metav1.ListOptions{LabelSelector: label.String()}
// List the pods, making sure we observe all the replicas.
pods, err := c.CoreV1().Pods(ns).List(ctx, options)
if err != nil {
return nil, err
}
created := []v1.Pod{}
for _, pod := range pods.Items {
if pod.DeletionTimestamp != nil {
continue
}
created = append(created, pod)
}
framework.Logf("Pod name %s: Found %d pods out of %d", name, len(created), replicas)
if int32(len(created)) == replicas {
pods.Items = created
return pods, nil
}
}
return nil, fmt.Errorf("Pod name %s: Gave up waiting %v for %d pods to come up", name, timeout, replicas)
}
// VerifyPods checks if the specified pod is responding.
func VerifyPods(ctx context.Context, c clientset.Interface, ns, name string, wantName bool, replicas int32) error {
return podRunningMaybeResponding(ctx, c, ns, name, wantName, replicas, true)
}
// VerifyPodsRunning checks if the specified pod is running.
func VerifyPodsRunning(ctx context.Context, c clientset.Interface, ns, name string, wantName bool, replicas int32) error {
return podRunningMaybeResponding(ctx, c, ns, name, wantName, replicas, false)
}
func podRunningMaybeResponding(ctx context.Context, c clientset.Interface, ns, name string, wantName bool, replicas int32, checkResponding bool) error {
pods, err := PodsCreated(ctx, c, ns, name, replicas)
if err != nil {
return err
}
e := podsRunning(ctx, c, pods)
if len(e) > 0 {
return fmt.Errorf("failed to wait for pods running: %v", e)
}
if checkResponding {
return WaitForPodsResponding(ctx, c, ns, name, wantName, podRespondingTimeout, pods)
}
return nil
}
func podsRunning(ctx context.Context, c clientset.Interface, pods *v1.PodList) []error {
// Wait for the pods to enter the running state. Waiting loops until the pods
// are running so non-running pods cause a timeout for this test.
ginkgo.By("ensuring each pod is running")
e := []error{}
errorChan := make(chan error)
for _, pod := range pods.Items {
go func(p v1.Pod) {
errorChan <- WaitForPodRunningInNamespace(ctx, c, &p)
}(pod)
}
for range pods.Items {
err := <-errorChan
if err != nil {
e = append(e, err)
}
}
return e
}
// LogPodStates logs basic info of provided pods for debugging.
func LogPodStates(pods []v1.Pod) {
// Find maximum widths for pod, node, and phase strings for column printing.
maxPodW, maxNodeW, maxPhaseW, maxGraceW := len("POD"), len("NODE"), len("PHASE"), len("GRACE")
for i := range pods {
pod := &pods[i]
if len(pod.ObjectMeta.Name) > maxPodW {
maxPodW = len(pod.ObjectMeta.Name)
}
if len(pod.Spec.NodeName) > maxNodeW {
maxNodeW = len(pod.Spec.NodeName)
}
if len(pod.Status.Phase) > maxPhaseW {
maxPhaseW = len(pod.Status.Phase)
}
}
// Increase widths by one to separate by a single space.
maxPodW++
maxNodeW++
maxPhaseW++
maxGraceW++
// Log pod info. * does space padding, - makes them left-aligned.
framework.Logf("%-[1]*[2]s %-[3]*[4]s %-[5]*[6]s %-[7]*[8]s %[9]s",
maxPodW, "POD", maxNodeW, "NODE", maxPhaseW, "PHASE", maxGraceW, "GRACE", "CONDITIONS")
for _, pod := range pods {
grace := ""
if pod.DeletionGracePeriodSeconds != nil {
grace = fmt.Sprintf("%ds", *pod.DeletionGracePeriodSeconds)
}
framework.Logf("%-[1]*[2]s %-[3]*[4]s %-[5]*[6]s %-[7]*[8]s %[9]s",
maxPodW, pod.ObjectMeta.Name, maxNodeW, pod.Spec.NodeName, maxPhaseW, pod.Status.Phase, maxGraceW, grace, pod.Status.Conditions)
}
framework.Logf("") // Final empty line helps for readability.
}
// logPodTerminationMessages logs termination messages for failing pods. It's a short snippet (much smaller than full logs), but it often shows
// why pods crashed and since it is in the API, it's fast to retrieve.
func logPodTerminationMessages(pods []v1.Pod) {
for _, pod := range pods {
for _, status := range pod.Status.InitContainerStatuses {
if status.LastTerminationState.Terminated != nil && len(status.LastTerminationState.Terminated.Message) > 0 {
framework.Logf("%s[%s].initContainer[%s]=%s", pod.Name, pod.Namespace, status.Name, status.LastTerminationState.Terminated.Message)
}
}
for _, status := range pod.Status.ContainerStatuses {
if status.LastTerminationState.Terminated != nil && len(status.LastTerminationState.Terminated.Message) > 0 {
framework.Logf("%s[%s].container[%s]=%s", pod.Name, pod.Namespace, status.Name, status.LastTerminationState.Terminated.Message)
}
}
}
}
// logPodLogs logs the container logs from pods in the given namespace. This can be helpful for debugging
// issues that do not cause the container to fail (e.g.: network connectivity issues)
// We will log the Pods that have the LabelLogOnPodFailure label. If there aren't any, we default to
// logging only the first 5 Pods. This requires the reportDir to be set, and the pods are logged into:
// {report_dir}/pods/{namespace}/{pod}/{container_name}/logs.txt
func logPodLogs(ctx context.Context, c clientset.Interface, namespace string, pods []v1.Pod, reportDir string) {
if reportDir == "" {
return
}
var logPods []v1.Pod
for _, pod := range pods {
if _, ok := pod.Labels[LabelLogOnPodFailure]; ok {
logPods = append(logPods, pod)
}
}
maxPods := len(logPods)
// There are no pods with the LabelLogOnPodFailure label, we default to the first 5 Pods.
if maxPods == 0 {
logPods = pods
maxPods = len(pods)
if maxPods > 5 {
maxPods = 5
}
}
tailLen := 42
for i := 0; i < maxPods; i++ {
pod := logPods[i]
for _, container := range pod.Spec.Containers {
logs, err := getPodLogsInternal(ctx, c, namespace, pod.Name, container.Name, false, nil, &tailLen)
if err != nil {
framework.Logf("Unable to fetch %s/%s/%s logs: %v", pod.Namespace, pod.Name, container.Name, err)
continue
}
logDir := filepath.Join(reportDir, namespace, pod.Name, container.Name)
err = os.MkdirAll(logDir, 0755)
if err != nil {
framework.Logf("Unable to create path '%s'. Err: %v", logDir, err)
continue
}
logPath := filepath.Join(logDir, "logs.txt")
err = os.WriteFile(logPath, []byte(logs), 0644)
if err != nil {
framework.Logf("Could not write the container logs in: %s. Err: %v", logPath, err)
}
}
}
}
// DumpAllPodInfoForNamespace logs all pod information for a given namespace.
func DumpAllPodInfoForNamespace(ctx context.Context, c clientset.Interface, namespace, reportDir string) {
pods, err := c.CoreV1().Pods(namespace).List(ctx, metav1.ListOptions{})
if err != nil {
framework.Logf("unable to fetch pod debug info: %v", err)
}
LogPodStates(pods.Items)
logPodTerminationMessages(pods.Items)
logPodLogs(ctx, c, namespace, pods.Items, reportDir)
}
// FilterNonRestartablePods filters out pods that will never get recreated if
// deleted after termination.
func FilterNonRestartablePods(pods []*v1.Pod) []*v1.Pod {
var results []*v1.Pod
for _, p := range pods {
if isNotRestartAlwaysMirrorPod(p) {
// Mirror pods with restart policy == Never will not get
// recreated if they are deleted after the pods have
// terminated. For now, we discount such pods.
// https://github.com/kubernetes/kubernetes/issues/34003
continue
}
results = append(results, p)
}
return results
}
func isNotRestartAlwaysMirrorPod(p *v1.Pod) bool {
// Check if the pod is a mirror pod
if _, ok := p.Annotations[v1.MirrorPodAnnotationKey]; !ok {
return false
}
return p.Spec.RestartPolicy != v1.RestartPolicyAlways
}
// NewAgnhostPod returns a pod that uses the agnhost image. The image's binary supports various subcommands
// that behave the same, no matter the underlying OS. If no args are given, it defaults to the pause subcommand.
// For more information about agnhost subcommands, see: https://github.com/kubernetes/kubernetes/tree/master/test/images/agnhost#agnhost
func NewAgnhostPod(ns, podName string, volumes []v1.Volume, mounts []v1.VolumeMount, ports []v1.ContainerPort, args ...string) *v1.Pod {
immediate := int64(0)
pod := &v1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: podName,
Namespace: ns,
},
Spec: v1.PodSpec{
Containers: []v1.Container{
NewAgnhostContainer("agnhost-container", mounts, ports, args...),
},
Volumes: volumes,
SecurityContext: &v1.PodSecurityContext{},
TerminationGracePeriodSeconds: &immediate,
},
}
return pod
}
func NewAgnhostPodFromContainers(ns, podName string, volumes []v1.Volume, containers ...v1.Container) *v1.Pod {
immediate := int64(0)
pod := &v1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: podName,
Namespace: ns,
},
Spec: v1.PodSpec{
Containers: containers[:],
Volumes: volumes,
SecurityContext: &v1.PodSecurityContext{},
TerminationGracePeriodSeconds: &immediate,
},
}
return pod
}
// NewAgnhostContainer returns the container Spec of an agnhost container.
func NewAgnhostContainer(containerName string, mounts []v1.VolumeMount, ports []v1.ContainerPort, args ...string) v1.Container {
if len(args) == 0 {
args = []string{"pause"}
}
return v1.Container{
Name: containerName,
Image: imageutils.GetE2EImage(imageutils.Agnhost),
Args: args,
VolumeMounts: mounts,
Ports: ports,
SecurityContext: &v1.SecurityContext{},
ImagePullPolicy: v1.PullIfNotPresent,
}
}
// NewExecPodSpec returns the pod spec of hostexec pod
func NewExecPodSpec(ns, name string, hostNetwork bool) *v1.Pod {
pod := NewAgnhostPod(ns, name, nil, nil, nil)
pod.Spec.HostNetwork = hostNetwork
return pod
}
// newExecPodSpec returns the pod spec of exec pod
func newExecPodSpec(ns, generateName string) *v1.Pod {
// GenerateName is an optional prefix, used by the server,
// to generate a unique name ONLY IF the Name field has not been provided
pod := NewAgnhostPod(ns, "", nil, nil, nil)
pod.ObjectMeta.GenerateName = generateName
return pod
}
// CreateExecPodOrFail creates a agnhost pause pod used as a vessel for kubectl exec commands.
// Pod name is uniquely generated.
func CreateExecPodOrFail(ctx context.Context, client clientset.Interface, ns, generateName string, tweak func(*v1.Pod)) *v1.Pod {
framework.Logf("Creating new exec pod")
pod := newExecPodSpec(ns, generateName)
if tweak != nil {
tweak(pod)
}
execPod, err := client.CoreV1().Pods(ns).Create(ctx, pod, metav1.CreateOptions{})
expectNoError(err, "failed to create new exec pod in namespace: %s", ns)
err = WaitForPodNameRunningInNamespace(ctx, client, execPod.Name, execPod.Namespace)
expectNoError(err, "failed to create new exec pod in namespace: %s", ns)
return execPod
}
// WithWindowsHostProcess sets the Pod's Windows HostProcess option to true. When this option is set,
// HostNetwork can be enabled.
// Containers running as HostProcess will require certain usernames to be set, otherwise the Pod will
// not start: NT AUTHORITY\SYSTEM, NT AUTHORITY\Local service, NT AUTHORITY\NetworkService.
// If the given username is empty, NT AUTHORITY\SYSTEM will be used instead.
// See: https://kubernetes.io/docs/tasks/configure-pod-container/create-hostprocess-pod/
func WithWindowsHostProcess(pod *v1.Pod, username string) {
if pod.Spec.SecurityContext == nil {
pod.Spec.SecurityContext = &v1.PodSecurityContext{}
}
if pod.Spec.SecurityContext.WindowsOptions == nil {
pod.Spec.SecurityContext.WindowsOptions = &v1.WindowsSecurityContextOptions{}
}
trueVar := true
if username == "" {
username = "NT AUTHORITY\\SYSTEM"
}
pod.Spec.SecurityContext.WindowsOptions.HostProcess = &trueVar
pod.Spec.SecurityContext.WindowsOptions.RunAsUserName = &username
}
// CheckPodsRunningReady returns whether all pods whose names are listed in
// podNames in namespace ns are running and ready, using c and waiting at most
// timeout.
func CheckPodsRunningReady(ctx context.Context, c clientset.Interface, ns string, podNames []string, timeout time.Duration) bool {
return checkPodsCondition(ctx, c, ns, podNames, timeout, testutils.PodRunningReady, "running and ready")
}
// CheckPodsRunningReadyOrSucceeded returns whether all pods whose names are
// listed in podNames in namespace ns are running and ready, or succeeded; use
// c and waiting at most timeout.
func CheckPodsRunningReadyOrSucceeded(ctx context.Context, c clientset.Interface, ns string, podNames []string, timeout time.Duration) bool {
return checkPodsCondition(ctx, c, ns, podNames, timeout, testutils.PodRunningReadyOrSucceeded, "running and ready, or succeeded")
}
// checkPodsCondition returns whether all pods whose names are listed in podNames
// in namespace ns are in the condition, using c and waiting at most timeout.
func checkPodsCondition(ctx context.Context, c clientset.Interface, ns string, podNames []string, timeout time.Duration, condition podCondition, desc string) bool {
np := len(podNames)
framework.Logf("Waiting up to %v for %d pods to be %s: %s", timeout, np, desc, podNames)
type waitPodResult struct {
success bool
podName string
}
result := make(chan waitPodResult, len(podNames))
for _, podName := range podNames {
// Launch off pod readiness checkers.
go func(name string) {
err := WaitForPodCondition(ctx, c, ns, name, desc, timeout, condition)
result <- waitPodResult{err == nil, name}
}(podName)
}
// Wait for them all to finish.
success := true
for range podNames {
res := <-result
if !res.success {
framework.Logf("Pod %[1]s failed to be %[2]s.", res.podName, desc)
success = false
}
}
framework.Logf("Wanted all %d pods to be %s. Result: %t. Pods: %v", np, desc, success, podNames)
return success
}
// GetPodLogs returns the logs of the specified container (namespace/pod/container).
func GetPodLogs(ctx context.Context, c clientset.Interface, namespace, podName, containerName string) (string, error) {
return getPodLogsInternal(ctx, c, namespace, podName, containerName, false, nil, nil)
}
// GetPodLogsSince returns the logs of the specified container (namespace/pod/container) since a timestamp.
func GetPodLogsSince(ctx context.Context, c clientset.Interface, namespace, podName, containerName string, since time.Time) (string, error) {
sinceTime := metav1.NewTime(since)
return getPodLogsInternal(ctx, c, namespace, podName, containerName, false, &sinceTime, nil)
}
// GetPreviousPodLogs returns the logs of the previous instance of the
// specified container (namespace/pod/container).
func GetPreviousPodLogs(ctx context.Context, c clientset.Interface, namespace, podName, containerName string) (string, error) {
return getPodLogsInternal(ctx, c, namespace, podName, containerName, true, nil, nil)
}
// utility function for gomega Eventually
func getPodLogsInternal(ctx context.Context, c clientset.Interface, namespace, podName, containerName string, previous bool, sinceTime *metav1.Time, tailLines *int) (string, error) {
request := c.CoreV1().RESTClient().Get().
Resource("pods").
Namespace(namespace).
Name(podName).SubResource("log").
Param("container", containerName).
Param("previous", strconv.FormatBool(previous))
if sinceTime != nil {
request.Param("sinceTime", sinceTime.Format(time.RFC3339))
}
if tailLines != nil {
request.Param("tailLines", strconv.Itoa(*tailLines))
}
logs, err := request.Do(ctx).Raw()
if err != nil {
return "", err
}
if strings.Contains(string(logs), "Internal Error") {
return "", fmt.Errorf("Fetched log contains \"Internal Error\": %q", string(logs))
}
return string(logs), err
}
// GetPodsInNamespace returns the pods in the given namespace.
func GetPodsInNamespace(ctx context.Context, c clientset.Interface, ns string, ignoreLabels map[string]string) ([]*v1.Pod, error) {
pods, err := c.CoreV1().Pods(ns).List(ctx, metav1.ListOptions{})
if err != nil {
return []*v1.Pod{}, err
}
ignoreSelector := labels.SelectorFromSet(ignoreLabels)
var filtered []*v1.Pod
for i := range pods.Items {
p := pods.Items[i]
if len(ignoreLabels) != 0 && ignoreSelector.Matches(labels.Set(p.Labels)) {
continue
}
filtered = append(filtered, &p)
}
return filtered, nil
}
// GetPods return the label matched pods in the given ns
func GetPods(ctx context.Context, c clientset.Interface, ns string, matchLabels map[string]string) ([]v1.Pod, error) {
label := labels.SelectorFromSet(matchLabels)
listOpts := metav1.ListOptions{LabelSelector: label.String()}
pods, err := c.CoreV1().Pods(ns).List(ctx, listOpts)
if err != nil {
return []v1.Pod{}, err
}
return pods.Items, nil
}
// GetPodSecretUpdateTimeout returns the timeout duration for updating pod secret.
func GetPodSecretUpdateTimeout(ctx context.Context, c clientset.Interface) time.Duration {
// With SecretManager(ConfigMapManager), we may have to wait up to full sync period +
// TTL of secret(configmap) to elapse before the Kubelet projects the update into the
// volume and the container picks it up.
// So this timeout is based on default Kubelet sync period (1 minute) + maximum TTL for
// secret(configmap) that's based on cluster size + additional time as a fudge factor.
secretTTL, err := getNodeTTLAnnotationValue(ctx, c)
if err != nil {
framework.Logf("Couldn't get node TTL annotation (using default value of 0): %v", err)
}
podLogTimeout := 240*time.Second + secretTTL
return podLogTimeout
}
// VerifyPodHasConditionWithType verifies the pod has the expected condition by type
func VerifyPodHasConditionWithType(ctx context.Context, f *framework.Framework, pod *v1.Pod, cType v1.PodConditionType) {
pod, err := f.ClientSet.CoreV1().Pods(f.Namespace.Name).Get(ctx, pod.Name, metav1.GetOptions{})
framework.ExpectNoError(err, "Failed to get the recent pod object for name: %q", pod.Name)
if condition := FindPodConditionByType(&pod.Status, cType); condition == nil {
framework.Failf("pod %q should have the condition: %q, pod status: %v", pod.Name, cType, pod.Status)
}
}
func getNodeTTLAnnotationValue(ctx context.Context, c clientset.Interface) (time.Duration, error) {
nodes, err := c.CoreV1().Nodes().List(ctx, metav1.ListOptions{})
if err != nil || len(nodes.Items) == 0 {
return time.Duration(0), fmt.Errorf("Couldn't list any nodes to get TTL annotation: %w", err)
}
// Since TTL the kubelet is using is stored in node object, for the timeout
// purpose we take it from the first node (all of them should be the same).
node := &nodes.Items[0]
if node.Annotations == nil {
return time.Duration(0), fmt.Errorf("No annotations found on the node")
}
value, ok := node.Annotations[v1.ObjectTTLAnnotationKey]
if !ok {
return time.Duration(0), fmt.Errorf("No TTL annotation found on the node")
}
intValue, err := strconv.Atoi(value)
if err != nil {
return time.Duration(0), fmt.Errorf("Cannot convert TTL annotation from %#v to int", *node)
}
return time.Duration(intValue) * time.Second, nil
}
// FilterActivePods returns pods that have not terminated.
func FilterActivePods(pods []*v1.Pod) []*v1.Pod {
var result []*v1.Pod
for _, p := range pods {
if IsPodActive(p) {
result = append(result, p)
} else {
klog.V(4).Infof("Ignoring inactive pod %v/%v in state %v, deletion time %v",
p.Namespace, p.Name, p.Status.Phase, p.DeletionTimestamp)
}
}
return result
}
// IsPodActive return true if the pod meets certain conditions.
func IsPodActive(p *v1.Pod) bool {
return v1.PodSucceeded != p.Status.Phase &&
v1.PodFailed != p.Status.Phase &&
p.DeletionTimestamp == nil
}

View File

@ -0,0 +1,309 @@
/*
Copyright 2021 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pod
import (
"flag"
"fmt"
"strings"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
v1 "k8s.io/api/core/v1"
"k8s.io/kubernetes/test/e2e/framework"
imageutils "k8s.io/kubernetes/test/utils/image"
psaapi "k8s.io/pod-security-admission/api"
psapolicy "k8s.io/pod-security-admission/policy"
"k8s.io/utils/pointer"
)
// NodeOSDistroIs returns true if the distro is the same as `--node-os-distro`
// the package framework/pod can't import the framework package (see #81245)
// we need to check if the --node-os-distro=windows is set and the framework package
// is the one that's parsing the flags, as a workaround this method is looking for the same flag again
// TODO: replace with `framework.NodeOSDistroIs` when #81245 is complete
func NodeOSDistroIs(distro string) bool {
var nodeOsDistro *flag.Flag = flag.Lookup("node-os-distro")
if nodeOsDistro != nil && nodeOsDistro.Value.String() == distro {
return true
}
return false
}
const InfiniteSleepCommand = "trap exit TERM; while true; do sleep 1; done"
// GenerateScriptCmd generates the corresponding command lines to execute a command.
func GenerateScriptCmd(command string) []string {
return []string{"/bin/sh", "-c", command}
}
// GetDefaultTestImage returns the default test image based on OS.
// If the node OS is windows, currently we return Agnhost image for Windows node
// due to the issue of #https://github.com/kubernetes-sigs/windows-testing/pull/35.
// If the node OS is linux, return busybox image
func GetDefaultTestImage() string {
return imageutils.GetE2EImage(GetDefaultTestImageID())
}
// GetDefaultTestImageID returns the default test image id based on OS.
// If the node OS is windows, currently we return Agnhost image for Windows node
// due to the issue of #https://github.com/kubernetes-sigs/windows-testing/pull/35.
// If the node OS is linux, return busybox image
func GetDefaultTestImageID() imageutils.ImageID {
return GetTestImageID(imageutils.BusyBox)
}
// GetTestImage returns the image name with the given input
// If the Node OS is windows, currently we return Agnhost image for Windows node
// due to the issue of #https://github.com/kubernetes-sigs/windows-testing/pull/35.
func GetTestImage(id imageutils.ImageID) string {
if NodeOSDistroIs("windows") {
return imageutils.GetE2EImage(imageutils.Agnhost)
}
return imageutils.GetE2EImage(id)
}
// GetTestImageID returns the image id with the given input
// If the Node OS is windows, currently we return Agnhost image for Windows node
// due to the issue of #https://github.com/kubernetes-sigs/windows-testing/pull/35.
func GetTestImageID(id imageutils.ImageID) imageutils.ImageID {
if NodeOSDistroIs("windows") {
return imageutils.Agnhost
}
return id
}
// GetDefaultNonRootUser returns default non root user
// If the Node OS is windows, we return nill due to issue with invalid permissions set on projected volumes
// https://github.com/kubernetes/kubernetes/issues/102849
func GetDefaultNonRootUser() *int64 {
if NodeOSDistroIs("windows") {
return nil
}
return pointer.Int64(DefaultNonRootUser)
}
// GeneratePodSecurityContext generates the corresponding pod security context with the given inputs
// If the Node OS is windows, currently we will ignore the inputs and return nil.
// TODO: Will modify it after windows has its own security context
func GeneratePodSecurityContext(fsGroup *int64, seLinuxOptions *v1.SELinuxOptions) *v1.PodSecurityContext {
if NodeOSDistroIs("windows") {
return nil
}
return &v1.PodSecurityContext{
FSGroup: fsGroup,
SELinuxOptions: seLinuxOptions,
}
}
// GenerateContainerSecurityContext generates the corresponding container security context with the given inputs
// If the Node OS is windows, currently we will ignore the inputs and return nil.
// TODO: Will modify it after windows has its own security context
func GenerateContainerSecurityContext(level psaapi.Level) *v1.SecurityContext {
if NodeOSDistroIs("windows") {
return nil
}
switch level {
case psaapi.LevelBaseline:
return &v1.SecurityContext{
Privileged: pointer.Bool(false),
}
case psaapi.LevelPrivileged:
return &v1.SecurityContext{
Privileged: pointer.Bool(true),
}
case psaapi.LevelRestricted:
return GetRestrictedContainerSecurityContext()
default:
ginkgo.Fail(fmt.Sprintf("unknown k8s.io/pod-security-admission/policy.Level %q", level))
panic("not reached")
}
}
// GetLinuxLabel returns the default SELinuxLabel based on OS.
// If the node OS is windows, it will return nil
func GetLinuxLabel() *v1.SELinuxOptions {
if NodeOSDistroIs("windows") {
return nil
}
return &v1.SELinuxOptions{
Level: "s0:c0,c1"}
}
// DefaultNonRootUser is the default user ID used for running restricted (non-root) containers.
const DefaultNonRootUser = 1000
// DefaultNonRootUserName is the default username in Windows used for running restricted (non-root) containers
const DefaultNonRootUserName = "ContainerUser"
// GetRestrictedPodSecurityContext returns a restricted pod security context.
// This includes setting RunAsUser for convenience, to pass the RunAsNonRoot check.
// Tests that require a specific user ID should override this.
func GetRestrictedPodSecurityContext() *v1.PodSecurityContext {
psc := &v1.PodSecurityContext{
RunAsNonRoot: pointer.Bool(true),
RunAsUser: GetDefaultNonRootUser(),
SeccompProfile: &v1.SeccompProfile{Type: v1.SeccompProfileTypeRuntimeDefault},
}
if NodeOSDistroIs("windows") {
psc.WindowsOptions = &v1.WindowsSecurityContextOptions{}
psc.WindowsOptions.RunAsUserName = pointer.String(DefaultNonRootUserName)
}
return psc
}
// GetRestrictedContainerSecurityContext returns a minimal restricted container security context.
func GetRestrictedContainerSecurityContext() *v1.SecurityContext {
return &v1.SecurityContext{
AllowPrivilegeEscalation: pointer.Bool(false),
Capabilities: &v1.Capabilities{Drop: []v1.Capability{"ALL"}},
}
}
var psaEvaluator, _ = psapolicy.NewEvaluator(psapolicy.DefaultChecks())
// MustMixinRestrictedPodSecurity makes the given pod compliant with the restricted pod security level.
// If doing so would overwrite existing non-conformant configuration, a test failure is triggered.
func MustMixinRestrictedPodSecurity(pod *v1.Pod) *v1.Pod {
err := MixinRestrictedPodSecurity(pod)
gomega.ExpectWithOffset(1, err).NotTo(gomega.HaveOccurred())
return pod
}
// MixinRestrictedPodSecurity makes the given pod compliant with the restricted pod security level.
// If doing so would overwrite existing non-conformant configuration, an error is returned.
// Note that this sets a default RunAsUser. See GetRestrictedPodSecurityContext.
// TODO(#105919): Handle PodOS for windows pods.
func MixinRestrictedPodSecurity(pod *v1.Pod) error {
if pod.Spec.SecurityContext == nil {
pod.Spec.SecurityContext = GetRestrictedPodSecurityContext()
} else {
if pod.Spec.SecurityContext.RunAsNonRoot == nil {
pod.Spec.SecurityContext.RunAsNonRoot = pointer.Bool(true)
}
if pod.Spec.SecurityContext.RunAsUser == nil {
pod.Spec.SecurityContext.RunAsUser = GetDefaultNonRootUser()
}
if pod.Spec.SecurityContext.SeccompProfile == nil {
pod.Spec.SecurityContext.SeccompProfile = &v1.SeccompProfile{Type: v1.SeccompProfileTypeRuntimeDefault}
}
if NodeOSDistroIs("windows") && pod.Spec.SecurityContext.WindowsOptions == nil {
pod.Spec.SecurityContext.WindowsOptions = &v1.WindowsSecurityContextOptions{}
pod.Spec.SecurityContext.WindowsOptions.RunAsUserName = pointer.String(DefaultNonRootUserName)
}
}
for i := range pod.Spec.Containers {
mixinRestrictedContainerSecurityContext(&pod.Spec.Containers[i])
}
for i := range pod.Spec.InitContainers {
mixinRestrictedContainerSecurityContext(&pod.Spec.InitContainers[i])
}
// Validate the resulting pod against the restricted profile.
restricted := psaapi.LevelVersion{
Level: psaapi.LevelRestricted,
Version: psaapi.LatestVersion(),
}
if agg := psapolicy.AggregateCheckResults(psaEvaluator.EvaluatePod(restricted, &pod.ObjectMeta, &pod.Spec)); !agg.Allowed {
return fmt.Errorf("failed to make pod %s restricted: %s", pod.Name, agg.ForbiddenDetail())
}
return nil
}
// mixinRestrictedContainerSecurityContext adds the required container security context options to
// be compliant with the restricted pod security level. Non-conformance checking is handled by the
// caller.
func mixinRestrictedContainerSecurityContext(container *v1.Container) {
if container.SecurityContext == nil {
container.SecurityContext = GetRestrictedContainerSecurityContext()
} else {
if container.SecurityContext.AllowPrivilegeEscalation == nil {
container.SecurityContext.AllowPrivilegeEscalation = pointer.Bool(false)
}
if container.SecurityContext.Capabilities == nil {
container.SecurityContext.Capabilities = &v1.Capabilities{}
}
if len(container.SecurityContext.Capabilities.Drop) == 0 {
container.SecurityContext.Capabilities.Drop = []v1.Capability{"ALL"}
}
}
}
// FindPodConditionByType loops through all pod conditions in pod status and returns the specified condition.
func FindPodConditionByType(podStatus *v1.PodStatus, conditionType v1.PodConditionType) *v1.PodCondition {
for _, cond := range podStatus.Conditions {
if cond.Type == conditionType {
return &cond
}
}
return nil
}
// FindContainerStatusInPod finds a container status by its name in the provided pod
func FindContainerStatusInPod(pod *v1.Pod, containerName string) *v1.ContainerStatus {
for _, containerStatus := range pod.Status.InitContainerStatuses {
if containerStatus.Name == containerName {
return &containerStatus
}
}
for _, containerStatus := range pod.Status.ContainerStatuses {
if containerStatus.Name == containerName {
return &containerStatus
}
}
for _, containerStatus := range pod.Status.EphemeralContainerStatuses {
if containerStatus.Name == containerName {
return &containerStatus
}
}
return nil
}
// VerifyCgroupValue verifies that the given cgroup path has the expected value in
// the specified container of the pod. It execs into the container to retrive the
// cgroup value and compares it against the expected value.
func VerifyCgroupValue(f *framework.Framework, pod *v1.Pod, cName, cgPath, expectedCgValue string) error {
cmd := fmt.Sprintf("head -n 1 %s", cgPath)
framework.Logf("Namespace %s Pod %s Container %s - looking for cgroup value %s in path %s",
pod.Namespace, pod.Name, cName, expectedCgValue, cgPath)
cgValue, _, err := ExecCommandInContainerWithFullOutput(f, pod.Name, cName, "/bin/sh", "-c", cmd)
if err != nil {
return fmt.Errorf("failed to find expected value %q in container cgroup %q", expectedCgValue, cgPath)
}
cgValue = strings.Trim(cgValue, "\n")
if cgValue != expectedCgValue {
return fmt.Errorf("cgroup value %q not equal to expected %q", cgValue, expectedCgValue)
}
return nil
}
// IsPodOnCgroupv2Node checks whether the pod is running on cgroupv2 node.
// TODO: Deduplicate this function with NPD cluster e2e test:
// https://github.com/kubernetes/kubernetes/blob/2049360379bcc5d6467769cef112e6e492d3d2f0/test/e2e/node/node_problem_detector.go#L369
func IsPodOnCgroupv2Node(f *framework.Framework, pod *v1.Pod) bool {
cmd := "mount -t cgroup2"
out, _, err := ExecCommandInContainerWithFullOutput(f, pod.Name, pod.Spec.Containers[0].Name, "/bin/sh", "-c", cmd)
if err != nil {
return false
}
return len(out) != 0
}

View File

@ -0,0 +1,870 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pod
import (
"context"
"errors"
"fmt"
"reflect"
"strings"
"time"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
"github.com/onsi/gomega/gcustom"
"github.com/onsi/gomega/types"
appsv1 "k8s.io/api/apps/v1"
v1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/labels"
apitypes "k8s.io/apimachinery/pkg/types"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubectl/pkg/util/podutils"
"k8s.io/kubernetes/test/e2e/framework"
testutils "k8s.io/kubernetes/test/utils"
"k8s.io/kubernetes/test/utils/format"
)
const (
// defaultPodDeletionTimeout is the default timeout for deleting pod.
defaultPodDeletionTimeout = 3 * time.Minute
// podListTimeout is how long to wait for the pod to be listable.
podListTimeout = time.Minute
podRespondingTimeout = 15 * time.Minute
// How long pods have to become scheduled onto nodes
podScheduledBeforeTimeout = podListTimeout + (20 * time.Second)
// podStartTimeout is how long to wait for the pod to be started.
podStartTimeout = 5 * time.Minute
// singleCallTimeout is how long to try single API calls (like 'get' or 'list'). Used to prevent
// transient failures from failing tests.
singleCallTimeout = 5 * time.Minute
// Some pods can take much longer to get ready due to volume attach/detach latency.
slowPodStartTimeout = 15 * time.Minute
)
type podCondition func(pod *v1.Pod) (bool, error)
// BeRunningNoRetries verifies that a pod starts running. It's a permanent
// failure when the pod enters some other permanent phase.
func BeRunningNoRetries() types.GomegaMatcher {
return gomega.And(
// This additional matcher checks for the final error condition.
gcustom.MakeMatcher(func(pod *v1.Pod) (bool, error) {
switch pod.Status.Phase {
case v1.PodFailed, v1.PodSucceeded:
return false, gomega.StopTrying(fmt.Sprintf("Expected pod to reach phase %q, got final phase %q instead:\n%s", v1.PodRunning, pod.Status.Phase, format.Object(pod, 1)))
default:
return true, nil
}
}),
BeInPhase(v1.PodRunning),
)
}
// BeInPhase matches if pod.status.phase is the expected phase.
func BeInPhase(phase v1.PodPhase) types.GomegaMatcher {
// A simple implementation of this would be:
// return gomega.HaveField("Status.Phase", phase)
//
// But that produces a fairly generic
// Value for field 'Status.Phase' failed to satisfy matcher.
// failure message and doesn't show the pod. We can do better than
// that with a custom matcher.
return gcustom.MakeMatcher(func(pod *v1.Pod) (bool, error) {
return pod.Status.Phase == phase, nil
}).WithTemplate("Expected Pod {{.To}} be in {{format .Data}}\nGot instead:\n{{.FormattedActual}}").WithTemplateData(phase)
}
// WaitForAlmostAllReady waits up to timeout for the following conditions:
// 1. At least minPods Pods in Namespace ns are Running and Ready
// 2. All Pods in Namespace ns are either Ready or Succeeded
// 3. All Pods part of a ReplicaSet or ReplicationController in Namespace ns are Ready
//
// After the timeout has elapsed, an error is returned if the number of Pods in a Pending Phase
// is greater than allowedNotReadyPods.
//
// It is generally recommended to use WaitForPodsRunningReady instead of this function
// whenever possible, because its behavior is more intuitive. Similar to WaitForPodsRunningReady,
// this function requests the list of pods on every iteration, making it useful for situations
// where the set of Pods is likely changing, such as during cluster startup.
//
// If minPods or allowedNotReadyPods are -1, this method returns immediately
// without waiting.
func WaitForAlmostAllPodsReady(ctx context.Context, c clientset.Interface, ns string, minPods, allowedNotReadyPods int, timeout time.Duration) error {
if minPods == -1 || allowedNotReadyPods == -1 {
return nil
}
// We get the new list of pods, replication controllers, and replica
// sets in every iteration because more pods come online during startup
// and we want to ensure they are also checked.
//
// This struct gets populated while polling, then gets checked, and in
// case of a timeout is included in the failure message.
type state struct {
ReplicationControllers []v1.ReplicationController
ReplicaSets []appsv1.ReplicaSet
Pods []v1.Pod
}
nOk := 0
badPods := []v1.Pod{}
otherPods := []v1.Pod{}
succeededPods := []string{}
err := framework.Gomega().Eventually(ctx, framework.HandleRetry(func(ctx context.Context) (*state, error) {
rcList, err := c.CoreV1().ReplicationControllers(ns).List(ctx, metav1.ListOptions{})
if err != nil {
return nil, fmt.Errorf("listing replication controllers in namespace %s: %w", ns, err)
}
rsList, err := c.AppsV1().ReplicaSets(ns).List(ctx, metav1.ListOptions{})
if err != nil {
return nil, fmt.Errorf("listing replication sets in namespace %s: %w", ns, err)
}
podList, err := c.CoreV1().Pods(ns).List(ctx, metav1.ListOptions{})
if err != nil {
return nil, fmt.Errorf("listing pods in namespace %s: %w", ns, err)
}
return &state{
ReplicationControllers: rcList.Items,
ReplicaSets: rsList.Items,
Pods: podList.Items,
}, nil
})).WithTimeout(timeout).Should(framework.MakeMatcher(func(s *state) (func() string, error) {
replicas, replicaOk := int32(0), int32(0)
for _, rc := range s.ReplicationControllers {
replicas += *rc.Spec.Replicas
replicaOk += rc.Status.ReadyReplicas
}
for _, rs := range s.ReplicaSets {
replicas += *rs.Spec.Replicas
replicaOk += rs.Status.ReadyReplicas
}
nOk = 0
badPods = []v1.Pod{}
otherPods = []v1.Pod{}
succeededPods = []string{}
for _, pod := range s.Pods {
res, err := testutils.PodRunningReady(&pod)
switch {
case res && err == nil:
nOk++
case pod.Status.Phase == v1.PodSucceeded:
// it doesn't make sense to wait for this pod
succeededPods = append(succeededPods, pod.Name)
case pod.Status.Phase == v1.PodFailed:
// ignore failed pods that are controlled by some controller
if metav1.GetControllerOf(&pod) == nil {
badPods = append(badPods, pod)
}
default:
otherPods = append(otherPods, pod)
}
}
done := replicaOk == replicas && nOk >= minPods && (len(badPods)+len(otherPods)) == 0
if done {
return nil, nil
}
// Delayed formatting of a failure message.
return func() string {
var buffer strings.Builder
buffer.WriteString(fmt.Sprintf("Expected all pods (need at least %d) in namespace %q to be running and ready (except for %d).\n", minPods, ns, allowedNotReadyPods))
buffer.WriteString(fmt.Sprintf("%d / %d pods were running and ready.\n", nOk, len(s.Pods)))
buffer.WriteString(fmt.Sprintf("Expected %d pod replicas, %d are Running and Ready.\n", replicas, replicaOk))
if len(succeededPods) > 0 {
buffer.WriteString(fmt.Sprintf("Pods that completed successfully:\n%s", format.Object(succeededPods, 1)))
}
if len(badPods) > 0 {
buffer.WriteString(fmt.Sprintf("Pods that failed and were not controlled by some controller:\n%s", format.Object(badPods, 1)))
}
if len(otherPods) > 0 {
buffer.WriteString(fmt.Sprintf("Pods that were neither completed nor running:\n%s", format.Object(otherPods, 1)))
}
return buffer.String()
}, nil
}))
// An error might not be fatal.
if len(otherPods) <= allowedNotReadyPods {
return nil
}
return err
}
// WaitForPodsRunningReady waits up to timeout for the following conditions:
// 1. At least minPods Pods in Namespace ns are Running and Ready
// 2. No Pods in Namespace ns are Failed and not owned by a controller or Pending
//
// An error is returned if either of these conditions are not met within the timeout.
//
// It has separate behavior from other 'wait for' pods functions in
// that it requests the list of pods on every iteration. This is useful, for
// example, in cluster startup, because the number of pods increases while
// waiting. All pods that are in SUCCESS state are not counted.
func WaitForPodsRunningReady(ctx context.Context, c clientset.Interface, ns string, minPods int, timeout time.Duration) error {
return framework.Gomega().Eventually(ctx, framework.HandleRetry(func(ctx context.Context) ([]v1.Pod, error) {
podList, err := c.CoreV1().Pods(ns).List(ctx, metav1.ListOptions{})
if err != nil {
return nil, fmt.Errorf("listing pods in namespace %s: %w", ns, err)
}
return podList.Items, nil
})).WithTimeout(timeout).Should(framework.MakeMatcher(func(pods []v1.Pod) (func() string, error) {
nOk := 0
badPods := []v1.Pod{}
otherPods := []v1.Pod{}
succeededPods := []string{}
for _, pod := range pods {
res, err := testutils.PodRunningReady(&pod)
switch {
case res && err == nil:
nOk++
case pod.Status.Phase == v1.PodSucceeded:
// ignore succeeded pods
succeededPods = append(succeededPods, pod.Name)
case pod.Status.Phase == v1.PodFailed:
// ignore failed pods that are controlled by some controller
if metav1.GetControllerOf(&pod) == nil {
badPods = append(badPods, pod)
}
default:
otherPods = append(otherPods, pod)
}
}
if nOk >= minPods && len(badPods)+len(otherPods) == 0 {
return nil, nil
}
// Delayed formatting of a failure message.
return func() string {
var buffer strings.Builder
buffer.WriteString(fmt.Sprintf("Expected all pods (need at least %d) in namespace %q to be running and ready \n", minPods, ns))
buffer.WriteString(fmt.Sprintf("%d / %d pods were running and ready.\n", nOk, len(pods)))
if len(succeededPods) > 0 {
buffer.WriteString(fmt.Sprintf("Pods that completed successfully:\n%s", format.Object(succeededPods, 1)))
}
if len(badPods) > 0 {
buffer.WriteString(fmt.Sprintf("Pods that failed and were not controlled by some controller:\n%s", format.Object(badPods, 1)))
}
if len(otherPods) > 0 {
buffer.WriteString(fmt.Sprintf("Pods that were neither completed nor running:\n%s", format.Object(otherPods, 1)))
}
return buffer.String()
}, nil
}))
}
// WaitForPodCondition waits a pods to be matched to the given condition.
// The condition callback may use gomega.StopTrying to abort early.
func WaitForPodCondition(ctx context.Context, c clientset.Interface, ns, podName, conditionDesc string, timeout time.Duration, condition podCondition) error {
return framework.Gomega().
Eventually(ctx, framework.RetryNotFound(framework.GetObject(c.CoreV1().Pods(ns).Get, podName, metav1.GetOptions{}))).
WithTimeout(timeout).
Should(framework.MakeMatcher(func(pod *v1.Pod) (func() string, error) {
done, err := condition(pod)
if err != nil {
return nil, err
}
if done {
return nil, nil
}
return func() string {
return fmt.Sprintf("expected pod to be %s, got instead:\n%s", conditionDesc, format.Object(pod, 1))
}, nil
}))
}
// Range determines how many items must exist and how many must match a certain
// condition. Values <= 0 are ignored.
// TODO (?): move to test/e2e/framework/range
type Range struct {
// MinMatching must be <= actual matching items or <= 0.
MinMatching int
// MaxMatching must be >= actual matching items or <= 0.
// To check for "no matching items", set NonMatching.
MaxMatching int
// NoneMatching indicates that no item must match.
NoneMatching bool
// AllMatching indicates that all items must match.
AllMatching bool
// MinFound must be <= existing items or <= 0.
MinFound int
}
// Min returns how many items must exist.
func (r Range) Min() int {
min := r.MinMatching
if min < r.MinFound {
min = r.MinFound
}
return min
}
// WaitForPods waits for pods in the given namespace to match the given
// condition. How many pods must exist and how many must match the condition
// is determined by the range parameter. The condition callback may use
// gomega.StopTrying(...).Now() to abort early. The condition description
// will be used with "expected pods to <description>".
func WaitForPods(ctx context.Context, c clientset.Interface, ns string, opts metav1.ListOptions, r Range, timeout time.Duration, conditionDesc string, condition func(*v1.Pod) bool) (*v1.PodList, error) {
var finalPods *v1.PodList
minPods := r.Min()
match := func(pods *v1.PodList) (func() string, error) {
finalPods = pods
if len(pods.Items) < minPods {
return func() string {
return fmt.Sprintf("expected at least %d pods, only got %d", minPods, len(pods.Items))
}, nil
}
var nonMatchingPods, matchingPods []v1.Pod
for _, pod := range pods.Items {
if condition(&pod) {
matchingPods = append(matchingPods, pod)
} else {
nonMatchingPods = append(nonMatchingPods, pod)
}
}
matching := len(pods.Items) - len(nonMatchingPods)
if matching < r.MinMatching && r.MinMatching > 0 {
return func() string {
return fmt.Sprintf("expected at least %d pods to %s, %d out of %d were not:\n%s",
r.MinMatching, conditionDesc, len(nonMatchingPods), len(pods.Items),
format.Object(nonMatchingPods, 1))
}, nil
}
if len(nonMatchingPods) > 0 && r.AllMatching {
return func() string {
return fmt.Sprintf("expected all pods to %s, %d out of %d were not:\n%s",
conditionDesc, len(nonMatchingPods), len(pods.Items),
format.Object(nonMatchingPods, 1))
}, nil
}
if matching > r.MaxMatching && r.MaxMatching > 0 {
return func() string {
return fmt.Sprintf("expected at most %d pods to %s, %d out of %d were:\n%s",
r.MinMatching, conditionDesc, len(matchingPods), len(pods.Items),
format.Object(matchingPods, 1))
}, nil
}
if matching > 0 && r.NoneMatching {
return func() string {
return fmt.Sprintf("expected no pods to %s, %d out of %d were:\n%s",
conditionDesc, len(matchingPods), len(pods.Items),
format.Object(matchingPods, 1))
}, nil
}
return nil, nil
}
err := framework.Gomega().
Eventually(ctx, framework.ListObjects(c.CoreV1().Pods(ns).List, opts)).
WithTimeout(timeout).
Should(framework.MakeMatcher(match))
return finalPods, err
}
// RunningReady checks whether pod p's phase is running and it has a ready
// condition of status true.
func RunningReady(p *v1.Pod) bool {
return p.Status.Phase == v1.PodRunning && podutils.IsPodReady(p)
}
// WaitForPodsRunning waits for a given `timeout` to evaluate if a certain amount of pods in given `ns` are running.
func WaitForPodsRunning(ctx context.Context, c clientset.Interface, ns string, num int, timeout time.Duration) error {
_, err := WaitForPods(ctx, c, ns, metav1.ListOptions{}, Range{MinMatching: num, MaxMatching: num}, timeout,
"be running and ready", func(pod *v1.Pod) bool {
ready, _ := testutils.PodRunningReady(pod)
return ready
})
return err
}
// WaitForPodsSchedulingGated waits for a given `timeout` to evaluate if a certain amount of pods in given `ns` stay in scheduling gated state.
func WaitForPodsSchedulingGated(ctx context.Context, c clientset.Interface, ns string, num int, timeout time.Duration) error {
_, err := WaitForPods(ctx, c, ns, metav1.ListOptions{}, Range{MinMatching: num, MaxMatching: num}, timeout,
"be in scheduling gated state", func(pod *v1.Pod) bool {
for _, condition := range pod.Status.Conditions {
if condition.Type == v1.PodScheduled && condition.Reason == v1.PodReasonSchedulingGated {
return true
}
}
return false
})
return err
}
// WaitForPodsWithSchedulingGates waits for a given `timeout` to evaluate if a certain amount of pods in given `ns`
// match the given `schedulingGates`stay in scheduling gated state.
func WaitForPodsWithSchedulingGates(ctx context.Context, c clientset.Interface, ns string, num int, timeout time.Duration, schedulingGates []v1.PodSchedulingGate) error {
_, err := WaitForPods(ctx, c, ns, metav1.ListOptions{}, Range{MinMatching: num, MaxMatching: num}, timeout,
"have certain scheduling gates", func(pod *v1.Pod) bool {
return reflect.DeepEqual(pod.Spec.SchedulingGates, schedulingGates)
})
return err
}
// WaitForPodTerminatedInNamespace returns an error if it takes too long for the pod to terminate,
// if the pod Get api returns an error (IsNotFound or other), or if the pod failed (and thus did not
// terminate) with an unexpected reason. Typically called to test that the passed-in pod is fully
// terminated (reason==""), but may be called to detect if a pod did *not* terminate according to
// the supplied reason.
func WaitForPodTerminatedInNamespace(ctx context.Context, c clientset.Interface, podName, reason, namespace string) error {
return WaitForPodCondition(ctx, c, namespace, podName, fmt.Sprintf("terminated with reason %s", reason), podStartTimeout, func(pod *v1.Pod) (bool, error) {
// Only consider Failed pods. Successful pods will be deleted and detected in
// waitForPodCondition's Get call returning `IsNotFound`
if pod.Status.Phase == v1.PodFailed {
if pod.Status.Reason == reason { // short-circuit waitForPodCondition's loop
return true, nil
}
return true, fmt.Errorf("Expected pod %q in namespace %q to be terminated with reason %q, got reason: %q", podName, namespace, reason, pod.Status.Reason)
}
return false, nil
})
}
// WaitForPodTerminatingInNamespaceTimeout returns if the pod is terminating, or an error if it is not after the timeout.
func WaitForPodTerminatingInNamespaceTimeout(ctx context.Context, c clientset.Interface, podName, namespace string, timeout time.Duration) error {
return WaitForPodCondition(ctx, c, namespace, podName, "is terminating", timeout, func(pod *v1.Pod) (bool, error) {
if pod.DeletionTimestamp != nil {
return true, nil
}
return false, nil
})
}
// WaitForPodSuccessInNamespaceTimeout returns nil if the pod reached state success, or an error if it reached failure or ran too long.
func WaitForPodSuccessInNamespaceTimeout(ctx context.Context, c clientset.Interface, podName, namespace string, timeout time.Duration) error {
return WaitForPodCondition(ctx, c, namespace, podName, fmt.Sprintf("%s or %s", v1.PodSucceeded, v1.PodFailed), timeout, func(pod *v1.Pod) (bool, error) {
if pod.DeletionTimestamp == nil && pod.Spec.RestartPolicy == v1.RestartPolicyAlways {
return true, gomega.StopTrying(fmt.Sprintf("pod %q will never terminate with a succeeded state since its restart policy is Always", podName))
}
switch pod.Status.Phase {
case v1.PodSucceeded:
ginkgo.By("Saw pod success")
return true, nil
case v1.PodFailed:
return true, gomega.StopTrying(fmt.Sprintf("pod %q failed with status: \n%s", podName, format.Object(pod.Status, 1)))
default:
return false, nil
}
})
}
// WaitForPodNameUnschedulableInNamespace returns an error if it takes too long for the pod to become Pending
// and have condition Status equal to Unschedulable,
// if the pod Get api returns an error (IsNotFound or other), or if the pod failed with an unexpected reason.
// Typically called to test that the passed-in pod is Pending and Unschedulable.
func WaitForPodNameUnschedulableInNamespace(ctx context.Context, c clientset.Interface, podName, namespace string) error {
return WaitForPodCondition(ctx, c, namespace, podName, v1.PodReasonUnschedulable, podStartTimeout, func(pod *v1.Pod) (bool, error) {
// Only consider Failed pods. Successful pods will be deleted and detected in
// waitForPodCondition's Get call returning `IsNotFound`
if pod.Status.Phase == v1.PodPending {
for _, cond := range pod.Status.Conditions {
if cond.Type == v1.PodScheduled && cond.Status == v1.ConditionFalse && cond.Reason == v1.PodReasonUnschedulable {
return true, nil
}
}
}
if pod.Status.Phase == v1.PodRunning || pod.Status.Phase == v1.PodSucceeded || pod.Status.Phase == v1.PodFailed {
return true, fmt.Errorf("Expected pod %q in namespace %q to be in phase Pending, but got phase: %v", podName, namespace, pod.Status.Phase)
}
return false, nil
})
}
// WaitForPodNameRunningInNamespace waits default amount of time (PodStartTimeout) for the specified pod to become running.
// Returns an error if timeout occurs first, or pod goes in to failed state.
func WaitForPodNameRunningInNamespace(ctx context.Context, c clientset.Interface, podName, namespace string) error {
return WaitTimeoutForPodRunningInNamespace(ctx, c, podName, namespace, podStartTimeout)
}
// WaitForPodRunningInNamespaceSlow waits an extended amount of time (slowPodStartTimeout) for the specified pod to become running.
// The resourceVersion is used when Watching object changes, it tells since when we care
// about changes to the pod. Returns an error if timeout occurs first, or pod goes in to failed state.
func WaitForPodRunningInNamespaceSlow(ctx context.Context, c clientset.Interface, podName, namespace string) error {
return WaitTimeoutForPodRunningInNamespace(ctx, c, podName, namespace, slowPodStartTimeout)
}
// WaitTimeoutForPodRunningInNamespace waits the given timeout duration for the specified pod to become running.
// It does not need to exist yet when this function gets called and the pod is not expected to be recreated
// when it succeeds or fails.
func WaitTimeoutForPodRunningInNamespace(ctx context.Context, c clientset.Interface, podName, namespace string, timeout time.Duration) error {
return framework.Gomega().Eventually(ctx, framework.RetryNotFound(framework.GetObject(c.CoreV1().Pods(namespace).Get, podName, metav1.GetOptions{}))).
WithTimeout(timeout).
Should(BeRunningNoRetries())
}
// WaitForPodRunningInNamespace waits default amount of time (podStartTimeout) for the specified pod to become running.
// Returns an error if timeout occurs first, or pod goes in to failed state.
func WaitForPodRunningInNamespace(ctx context.Context, c clientset.Interface, pod *v1.Pod) error {
if pod.Status.Phase == v1.PodRunning {
return nil
}
return WaitTimeoutForPodRunningInNamespace(ctx, c, pod.Name, pod.Namespace, podStartTimeout)
}
// WaitTimeoutForPodNoLongerRunningInNamespace waits the given timeout duration for the specified pod to stop.
func WaitTimeoutForPodNoLongerRunningInNamespace(ctx context.Context, c clientset.Interface, podName, namespace string, timeout time.Duration) error {
return WaitForPodCondition(ctx, c, namespace, podName, "completed", timeout, func(pod *v1.Pod) (bool, error) {
switch pod.Status.Phase {
case v1.PodFailed, v1.PodSucceeded:
return true, nil
}
return false, nil
})
}
// WaitForPodNoLongerRunningInNamespace waits default amount of time (defaultPodDeletionTimeout) for the specified pod to stop running.
// Returns an error if timeout occurs first.
func WaitForPodNoLongerRunningInNamespace(ctx context.Context, c clientset.Interface, podName, namespace string) error {
return WaitTimeoutForPodNoLongerRunningInNamespace(ctx, c, podName, namespace, defaultPodDeletionTimeout)
}
// WaitTimeoutForPodReadyInNamespace waits the given timeout duration for the
// specified pod to be ready and running.
func WaitTimeoutForPodReadyInNamespace(ctx context.Context, c clientset.Interface, podName, namespace string, timeout time.Duration) error {
return WaitForPodCondition(ctx, c, namespace, podName, "running and ready", timeout, func(pod *v1.Pod) (bool, error) {
switch pod.Status.Phase {
case v1.PodFailed, v1.PodSucceeded:
return false, gomega.StopTrying(fmt.Sprintf("The phase of Pod %s is %s which is unexpected.", pod.Name, pod.Status.Phase))
case v1.PodRunning:
return podutils.IsPodReady(pod), nil
}
return false, nil
})
}
// WaitForPodNotPending returns an error if it took too long for the pod to go out of pending state.
// The resourceVersion is used when Watching object changes, it tells since when we care
// about changes to the pod.
func WaitForPodNotPending(ctx context.Context, c clientset.Interface, ns, podName string) error {
return WaitForPodCondition(ctx, c, ns, podName, "not pending", podStartTimeout, func(pod *v1.Pod) (bool, error) {
switch pod.Status.Phase {
case v1.PodPending:
return false, nil
default:
return true, nil
}
})
}
// WaitForPodSuccessInNamespace returns nil if the pod reached state success, or an error if it reached failure or until podStartupTimeout.
func WaitForPodSuccessInNamespace(ctx context.Context, c clientset.Interface, podName string, namespace string) error {
return WaitForPodSuccessInNamespaceTimeout(ctx, c, podName, namespace, podStartTimeout)
}
// WaitForPodNotFoundInNamespace returns an error if it takes too long for the pod to fully terminate.
// Unlike `waitForPodTerminatedInNamespace`, the pod's Phase and Reason are ignored. If the pod Get
// api returns IsNotFound then the wait stops and nil is returned. If the Get api returns an error other
// than "not found" and that error is final, that error is returned and the wait stops.
func WaitForPodNotFoundInNamespace(ctx context.Context, c clientset.Interface, podName, ns string, timeout time.Duration) error {
err := framework.Gomega().Eventually(ctx, framework.HandleRetry(func(ctx context.Context) (*v1.Pod, error) {
pod, err := c.CoreV1().Pods(ns).Get(ctx, podName, metav1.GetOptions{})
if apierrors.IsNotFound(err) {
return nil, nil
}
return pod, err
})).WithTimeout(timeout).Should(gomega.BeNil())
if err != nil {
return fmt.Errorf("expected pod to not be found: %w", err)
}
return nil
}
// WaitForPodsResponding waits for the pods to response.
func WaitForPodsResponding(ctx context.Context, c clientset.Interface, ns string, controllerName string, wantName bool, timeout time.Duration, pods *v1.PodList) error {
if timeout == 0 {
timeout = podRespondingTimeout
}
ginkgo.By("trying to dial each unique pod")
label := labels.SelectorFromSet(labels.Set(map[string]string{"name": controllerName}))
options := metav1.ListOptions{LabelSelector: label.String()}
type response struct {
podName string
response string
}
get := func(ctx context.Context) ([]response, error) {
currentPods, err := c.CoreV1().Pods(ns).List(ctx, options)
if err != nil {
return nil, fmt.Errorf("list pods: %w", err)
}
var responses []response
for _, pod := range pods.Items {
// Check that the replica list remains unchanged, otherwise we have problems.
if !isElementOf(pod.UID, currentPods) {
return nil, gomega.StopTrying(fmt.Sprintf("Pod with UID %s is no longer a member of the replica set. Must have been restarted for some reason.\nCurrent replica set:\n%s", pod.UID, format.Object(currentPods, 1)))
}
ctxUntil, cancel := context.WithTimeout(ctx, singleCallTimeout)
defer cancel()
body, err := c.CoreV1().RESTClient().Get().
Namespace(ns).
Resource("pods").
SubResource("proxy").
Name(string(pod.Name)).
Do(ctxUntil).
Raw()
if err != nil {
// We may encounter errors here because of a race between the pod readiness and apiserver
// proxy or because of temporary failures. The error gets wrapped for framework.HandleRetry.
// Gomega+Ginkgo will handle logging.
return nil, fmt.Errorf("controller %s: failed to Get from replica pod %s:\n%w\nPod status:\n%s",
controllerName, pod.Name,
err, format.Object(pod.Status, 1))
}
responses = append(responses, response{podName: pod.Name, response: string(body)})
}
return responses, nil
}
match := func(responses []response) (func() string, error) {
// The response checker expects the pod's name unless !respondName, in
// which case it just checks for a non-empty response.
var unexpected []response
for _, response := range responses {
if wantName {
if response.response != response.podName {
unexpected = append(unexpected, response)
}
} else {
if len(response.response) == 0 {
unexpected = append(unexpected, response)
}
}
}
if len(unexpected) > 0 {
return func() string {
what := "some response"
if wantName {
what = "the pod's own name as response"
}
return fmt.Sprintf("Wanted %s, but the following pods replied with something else:\n%s", what, format.Object(unexpected, 1))
}, nil
}
return nil, nil
}
err := framework.Gomega().
Eventually(ctx, framework.HandleRetry(get)).
WithTimeout(timeout).
Should(framework.MakeMatcher(match))
if err != nil {
return fmt.Errorf("checking pod responses: %w", err)
}
return nil
}
func isElementOf(podUID apitypes.UID, pods *v1.PodList) bool {
for _, pod := range pods.Items {
if pod.UID == podUID {
return true
}
}
return false
}
// WaitForNumberOfPods waits up to timeout to ensure there are exact
// `num` pods in namespace `ns`.
// It returns the matching Pods or a timeout error.
func WaitForNumberOfPods(ctx context.Context, c clientset.Interface, ns string, num int, timeout time.Duration) (pods *v1.PodList, err error) {
return WaitForPods(ctx, c, ns, metav1.ListOptions{}, Range{MinMatching: num, MaxMatching: num}, podScheduledBeforeTimeout, "exist", func(pod *v1.Pod) bool {
return true
})
}
// WaitForPodsWithLabelScheduled waits for all matching pods to become scheduled and at least one
// matching pod exists. Return the list of matching pods.
func WaitForPodsWithLabelScheduled(ctx context.Context, c clientset.Interface, ns string, label labels.Selector) (pods *v1.PodList, err error) {
opts := metav1.ListOptions{LabelSelector: label.String()}
return WaitForPods(ctx, c, ns, opts, Range{MinFound: 1, AllMatching: true}, podScheduledBeforeTimeout, "be scheduled", func(pod *v1.Pod) bool {
return pod.Spec.NodeName != ""
})
}
// WaitForPodsWithLabel waits up to podListTimeout for getting pods with certain label
func WaitForPodsWithLabel(ctx context.Context, c clientset.Interface, ns string, label labels.Selector) (*v1.PodList, error) {
opts := metav1.ListOptions{LabelSelector: label.String()}
return WaitForPods(ctx, c, ns, opts, Range{MinFound: 1}, podListTimeout, "exist", func(pod *v1.Pod) bool {
return true
})
}
// WaitForPodsWithLabelRunningReady waits for exact amount of matching pods to become running and ready.
// Return the list of matching pods.
func WaitForPodsWithLabelRunningReady(ctx context.Context, c clientset.Interface, ns string, label labels.Selector, num int, timeout time.Duration) (pods *v1.PodList, err error) {
opts := metav1.ListOptions{LabelSelector: label.String()}
return WaitForPods(ctx, c, ns, opts, Range{MinFound: num, AllMatching: true}, timeout, "be running and ready", RunningReady)
}
// WaitForNRestartablePods tries to list restarting pods using ps until it finds expect of them,
// returning their names if it can do so before timeout.
func WaitForNRestartablePods(ctx context.Context, ps *testutils.PodStore, expect int, timeout time.Duration) ([]string, error) {
var pods []*v1.Pod
get := func(ctx context.Context) ([]*v1.Pod, error) {
return ps.List(), nil
}
match := func(allPods []*v1.Pod) (func() string, error) {
pods = FilterNonRestartablePods(allPods)
if len(pods) != expect {
return func() string {
return fmt.Sprintf("expected to find non-restartable %d pods, but found %d:\n%s", expect, len(pods), format.Object(pods, 1))
}, nil
}
return nil, nil
}
err := framework.Gomega().
Eventually(ctx, framework.HandleRetry(get)).
WithTimeout(timeout).
Should(framework.MakeMatcher(match))
if err != nil {
return nil, err
}
podNames := make([]string, len(pods))
for i, p := range pods {
podNames[i] = p.Name
}
return podNames, nil
}
// WaitForPodContainerToFail waits for the given Pod container to fail with the given reason, specifically due to
// invalid container configuration. In this case, the container will remain in a waiting state with a specific
// reason set, which should match the given reason.
func WaitForPodContainerToFail(ctx context.Context, c clientset.Interface, namespace, podName string, containerIndex int, reason string, timeout time.Duration) error {
conditionDesc := fmt.Sprintf("container %d failed with reason %s", containerIndex, reason)
return WaitForPodCondition(ctx, c, namespace, podName, conditionDesc, timeout, func(pod *v1.Pod) (bool, error) {
switch pod.Status.Phase {
case v1.PodPending:
if len(pod.Status.ContainerStatuses) == 0 {
return false, nil
}
containerStatus := pod.Status.ContainerStatuses[containerIndex]
if containerStatus.State.Waiting != nil && containerStatus.State.Waiting.Reason == reason {
return true, nil
}
return false, nil
case v1.PodFailed, v1.PodRunning, v1.PodSucceeded:
return false, fmt.Errorf("pod was expected to be pending, but it is in the state: %s", pod.Status.Phase)
}
return false, nil
})
}
// WaitForPodScheduled waits for the pod to be schedule, ie. the .spec.nodeName is set
func WaitForPodScheduled(ctx context.Context, c clientset.Interface, namespace, podName string) error {
return WaitForPodCondition(ctx, c, namespace, podName, "pod is scheduled", podScheduledBeforeTimeout, func(pod *v1.Pod) (bool, error) {
return pod.Spec.NodeName != "", nil
})
}
// WaitForPodContainerStarted waits for the given Pod container to start, after a successful run of the startupProbe.
func WaitForPodContainerStarted(ctx context.Context, c clientset.Interface, namespace, podName string, containerIndex int, timeout time.Duration) error {
conditionDesc := fmt.Sprintf("container %d started", containerIndex)
return WaitForPodCondition(ctx, c, namespace, podName, conditionDesc, timeout, func(pod *v1.Pod) (bool, error) {
if containerIndex > len(pod.Status.ContainerStatuses)-1 {
return false, nil
}
containerStatus := pod.Status.ContainerStatuses[containerIndex]
return *containerStatus.Started, nil
})
}
// WaitForPodInitContainerStarted waits for the given Pod init container to start.
func WaitForPodInitContainerStarted(ctx context.Context, c clientset.Interface, namespace, podName string, initContainerIndex int, timeout time.Duration) error {
conditionDesc := fmt.Sprintf("init container %d started", initContainerIndex)
return WaitForPodCondition(ctx, c, namespace, podName, conditionDesc, timeout, func(pod *v1.Pod) (bool, error) {
if initContainerIndex > len(pod.Status.InitContainerStatuses)-1 {
return false, nil
}
initContainerStatus := pod.Status.InitContainerStatuses[initContainerIndex]
return *initContainerStatus.Started, nil
})
}
// WaitForPodFailedReason wait for pod failed reason in status, for example "SysctlForbidden".
func WaitForPodFailedReason(ctx context.Context, c clientset.Interface, pod *v1.Pod, reason string, timeout time.Duration) error {
conditionDesc := fmt.Sprintf("failed with reason %s", reason)
return WaitForPodCondition(ctx, c, pod.Namespace, pod.Name, conditionDesc, timeout, func(pod *v1.Pod) (bool, error) {
switch pod.Status.Phase {
case v1.PodSucceeded:
return true, errors.New("pod succeeded unexpectedly")
case v1.PodFailed:
if pod.Status.Reason == reason {
return true, nil
} else {
return true, fmt.Errorf("pod failed with reason %s", pod.Status.Reason)
}
}
return false, nil
})
}
// WaitForContainerRunning waits for the given Pod container to have a state of running
func WaitForContainerRunning(ctx context.Context, c clientset.Interface, namespace, podName, containerName string, timeout time.Duration) error {
conditionDesc := fmt.Sprintf("container %s running", containerName)
return WaitForPodCondition(ctx, c, namespace, podName, conditionDesc, timeout, func(pod *v1.Pod) (bool, error) {
for _, statuses := range [][]v1.ContainerStatus{pod.Status.ContainerStatuses, pod.Status.InitContainerStatuses, pod.Status.EphemeralContainerStatuses} {
for _, cs := range statuses {
if cs.Name == containerName {
return cs.State.Running != nil, nil
}
}
}
return false, nil
})
}
// WaitForContainerTerminated waits for the given Pod container to have a state of terminated
func WaitForContainerTerminated(ctx context.Context, c clientset.Interface, namespace, podName, containerName string, timeout time.Duration) error {
conditionDesc := fmt.Sprintf("container %s terminated", containerName)
return WaitForPodCondition(ctx, c, namespace, podName, conditionDesc, timeout, func(pod *v1.Pod) (bool, error) {
for _, statuses := range [][]v1.ContainerStatus{pod.Status.ContainerStatuses, pod.Status.InitContainerStatuses, pod.Status.EphemeralContainerStatuses} {
for _, cs := range statuses {
if cs.Name == containerName {
return cs.State.Terminated != nil, nil
}
}
}
return false, nil
})
}

View File

@ -0,0 +1,28 @@
/*
Copyright 2020 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
// NOTE: constants in this file are copied from pkg/cluster/ports/ports.go
const (
// KubeletPort is the default port for the kubelet server on each host machine.
// May be overridden by a flag at startup.
KubeletPort = 10250
// KubeControllerManagerPort is the default port for the controller manager status server.
// May be overridden by a flag at startup.
KubeControllerManagerPort = 10257
)

View File

@ -0,0 +1,192 @@
/*
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import (
"context"
"fmt"
"os"
"sync"
v1 "k8s.io/api/core/v1"
clientset "k8s.io/client-go/kubernetes"
)
// Factory is a func which operates provider specific behavior.
type Factory func() (ProviderInterface, error)
var (
providers = make(map[string]Factory)
mutex sync.Mutex
)
// RegisterProvider is expected to be called during application init,
// typically by an init function in a provider package.
func RegisterProvider(name string, factory Factory) {
mutex.Lock()
defer mutex.Unlock()
if _, ok := providers[name]; ok {
panic(fmt.Sprintf("provider %s already registered", name))
}
providers[name] = factory
}
// GetProviders returns the names of all currently registered providers.
func GetProviders() []string {
mutex.Lock()
defer mutex.Unlock()
var providerNames []string
for name := range providers {
providerNames = append(providerNames, name)
}
return providerNames
}
func init() {
// "local" or "skeleton" can always be used.
RegisterProvider("local", func() (ProviderInterface, error) {
return NullProvider{}, nil
})
RegisterProvider("skeleton", func() (ProviderInterface, error) {
return NullProvider{}, nil
})
// The empty string used to be accepted in the past, but is not
// a valid value anymore.
}
// SetupProviderConfig validates the chosen provider and creates
// an interface instance for it.
func SetupProviderConfig(providerName string) (ProviderInterface, error) {
var err error
mutex.Lock()
defer mutex.Unlock()
factory, ok := providers[providerName]
if !ok {
return nil, fmt.Errorf("The provider %s is unknown: %w", providerName, os.ErrNotExist)
}
provider, err := factory()
return provider, err
}
// ProviderInterface contains the implementation for certain
// provider-specific functionality.
type ProviderInterface interface {
FrameworkBeforeEach(f *Framework)
FrameworkAfterEach(f *Framework)
ResizeGroup(group string, size int32) error
GetGroupNodes(group string) ([]string, error)
GroupSize(group string) (int, error)
DeleteNode(node *v1.Node) error
CreatePD(zone string) (string, error)
DeletePD(pdName string) error
CreateShare() (string, string, string, error)
DeleteShare(accountName, shareName string) error
CreatePVSource(ctx context.Context, zone, diskName string) (*v1.PersistentVolumeSource, error)
DeletePVSource(ctx context.Context, pvSource *v1.PersistentVolumeSource) error
CleanupServiceResources(ctx context.Context, c clientset.Interface, loadBalancerName, region, zone string)
EnsureLoadBalancerResourcesDeleted(ctx context.Context, ip, portRange string) error
LoadBalancerSrcRanges() []string
EnableAndDisableInternalLB() (enable, disable func(svc *v1.Service))
}
// NullProvider is the default implementation of the ProviderInterface
// which doesn't do anything.
type NullProvider struct{}
// FrameworkBeforeEach is a base implementation which does BeforeEach.
func (n NullProvider) FrameworkBeforeEach(f *Framework) {}
// FrameworkAfterEach is a base implementation which does AfterEach.
func (n NullProvider) FrameworkAfterEach(f *Framework) {}
// ResizeGroup is a base implementation which resizes group.
func (n NullProvider) ResizeGroup(string, int32) error {
return fmt.Errorf("Provider does not support InstanceGroups")
}
// GetGroupNodes is a base implementation which returns group nodes.
func (n NullProvider) GetGroupNodes(group string) ([]string, error) {
return nil, fmt.Errorf("provider does not support InstanceGroups")
}
// GroupSize returns the size of an instance group
func (n NullProvider) GroupSize(group string) (int, error) {
return -1, fmt.Errorf("provider does not support InstanceGroups")
}
// DeleteNode is a base implementation which deletes a node.
func (n NullProvider) DeleteNode(node *v1.Node) error {
return fmt.Errorf("provider does not support DeleteNode")
}
func (n NullProvider) CreateShare() (string, string, string, error) {
return "", "", "", fmt.Errorf("provider does not support volume creation")
}
func (n NullProvider) DeleteShare(accountName, shareName string) error {
return fmt.Errorf("provider does not support volume deletion")
}
// CreatePD is a base implementation which creates PD.
func (n NullProvider) CreatePD(zone string) (string, error) {
return "", fmt.Errorf("provider does not support volume creation")
}
// DeletePD is a base implementation which deletes PD.
func (n NullProvider) DeletePD(pdName string) error {
return fmt.Errorf("provider does not support volume deletion")
}
// CreatePVSource is a base implementation which creates PV source.
func (n NullProvider) CreatePVSource(ctx context.Context, zone, diskName string) (*v1.PersistentVolumeSource, error) {
return nil, fmt.Errorf("Provider not supported")
}
// DeletePVSource is a base implementation which deletes PV source.
func (n NullProvider) DeletePVSource(ctx context.Context, pvSource *v1.PersistentVolumeSource) error {
return fmt.Errorf("Provider not supported")
}
// CleanupServiceResources is a base implementation which cleans up service resources.
func (n NullProvider) CleanupServiceResources(ctx context.Context, c clientset.Interface, loadBalancerName, region, zone string) {
}
// EnsureLoadBalancerResourcesDeleted is a base implementation which ensures load balancer is deleted.
func (n NullProvider) EnsureLoadBalancerResourcesDeleted(ctx context.Context, ip, portRange string) error {
return nil
}
// LoadBalancerSrcRanges is a base implementation which returns the ranges of ips used by load balancers.
func (n NullProvider) LoadBalancerSrcRanges() []string {
return nil
}
// EnableAndDisableInternalLB is a base implementation which returns functions for enabling/disabling an internal LB.
func (n NullProvider) EnableAndDisableInternalLB() (enable, disable func(svc *v1.Service)) {
nop := func(svc *v1.Service) {}
return nop, nop
}
var _ ProviderInterface = NullProvider{}

View File

@ -0,0 +1,9 @@
# This E2E framework sub-package is currently allowed to use arbitrary
# dependencies, therefore we need to override the restrictions from
# the parent .import-restrictions file.
#
# At some point it may become useful to also check this package's
# dependencies more careful.
rules:
- selectorRegexp: ""
allowedPrefixes: [ "" ]

View File

@ -0,0 +1,913 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pv
import (
"context"
"fmt"
"strings"
"time"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/kubernetes/test/e2e/storage/utils"
"github.com/onsi/ginkgo/v2"
v1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/labels"
"k8s.io/apimachinery/pkg/types"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubernetes/pkg/volume/util"
"k8s.io/kubernetes/test/e2e/framework"
e2eskipper "k8s.io/kubernetes/test/e2e/framework/skipper"
)
const (
pdRetryTimeout = 5 * time.Minute
pdRetryPollTime = 5 * time.Second
// VolumeSelectorKey is the key for volume selector.
VolumeSelectorKey = "e2e-pv-pool"
// volumeGidAnnotationKey is the of the annotation on the PersistentVolume
// object that specifies a supplemental GID.
// it is copied from k8s.io/kubernetes/pkg/volume/util VolumeGidAnnotationKey
volumeGidAnnotationKey = "pv.beta.kubernetes.io/gid"
)
var (
// SELinuxLabel is common selinux labels.
SELinuxLabel = &v1.SELinuxOptions{
Level: "s0:c0,c1"}
)
type pvval struct{}
// PVMap is a map of all PVs used in the multi pv-pvc tests. The key is the PV's name, which is
// guaranteed to be unique. The value is {} (empty struct) since we're only interested
// in the PV's name and if it is present. We must always Get the pv object before
// referencing any of its values, eg its ClaimRef.
type PVMap map[string]pvval
type pvcval struct{}
// PVCMap is a map of all PVCs used in the multi pv-pvc tests. The key is "namespace/pvc.Name". The
// value is {} (empty struct) since we're only interested in the PVC's name and if it is
// present. We must always Get the pvc object before referencing any of its values, eg.
// its VolumeName.
// Note: It's unsafe to add keys to a map in a loop. Their insertion in the map is
//
// unpredictable and can result in the same key being iterated over again.
type PVCMap map[types.NamespacedName]pvcval
// PersistentVolumeConfig is consumed by MakePersistentVolume() to generate a PV object
// for varying storage options (NFS, ceph, etc.).
// (+optional) prebind holds a pre-bound PVC
// Example pvSource:
//
// pvSource: api.PersistentVolumeSource{
// NFS: &api.NFSVolumeSource{
// ...
// },
// }
type PersistentVolumeConfig struct {
// [Optional] NamePrefix defaults to "pv-" if unset
NamePrefix string
// [Optional] Labels contains information used to organize and categorize
// objects
Labels labels.Set
// [Optional] Annotations contains information used to organize and categorize
// objects
Annotations map[string]string
// PVSource contains the details of the underlying volume and must be set
PVSource v1.PersistentVolumeSource
// [Optional] Prebind lets you specify a PVC to bind this PV to before
// creation
Prebind *v1.PersistentVolumeClaim
// [Optiona] ReclaimPolicy defaults to "Reclaim" if unset
ReclaimPolicy v1.PersistentVolumeReclaimPolicy
StorageClassName string
// [Optional] NodeAffinity defines constraints that limit what nodes this
// volume can be accessed from.
NodeAffinity *v1.VolumeNodeAffinity
// [Optional] VolumeMode defaults to "Filesystem" if unset
VolumeMode *v1.PersistentVolumeMode
// [Optional] AccessModes defaults to RWO if unset
AccessModes []v1.PersistentVolumeAccessMode
// [Optional] Capacity is the storage capacity in Quantity format. Defaults
// to "2Gi" if unset
Capacity string
}
// PersistentVolumeClaimConfig is consumed by MakePersistentVolumeClaim() to
// generate a PVC object.
type PersistentVolumeClaimConfig struct {
// Name of the PVC. If set, overrides NamePrefix
Name string
// NamePrefix defaults to "pvc-" if unspecified
NamePrefix string
// ClaimSize must be specified in the Quantity format. Defaults to 2Gi if
// unspecified
ClaimSize string
// AccessModes defaults to RWO if unspecified
AccessModes []v1.PersistentVolumeAccessMode
Annotations map[string]string
Selector *metav1.LabelSelector
StorageClassName *string
VolumeAttributesClassName *string
// VolumeMode defaults to nil if unspecified or specified as the empty
// string
VolumeMode *v1.PersistentVolumeMode
}
// PVPVCCleanup cleans up a pv and pvc in a single pv/pvc test case.
// Note: delete errors are appended to []error so that we can attempt to delete both the pvc and pv.
func PVPVCCleanup(ctx context.Context, c clientset.Interface, ns string, pv *v1.PersistentVolume, pvc *v1.PersistentVolumeClaim) []error {
var errs []error
if pvc != nil {
err := DeletePersistentVolumeClaim(ctx, c, pvc.Name, ns)
if err != nil {
errs = append(errs, fmt.Errorf("failed to delete PVC %q: %w", pvc.Name, err))
}
} else {
framework.Logf("pvc is nil")
}
if pv != nil {
err := DeletePersistentVolume(ctx, c, pv.Name)
if err != nil {
errs = append(errs, fmt.Errorf("failed to delete PV %q: %w", pv.Name, err))
}
} else {
framework.Logf("pv is nil")
}
return errs
}
// PVPVCMapCleanup Cleans up pvs and pvcs in multi-pv-pvc test cases. Entries found in the pv and claim maps are
// deleted as long as the Delete api call succeeds.
// Note: delete errors are appended to []error so that as many pvcs and pvs as possible are deleted.
func PVPVCMapCleanup(ctx context.Context, c clientset.Interface, ns string, pvols PVMap, claims PVCMap) []error {
var errs []error
for pvcKey := range claims {
err := DeletePersistentVolumeClaim(ctx, c, pvcKey.Name, ns)
if err != nil {
errs = append(errs, fmt.Errorf("failed to delete PVC %q: %w", pvcKey.Name, err))
} else {
delete(claims, pvcKey)
}
}
for pvKey := range pvols {
err := DeletePersistentVolume(ctx, c, pvKey)
if err != nil {
errs = append(errs, fmt.Errorf("failed to delete PV %q: %w", pvKey, err))
} else {
delete(pvols, pvKey)
}
}
return errs
}
// DeletePersistentVolume deletes the PV.
func DeletePersistentVolume(ctx context.Context, c clientset.Interface, pvName string) error {
if c != nil && len(pvName) > 0 {
framework.Logf("Deleting PersistentVolume %q", pvName)
err := c.CoreV1().PersistentVolumes().Delete(ctx, pvName, metav1.DeleteOptions{})
if err != nil && !apierrors.IsNotFound(err) {
return fmt.Errorf("PV Delete API error: %w", err)
}
}
return nil
}
// DeletePersistentVolumeClaim deletes the Claim.
func DeletePersistentVolumeClaim(ctx context.Context, c clientset.Interface, pvcName string, ns string) error {
if c != nil && len(pvcName) > 0 {
framework.Logf("Deleting PersistentVolumeClaim %q", pvcName)
err := c.CoreV1().PersistentVolumeClaims(ns).Delete(ctx, pvcName, metav1.DeleteOptions{})
if err != nil && !apierrors.IsNotFound(err) {
return fmt.Errorf("PVC Delete API error: %w", err)
}
}
return nil
}
// DeletePVCandValidatePV deletes the PVC and waits for the PV to enter its expected phase. Validate that the PV
// has been reclaimed (assumption here about reclaimPolicy). Caller tells this func which
// phase value to expect for the pv bound to the to-be-deleted claim.
func DeletePVCandValidatePV(ctx context.Context, c clientset.Interface, timeouts *framework.TimeoutContext, ns string, pvc *v1.PersistentVolumeClaim, pv *v1.PersistentVolume, expectPVPhase v1.PersistentVolumePhase) error {
pvname := pvc.Spec.VolumeName
framework.Logf("Deleting PVC %v to trigger reclamation of PV %v", pvc.Name, pvname)
err := DeletePersistentVolumeClaim(ctx, c, pvc.Name, ns)
if err != nil {
return err
}
// Wait for the PV's phase to return to be `expectPVPhase`
framework.Logf("Waiting for reclaim process to complete.")
err = WaitForPersistentVolumePhase(ctx, expectPVPhase, c, pv.Name, framework.Poll, timeouts.PVReclaim)
if err != nil {
return fmt.Errorf("pv %q phase did not become %v: %w", pv.Name, expectPVPhase, err)
}
// examine the pv's ClaimRef and UID and compare to expected values
pv, err = c.CoreV1().PersistentVolumes().Get(ctx, pv.Name, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("PV Get API error: %w", err)
}
cr := pv.Spec.ClaimRef
if expectPVPhase == v1.VolumeAvailable {
if cr != nil && len(cr.UID) > 0 {
return fmt.Errorf("PV is 'Available' but ClaimRef.UID is not empty")
}
} else if expectPVPhase == v1.VolumeBound {
if cr == nil {
return fmt.Errorf("PV is 'Bound' but ClaimRef is nil")
}
if len(cr.UID) == 0 {
return fmt.Errorf("PV is 'Bound' but ClaimRef.UID is empty")
}
}
framework.Logf("PV %v now in %q phase", pv.Name, expectPVPhase)
return nil
}
// DeletePVCandValidatePVGroup wraps deletePVCandValidatePV() by calling the function in a loop over the PV map. Only bound PVs
// are deleted. Validates that the claim was deleted and the PV is in the expected Phase (Released,
// Available, Bound).
// Note: if there are more claims than pvs then some of the remaining claims may bind to just made
//
// available pvs.
func DeletePVCandValidatePVGroup(ctx context.Context, c clientset.Interface, timeouts *framework.TimeoutContext, ns string, pvols PVMap, claims PVCMap, expectPVPhase v1.PersistentVolumePhase) error {
var boundPVs, deletedPVCs int
for pvName := range pvols {
pv, err := c.CoreV1().PersistentVolumes().Get(ctx, pvName, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("PV Get API error: %w", err)
}
cr := pv.Spec.ClaimRef
// if pv is bound then delete the pvc it is bound to
if cr != nil && len(cr.Name) > 0 {
boundPVs++
// Assert bound PVC is tracked in this test. Failing this might
// indicate external PVCs interfering with the test.
pvcKey := makePvcKey(ns, cr.Name)
if _, found := claims[pvcKey]; !found {
return fmt.Errorf("internal: claims map is missing pvc %q", pvcKey)
}
// get the pvc for the delete call below
pvc, err := c.CoreV1().PersistentVolumeClaims(ns).Get(ctx, cr.Name, metav1.GetOptions{})
if err == nil {
if err = DeletePVCandValidatePV(ctx, c, timeouts, ns, pvc, pv, expectPVPhase); err != nil {
return err
}
} else if !apierrors.IsNotFound(err) {
return fmt.Errorf("PVC Get API error: %w", err)
}
// delete pvckey from map even if apierrors.IsNotFound above is true and thus the
// claim was not actually deleted here
delete(claims, pvcKey)
deletedPVCs++
}
}
if boundPVs != deletedPVCs {
return fmt.Errorf("expect number of bound PVs (%v) to equal number of deleted PVCs (%v)", boundPVs, deletedPVCs)
}
return nil
}
// create the PV resource. Fails test on error.
func createPV(ctx context.Context, c clientset.Interface, timeouts *framework.TimeoutContext, pv *v1.PersistentVolume) (*v1.PersistentVolume, error) {
var resultPV *v1.PersistentVolume
var lastCreateErr error
err := wait.PollUntilContextTimeout(ctx, 29*time.Second, timeouts.PVCreate, true, func(ctx context.Context) (done bool, err error) {
resultPV, lastCreateErr = c.CoreV1().PersistentVolumes().Create(ctx, pv, metav1.CreateOptions{})
if lastCreateErr != nil {
// If we hit a quota problem, we are not done and should retry again. This happens to be the quota failure string for GCP.
// If quota failure strings are found for other platforms, they can be added to improve reliability when running
// many parallel test jobs in a single cloud account. This corresponds to controller-like behavior and
// to what we would recommend for general clients.
if strings.Contains(lastCreateErr.Error(), `googleapi: Error 403: Quota exceeded for quota group`) {
return false, nil
}
// if it was not a quota failure, fail immediately
return false, lastCreateErr
}
return true, nil
})
// if we have an error from creating the PV, use that instead of a timeout error
if lastCreateErr != nil {
return nil, fmt.Errorf("PV Create API error: %w", err)
}
if err != nil {
return nil, fmt.Errorf("PV Create API error: %w", err)
}
return resultPV, nil
}
// CreatePV creates the PV resource. Fails test on error.
func CreatePV(ctx context.Context, c clientset.Interface, timeouts *framework.TimeoutContext, pv *v1.PersistentVolume) (*v1.PersistentVolume, error) {
return createPV(ctx, c, timeouts, pv)
}
// CreatePVC creates the PVC resource. Fails test on error.
func CreatePVC(ctx context.Context, c clientset.Interface, ns string, pvc *v1.PersistentVolumeClaim) (*v1.PersistentVolumeClaim, error) {
pvc, err := c.CoreV1().PersistentVolumeClaims(ns).Create(ctx, pvc, metav1.CreateOptions{})
if err != nil {
return nil, fmt.Errorf("PVC Create API error: %w", err)
}
return pvc, nil
}
// CreatePVCPV creates a PVC followed by the PV based on the passed in nfs-server ip and
// namespace. If the "preBind" bool is true then pre-bind the PV to the PVC
// via the PV's ClaimRef. Return the pv and pvc to reflect the created objects.
// Note: in the pre-bind case the real PVC name, which is generated, is not
//
// known until after the PVC is instantiated. This is why the pvc is created
// before the pv.
func CreatePVCPV(ctx context.Context, c clientset.Interface, timeouts *framework.TimeoutContext, pvConfig PersistentVolumeConfig, pvcConfig PersistentVolumeClaimConfig, ns string, preBind bool) (*v1.PersistentVolume, *v1.PersistentVolumeClaim, error) {
// make the pvc spec
pvc := MakePersistentVolumeClaim(pvcConfig, ns)
preBindMsg := ""
if preBind {
preBindMsg = " pre-bound"
pvConfig.Prebind = pvc
}
// make the pv spec
pv := MakePersistentVolume(pvConfig)
ginkgo.By(fmt.Sprintf("Creating a PVC followed by a%s PV", preBindMsg))
pvc, err := CreatePVC(ctx, c, ns, pvc)
if err != nil {
return nil, nil, err
}
// instantiate the pv, handle pre-binding by ClaimRef if needed
if preBind {
pv.Spec.ClaimRef.Name = pvc.Name
}
pv, err = createPV(ctx, c, timeouts, pv)
if err != nil {
return nil, pvc, err
}
return pv, pvc, nil
}
// CreatePVPVC creates a PV followed by the PVC based on the passed in nfs-server ip and
// namespace. If the "preBind" bool is true then pre-bind the PVC to the PV
// via the PVC's VolumeName. Return the pv and pvc to reflect the created
// objects.
// Note: in the pre-bind case the real PV name, which is generated, is not
//
// known until after the PV is instantiated. This is why the pv is created
// before the pvc.
func CreatePVPVC(ctx context.Context, c clientset.Interface, timeouts *framework.TimeoutContext, pvConfig PersistentVolumeConfig, pvcConfig PersistentVolumeClaimConfig, ns string, preBind bool) (*v1.PersistentVolume, *v1.PersistentVolumeClaim, error) {
preBindMsg := ""
if preBind {
preBindMsg = " pre-bound"
}
framework.Logf("Creating a PV followed by a%s PVC", preBindMsg)
// make the pv and pvc definitions
pv := MakePersistentVolume(pvConfig)
pvc := MakePersistentVolumeClaim(pvcConfig, ns)
// instantiate the pv
pv, err := createPV(ctx, c, timeouts, pv)
if err != nil {
return nil, nil, err
}
// instantiate the pvc, handle pre-binding by VolumeName if needed
if preBind {
pvc.Spec.VolumeName = pv.Name
}
pvc, err = CreatePVC(ctx, c, ns, pvc)
if err != nil {
return pv, nil, err
}
return pv, pvc, nil
}
// CreatePVsPVCs creates the desired number of PVs and PVCs and returns them in separate maps. If the
// number of PVs != the number of PVCs then the min of those two counts is the number of
// PVs expected to bind. If a Create error occurs, the returned maps may contain pv and pvc
// entries for the resources that were successfully created. In other words, when the caller
// sees an error returned, it needs to decide what to do about entries in the maps.
// Note: when the test suite deletes the namespace orphaned pvcs and pods are deleted. However,
//
// orphaned pvs are not deleted and will remain after the suite completes.
func CreatePVsPVCs(ctx context.Context, numpvs, numpvcs int, c clientset.Interface, timeouts *framework.TimeoutContext, ns string, pvConfig PersistentVolumeConfig, pvcConfig PersistentVolumeClaimConfig) (PVMap, PVCMap, error) {
pvMap := make(PVMap, numpvs)
pvcMap := make(PVCMap, numpvcs)
extraPVCs := 0
extraPVs := numpvs - numpvcs
if extraPVs < 0 {
extraPVCs = -extraPVs
extraPVs = 0
}
pvsToCreate := numpvs - extraPVs // want the min(numpvs, numpvcs)
// create pvs and pvcs
for i := 0; i < pvsToCreate; i++ {
pv, pvc, err := CreatePVPVC(ctx, c, timeouts, pvConfig, pvcConfig, ns, false)
if err != nil {
return pvMap, pvcMap, err
}
pvMap[pv.Name] = pvval{}
pvcMap[makePvcKey(ns, pvc.Name)] = pvcval{}
}
// create extra pvs or pvcs as needed
for i := 0; i < extraPVs; i++ {
pv := MakePersistentVolume(pvConfig)
pv, err := createPV(ctx, c, timeouts, pv)
if err != nil {
return pvMap, pvcMap, err
}
pvMap[pv.Name] = pvval{}
}
for i := 0; i < extraPVCs; i++ {
pvc := MakePersistentVolumeClaim(pvcConfig, ns)
pvc, err := CreatePVC(ctx, c, ns, pvc)
if err != nil {
return pvMap, pvcMap, err
}
pvcMap[makePvcKey(ns, pvc.Name)] = pvcval{}
}
return pvMap, pvcMap, nil
}
// WaitOnPVandPVC waits for the pv and pvc to bind to each other.
func WaitOnPVandPVC(ctx context.Context, c clientset.Interface, timeouts *framework.TimeoutContext, ns string, pv *v1.PersistentVolume, pvc *v1.PersistentVolumeClaim) error {
// Wait for newly created PVC to bind to the PV
framework.Logf("Waiting for PV %v to bind to PVC %v", pv.Name, pvc.Name)
err := WaitForPersistentVolumeClaimPhase(ctx, v1.ClaimBound, c, ns, pvc.Name, framework.Poll, timeouts.ClaimBound)
if err != nil {
return fmt.Errorf("PVC %q did not become Bound: %w", pvc.Name, err)
}
// Wait for PersistentVolume.Status.Phase to be Bound, which it should be
// since the PVC is already bound.
err = WaitForPersistentVolumePhase(ctx, v1.VolumeBound, c, pv.Name, framework.Poll, timeouts.PVBound)
if err != nil {
return fmt.Errorf("PV %q did not become Bound: %w", pv.Name, err)
}
// Re-get the pv and pvc objects
pv, err = c.CoreV1().PersistentVolumes().Get(ctx, pv.Name, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("PV Get API error: %w", err)
}
pvc, err = c.CoreV1().PersistentVolumeClaims(ns).Get(ctx, pvc.Name, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("PVC Get API error: %w", err)
}
// The pv and pvc are both bound, but to each other?
// Check that the PersistentVolume.ClaimRef matches the PVC
if pv.Spec.ClaimRef == nil {
return fmt.Errorf("PV %q ClaimRef is nil", pv.Name)
}
if pv.Spec.ClaimRef.Name != pvc.Name {
return fmt.Errorf("PV %q ClaimRef's name (%q) should be %q", pv.Name, pv.Spec.ClaimRef.Name, pvc.Name)
}
if pvc.Spec.VolumeName != pv.Name {
return fmt.Errorf("PVC %q VolumeName (%q) should be %q", pvc.Name, pvc.Spec.VolumeName, pv.Name)
}
if pv.Spec.ClaimRef.UID != pvc.UID {
return fmt.Errorf("PV %q ClaimRef's UID (%q) should be %q", pv.Name, pv.Spec.ClaimRef.UID, pvc.UID)
}
return nil
}
// WaitAndVerifyBinds searches for bound PVs and PVCs by examining pvols for non-nil claimRefs.
// NOTE: Each iteration waits for a maximum of 3 minutes per PV and, if the PV is bound,
//
// up to 3 minutes for the PVC. When the number of PVs != number of PVCs, this can lead
// to situations where the maximum wait times are reached several times in succession,
// extending test time. Thus, it is recommended to keep the delta between PVs and PVCs
// small.
func WaitAndVerifyBinds(ctx context.Context, c clientset.Interface, timeouts *framework.TimeoutContext, ns string, pvols PVMap, claims PVCMap, testExpected bool) error {
var actualBinds int
expectedBinds := len(pvols)
if expectedBinds > len(claims) { // want the min of # pvs or #pvcs
expectedBinds = len(claims)
}
for pvName := range pvols {
err := WaitForPersistentVolumePhase(ctx, v1.VolumeBound, c, pvName, framework.Poll, timeouts.PVBound)
if err != nil && len(pvols) > len(claims) {
framework.Logf("WARN: pv %v is not bound after max wait", pvName)
framework.Logf(" This may be ok since there are more pvs than pvcs")
continue
}
if err != nil {
return fmt.Errorf("PV %q did not become Bound: %w", pvName, err)
}
pv, err := c.CoreV1().PersistentVolumes().Get(ctx, pvName, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("PV Get API error: %w", err)
}
cr := pv.Spec.ClaimRef
if cr != nil && len(cr.Name) > 0 {
// Assert bound pvc is a test resource. Failing assertion could
// indicate non-test PVC interference or a bug in the test
pvcKey := makePvcKey(ns, cr.Name)
if _, found := claims[pvcKey]; !found {
return fmt.Errorf("internal: claims map is missing pvc %q", pvcKey)
}
err := WaitForPersistentVolumeClaimPhase(ctx, v1.ClaimBound, c, ns, cr.Name, framework.Poll, timeouts.ClaimBound)
if err != nil {
return fmt.Errorf("PVC %q did not become Bound: %w", cr.Name, err)
}
actualBinds++
}
}
if testExpected && actualBinds != expectedBinds {
return fmt.Errorf("expect number of bound PVs (%v) to equal number of claims (%v)", actualBinds, expectedBinds)
}
return nil
}
// Return a pvckey struct.
func makePvcKey(ns, name string) types.NamespacedName {
return types.NamespacedName{Namespace: ns, Name: name}
}
// MakePersistentVolume returns a PV definition based on the nfs server IP. If the PVC is not nil
// then the PV is defined with a ClaimRef which includes the PVC's namespace.
// If the PVC is nil then the PV is not defined with a ClaimRef. If no reclaimPolicy
// is assigned, assumes "Retain". Specs are expected to match the test's PVC.
// Note: the passed-in claim does not have a name until it is created and thus the PV's
//
// ClaimRef cannot be completely filled-in in this func. Therefore, the ClaimRef's name
// is added later in CreatePVCPV.
func MakePersistentVolume(pvConfig PersistentVolumeConfig) *v1.PersistentVolume {
var claimRef *v1.ObjectReference
if len(pvConfig.AccessModes) == 0 {
pvConfig.AccessModes = append(pvConfig.AccessModes, v1.ReadWriteOnce)
}
if len(pvConfig.NamePrefix) == 0 {
pvConfig.NamePrefix = "pv-"
}
if pvConfig.ReclaimPolicy == "" {
pvConfig.ReclaimPolicy = v1.PersistentVolumeReclaimRetain
}
if len(pvConfig.Capacity) == 0 {
pvConfig.Capacity = "2Gi"
}
if pvConfig.Prebind != nil {
claimRef = &v1.ObjectReference{
Kind: "PersistentVolumeClaim",
APIVersion: "v1",
Name: pvConfig.Prebind.Name,
Namespace: pvConfig.Prebind.Namespace,
UID: pvConfig.Prebind.UID,
}
}
annotations := map[string]string{
volumeGidAnnotationKey: "777",
}
for k, v := range pvConfig.Annotations {
annotations[k] = v
}
return &v1.PersistentVolume{
ObjectMeta: metav1.ObjectMeta{
GenerateName: pvConfig.NamePrefix,
Labels: pvConfig.Labels,
Annotations: annotations,
},
Spec: v1.PersistentVolumeSpec{
PersistentVolumeReclaimPolicy: pvConfig.ReclaimPolicy,
Capacity: v1.ResourceList{
v1.ResourceStorage: resource.MustParse(pvConfig.Capacity),
},
PersistentVolumeSource: pvConfig.PVSource,
AccessModes: pvConfig.AccessModes,
ClaimRef: claimRef,
StorageClassName: pvConfig.StorageClassName,
NodeAffinity: pvConfig.NodeAffinity,
VolumeMode: pvConfig.VolumeMode,
},
}
}
// MakePersistentVolumeClaim returns a PVC API Object based on the PersistentVolumeClaimConfig.
func MakePersistentVolumeClaim(cfg PersistentVolumeClaimConfig, ns string) *v1.PersistentVolumeClaim {
if len(cfg.AccessModes) == 0 {
cfg.AccessModes = append(cfg.AccessModes, v1.ReadWriteOnce)
}
if len(cfg.ClaimSize) == 0 {
cfg.ClaimSize = "2Gi"
}
if len(cfg.NamePrefix) == 0 {
cfg.NamePrefix = "pvc-"
}
if cfg.VolumeMode != nil && *cfg.VolumeMode == "" {
framework.Logf("Warning: Making PVC: VolumeMode specified as invalid empty string, treating as nil")
cfg.VolumeMode = nil
}
return &v1.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: cfg.Name,
GenerateName: cfg.NamePrefix,
Namespace: ns,
Annotations: cfg.Annotations,
},
Spec: v1.PersistentVolumeClaimSpec{
Selector: cfg.Selector,
AccessModes: cfg.AccessModes,
Resources: v1.VolumeResourceRequirements{
Requests: v1.ResourceList{
v1.ResourceStorage: resource.MustParse(cfg.ClaimSize),
},
},
StorageClassName: cfg.StorageClassName,
VolumeAttributesClassName: cfg.VolumeAttributesClassName,
VolumeMode: cfg.VolumeMode,
},
}
}
func createPDWithRetry(ctx context.Context, zone string) (string, error) {
var err error
var newDiskName string
for start := time.Now(); ; time.Sleep(pdRetryPollTime) {
if time.Since(start) >= pdRetryTimeout ||
ctx.Err() != nil {
return "", fmt.Errorf("timed out while trying to create PD in zone %q, last error: %w", zone, err)
}
newDiskName, err = createPD(zone)
if err != nil {
framework.Logf("Couldn't create a new PD in zone %q, sleeping 5 seconds: %v", zone, err)
continue
}
framework.Logf("Successfully created a new PD in zone %q: %q.", zone, newDiskName)
return newDiskName, nil
}
}
func CreateShare() (string, string, string, error) {
return framework.TestContext.CloudConfig.Provider.CreateShare()
}
func DeleteShare(accountName, shareName string) error {
return framework.TestContext.CloudConfig.Provider.DeleteShare(accountName, shareName)
}
// CreatePDWithRetry creates PD with retry.
func CreatePDWithRetry(ctx context.Context) (string, error) {
return createPDWithRetry(ctx, "")
}
// CreatePDWithRetryAndZone creates PD on zone with retry.
func CreatePDWithRetryAndZone(ctx context.Context, zone string) (string, error) {
return createPDWithRetry(ctx, zone)
}
// DeletePDWithRetry deletes PD with retry.
func DeletePDWithRetry(ctx context.Context, diskName string) error {
var err error
for start := time.Now(); ; time.Sleep(pdRetryPollTime) {
if time.Since(start) >= pdRetryTimeout ||
ctx.Err() != nil {
return fmt.Errorf("timed out while trying to delete PD %q, last error: %w", diskName, err)
}
err = deletePD(diskName)
if err != nil {
framework.Logf("Couldn't delete PD %q, sleeping %v: %v", diskName, pdRetryPollTime, err)
continue
}
framework.Logf("Successfully deleted PD %q.", diskName)
return nil
}
}
func createPD(zone string) (string, error) {
if zone == "" {
zone = framework.TestContext.CloudConfig.Zone
}
return framework.TestContext.CloudConfig.Provider.CreatePD(zone)
}
func deletePD(pdName string) error {
return framework.TestContext.CloudConfig.Provider.DeletePD(pdName)
}
// WaitForPVClaimBoundPhase waits until all pvcs phase set to bound
func WaitForPVClaimBoundPhase(ctx context.Context, client clientset.Interface, pvclaims []*v1.PersistentVolumeClaim, timeout time.Duration) ([]*v1.PersistentVolume, error) {
persistentvolumes := make([]*v1.PersistentVolume, len(pvclaims))
for index, claim := range pvclaims {
err := WaitForPersistentVolumeClaimPhase(ctx, v1.ClaimBound, client, claim.Namespace, claim.Name, framework.Poll, timeout)
if err != nil {
return persistentvolumes, err
}
// Get new copy of the claim
claim, err = client.CoreV1().PersistentVolumeClaims(claim.Namespace).Get(ctx, claim.Name, metav1.GetOptions{})
if err != nil {
return persistentvolumes, fmt.Errorf("PVC Get API error: %w", err)
}
// Get the bounded PV
persistentvolumes[index], err = client.CoreV1().PersistentVolumes().Get(ctx, claim.Spec.VolumeName, metav1.GetOptions{})
if err != nil {
return persistentvolumes, fmt.Errorf("PV Get API error: %w", err)
}
}
return persistentvolumes, nil
}
// WaitForPersistentVolumePhase waits for a PersistentVolume to be in a specific phase or until timeout occurs, whichever comes first.
func WaitForPersistentVolumePhase(ctx context.Context, phase v1.PersistentVolumePhase, c clientset.Interface, pvName string, poll, timeout time.Duration) error {
framework.Logf("Waiting up to %v for PersistentVolume %s to have phase %s", timeout, pvName, phase)
for start := time.Now(); time.Since(start) < timeout; time.Sleep(poll) {
pv, err := c.CoreV1().PersistentVolumes().Get(ctx, pvName, metav1.GetOptions{})
if err != nil {
framework.Logf("Get persistent volume %s in failed, ignoring for %v: %v", pvName, poll, err)
continue
}
if pv.Status.Phase == phase {
framework.Logf("PersistentVolume %s found and phase=%s (%v)", pvName, phase, time.Since(start))
return nil
}
framework.Logf("PersistentVolume %s found but phase is %s instead of %s.", pvName, pv.Status.Phase, phase)
}
return fmt.Errorf("PersistentVolume %s not in phase %s within %v", pvName, phase, timeout)
}
// WaitForPersistentVolumeClaimPhase waits for a PersistentVolumeClaim to be in a specific phase or until timeout occurs, whichever comes first.
func WaitForPersistentVolumeClaimPhase(ctx context.Context, phase v1.PersistentVolumeClaimPhase, c clientset.Interface, ns string, pvcName string, poll, timeout time.Duration) error {
return WaitForPersistentVolumeClaimsPhase(ctx, phase, c, ns, []string{pvcName}, poll, timeout, true)
}
// WaitForPersistentVolumeClaimsPhase waits for any (if matchAny is true) or all (if matchAny is false) PersistentVolumeClaims
// to be in a specific phase or until timeout occurs, whichever comes first.
func WaitForPersistentVolumeClaimsPhase(ctx context.Context, phase v1.PersistentVolumeClaimPhase, c clientset.Interface, ns string, pvcNames []string, poll, timeout time.Duration, matchAny bool) error {
if len(pvcNames) == 0 {
return fmt.Errorf("Incorrect parameter: Need at least one PVC to track. Found 0")
}
framework.Logf("Waiting up to timeout=%v for PersistentVolumeClaims %v to have phase %s", timeout, pvcNames, phase)
for start := time.Now(); time.Since(start) < timeout; time.Sleep(poll) {
phaseFoundInAllClaims := true
for _, pvcName := range pvcNames {
pvc, err := c.CoreV1().PersistentVolumeClaims(ns).Get(ctx, pvcName, metav1.GetOptions{})
if err != nil {
framework.Logf("Failed to get claim %q, retrying in %v. Error: %v", pvcName, poll, err)
phaseFoundInAllClaims = false
break
}
if pvc.Status.Phase == phase {
framework.Logf("PersistentVolumeClaim %s found and phase=%s (%v)", pvcName, phase, time.Since(start))
if matchAny {
return nil
}
} else {
framework.Logf("PersistentVolumeClaim %s found but phase is %s instead of %s.", pvcName, pvc.Status.Phase, phase)
phaseFoundInAllClaims = false
}
}
if phaseFoundInAllClaims {
return nil
}
}
return fmt.Errorf("PersistentVolumeClaims %v not all in phase %s within %v", pvcNames, phase, timeout)
}
// CreatePVSource creates a PV source.
func CreatePVSource(ctx context.Context, zone string) (*v1.PersistentVolumeSource, error) {
diskName, err := CreatePDWithRetryAndZone(ctx, zone)
if err != nil {
return nil, err
}
return framework.TestContext.CloudConfig.Provider.CreatePVSource(ctx, zone, diskName)
}
// DeletePVSource deletes a PV source.
func DeletePVSource(ctx context.Context, pvSource *v1.PersistentVolumeSource) error {
return framework.TestContext.CloudConfig.Provider.DeletePVSource(ctx, pvSource)
}
// GetDefaultStorageClassName returns default storageClass or return error
func GetDefaultStorageClassName(ctx context.Context, c clientset.Interface) (string, error) {
list, err := c.StorageV1().StorageClasses().List(ctx, metav1.ListOptions{})
if err != nil {
return "", fmt.Errorf("Error listing storage classes: %w", err)
}
var scName string
for _, sc := range list.Items {
if util.IsDefaultAnnotation(sc.ObjectMeta) {
if len(scName) != 0 {
return "", fmt.Errorf("Multiple default storage classes found: %q and %q", scName, sc.Name)
}
scName = sc.Name
}
}
if len(scName) == 0 {
return "", fmt.Errorf("No default storage class found")
}
framework.Logf("Default storage class: %q", scName)
return scName, nil
}
// SkipIfNoDefaultStorageClass skips tests if no default SC can be found.
func SkipIfNoDefaultStorageClass(ctx context.Context, c clientset.Interface) {
_, err := GetDefaultStorageClassName(ctx, c)
if err != nil {
e2eskipper.Skipf("error finding default storageClass : %v", err)
}
}
// WaitForPersistentVolumeDeleted waits for a PersistentVolume to get deleted or until timeout occurs, whichever comes first.
func WaitForPersistentVolumeDeleted(ctx context.Context, c clientset.Interface, pvName string, poll, timeout time.Duration) error {
framework.Logf("Waiting up to %v for PersistentVolume %s to get deleted", timeout, pvName)
for start := time.Now(); time.Since(start) < timeout; time.Sleep(poll) {
pv, err := c.CoreV1().PersistentVolumes().Get(ctx, pvName, metav1.GetOptions{})
if err == nil {
framework.Logf("PersistentVolume %s found and phase=%s (%v)", pvName, pv.Status.Phase, time.Since(start))
continue
}
if apierrors.IsNotFound(err) {
framework.Logf("PersistentVolume %s was removed", pvName)
return nil
}
framework.Logf("Get persistent volume %s in failed, ignoring for %v: %v", pvName, poll, err)
}
return fmt.Errorf("PersistentVolume %s still exists within %v", pvName, timeout)
}
// WaitForPVCFinalizer waits for a finalizer to be added to a PVC in a given namespace.
func WaitForPVCFinalizer(ctx context.Context, cs clientset.Interface, name, namespace, finalizer string, poll, timeout time.Duration) error {
var (
err error
pvc *v1.PersistentVolumeClaim
)
framework.Logf("Waiting up to %v for PersistentVolumeClaim %s/%s to contain finalizer %s", timeout, namespace, name, finalizer)
if successful := utils.WaitUntil(poll, timeout, func() bool {
pvc, err = cs.CoreV1().PersistentVolumeClaims(namespace).Get(ctx, name, metav1.GetOptions{})
if err != nil {
framework.Logf("Failed to get PersistentVolumeClaim %s/%s with err: %v. Will retry in %v", name, namespace, err, timeout)
return false
}
for _, f := range pvc.Finalizers {
if f == finalizer {
return true
}
}
return false
}); successful {
return nil
}
if err == nil {
err = fmt.Errorf("finalizer %s not added to pvc %s/%s", finalizer, namespace, name)
}
return err
}
// GetDefaultFSType returns the default fsType
func GetDefaultFSType() string {
if framework.NodeOSDistroIs("windows") {
return "ntfs"
}
return "ext4"
}

View File

@ -0,0 +1,70 @@
/*
Copyright 2024 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package pv
import (
"context"
"fmt"
"time"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubernetes/test/e2e/framework"
"k8s.io/kubernetes/test/utils/format"
"k8s.io/utils/ptr"
)
// WaitForPersistentVolumeClaimModified waits the given timeout duration for the specified claim to become bound with the
// desired volume attributes class.
// Returns an error if timeout occurs first.
func WaitForPersistentVolumeClaimModified(ctx context.Context, c clientset.Interface, claim *v1.PersistentVolumeClaim, timeout time.Duration) error {
desiredClass := ptr.Deref(claim.Spec.VolumeAttributesClassName, "")
var match = func(claim *v1.PersistentVolumeClaim) bool {
for _, condition := range claim.Status.Conditions {
// conditions that indicate the claim is being modified
// or has an error when modifying the volume
if condition.Type == v1.PersistentVolumeClaimVolumeModifyVolumeError ||
condition.Type == v1.PersistentVolumeClaimVolumeModifyingVolume {
return false
}
}
// check if claim is bound with the desired volume attributes class
currentClass := ptr.Deref(claim.Status.CurrentVolumeAttributesClassName, "")
return claim.Status.Phase == v1.ClaimBound &&
desiredClass == currentClass && claim.Status.ModifyVolumeStatus == nil
}
if match(claim) {
return nil
}
return framework.Gomega().
Eventually(ctx, framework.GetObject(c.CoreV1().PersistentVolumeClaims(claim.Namespace).Get, claim.Name, metav1.GetOptions{})).
WithTimeout(timeout).
Should(framework.MakeMatcher(func(claim *v1.PersistentVolumeClaim) (func() string, error) {
if match(claim) {
return nil, nil
}
return func() string {
return fmt.Sprintf("expected claim's status to be modified with the given VolumeAttirbutesClass %s, got instead:\n%s", desiredClass, format.Object(claim, 1))
}, nil
}))
}

View File

@ -0,0 +1,60 @@
/*
Copyright 2015 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import (
"fmt"
"time"
)
// ResizeGroup resizes an instance group
func ResizeGroup(group string, size int32) error {
if TestContext.ReportDir != "" {
CoreDump(TestContext.ReportDir)
defer CoreDump(TestContext.ReportDir)
}
return TestContext.CloudConfig.Provider.ResizeGroup(group, size)
}
// GetGroupNodes returns a node name for the specified node group
func GetGroupNodes(group string) ([]string, error) {
return TestContext.CloudConfig.Provider.GetGroupNodes(group)
}
// GroupSize returns the size of an instance group
func GroupSize(group string) (int, error) {
return TestContext.CloudConfig.Provider.GroupSize(group)
}
// WaitForGroupSize waits for node instance group reached the desired size
func WaitForGroupSize(group string, size int32) error {
timeout := 30 * time.Minute
for start := time.Now(); time.Since(start) < timeout; time.Sleep(20 * time.Second) {
currentSize, err := GroupSize(group)
if err != nil {
Logf("Failed to get node instance group size: %v", err)
continue
}
if currentSize != int(size) {
Logf("Waiting for node instance group size %d, current size %d", size, currentSize)
continue
}
Logf("Node instance group has reached the desired size %d", size)
return nil
}
return fmt.Errorf("timeout waiting %v for node instance group size to be %d", timeout, size)
}

View File

@ -0,0 +1,9 @@
# This E2E framework sub-package is currently allowed to use arbitrary
# dependencies, therefore we need to override the restrictions from
# the parent .import-restrictions file.
#
# At some point it may become useful to also check this package's
# dependencies more careful.
rules:
- selectorRegexp: ""
allowedPrefixes: [ "" ]

View File

@ -0,0 +1,252 @@
/*
Copyright 2014 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package skipper
import (
"context"
"fmt"
"github.com/onsi/ginkgo/v2"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/labels"
utilversion "k8s.io/apimachinery/pkg/util/version"
"k8s.io/client-go/discovery"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/component-base/featuregate"
"k8s.io/kubernetes/test/e2e/framework"
e2enode "k8s.io/kubernetes/test/e2e/framework/node"
e2essh "k8s.io/kubernetes/test/e2e/framework/ssh"
)
func skipInternalf(caller int, format string, args ...interface{}) {
msg := fmt.Sprintf(format, args...)
ginkgo.Skip(msg, caller+1)
panic("unreachable")
}
// Skipf skips with information about why the test is being skipped.
// The direct caller is recorded in the callstack.
func Skipf(format string, args ...interface{}) {
skipInternalf(1, format, args...)
panic("unreachable")
}
// Skip is an alias for ginkgo.Skip.
var Skip = ginkgo.Skip
// SkipUnlessAtLeast skips if the value is less than the minValue.
func SkipUnlessAtLeast(value int, minValue int, message string) {
if value < minValue {
skipInternalf(1, "%s", message)
}
}
var featureGate featuregate.FeatureGate
// InitFeatureGates must be called in test suites that have a --feature-gates parameter.
// If not called, SkipUnlessFeatureGateEnabled will record a test failure.
func InitFeatureGates(defaults featuregate.FeatureGate, overrides map[string]bool) error {
clone := defaults.DeepCopy()
if err := clone.SetFromMap(overrides); err != nil {
return err
}
featureGate = clone
return nil
}
// IsFeatureGateEnabled can be used during e2e tests to figure out if a certain feature gate is enabled.
// This function is dependent on InitFeatureGates under the hood. Therefore, the test must be called with a
// --feature-gates parameter.
func IsFeatureGateEnabled(feature featuregate.Feature) bool {
if featureGate == nil {
framework.Failf("feature gate interface is not initialized")
}
return featureGate.Enabled(feature)
}
// SkipUnlessFeatureGateEnabled skips if the feature is disabled.
//
// Beware that this only works in test suites that have a --feature-gate
// parameter and call InitFeatureGates. In test/e2e, the `Feature: XYZ` tag
// has to be used instead and invocations have to make sure that they
// only run tests that work with the given test cluster.
func SkipUnlessFeatureGateEnabled(gate featuregate.Feature) {
if featureGate == nil {
framework.Failf("Feature gate checking is not enabled, don't use SkipUnlessFeatureGateEnabled(%v). Instead use the Feature tag.", gate)
}
if !featureGate.Enabled(gate) {
skipInternalf(1, "Only supported when %v feature is enabled", gate)
}
}
// SkipUnlessNodeCountIsAtLeast skips if the number of nodes is less than the minNodeCount.
func SkipUnlessNodeCountIsAtLeast(minNodeCount int) {
if framework.TestContext.CloudConfig.NumNodes < minNodeCount {
skipInternalf(1, "Requires at least %d nodes (not %d)", minNodeCount, framework.TestContext.CloudConfig.NumNodes)
}
}
// SkipUnlessNodeCountIsAtMost skips if the number of nodes is greater than the maxNodeCount.
func SkipUnlessNodeCountIsAtMost(maxNodeCount int) {
if framework.TestContext.CloudConfig.NumNodes > maxNodeCount {
skipInternalf(1, "Requires at most %d nodes (not %d)", maxNodeCount, framework.TestContext.CloudConfig.NumNodes)
}
}
// SkipIfProviderIs skips if the provider is included in the unsupportedProviders.
func SkipIfProviderIs(unsupportedProviders ...string) {
if framework.ProviderIs(unsupportedProviders...) {
skipInternalf(1, "Not supported for providers %v (found %s)", unsupportedProviders, framework.TestContext.Provider)
}
}
// SkipUnlessProviderIs skips if the provider is not included in the supportedProviders.
func SkipUnlessProviderIs(supportedProviders ...string) {
if !framework.ProviderIs(supportedProviders...) {
skipInternalf(1, "Only supported for providers %v (not %s)", supportedProviders, framework.TestContext.Provider)
}
}
// SkipUnlessMultizone skips if the cluster does not have multizone.
func SkipUnlessMultizone(ctx context.Context, c clientset.Interface) {
zones, err := e2enode.GetClusterZones(ctx, c)
if err != nil {
skipInternalf(1, "Error listing cluster zones")
}
if zones.Len() <= 1 {
skipInternalf(1, "Requires more than one zone")
}
}
// SkipUnlessAtLeastNZones skips if the cluster does not have n multizones.
func SkipUnlessAtLeastNZones(ctx context.Context, c clientset.Interface, n int) {
zones, err := e2enode.GetClusterZones(ctx, c)
if err != nil {
skipInternalf(1, "Error listing cluster zones")
}
if zones.Len() < n {
skipInternalf(1, "Requires >= %d zones", n)
}
}
// SkipIfMultizone skips if the cluster has multizone.
func SkipIfMultizone(ctx context.Context, c clientset.Interface) {
zones, err := e2enode.GetClusterZones(ctx, c)
if err != nil {
skipInternalf(1, "Error listing cluster zones")
}
if zones.Len() > 1 {
skipInternalf(1, "Requires at most one zone")
}
}
// SkipUnlessMasterOSDistroIs skips if the master OS distro is not included in the supportedMasterOsDistros.
func SkipUnlessMasterOSDistroIs(supportedMasterOsDistros ...string) {
if !framework.MasterOSDistroIs(supportedMasterOsDistros...) {
skipInternalf(1, "Only supported for master OS distro %v (not %s)", supportedMasterOsDistros, framework.TestContext.MasterOSDistro)
}
}
// SkipUnlessNodeOSDistroIs skips if the node OS distro is not included in the supportedNodeOsDistros.
func SkipUnlessNodeOSDistroIs(supportedNodeOsDistros ...string) {
if !framework.NodeOSDistroIs(supportedNodeOsDistros...) {
skipInternalf(1, "Only supported for node OS distro %v (not %s)", supportedNodeOsDistros, framework.TestContext.NodeOSDistro)
}
}
// SkipUnlessNodeOSArchIs skips if the node OS distro is not included in the supportedNodeOsArchs.
func SkipUnlessNodeOSArchIs(supportedNodeOsArchs ...string) {
if !framework.NodeOSArchIs(supportedNodeOsArchs...) {
skipInternalf(1, "Only supported for node OS arch %v (not %s)", supportedNodeOsArchs, framework.TestContext.NodeOSArch)
}
}
// SkipIfNodeOSDistroIs skips if the node OS distro is included in the unsupportedNodeOsDistros.
func SkipIfNodeOSDistroIs(unsupportedNodeOsDistros ...string) {
if framework.NodeOSDistroIs(unsupportedNodeOsDistros...) {
skipInternalf(1, "Not supported for node OS distro %v (is %s)", unsupportedNodeOsDistros, framework.TestContext.NodeOSDistro)
}
}
// SkipUnlessServerVersionGTE skips if the server version is less than v.
func SkipUnlessServerVersionGTE(v *utilversion.Version, c discovery.ServerVersionInterface) {
gte, err := serverVersionGTE(v, c)
if err != nil {
framework.Failf("Failed to get server version: %v", err)
}
if !gte {
skipInternalf(1, "Not supported for server versions before %q", v)
}
}
// SkipUnlessSSHKeyPresent skips if no SSH key is found.
func SkipUnlessSSHKeyPresent() {
if _, err := e2essh.GetSigner(framework.TestContext.Provider); err != nil {
skipInternalf(1, "No SSH Key for provider %s: '%v'", framework.TestContext.Provider, err)
}
}
// serverVersionGTE returns true if v is greater than or equal to the server version.
func serverVersionGTE(v *utilversion.Version, c discovery.ServerVersionInterface) (bool, error) {
serverVersion, err := c.ServerVersion()
if err != nil {
return false, fmt.Errorf("Unable to get server version: %w", err)
}
sv, err := utilversion.ParseSemantic(serverVersion.GitVersion)
if err != nil {
return false, fmt.Errorf("Unable to parse server version %q: %w", serverVersion.GitVersion, err)
}
return sv.AtLeast(v), nil
}
// AppArmorDistros are distros with AppArmor support
var AppArmorDistros = []string{"gci", "ubuntu"}
// SkipIfAppArmorNotSupported skips if the AppArmor is not supported by the node OS distro.
func SkipIfAppArmorNotSupported() {
SkipUnlessNodeOSDistroIs(AppArmorDistros...)
}
// SkipUnlessComponentRunsAsPodsAndClientCanDeleteThem run if the component run as pods and client can delete them
func SkipUnlessComponentRunsAsPodsAndClientCanDeleteThem(ctx context.Context, componentName string, c clientset.Interface, ns string, labelSet labels.Set) {
// verify if component run as pod
label := labels.SelectorFromSet(labelSet)
listOpts := metav1.ListOptions{LabelSelector: label.String()}
pods, err := c.CoreV1().Pods(ns).List(ctx, listOpts)
framework.Logf("SkipUnlessComponentRunsAsPodsAndClientCanDeleteThem: %v, %v", pods, err)
if err != nil {
skipInternalf(1, "Skipped because client failed to get component:%s pod err:%v", componentName, err)
}
if len(pods.Items) == 0 {
skipInternalf(1, "Skipped because component:%s is not running as pod.", componentName)
}
// verify if client can delete pod
pod := pods.Items[0]
if err := c.CoreV1().Pods(ns).Delete(ctx, pod.Name, metav1.DeleteOptions{DryRun: []string{metav1.DryRunAll}}); err != nil {
skipInternalf(1, "Skipped because client failed to delete component:%s pod, err:%v", componentName, err)
}
}
// SkipIfIPv6 skips if the cluster IP family is IPv6 and the provider is included in the unsupportedProviders.
func SkipIfIPv6(unsupportedProviders ...string) {
if framework.TestContext.ClusterIsIPv6() && framework.ProviderIs(unsupportedProviders...) {
skipInternalf(1, "Not supported for IPv6 clusters and providers %v (found %s)", unsupportedProviders, framework.TestContext.Provider)
}
}

View File

@ -0,0 +1,12 @@
# This E2E framework sub-package is currently allowed to use arbitrary
# dependencies except of k/k/pkg, therefore we need to override the
# restrictions from the parent .import-restrictions file.
#
# At some point it may become useful to also check this package's
# dependencies more careful.
rules:
- selectorRegexp: "^k8s[.]io/kubernetes/pkg"
allowedPrefixes: []
- selectorRegexp: ""
allowedPrefixes: [ "" ]

View File

@ -0,0 +1,468 @@
/*
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package ssh
import (
"bytes"
"context"
"fmt"
"net"
"os"
"path/filepath"
"sync"
"time"
"github.com/onsi/gomega"
"golang.org/x/crypto/ssh"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/fields"
"k8s.io/apimachinery/pkg/util/wait"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubernetes/test/e2e/framework"
)
const (
// SSHPort is tcp port number of SSH
SSHPort = "22"
// pollNodeInterval is how often to Poll pods.
pollNodeInterval = 2 * time.Second
// singleCallTimeout is how long to try single API calls (like 'get' or 'list'). Used to prevent
// transient failures from failing tests.
singleCallTimeout = 5 * time.Minute
// sshBastionEnvKey is the environment variable key for running SSH commands via bastion.
sshBastionEnvKey = "KUBE_SSH_BASTION"
)
// GetSigner returns an ssh.Signer for the provider ("gce", etc.) that can be
// used to SSH to their nodes.
func GetSigner(provider string) (ssh.Signer, error) {
// honor a consistent SSH key across all providers
if path := os.Getenv("KUBE_SSH_KEY_PATH"); len(path) > 0 {
return makePrivateKeySignerFromFile(path)
}
// Select the key itself to use. When implementing more providers here,
// please also add them to any SSH tests that are disabled because of signer
// support.
keyfile := ""
switch provider {
case "gce", "gke", "kubemark":
keyfile = os.Getenv("GCE_SSH_KEY")
if keyfile == "" {
keyfile = os.Getenv("GCE_SSH_PRIVATE_KEY_FILE")
}
if keyfile == "" {
keyfile = "google_compute_engine"
}
case "aws", "eks":
keyfile = os.Getenv("AWS_SSH_KEY")
if keyfile == "" {
keyfile = "kube_aws_rsa"
}
case "local", "vsphere":
keyfile = os.Getenv("LOCAL_SSH_KEY")
if keyfile == "" {
keyfile = "id_rsa"
}
case "skeleton":
keyfile = os.Getenv("KUBE_SSH_KEY")
if keyfile == "" {
keyfile = "id_rsa"
}
case "azure":
keyfile = os.Getenv("AZURE_SSH_KEY")
if keyfile == "" {
keyfile = "id_rsa"
}
default:
return nil, fmt.Errorf("GetSigner(...) not implemented for %s", provider)
}
// Respect absolute paths for keys given by user, fallback to assuming
// relative paths are in ~/.ssh
if !filepath.IsAbs(keyfile) {
keydir := filepath.Join(os.Getenv("HOME"), ".ssh")
keyfile = filepath.Join(keydir, keyfile)
}
return makePrivateKeySignerFromFile(keyfile)
}
func makePrivateKeySignerFromFile(key string) (ssh.Signer, error) {
buffer, err := os.ReadFile(key)
if err != nil {
return nil, fmt.Errorf("error reading SSH key %s: %w", key, err)
}
signer, err := ssh.ParsePrivateKey(buffer)
if err != nil {
return nil, fmt.Errorf("error parsing SSH key: %w", err)
}
return signer, err
}
// NodeSSHHosts returns SSH-able host names for all schedulable nodes.
// If it can't find any external IPs, it falls back to
// looking for internal IPs. If it can't find an internal IP for every node it
// returns an error, though it still returns all hosts that it found in that
// case.
func NodeSSHHosts(ctx context.Context, c clientset.Interface) ([]string, error) {
nodelist := waitListSchedulableNodesOrDie(ctx, c)
hosts := nodeAddresses(nodelist, v1.NodeExternalIP)
// If ExternalIPs aren't available for all nodes, try falling back to the InternalIPs.
if len(hosts) < len(nodelist.Items) {
framework.Logf("No external IP address on nodes, falling back to internal IPs")
hosts = nodeAddresses(nodelist, v1.NodeInternalIP)
}
// Error if neither External nor Internal IPs weren't available for all nodes.
if len(hosts) != len(nodelist.Items) {
return hosts, fmt.Errorf(
"only found %d IPs on nodes, but found %d nodes. Nodelist: %v",
len(hosts), len(nodelist.Items), nodelist)
}
lenHosts := len(hosts)
wg := &sync.WaitGroup{}
wg.Add(lenHosts)
sshHosts := make([]string, 0, lenHosts)
var sshHostsLock sync.Mutex
for _, host := range hosts {
go func(host string) {
defer wg.Done()
if canConnect(host) {
framework.Logf("Assuming SSH on host %s", host)
sshHostsLock.Lock()
sshHosts = append(sshHosts, net.JoinHostPort(host, SSHPort))
sshHostsLock.Unlock()
} else {
framework.Logf("Skipping host %s because it does not run anything on port %s", host, SSHPort)
}
}(host)
}
wg.Wait()
return sshHosts, nil
}
// canConnect returns true if a network connection is possible to the SSHPort.
func canConnect(host string) bool {
if _, ok := os.LookupEnv(sshBastionEnvKey); ok {
return true
}
hostPort := net.JoinHostPort(host, SSHPort)
conn, err := net.DialTimeout("tcp", hostPort, 3*time.Second)
if err != nil {
framework.Logf("cannot dial %s: %v", hostPort, err)
return false
}
conn.Close()
return true
}
// Result holds the execution result of SSH command
type Result struct {
User string
Host string
Cmd string
Stdout string
Stderr string
Code int
}
// NodeExec execs the given cmd on node via SSH. Note that the nodeName is an sshable name,
// eg: the name returned by framework.GetMasterHost(). This is also not guaranteed to work across
// cloud providers since it involves ssh.
func NodeExec(ctx context.Context, nodeName, cmd, provider string) (Result, error) {
return SSH(ctx, cmd, net.JoinHostPort(nodeName, SSHPort), provider)
}
// SSH synchronously SSHs to a node running on provider and runs cmd. If there
// is no error performing the SSH, the stdout, stderr, and exit code are
// returned.
func SSH(ctx context.Context, cmd, host, provider string) (Result, error) {
result := Result{Host: host, Cmd: cmd}
// Get a signer for the provider.
signer, err := GetSigner(provider)
if err != nil {
return result, fmt.Errorf("error getting signer for provider %s: %w", provider, err)
}
// RunSSHCommand will default to Getenv("USER") if user == "", but we're
// defaulting here as well for logging clarity.
result.User = os.Getenv("KUBE_SSH_USER")
if result.User == "" {
result.User = os.Getenv("USER")
}
if bastion := os.Getenv(sshBastionEnvKey); len(bastion) > 0 {
stdout, stderr, code, err := runSSHCommandViaBastion(ctx, cmd, result.User, bastion, host, signer)
result.Stdout = stdout
result.Stderr = stderr
result.Code = code
return result, err
}
stdout, stderr, code, err := runSSHCommand(ctx, cmd, result.User, host, signer)
result.Stdout = stdout
result.Stderr = stderr
result.Code = code
return result, err
}
// runSSHCommandViaBastion returns the stdout, stderr, and exit code from running cmd on
// host as specific user, along with any SSH-level error.
func runSSHCommand(ctx context.Context, cmd, user, host string, signer ssh.Signer) (string, string, int, error) {
if user == "" {
user = os.Getenv("USER")
}
// Setup the config, dial the server, and open a session.
config := &ssh.ClientConfig{
User: user,
Auth: []ssh.AuthMethod{ssh.PublicKeys(signer)},
HostKeyCallback: ssh.InsecureIgnoreHostKey(),
}
client, err := ssh.Dial("tcp", host, config)
if err != nil {
err = wait.PollUntilContextTimeout(ctx, 5*time.Second, 20*time.Second, false, func(ctx context.Context) (bool, error) {
fmt.Printf("error dialing %s@%s: '%v', retrying\n", user, host, err)
if client, err = ssh.Dial("tcp", host, config); err != nil {
return false, nil // retrying, error will be logged above
}
return true, nil
})
}
if err != nil {
return "", "", 0, fmt.Errorf("error getting SSH client to %s@%s: %w", user, host, err)
}
defer client.Close()
session, err := client.NewSession()
if err != nil {
return "", "", 0, fmt.Errorf("error creating session to %s@%s: %w", user, host, err)
}
defer session.Close()
// Run the command.
code := 0
var bout, berr bytes.Buffer
session.Stdout, session.Stderr = &bout, &berr
if err = session.Run(cmd); err != nil {
// Check whether the command failed to run or didn't complete.
if exiterr, ok := err.(*ssh.ExitError); ok {
// If we got an ExitError and the exit code is nonzero, we'll
// consider the SSH itself successful (just that the command run
// errored on the host).
if code = exiterr.ExitStatus(); code != 0 {
err = nil
}
} else {
// Some other kind of error happened (e.g. an IOError); consider the
// SSH unsuccessful.
err = fmt.Errorf("failed running `%s` on %s@%s: %w", cmd, user, host, err)
}
}
return bout.String(), berr.String(), code, err
}
// runSSHCommandViaBastion returns the stdout, stderr, and exit code from running cmd on
// host as specific user, along with any SSH-level error. It uses an SSH proxy to connect
// to bastion, then via that tunnel connects to the remote host. Similar to
// sshutil.RunSSHCommand but scoped to the needs of the test infrastructure.
func runSSHCommandViaBastion(ctx context.Context, cmd, user, bastion, host string, signer ssh.Signer) (string, string, int, error) {
// Setup the config, dial the server, and open a session.
config := &ssh.ClientConfig{
User: user,
Auth: []ssh.AuthMethod{ssh.PublicKeys(signer)},
HostKeyCallback: ssh.InsecureIgnoreHostKey(),
Timeout: 150 * time.Second,
}
bastionClient, err := ssh.Dial("tcp", bastion, config)
if err != nil {
err = wait.PollUntilContextTimeout(ctx, 5*time.Second, 20*time.Second, false, func(ctx context.Context) (bool, error) {
fmt.Printf("error dialing %s@%s: '%v', retrying\n", user, bastion, err)
if bastionClient, err = ssh.Dial("tcp", bastion, config); err != nil {
return false, err
}
return true, nil
})
}
if err != nil {
return "", "", 0, fmt.Errorf("error getting SSH client to %s@%s: %w", user, bastion, err)
}
defer bastionClient.Close()
conn, err := bastionClient.Dial("tcp", host)
if err != nil {
return "", "", 0, fmt.Errorf("error dialing %s from bastion: %w", host, err)
}
defer conn.Close()
ncc, chans, reqs, err := ssh.NewClientConn(conn, host, config)
if err != nil {
return "", "", 0, fmt.Errorf("error creating forwarding connection %s from bastion: %w", host, err)
}
client := ssh.NewClient(ncc, chans, reqs)
defer client.Close()
session, err := client.NewSession()
if err != nil {
return "", "", 0, fmt.Errorf("error creating session to %s@%s from bastion: %w", user, host, err)
}
defer session.Close()
// Run the command.
code := 0
var bout, berr bytes.Buffer
session.Stdout, session.Stderr = &bout, &berr
if err = session.Run(cmd); err != nil {
// Check whether the command failed to run or didn't complete.
if exiterr, ok := err.(*ssh.ExitError); ok {
// If we got an ExitError and the exit code is nonzero, we'll
// consider the SSH itself successful (just that the command run
// errored on the host).
if code = exiterr.ExitStatus(); code != 0 {
err = nil
}
} else {
// Some other kind of error happened (e.g. an IOError); consider the
// SSH unsuccessful.
err = fmt.Errorf("failed running `%s` on %s@%s: %w", cmd, user, host, err)
}
}
return bout.String(), berr.String(), code, err
}
// LogResult records result log
func LogResult(result Result) {
remote := fmt.Sprintf("%s@%s", result.User, result.Host)
framework.Logf("ssh %s: command: %s", remote, result.Cmd)
framework.Logf("ssh %s: stdout: %q", remote, result.Stdout)
framework.Logf("ssh %s: stderr: %q", remote, result.Stderr)
framework.Logf("ssh %s: exit code: %d", remote, result.Code)
}
// IssueSSHCommandWithResult tries to execute a SSH command and returns the execution result
func IssueSSHCommandWithResult(ctx context.Context, cmd, provider string, node *v1.Node) (*Result, error) {
framework.Logf("Getting external IP address for %s", node.Name)
host := ""
for _, a := range node.Status.Addresses {
if a.Type == v1.NodeExternalIP && a.Address != "" {
host = net.JoinHostPort(a.Address, SSHPort)
break
}
}
if host == "" {
// No external IPs were found, let's try to use internal as plan B
for _, a := range node.Status.Addresses {
if a.Type == v1.NodeInternalIP && a.Address != "" {
host = net.JoinHostPort(a.Address, SSHPort)
break
}
}
}
if host == "" {
return nil, fmt.Errorf("couldn't find any IP address for node %s", node.Name)
}
framework.Logf("SSH %q on %s(%s)", cmd, node.Name, host)
result, err := SSH(ctx, cmd, host, provider)
LogResult(result)
if result.Code != 0 || err != nil {
return nil, fmt.Errorf("failed running %q: %v (exit code %d, stderr %v)",
cmd, err, result.Code, result.Stderr)
}
return &result, nil
}
// IssueSSHCommand tries to execute a SSH command
func IssueSSHCommand(ctx context.Context, cmd, provider string, node *v1.Node) error {
_, err := IssueSSHCommandWithResult(ctx, cmd, provider, node)
if err != nil {
return err
}
return nil
}
// nodeAddresses returns the first address of the given type of each node.
func nodeAddresses(nodelist *v1.NodeList, addrType v1.NodeAddressType) []string {
hosts := []string{}
for _, n := range nodelist.Items {
for _, addr := range n.Status.Addresses {
if addr.Type == addrType && addr.Address != "" {
hosts = append(hosts, addr.Address)
break
}
}
}
return hosts
}
// waitListSchedulableNodes is a wrapper around listing nodes supporting retries.
func waitListSchedulableNodes(ctx context.Context, c clientset.Interface) (*v1.NodeList, error) {
var nodes *v1.NodeList
var err error
if wait.PollUntilContextTimeout(ctx, pollNodeInterval, singleCallTimeout, true, func(ctx context.Context) (bool, error) {
nodes, err = c.CoreV1().Nodes().List(ctx, metav1.ListOptions{FieldSelector: fields.Set{
"spec.unschedulable": "false",
}.AsSelector().String()})
if err != nil {
return false, err
}
return true, nil
}) != nil {
return nodes, err
}
return nodes, nil
}
// waitListSchedulableNodesOrDie is a wrapper around listing nodes supporting retries.
func waitListSchedulableNodesOrDie(ctx context.Context, c clientset.Interface) *v1.NodeList {
nodes, err := waitListSchedulableNodes(ctx, c)
if err != nil {
expectNoError(err, "Non-retryable failure or timed out while listing nodes for e2e cluster.")
}
return nodes
}
// expectNoError checks if "err" is set, and if so, fails assertion while logging the error.
func expectNoError(err error, explain ...interface{}) {
expectNoErrorWithOffset(1, err, explain...)
}
// expectNoErrorWithOffset checks if "err" is set, and if so, fails assertion while logging the error at "offset" levels above its caller
// (for example, for call chain f -> g -> ExpectNoErrorWithOffset(1, ...) error would be logged for "f").
func expectNoErrorWithOffset(offset int, err error, explain ...interface{}) {
if err != nil {
framework.Logf("Unexpected error occurred: %v", err)
}
gomega.ExpectWithOffset(1+offset, err).NotTo(gomega.HaveOccurred(), explain...)
}

View File

@ -0,0 +1,676 @@
/*
Copyright 2016 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import (
"context"
"crypto/rand"
"encoding/base64"
"errors"
"flag"
"fmt"
"io"
"math"
"os"
"path"
"path/filepath"
"sort"
"strings"
"time"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/ginkgo/v2/reporters"
"github.com/onsi/ginkgo/v2/types"
"github.com/onsi/gomega"
gomegaformat "github.com/onsi/gomega/format"
"k8s.io/apimachinery/pkg/util/sets"
restclient "k8s.io/client-go/rest"
"k8s.io/client-go/tools/clientcmd"
cliflag "k8s.io/component-base/cli/flag"
"k8s.io/klog/v2"
"k8s.io/kubernetes/test/e2e/framework/internal/junit"
"k8s.io/kubernetes/test/utils/image"
"k8s.io/kubernetes/test/utils/kubeconfig"
)
const (
defaultHost = "https://127.0.0.1:6443"
// DefaultNumNodes is the number of nodes. If not specified, then number of nodes is auto-detected
DefaultNumNodes = -1
)
var (
// Output is used for output when not running tests, for example in -list-tests.
// Test output should go to ginkgo.GinkgoWriter.
Output io.Writer = os.Stdout
// Exit is called when the framework detects fatal errors or when
// it is done with the execution of e.g. -list-tests.
Exit = os.Exit
// CheckForBugs determines whether the framework bails out when
// test initialization found any bugs.
CheckForBugs = true
)
// TestContextType contains test settings and global state. Due to
// historic reasons, it is a mixture of items managed by the test
// framework itself, cloud providers and individual tests.
// The goal is to move anything not required by the framework
// into the code which uses the settings.
//
// The recommendation for those settings is:
// - They are stored in their own context structure or local
// variables.
// - The standard `flag` package is used to register them.
// The flag name should follow the pattern <part1>.<part2>....<partn>
// where the prefix is unlikely to conflict with other tests or
// standard packages and each part is in lower camel case. For
// example, test/e2e/storage/csi/context.go could define
// storage.csi.numIterations.
// - framework/config can be used to simplify the registration of
// multiple options with a single function call:
// var storageCSI {
// NumIterations `default:"1" usage:"number of iterations"`
// }
// _ config.AddOptions(&storageCSI, "storage.csi")
// - The direct use Viper in tests is possible, but discouraged because
// it only works in test suites which use Viper (which is not
// required) and the supported options cannot be
// discovered by a test suite user.
//
// Test suite authors can use framework/viper to make all command line
// parameters also configurable via a configuration file.
type TestContextType struct {
KubeConfig string
KubeContext string
KubeAPIContentType string
KubeletRootDir string
KubeletConfigDropinDir string
CertDir string
Host string
BearerToken string `datapolicy:"token"`
// TODO: Deprecating this over time... instead just use gobindata_util.go , see #23987.
RepoRoot string
// ListImages will list off all images that are used then quit
ListImages bool
listTests, listLabels bool
// ListConformanceTests will list off all conformance tests that are available then quit
ListConformanceTests bool
// Provider identifies the infrastructure provider (gce, gke, aws)
Provider string
// Tooling is the tooling in use (e.g. kops, gke). Provider is the cloud provider and might not uniquely identify the tooling.
Tooling string
// timeouts contains user-configurable timeouts for various operations.
// Individual Framework instance also have such timeouts which may be
// different from these here. To avoid confusion, this field is not
// exported. Its values can be accessed through
// NewTimeoutContext.
timeouts TimeoutContext
CloudConfig CloudConfig
KubectlPath string
OutputDir string
ReportDir string
ReportPrefix string
ReportCompleteGinkgo bool
ReportCompleteJUnit bool
Prefix string
MinStartupPods int
EtcdUpgradeStorage string
EtcdUpgradeVersion string
GCEUpgradeScript string
ContainerRuntimeEndpoint string
ContainerRuntimeProcessName string
ContainerRuntimePidFile string
// SystemdServices are comma separated list of systemd services the test framework
// will dump logs for.
SystemdServices string
// DumpSystemdJournal controls whether to dump the full systemd journal.
DumpSystemdJournal bool
ImageServiceEndpoint string
MasterOSDistro string
NodeOSDistro string
NodeOSArch string
VerifyServiceAccount bool
DeleteNamespace bool
DeleteNamespaceOnFailure bool
AllowedNotReadyNodes int
CleanStart bool
// If set to 'true' or 'all' framework will start a goroutine monitoring resource usage of system add-ons.
// It will read the data every 30 seconds from all Nodes and print summary during afterEach. If set to 'master'
// only master Node will be monitored.
GatherKubeSystemResourceUsageData string
GatherLogsSizes bool
GatherMetricsAfterTest string
GatherSuiteMetricsAfterTest bool
MaxNodesToGather int
// If set to 'true' framework will gather ClusterAutoscaler metrics when gathering them for other components.
IncludeClusterAutoscalerMetrics bool
// Currently supported values are 'hr' for human-readable and 'json'. It's a comma separated list.
OutputPrintType string
// CreateTestingNS is responsible for creating namespace used for executing e2e tests.
// It accepts namespace base name, which will be prepended with e2e prefix, kube client
// and labels to be applied to a namespace.
CreateTestingNS CreateTestingNSFn
// If set to true test will dump data about the namespace in which test was running.
DumpLogsOnFailure bool
// Disables dumping cluster log from master and nodes after all tests.
DisableLogDump bool
// Path to the GCS artifacts directory to dump logs from nodes. Logexporter gets enabled if this is non-empty.
LogexporterGCSPath string
// Node e2e specific test context
NodeTestContextType
// The DNS Domain of the cluster.
ClusterDNSDomain string
// The configuration of NodeKiller.
NodeKiller NodeKillerConfig
// The Default IP Family of the cluster ("ipv4" or "ipv6")
IPFamily string
// NonblockingTaints is the comma-delimeted string given by the user to specify taints which should not stop the test framework from running tests.
NonblockingTaints string
// ProgressReportURL is the URL which progress updates will be posted to as tests complete. If empty, no updates are sent.
ProgressReportURL string
// SriovdpConfigMapFile is the path to the ConfigMap to configure the SRIOV device plugin on this host.
SriovdpConfigMapFile string
// SpecSummaryOutput is the file to write ginkgo.SpecSummary objects to as tests complete. Useful for debugging and test introspection.
SpecSummaryOutput string
// DockerConfigFile is a file that contains credentials which can be used to pull images from certain private registries, needed for a test.
DockerConfigFile string
// E2EDockerConfigFile is a docker credentials configuration file used which contains authorization token that can be used to pull images from certain private registries provided by the users.
// For more details refer https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/#log-in-to-docker-hub
E2EDockerConfigFile string
// KubeTestRepoConfigFile is a yaml file used for overriding registries for test images.
KubeTestRepoList string
// SnapshotControllerPodName is the name used for identifying the snapshot controller pod.
SnapshotControllerPodName string
// SnapshotControllerHTTPPort the port used for communicating with the snapshot controller HTTP endpoint.
SnapshotControllerHTTPPort int
// RequireDevices makes mandatory on the environment on which tests are run 1+ devices exposed through device plugins.
// With this enabled The e2e tests requiring devices for their operation can assume that if devices aren't reported, the test can fail
RequireDevices bool
// Enable volume drivers which are disabled by default. See test/e2e/storage/in_tree_volumes.go for details.
EnabledVolumeDrivers []string
}
// NodeKillerConfig describes configuration of NodeKiller -- a utility to
// simulate node failures.
//
// TODO: move this and the corresponding command line flags into
// test/e2e/framework/node.
type NodeKillerConfig struct {
// Enabled determines whether NodeKill should do anything at all.
// All other options below are ignored if Enabled = false.
Enabled bool
// FailureRatio is a percentage of all nodes that could fail simultinously.
FailureRatio float64
// Interval is time between node failures.
Interval time.Duration
// JitterFactor is factor used to jitter node failures.
// Node will be killed between [Interval, Interval + (1.0 + JitterFactor)].
JitterFactor float64
// SimulatedDowntime is a duration between node is killed and recreated.
SimulatedDowntime time.Duration
// NodeKillerStopCtx is a context that is used to notify NodeKiller to stop killing nodes.
NodeKillerStopCtx context.Context
// NodeKillerStop is the cancel function for NodeKillerStopCtx.
NodeKillerStop func()
}
// NodeTestContextType is part of TestContextType, it is shared by all node e2e test.
type NodeTestContextType struct {
// NodeE2E indicates whether it is running node e2e.
NodeE2E bool
// Name of the node to run tests on.
NodeName string
// NodeConformance indicates whether the test is running in node conformance mode.
NodeConformance bool
// PrepullImages indicates whether node e2e framework should prepull images.
PrepullImages bool
// ImageDescription is the description of the image on which the test is running.
ImageDescription string
// RuntimeConfig is a map of API server runtime configuration values.
RuntimeConfig map[string]string
// SystemSpecName is the name of the system spec (e.g., gke) that's used in
// the node e2e test. If empty, the default one (system.DefaultSpec) is
// used. The system specs are in test/e2e_node/system/specs/.
SystemSpecName string
// RestartKubelet restarts Kubelet unit when the process is killed.
RestartKubelet bool
// ExtraEnvs is a map of environment names to values.
ExtraEnvs map[string]string
// StandaloneMode indicates whether the test is running kubelet in a standalone mode.
StandaloneMode bool
// CriProxyEnabled indicates whether enable CRI API proxy for failure injection.
CriProxyEnabled bool
}
// CloudConfig holds the cloud configuration for e2e test suites.
type CloudConfig struct {
APIEndpoint string
ProjectID string
Zone string // for multizone tests, arbitrarily chosen zone
Zones []string // for multizone tests, use this set of zones instead of querying the cloud provider. Must include Zone.
Region string
MultiZone bool
MultiMaster bool
Cluster string
MasterName string
NodeInstanceGroup string // comma-delimited list of groups' names
NumNodes int
ClusterIPRange string
ClusterTag string
Network string
ConfigFile string // for azure
NodeTag string
MasterTag string
Provider ProviderInterface
}
// TestContext should be used by all tests to access common context data.
var TestContext = TestContextType{
timeouts: defaultTimeouts,
}
// StringArrayValue is used with flag.Var for a comma-separated list of strings placed into a string array.
type stringArrayValue struct {
stringArray *[]string
}
func (v stringArrayValue) String() string {
if v.stringArray != nil {
return strings.Join(*v.stringArray, ",")
}
return ""
}
func (v stringArrayValue) Set(s string) error {
if len(s) == 0 {
*v.stringArray = []string{}
} else {
*v.stringArray = strings.Split(s, ",")
}
return nil
}
// ClusterIsIPv6 returns true if the cluster is IPv6
func (tc TestContextType) ClusterIsIPv6() bool {
return tc.IPFamily == "ipv6"
}
// RegisterCommonFlags registers flags common to all e2e test suites.
// The flag set can be flag.CommandLine (if desired) or a custom
// flag set that then gets passed to viperconfig.ViperizeFlags.
//
// The other Register*Flags methods below can be used to add more
// test-specific flags. However, those settings then get added
// regardless whether the test is actually in the test suite.
//
// For tests that have been converted to registering their
// options themselves, copy flags from test/e2e/framework/config
// as shown in HandleFlags.
func RegisterCommonFlags(flags *flag.FlagSet) {
// The default is too low for objects like pods, even when using YAML. We double the default.
flags.IntVar(&gomegaformat.MaxLength, "gomega-max-length", 8000, "Sets the maximum size for the gomega formatter (= gomega.MaxLength). Use 0 to disable truncation.")
flags.StringVar(&TestContext.GatherKubeSystemResourceUsageData, "gather-resource-usage", "false", "If set to 'true' or 'all' framework will be monitoring resource usage of system all add-ons in (some) e2e tests, if set to 'master' framework will be monitoring master node only, if set to 'none' of 'false' monitoring will be turned off.")
flags.BoolVar(&TestContext.GatherLogsSizes, "gather-logs-sizes", false, "If set to true framework will be monitoring logs sizes on all machines running e2e tests.")
flags.IntVar(&TestContext.MaxNodesToGather, "max-nodes-to-gather-from", 20, "The maximum number of nodes to gather extended info from on test failure.")
flags.StringVar(&TestContext.GatherMetricsAfterTest, "gather-metrics-at-teardown", "false", "If set to 'true' framework will gather metrics from all components after each test. If set to 'master' only master component metrics would be gathered.")
flags.BoolVar(&TestContext.GatherSuiteMetricsAfterTest, "gather-suite-metrics-at-teardown", false, "If set to true framework will gather metrics from all components after the whole test suite completes.")
flags.BoolVar(&TestContext.IncludeClusterAutoscalerMetrics, "include-cluster-autoscaler", false, "If set to true, framework will include Cluster Autoscaler when gathering metrics.")
flags.StringVar(&TestContext.OutputPrintType, "output-print-type", "json", "Format in which summaries should be printed: 'hr' for human readable, 'json' for JSON ones.")
flags.BoolVar(&TestContext.DumpLogsOnFailure, "dump-logs-on-failure", true, "If set to true test will dump data about the namespace in which test was running.")
flags.BoolVar(&TestContext.DisableLogDump, "disable-log-dump", false, "If set to true, logs from master and nodes won't be gathered after test run.")
flags.StringVar(&TestContext.LogexporterGCSPath, "logexporter-gcs-path", "", "Path to the GCS artifacts directory to dump logs from nodes. Logexporter gets enabled if this is non-empty.")
flags.BoolVar(&TestContext.DeleteNamespace, "delete-namespace", true, "If true tests will delete namespace after completion. It is only designed to make debugging easier, DO NOT turn it off by default.")
flags.BoolVar(&TestContext.DeleteNamespaceOnFailure, "delete-namespace-on-failure", true, "If true, framework will delete test namespace on failure. Used only during test debugging.")
flags.IntVar(&TestContext.AllowedNotReadyNodes, "allowed-not-ready-nodes", 0, "If greater than zero, framework will allow for that many non-ready nodes when checking for all ready nodes. If -1, no waiting will be performed for ready nodes or daemonset pods.")
flags.StringVar(&TestContext.Host, "host", "", fmt.Sprintf("The host, or apiserver, to connect to. Will default to %s if this argument and --kubeconfig are not set.", defaultHost))
flags.StringVar(&TestContext.ReportPrefix, "report-prefix", "", "Optional prefix for JUnit XML reports. Default is empty, which doesn't prepend anything to the default name.")
flags.StringVar(&TestContext.ReportDir, "report-dir", "", "Path to the directory where the simplified JUnit XML reports and other tests results should be saved. Default is empty, which doesn't generate these reports. If ginkgo's -junit-report parameter is used, that parameter instead of -report-dir determines the location of a single JUnit report.")
flags.BoolVar(&TestContext.ReportCompleteGinkgo, "report-complete-ginkgo", false, "Enables writing a complete test report as Ginkgo JSON to <report dir>/ginkgo/report.json. Ignored if --report-dir is not set.")
flags.BoolVar(&TestContext.ReportCompleteJUnit, "report-complete-junit", false, "Enables writing a complete test report as JUnit XML to <report dir>/ginkgo/report.json. Ignored if --report-dir is not set.")
flags.StringVar(&TestContext.ContainerRuntimeEndpoint, "container-runtime-endpoint", "unix:///run/containerd/containerd.sock", "The container runtime endpoint of cluster VM instances.")
flags.StringVar(&TestContext.ContainerRuntimeProcessName, "container-runtime-process-name", "containerd", "The name of the container runtime process.")
flags.StringVar(&TestContext.ContainerRuntimePidFile, "container-runtime-pid-file", "/run/containerd/containerd.pid", "The pid file of the container runtime.")
flags.StringVar(&TestContext.SystemdServices, "systemd-services", "containerd*", "The comma separated list of systemd services the framework will dump logs for.")
flags.BoolVar(&TestContext.DumpSystemdJournal, "dump-systemd-journal", false, "Whether to dump the full systemd journal.")
flags.StringVar(&TestContext.ImageServiceEndpoint, "image-service-endpoint", "", "The image service endpoint of cluster VM instances.")
flags.StringVar(&TestContext.NonblockingTaints, "non-blocking-taints", `node-role.kubernetes.io/control-plane`, "Nodes with taints in this comma-delimited list will not block the test framework from starting tests.")
flags.BoolVar(&TestContext.ListImages, "list-images", false, "If true, will show list of images used for running tests.")
flags.BoolVar(&TestContext.listLabels, "list-labels", false, "If true, will show the list of labels that can be used to select tests via -ginkgo.label-filter.")
flags.BoolVar(&TestContext.listTests, "list-tests", false, "If true, will show the full names of all tests (aka specs) that can be used to select test via -ginkgo.focus/skip.")
flags.StringVar(&TestContext.KubectlPath, "kubectl-path", "kubectl", "The kubectl binary to use. For development, you might use 'cluster/kubectl.sh' here.")
flags.StringVar(&TestContext.ProgressReportURL, "progress-report-url", "", "The URL to POST progress updates to as the suite runs to assist in aiding integrations. If empty, no messages sent.")
flags.StringVar(&TestContext.SpecSummaryOutput, "spec-dump", "", "The file to dump all ginkgo.SpecSummary to after tests run. If empty, no objects are saved/printed.")
flags.StringVar(&TestContext.DockerConfigFile, "docker-config-file", "", "A docker credential file which contains authorization token that is used to perform image pull tests from an authenticated registry. For more details regarding the content of the file refer https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/#log-in-to-docker-hub")
flags.StringVar(&TestContext.E2EDockerConfigFile, "e2e-docker-config-file", "", "A docker credentials configuration file used which contains authorization token that can be used to pull images from certain private registries provided by the users. For more details refer https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/#log-in-to-docker-hub")
flags.StringVar(&TestContext.KubeTestRepoList, "kube-test-repo-list", "", "A yaml file used for overriding registries for test images. Alternatively, the KUBE_TEST_REPO_LIST env variable can be set.")
flags.StringVar(&TestContext.SnapshotControllerPodName, "snapshot-controller-pod-name", "", "The pod name to use for identifying the snapshot controller in the kube-system namespace.")
flags.IntVar(&TestContext.SnapshotControllerHTTPPort, "snapshot-controller-http-port", 0, "The port to use for snapshot controller HTTP communication.")
flags.Var(&stringArrayValue{&TestContext.EnabledVolumeDrivers}, "enabled-volume-drivers", "Comma-separated list of in-tree volume drivers to enable for testing. This is only needed for in-tree drivers disabled by default. An example is gcepd; see test/e2e/storage/in_tree_volumes.go for full details.")
}
func CreateGinkgoConfig() (types.SuiteConfig, types.ReporterConfig) {
// fetch the current config
suiteConfig, reporterConfig := ginkgo.GinkgoConfiguration()
// Randomize specs as well as suites
suiteConfig.RandomizeAllSpecs = true
// Disable skipped tests unless they are explicitly requested.
if len(suiteConfig.FocusStrings) == 0 && len(suiteConfig.SkipStrings) == 0 && suiteConfig.LabelFilter == "" {
suiteConfig.SkipStrings = []string{`\[Flaky\]|\[Feature:.+\]`}
}
return suiteConfig, reporterConfig
}
// RegisterClusterFlags registers flags specific to the cluster e2e test suite.
func RegisterClusterFlags(flags *flag.FlagSet) {
flags.BoolVar(&TestContext.VerifyServiceAccount, "e2e-verify-service-account", true, "If true tests will verify the service account before running.")
flags.StringVar(&TestContext.KubeConfig, clientcmd.RecommendedConfigPathFlag, os.Getenv(clientcmd.RecommendedConfigPathEnvVar), "Path to kubeconfig containing embedded authinfo.")
flags.StringVar(&TestContext.KubeContext, clientcmd.FlagContext, "", "kubeconfig context to use/override. If unset, will use value from 'current-context'")
flags.StringVar(&TestContext.KubeAPIContentType, "kube-api-content-type", "application/vnd.kubernetes.protobuf", "ContentType used to communicate with apiserver")
flags.StringVar(&TestContext.KubeletRootDir, "kubelet-root-dir", "/var/lib/kubelet", "The data directory of kubelet. Some tests (for example, CSI storage tests) deploy DaemonSets which need to know this value and cannot query it. Such tests only work in clusters where the data directory is the same on all nodes.")
flags.StringVar(&TestContext.KubeletRootDir, "volume-dir", "/var/lib/kubelet", "An alias for --kubelet-root-dir, kept for backwards compatibility.")
flags.StringVar(&TestContext.CertDir, "cert-dir", "", "Path to the directory containing the certs. Default is empty, which doesn't use certs.")
flags.StringVar(&TestContext.RepoRoot, "repo-root", "../../", "Root directory of kubernetes repository, for finding test files.")
// NOTE: Node E2E tests have this flag defined as well, but true by default.
// If this becomes true as well, they should be refactored into RegisterCommonFlags.
flags.BoolVar(&TestContext.PrepullImages, "prepull-images", false, "If true, prepull images so image pull failures do not cause test failures.")
flags.StringVar(&TestContext.Provider, "provider", "", "The name of the Kubernetes provider (gce, gke, local, skeleton (the fallback if not set), etc.)")
flags.StringVar(&TestContext.Tooling, "tooling", "", "The tooling in use (kops, gke, etc.)")
flags.StringVar(&TestContext.OutputDir, "e2e-output-dir", "/tmp", "Output directory for interesting/useful test data, like performance data, benchmarks, and other metrics.")
flags.StringVar(&TestContext.Prefix, "prefix", "e2e", "A prefix to be added to cloud resources created during testing.")
flags.StringVar(&TestContext.MasterOSDistro, "master-os-distro", "debian", "The OS distribution of cluster master (debian, ubuntu, gci, coreos, or custom).")
flags.StringVar(&TestContext.NodeOSDistro, "node-os-distro", "debian", "The OS distribution of cluster VM instances (debian, ubuntu, gci, coreos, windows, or custom), which determines how specific tests are implemented.")
flags.StringVar(&TestContext.NodeOSArch, "node-os-arch", "amd64", "The OS architecture of cluster VM instances (amd64, arm64, or custom).")
flags.StringVar(&TestContext.ClusterDNSDomain, "dns-domain", "cluster.local", "The DNS Domain of the cluster.")
// TODO: Flags per provider? Rename gce-project/gce-zone?
cloudConfig := &TestContext.CloudConfig
flags.StringVar(&cloudConfig.MasterName, "kube-master", "", "Name of the kubernetes master. Only required if provider is gce or gke")
flags.StringVar(&cloudConfig.APIEndpoint, "gce-api-endpoint", "", "The GCE APIEndpoint being used, if applicable")
flags.StringVar(&cloudConfig.ProjectID, "gce-project", "", "The GCE project being used, if applicable")
flags.StringVar(&cloudConfig.Zone, "gce-zone", "", "GCE zone being used, if applicable")
flags.Var(cliflag.NewStringSlice(&cloudConfig.Zones), "gce-zones", "The set of zones to use in a multi-zone test instead of querying the cloud provider.")
flags.StringVar(&cloudConfig.Region, "gce-region", "", "GCE region being used, if applicable")
flags.BoolVar(&cloudConfig.MultiZone, "gce-multizone", false, "If true, start GCE cloud provider with multizone support.")
flags.BoolVar(&cloudConfig.MultiMaster, "gce-multimaster", false, "If true, the underlying GCE/GKE cluster is assumed to be multi-master.")
flags.StringVar(&cloudConfig.Cluster, "gke-cluster", "", "GKE name of cluster being used, if applicable")
flags.StringVar(&cloudConfig.NodeInstanceGroup, "node-instance-group", "", "Name of the managed instance group for nodes. Valid only for gce, gke or aws. If there is more than one group: comma separated list of groups.")
flags.StringVar(&cloudConfig.Network, "network", "e2e", "The cloud provider network for this e2e cluster.")
flags.IntVar(&cloudConfig.NumNodes, "num-nodes", DefaultNumNodes, fmt.Sprintf("Number of nodes in the cluster. If the default value of '%q' is used the number of schedulable nodes is auto-detected.", DefaultNumNodes))
flags.StringVar(&cloudConfig.ClusterIPRange, "cluster-ip-range", "10.64.0.0/14", "A CIDR notation IP range from which to assign IPs in the cluster.")
flags.StringVar(&cloudConfig.NodeTag, "node-tag", "", "Network tags used on node instances. Valid only for gce, gke")
flags.StringVar(&cloudConfig.MasterTag, "master-tag", "", "Network tags used on master instances. Valid only for gce, gke")
flags.StringVar(&cloudConfig.ClusterTag, "cluster-tag", "", "Tag used to identify resources. Only required if provider is aws.")
flags.StringVar(&cloudConfig.ConfigFile, "cloud-config-file", "", "Cloud config file. Only required if provider is azure or vsphere.")
flags.IntVar(&TestContext.MinStartupPods, "minStartupPods", 0, "The number of pods which we need to see in 'Running' state with a 'Ready' condition of true, before we try running tests. This is useful in any cluster which needs some base pod-based services running before it can be used. If set to -1, no pods are checked and tests run straight away.")
flags.DurationVar(&TestContext.timeouts.SystemPodsStartup, "system-pods-startup-timeout", TestContext.timeouts.SystemPodsStartup, "Timeout for waiting for all system pods to be running before starting tests.")
flags.DurationVar(&TestContext.timeouts.NodeSchedulable, "node-schedulable-timeout", TestContext.timeouts.NodeSchedulable, "Timeout for waiting for all nodes to be schedulable.")
flags.DurationVar(&TestContext.timeouts.SystemDaemonsetStartup, "system-daemonsets-startup-timeout", TestContext.timeouts.SystemDaemonsetStartup, "Timeout for waiting for all system daemonsets to be ready.")
flags.StringVar(&TestContext.EtcdUpgradeStorage, "etcd-upgrade-storage", "", "The storage version to upgrade to (either 'etcdv2' or 'etcdv3') if doing an etcd upgrade test.")
flags.StringVar(&TestContext.EtcdUpgradeVersion, "etcd-upgrade-version", "", "The etcd binary version to upgrade to (e.g., '3.0.14', '2.3.7') if doing an etcd upgrade test.")
flags.StringVar(&TestContext.GCEUpgradeScript, "gce-upgrade-script", "", "Script to use to upgrade a GCE cluster.")
flags.BoolVar(&TestContext.CleanStart, "clean-start", false, "If true, purge all namespaces except default and system before running tests. This serves to Cleanup test namespaces from failed/interrupted e2e runs in a long-lived cluster.")
nodeKiller := &TestContext.NodeKiller
flags.BoolVar(&nodeKiller.Enabled, "node-killer", false, "Whether NodeKiller should kill any nodes.")
flags.Float64Var(&nodeKiller.FailureRatio, "node-killer-failure-ratio", 0.01, "Percentage of nodes to be killed")
flags.DurationVar(&nodeKiller.Interval, "node-killer-interval", 1*time.Minute, "Time between node failures.")
flags.Float64Var(&nodeKiller.JitterFactor, "node-killer-jitter-factor", 60, "Factor used to jitter node failures.")
flags.DurationVar(&nodeKiller.SimulatedDowntime, "node-killer-simulated-downtime", 10*time.Minute, "A delay between node death and recreation")
}
// generateSecureToken returns a string of length tokenLen, consisting
// of random bytes encoded as base64 for use as a Bearer Token during
// communication with an APIServer
func generateSecureToken(tokenLen int) (string, error) {
// Number of bytes to be tokenLen when base64 encoded.
tokenSize := math.Ceil(float64(tokenLen) * 6 / 8)
rawToken := make([]byte, int(tokenSize))
if _, err := rand.Read(rawToken); err != nil {
return "", err
}
encoded := base64.RawURLEncoding.EncodeToString(rawToken)
token := encoded[:tokenLen]
return token, nil
}
// AfterReadingAllFlags makes changes to the context after all flags
// have been read and prepares the process for a test run.
func AfterReadingAllFlags(t *TestContextType) {
// Reconfigure klog so that output goes to the GinkgoWriter instead
// of stderr. The advantage is that it then gets interleaved properly
// with output that goes to GinkgoWriter (By, Logf).
// These flags are not exposed via the normal command line flag set,
// therefore we have to use our own private one here.
if t.KubeTestRepoList != "" {
image.Init(t.KubeTestRepoList)
}
if t.ListImages {
for _, v := range image.GetImageConfigs() {
fmt.Println(v.GetE2EImage())
}
Exit(0)
}
// Reconfigure gomega defaults. The poll interval should be suitable
// for most tests. The timeouts are more subjective and tests may want
// to override them, but these defaults are still better for E2E than the
// ones from Gomega (1s timeout, 10ms interval).
gomega.SetDefaultEventuallyPollingInterval(t.timeouts.Poll)
gomega.SetDefaultConsistentlyPollingInterval(t.timeouts.Poll)
gomega.SetDefaultEventuallyTimeout(t.timeouts.PodStart)
gomega.SetDefaultConsistentlyDuration(t.timeouts.PodStartShort)
gomega.EnforceDefaultTimeoutsWhenUsingContexts()
// ginkgo.PreviewSpecs will expand all nodes and thus may find new bugs.
report := ginkgo.PreviewSpecs("Kubernetes e2e test statistics")
validateSpecs(report.SpecReports)
if err := FormatBugs(); CheckForBugs && err != nil {
// Refuse to do anything if the E2E suite is buggy.
fmt.Fprint(Output, "ERROR: E2E suite initialization was faulty, these errors must be fixed:")
fmt.Fprint(Output, "\n"+err.Error())
Exit(1)
}
if t.listLabels || t.listTests {
listTestInformation(report)
Exit(0)
}
// Only set a default host if one won't be supplied via kubeconfig
if len(t.Host) == 0 && len(t.KubeConfig) == 0 {
// Check if we can use the in-cluster config
if clusterConfig, err := restclient.InClusterConfig(); err == nil {
if tempFile, err := os.CreateTemp(os.TempDir(), "kubeconfig-"); err == nil {
kubeConfig := kubeconfig.CreateKubeConfig(clusterConfig)
clientcmd.WriteToFile(*kubeConfig, tempFile.Name())
t.KubeConfig = tempFile.Name()
klog.V(4).Infof("Using a temporary kubeconfig file from in-cluster config : %s", tempFile.Name())
}
}
if len(t.KubeConfig) == 0 {
klog.Warningf("Unable to find in-cluster config, using default host : %s", defaultHost)
t.Host = defaultHost
}
}
if len(t.BearerToken) == 0 {
var err error
t.BearerToken, err = generateSecureToken(16)
ExpectNoError(err, "Failed to generate bearer token")
}
// Allow 1% of nodes to be unready (statistically) - relevant for large clusters.
if t.AllowedNotReadyNodes == 0 {
t.AllowedNotReadyNodes = t.CloudConfig.NumNodes / 100
}
klog.V(4).Infof("Tolerating taints %q when considering if nodes are ready", TestContext.NonblockingTaints)
// Make sure that all test runs have a valid TestContext.CloudConfig.Provider.
// TODO: whether and how long this code is needed is getting discussed
// in https://github.com/kubernetes/kubernetes/issues/70194.
if TestContext.Provider == "" {
// Some users of the e2e.test binary pass --provider=.
// We need to support that, changing it would break those usages.
Logf("The --provider flag is not set. Continuing as if --provider=skeleton had been used.")
TestContext.Provider = "skeleton"
}
var err error
TestContext.CloudConfig.Provider, err = SetupProviderConfig(TestContext.Provider)
if err != nil {
if os.IsNotExist(errors.Unwrap(err)) {
// Provide a more helpful error message when the provider is unknown.
var providers []string
for _, name := range GetProviders() {
// The empty string is accepted, but looks odd in the output below unless we quote it.
if name == "" {
name = `""`
}
providers = append(providers, name)
}
sort.Strings(providers)
klog.Errorf("Unknown provider %q. The following providers are known: %v", TestContext.Provider, strings.Join(providers, " "))
} else {
klog.Errorf("Failed to setup provider config for %q: %v", TestContext.Provider, err)
}
Exit(1)
}
if TestContext.ReportDir != "" {
// Create the directory before running the suite. If
// --report-dir is not unusable, we should report
// that as soon as possible. This will be done by each worker
// in parallel, so we will get "exists" error in most of them.
if err := os.MkdirAll(TestContext.ReportDir, 0777); err != nil && !os.IsExist(err) {
klog.Errorf("Create report dir: %v", err)
Exit(1)
}
ginkgoDir := path.Join(TestContext.ReportDir, "ginkgo")
if TestContext.ReportCompleteGinkgo || TestContext.ReportCompleteJUnit {
if err := os.MkdirAll(ginkgoDir, 0777); err != nil && !os.IsExist(err) {
klog.Errorf("Create <report-dir>/ginkgo: %v", err)
Exit(1)
}
}
if TestContext.ReportCompleteGinkgo {
ginkgo.ReportAfterSuite("Ginkgo JSON report", func(report ginkgo.Report) {
ExpectNoError(reporters.GenerateJSONReport(report, path.Join(ginkgoDir, "report.json")))
})
ginkgo.ReportAfterSuite("JUnit XML report", func(report ginkgo.Report) {
ExpectNoError(reporters.GenerateJUnitReport(report, path.Join(ginkgoDir, "report.xml")))
})
}
ginkgo.ReportAfterSuite("Kubernetes e2e JUnit report", func(report ginkgo.Report) {
// With Ginkgo v1, we used to write one file per
// parallel node. Now Ginkgo v2 automatically merges
// all results into a report for us. The 01 suffix is
// kept in case that users expect files to be called
// "junit_<prefix><number>.xml".
junitReport := path.Join(TestContext.ReportDir, "junit_"+TestContext.ReportPrefix+"01.xml")
// writeJUnitReport generates a JUnit file in the e2e
// report directory that is shorter than the one
// normally written by `ginkgo --junit-report`. This is
// needed because the full report can become too large
// for tools like Spyglass
// (https://github.com/kubernetes/kubernetes/issues/111510).
ExpectNoError(junit.WriteJUnitReport(report, junitReport))
})
}
}
func listTestInformation(report ginkgo.Report) {
indent := strings.Repeat(" ", 4)
if TestContext.listLabels {
labels := sets.New[string]()
for _, spec := range report.SpecReports {
if spec.LeafNodeType == types.NodeTypeIt {
labels.Insert(spec.Labels()...)
}
}
fmt.Fprintf(Output, "The following labels can be used with 'ginkgo run --label-filter':\n%s%s\n\n", indent, strings.Join(sets.List(labels), "\n"+indent))
}
if TestContext.listTests {
leafs := make([][]string, 0, len(report.SpecReports))
wd, _ := os.Getwd()
for _, spec := range report.SpecReports {
if spec.LeafNodeType == types.NodeTypeIt {
leafs = append(leafs, []string{fmt.Sprintf("%s:%d: ", relativePath(wd, spec.LeafNodeLocation.FileName), spec.LeafNodeLocation.LineNumber), spec.FullText()})
}
}
// Sort by test name, not the source code location, because the test
// name is more stable across code refactoring.
sort.Slice(leafs, func(i, j int) bool {
return leafs[i][1] < leafs[j][1]
})
fmt.Fprint(Output, "The following spec names can be used with 'ginkgo run --focus/skip':\n")
for _, leaf := range leafs {
fmt.Fprintf(Output, "%s%s%s\n", indent, leaf[0], leaf[1])
}
fmt.Fprint(Output, "\n")
}
}
func relativePath(wd, path string) string {
if wd == "" {
return path
}
relpath, err := filepath.Rel(wd, path)
if err != nil {
return path
}
return relpath
}

View File

@ -0,0 +1,12 @@
# This E2E framework sub-package is currently allowed to use arbitrary
# dependencies except of k/k/pkg, therefore we need to override the
# restrictions from the parent .import-restrictions file.
#
# At some point it may become useful to also check this package's
# dependencies more careful.
rules:
- selectorRegexp: "^k8s[.]io/kubernetes/pkg"
allowedPrefixes: []
- selectorRegexp: ""
allowedPrefixes: [ "" ]

View File

@ -0,0 +1,193 @@
/*
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Package testfiles provides a wrapper around various optional ways
// of retrieving additional files needed during a test run:
// - builtin bindata
// - filesystem access
//
// Because it is a is self-contained package, it can be used by
// test/e2e/framework and test/e2e/manifest without creating
// a circular dependency.
package testfiles
import (
"embed"
"errors"
"fmt"
"io/fs"
"os"
"path"
"path/filepath"
"strings"
)
var filesources []FileSource
// AddFileSource registers another provider for files that may be
// needed at runtime. Should be called during initialization of a test
// binary.
func AddFileSource(filesource FileSource) {
filesources = append(filesources, filesource)
}
// FileSource implements one way of retrieving test file content. For
// example, one file source could read from the original source code
// file tree, another from bindata compiled into a test executable.
type FileSource interface {
// ReadTestFile retrieves the content of a file that gets maintained
// alongside a test's source code. Files are identified by the
// relative path inside the repository containing the tests, for
// example "cluster/gce/upgrade.sh" inside kubernetes/kubernetes.
//
// When the file is not found, a nil slice is returned. An error is
// returned for all fatal errors.
ReadTestFile(filePath string) ([]byte, error)
// DescribeFiles returns a multi-line description of which
// files are available via this source. It is meant to be
// used as part of the error message when a file cannot be
// found.
DescribeFiles() string
}
// Read tries to retrieve the desired file content from
// one of the registered file sources.
func Read(filePath string) ([]byte, error) {
if len(filesources) == 0 {
return nil, fmt.Errorf("no file sources registered (yet?), cannot retrieve test file %s", filePath)
}
for _, filesource := range filesources {
data, err := filesource.ReadTestFile(filePath)
if err != nil {
return nil, fmt.Errorf("fatal error retrieving test file %s: %w", filePath, err)
}
if data != nil {
return data, nil
}
}
// Here we try to generate an error that points test authors
// or users in the right direction for resolving the problem.
err := fmt.Sprintf("Test file %q was not found.\n", filePath)
for _, filesource := range filesources {
err += filesource.DescribeFiles()
err += "\n"
}
return nil, errors.New(err)
}
// Exists checks whether a file could be read. Unexpected errors
// are handled by calling the fail function, which then should
// abort the current test.
func Exists(filePath string) (bool, error) {
for _, filesource := range filesources {
data, err := filesource.ReadTestFile(filePath)
if err != nil {
return false, err
}
if data != nil {
return true, nil
}
}
return false, nil
}
// RootFileSource looks for files relative to a root directory.
type RootFileSource struct {
Root string
}
// ReadTestFile looks for the file relative to the configured
// root directory. If the path is already absolute, for example
// in a test that has its own method of determining where
// files are, then the path will be used directly.
func (r RootFileSource) ReadTestFile(filePath string) ([]byte, error) {
var fullPath string
if path.IsAbs(filePath) {
fullPath = filePath
} else {
fullPath = filepath.Join(r.Root, filePath)
}
data, err := os.ReadFile(fullPath)
if os.IsNotExist(err) {
// Not an error (yet), some other provider may have the file.
return nil, nil
}
return data, err
}
// DescribeFiles explains that it looks for files inside a certain
// root directory.
func (r RootFileSource) DescribeFiles() string {
description := fmt.Sprintf("Test files are expected in %q", r.Root)
if !path.IsAbs(r.Root) {
// The default in test_context.go is the relative path
// ../../, which doesn't really help locating the
// actual location. Therefore we add also the absolute
// path if necessary.
abs, err := filepath.Abs(r.Root)
if err == nil {
description += fmt.Sprintf(" = %q", abs)
}
}
description += "."
return description
}
// EmbeddedFileSource handles files stored in a package generated with bindata.
type EmbeddedFileSource struct {
EmbeddedFS embed.FS
Root string
fileList []string
}
// ReadTestFile looks for an embedded file with the given path.
func (e EmbeddedFileSource) ReadTestFile(filepath string) ([]byte, error) {
relativePath := strings.TrimPrefix(filepath, fmt.Sprintf("%s/", e.Root))
b, err := e.EmbeddedFS.ReadFile(relativePath)
if err != nil {
if errors.Is(err, fs.ErrNotExist) {
return nil, nil
}
return nil, err
}
return b, nil
}
// DescribeFiles explains that it is looking inside an embedded filesystem
func (e EmbeddedFileSource) DescribeFiles() string {
var lines []string
lines = append(lines, "The following files are embedded into the test executable:")
if len(e.fileList) == 0 {
e.populateFileList()
}
lines = append(lines, e.fileList...)
return strings.Join(lines, "\n\t")
}
func (e *EmbeddedFileSource) populateFileList() {
fs.WalkDir(e.EmbeddedFS, ".", func(path string, d fs.DirEntry, err error) error {
if !d.IsDir() {
e.fileList = append(e.fileList, filepath.Join(e.Root, path))
}
return nil
})
}

View File

@ -0,0 +1,131 @@
/*
Copyright 2020 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import "time"
var defaultTimeouts = TimeoutContext{
Poll: 2 * time.Second, // from the former e2e/framework/pod poll interval
PodStart: 5 * time.Minute,
PodStartShort: 2 * time.Minute,
PodStartSlow: 15 * time.Minute,
PodDelete: 5 * time.Minute,
ClaimProvision: 5 * time.Minute,
ClaimProvisionShort: 1 * time.Minute,
DataSourceProvision: 5 * time.Minute,
ClaimBound: 3 * time.Minute,
PVReclaim: 3 * time.Minute,
PVBound: 3 * time.Minute,
PVCreate: 3 * time.Minute,
PVDelete: 5 * time.Minute,
PVDeleteSlow: 20 * time.Minute,
SnapshotCreate: 5 * time.Minute,
SnapshotDelete: 5 * time.Minute,
SnapshotControllerMetrics: 5 * time.Minute,
SystemPodsStartup: 10 * time.Minute,
NodeSchedulable: 30 * time.Minute,
SystemDaemonsetStartup: 5 * time.Minute,
NodeNotReady: 3 * time.Minute,
}
// TimeoutContext contains timeout settings for several actions.
type TimeoutContext struct {
// Poll is how long to wait between API calls when waiting for some condition.
Poll time.Duration
// PodStart is how long to wait for the pod to be started.
// This value is the default for gomega.Eventually.
PodStart time.Duration
// PodStartShort is same as `PodStart`, but shorter.
// Use it in a case-by-case basis, mostly when you are sure pod start will not be delayed.
// This value is the default for gomega.Consistently.
PodStartShort time.Duration
// PodStartSlow is same as `PodStart`, but longer.
// Use it in a case-by-case basis, mostly when you are sure pod start will take longer than usual.
PodStartSlow time.Duration
// PodDelete is how long to wait for the pod to be deleted.
PodDelete time.Duration
// ClaimProvision is how long claims have to become dynamically provisioned.
ClaimProvision time.Duration
// DataSourceProvision is how long claims have to become dynamically provisioned from source claim.
DataSourceProvision time.Duration
// ClaimProvisionShort is the same as `ClaimProvision`, but shorter.
ClaimProvisionShort time.Duration
// ClaimBound is how long claims have to become bound.
ClaimBound time.Duration
// PVReclaim is how long PVs have to become reclaimed.
PVReclaim time.Duration
// PVBound is how long PVs have to become bound.
PVBound time.Duration
// PVCreate is how long PVs have to be created.
PVCreate time.Duration
// PVDelete is how long PVs have to become deleted.
PVDelete time.Duration
// PVDeleteSlow is the same as PVDelete, but slower.
PVDeleteSlow time.Duration
// SnapshotCreate is how long for snapshot to create snapshotContent.
SnapshotCreate time.Duration
// SnapshotDelete is how long for snapshot to delete snapshotContent.
SnapshotDelete time.Duration
// SnapshotControllerMetrics is how long to wait for snapshot controller metrics.
SnapshotControllerMetrics time.Duration
// SystemPodsStartup is how long to wait for system pods to be running.
SystemPodsStartup time.Duration
// NodeSchedulable is how long to wait for all nodes to be schedulable.
NodeSchedulable time.Duration
// SystemDaemonsetStartup is how long to wait for all system daemonsets to be ready.
SystemDaemonsetStartup time.Duration
// NodeNotReady is how long to wait for a node to be not ready.
NodeNotReady time.Duration
}
// NewTimeoutContext returns a TimeoutContext with all values set either to
// hard-coded defaults or a value that was configured when running the E2E
// suite. Should be called after command line parsing.
func NewTimeoutContext() *TimeoutContext {
// Make a copy, otherwise the caller would have the ability to modify
// the original values.
copy := TestContext.timeouts
return &copy
}
// PollInterval defines how long to wait between API server queries while
// waiting for some condition.
//
// This value is the default for gomega.Eventually and gomega.Consistently.
func PollInterval() time.Duration {
return TestContext.timeouts.Poll
}

852
e2e/vendor/k8s.io/kubernetes/test/e2e/framework/util.go generated vendored Normal file
View File

@ -0,0 +1,852 @@
/*
Copyright 2014 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"math/rand"
"net/url"
"os"
"os/exec"
"path"
"strconv"
"strings"
"sync"
"time"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
v1 "k8s.io/api/core/v1"
discoveryv1 "k8s.io/api/discovery/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/fields"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apimachinery/pkg/util/sets"
"k8s.io/apimachinery/pkg/util/uuid"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/apimachinery/pkg/watch"
"k8s.io/client-go/dynamic"
clientset "k8s.io/client-go/kubernetes"
restclient "k8s.io/client-go/rest"
"k8s.io/client-go/tools/cache"
"k8s.io/client-go/tools/clientcmd"
clientcmdapi "k8s.io/client-go/tools/clientcmd/api"
watchtools "k8s.io/client-go/tools/watch"
netutils "k8s.io/utils/net"
)
const (
// TODO(justinsb): Avoid hardcoding this.
awsMasterIP = "172.20.0.9"
)
// DEPRECATED constants. Use the timeouts in framework.Framework instead.
const (
// PodListTimeout is how long to wait for the pod to be listable.
PodListTimeout = time.Minute
// PodStartTimeout is how long to wait for the pod to be started.
PodStartTimeout = 5 * time.Minute
// PodStartShortTimeout is same as `PodStartTimeout` to wait for the pod to be started, but shorter.
// Use it case by case when we are sure pod start will not be delayed.
// minutes by slow docker pulls or something else.
PodStartShortTimeout = 2 * time.Minute
// PodDeleteTimeout is how long to wait for a pod to be deleted.
PodDeleteTimeout = 5 * time.Minute
// PodGetTimeout is how long to wait for a pod to be got.
PodGetTimeout = 2 * time.Minute
// PodEventTimeout is how much we wait for a pod event to occur.
PodEventTimeout = 2 * time.Minute
// ServiceStartTimeout is how long to wait for a service endpoint to be resolvable.
ServiceStartTimeout = 3 * time.Minute
// Poll is how often to Poll pods, nodes and claims.
Poll = 2 * time.Second
// PollShortTimeout is the short timeout value in polling.
PollShortTimeout = 1 * time.Minute
// ServiceAccountProvisionTimeout is how long to wait for a service account to be provisioned.
// service accounts are provisioned after namespace creation
// a service account is required to support pod creation in a namespace as part of admission control
ServiceAccountProvisionTimeout = 2 * time.Minute
// SingleCallTimeout is how long to try single API calls (like 'get' or 'list'). Used to prevent
// transient failures from failing tests.
SingleCallTimeout = 5 * time.Minute
// NodeReadyInitialTimeout is how long nodes have to be "ready" when a test begins. They should already
// be "ready" before the test starts, so this is small.
NodeReadyInitialTimeout = 20 * time.Second
// PodReadyBeforeTimeout is how long pods have to be "ready" when a test begins.
PodReadyBeforeTimeout = 5 * time.Minute
// ClaimProvisionShortTimeout is same as `ClaimProvisionTimeout` to wait for claim to be dynamically provisioned, but shorter.
// Use it case by case when we are sure this timeout is enough.
ClaimProvisionShortTimeout = 1 * time.Minute
// ClaimProvisionTimeout is how long claims have to become dynamically provisioned.
ClaimProvisionTimeout = 5 * time.Minute
// RestartNodeReadyAgainTimeout is how long a node is allowed to become "Ready" after it is restarted before
// the test is considered failed.
RestartNodeReadyAgainTimeout = 5 * time.Minute
// RestartPodReadyAgainTimeout is how long a pod is allowed to become "running" and "ready" after a node
// restart before test is considered failed.
RestartPodReadyAgainTimeout = 5 * time.Minute
// SnapshotCreateTimeout is how long for snapshot to create snapshotContent.
SnapshotCreateTimeout = 5 * time.Minute
// SnapshotDeleteTimeout is how long for snapshot to delete snapshotContent.
SnapshotDeleteTimeout = 5 * time.Minute
// ControlPlaneLabel is valid label for kubeadm based clusters like kops ONLY
ControlPlaneLabel = "node-role.kubernetes.io/control-plane"
)
var (
// ProvidersWithSSH are those providers where each node is accessible with SSH
ProvidersWithSSH = []string{"gce", "gke", "aws", "local", "azure"}
)
// RunID is a unique identifier of the e2e run.
// Beware that this ID is not the same for all tests in the e2e run, because each Ginkgo node creates it separately.
var RunID = uuid.NewUUID()
// CreateTestingNSFn is a func that is responsible for creating namespace used for executing e2e tests.
type CreateTestingNSFn func(ctx context.Context, baseName string, c clientset.Interface, labels map[string]string) (*v1.Namespace, error)
// APIAddress returns a address of an instance.
func APIAddress() string {
instanceURL, err := url.Parse(TestContext.Host)
ExpectNoError(err)
return instanceURL.Hostname()
}
// ProviderIs returns true if the provider is included is the providers. Otherwise false.
func ProviderIs(providers ...string) bool {
for _, provider := range providers {
if strings.EqualFold(provider, TestContext.Provider) {
return true
}
}
return false
}
// MasterOSDistroIs returns true if the master OS distro is included in the supportedMasterOsDistros. Otherwise false.
func MasterOSDistroIs(supportedMasterOsDistros ...string) bool {
for _, distro := range supportedMasterOsDistros {
if strings.EqualFold(distro, TestContext.MasterOSDistro) {
return true
}
}
return false
}
// NodeOSDistroIs returns true if the node OS distro is included in the supportedNodeOsDistros. Otherwise false.
func NodeOSDistroIs(supportedNodeOsDistros ...string) bool {
for _, distro := range supportedNodeOsDistros {
if strings.EqualFold(distro, TestContext.NodeOSDistro) {
return true
}
}
return false
}
// NodeOSArchIs returns true if the node OS arch is included in the supportedNodeOsArchs. Otherwise false.
func NodeOSArchIs(supportedNodeOsArchs ...string) bool {
for _, arch := range supportedNodeOsArchs {
if strings.EqualFold(arch, TestContext.NodeOSArch) {
return true
}
}
return false
}
// DeleteNamespaces deletes all namespaces that match the given delete and skip filters.
// Filter is by simple strings.Contains; first skip filter, then delete filter.
// Returns the list of deleted namespaces or an error.
func DeleteNamespaces(ctx context.Context, c clientset.Interface, deleteFilter, skipFilter []string) ([]string, error) {
ginkgo.By("Deleting namespaces")
nsList, err := c.CoreV1().Namespaces().List(ctx, metav1.ListOptions{})
ExpectNoError(err, "Failed to get namespace list")
var deleted []string
var wg sync.WaitGroup
OUTER:
for _, item := range nsList.Items {
for _, pattern := range skipFilter {
if strings.Contains(item.Name, pattern) {
continue OUTER
}
}
if deleteFilter != nil {
var shouldDelete bool
for _, pattern := range deleteFilter {
if strings.Contains(item.Name, pattern) {
shouldDelete = true
break
}
}
if !shouldDelete {
continue OUTER
}
}
wg.Add(1)
deleted = append(deleted, item.Name)
go func(nsName string) {
defer wg.Done()
defer ginkgo.GinkgoRecover()
gomega.Expect(c.CoreV1().Namespaces().Delete(ctx, nsName, metav1.DeleteOptions{})).To(gomega.Succeed())
Logf("namespace : %v api call to delete is complete ", nsName)
}(item.Name)
}
wg.Wait()
return deleted, nil
}
// WaitForNamespacesDeleted waits for the namespaces to be deleted.
func WaitForNamespacesDeleted(ctx context.Context, c clientset.Interface, namespaces []string, timeout time.Duration) error {
ginkgo.By(fmt.Sprintf("Waiting for namespaces %+v to vanish", namespaces))
nsMap := map[string]bool{}
for _, ns := range namespaces {
nsMap[ns] = true
}
//Now POLL until all namespaces have been eradicated.
return wait.PollUntilContextTimeout(ctx, 2*time.Second, timeout, false,
func(ctx context.Context) (bool, error) {
nsList, err := c.CoreV1().Namespaces().List(ctx, metav1.ListOptions{})
if err != nil {
return false, err
}
for _, item := range nsList.Items {
if _, ok := nsMap[item.Name]; ok {
return false, nil
}
}
return true, nil
})
}
func waitForConfigMapInNamespace(ctx context.Context, c clientset.Interface, ns, name string, timeout time.Duration) error {
fieldSelector := fields.OneTermEqualSelector("metadata.name", name).String()
ctx, cancel := watchtools.ContextWithOptionalTimeout(ctx, timeout)
defer cancel()
lw := &cache.ListWatch{
ListFunc: func(options metav1.ListOptions) (object runtime.Object, e error) {
options.FieldSelector = fieldSelector
return c.CoreV1().ConfigMaps(ns).List(ctx, options)
},
WatchFunc: func(options metav1.ListOptions) (i watch.Interface, e error) {
options.FieldSelector = fieldSelector
return c.CoreV1().ConfigMaps(ns).Watch(ctx, options)
},
}
_, err := watchtools.UntilWithSync(ctx, lw, &v1.ConfigMap{}, nil, func(event watch.Event) (bool, error) {
switch event.Type {
case watch.Deleted:
return false, apierrors.NewNotFound(schema.GroupResource{Resource: "configmaps"}, name)
case watch.Added, watch.Modified:
return true, nil
}
return false, nil
})
return err
}
func waitForServiceAccountInNamespace(ctx context.Context, c clientset.Interface, ns, serviceAccountName string, timeout time.Duration) error {
fieldSelector := fields.OneTermEqualSelector("metadata.name", serviceAccountName).String()
ctx, cancel := watchtools.ContextWithOptionalTimeout(ctx, timeout)
defer cancel()
lw := &cache.ListWatch{
ListFunc: func(options metav1.ListOptions) (object runtime.Object, e error) {
options.FieldSelector = fieldSelector
return c.CoreV1().ServiceAccounts(ns).List(ctx, options)
},
WatchFunc: func(options metav1.ListOptions) (i watch.Interface, e error) {
options.FieldSelector = fieldSelector
return c.CoreV1().ServiceAccounts(ns).Watch(ctx, options)
},
}
_, err := watchtools.UntilWithSync(ctx, lw, &v1.ServiceAccount{}, nil, func(event watch.Event) (bool, error) {
switch event.Type {
case watch.Deleted:
return false, apierrors.NewNotFound(schema.GroupResource{Resource: "serviceaccounts"}, serviceAccountName)
case watch.Added, watch.Modified:
return true, nil
}
return false, nil
})
if err != nil {
return fmt.Errorf("wait for service account %q in namespace %q: %w", serviceAccountName, ns, err)
}
return nil
}
// WaitForDefaultServiceAccountInNamespace waits for the default service account to be provisioned
// the default service account is what is associated with pods when they do not specify a service account
// as a result, pods are not able to be provisioned in a namespace until the service account is provisioned
func WaitForDefaultServiceAccountInNamespace(ctx context.Context, c clientset.Interface, namespace string) error {
return waitForServiceAccountInNamespace(ctx, c, namespace, defaultServiceAccountName, ServiceAccountProvisionTimeout)
}
// WaitForKubeRootCAInNamespace waits for the configmap kube-root-ca.crt containing the service account
// CA trust bundle to be provisioned in the specified namespace so that pods do not have to retry mounting
// the config map (which creates noise that hides other issues in the Kubelet).
func WaitForKubeRootCAInNamespace(ctx context.Context, c clientset.Interface, namespace string) error {
return waitForConfigMapInNamespace(ctx, c, namespace, "kube-root-ca.crt", ServiceAccountProvisionTimeout)
}
// CreateTestingNS should be used by every test, note that we append a common prefix to the provided test name.
// Please see NewFramework instead of using this directly.
func CreateTestingNS(ctx context.Context, baseName string, c clientset.Interface, labels map[string]string) (*v1.Namespace, error) {
if labels == nil {
labels = map[string]string{}
}
labels["e2e-run"] = string(RunID)
// We don't use ObjectMeta.GenerateName feature, as in case of API call
// failure we don't know whether the namespace was created and what is its
// name.
name := fmt.Sprintf("%v-%v", baseName, RandomSuffix())
namespaceObj := &v1.Namespace{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Namespace: "",
Labels: labels,
},
Status: v1.NamespaceStatus{},
}
// Be robust about making the namespace creation call.
var got *v1.Namespace
if err := wait.PollUntilContextTimeout(ctx, Poll, 30*time.Second, true, func(ctx context.Context) (bool, error) {
var err error
got, err = c.CoreV1().Namespaces().Create(ctx, namespaceObj, metav1.CreateOptions{})
if err != nil {
if apierrors.IsAlreadyExists(err) {
// regenerate on conflict
Logf("Namespace name %q was already taken, generate a new name and retry", namespaceObj.Name)
namespaceObj.Name = fmt.Sprintf("%v-%v", baseName, RandomSuffix())
} else {
Logf("Unexpected error while creating namespace: %v", err)
}
return false, nil
}
return true, nil
}); err != nil {
return nil, err
}
if TestContext.VerifyServiceAccount {
if err := WaitForDefaultServiceAccountInNamespace(ctx, c, got.Name); err != nil {
// Even if we fail to create serviceAccount in the namespace,
// we have successfully create a namespace.
// So, return the created namespace.
return got, err
}
}
return got, nil
}
// CheckTestingNSDeletedExcept checks whether all e2e based existing namespaces are in the Terminating state
// and waits until they are finally deleted. It ignores namespace skip.
func CheckTestingNSDeletedExcept(ctx context.Context, c clientset.Interface, skip string) error {
// TODO: Since we don't have support for bulk resource deletion in the API,
// while deleting a namespace we are deleting all objects from that namespace
// one by one (one deletion == one API call). This basically exposes us to
// throttling - currently controller-manager has a limit of max 20 QPS.
// Once #10217 is implemented and used in namespace-controller, deleting all
// object from a given namespace should be much faster and we will be able
// to lower this timeout.
// However, now Density test is producing ~26000 events and Load capacity test
// is producing ~35000 events, thus assuming there are no other requests it will
// take ~30 minutes to fully delete the namespace. Thus I'm setting it to 60
// minutes to avoid any timeouts here.
timeout := 60 * time.Minute
Logf("Waiting for terminating namespaces to be deleted...")
for start := time.Now(); time.Since(start) < timeout; time.Sleep(15 * time.Second) {
namespaces, err := c.CoreV1().Namespaces().List(ctx, metav1.ListOptions{})
if err != nil {
Logf("Listing namespaces failed: %v", err)
continue
}
terminating := 0
for _, ns := range namespaces.Items {
if strings.HasPrefix(ns.ObjectMeta.Name, "e2e-tests-") && ns.ObjectMeta.Name != skip {
if ns.Status.Phase == v1.NamespaceActive {
return fmt.Errorf("Namespace %s is active", ns.ObjectMeta.Name)
}
terminating++
}
}
if terminating == 0 {
return nil
}
}
return fmt.Errorf("Waiting for terminating namespaces to be deleted timed out")
}
// WaitForServiceEndpointsNum waits until the amount of endpoints that implement service to expectNum.
// Some components use EndpointSlices other Endpoints, we must verify that both objects meet the requirements.
func WaitForServiceEndpointsNum(ctx context.Context, c clientset.Interface, namespace, serviceName string, expectNum int, interval, timeout time.Duration) error {
return wait.PollUntilContextTimeout(ctx, interval, timeout, false, func(ctx context.Context) (bool, error) {
Logf("Waiting for amount of service:%s endpoints to be %d", serviceName, expectNum)
endpoint, err := c.CoreV1().Endpoints(namespace).Get(ctx, serviceName, metav1.GetOptions{})
if err != nil {
Logf("Unexpected error trying to get Endpoints for %s : %v", serviceName, err)
return false, nil
}
if countEndpointsNum(endpoint) != expectNum {
Logf("Unexpected number of Endpoints, got %d, expected %d", countEndpointsNum(endpoint), expectNum)
return false, nil
}
// Endpoints are single family but EndpointSlices can have dual stack addresses,
// so we verify the number of addresses that matches the same family on both.
addressType := discoveryv1.AddressTypeIPv4
if isIPv6Endpoint(endpoint) {
addressType = discoveryv1.AddressTypeIPv6
}
esList, err := c.DiscoveryV1().EndpointSlices(namespace).List(ctx, metav1.ListOptions{LabelSelector: fmt.Sprintf("%s=%s", discoveryv1.LabelServiceName, serviceName)})
if err != nil {
Logf("Unexpected error trying to get EndpointSlices for %s : %v", serviceName, err)
return false, nil
}
if len(esList.Items) == 0 {
Logf("Waiting for at least 1 EndpointSlice to exist")
return false, nil
}
if countEndpointsSlicesNum(esList, addressType) != expectNum {
Logf("Unexpected number of Endpoints on Slices, got %d, expected %d", countEndpointsSlicesNum(esList, addressType), expectNum)
return false, nil
}
return true, nil
})
}
func countEndpointsNum(e *v1.Endpoints) int {
num := 0
for _, sub := range e.Subsets {
num += len(sub.Addresses)
}
return num
}
// isIPv6Endpoint returns true if the Endpoint uses IPv6 addresses
func isIPv6Endpoint(e *v1.Endpoints) bool {
for _, sub := range e.Subsets {
for _, addr := range sub.Addresses {
if len(addr.IP) == 0 {
continue
}
// Endpoints are single family, so it is enough to check only one address
return netutils.IsIPv6String(addr.IP)
}
}
// default to IPv4 an Endpoint without IP addresses
return false
}
func countEndpointsSlicesNum(epList *discoveryv1.EndpointSliceList, addressType discoveryv1.AddressType) int {
// EndpointSlices can contain the same address on multiple Slices
addresses := sets.Set[string]{}
for _, epSlice := range epList.Items {
if epSlice.AddressType != addressType {
continue
}
for _, ep := range epSlice.Endpoints {
if len(ep.Addresses) > 0 {
addresses.Insert(ep.Addresses[0])
}
}
}
return addresses.Len()
}
// restclientConfig returns a config holds the information needed to build connection to kubernetes clusters.
func restclientConfig(kubeContext string) (*clientcmdapi.Config, error) {
Logf(">>> kubeConfig: %s", TestContext.KubeConfig)
if TestContext.KubeConfig == "" {
return nil, fmt.Errorf("KubeConfig must be specified to load client config")
}
c, err := clientcmd.LoadFromFile(TestContext.KubeConfig)
if err != nil {
return nil, fmt.Errorf("error loading KubeConfig: %v", err.Error())
}
if kubeContext != "" {
Logf(">>> kubeContext: %s", kubeContext)
c.CurrentContext = kubeContext
}
return c, nil
}
// ClientConfigGetter is a func that returns getter to return a config.
type ClientConfigGetter func() (*restclient.Config, error)
// LoadConfig returns a config for a rest client with the UserAgent set to include the current test name.
func LoadConfig() (config *restclient.Config, err error) {
defer func() {
if err == nil && config != nil {
testDesc := ginkgo.CurrentSpecReport()
if len(testDesc.ContainerHierarchyTexts) > 0 {
testName := strings.Join(testDesc.ContainerHierarchyTexts, " ")
if len(testDesc.LeafNodeText) > 0 {
testName = testName + " " + testDesc.LeafNodeText
}
config.UserAgent = fmt.Sprintf("%s -- %s", restclient.DefaultKubernetesUserAgent(), testName)
}
}
}()
if TestContext.NodeE2E {
// This is a node e2e test, apply the node e2e configuration
return &restclient.Config{
Host: TestContext.Host,
BearerToken: TestContext.BearerToken,
TLSClientConfig: restclient.TLSClientConfig{
Insecure: true,
},
}, nil
}
c, err := restclientConfig(TestContext.KubeContext)
if err != nil {
if TestContext.KubeConfig == "" {
return restclient.InClusterConfig()
}
return nil, err
}
// In case Host is not set in TestContext, sets it as
// CurrentContext Server for k8s API client to connect to.
if TestContext.Host == "" && c.Clusters != nil {
currentContext, ok := c.Clusters[c.CurrentContext]
if ok {
TestContext.Host = currentContext.Server
}
}
return clientcmd.NewDefaultClientConfig(*c, &clientcmd.ConfigOverrides{ClusterInfo: clientcmdapi.Cluster{Server: TestContext.Host}}).ClientConfig()
}
// LoadClientset returns clientset for connecting to kubernetes clusters.
func LoadClientset() (*clientset.Clientset, error) {
config, err := LoadConfig()
if err != nil {
return nil, fmt.Errorf("error creating client: %v", err.Error())
}
return clientset.NewForConfig(config)
}
// RandomSuffix provides a random sequence to append to pods,services,rcs.
func RandomSuffix() string {
return strconv.Itoa(rand.Intn(10000))
}
// StartCmdAndStreamOutput returns stdout and stderr after starting the given cmd.
func StartCmdAndStreamOutput(cmd *exec.Cmd) (stdout, stderr io.ReadCloser, err error) {
stdout, err = cmd.StdoutPipe()
if err != nil {
return
}
stderr, err = cmd.StderrPipe()
if err != nil {
return
}
// cmd.Args contains command itself as 0th argument, so it's sufficient to
// print 1st and latter arguments
Logf("Asynchronously running '%s %s'", cmd.Path, strings.Join(cmd.Args[1:], " "))
err = cmd.Start()
return
}
// TryKill is rough equivalent of ctrl+c for cleaning up processes. Intended to be run in defer.
func TryKill(cmd *exec.Cmd) {
if err := cmd.Process.Kill(); err != nil {
Logf("ERROR failed to kill command %v! The process may leak", cmd)
}
}
// EnsureLoadBalancerResourcesDeleted ensures that cloud load balancer resources that were created
// are actually cleaned up. Currently only implemented for GCE/GKE.
func EnsureLoadBalancerResourcesDeleted(ctx context.Context, ip, portRange string) error {
return TestContext.CloudConfig.Provider.EnsureLoadBalancerResourcesDeleted(ctx, ip, portRange)
}
// CoreDump SSHs to the master and all nodes and dumps their logs into dir.
// It shells out to cluster/log-dump/log-dump.sh to accomplish this.
func CoreDump(dir string) {
if TestContext.DisableLogDump {
Logf("Skipping dumping logs from cluster")
return
}
var cmd *exec.Cmd
if TestContext.LogexporterGCSPath != "" {
Logf("Dumping logs from nodes to GCS directly at path: %s", TestContext.LogexporterGCSPath)
cmd = exec.Command(path.Join(TestContext.RepoRoot, "cluster", "log-dump", "log-dump.sh"), dir, TestContext.LogexporterGCSPath)
} else {
Logf("Dumping logs locally to: %s", dir)
cmd = exec.Command(path.Join(TestContext.RepoRoot, "cluster", "log-dump", "log-dump.sh"), dir)
}
env := os.Environ()
env = append(env, fmt.Sprintf("LOG_DUMP_SYSTEMD_SERVICES=%s", parseSystemdServices(TestContext.SystemdServices)))
env = append(env, fmt.Sprintf("LOG_DUMP_SYSTEMD_JOURNAL=%v", TestContext.DumpSystemdJournal))
cmd.Env = env
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
Logf("Error running cluster/log-dump/log-dump.sh: %v", err)
}
}
// parseSystemdServices converts services separator from comma to space.
func parseSystemdServices(services string) string {
return strings.TrimSpace(strings.Replace(services, ",", " ", -1))
}
// RunCmd runs cmd using args and returns its stdout and stderr. It also outputs
// cmd's stdout and stderr to their respective OS streams.
func RunCmd(command string, args ...string) (string, string, error) {
return RunCmdEnv(nil, command, args...)
}
// RunCmdEnv runs cmd with the provided environment and args and
// returns its stdout and stderr. It also outputs cmd's stdout and
// stderr to their respective OS streams.
func RunCmdEnv(env []string, command string, args ...string) (string, string, error) {
Logf("Running %s %v", command, args)
var bout, berr bytes.Buffer
cmd := exec.Command(command, args...)
// We also output to the OS stdout/stderr to aid in debugging in case cmd
// hangs and never returns before the test gets killed.
//
// This creates some ugly output because gcloud doesn't always provide
// newlines.
cmd.Stdout = io.MultiWriter(os.Stdout, &bout)
cmd.Stderr = io.MultiWriter(os.Stderr, &berr)
cmd.Env = env
err := cmd.Run()
stdout, stderr := bout.String(), berr.String()
if err != nil {
return "", "", fmt.Errorf("error running %s %v; got error %v, stdout %q, stderr %q",
command, args, err, stdout, stderr)
}
return stdout, stderr, nil
}
// GetNodeExternalIPs returns a list of external ip address(es) if any for a node
func GetNodeExternalIPs(node *v1.Node) (ips []string) {
for j := range node.Status.Addresses {
nodeAddress := &node.Status.Addresses[j]
if nodeAddress.Type == v1.NodeExternalIP && nodeAddress.Address != "" {
ips = append(ips, nodeAddress.Address)
}
}
return
}
// getControlPlaneAddresses returns the externalIP, internalIP and hostname fields of control plane nodes.
// If any of these is unavailable, empty slices are returned.
func getControlPlaneAddresses(ctx context.Context, c clientset.Interface) ([]string, []string, []string) {
var externalIPs, internalIPs, hostnames []string
// Populate the internal IPs.
eps, err := c.CoreV1().Endpoints(metav1.NamespaceDefault).Get(ctx, "kubernetes", metav1.GetOptions{})
if err != nil {
Failf("Failed to get kubernetes endpoints: %v", err)
}
for _, subset := range eps.Subsets {
for _, address := range subset.Addresses {
if address.IP != "" {
internalIPs = append(internalIPs, address.IP)
}
}
}
// Populate the external IP/hostname.
hostURL, err := url.Parse(TestContext.Host)
if err != nil {
Failf("Failed to parse hostname: %v", err)
}
if netutils.ParseIPSloppy(hostURL.Host) != nil {
externalIPs = append(externalIPs, hostURL.Host)
} else {
hostnames = append(hostnames, hostURL.Host)
}
return externalIPs, internalIPs, hostnames
}
// GetControlPlaneNodes returns a list of control plane nodes
func GetControlPlaneNodes(ctx context.Context, c clientset.Interface) *v1.NodeList {
allNodes, err := c.CoreV1().Nodes().List(ctx, metav1.ListOptions{})
ExpectNoError(err, "error reading all nodes")
var cpNodes v1.NodeList
for _, node := range allNodes.Items {
// Check for the control plane label
if _, hasLabel := node.Labels[ControlPlaneLabel]; hasLabel {
cpNodes.Items = append(cpNodes.Items, node)
continue
}
// Check for the specific taint
for _, taint := range node.Spec.Taints {
// NOTE the taint key is the same as the control plane label
if taint.Key == ControlPlaneLabel && taint.Effect == v1.TaintEffectNoSchedule {
cpNodes.Items = append(cpNodes.Items, node)
continue
}
}
}
return &cpNodes
}
// GetControlPlaneAddresses returns all IP addresses on which the kubelet can reach the control plane.
// It may return internal and external IPs, even if we expect for
// e.g. internal IPs to be used (issue #56787), so that we can be
// sure to block the control plane fully during tests.
func GetControlPlaneAddresses(ctx context.Context, c clientset.Interface) []string {
externalIPs, internalIPs, _ := getControlPlaneAddresses(ctx, c)
ips := sets.NewString()
switch TestContext.Provider {
case "gce", "gke":
for _, ip := range externalIPs {
ips.Insert(ip)
}
for _, ip := range internalIPs {
ips.Insert(ip)
}
case "aws":
ips.Insert(awsMasterIP)
default:
Failf("This test is not supported for provider %s and should be disabled", TestContext.Provider)
}
return ips.List()
}
// PrettyPrintJSON converts metrics to JSON format.
func PrettyPrintJSON(metrics interface{}) string {
output := &bytes.Buffer{}
if err := json.NewEncoder(output).Encode(metrics); err != nil {
Logf("Error building encoder: %v", err)
return ""
}
formatted := &bytes.Buffer{}
if err := json.Indent(formatted, output.Bytes(), "", " "); err != nil {
Logf("Error indenting: %v", err)
return ""
}
return formatted.String()
}
// WatchEventSequenceVerifier ...
// manages a watch for a given resource, ensures that events take place in a given order, retries the test on failure
//
// ctx cancellation signal across API boundaries, e.g: context from Ginkgo
// dc sets up a client to the API
// resourceType specify the type of resource
// namespace select a namespace
// resourceName the name of the given resource
// listOptions options used to find the resource, recommended to use listOptions.labelSelector
// expectedWatchEvents array of events which are expected to occur
// scenario the test itself
// retryCleanup a function to run which ensures that there are no dangling resources upon test failure
//
// this tooling relies on the test to return the events as they occur
// the entire scenario must be run to ensure that the desired watch events arrive in order (allowing for interweaving of watch events)
//
// if an expected watch event is missing we elect to clean up and run the entire scenario again
//
// we try the scenario three times to allow the sequencing to fail a couple of times
func WatchEventSequenceVerifier(ctx context.Context, dc dynamic.Interface, resourceType schema.GroupVersionResource, namespace string, resourceName string, listOptions metav1.ListOptions, expectedWatchEvents []watch.Event, scenario func(*watchtools.RetryWatcher) []watch.Event, retryCleanup func() error) {
listWatcher := &cache.ListWatch{
WatchFunc: func(listOptions metav1.ListOptions) (watch.Interface, error) {
return dc.Resource(resourceType).Namespace(namespace).Watch(ctx, listOptions)
},
}
retries := 3
retriesLoop:
for try := 1; try <= retries; try++ {
initResource, err := dc.Resource(resourceType).Namespace(namespace).List(ctx, listOptions)
ExpectNoError(err, "Failed to fetch initial resource")
resourceWatch, err := watchtools.NewRetryWatcher(initResource.GetResourceVersion(), listWatcher)
ExpectNoError(err, "Failed to create a resource watch of %v in namespace %v", resourceType.Resource, namespace)
// NOTE the test may need access to the events to see what's going on, such as a change in status
actualWatchEvents := scenario(resourceWatch)
errs := sets.NewString()
gomega.Expect(len(expectedWatchEvents)).To(gomega.BeNumerically("<=", len(actualWatchEvents)), "Did not get enough watch events")
totalValidWatchEvents := 0
foundEventIndexes := map[int]*int{}
for watchEventIndex, expectedWatchEvent := range expectedWatchEvents {
foundExpectedWatchEvent := false
actualWatchEventsLoop:
for actualWatchEventIndex, actualWatchEvent := range actualWatchEvents {
if foundEventIndexes[actualWatchEventIndex] != nil {
continue actualWatchEventsLoop
}
if actualWatchEvent.Type == expectedWatchEvent.Type {
foundExpectedWatchEvent = true
foundEventIndexes[actualWatchEventIndex] = &watchEventIndex
break actualWatchEventsLoop
}
}
if !foundExpectedWatchEvent {
errs.Insert(fmt.Sprintf("Watch event %v not found", expectedWatchEvent.Type))
}
totalValidWatchEvents++
}
err = retryCleanup()
ExpectNoError(err, "Error occurred when cleaning up resources")
if errs.Len() > 0 && try < retries {
fmt.Println("invariants violated:\n", strings.Join(errs.List(), "\n - "))
continue retriesLoop
}
if errs.Len() > 0 {
Failf("Unexpected error(s): %v", strings.Join(errs.List(), "\n - "))
}
gomega.Expect(expectedWatchEvents).To(gomega.HaveLen(totalValidWatchEvents), "Error: there must be an equal amount of total valid watch events (%d) and expected watch events (%d)", totalValidWatchEvents, len(expectedWatchEvents))
break retriesLoop
}
}

View File

@ -0,0 +1,9 @@
# This E2E framework sub-package is currently allowed to use arbitrary
# dependencies, therefore we need to override the restrictions from
# the parent .import-restrictions file.
#
# At some point it may become useful to also check this package's
# dependencies more careful.
rules:
- selectorRegexp: ""
allowedPrefixes: [ "" ]

View File

@ -0,0 +1,9 @@
# See the OWNERS docs at https://go.k8s.io/owners
approvers:
- sig-storage-approvers
- jingxu97
reviewers:
- sig-storage-reviewers
emeritus_approvers:
- rootfs

View File

@ -0,0 +1,728 @@
/*
Copyright 2017 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
/*
* This test checks that various VolumeSources are working.
*
* There are two ways, how to test the volumes:
* 1) With containerized server (NFS, Ceph, iSCSI, ...)
* The test creates a server pod, exporting simple 'index.html' file.
* Then it uses appropriate VolumeSource to import this file into a client pod
* and checks that the pod can see the file. It does so by importing the file
* into web server root and loading the index.html from it.
*
* These tests work only when privileged containers are allowed, exporting
* various filesystems (ex: NFS) usually needs some mounting or
* other privileged magic in the server pod.
*
* Note that the server containers are for testing purposes only and should not
* be used in production.
*
* 2) With server outside of Kubernetes
* Appropriate server must exist somewhere outside
* the tested Kubernetes cluster. The test itself creates a new volume,
* and checks, that Kubernetes can use it as a volume.
*/
package volume
import (
"context"
"crypto/sha256"
"fmt"
"path/filepath"
"strconv"
"strings"
"time"
v1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/util/wait"
clientset "k8s.io/client-go/kubernetes"
clientexec "k8s.io/client-go/util/exec"
"k8s.io/kubernetes/test/e2e/framework"
e2ekubectl "k8s.io/kubernetes/test/e2e/framework/kubectl"
e2epod "k8s.io/kubernetes/test/e2e/framework/pod"
e2epodoutput "k8s.io/kubernetes/test/e2e/framework/pod/output"
imageutils "k8s.io/kubernetes/test/utils/image"
admissionapi "k8s.io/pod-security-admission/api"
uexec "k8s.io/utils/exec"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
)
const (
// Kb is byte size of kilobyte
Kb int64 = 1000
// Mb is byte size of megabyte
Mb int64 = 1000 * Kb
// Gb is byte size of gigabyte
Gb int64 = 1000 * Mb
// Tb is byte size of terabyte
Tb int64 = 1000 * Gb
// KiB is byte size of kibibyte
KiB int64 = 1024
// MiB is byte size of mebibyte
MiB int64 = 1024 * KiB
// GiB is byte size of gibibyte
GiB int64 = 1024 * MiB
// TiB is byte size of tebibyte
TiB int64 = 1024 * GiB
// VolumeServerPodStartupTimeout is a waiting period for volume server (Ceph, ...) to initialize itself.
VolumeServerPodStartupTimeout = 3 * time.Minute
// PodCleanupTimeout is a waiting period for pod to be cleaned up and unmount its volumes so we
// don't tear down containers with NFS/Ceph server too early.
PodCleanupTimeout = 20 * time.Second
)
// SizeRange encapsulates a range of sizes specified as minimum and maximum quantity strings
// Both values are optional.
// If size is not set, it will assume there's not limitation and it may set a very small size (E.g. 1ki)
// as Min and set a considerable big size(E.g. 10Ei) as Max, which make it possible to calculate
// the intersection of given intervals (if it exists)
type SizeRange struct {
// Max quantity specified as a string including units. E.g "3Gi".
// If the Max size is unset, It will be assign a default valid maximum size 10Ei,
// which is defined in test/e2e/storage/testsuites/base.go
Max string
// Min quantity specified as a string including units. E.g "1Gi"
// If the Min size is unset, It will be assign a default valid minimum size 1Ki,
// which is defined in test/e2e/storage/testsuites/base.go
Min string
}
// TestConfig is a struct for configuration of one tests. The test consist of:
// - server pod - runs serverImage, exports ports[]
// - client pod - does not need any special configuration
type TestConfig struct {
Namespace string
// Prefix of all pods. Typically the test name.
Prefix string
// Name of container image for the server pod.
ServerImage string
// Ports to export from the server pod. TCP only.
ServerPorts []int
// Commands to run in the container image.
ServerCmds []string
// Arguments to pass to the container image.
ServerArgs []string
// Volumes needed to be mounted to the server container from the host
// map <host (source) path> -> <container (dst.) path>
// if <host (source) path> is empty, mount a tmpfs emptydir
ServerVolumes map[string]string
// Message to wait for before starting clients
ServerReadyMessage string
// Use HostNetwork for the server
ServerHostNetwork bool
// Wait for the pod to terminate successfully
// False indicates that the pod is long running
WaitForCompletion bool
// ClientNodeSelection restricts where the client pod runs on. Default is any node.
ClientNodeSelection e2epod.NodeSelection
}
// Test contains a volume to mount into a client pod and its
// expected content.
type Test struct {
Volume v1.VolumeSource
Mode v1.PersistentVolumeMode
// Name of file to read/write in FileSystem mode
File string
ExpectedContent string
}
// NewNFSServer is a NFS-specific wrapper for CreateStorageServer.
func NewNFSServer(ctx context.Context, cs clientset.Interface, namespace string, args []string) (config TestConfig, pod *v1.Pod, host string) {
return NewNFSServerWithNodeName(ctx, cs, namespace, args, "")
}
func NewNFSServerWithNodeName(ctx context.Context, cs clientset.Interface, namespace string, args []string, nodeName string) (config TestConfig, pod *v1.Pod, host string) {
config = TestConfig{
Namespace: namespace,
Prefix: "nfs",
ServerImage: imageutils.GetE2EImage(imageutils.VolumeNFSServer),
ServerPorts: []int{2049},
ServerVolumes: map[string]string{"": "/exports"},
ServerReadyMessage: "NFS started",
}
if nodeName != "" {
config.ClientNodeSelection = e2epod.NodeSelection{Name: nodeName}
}
if len(args) > 0 {
config.ServerArgs = args
}
pod, host = CreateStorageServer(ctx, cs, config)
if strings.Contains(host, ":") {
host = "[" + host + "]"
}
return config, pod, host
}
// Restart the passed-in nfs-server by issuing a `rpc.nfsd 1` command in the
// pod's (only) container. This command changes the number of nfs server threads from
// (presumably) zero back to 1, and therefore allows nfs to open connections again.
func RestartNFSServer(f *framework.Framework, serverPod *v1.Pod) {
const startcmd = "rpc.nfsd 1"
_, _, err := PodExec(f, serverPod, startcmd)
framework.ExpectNoError(err)
}
// Stop the passed-in nfs-server by issuing a `rpc.nfsd 0` command in the
// pod's (only) container. This command changes the number of nfs server threads to 0,
// thus closing all open nfs connections.
func StopNFSServer(f *framework.Framework, serverPod *v1.Pod) {
const stopcmd = "rpc.nfsd 0 && for i in $(seq 200); do rpcinfo -p | grep -q nfs || break; sleep 1; done"
_, _, err := PodExec(f, serverPod, stopcmd)
framework.ExpectNoError(err)
}
// CreateStorageServer is a wrapper for startVolumeServer(). A storage server config is passed in, and a pod pointer
// and ip address string are returned.
// Note: Expect() is called so no error is returned.
func CreateStorageServer(ctx context.Context, cs clientset.Interface, config TestConfig) (pod *v1.Pod, ip string) {
pod = startVolumeServer(ctx, cs, config)
gomega.Expect(pod).NotTo(gomega.BeNil(), "storage server pod should not be nil")
ip = pod.Status.PodIP
gomega.Expect(ip).NotTo(gomega.BeEmpty(), fmt.Sprintf("pod %s's IP should not be empty", pod.Name))
framework.Logf("%s server pod IP address: %s", config.Prefix, ip)
return pod, ip
}
// GetVolumeAttachmentName returns the hash value of the provisioner, the config ClientNodeSelection name,
// and the VolumeAttachment name of the PV that is bound to the PVC with the passed in claimName and claimNamespace.
func GetVolumeAttachmentName(ctx context.Context, cs clientset.Interface, config TestConfig, provisioner string, claimName string, claimNamespace string) string {
var nodeName string
// For provisioning tests, ClientNodeSelection is not set so we do not know the NodeName of the VolumeAttachment of the PV that is
// bound to the PVC with the passed in claimName and claimNamespace. We need this NodeName because it is used to generate the
// attachmentName that is returned, and used to look up a certain VolumeAttachment in WaitForVolumeAttachmentTerminated.
// To get the nodeName of the VolumeAttachment, we get all the VolumeAttachments, look for the VolumeAttachment with a
// PersistentVolumeName equal to the PV that is bound to the passed in PVC, and then we get the NodeName from that VolumeAttachment.
if config.ClientNodeSelection.Name == "" {
claim, _ := cs.CoreV1().PersistentVolumeClaims(claimNamespace).Get(ctx, claimName, metav1.GetOptions{})
pvName := claim.Spec.VolumeName
volumeAttachments, _ := cs.StorageV1().VolumeAttachments().List(ctx, metav1.ListOptions{})
for _, volumeAttachment := range volumeAttachments.Items {
if *volumeAttachment.Spec.Source.PersistentVolumeName == pvName {
nodeName = volumeAttachment.Spec.NodeName
break
}
}
} else {
nodeName = config.ClientNodeSelection.Name
}
handle := getVolumeHandle(ctx, cs, claimName, claimNamespace)
attachmentHash := sha256.Sum256([]byte(fmt.Sprintf("%s%s%s", handle, provisioner, nodeName)))
return fmt.Sprintf("csi-%x", attachmentHash)
}
// getVolumeHandle returns the VolumeHandle of the PV that is bound to the PVC with the passed in claimName and claimNamespace.
func getVolumeHandle(ctx context.Context, cs clientset.Interface, claimName string, claimNamespace string) string {
// re-get the claim to the latest state with bound volume
claim, err := cs.CoreV1().PersistentVolumeClaims(claimNamespace).Get(ctx, claimName, metav1.GetOptions{})
if err != nil {
framework.ExpectNoError(err, "Cannot get PVC")
return ""
}
pvName := claim.Spec.VolumeName
pv, err := cs.CoreV1().PersistentVolumes().Get(ctx, pvName, metav1.GetOptions{})
if err != nil {
framework.ExpectNoError(err, "Cannot get PV")
return ""
}
if pv.Spec.CSI == nil {
gomega.Expect(pv.Spec.CSI).NotTo(gomega.BeNil())
return ""
}
return pv.Spec.CSI.VolumeHandle
}
// WaitForVolumeAttachmentTerminated waits for the VolumeAttachment with the passed in attachmentName to be terminated.
func WaitForVolumeAttachmentTerminated(ctx context.Context, attachmentName string, cs clientset.Interface, timeout time.Duration) error {
waitErr := wait.PollUntilContextTimeout(ctx, 10*time.Second, timeout, true, func(ctx context.Context) (bool, error) {
_, err := cs.StorageV1().VolumeAttachments().Get(ctx, attachmentName, metav1.GetOptions{})
if err != nil {
// if the volumeattachment object is not found, it means it has been terminated.
if apierrors.IsNotFound(err) {
return true, nil
}
return false, err
}
return false, nil
})
if waitErr != nil {
return fmt.Errorf("error waiting volume attachment %v to terminate: %v", attachmentName, waitErr)
}
return nil
}
// startVolumeServer starts a container specified by config.serverImage and exports all
// config.serverPorts from it. The returned pod should be used to get the server
// IP address and create appropriate VolumeSource.
func startVolumeServer(ctx context.Context, client clientset.Interface, config TestConfig) *v1.Pod {
podClient := client.CoreV1().Pods(config.Namespace)
portCount := len(config.ServerPorts)
serverPodPorts := make([]v1.ContainerPort, portCount)
for i := 0; i < portCount; i++ {
portName := fmt.Sprintf("%s-%d", config.Prefix, i)
serverPodPorts[i] = v1.ContainerPort{
Name: portName,
ContainerPort: int32(config.ServerPorts[i]),
Protocol: v1.ProtocolTCP,
}
}
volumeCount := len(config.ServerVolumes)
volumes := make([]v1.Volume, volumeCount)
mounts := make([]v1.VolumeMount, volumeCount)
i := 0
for src, dst := range config.ServerVolumes {
mountName := fmt.Sprintf("path%d", i)
volumes[i].Name = mountName
if src == "" {
volumes[i].VolumeSource.EmptyDir = &v1.EmptyDirVolumeSource{}
} else {
volumes[i].VolumeSource.HostPath = &v1.HostPathVolumeSource{
Path: src,
}
}
mounts[i].Name = mountName
mounts[i].ReadOnly = false
mounts[i].MountPath = dst
i++
}
serverPodName := fmt.Sprintf("%s-server", config.Prefix)
ginkgo.By(fmt.Sprint("creating ", serverPodName, " pod"))
privileged := new(bool)
*privileged = true
restartPolicy := v1.RestartPolicyAlways
if config.WaitForCompletion {
restartPolicy = v1.RestartPolicyNever
}
serverPod := &v1.Pod{
TypeMeta: metav1.TypeMeta{
Kind: "Pod",
APIVersion: "v1",
},
ObjectMeta: metav1.ObjectMeta{
Name: serverPodName,
Labels: map[string]string{
"role": serverPodName,
},
},
Spec: v1.PodSpec{
HostNetwork: config.ServerHostNetwork,
Containers: []v1.Container{
{
Name: serverPodName,
Image: config.ServerImage,
SecurityContext: &v1.SecurityContext{
Privileged: privileged,
},
Command: config.ServerCmds,
Args: config.ServerArgs,
Ports: serverPodPorts,
VolumeMounts: mounts,
},
},
Volumes: volumes,
RestartPolicy: restartPolicy,
},
}
if config.ClientNodeSelection.Name != "" {
serverPod.Spec.NodeName = config.ClientNodeSelection.Name
}
var pod *v1.Pod
serverPod, err := podClient.Create(ctx, serverPod, metav1.CreateOptions{})
// ok if the server pod already exists. TODO: make this controllable by callers
if err != nil {
if apierrors.IsAlreadyExists(err) {
framework.Logf("Ignore \"already-exists\" error, re-get pod...")
ginkgo.By(fmt.Sprintf("re-getting the %q server pod", serverPodName))
serverPod, err = podClient.Get(ctx, serverPodName, metav1.GetOptions{})
framework.ExpectNoError(err, "Cannot re-get the server pod %q: %v", serverPodName, err)
pod = serverPod
} else {
framework.ExpectNoError(err, "Failed to create %q pod: %v", serverPodName, err)
}
}
if config.WaitForCompletion {
framework.ExpectNoError(e2epod.WaitForPodSuccessInNamespace(ctx, client, serverPod.Name, serverPod.Namespace))
framework.ExpectNoError(podClient.Delete(ctx, serverPod.Name, metav1.DeleteOptions{}))
} else {
framework.ExpectNoError(e2epod.WaitForPodRunningInNamespace(ctx, client, serverPod))
if pod == nil {
ginkgo.By(fmt.Sprintf("locating the %q server pod", serverPodName))
pod, err = podClient.Get(ctx, serverPodName, metav1.GetOptions{})
framework.ExpectNoError(err, "Cannot locate the server pod %q: %v", serverPodName, err)
}
}
if config.ServerReadyMessage != "" {
_, err := e2epodoutput.LookForStringInLogWithoutKubectl(ctx, client, pod.Namespace, pod.Name, serverPodName, config.ServerReadyMessage, VolumeServerPodStartupTimeout)
framework.ExpectNoError(err, "Failed to find %q in pod logs: %s", config.ServerReadyMessage, err)
}
return pod
}
// TestServerCleanup cleans server pod.
func TestServerCleanup(ctx context.Context, f *framework.Framework, config TestConfig) {
ginkgo.By(fmt.Sprint("cleaning the environment after ", config.Prefix))
defer ginkgo.GinkgoRecover()
if config.ServerImage == "" {
return
}
err := e2epod.DeletePodWithWaitByName(ctx, f.ClientSet, config.Prefix+"-server", config.Namespace)
framework.ExpectNoError(err, "delete pod %v in namespace %v", config.Prefix+"-server", config.Namespace)
}
func runVolumeTesterPod(ctx context.Context, client clientset.Interface, timeouts *framework.TimeoutContext, config TestConfig, podSuffix string, privileged bool, fsGroup *int64, tests []Test, slow bool) (*v1.Pod, error) {
ginkgo.By(fmt.Sprint("starting ", config.Prefix, "-", podSuffix))
var gracePeriod int64 = 1
var command string
/**
This condition fixes running storage e2e tests in SELinux environment.
HostPath Volume Plugin creates a directory within /tmp on host machine, to be mounted as volume.
Inject-pod writes content to the volume, and a client-pod tries the read the contents and verify.
When SELinux is enabled on the host, client-pod can not read the content, with permission denied.
Invoking client-pod as privileged, so that it can access the volume content, even when SELinux is enabled on the host.
*/
securityLevel := admissionapi.LevelBaseline // TODO (#118184): also support LevelRestricted
if privileged || config.Prefix == "hostpathsymlink" || config.Prefix == "hostpath" {
securityLevel = admissionapi.LevelPrivileged
}
command = "while true ; do sleep 2; done "
seLinuxOptions := &v1.SELinuxOptions{Level: "s0:c0,c1"}
clientPod := &v1.Pod{
TypeMeta: metav1.TypeMeta{
Kind: "Pod",
APIVersion: "v1",
},
ObjectMeta: metav1.ObjectMeta{
Name: config.Prefix + "-" + podSuffix,
Labels: map[string]string{
"role": config.Prefix + "-" + podSuffix,
},
},
Spec: v1.PodSpec{
Containers: []v1.Container{
{
Name: config.Prefix + "-" + podSuffix,
Image: e2epod.GetDefaultTestImage(),
WorkingDir: "/opt",
// An imperative and easily debuggable container which reads/writes vol contents for
// us to scan in the tests or by eye.
// We expect that /opt is empty in the minimal containers which we use in this test.
Command: e2epod.GenerateScriptCmd(command),
VolumeMounts: []v1.VolumeMount{},
},
},
TerminationGracePeriodSeconds: &gracePeriod,
SecurityContext: e2epod.GeneratePodSecurityContext(fsGroup, seLinuxOptions),
Volumes: []v1.Volume{},
},
}
e2epod.SetNodeSelection(&clientPod.Spec, config.ClientNodeSelection)
for i, test := range tests {
volumeName := fmt.Sprintf("%s-%s-%d", config.Prefix, "volume", i)
// We need to make the container privileged when SELinux is enabled on the
// host, so the test can write data to a location like /tmp. Also, due to
// the Docker bug below, it's not currently possible to map a device with
// a privileged container, so we don't go privileged for block volumes.
// https://github.com/moby/moby/issues/35991
if privileged && test.Mode == v1.PersistentVolumeBlock {
securityLevel = admissionapi.LevelBaseline
}
clientPod.Spec.Containers[0].SecurityContext = e2epod.GenerateContainerSecurityContext(securityLevel)
if test.Mode == v1.PersistentVolumeBlock {
clientPod.Spec.Containers[0].VolumeDevices = append(clientPod.Spec.Containers[0].VolumeDevices, v1.VolumeDevice{
Name: volumeName,
DevicePath: fmt.Sprintf("/opt/%d", i),
})
} else {
clientPod.Spec.Containers[0].VolumeMounts = append(clientPod.Spec.Containers[0].VolumeMounts, v1.VolumeMount{
Name: volumeName,
MountPath: fmt.Sprintf("/opt/%d", i),
})
}
clientPod.Spec.Volumes = append(clientPod.Spec.Volumes, v1.Volume{
Name: volumeName,
VolumeSource: test.Volume,
})
}
podsNamespacer := client.CoreV1().Pods(config.Namespace)
clientPod, err := podsNamespacer.Create(ctx, clientPod, metav1.CreateOptions{})
if err != nil {
return nil, err
}
if slow {
err = e2epod.WaitTimeoutForPodRunningInNamespace(ctx, client, clientPod.Name, clientPod.Namespace, timeouts.PodStartSlow)
} else {
err = e2epod.WaitTimeoutForPodRunningInNamespace(ctx, client, clientPod.Name, clientPod.Namespace, timeouts.PodStart)
}
if err != nil {
e2epod.DeletePodOrFail(ctx, client, clientPod.Namespace, clientPod.Name)
_ = e2epod.WaitForPodNotFoundInNamespace(ctx, client, clientPod.Name, clientPod.Namespace, timeouts.PodDelete)
return nil, err
}
return clientPod, nil
}
func testVolumeContent(f *framework.Framework, pod *v1.Pod, containerName string, fsGroup *int64, fsType string, tests []Test) {
ginkgo.By("Checking that text file contents are perfect.")
for i, test := range tests {
if test.Mode == v1.PersistentVolumeBlock {
// Block: check content
deviceName := fmt.Sprintf("/opt/%d", i)
commands := GenerateReadBlockCmd(deviceName, len(test.ExpectedContent))
_, err := e2epodoutput.LookForStringInPodExecToContainer(pod.Namespace, pod.Name, containerName, commands, test.ExpectedContent, time.Minute)
framework.ExpectNoError(err, "failed: finding the contents of the block device %s.", deviceName)
// Check that it's a real block device
CheckVolumeModeOfPath(f, pod, test.Mode, deviceName)
} else {
// Filesystem: check content
fileName := fmt.Sprintf("/opt/%d/%s", i, test.File)
commands := GenerateReadFileCmd(fileName)
_, err := e2epodoutput.LookForStringInPodExecToContainer(pod.Namespace, pod.Name, containerName, commands, test.ExpectedContent, time.Minute)
framework.ExpectNoError(err, "failed: finding the contents of the mounted file %s.", fileName)
// Check that a directory has been mounted
dirName := filepath.Dir(fileName)
CheckVolumeModeOfPath(f, pod, test.Mode, dirName)
if !framework.NodeOSDistroIs("windows") {
// Filesystem: check fsgroup
if fsGroup != nil {
ginkgo.By("Checking fsGroup is correct.")
_, err = e2epodoutput.LookForStringInPodExecToContainer(pod.Namespace, pod.Name, containerName, []string{"ls", "-ld", dirName}, strconv.Itoa(int(*fsGroup)), time.Minute)
framework.ExpectNoError(err, "failed: getting the right privileges in the file %v", int(*fsGroup))
}
// Filesystem: check fsType
if fsType != "" {
ginkgo.By("Checking fsType is correct.")
_, err = e2epodoutput.LookForStringInPodExecToContainer(pod.Namespace, pod.Name, containerName, []string{"grep", " " + dirName + " ", "/proc/mounts"}, fsType, time.Minute)
framework.ExpectNoError(err, "failed: getting the right fsType %s", fsType)
}
}
}
}
}
// TestVolumeClient start a client pod using given VolumeSource (exported by startVolumeServer())
// and check that the pod sees expected data, e.g. from the server pod.
// Multiple Tests can be specified to mount multiple volumes to a single
// pod.
// Timeout for dynamic provisioning (if "WaitForFirstConsumer" is set && provided PVC is not bound yet),
// pod creation, scheduling and complete pod startup (incl. volume attach & mount) is pod.podStartTimeout.
// It should be used for cases where "regular" dynamic provisioning of an empty volume is requested.
func TestVolumeClient(ctx context.Context, f *framework.Framework, config TestConfig, fsGroup *int64, fsType string, tests []Test) {
testVolumeClient(ctx, f, config, fsGroup, fsType, tests, false)
}
// TestVolumeClientSlow is the same as TestVolumeClient except for its timeout.
// Timeout for dynamic provisioning (if "WaitForFirstConsumer" is set && provided PVC is not bound yet),
// pod creation, scheduling and complete pod startup (incl. volume attach & mount) is pod.slowPodStartTimeout.
// It should be used for cases where "special" dynamic provisioning is requested, such as volume cloning
// or snapshot restore.
func TestVolumeClientSlow(ctx context.Context, f *framework.Framework, config TestConfig, fsGroup *int64, fsType string, tests []Test) {
testVolumeClient(ctx, f, config, fsGroup, fsType, tests, true)
}
func testVolumeClient(ctx context.Context, f *framework.Framework, config TestConfig, fsGroup *int64, fsType string, tests []Test, slow bool) {
timeouts := f.Timeouts
clientPod, err := runVolumeTesterPod(ctx, f.ClientSet, timeouts, config, "client", false, fsGroup, tests, slow)
if err != nil {
framework.Failf("Failed to create client pod: %v", err)
}
defer func() {
// testVolumeClient might get used more than once per test, therefore
// we have to clean up before returning.
e2epod.DeletePodOrFail(ctx, f.ClientSet, clientPod.Namespace, clientPod.Name)
framework.ExpectNoError(e2epod.WaitForPodNotFoundInNamespace(ctx, f.ClientSet, clientPod.Name, clientPod.Namespace, timeouts.PodDelete))
}()
testVolumeContent(f, clientPod, "", fsGroup, fsType, tests)
ginkgo.By("Repeating the test on an ephemeral container (if enabled)")
ec := &v1.EphemeralContainer{
EphemeralContainerCommon: v1.EphemeralContainerCommon(clientPod.Spec.Containers[0]),
}
ec.Resources = v1.ResourceRequirements{}
ec.Name = "volume-ephemeral-container"
err = e2epod.NewPodClient(f).AddEphemeralContainerSync(ctx, clientPod, ec, timeouts.PodStart)
// The API server will return NotFound for the subresource when the feature is disabled
framework.ExpectNoError(err, "failed to add ephemeral container for re-test")
testVolumeContent(f, clientPod, ec.Name, fsGroup, fsType, tests)
}
// InjectContent inserts index.html with given content into given volume. It does so by
// starting and auxiliary pod which writes the file there.
// The volume must be writable.
func InjectContent(ctx context.Context, f *framework.Framework, config TestConfig, fsGroup *int64, fsType string, tests []Test) {
privileged := true
timeouts := f.Timeouts
if framework.NodeOSDistroIs("windows") {
privileged = false
}
injectorPod, err := runVolumeTesterPod(ctx, f.ClientSet, timeouts, config, "injector", privileged, fsGroup, tests, false /*slow*/)
if err != nil {
framework.Failf("Failed to create injector pod: %v", err)
return
}
defer func() {
// This pod must get deleted before the function returns becaue the test relies on
// the volume not being in use.
e2epod.DeletePodOrFail(ctx, f.ClientSet, injectorPod.Namespace, injectorPod.Name)
framework.ExpectNoError(e2epod.WaitForPodNotFoundInNamespace(ctx, f.ClientSet, injectorPod.Name, injectorPod.Namespace, timeouts.PodDelete))
}()
ginkgo.By("Writing text file contents in the container.")
for i, test := range tests {
commands := []string{"exec", injectorPod.Name, fmt.Sprintf("--namespace=%v", injectorPod.Namespace), "--"}
if test.Mode == v1.PersistentVolumeBlock {
// Block: write content
deviceName := fmt.Sprintf("/opt/%d", i)
commands = append(commands, generateWriteBlockCmd(test.ExpectedContent, deviceName)...)
} else {
// Filesystem: write content
fileName := fmt.Sprintf("/opt/%d/%s", i, test.File)
commands = append(commands, generateWriteFileCmd(test.ExpectedContent, fileName)...)
}
out, err := e2ekubectl.RunKubectl(injectorPod.Namespace, commands...)
framework.ExpectNoError(err, "failed: writing the contents: %s", out)
}
// Check that the data have been really written in this pod.
// This tests non-persistent volume types
testVolumeContent(f, injectorPod, "", fsGroup, fsType, tests)
}
// generateWriteCmd is used by generateWriteBlockCmd and generateWriteFileCmd
func generateWriteCmd(content, path string) []string {
var commands []string
commands = []string{"/bin/sh", "-c", "echo '" + content + "' > " + path + "; sync"}
return commands
}
// GenerateReadBlockCmd generates the corresponding command lines to read from a block device with the given file path.
func GenerateReadBlockCmd(fullPath string, numberOfCharacters int) []string {
var commands []string
commands = []string{"head", "-c", strconv.Itoa(numberOfCharacters), fullPath}
return commands
}
// generateWriteBlockCmd generates the corresponding command lines to write to a block device the given content.
func generateWriteBlockCmd(content, fullPath string) []string {
return generateWriteCmd(content, fullPath)
}
// GenerateReadFileCmd generates the corresponding command lines to read from a file with the given file path.
func GenerateReadFileCmd(fullPath string) []string {
var commands []string
commands = []string{"cat", fullPath}
return commands
}
// generateWriteFileCmd generates the corresponding command lines to write a file with the given content and file path.
func generateWriteFileCmd(content, fullPath string) []string {
return generateWriteCmd(content, fullPath)
}
// CheckVolumeModeOfPath check mode of volume
func CheckVolumeModeOfPath(f *framework.Framework, pod *v1.Pod, volMode v1.PersistentVolumeMode, path string) {
if volMode == v1.PersistentVolumeBlock {
// Check if block exists
VerifyExecInPodSucceed(f, pod, fmt.Sprintf("test -b %s", path))
// Double check that it's not directory
VerifyExecInPodFail(f, pod, fmt.Sprintf("test -d %s", path), 1)
} else {
// Check if directory exists
VerifyExecInPodSucceed(f, pod, fmt.Sprintf("test -d %s", path))
// Double check that it's not block
VerifyExecInPodFail(f, pod, fmt.Sprintf("test -b %s", path), 1)
}
}
// PodExec runs f.ExecCommandInContainerWithFullOutput to execute a shell cmd in target pod
// TODO: put this under e2epod once https://github.com/kubernetes/kubernetes/issues/81245
// is resolved. Otherwise there will be dependency issue.
func PodExec(f *framework.Framework, pod *v1.Pod, shExec string) (string, string, error) {
return e2epod.ExecCommandInContainerWithFullOutput(f, pod.Name, pod.Spec.Containers[0].Name, "/bin/sh", "-c", shExec)
}
// VerifyExecInPodSucceed verifies shell cmd in target pod succeed
// TODO: put this under e2epod once https://github.com/kubernetes/kubernetes/issues/81245
// is resolved. Otherwise there will be dependency issue.
func VerifyExecInPodSucceed(f *framework.Framework, pod *v1.Pod, shExec string) {
stdout, stderr, err := PodExec(f, pod, shExec)
if err != nil {
if exiterr, ok := err.(uexec.CodeExitError); ok {
exitCode := exiterr.ExitStatus()
framework.ExpectNoError(err,
"%q should succeed, but failed with exit code %d and error message %q\nstdout: %s\nstderr: %s",
shExec, exitCode, exiterr, stdout, stderr)
} else {
framework.ExpectNoError(err,
"%q should succeed, but failed with error message %q\nstdout: %s\nstderr: %s",
shExec, err, stdout, stderr)
}
}
}
// VerifyExecInPodFail verifies shell cmd in target pod fail with certain exit code
// TODO: put this under e2epod once https://github.com/kubernetes/kubernetes/issues/81245
// is resolved. Otherwise there will be dependency issue.
func VerifyExecInPodFail(f *framework.Framework, pod *v1.Pod, shExec string, exitCode int) {
stdout, stderr, err := PodExec(f, pod, shExec)
if err != nil {
if exiterr, ok := err.(clientexec.ExitError); ok {
actualExitCode := exiterr.ExitStatus()
gomega.Expect(actualExitCode).To(gomega.Equal(exitCode),
"%q should fail with exit code %d, but failed with exit code %d and error message %q\nstdout: %s\nstderr: %s",
shExec, exitCode, actualExitCode, exiterr, stdout, stderr)
} else {
framework.ExpectNoError(err,
"%q should fail with exit code %d, but failed with error message %q\nstdout: %s\nstderr: %s",
shExec, exitCode, err, stdout, stderr)
}
}
gomega.Expect(err).To(gomega.HaveOccurred(), "%q should fail with exit code %d, but exit without error", shExec, exitCode)
}

View File

@ -0,0 +1,53 @@
/*
Copyright 2016 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package perftype
// TODO(random-liu): Replace this with prometheus' data model.
// The following performance data structures are generalized and well-formatted.
// They can be pretty printed in json format and be analyzed by other performance
// analyzing tools, such as Perfdash (k8s.io/contrib/perfdash).
// DataItem is the data point.
type DataItem struct {
// Data is a map from bucket to real data point (e.g. "Perc90" -> 23.5). Notice
// that all data items with the same label combination should have the same buckets.
Data map[string]float64 `json:"data"`
// Unit is the data unit. Notice that all data items with the same label combination
// should have the same unit.
Unit string `json:"unit"`
// Labels is the labels of the data item.
Labels map[string]string `json:"labels,omitempty"`
}
// PerfData contains all data items generated in current test.
type PerfData struct {
// Version is the version of the metrics. The metrics consumer could use the version
// to detect metrics version change and decide what version to support.
Version string `json:"version"`
DataItems []DataItem `json:"dataItems"`
// Labels is the labels of the dataset.
Labels map[string]string `json:"labels,omitempty"`
}
// PerfResultTag is the prefix of generated perfdata. Analyzing tools can find the perf result
// with this tag.
const PerfResultTag = "[Result:Performance]"
// PerfResultEnd is the end of generated perfdata. Analyzing tools can find the end of the perf
// result with this tag.
const PerfResultEnd = "[Finish:Performance]"

View File

@ -0,0 +1,372 @@
/*
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Package podlogs enables live capturing of all events and log
// messages for some or all pods in a namespace as they get generated.
// This helps debugging both a running test (what is currently going
// on?) and the output of a CI run (events appear in chronological
// order and output that normally isn't available like the command
// stdout messages are available).
package podlogs
import (
"bufio"
"bytes"
"context"
"fmt"
"io"
"os"
"path"
"regexp"
"strings"
"sync"
"time"
v1 "k8s.io/api/core/v1"
meta "k8s.io/apimachinery/pkg/apis/meta/v1"
clientset "k8s.io/client-go/kubernetes"
)
// LogOutput determines where output from CopyAllLogs goes.
type LogOutput struct {
// If not nil, errors will be logged here.
StatusWriter io.Writer
// If not nil, all output goes to this writer with "<pod>/<container>:" as prefix.
LogWriter io.Writer
// Base directory for one log file per container.
// The full path of each log file will be <log path prefix><pod>-<container>.log.
LogPathPrefix string
}
// Matches harmless errors from pkg/kubelet/kubelet_pods.go.
var expectedErrors = regexp.MustCompile(`container .* in pod .* is (terminated|waiting to start|not available)|the server could not find the requested resource`)
// CopyPodLogs is basically CopyPodLogs for all current or future pods in the given namespace ns
func CopyAllLogs(ctx context.Context, cs clientset.Interface, ns string, to LogOutput) error {
return CopyPodLogs(ctx, cs, ns, "", to)
}
// CopyPodLogs follows the logs of all containers in pod with the given podName,
// including those that get created in the future, and writes each log
// line as configured in the output options. It does that until the
// context is done or until an error occurs.
//
// If podName is empty, it will follow all pods in the given namespace ns.
//
// Beware that there is currently no way to force log collection
// before removing pods, which means that there is a known race
// between "stop pod" and "collecting log entries". The alternative
// would be a blocking function with collects logs from all currently
// running pods, but that then would have the disadvantage that
// already deleted pods aren't covered.
//
// Another race occurs is when a pod shuts down. Logging stops, but if
// then the pod is not removed from the apiserver quickly enough, logging
// resumes and dumps the old log again. Previously, this was allowed based
// on the assumption that it is better to log twice than miss log messages
// of pods that started and immediately terminated or when logging temporarily
// stopped.
//
// But it turned out to be rather confusing, so now a heuristic is used: if
// log output of a container was already captured, then capturing does not
// resume if the pod is marked for deletion.
func CopyPodLogs(ctx context.Context, cs clientset.Interface, ns, podName string, to LogOutput) error {
options := meta.ListOptions{}
if podName != "" {
options = meta.ListOptions{
FieldSelector: fmt.Sprintf("metadata.name=%s", podName),
}
}
watcher, err := cs.CoreV1().Pods(ns).Watch(context.TODO(), options)
if err != nil {
return fmt.Errorf("cannot create Pod event watcher: %w", err)
}
go func() {
var m sync.Mutex
// Key is pod/container name, true if currently logging it.
active := map[string]bool{}
// Key is pod/container/container-id, true if we have ever started to capture its output.
started := map[string]bool{}
check := func() {
m.Lock()
defer m.Unlock()
pods, err := cs.CoreV1().Pods(ns).List(context.TODO(), options)
if err != nil {
if to.StatusWriter != nil {
fmt.Fprintf(to.StatusWriter, "ERROR: get pod list in %s: %s\n", ns, err)
}
return
}
for _, pod := range pods.Items {
for i, c := range pod.Spec.Containers {
// sanity check, array should have entry for each container
if len(pod.Status.ContainerStatuses) <= i {
continue
}
name := pod.ObjectMeta.Name + "/" + c.Name
id := name + "/" + pod.Status.ContainerStatuses[i].ContainerID
if active[name] ||
// If we have worked on a container before and it has now terminated, then
// there cannot be any new output and we can ignore it.
(pod.Status.ContainerStatuses[i].State.Terminated != nil &&
started[id]) ||
// State.Terminated might not have been updated although the container already
// stopped running. Also check whether the pod is deleted.
(pod.DeletionTimestamp != nil && started[id]) ||
// Don't attempt to get logs for a container unless it is running or has terminated.
// Trying to get a log would just end up with an error that we would have to suppress.
(pod.Status.ContainerStatuses[i].State.Running == nil &&
pod.Status.ContainerStatuses[i].State.Terminated == nil) {
continue
}
readCloser, err := logsForPod(ctx, cs, ns, pod.ObjectMeta.Name,
&v1.PodLogOptions{
Container: c.Name,
Follow: true,
})
if err != nil {
// We do get "normal" errors here, like trying to read too early.
// We can ignore those.
if to.StatusWriter != nil &&
expectedErrors.FindStringIndex(err.Error()) == nil {
fmt.Fprintf(to.StatusWriter, "WARNING: pod log: %s: %s\n", name, err)
}
continue
}
// Determine where we write. If this fails, we intentionally return without clearing
// the active[name] flag, which prevents trying over and over again to
// create the output file.
var out io.Writer
var closer io.Closer
var prefix string
if to.LogWriter != nil {
out = to.LogWriter
nodeName := pod.Spec.NodeName
if len(nodeName) > 10 {
nodeName = nodeName[0:4] + ".." + nodeName[len(nodeName)-4:]
}
prefix = name + "@" + nodeName + ": "
} else {
var err error
filename := to.LogPathPrefix + pod.ObjectMeta.Name + "-" + c.Name + ".log"
err = os.MkdirAll(path.Dir(filename), 0755)
if err != nil {
if to.StatusWriter != nil {
fmt.Fprintf(to.StatusWriter, "ERROR: pod log: create directory for %s: %s\n", filename, err)
}
return
}
// The test suite might run the same test multiple times,
// so we have to append here.
file, err := os.OpenFile(filename, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0644)
if err != nil {
if to.StatusWriter != nil {
fmt.Fprintf(to.StatusWriter, "ERROR: pod log: create file %s: %s\n", filename, err)
}
return
}
closer = file
out = file
}
go func() {
if closer != nil {
defer closer.Close()
}
first := true
defer func() {
m.Lock()
// If we never printed anything, then also skip the final message.
if !first {
if prefix != "" {
fmt.Fprintf(out, "%s==== end of pod log ====\n", prefix)
} else {
fmt.Fprintf(out, "==== end of pod log for container %s ====\n", name)
}
}
active[name] = false
m.Unlock()
readCloser.Close()
}()
scanner := bufio.NewScanner(readCloser)
for scanner.Scan() {
line := scanner.Text()
// Filter out the expected "end of stream" error message,
// it would just confuse developers who don't know about it.
// Same for attempts to read logs from a container that
// isn't ready (yet?!).
if !strings.HasPrefix(line, "rpc error: code = Unknown desc = Error: No such container:") &&
!strings.HasPrefix(line, "unable to retrieve container logs for ") &&
!strings.HasPrefix(line, "Unable to retrieve container logs for ") {
if first {
// Because the same log might be written to multiple times
// in different test instances, log an extra line to separate them.
// Also provides some useful extra information.
if prefix == "" {
fmt.Fprintf(out, "==== start of pod log for container %s ====\n", name)
} else {
fmt.Fprintf(out, "%s==== start of pod log ====\n", prefix)
}
first = false
}
fmt.Fprintf(out, "%s%s\n", prefix, line)
}
}
}()
active[name] = true
started[id] = true
}
}
}
// Watch events to see whether we can start logging
// and log interesting ones.
check()
for {
select {
case <-watcher.ResultChan():
check()
case <-ctx.Done():
return
}
}
}()
return nil
}
// logsForPod starts reading the logs for a certain pod. If the pod has more than one
// container, opts.Container must be set. Reading stops when the context is done.
// The stream includes formatted error messages and ends with
//
// rpc error: code = Unknown desc = Error: No such container: 41a...
//
// when the pod gets deleted while streaming.
func logsForPod(ctx context.Context, cs clientset.Interface, ns, pod string, opts *v1.PodLogOptions) (io.ReadCloser, error) {
return cs.CoreV1().Pods(ns).GetLogs(pod, opts).Stream(ctx)
}
// WatchPods prints pod status events for a certain namespace or all namespaces
// when namespace name is empty. The closer can be nil if the caller doesn't want
// the file to be closed when watching stops.
func WatchPods(ctx context.Context, cs clientset.Interface, ns string, to io.Writer, toCloser io.Closer) (finalErr error) {
defer func() {
if finalErr != nil && toCloser != nil {
toCloser.Close()
}
}()
pods, err := cs.CoreV1().Pods(ns).Watch(context.Background(), meta.ListOptions{})
if err != nil {
return fmt.Errorf("cannot create Pod watcher: %w", err)
}
defer func() {
if finalErr != nil {
pods.Stop()
}
}()
events, err := cs.CoreV1().Events(ns).Watch(context.Background(), meta.ListOptions{})
if err != nil {
return fmt.Errorf("cannot create Event watcher: %w", err)
}
go func() {
defer func() {
pods.Stop()
events.Stop()
if toCloser != nil {
toCloser.Close()
}
}()
timeFormat := "15:04:05.000"
for {
select {
case e := <-pods.ResultChan():
if e.Object == nil {
continue
}
pod, ok := e.Object.(*v1.Pod)
if !ok {
continue
}
buffer := new(bytes.Buffer)
fmt.Fprintf(buffer,
"%s pod: %s: %s/%s %s: %s %s\n",
time.Now().Format(timeFormat),
e.Type,
pod.Namespace,
pod.Name,
pod.Status.Phase,
pod.Status.Reason,
pod.Status.Conditions,
)
for _, cst := range pod.Status.ContainerStatuses {
fmt.Fprintf(buffer, " %s: ", cst.Name)
if cst.State.Waiting != nil {
fmt.Fprintf(buffer, "WAITING: %s - %s",
cst.State.Waiting.Reason,
cst.State.Waiting.Message,
)
} else if cst.State.Running != nil {
fmt.Fprintf(buffer, "RUNNING")
} else if cst.State.Terminated != nil {
fmt.Fprintf(buffer, "TERMINATED: %s - %s",
cst.State.Terminated.Reason,
cst.State.Terminated.Message,
)
}
fmt.Fprintf(buffer, "\n")
}
to.Write(buffer.Bytes())
case e := <-events.ResultChan():
if e.Object == nil {
continue
}
event, ok := e.Object.(*v1.Event)
if !ok {
continue
}
to.Write([]byte(fmt.Sprintf("%s event: %s/%s %s: %s %s: %s (%v - %v)\n",
time.Now().Format(timeFormat),
event.InvolvedObject.APIVersion,
event.InvolvedObject.Kind,
event.InvolvedObject.Name,
event.Source.Component,
event.Type,
event.Message,
event.FirstTimestamp,
event.LastTimestamp,
)))
case <-ctx.Done():
to.Write([]byte(fmt.Sprintf("%s ==== stopping pod watch ====\n",
time.Now().Format(timeFormat))))
return
}
}
}()
return nil
}

View File

@ -0,0 +1,738 @@
/*
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package utils
import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"github.com/onsi/ginkgo/v2"
appsv1 "k8s.io/api/apps/v1"
v1 "k8s.io/api/core/v1"
rbacv1 "k8s.io/api/rbac/v1"
storagev1 "k8s.io/api/storage/v1"
storagev1beta1 "k8s.io/api/storage/v1beta1"
apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/client-go/kubernetes/scheme"
"k8s.io/client-go/tools/cache"
"k8s.io/kubernetes/test/e2e/framework"
e2etestfiles "k8s.io/kubernetes/test/e2e/framework/testfiles"
imageutils "k8s.io/kubernetes/test/utils/image"
)
// LoadFromManifests loads .yaml or .json manifest files and returns
// all items that it finds in them. It supports all items for which
// there is a factory registered in factories and .yaml files with
// multiple items separated by "---". Files are accessed via the
// "testfiles" package, which means they can come from a file system
// or be built into the binary.
//
// LoadFromManifests has some limitations:
// - aliases are not supported (i.e. use serviceAccountName instead of the deprecated serviceAccount,
// https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1)
// and silently ignored
// - the latest stable API version for each item is used, regardless of what
// is specified in the manifest files
func LoadFromManifests(files ...string) ([]interface{}, error) {
var items []interface{}
err := visitManifests(func(data []byte) error {
// Ignore any additional fields for now, just determine what we have.
var what What
if err := runtime.DecodeInto(scheme.Codecs.UniversalDecoder(), data, &what); err != nil {
return fmt.Errorf("decode TypeMeta: %w", err)
}
// Ignore empty documents.
if what.Kind == "" {
return nil
}
factory := factories[what]
if factory == nil {
return fmt.Errorf("item of type %+v not supported", what)
}
object := factory.New()
if err := runtime.DecodeInto(scheme.Codecs.UniversalDecoder(), data, object); err != nil {
return fmt.Errorf("decode %+v: %w", what, err)
}
items = append(items, object)
return nil
}, files...)
return items, err
}
func visitManifests(cb func([]byte) error, files ...string) error {
for _, fileName := range files {
data, err := e2etestfiles.Read(fileName)
if err != nil {
framework.Failf("reading manifest file: %v", err)
}
// Split at the "---" separator before working on
// individual item. Only works for .yaml.
//
// We need to split ourselves because we need access
// to each original chunk of data for
// runtime.DecodeInto. kubectl has its own
// infrastructure for this, but that is a lot of code
// with many dependencies.
items := bytes.Split(data, []byte("\n---"))
for _, item := range items {
if err := cb(item); err != nil {
return fmt.Errorf("%s: %w", fileName, err)
}
}
}
return nil
}
// PatchItems modifies the given items in place such that each test
// gets its own instances, to avoid conflicts between different tests
// and between tests and normal deployments.
//
// This is done by:
// - creating namespaced items inside the test's namespace
// - changing the name of non-namespaced items like ClusterRole
//
// PatchItems has some limitations:
// - only some common items are supported, unknown ones trigger an error
// - only the latest stable API version for each item is supported
func PatchItems(f *framework.Framework, driverNamespace *v1.Namespace, items ...interface{}) error {
for _, item := range items {
// Uncomment when debugging the loading and patching of items.
// Logf("patching original content of %T:\n%s", item, PrettyPrint(item))
if err := patchItemRecursively(f, driverNamespace, item); err != nil {
return err
}
}
return nil
}
// CreateItems creates the items. Each of them must be an API object
// of a type that is registered in Factory.
//
// It returns either a cleanup function or an error, but never both.
//
// Cleaning up after a test can be triggered in two ways:
// - the test invokes the returned cleanup function,
// usually in an AfterEach
// - the test suite terminates, potentially after
// skipping the test's AfterEach (https://github.com/onsi/ginkgo/issues/222)
//
// PatchItems has the some limitations as LoadFromManifests:
// - only some common items are supported, unknown ones trigger an error
// - only the latest stable API version for each item is supported
func CreateItems(ctx context.Context, f *framework.Framework, ns *v1.Namespace, items ...interface{}) error {
var result error
for _, item := range items {
// Each factory knows which item(s) it supports, so try each one.
done := false
description := describeItem(item)
// Uncomment this line to get a full dump of the entire item.
// description = fmt.Sprintf("%s:\n%s", description, PrettyPrint(item))
framework.Logf("creating %s", description)
for _, factory := range factories {
destructor, err := factory.Create(ctx, f, ns, item)
if destructor != nil {
ginkgo.DeferCleanup(framework.IgnoreNotFound(destructor), framework.AnnotatedLocation(fmt.Sprintf("deleting %s", description)))
}
if err == nil {
done = true
break
} else if !errors.Is(err, errorItemNotSupported) {
result = err
break
}
}
if result == nil && !done {
result = fmt.Errorf("item of type %T not supported", item)
break
}
}
return result
}
// CreateFromManifests is a combination of LoadFromManifests,
// PatchItems, patching with an optional custom function,
// and CreateItems.
func CreateFromManifests(ctx context.Context, f *framework.Framework, driverNamespace *v1.Namespace, patch func(item interface{}) error, files ...string) error {
items, err := LoadFromManifests(files...)
if err != nil {
return fmt.Errorf("CreateFromManifests: %w", err)
}
if err := PatchItems(f, driverNamespace, items...); err != nil {
return err
}
if patch != nil {
for _, item := range items {
if err := patch(item); err != nil {
return err
}
}
}
return CreateItems(ctx, f, driverNamespace, items...)
}
// What is a subset of metav1.TypeMeta which (in contrast to
// metav1.TypeMeta itself) satisfies the runtime.Object interface.
type What struct {
Kind string `json:"kind"`
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new What.
func (in *What) DeepCopy() *What {
return &What{Kind: in.Kind}
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out.
func (in *What) DeepCopyInto(out *What) {
out.Kind = in.Kind
}
// DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object.
func (in *What) DeepCopyObject() runtime.Object {
return &What{Kind: in.Kind}
}
// GetObjectKind returns the ObjectKind schema
func (in *What) GetObjectKind() schema.ObjectKind {
return nil
}
// ItemFactory provides support for creating one particular item.
// The type gets exported because other packages might want to
// extend the set of pre-defined factories.
type ItemFactory interface {
// New returns a new empty item.
New() runtime.Object
// Create is responsible for creating the item. It returns an
// error or a cleanup function for the created item.
// If the item is of an unsupported type, it must return
// an error that has errorItemNotSupported as cause.
Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, item interface{}) (func(ctx context.Context) error, error)
}
// describeItem always returns a string that describes the item,
// usually by calling out to cache.MetaNamespaceKeyFunc which
// concatenates namespace (if set) and name. If that fails, the entire
// item gets converted to a string.
func describeItem(item interface{}) string {
key, err := cache.MetaNamespaceKeyFunc(item)
if err == nil && key != "" {
return fmt.Sprintf("%T: %s", item, key)
}
return fmt.Sprintf("%T: %s", item, item)
}
// errorItemNotSupported is the error that Create methods
// must return or wrap when they don't support the given item.
var errorItemNotSupported = errors.New("not supported")
var factories = map[What]ItemFactory{
{"ClusterRole"}: &clusterRoleFactory{},
{"ClusterRoleBinding"}: &clusterRoleBindingFactory{},
{"CSIDriver"}: &csiDriverFactory{},
{"DaemonSet"}: &daemonSetFactory{},
{"ReplicaSet"}: &replicaSetFactory{},
{"Role"}: &roleFactory{},
{"RoleBinding"}: &roleBindingFactory{},
{"Secret"}: &secretFactory{},
{"Service"}: &serviceFactory{},
{"ServiceAccount"}: &serviceAccountFactory{},
{"StatefulSet"}: &statefulSetFactory{},
{"Deployment"}: &deploymentFactory{},
{"StorageClass"}: &storageClassFactory{},
{"VolumeAttributesClass"}: &volumeAttributesClassFactory{},
{"CustomResourceDefinition"}: &customResourceDefinitionFactory{},
}
// PatchName makes the name of some item unique by appending the
// generated unique name.
func PatchName(f *framework.Framework, item *string) {
if *item != "" {
*item = *item + "-" + f.UniqueName
}
}
// PatchNamespace moves the item into the test's namespace. Not
// all items can be namespaced. For those, the name also needs to be
// patched.
func PatchNamespace(f *framework.Framework, driverNamespace *v1.Namespace, item *string) {
if driverNamespace != nil {
*item = driverNamespace.GetName()
return
}
if f.Namespace != nil {
*item = f.Namespace.GetName()
}
}
func patchItemRecursively(f *framework.Framework, driverNamespace *v1.Namespace, item interface{}) error {
switch item := item.(type) {
case *rbacv1.Subject:
PatchNamespace(f, driverNamespace, &item.Namespace)
case *rbacv1.RoleRef:
// TODO: avoid hard-coding this special name. Perhaps add a Framework.PredefinedRoles
// which contains all role names that are defined cluster-wide before the test starts?
// All those names are exempt from renaming. That list could be populated by querying
// and get extended by tests.
if item.Name != "e2e-test-privileged-psp" {
PatchName(f, &item.Name)
}
case *rbacv1.ClusterRole:
PatchName(f, &item.Name)
case *rbacv1.Role:
PatchNamespace(f, driverNamespace, &item.Namespace)
// Roles are namespaced, but because for RoleRef above we don't
// know whether the referenced role is a ClusterRole or Role
// and therefore always renames, we have to do the same here.
PatchName(f, &item.Name)
case *storagev1.StorageClass:
PatchName(f, &item.Name)
case *storagev1beta1.VolumeAttributesClass:
PatchName(f, &item.Name)
case *storagev1.CSIDriver:
PatchName(f, &item.Name)
case *v1.ServiceAccount:
PatchNamespace(f, driverNamespace, &item.ObjectMeta.Namespace)
case *v1.Secret:
PatchNamespace(f, driverNamespace, &item.ObjectMeta.Namespace)
case *rbacv1.ClusterRoleBinding:
PatchName(f, &item.Name)
for i := range item.Subjects {
if err := patchItemRecursively(f, driverNamespace, &item.Subjects[i]); err != nil {
return fmt.Errorf("%T: %w", f, err)
}
}
if err := patchItemRecursively(f, driverNamespace, &item.RoleRef); err != nil {
return fmt.Errorf("%T: %w", f, err)
}
case *rbacv1.RoleBinding:
PatchNamespace(f, driverNamespace, &item.Namespace)
for i := range item.Subjects {
if err := patchItemRecursively(f, driverNamespace, &item.Subjects[i]); err != nil {
return fmt.Errorf("%T: %w", f, err)
}
}
if err := patchItemRecursively(f, driverNamespace, &item.RoleRef); err != nil {
return fmt.Errorf("%T: %w", f, err)
}
case *v1.Service:
PatchNamespace(f, driverNamespace, &item.ObjectMeta.Namespace)
case *appsv1.StatefulSet:
PatchNamespace(f, driverNamespace, &item.ObjectMeta.Namespace)
if err := patchContainerImages(item.Spec.Template.Spec.Containers); err != nil {
return err
}
if err := patchContainerImages(item.Spec.Template.Spec.InitContainers); err != nil {
return err
}
case *appsv1.Deployment:
PatchNamespace(f, driverNamespace, &item.ObjectMeta.Namespace)
if err := patchContainerImages(item.Spec.Template.Spec.Containers); err != nil {
return err
}
if err := patchContainerImages(item.Spec.Template.Spec.InitContainers); err != nil {
return err
}
case *appsv1.DaemonSet:
PatchNamespace(f, driverNamespace, &item.ObjectMeta.Namespace)
if err := patchContainerImages(item.Spec.Template.Spec.Containers); err != nil {
return err
}
if err := patchContainerImages(item.Spec.Template.Spec.InitContainers); err != nil {
return err
}
case *appsv1.ReplicaSet:
PatchNamespace(f, driverNamespace, &item.ObjectMeta.Namespace)
if err := patchContainerImages(item.Spec.Template.Spec.Containers); err != nil {
return err
}
if err := patchContainerImages(item.Spec.Template.Spec.InitContainers); err != nil {
return err
}
case *apiextensionsv1.CustomResourceDefinition:
// Do nothing. Patching name to all CRDs won't always be the expected behavior.
default:
return fmt.Errorf("missing support for patching item of type %T", item)
}
return nil
}
// The individual factories all follow the same template, but with
// enough differences in types and functions that copy-and-paste
// looked like the least dirty approach. Perhaps one day Go will have
// generics.
type serviceAccountFactory struct{}
func (f *serviceAccountFactory) New() runtime.Object {
return &v1.ServiceAccount{}
}
func (*serviceAccountFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*v1.ServiceAccount)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.CoreV1().ServiceAccounts(ns.Name)
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create ServiceAccount: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type clusterRoleFactory struct{}
func (f *clusterRoleFactory) New() runtime.Object {
return &rbacv1.ClusterRole{}
}
func (*clusterRoleFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*rbacv1.ClusterRole)
if !ok {
return nil, errorItemNotSupported
}
framework.Logf("Define cluster role %v", item.GetName())
client := f.ClientSet.RbacV1().ClusterRoles()
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create ClusterRole: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type clusterRoleBindingFactory struct{}
func (f *clusterRoleBindingFactory) New() runtime.Object {
return &rbacv1.ClusterRoleBinding{}
}
func (*clusterRoleBindingFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*rbacv1.ClusterRoleBinding)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.RbacV1().ClusterRoleBindings()
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create ClusterRoleBinding: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type roleFactory struct{}
func (f *roleFactory) New() runtime.Object {
return &rbacv1.Role{}
}
func (*roleFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*rbacv1.Role)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.RbacV1().Roles(ns.Name)
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create Role: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type roleBindingFactory struct{}
func (f *roleBindingFactory) New() runtime.Object {
return &rbacv1.RoleBinding{}
}
func (*roleBindingFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*rbacv1.RoleBinding)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.RbacV1().RoleBindings(ns.Name)
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create RoleBinding: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type serviceFactory struct{}
func (f *serviceFactory) New() runtime.Object {
return &v1.Service{}
}
func (*serviceFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*v1.Service)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.CoreV1().Services(ns.Name)
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create Service: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type statefulSetFactory struct{}
func (f *statefulSetFactory) New() runtime.Object {
return &appsv1.StatefulSet{}
}
func (*statefulSetFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*appsv1.StatefulSet)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.AppsV1().StatefulSets(ns.Name)
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create StatefulSet: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type deploymentFactory struct{}
func (f *deploymentFactory) New() runtime.Object {
return &appsv1.Deployment{}
}
func (*deploymentFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*appsv1.Deployment)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.AppsV1().Deployments(ns.Name)
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create Deployment: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type daemonSetFactory struct{}
func (f *daemonSetFactory) New() runtime.Object {
return &appsv1.DaemonSet{}
}
func (*daemonSetFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*appsv1.DaemonSet)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.AppsV1().DaemonSets(ns.Name)
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create DaemonSet: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type replicaSetFactory struct{}
func (f *replicaSetFactory) New() runtime.Object {
return &appsv1.ReplicaSet{}
}
func (*replicaSetFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*appsv1.ReplicaSet)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.AppsV1().ReplicaSets(ns.Name)
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create ReplicaSet: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type storageClassFactory struct{}
func (f *storageClassFactory) New() runtime.Object {
return &storagev1.StorageClass{}
}
func (*storageClassFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*storagev1.StorageClass)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.StorageV1().StorageClasses()
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create StorageClass: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type volumeAttributesClassFactory struct{}
func (f *volumeAttributesClassFactory) New() runtime.Object {
return &storagev1beta1.VolumeAttributesClass{}
}
func (*volumeAttributesClassFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*storagev1beta1.VolumeAttributesClass)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.StorageV1beta1().VolumeAttributesClasses()
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create VolumeAttributesClass: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type csiDriverFactory struct{}
func (f *csiDriverFactory) New() runtime.Object {
return &storagev1.CSIDriver{}
}
func (*csiDriverFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*storagev1.CSIDriver)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.StorageV1().CSIDrivers()
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create CSIDriver: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type secretFactory struct{}
func (f *secretFactory) New() runtime.Object {
return &v1.Secret{}
}
func (*secretFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
item, ok := i.(*v1.Secret)
if !ok {
return nil, errorItemNotSupported
}
client := f.ClientSet.CoreV1().Secrets(ns.Name)
if _, err := client.Create(ctx, item, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create Secret: %w", err)
}
return func(ctx context.Context) error {
return client.Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
type customResourceDefinitionFactory struct{}
func (f *customResourceDefinitionFactory) New() runtime.Object {
return &apiextensionsv1.CustomResourceDefinition{}
}
func (*customResourceDefinitionFactory) Create(ctx context.Context, f *framework.Framework, ns *v1.Namespace, i interface{}) (func(ctx context.Context) error, error) {
var err error
unstructCRD := &unstructured.Unstructured{}
gvr := schema.GroupVersionResource{Group: "apiextensions.k8s.io", Version: "v1", Resource: "customresourcedefinitions"}
item, ok := i.(*apiextensionsv1.CustomResourceDefinition)
if !ok {
return nil, errorItemNotSupported
}
unstructCRD.Object, err = runtime.DefaultUnstructuredConverter.ToUnstructured(i)
if err != nil {
return nil, err
}
if _, err = f.DynamicClient.Resource(gvr).Create(ctx, unstructCRD, metav1.CreateOptions{}); err != nil {
return nil, fmt.Errorf("create CustomResourceDefinition: %w", err)
}
return func(ctx context.Context) error {
return f.DynamicClient.Resource(gvr).Delete(ctx, item.GetName(), metav1.DeleteOptions{})
}, nil
}
// PrettyPrint returns a human-readable representation of an item.
func PrettyPrint(item interface{}) string {
data, err := json.MarshalIndent(item, "", " ")
if err == nil {
return string(data)
}
return fmt.Sprintf("%+v", item)
}
// patchContainerImages replaces the specified Container Registry with a custom
// one provided via the KUBE_TEST_REPO_LIST env variable
func patchContainerImages(containers []v1.Container) error {
var err error
for i, c := range containers {
containers[i].Image, err = imageutils.ReplaceRegistryInImageURL(c.Image)
if err != nil {
return err
}
}
return nil
}

View File

@ -0,0 +1,233 @@
/*
Copyright 2018 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package utils
import (
"fmt"
"path"
"strings"
appsv1 "k8s.io/api/apps/v1"
v1 "k8s.io/api/core/v1"
storagev1 "k8s.io/api/storage/v1"
e2eframework "k8s.io/kubernetes/test/e2e/framework"
e2epod "k8s.io/kubernetes/test/e2e/framework/pod"
)
// PatchCSIDeployment modifies the CSI driver deployment:
// - replaces the provisioner name
// - forces pods onto a specific host
//
// All of that is optional, see PatchCSIOptions. Just beware
// that not renaming the CSI driver deployment can be problematic:
// - when multiple tests deploy the driver, they need
// to run sequentially
// - might conflict with manual deployments
//
// This function is written so that it works for CSI driver deployments
// that follow these conventions:
// - driver and provisioner names are identical
// - the driver binary accepts a --drivername parameter
// - the paths inside the container are either fixed
// and don't need to be patch (for example, --csi-address=/csi/csi.sock is
// okay) or are specified directly in a parameter (for example,
// --kubelet-registration-path=/var/lib/kubelet/plugins/csi-hostpath/csi.sock)
//
// Driver deployments that are different will have to do the patching
// without this function, or skip patching entirely.
func PatchCSIDeployment(f *e2eframework.Framework, o PatchCSIOptions, object interface{}) error {
rename := o.OldDriverName != "" && o.NewDriverName != "" &&
o.OldDriverName != o.NewDriverName
substKubeletRootDir := func(s string) string {
return strings.ReplaceAll(s, "/var/lib/kubelet/", e2eframework.TestContext.KubeletRootDir+"/")
}
patchVolumes := func(volumes []v1.Volume) {
if !rename {
return
}
for i := range volumes {
volume := &volumes[i]
if volume.HostPath != nil {
// Update paths like /var/lib/kubelet/plugins/<provisioner>.
p := &volume.HostPath.Path
dir, file := path.Split(*p)
if file == o.OldDriverName {
*p = path.Join(dir, o.NewDriverName)
}
// Inject non-standard kubelet path.
*p = substKubeletRootDir(*p)
}
}
}
patchContainers := func(containers []v1.Container) {
for i := range containers {
container := &containers[i]
if rename {
for e := range container.Args {
// Inject test-specific provider name into paths like this one:
// --kubelet-registration-path=/var/lib/kubelet/plugins/csi-hostpath/csi.sock
container.Args[e] = strings.Replace(container.Args[e], "/"+o.OldDriverName+"/", "/"+o.NewDriverName+"/", 1)
}
}
// Modify --kubelet-registration-path.
for e := range container.Args {
container.Args[e] = substKubeletRootDir(container.Args[e])
}
for e := range container.VolumeMounts {
container.VolumeMounts[e].MountPath = substKubeletRootDir(container.VolumeMounts[e].MountPath)
}
if len(o.Features) > 0 && len(o.Features[container.Name]) > 0 {
featuregateString := strings.Join(o.Features[container.Name], ",")
container.Args = append(container.Args, fmt.Sprintf("--feature-gates=%s", featuregateString))
}
// Overwrite driver name resp. provider name
// by appending a parameter with the right
// value.
switch container.Name {
case o.DriverContainerName:
container.Args = append(container.Args, o.DriverContainerArguments...)
}
}
}
patchPodSpec := func(spec *v1.PodSpec) {
patchContainers(spec.Containers)
patchVolumes(spec.Volumes)
if o.NodeName != "" {
e2epod.SetNodeSelection(spec, e2epod.NodeSelection{Name: o.NodeName})
}
}
switch object := object.(type) {
case *appsv1.ReplicaSet:
patchPodSpec(&object.Spec.Template.Spec)
case *appsv1.DaemonSet:
patchPodSpec(&object.Spec.Template.Spec)
case *appsv1.StatefulSet:
patchPodSpec(&object.Spec.Template.Spec)
case *appsv1.Deployment:
patchPodSpec(&object.Spec.Template.Spec)
case *storagev1.StorageClass:
if o.NewDriverName != "" {
// Driver name is expected to be the same
// as the provisioner name here.
object.Provisioner = o.NewDriverName
}
case *storagev1.CSIDriver:
if o.NewDriverName != "" {
object.Name = o.NewDriverName
}
if o.PodInfo != nil {
object.Spec.PodInfoOnMount = o.PodInfo
}
if o.StorageCapacity != nil {
object.Spec.StorageCapacity = o.StorageCapacity
}
if o.CanAttach != nil {
object.Spec.AttachRequired = o.CanAttach
}
if o.VolumeLifecycleModes != nil {
object.Spec.VolumeLifecycleModes = *o.VolumeLifecycleModes
}
if o.TokenRequests != nil {
object.Spec.TokenRequests = o.TokenRequests
}
if o.RequiresRepublish != nil {
object.Spec.RequiresRepublish = o.RequiresRepublish
}
if o.FSGroupPolicy != nil {
object.Spec.FSGroupPolicy = o.FSGroupPolicy
}
if o.SELinuxMount != nil {
object.Spec.SELinuxMount = o.SELinuxMount
}
}
return nil
}
// PatchCSIOptions controls how PatchCSIDeployment patches the objects.
type PatchCSIOptions struct {
// The original driver name.
OldDriverName string
// The driver name that replaces the original name.
// Can be empty (not used at all) or equal to OldDriverName
// (then it will be added were appropriate without renaming
// in existing fields).
NewDriverName string
// The name of the container which has the CSI driver binary.
// If non-empty, DriverContainerArguments are added to argument
// list in container with that name.
DriverContainerName string
// List of arguments to add to container with
// DriverContainerName.
DriverContainerArguments []string
// The name of the container which has the provisioner binary.
// If non-empty, --provisioner with new name will be appended
// to the argument list.
ProvisionerContainerName string
// The name of the container which has the snapshotter binary.
// If non-empty, --snapshotter with new name will be appended
// to the argument list.
SnapshotterContainerName string
// If non-empty, all pods are forced to run on this node.
NodeName string
// If not nil, the value to use for the CSIDriver.Spec.PodInfo
// field *if* the driver deploys a CSIDriver object. Ignored
// otherwise.
PodInfo *bool
// If not nil, the value to use for the CSIDriver.Spec.CanAttach
// field *if* the driver deploys a CSIDriver object. Ignored
// otherwise.
CanAttach *bool
// If not nil, the value to use for the CSIDriver.Spec.StorageCapacity
// field *if* the driver deploys a CSIDriver object. Ignored
// otherwise.
StorageCapacity *bool
// If not nil, the value to use for the CSIDriver.Spec.VolumeLifecycleModes
// field *if* the driver deploys a CSIDriver object. Ignored
// otherwise.
VolumeLifecycleModes *[]storagev1.VolumeLifecycleMode
// If not nil, the value to use for the CSIDriver.Spec.TokenRequests
// field *if* the driver deploys a CSIDriver object. Ignored
// otherwise.
TokenRequests []storagev1.TokenRequest
// If not nil, the value to use for the CSIDriver.Spec.RequiresRepublish
// field *if* the driver deploys a CSIDriver object. Ignored
// otherwise.
RequiresRepublish *bool
// If not nil, the value to use for the CSIDriver.Spec.FSGroupPolicy
// field *if* the driver deploys a CSIDriver object. Ignored
// otherwise.
FSGroupPolicy *storagev1.FSGroupPolicy
// If not nil, the value to use for the CSIDriver.Spec.SELinuxMount
// field *if* the driver deploys a CSIDriver object. Ignored
// otherwise.
SELinuxMount *bool
// If not nil, the values will be used for setting feature arguments to
// specific sidecar.
// Feature is a map - where key is sidecar name such as:
// -- key: resizer
// -- value: []string{feature-gates}
Features map[string][]string
}

View File

@ -0,0 +1,22 @@
/*
Copyright 2017 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package utils
import "k8s.io/kubernetes/test/e2e/framework"
// SIGDescribe annotates the test with the SIG label.
var SIGDescribe = framework.SIGDescribe("storage")

View File

@ -0,0 +1,197 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package utils
import (
"context"
"fmt"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/util/exec"
"k8s.io/kubernetes/test/e2e/framework"
e2epod "k8s.io/kubernetes/test/e2e/framework/pod"
)
// Result holds the execution result of remote execution command.
type Result struct {
Host string
Cmd string
Stdout string
Stderr string
Code int
}
// LogResult records result log
func LogResult(result Result) {
remote := result.Host
framework.Logf("exec %s: command: %s", remote, result.Cmd)
framework.Logf("exec %s: stdout: %q", remote, result.Stdout)
framework.Logf("exec %s: stderr: %q", remote, result.Stderr)
framework.Logf("exec %s: exit code: %d", remote, result.Code)
}
// HostExec represents interface we require to execute commands on remote host.
type HostExec interface {
Execute(ctx context.Context, cmd string, node *v1.Node) (Result, error)
IssueCommandWithResult(ctx context.Context, cmd string, node *v1.Node) (string, error)
IssueCommand(ctx context.Context, cmd string, node *v1.Node) error
Cleanup(ctx context.Context)
}
// hostExecutor implements HostExec
type hostExecutor struct {
*framework.Framework
nodeExecPods map[string]*v1.Pod
}
// NewHostExec returns a HostExec
func NewHostExec(framework *framework.Framework) HostExec {
return &hostExecutor{
Framework: framework,
nodeExecPods: make(map[string]*v1.Pod),
}
}
// launchNodeExecPod launches a hostexec pod for local PV and waits
// until it's Running.
func (h *hostExecutor) launchNodeExecPod(ctx context.Context, node string) *v1.Pod {
f := h.Framework
cs := f.ClientSet
ns := f.Namespace
hostExecPod := e2epod.NewExecPodSpec(ns.Name, "", true)
hostExecPod.GenerateName = fmt.Sprintf("hostexec-%s-", node)
if framework.TestContext.NodeE2E {
// E2E node tests do not run a scheduler, so set the node name directly
hostExecPod.Spec.NodeName = node
} else {
// Use NodeAffinity instead of NodeName so that pods will not
// be immediately Failed by kubelet if it's out of space. Instead
// Pods will be pending in the scheduler until there is space freed
// up.
e2epod.SetNodeAffinity(&hostExecPod.Spec, node)
}
hostExecPod.Spec.Volumes = []v1.Volume{
{
// Required to enter into host mount namespace via nsenter.
Name: "rootfs",
VolumeSource: v1.VolumeSource{
HostPath: &v1.HostPathVolumeSource{
Path: "/",
},
},
},
}
hostExecPod.Spec.Containers[0].VolumeMounts = []v1.VolumeMount{
{
Name: "rootfs",
MountPath: "/rootfs",
ReadOnly: true,
},
}
hostExecPod.Spec.Containers[0].SecurityContext = &v1.SecurityContext{
Privileged: func(privileged bool) *bool {
return &privileged
}(true),
}
pod, err := cs.CoreV1().Pods(ns.Name).Create(ctx, hostExecPod, metav1.CreateOptions{})
framework.ExpectNoError(err)
err = e2epod.WaitTimeoutForPodRunningInNamespace(ctx, cs, pod.Name, pod.Namespace, f.Timeouts.PodStart)
framework.ExpectNoError(err)
return pod
}
// Execute executes the command on the given node. If there is no error
// performing the remote command execution, the stdout, stderr and exit code
// are returned.
// This works like ssh.SSH(...) utility.
func (h *hostExecutor) Execute(ctx context.Context, cmd string, node *v1.Node) (Result, error) {
result, err := h.exec(ctx, cmd, node)
if codeExitErr, ok := err.(exec.CodeExitError); ok {
// extract the exit code of remote command and silence the command
// non-zero exit code error
result.Code = codeExitErr.ExitStatus()
err = nil
}
return result, err
}
func (h *hostExecutor) exec(ctx context.Context, cmd string, node *v1.Node) (Result, error) {
result := Result{
Host: node.Name,
Cmd: cmd,
}
pod, ok := h.nodeExecPods[node.Name]
if !ok {
pod = h.launchNodeExecPod(ctx, node.Name)
if pod == nil {
return result, fmt.Errorf("failed to create hostexec pod for node %q", node)
}
h.nodeExecPods[node.Name] = pod
}
args := []string{
"nsenter",
"--mount=/rootfs/proc/1/ns/mnt",
"--",
"sh",
"-c",
cmd,
}
containerName := pod.Spec.Containers[0].Name
var err error
result.Stdout, result.Stderr, err = e2epod.ExecWithOptions(h.Framework, e2epod.ExecOptions{
Command: args,
Namespace: pod.Namespace,
PodName: pod.Name,
ContainerName: containerName,
Stdin: nil,
CaptureStdout: true,
CaptureStderr: true,
PreserveWhitespace: true,
})
return result, err
}
// IssueCommandWithResult issues command on the given node and returns stdout as
// result. It returns error if there are some issues executing the command or
// the command exits non-zero.
func (h *hostExecutor) IssueCommandWithResult(ctx context.Context, cmd string, node *v1.Node) (string, error) {
result, err := h.exec(ctx, cmd, node)
if err != nil {
LogResult(result)
}
return result.Stdout, err
}
// IssueCommand works like IssueCommandWithResult, but discards result.
func (h *hostExecutor) IssueCommand(ctx context.Context, cmd string, node *v1.Node) error {
_, err := h.IssueCommandWithResult(ctx, cmd, node)
return err
}
// Cleanup cleanup resources it created during test.
// Note that in most cases it is not necessary to call this because we create
// pods under test namespace which will be destroyed in teardown phase.
func (h *hostExecutor) Cleanup(ctx context.Context) {
for _, pod := range h.nodeExecPods {
e2epod.DeletePodOrFail(ctx, h.Framework.ClientSet, pod.Namespace, pod.Name)
}
h.nodeExecPods = make(map[string]*v1.Pod)
}

View File

@ -0,0 +1,361 @@
/*
Copyright 2019 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package utils
/*
* Various local test resource implementations.
*/
import (
"context"
"fmt"
"path/filepath"
"strings"
"github.com/onsi/ginkgo/v2"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/util/uuid"
"k8s.io/kubernetes/test/e2e/framework"
)
// LocalVolumeType represents type of local volume, e.g. tmpfs, directory,
// block, etc.
type LocalVolumeType string
const (
// LocalVolumeDirectory reprensents a simple directory as local volume
LocalVolumeDirectory LocalVolumeType = "dir"
// LocalVolumeDirectoryLink is like LocalVolumeDirectory but it's a symbolic link to directory
LocalVolumeDirectoryLink LocalVolumeType = "dir-link"
// LocalVolumeDirectoryBindMounted is like LocalVolumeDirectory but bind mounted
LocalVolumeDirectoryBindMounted LocalVolumeType = "dir-bindmounted"
// LocalVolumeDirectoryLinkBindMounted is like LocalVolumeDirectory but it's a symbolic link to self bind mounted directory
// Note that bind mounting at symbolic link actually mounts at directory it
// links to
LocalVolumeDirectoryLinkBindMounted LocalVolumeType = "dir-link-bindmounted"
// LocalVolumeTmpfs represents a temporary filesystem to be used as local volume
LocalVolumeTmpfs LocalVolumeType = "tmpfs"
// LocalVolumeBlock represents a Block device, creates a local file, and maps it as a block device
LocalVolumeBlock LocalVolumeType = "block"
// LocalVolumeBlockFS represents a filesystem backed by a block device
LocalVolumeBlockFS LocalVolumeType = "blockfs"
// LocalVolumeGCELocalSSD represents a Filesystem backed by GCE Local SSD as local volume
LocalVolumeGCELocalSSD LocalVolumeType = "gce-localssd-scsi-fs"
)
// LocalTestResource represents test resource of a local volume.
type LocalTestResource struct {
VolumeType LocalVolumeType
Node *v1.Node
// Volume path, path to filesystem or block device on the node
Path string
// If volume is backed by a loop device, we create loop device storage file
// under this directory.
loopDir string
}
// LocalTestResourceManager represents interface to create/destroy local test resources on node
type LocalTestResourceManager interface {
Create(ctx context.Context, node *v1.Node, volumeType LocalVolumeType, parameters map[string]string) *LocalTestResource
ExpandBlockDevice(ctx context.Context, ltr *LocalTestResource, mbToAdd int) error
Remove(ctx context.Context, ltr *LocalTestResource)
}
// ltrMgr implements LocalTestResourceManager
type ltrMgr struct {
prefix string
hostExec HostExec
// hostBase represents a writable directory on the host under which we
// create test directories
hostBase string
}
// NewLocalResourceManager returns a instance of LocalTestResourceManager
func NewLocalResourceManager(prefix string, hostExec HostExec, hostBase string) LocalTestResourceManager {
return &ltrMgr{
prefix: prefix,
hostExec: hostExec,
hostBase: hostBase,
}
}
// getTestDir returns a test dir under `hostBase` directory with randome name.
func (l *ltrMgr) getTestDir() string {
testDirName := fmt.Sprintf("%s-%s", l.prefix, string(uuid.NewUUID()))
return filepath.Join(l.hostBase, testDirName)
}
func (l *ltrMgr) setupLocalVolumeTmpfs(ctx context.Context, node *v1.Node, parameters map[string]string) *LocalTestResource {
hostDir := l.getTestDir()
ginkgo.By(fmt.Sprintf("Creating tmpfs mount point on node %q at path %q", node.Name, hostDir))
err := l.hostExec.IssueCommand(ctx, fmt.Sprintf("mkdir -p %q && mount -t tmpfs -o size=10m tmpfs-%q %q", hostDir, hostDir, hostDir), node)
framework.ExpectNoError(err)
return &LocalTestResource{
Node: node,
Path: hostDir,
}
}
func (l *ltrMgr) cleanupLocalVolumeTmpfs(ctx context.Context, ltr *LocalTestResource) {
ginkgo.By(fmt.Sprintf("Unmount tmpfs mount point on node %q at path %q", ltr.Node.Name, ltr.Path))
err := l.hostExec.IssueCommand(ctx, fmt.Sprintf("umount %q", ltr.Path), ltr.Node)
framework.ExpectNoError(err)
ginkgo.By("Removing the test directory")
err = l.hostExec.IssueCommand(ctx, fmt.Sprintf("rm -r %s", ltr.Path), ltr.Node)
framework.ExpectNoError(err)
}
// createAndSetupLoopDevice creates an empty file and associates a loop devie with it.
func (l *ltrMgr) createAndSetupLoopDevice(ctx context.Context, dir string, node *v1.Node, size int) {
ginkgo.By(fmt.Sprintf("Creating block device on node %q using path %q", node.Name, dir))
mkdirCmd := fmt.Sprintf("mkdir -p %s", dir)
count := size / 4096
// xfs requires at least 4096 blocks
if count < 4096 {
count = 4096
}
ddCmd := fmt.Sprintf("dd if=/dev/zero of=%s/file bs=4096 count=%d", dir, count)
losetupCmd := fmt.Sprintf("losetup -f %s/file", dir)
err := l.hostExec.IssueCommand(ctx, fmt.Sprintf("%s && %s && %s", mkdirCmd, ddCmd, losetupCmd), node)
framework.ExpectNoError(err)
}
// findLoopDevice finds loop device path by its associated storage directory.
func (l *ltrMgr) findLoopDevice(ctx context.Context, dir string, node *v1.Node) string {
cmd := fmt.Sprintf("E2E_LOOP_DEV=$(losetup | grep %s/file | awk '{ print $1 }') 2>&1 > /dev/null && echo ${E2E_LOOP_DEV}", dir)
loopDevResult, err := l.hostExec.IssueCommandWithResult(ctx, cmd, node)
framework.ExpectNoError(err)
return strings.TrimSpace(loopDevResult)
}
func (l *ltrMgr) setupLocalVolumeBlock(ctx context.Context, node *v1.Node, parameters map[string]string) *LocalTestResource {
loopDir := l.getTestDir()
l.createAndSetupLoopDevice(ctx, loopDir, node, 20*1024*1024)
loopDev := l.findLoopDevice(ctx, loopDir, node)
return &LocalTestResource{
Node: node,
Path: loopDev,
loopDir: loopDir,
}
}
// teardownLoopDevice tears down loop device by its associated storage directory.
func (l *ltrMgr) teardownLoopDevice(ctx context.Context, dir string, node *v1.Node) {
loopDev := l.findLoopDevice(ctx, dir, node)
ginkgo.By(fmt.Sprintf("Tear down block device %q on node %q at path %s/file", loopDev, node.Name, dir))
losetupDeleteCmd := fmt.Sprintf("losetup -d %s", loopDev)
err := l.hostExec.IssueCommand(ctx, losetupDeleteCmd, node)
framework.ExpectNoError(err)
return
}
func (l *ltrMgr) cleanupLocalVolumeBlock(ctx context.Context, ltr *LocalTestResource) {
l.teardownLoopDevice(ctx, ltr.loopDir, ltr.Node)
ginkgo.By(fmt.Sprintf("Removing the test directory %s", ltr.loopDir))
removeCmd := fmt.Sprintf("rm -r %s", ltr.loopDir)
err := l.hostExec.IssueCommand(ctx, removeCmd, ltr.Node)
framework.ExpectNoError(err)
}
func (l *ltrMgr) setupLocalVolumeBlockFS(ctx context.Context, node *v1.Node, parameters map[string]string) *LocalTestResource {
ltr := l.setupLocalVolumeBlock(ctx, node, parameters)
loopDev := ltr.Path
loopDir := ltr.loopDir
// Format and mount at loopDir and give others rwx for read/write testing
cmd := fmt.Sprintf("mkfs -t ext4 %s && mount -t ext4 %s %s && chmod o+rwx %s", loopDev, loopDev, loopDir, loopDir)
err := l.hostExec.IssueCommand(ctx, cmd, node)
framework.ExpectNoError(err)
return &LocalTestResource{
Node: node,
Path: loopDir,
loopDir: loopDir,
}
}
func (l *ltrMgr) cleanupLocalVolumeBlockFS(ctx context.Context, ltr *LocalTestResource) {
umountCmd := fmt.Sprintf("umount %s", ltr.Path)
err := l.hostExec.IssueCommand(ctx, umountCmd, ltr.Node)
framework.ExpectNoError(err)
l.cleanupLocalVolumeBlock(ctx, ltr)
}
func (l *ltrMgr) setupLocalVolumeDirectory(ctx context.Context, node *v1.Node, parameters map[string]string) *LocalTestResource {
hostDir := l.getTestDir()
mkdirCmd := fmt.Sprintf("mkdir -p %s", hostDir)
err := l.hostExec.IssueCommand(ctx, mkdirCmd, node)
framework.ExpectNoError(err)
return &LocalTestResource{
Node: node,
Path: hostDir,
}
}
func (l *ltrMgr) cleanupLocalVolumeDirectory(ctx context.Context, ltr *LocalTestResource) {
ginkgo.By("Removing the test directory")
removeCmd := fmt.Sprintf("rm -r %s", ltr.Path)
err := l.hostExec.IssueCommand(ctx, removeCmd, ltr.Node)
framework.ExpectNoError(err)
}
func (l *ltrMgr) setupLocalVolumeDirectoryLink(ctx context.Context, node *v1.Node, parameters map[string]string) *LocalTestResource {
hostDir := l.getTestDir()
hostDirBackend := hostDir + "-backend"
cmd := fmt.Sprintf("mkdir %s && ln -s %s %s", hostDirBackend, hostDirBackend, hostDir)
err := l.hostExec.IssueCommand(ctx, cmd, node)
framework.ExpectNoError(err)
return &LocalTestResource{
Node: node,
Path: hostDir,
}
}
func (l *ltrMgr) cleanupLocalVolumeDirectoryLink(ctx context.Context, ltr *LocalTestResource) {
ginkgo.By("Removing the test directory")
hostDir := ltr.Path
hostDirBackend := hostDir + "-backend"
removeCmd := fmt.Sprintf("rm -r %s && rm -r %s", hostDir, hostDirBackend)
err := l.hostExec.IssueCommand(ctx, removeCmd, ltr.Node)
framework.ExpectNoError(err)
}
func (l *ltrMgr) setupLocalVolumeDirectoryBindMounted(ctx context.Context, node *v1.Node, parameters map[string]string) *LocalTestResource {
hostDir := l.getTestDir()
cmd := fmt.Sprintf("mkdir %s && mount --bind %s %s", hostDir, hostDir, hostDir)
err := l.hostExec.IssueCommand(ctx, cmd, node)
framework.ExpectNoError(err)
return &LocalTestResource{
Node: node,
Path: hostDir,
}
}
func (l *ltrMgr) cleanupLocalVolumeDirectoryBindMounted(ctx context.Context, ltr *LocalTestResource) {
ginkgo.By("Removing the test directory")
hostDir := ltr.Path
removeCmd := fmt.Sprintf("umount %s && rm -r %s", hostDir, hostDir)
err := l.hostExec.IssueCommand(ctx, removeCmd, ltr.Node)
framework.ExpectNoError(err)
}
func (l *ltrMgr) setupLocalVolumeDirectoryLinkBindMounted(ctx context.Context, node *v1.Node, parameters map[string]string) *LocalTestResource {
hostDir := l.getTestDir()
hostDirBackend := hostDir + "-backend"
cmd := fmt.Sprintf("mkdir %s && mount --bind %s %s && ln -s %s %s", hostDirBackend, hostDirBackend, hostDirBackend, hostDirBackend, hostDir)
err := l.hostExec.IssueCommand(ctx, cmd, node)
framework.ExpectNoError(err)
return &LocalTestResource{
Node: node,
Path: hostDir,
}
}
func (l *ltrMgr) cleanupLocalVolumeDirectoryLinkBindMounted(ctx context.Context, ltr *LocalTestResource) {
ginkgo.By("Removing the test directory")
hostDir := ltr.Path
hostDirBackend := hostDir + "-backend"
removeCmd := fmt.Sprintf("rm %s && umount %s && rm -r %s", hostDir, hostDirBackend, hostDirBackend)
err := l.hostExec.IssueCommand(ctx, removeCmd, ltr.Node)
framework.ExpectNoError(err)
}
func (l *ltrMgr) setupLocalVolumeGCELocalSSD(ctx context.Context, node *v1.Node, parameters map[string]string) *LocalTestResource {
res, err := l.hostExec.IssueCommandWithResult(ctx, "ls /mnt/disks/by-uuid/google-local-ssds-scsi-fs/", node)
framework.ExpectNoError(err)
dirName := strings.Fields(res)[0]
hostDir := "/mnt/disks/by-uuid/google-local-ssds-scsi-fs/" + dirName
return &LocalTestResource{
Node: node,
Path: hostDir,
}
}
func (l *ltrMgr) cleanupLocalVolumeGCELocalSSD(ctx context.Context, ltr *LocalTestResource) {
// This filesystem is attached in cluster initialization, we clean all files to make it reusable.
removeCmd := fmt.Sprintf("find '%s' -mindepth 1 -maxdepth 1 -print0 | xargs -r -0 rm -rf", ltr.Path)
err := l.hostExec.IssueCommand(ctx, removeCmd, ltr.Node)
framework.ExpectNoError(err)
}
func (l *ltrMgr) expandLocalVolumeBlockFS(ctx context.Context, ltr *LocalTestResource, mbToAdd int) error {
ddCmd := fmt.Sprintf("dd if=/dev/zero of=%s/file conv=notrunc oflag=append bs=1M count=%d", ltr.loopDir, mbToAdd)
loopDev := l.findLoopDevice(ctx, ltr.loopDir, ltr.Node)
losetupCmd := fmt.Sprintf("losetup -c %s", loopDev)
return l.hostExec.IssueCommand(ctx, fmt.Sprintf("%s && %s", ddCmd, losetupCmd), ltr.Node)
}
func (l *ltrMgr) ExpandBlockDevice(ctx context.Context, ltr *LocalTestResource, mbtoAdd int) error {
switch ltr.VolumeType {
case LocalVolumeBlockFS:
return l.expandLocalVolumeBlockFS(ctx, ltr, mbtoAdd)
}
return fmt.Errorf("Failed to expand local test resource, unsupported volume type: %s", ltr.VolumeType)
}
func (l *ltrMgr) Create(ctx context.Context, node *v1.Node, volumeType LocalVolumeType, parameters map[string]string) *LocalTestResource {
var ltr *LocalTestResource
switch volumeType {
case LocalVolumeDirectory:
ltr = l.setupLocalVolumeDirectory(ctx, node, parameters)
case LocalVolumeDirectoryLink:
ltr = l.setupLocalVolumeDirectoryLink(ctx, node, parameters)
case LocalVolumeDirectoryBindMounted:
ltr = l.setupLocalVolumeDirectoryBindMounted(ctx, node, parameters)
case LocalVolumeDirectoryLinkBindMounted:
ltr = l.setupLocalVolumeDirectoryLinkBindMounted(ctx, node, parameters)
case LocalVolumeTmpfs:
ltr = l.setupLocalVolumeTmpfs(ctx, node, parameters)
case LocalVolumeBlock:
ltr = l.setupLocalVolumeBlock(ctx, node, parameters)
case LocalVolumeBlockFS:
ltr = l.setupLocalVolumeBlockFS(ctx, node, parameters)
case LocalVolumeGCELocalSSD:
ltr = l.setupLocalVolumeGCELocalSSD(ctx, node, parameters)
default:
framework.Failf("Failed to create local test resource on node %q, unsupported volume type: %v is specified", node.Name, volumeType)
return nil
}
if ltr == nil {
framework.Failf("Failed to create local test resource on node %q, volume type: %v, parameters: %v", node.Name, volumeType, parameters)
}
ltr.VolumeType = volumeType
return ltr
}
func (l *ltrMgr) Remove(ctx context.Context, ltr *LocalTestResource) {
switch ltr.VolumeType {
case LocalVolumeDirectory:
l.cleanupLocalVolumeDirectory(ctx, ltr)
case LocalVolumeDirectoryLink:
l.cleanupLocalVolumeDirectoryLink(ctx, ltr)
case LocalVolumeDirectoryBindMounted:
l.cleanupLocalVolumeDirectoryBindMounted(ctx, ltr)
case LocalVolumeDirectoryLinkBindMounted:
l.cleanupLocalVolumeDirectoryLinkBindMounted(ctx, ltr)
case LocalVolumeTmpfs:
l.cleanupLocalVolumeTmpfs(ctx, ltr)
case LocalVolumeBlock:
l.cleanupLocalVolumeBlock(ctx, ltr)
case LocalVolumeBlockFS:
l.cleanupLocalVolumeBlockFS(ctx, ltr)
case LocalVolumeGCELocalSSD:
l.cleanupLocalVolumeGCELocalSSD(ctx, ltr)
default:
framework.Failf("Failed to remove local test resource, unsupported volume type: %v is specified", ltr.VolumeType)
}
return
}

View File

@ -0,0 +1,176 @@
/*
Copyright 2020 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package utils
import (
"context"
"fmt"
"io"
"os"
"path"
"regexp"
"strings"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
v1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubernetes/test/e2e/framework"
e2enode "k8s.io/kubernetes/test/e2e/framework/node"
e2essh "k8s.io/kubernetes/test/e2e/framework/ssh"
"k8s.io/kubernetes/test/e2e/storage/podlogs"
)
// StartPodLogs begins capturing log output and events from current
// and future pods running in the namespace of the framework. That
// ends when the returned cleanup function is called.
//
// The output goes to log files (when using --report-dir, as in the
// CI) or the output stream (otherwise).
func StartPodLogs(ctx context.Context, f *framework.Framework, driverNamespace *v1.Namespace) func() {
ctx, cancel := context.WithCancel(ctx)
cs := f.ClientSet
ns := driverNamespace.Name
var podEventLog io.Writer = ginkgo.GinkgoWriter
var podEventLogCloser io.Closer
to := podlogs.LogOutput{
StatusWriter: ginkgo.GinkgoWriter,
}
if framework.TestContext.ReportDir == "" {
to.LogWriter = ginkgo.GinkgoWriter
} else {
test := ginkgo.CurrentSpecReport()
// Clean up each individual component text such that
// it contains only characters that are valid as file
// name.
reg := regexp.MustCompile("[^a-zA-Z0-9_-]+")
var testName []string
for _, text := range test.ContainerHierarchyTexts {
testName = append(testName, reg.ReplaceAllString(text, "_"))
if len(test.LeafNodeText) > 0 {
testName = append(testName, reg.ReplaceAllString(test.LeafNodeText, "_"))
}
}
// We end the prefix with a slash to ensure that all logs
// end up in a directory named after the current test.
//
// Each component name maps to a directory. This
// avoids cluttering the root artifact directory and
// keeps each directory name smaller (the full test
// name at one point exceeded 256 characters, which was
// too much for some filesystems).
logDir := framework.TestContext.ReportDir + "/" + strings.Join(testName, "/")
to.LogPathPrefix = logDir + "/"
err := os.MkdirAll(logDir, 0755)
framework.ExpectNoError(err, "create pod log directory")
f, err := os.Create(path.Join(logDir, "pod-event.log"))
framework.ExpectNoError(err, "create pod events log file")
podEventLog = f
podEventLogCloser = f
}
podlogs.CopyAllLogs(ctx, cs, ns, to)
// The framework doesn't know about the driver pods because of
// the separate namespace. Therefore we always capture the
// events ourselves.
podlogs.WatchPods(ctx, cs, ns, podEventLog, podEventLogCloser)
return cancel
}
// KubeletCommand performs `start`, `restart`, or `stop` on the kubelet running on the node of the target pod and waits
// for the desired statues..
// Allowed kubeletOps are `KStart`, `KStop`, and `KRestart`
func KubeletCommand(ctx context.Context, kOp KubeletOpt, c clientset.Interface, pod *v1.Pod) {
nodeIP, err := getHostAddress(ctx, c, pod)
framework.ExpectNoError(err)
nodeIP = nodeIP + ":22"
commandTemplate := "systemctl %s kubelet"
sudoPresent := isSudoPresent(ctx, nodeIP, framework.TestContext.Provider)
if sudoPresent {
commandTemplate = "sudo " + commandTemplate
}
runCmd := func(cmd string) {
command := fmt.Sprintf(commandTemplate, cmd)
framework.Logf("Attempting `%s`", command)
sshResult, err := e2essh.SSH(ctx, command, nodeIP, framework.TestContext.Provider)
framework.ExpectNoError(err, fmt.Sprintf("SSH to Node %q errored.", pod.Spec.NodeName))
e2essh.LogResult(sshResult)
gomega.Expect(sshResult.Code).To(gomega.BeZero(), "Failed to [%s] kubelet:\n%#v", cmd, sshResult)
}
if kOp == KStop || kOp == KRestart {
runCmd("stop")
}
if kOp == KStop {
return
}
if kOp == KStart && getKubeletRunning(ctx, nodeIP) {
framework.Logf("Kubelet is already running on node %q", pod.Spec.NodeName)
// Just skip. Or we cannot get a new heartbeat in time.
return
}
node, err := c.CoreV1().Nodes().Get(ctx, pod.Spec.NodeName, metav1.GetOptions{})
framework.ExpectNoError(err)
heartbeatTime := e2enode.GetNodeHeartbeatTime(node)
runCmd("start")
// Wait for next heartbeat, which must be sent by the new kubelet process.
e2enode.WaitForNodeHeartbeatAfter(ctx, c, pod.Spec.NodeName, heartbeatTime, NodeStateTimeout)
// Then wait until Node with new process becomes Ready.
if ok := e2enode.WaitForNodeToBeReady(ctx, c, pod.Spec.NodeName, NodeStateTimeout); !ok {
framework.Failf("Node %s failed to enter Ready state", pod.Spec.NodeName)
}
}
// getHostAddress gets the node for a pod and returns the first
// address. Returns an error if the node the pod is on doesn't have an
// address.
func getHostAddress(ctx context.Context, client clientset.Interface, p *v1.Pod) (string, error) {
node, err := client.CoreV1().Nodes().Get(ctx, p.Spec.NodeName, metav1.GetOptions{})
if err != nil {
return "", err
}
// Try externalAddress first
for _, address := range node.Status.Addresses {
if address.Type == v1.NodeExternalIP {
if address.Address != "" {
return address.Address, nil
}
}
}
// If no externalAddress found, try internalAddress
for _, address := range node.Status.Addresses {
if address.Type == v1.NodeInternalIP {
if address.Address != "" {
return address.Address, nil
}
}
}
// If not found, return error
return "", fmt.Errorf("No address for pod %v on node %v",
p.Name, p.Spec.NodeName)
}

View File

@ -0,0 +1,149 @@
/*
Copyright 2020 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package utils
import (
"context"
"fmt"
"time"
"github.com/onsi/ginkgo/v2"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apiserver/pkg/storage/names"
"k8s.io/client-go/dynamic"
"k8s.io/kubernetes/test/e2e/framework"
)
const (
// SnapshotGroup is the snapshot CRD api group
SnapshotGroup = "snapshot.storage.k8s.io"
// SnapshotAPIVersion is the snapshot CRD api version
SnapshotAPIVersion = "snapshot.storage.k8s.io/v1"
)
var (
// SnapshotGVR is GroupVersionResource for volumesnapshots
SnapshotGVR = schema.GroupVersionResource{Group: SnapshotGroup, Version: "v1", Resource: "volumesnapshots"}
// SnapshotClassGVR is GroupVersionResource for volumesnapshotclasses
SnapshotClassGVR = schema.GroupVersionResource{Group: SnapshotGroup, Version: "v1", Resource: "volumesnapshotclasses"}
// SnapshotContentGVR is GroupVersionResource for volumesnapshotcontents
SnapshotContentGVR = schema.GroupVersionResource{Group: SnapshotGroup, Version: "v1", Resource: "volumesnapshotcontents"}
)
// WaitForSnapshotReady waits for a VolumeSnapshot to be ready to use or until timeout occurs, whichever comes first.
func WaitForSnapshotReady(ctx context.Context, c dynamic.Interface, ns string, snapshotName string, poll, timeout time.Duration) error {
framework.Logf("Waiting up to %v for VolumeSnapshot %s to become ready", timeout, snapshotName)
if successful := WaitUntil(poll, timeout, func() bool {
snapshot, err := c.Resource(SnapshotGVR).Namespace(ns).Get(ctx, snapshotName, metav1.GetOptions{})
if err != nil {
framework.Logf("Failed to get snapshot %q, retrying in %v. Error: %v", snapshotName, poll, err)
return false
}
status := snapshot.Object["status"]
if status == nil {
framework.Logf("VolumeSnapshot %s found but is not ready.", snapshotName)
return false
}
value := status.(map[string]interface{})
if value["readyToUse"] == true {
framework.Logf("VolumeSnapshot %s found and is ready", snapshotName)
return true
}
framework.Logf("VolumeSnapshot %s found but is not ready.", snapshotName)
return false
}); successful {
return nil
}
return fmt.Errorf("VolumeSnapshot %s is not ready within %v", snapshotName, timeout)
}
// GetSnapshotContentFromSnapshot returns the VolumeSnapshotContent object Bound to a
// given VolumeSnapshot
func GetSnapshotContentFromSnapshot(ctx context.Context, dc dynamic.Interface, snapshot *unstructured.Unstructured, timeout time.Duration) *unstructured.Unstructured {
defer ginkgo.GinkgoRecover()
err := WaitForSnapshotReady(ctx, dc, snapshot.GetNamespace(), snapshot.GetName(), framework.Poll, timeout)
framework.ExpectNoError(err)
vs, err := dc.Resource(SnapshotGVR).Namespace(snapshot.GetNamespace()).Get(ctx, snapshot.GetName(), metav1.GetOptions{})
snapshotStatus := vs.Object["status"].(map[string]interface{})
snapshotContentName := snapshotStatus["boundVolumeSnapshotContentName"].(string)
framework.Logf("received snapshotStatus %v", snapshotStatus)
framework.Logf("snapshotContentName %s", snapshotContentName)
framework.ExpectNoError(err)
vscontent, err := dc.Resource(SnapshotContentGVR).Get(ctx, snapshotContentName, metav1.GetOptions{})
framework.ExpectNoError(err)
return vscontent
}
// DeleteSnapshotWithoutWaiting deletes a VolumeSnapshot and return directly without waiting
func DeleteSnapshotWithoutWaiting(ctx context.Context, dc dynamic.Interface, ns string, snapshotName string) error {
ginkgo.By("deleting the snapshot")
err := dc.Resource(SnapshotGVR).Namespace(ns).Delete(ctx, snapshotName, metav1.DeleteOptions{})
if err != nil && !apierrors.IsNotFound(err) {
return err
}
return nil
}
// DeleteAndWaitSnapshot deletes a VolumeSnapshot and waits for it to be deleted or until timeout occurs, whichever comes first
func DeleteAndWaitSnapshot(ctx context.Context, dc dynamic.Interface, ns string, snapshotName string, poll, timeout time.Duration) error {
var err error
err = DeleteSnapshotWithoutWaiting(ctx, dc, ns, snapshotName)
if err != nil {
return err
}
ginkgo.By("checking the Snapshot has been deleted")
err = WaitForNamespacedGVRDeletion(ctx, dc, SnapshotGVR, ns, snapshotName, poll, timeout)
return err
}
// GenerateSnapshotClassSpec constructs a new SnapshotClass instance spec
// with a unique name that is based on namespace + suffix.
func GenerateSnapshotClassSpec(
snapshotter string,
parameters map[string]string,
ns string,
) *unstructured.Unstructured {
snapshotClass := &unstructured.Unstructured{
Object: map[string]interface{}{
"kind": "VolumeSnapshotClass",
"apiVersion": SnapshotAPIVersion,
"metadata": map[string]interface{}{
// Name must be unique, so let's base it on namespace name and use GenerateName
"name": names.SimpleNameGenerator.GenerateName(ns),
},
"driver": snapshotter,
"parameters": parameters,
"deletionPolicy": "Delete",
},
}
return snapshotClass
}

View File

@ -0,0 +1,825 @@
/*
Copyright 2017 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package utils
import (
"context"
"crypto/sha256"
"encoding/base64"
"fmt"
"math"
"math/rand"
"path/filepath"
"strconv"
"strings"
"time"
"github.com/onsi/ginkgo/v2"
"github.com/onsi/gomega"
v1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apimachinery/pkg/util/sets"
"k8s.io/client-go/dynamic"
clientset "k8s.io/client-go/kubernetes"
"k8s.io/kubernetes/test/e2e/framework"
e2epod "k8s.io/kubernetes/test/e2e/framework/pod"
e2essh "k8s.io/kubernetes/test/e2e/framework/ssh"
e2evolume "k8s.io/kubernetes/test/e2e/framework/volume"
imageutils "k8s.io/kubernetes/test/utils/image"
)
// KubeletOpt type definition
type KubeletOpt string
const (
// NodeStateTimeout defines Timeout
NodeStateTimeout = 1 * time.Minute
// KStart defines start value
KStart KubeletOpt = "start"
// KStop defines stop value
KStop KubeletOpt = "stop"
// KRestart defines restart value
KRestart KubeletOpt = "restart"
minValidSize string = "1Ki"
maxValidSize string = "10Ei"
)
// VerifyFSGroupInPod verifies that the passed in filePath contains the expectedFSGroup
func VerifyFSGroupInPod(f *framework.Framework, filePath, expectedFSGroup string, pod *v1.Pod) {
cmd := fmt.Sprintf("ls -l %s", filePath)
stdout, stderr, err := e2evolume.PodExec(f, pod, cmd)
framework.ExpectNoError(err)
framework.Logf("pod %s/%s exec for cmd %s, stdout: %s, stderr: %s", pod.Namespace, pod.Name, cmd, stdout, stderr)
fsGroupResult := strings.Fields(stdout)[3]
gomega.Expect(expectedFSGroup).To(gomega.Equal(fsGroupResult), "Expected fsGroup of %s, got %s", expectedFSGroup, fsGroupResult)
}
// getKubeletRunning return if the kubelet is running or not
func getKubeletRunning(ctx context.Context, nodeIP string) bool {
command := "systemctl show kubelet --property ActiveState --value"
framework.Logf("Attempting `%s`", command)
sshResult, err := e2essh.SSH(ctx, command, nodeIP, framework.TestContext.Provider)
framework.ExpectNoError(err, fmt.Sprintf("SSH to Node %q errored.", nodeIP))
e2essh.LogResult(sshResult)
gomega.Expect(sshResult.Code).To(gomega.BeZero(), "Failed to get kubelet status")
gomega.Expect(sshResult.Stdout).NotTo(gomega.BeEmpty(), "Kubelet status should not be Empty")
return strings.TrimSpace(sshResult.Stdout) == "active"
}
// TestKubeletRestartsAndRestoresMount tests that a volume mounted to a pod remains mounted after a kubelet restarts
func TestKubeletRestartsAndRestoresMount(ctx context.Context, c clientset.Interface, f *framework.Framework, clientPod *v1.Pod, volumePath string) {
byteLen := 64
seed := time.Now().UTC().UnixNano()
ginkgo.By("Writing to the volume.")
CheckWriteToPath(f, clientPod, v1.PersistentVolumeFilesystem, false, volumePath, byteLen, seed)
ginkgo.By("Restarting kubelet")
KubeletCommand(ctx, KRestart, c, clientPod)
ginkgo.By("Wait 20s for the volume to become stable")
time.Sleep(20 * time.Second)
ginkgo.By("Testing that written file is accessible.")
CheckReadFromPath(f, clientPod, v1.PersistentVolumeFilesystem, false, volumePath, byteLen, seed)
framework.Logf("Volume mount detected on pod %s and written file %s is readable post-restart.", clientPod.Name, volumePath)
}
// TestKubeletRestartsAndRestoresMap tests that a volume mapped to a pod remains mapped after a kubelet restarts
func TestKubeletRestartsAndRestoresMap(ctx context.Context, c clientset.Interface, f *framework.Framework, clientPod *v1.Pod, volumePath string) {
byteLen := 64
seed := time.Now().UTC().UnixNano()
ginkgo.By("Writing to the volume.")
CheckWriteToPath(f, clientPod, v1.PersistentVolumeBlock, false, volumePath, byteLen, seed)
ginkgo.By("Restarting kubelet")
KubeletCommand(ctx, KRestart, c, clientPod)
ginkgo.By("Wait 20s for the volume to become stable")
time.Sleep(20 * time.Second)
ginkgo.By("Testing that written pv is accessible.")
CheckReadFromPath(f, clientPod, v1.PersistentVolumeBlock, false, volumePath, byteLen, seed)
framework.Logf("Volume map detected on pod %s and written data %s is readable post-restart.", clientPod.Name, volumePath)
}
// TestVolumeUnmountsFromDeletedPodWithForceOption tests that a volume unmounts if the client pod was deleted while the kubelet was down.
// forceDelete is true indicating whether the pod is forcefully deleted.
// checkSubpath is true indicating whether the subpath should be checked.
// If secondPod is set, it is started when kubelet is down to check that the volume is usable while the old pod is being deleted and the new pod is starting.
func TestVolumeUnmountsFromDeletedPodWithForceOption(ctx context.Context, c clientset.Interface, f *framework.Framework, clientPod *v1.Pod, forceDelete bool, checkSubpath bool, secondPod *v1.Pod, volumePath string) {
nodeIP, err := getHostAddress(ctx, c, clientPod)
framework.ExpectNoError(err)
nodeIP = nodeIP + ":22"
ginkgo.By("Expecting the volume mount to be found.")
result, err := e2essh.SSH(ctx, fmt.Sprintf("mount | grep %s | grep -v volume-subpaths", clientPod.UID), nodeIP, framework.TestContext.Provider)
e2essh.LogResult(result)
framework.ExpectNoError(err, "Encountered SSH error.")
gomega.Expect(result.Code).To(gomega.Equal(0), fmt.Sprintf("Expected grep exit code of 0, got %d", result.Code))
if checkSubpath {
ginkgo.By("Expecting the volume subpath mount to be found.")
result, err := e2essh.SSH(ctx, fmt.Sprintf("cat /proc/self/mountinfo | grep %s | grep volume-subpaths", clientPod.UID), nodeIP, framework.TestContext.Provider)
e2essh.LogResult(result)
framework.ExpectNoError(err, "Encountered SSH error.")
gomega.Expect(result.Code).To(gomega.Equal(0), fmt.Sprintf("Expected grep exit code of 0, got %d", result.Code))
}
ginkgo.By("Writing to the volume.")
byteLen := 64
seed := time.Now().UTC().UnixNano()
CheckWriteToPath(f, clientPod, v1.PersistentVolumeFilesystem, false, volumePath, byteLen, seed)
// This command is to make sure kubelet is started after test finishes no matter it fails or not.
ginkgo.DeferCleanup(KubeletCommand, KStart, c, clientPod)
ginkgo.By("Stopping the kubelet.")
KubeletCommand(ctx, KStop, c, clientPod)
if secondPod != nil {
ginkgo.By("Starting the second pod")
_, err = c.CoreV1().Pods(clientPod.Namespace).Create(context.TODO(), secondPod, metav1.CreateOptions{})
framework.ExpectNoError(err, "when starting the second pod")
}
ginkgo.By(fmt.Sprintf("Deleting Pod %q", clientPod.Name))
if forceDelete {
err = c.CoreV1().Pods(clientPod.Namespace).Delete(ctx, clientPod.Name, *metav1.NewDeleteOptions(0))
} else {
err = c.CoreV1().Pods(clientPod.Namespace).Delete(ctx, clientPod.Name, metav1.DeleteOptions{})
}
framework.ExpectNoError(err)
ginkgo.By("Starting the kubelet and waiting for pod to delete.")
KubeletCommand(ctx, KStart, c, clientPod)
err = e2epod.WaitForPodNotFoundInNamespace(ctx, f.ClientSet, clientPod.Name, f.Namespace.Name, f.Timeouts.PodDelete)
if err != nil {
framework.ExpectNoError(err, "Expected pod to be not found.")
}
if forceDelete {
// With forceDelete, since pods are immediately deleted from API server, there is no way to be sure when volumes are torn down
// so wait some time to finish
time.Sleep(30 * time.Second)
}
if secondPod != nil {
ginkgo.By("Waiting for the second pod.")
err = e2epod.WaitForPodRunningInNamespace(ctx, c, secondPod)
framework.ExpectNoError(err, "while waiting for the second pod Running")
ginkgo.By("Getting the second pod uuid.")
secondPod, err := c.CoreV1().Pods(secondPod.Namespace).Get(context.TODO(), secondPod.Name, metav1.GetOptions{})
framework.ExpectNoError(err, "getting the second UID")
ginkgo.By("Expecting the volume mount to be found in the second pod.")
result, err := e2essh.SSH(ctx, fmt.Sprintf("mount | grep %s | grep -v volume-subpaths", secondPod.UID), nodeIP, framework.TestContext.Provider)
e2essh.LogResult(result)
framework.ExpectNoError(err, "Encountered SSH error when checking the second pod.")
gomega.Expect(result.Code).To(gomega.Equal(0), fmt.Sprintf("Expected grep exit code of 0, got %d", result.Code))
ginkgo.By("Testing that written file is accessible in the second pod.")
CheckReadFromPath(f, secondPod, v1.PersistentVolumeFilesystem, false, volumePath, byteLen, seed)
err = c.CoreV1().Pods(secondPod.Namespace).Delete(context.TODO(), secondPod.Name, metav1.DeleteOptions{})
framework.ExpectNoError(err, "when deleting the second pod")
err = e2epod.WaitForPodNotFoundInNamespace(ctx, f.ClientSet, secondPod.Name, f.Namespace.Name, f.Timeouts.PodDelete)
framework.ExpectNoError(err, "when waiting for the second pod to disappear")
}
ginkgo.By("Expecting the volume mount not to be found.")
result, err = e2essh.SSH(ctx, fmt.Sprintf("mount | grep %s | grep -v volume-subpaths", clientPod.UID), nodeIP, framework.TestContext.Provider)
e2essh.LogResult(result)
framework.ExpectNoError(err, "Encountered SSH error.")
gomega.Expect(result.Stdout).To(gomega.BeEmpty(), "Expected grep stdout to be empty (i.e. no mount found).")
framework.Logf("Volume unmounted on node %s", clientPod.Spec.NodeName)
if checkSubpath {
ginkgo.By("Expecting the volume subpath mount not to be found.")
result, err = e2essh.SSH(ctx, fmt.Sprintf("cat /proc/self/mountinfo | grep %s | grep volume-subpaths", clientPod.UID), nodeIP, framework.TestContext.Provider)
e2essh.LogResult(result)
framework.ExpectNoError(err, "Encountered SSH error.")
gomega.Expect(result.Stdout).To(gomega.BeEmpty(), "Expected grep stdout to be empty (i.e. no subpath mount found).")
framework.Logf("Subpath volume unmounted on node %s", clientPod.Spec.NodeName)
}
}
// TestVolumeUnmountsFromDeletedPod tests that a volume unmounts if the client pod was deleted while the kubelet was down.
func TestVolumeUnmountsFromDeletedPod(ctx context.Context, c clientset.Interface, f *framework.Framework, clientPod *v1.Pod, volumePath string) {
TestVolumeUnmountsFromDeletedPodWithForceOption(ctx, c, f, clientPod, false, false, nil, volumePath)
}
// TestVolumeUnmountsFromForceDeletedPod tests that a volume unmounts if the client pod was forcefully deleted while the kubelet was down.
func TestVolumeUnmountsFromForceDeletedPod(ctx context.Context, c clientset.Interface, f *framework.Framework, clientPod *v1.Pod, volumePath string) {
TestVolumeUnmountsFromDeletedPodWithForceOption(ctx, c, f, clientPod, true, false, nil, volumePath)
}
// TestVolumeUnmapsFromDeletedPodWithForceOption tests that a volume unmaps if the client pod was deleted while the kubelet was down.
// forceDelete is true indicating whether the pod is forcefully deleted.
func TestVolumeUnmapsFromDeletedPodWithForceOption(ctx context.Context, c clientset.Interface, f *framework.Framework, clientPod *v1.Pod, forceDelete bool, devicePath string) {
nodeIP, err := getHostAddress(ctx, c, clientPod)
framework.ExpectNoError(err, "Failed to get nodeIP.")
nodeIP = nodeIP + ":22"
// Creating command to check whether path exists
podDirectoryCmd := fmt.Sprintf("ls /var/lib/kubelet/pods/%s/volumeDevices/*/ | grep '.'", clientPod.UID)
if isSudoPresent(ctx, nodeIP, framework.TestContext.Provider) {
podDirectoryCmd = fmt.Sprintf("sudo sh -c \"%s\"", podDirectoryCmd)
}
// Directories in the global directory have unpredictable names, however, device symlinks
// have the same name as pod.UID. So just find anything with pod.UID name.
globalBlockDirectoryCmd := fmt.Sprintf("find /var/lib/kubelet/plugins -name %s", clientPod.UID)
if isSudoPresent(ctx, nodeIP, framework.TestContext.Provider) {
globalBlockDirectoryCmd = fmt.Sprintf("sudo sh -c \"%s\"", globalBlockDirectoryCmd)
}
ginkgo.By("Expecting the symlinks from PodDeviceMapPath to be found.")
result, err := e2essh.SSH(ctx, podDirectoryCmd, nodeIP, framework.TestContext.Provider)
e2essh.LogResult(result)
framework.ExpectNoError(err, "Encountered SSH error.")
gomega.Expect(result.Code).To(gomega.Equal(0), fmt.Sprintf("Expected grep exit code of 0, got %d", result.Code))
ginkgo.By("Expecting the symlinks from global map path to be found.")
result, err = e2essh.SSH(ctx, globalBlockDirectoryCmd, nodeIP, framework.TestContext.Provider)
e2essh.LogResult(result)
framework.ExpectNoError(err, "Encountered SSH error.")
gomega.Expect(result.Code).To(gomega.Equal(0), fmt.Sprintf("Expected find exit code of 0, got %d", result.Code))
// This command is to make sure kubelet is started after test finishes no matter it fails or not.
ginkgo.DeferCleanup(KubeletCommand, KStart, c, clientPod)
ginkgo.By("Stopping the kubelet.")
KubeletCommand(ctx, KStop, c, clientPod)
ginkgo.By(fmt.Sprintf("Deleting Pod %q", clientPod.Name))
if forceDelete {
err = c.CoreV1().Pods(clientPod.Namespace).Delete(ctx, clientPod.Name, *metav1.NewDeleteOptions(0))
} else {
err = c.CoreV1().Pods(clientPod.Namespace).Delete(ctx, clientPod.Name, metav1.DeleteOptions{})
}
framework.ExpectNoError(err, "Failed to delete pod.")
ginkgo.By("Starting the kubelet and waiting for pod to delete.")
KubeletCommand(ctx, KStart, c, clientPod)
err = e2epod.WaitForPodNotFoundInNamespace(ctx, f.ClientSet, clientPod.Name, f.Namespace.Name, f.Timeouts.PodDelete)
framework.ExpectNoError(err, "Expected pod to be not found.")
if forceDelete {
// With forceDelete, since pods are immediately deleted from API server, there is no way to be sure when volumes are torn down
// so wait some time to finish
time.Sleep(30 * time.Second)
}
ginkgo.By("Expecting the symlink from PodDeviceMapPath not to be found.")
result, err = e2essh.SSH(ctx, podDirectoryCmd, nodeIP, framework.TestContext.Provider)
e2essh.LogResult(result)
framework.ExpectNoError(err, "Encountered SSH error.")
gomega.Expect(result.Stdout).To(gomega.BeEmpty(), "Expected grep stdout to be empty.")
ginkgo.By("Expecting the symlinks from global map path not to be found.")
result, err = e2essh.SSH(ctx, globalBlockDirectoryCmd, nodeIP, framework.TestContext.Provider)
e2essh.LogResult(result)
framework.ExpectNoError(err, "Encountered SSH error.")
gomega.Expect(result.Stdout).To(gomega.BeEmpty(), "Expected find stdout to be empty.")
framework.Logf("Volume unmaped on node %s", clientPod.Spec.NodeName)
}
// TestVolumeUnmapsFromDeletedPod tests that a volume unmaps if the client pod was deleted while the kubelet was down.
func TestVolumeUnmapsFromDeletedPod(ctx context.Context, c clientset.Interface, f *framework.Framework, clientPod *v1.Pod, devicePath string) {
TestVolumeUnmapsFromDeletedPodWithForceOption(ctx, c, f, clientPod, false, devicePath)
}
// TestVolumeUnmapsFromForceDeletedPod tests that a volume unmaps if the client pod was forcefully deleted while the kubelet was down.
func TestVolumeUnmapsFromForceDeletedPod(ctx context.Context, c clientset.Interface, f *framework.Framework, clientPod *v1.Pod, devicePath string) {
TestVolumeUnmapsFromDeletedPodWithForceOption(ctx, c, f, clientPod, true, devicePath)
}
// RunInPodWithVolume runs a command in a pod with given claim mounted to /mnt directory.
func RunInPodWithVolume(ctx context.Context, c clientset.Interface, t *framework.TimeoutContext, ns, claimName, command string) {
pod := &v1.Pod{
TypeMeta: metav1.TypeMeta{
Kind: "Pod",
APIVersion: "v1",
},
ObjectMeta: metav1.ObjectMeta{
GenerateName: "pvc-volume-tester-",
},
Spec: v1.PodSpec{
Containers: []v1.Container{
{
Name: "volume-tester",
Image: imageutils.GetE2EImage(imageutils.BusyBox),
Command: []string{"/bin/sh"},
Args: []string{"-c", command},
VolumeMounts: []v1.VolumeMount{
{
Name: "my-volume",
MountPath: "/mnt/test",
},
},
},
},
RestartPolicy: v1.RestartPolicyNever,
Volumes: []v1.Volume{
{
Name: "my-volume",
VolumeSource: v1.VolumeSource{
PersistentVolumeClaim: &v1.PersistentVolumeClaimVolumeSource{
ClaimName: claimName,
ReadOnly: false,
},
},
},
},
},
}
pod, err := c.CoreV1().Pods(ns).Create(ctx, pod, metav1.CreateOptions{})
framework.ExpectNoError(err, "Failed to create pod: %v", err)
ginkgo.DeferCleanup(e2epod.DeletePodOrFail, c, ns, pod.Name)
framework.ExpectNoError(e2epod.WaitForPodSuccessInNamespaceTimeout(ctx, c, pod.Name, pod.Namespace, t.PodStartSlow))
}
// StartExternalProvisioner create external provisioner pod
func StartExternalProvisioner(ctx context.Context, c clientset.Interface, ns string, externalPluginName string) *v1.Pod {
podClient := c.CoreV1().Pods(ns)
provisionerPod := &v1.Pod{
TypeMeta: metav1.TypeMeta{
Kind: "Pod",
APIVersion: "v1",
},
ObjectMeta: metav1.ObjectMeta{
GenerateName: "external-provisioner-",
},
Spec: v1.PodSpec{
Containers: []v1.Container{
{
Name: "nfs-provisioner",
Image: imageutils.GetE2EImage(imageutils.NFSProvisioner),
SecurityContext: &v1.SecurityContext{
Capabilities: &v1.Capabilities{
Add: []v1.Capability{"DAC_READ_SEARCH"},
},
},
Args: []string{
"-provisioner=" + externalPluginName,
"-grace-period=0",
},
Ports: []v1.ContainerPort{
{Name: "nfs", ContainerPort: 2049},
{Name: "mountd", ContainerPort: 20048},
{Name: "rpcbind", ContainerPort: 111},
{Name: "rpcbind-udp", ContainerPort: 111, Protocol: v1.ProtocolUDP},
},
Env: []v1.EnvVar{
{
Name: "POD_IP",
ValueFrom: &v1.EnvVarSource{
FieldRef: &v1.ObjectFieldSelector{
FieldPath: "status.podIP",
},
},
},
},
ImagePullPolicy: v1.PullIfNotPresent,
VolumeMounts: []v1.VolumeMount{
{
Name: "export-volume",
MountPath: "/export",
},
},
},
},
Volumes: []v1.Volume{
{
Name: "export-volume",
VolumeSource: v1.VolumeSource{
EmptyDir: &v1.EmptyDirVolumeSource{},
},
},
},
},
}
provisionerPod, err := podClient.Create(ctx, provisionerPod, metav1.CreateOptions{})
framework.ExpectNoError(err, "Failed to create %s pod: %v", provisionerPod.Name, err)
framework.ExpectNoError(e2epod.WaitForPodRunningInNamespace(ctx, c, provisionerPod))
ginkgo.By("locating the provisioner pod")
pod, err := podClient.Get(ctx, provisionerPod.Name, metav1.GetOptions{})
framework.ExpectNoError(err, "Cannot locate the provisioner pod %v: %v", provisionerPod.Name, err)
return pod
}
func isSudoPresent(ctx context.Context, nodeIP string, provider string) bool {
framework.Logf("Checking if sudo command is present")
sshResult, err := e2essh.SSH(ctx, "sudo --version", nodeIP, provider)
framework.ExpectNoError(err, "SSH to %q errored.", nodeIP)
if !strings.Contains(sshResult.Stderr, "command not found") {
return true
}
return false
}
// CheckReadWriteToPath check that path can b e read and written
func CheckReadWriteToPath(f *framework.Framework, pod *v1.Pod, volMode v1.PersistentVolumeMode, path string) {
if volMode == v1.PersistentVolumeBlock {
// random -> file1
e2evolume.VerifyExecInPodSucceed(f, pod, "dd if=/dev/urandom of=/tmp/file1 bs=64 count=1")
// file1 -> dev (write to dev)
e2evolume.VerifyExecInPodSucceed(f, pod, fmt.Sprintf("dd if=/tmp/file1 of=%s bs=64 count=1", path))
// dev -> file2 (read from dev)
e2evolume.VerifyExecInPodSucceed(f, pod, fmt.Sprintf("dd if=%s of=/tmp/file2 bs=64 count=1", path))
// file1 == file2 (check contents)
e2evolume.VerifyExecInPodSucceed(f, pod, "diff /tmp/file1 /tmp/file2")
// Clean up temp files
e2evolume.VerifyExecInPodSucceed(f, pod, "rm -f /tmp/file1 /tmp/file2")
// Check that writing file to block volume fails
e2evolume.VerifyExecInPodFail(f, pod, fmt.Sprintf("echo 'Hello world.' > %s/file1.txt", path), 1)
} else {
// text -> file1 (write to file)
e2evolume.VerifyExecInPodSucceed(f, pod, fmt.Sprintf("echo 'Hello world.' > %s/file1.txt", path))
// grep file1 (read from file and check contents)
e2evolume.VerifyExecInPodSucceed(f, pod, readFile("Hello word.", path))
// Check that writing to directory as block volume fails
e2evolume.VerifyExecInPodFail(f, pod, fmt.Sprintf("dd if=/dev/urandom of=%s bs=64 count=1", path), 1)
}
}
func readFile(content, path string) string {
if framework.NodeOSDistroIs("windows") {
return fmt.Sprintf("Select-String '%s' %s/file1.txt", content, path)
}
return fmt.Sprintf("grep 'Hello world.' %s/file1.txt", path)
}
// genBinDataFromSeed generate binData with random seed
func genBinDataFromSeed(len int, seed int64) []byte {
binData := make([]byte, len)
rand.Seed(seed)
_, err := rand.Read(binData)
if err != nil {
fmt.Printf("Error: %v\n", err)
}
return binData
}
// CheckReadFromPath validate that file can be properly read.
//
// Note: directIO does not work with (default) BusyBox Pods. A requirement for
// directIO to function correctly, is to read whole sector(s) for Block-mode
// PVCs (normally a sector is 512 bytes), or memory pages for files (commonly
// 4096 bytes).
func CheckReadFromPath(f *framework.Framework, pod *v1.Pod, volMode v1.PersistentVolumeMode, directIO bool, path string, len int, seed int64) {
var pathForVolMode string
var iflag string
if volMode == v1.PersistentVolumeBlock {
pathForVolMode = path
} else {
pathForVolMode = filepath.Join(path, "file1.txt")
}
if directIO {
iflag = "iflag=direct"
}
sum := sha256.Sum256(genBinDataFromSeed(len, seed))
e2evolume.VerifyExecInPodSucceed(f, pod, fmt.Sprintf("dd if=%s %s bs=%d count=1 | sha256sum", pathForVolMode, iflag, len))
e2evolume.VerifyExecInPodSucceed(f, pod, fmt.Sprintf("dd if=%s %s bs=%d count=1 | sha256sum | grep -Fq %x", pathForVolMode, iflag, len, sum))
}
// CheckWriteToPath that file can be properly written.
//
// Note: nocache does not work with (default) BusyBox Pods. To read without
// caching, enable directIO with CheckReadFromPath and check the hints about
// the len requirements.
func CheckWriteToPath(f *framework.Framework, pod *v1.Pod, volMode v1.PersistentVolumeMode, nocache bool, path string, len int, seed int64) {
var pathForVolMode string
var oflag string
if volMode == v1.PersistentVolumeBlock {
pathForVolMode = path
} else {
pathForVolMode = filepath.Join(path, "file1.txt")
}
if nocache {
oflag = "oflag=nocache"
}
encoded := base64.StdEncoding.EncodeToString(genBinDataFromSeed(len, seed))
e2evolume.VerifyExecInPodSucceed(f, pod, fmt.Sprintf("echo %s | base64 -d | sha256sum", encoded))
e2evolume.VerifyExecInPodSucceed(f, pod, fmt.Sprintf("echo %s | base64 -d | dd of=%s %s bs=%d count=1", encoded, pathForVolMode, oflag, len))
}
// GetSectorSize returns the sector size of the device.
func GetSectorSize(f *framework.Framework, pod *v1.Pod, device string) int {
stdout, _, err := e2evolume.PodExec(f, pod, fmt.Sprintf("blockdev --getss %s", device))
framework.ExpectNoError(err, "Failed to get sector size of %s", device)
ss, err := strconv.Atoi(stdout)
framework.ExpectNoError(err, "Sector size returned by blockdev command isn't integer value.")
return ss
}
// findMountPoints returns all mount points on given node under specified directory.
func findMountPoints(ctx context.Context, hostExec HostExec, node *v1.Node, dir string) []string {
result, err := hostExec.IssueCommandWithResult(ctx, fmt.Sprintf(`find %s -type d -exec mountpoint {} \; | grep 'is a mountpoint$' || true`, dir), node)
framework.ExpectNoError(err, "Encountered HostExec error.")
var mountPoints []string
if err != nil {
for _, line := range strings.Split(result, "\n") {
if line == "" {
continue
}
mountPoints = append(mountPoints, strings.TrimSuffix(line, " is a mountpoint"))
}
}
return mountPoints
}
// FindVolumeGlobalMountPoints returns all volume global mount points on the node of given pod.
func FindVolumeGlobalMountPoints(ctx context.Context, hostExec HostExec, node *v1.Node) sets.String {
return sets.NewString(findMountPoints(ctx, hostExec, node, "/var/lib/kubelet/plugins")...)
}
// CreateDriverNamespace creates a namespace for CSI driver installation.
// The namespace is still tracked and ensured that gets deleted when test terminates.
func CreateDriverNamespace(ctx context.Context, f *framework.Framework) *v1.Namespace {
ginkgo.By(fmt.Sprintf("Building a driver namespace object, basename %s", f.Namespace.Name))
// The driver namespace will be bound to the test namespace in the prefix
namespace, err := f.CreateNamespace(ctx, f.Namespace.Name, map[string]string{
"e2e-framework": f.BaseName,
"e2e-test-namespace": f.Namespace.Name,
})
framework.ExpectNoError(err)
if framework.TestContext.VerifyServiceAccount {
ginkgo.By("Waiting for a default service account to be provisioned in namespace")
err = framework.WaitForDefaultServiceAccountInNamespace(ctx, f.ClientSet, namespace.Name)
framework.ExpectNoError(err)
} else {
framework.Logf("Skipping waiting for service account")
}
return namespace
}
// WaitForGVRDeletion waits until a non-namespaced object has been deleted
func WaitForGVRDeletion(ctx context.Context, c dynamic.Interface, gvr schema.GroupVersionResource, objectName string, poll, timeout time.Duration) error {
framework.Logf("Waiting up to %v for %s %s to be deleted", timeout, gvr.Resource, objectName)
if successful := WaitUntil(poll, timeout, func() bool {
_, err := c.Resource(gvr).Get(ctx, objectName, metav1.GetOptions{})
if err != nil && apierrors.IsNotFound(err) {
framework.Logf("%s %v is not found and has been deleted", gvr.Resource, objectName)
return true
} else if err != nil {
framework.Logf("Get %s returned an error: %v", objectName, err.Error())
} else {
framework.Logf("%s %v has been found and is not deleted", gvr.Resource, objectName)
}
return false
}); successful {
return nil
}
return fmt.Errorf("%s %s is not deleted within %v", gvr.Resource, objectName, timeout)
}
// EnsureGVRDeletion checks that no object as defined by the group/version/kind and name is ever found during the given time period
func EnsureGVRDeletion(ctx context.Context, c dynamic.Interface, gvr schema.GroupVersionResource, objectName string, poll, timeout time.Duration, namespace string) error {
var resourceClient dynamic.ResourceInterface
if namespace != "" {
resourceClient = c.Resource(gvr).Namespace(namespace)
} else {
resourceClient = c.Resource(gvr)
}
err := framework.Gomega().Eventually(ctx, func(ctx context.Context) error {
_, err := resourceClient.Get(ctx, objectName, metav1.GetOptions{})
return err
}).WithTimeout(timeout).WithPolling(poll).Should(gomega.MatchError(apierrors.IsNotFound, fmt.Sprintf("failed to delete %s %s", gvr, objectName)))
return err
}
// EnsureNoGVRDeletion checks that an object as defined by the group/version/kind and name has not been deleted during the given time period
func EnsureNoGVRDeletion(ctx context.Context, c dynamic.Interface, gvr schema.GroupVersionResource, objectName string, poll, timeout time.Duration, namespace string) error {
var resourceClient dynamic.ResourceInterface
if namespace != "" {
resourceClient = c.Resource(gvr).Namespace(namespace)
} else {
resourceClient = c.Resource(gvr)
}
err := framework.Gomega().Consistently(ctx, func(ctx context.Context) error {
_, err := resourceClient.Get(ctx, objectName, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("failed to get %s %s: %w", gvr.Resource, objectName, err)
}
return nil
}).WithTimeout(timeout).WithPolling(poll).Should(gomega.Succeed())
return err
}
// WaitForNamespacedGVRDeletion waits until a namespaced object has been deleted
func WaitForNamespacedGVRDeletion(ctx context.Context, c dynamic.Interface, gvr schema.GroupVersionResource, ns, objectName string, poll, timeout time.Duration) error {
framework.Logf("Waiting up to %v for %s %s to be deleted", timeout, gvr.Resource, objectName)
if successful := WaitUntil(poll, timeout, func() bool {
_, err := c.Resource(gvr).Namespace(ns).Get(ctx, objectName, metav1.GetOptions{})
if err != nil && apierrors.IsNotFound(err) {
framework.Logf("%s %s is not found in namespace %s and has been deleted", gvr.Resource, objectName, ns)
return true
} else if err != nil {
framework.Logf("Get %s in namespace %s returned an error: %v", objectName, ns, err.Error())
} else {
framework.Logf("%s %s has been found in namespace %s and is not deleted", gvr.Resource, objectName, ns)
}
return false
}); successful {
return nil
}
return fmt.Errorf("%s %s in namespace %s is not deleted within %v", gvr.Resource, objectName, ns, timeout)
}
// WaitUntil runs checkDone until a timeout is reached
func WaitUntil(poll, timeout time.Duration, checkDone func() bool) bool {
// TODO (pohly): replace with gomega.Eventually
for start := time.Now(); time.Since(start) < timeout; time.Sleep(poll) {
if checkDone() {
framework.Logf("WaitUntil finished successfully after %v", time.Since(start))
return true
}
}
framework.Logf("WaitUntil failed after reaching the timeout %v", timeout)
return false
}
// WaitForGVRFinalizer waits until a object from a given GVR contains a finalizer
// If namespace is empty, assume it is a non-namespaced object
func WaitForGVRFinalizer(ctx context.Context, c dynamic.Interface, gvr schema.GroupVersionResource, objectName, objectNamespace, finalizer string, poll, timeout time.Duration) error {
framework.Logf("Waiting up to %v for object %s %s of resource %s to contain finalizer %s", timeout, objectNamespace, objectName, gvr.Resource, finalizer)
var (
err error
resource *unstructured.Unstructured
)
if successful := WaitUntil(poll, timeout, func() bool {
switch objectNamespace {
case "":
resource, err = c.Resource(gvr).Get(ctx, objectName, metav1.GetOptions{})
default:
resource, err = c.Resource(gvr).Namespace(objectNamespace).Get(ctx, objectName, metav1.GetOptions{})
}
if err != nil {
framework.Logf("Failed to get object %s %s with err: %v. Will retry in %v", objectNamespace, objectName, err, timeout)
return false
}
for _, f := range resource.GetFinalizers() {
if f == finalizer {
return true
}
}
return false
}); successful {
return nil
}
if err == nil {
err = fmt.Errorf("finalizer %s not added to object %s %s of resource %s", finalizer, objectNamespace, objectName, gvr)
}
return err
}
// VerifyFilePathGidInPod verfies expected GID of the target filepath
func VerifyFilePathGidInPod(f *framework.Framework, filePath, expectedGid string, pod *v1.Pod) {
cmd := fmt.Sprintf("ls -l %s", filePath)
stdout, stderr, err := e2evolume.PodExec(f, pod, cmd)
framework.ExpectNoError(err)
framework.Logf("pod %s/%s exec for cmd %s, stdout: %s, stderr: %s", pod.Namespace, pod.Name, cmd, stdout, stderr)
ll := strings.Fields(stdout)
framework.Logf("stdout split: %v, expected gid: %v", ll, expectedGid)
gomega.Expect(ll[3]).To(gomega.Equal(expectedGid))
}
// ChangeFilePathGidInPod changes the GID of the target filepath.
func ChangeFilePathGidInPod(f *framework.Framework, filePath, targetGid string, pod *v1.Pod) {
cmd := fmt.Sprintf("chgrp %s %s", targetGid, filePath)
_, _, err := e2evolume.PodExec(f, pod, cmd)
framework.ExpectNoError(err)
VerifyFilePathGidInPod(f, filePath, targetGid, pod)
}
// DeleteStorageClass deletes the passed in StorageClass and catches errors other than "Not Found"
func DeleteStorageClass(ctx context.Context, cs clientset.Interface, className string) error {
err := cs.StorageV1().StorageClasses().Delete(ctx, className, metav1.DeleteOptions{})
if err != nil && !apierrors.IsNotFound(err) {
return err
}
return nil
}
// CreateVolumeSource creates a volume source object
func CreateVolumeSource(pvcName string, readOnly bool) *v1.VolumeSource {
return &v1.VolumeSource{
PersistentVolumeClaim: &v1.PersistentVolumeClaimVolumeSource{
ClaimName: pvcName,
ReadOnly: readOnly,
},
}
}
// TryFunc try to execute the function and return err if there is any
func TryFunc(f func()) error {
var err error
if f == nil {
return nil
}
defer func() {
if recoverError := recover(); recoverError != nil {
err = fmt.Errorf("%v", recoverError)
}
}()
f()
return err
}
// GetSizeRangesIntersection takes two instances of storage size ranges and determines the
// intersection of the intervals (if it exists) and return the minimum of the intersection
// to be used as the claim size for the test.
// if value not set, that means there's no minimum or maximum size limitation and we set default size for it.
func GetSizeRangesIntersection(first e2evolume.SizeRange, second e2evolume.SizeRange) (string, error) {
var firstMin, firstMax, secondMin, secondMax resource.Quantity
var err error
//if SizeRange is not set, assign a minimum or maximum size
if len(first.Min) == 0 {
first.Min = minValidSize
}
if len(first.Max) == 0 {
first.Max = maxValidSize
}
if len(second.Min) == 0 {
second.Min = minValidSize
}
if len(second.Max) == 0 {
second.Max = maxValidSize
}
if firstMin, err = resource.ParseQuantity(first.Min); err != nil {
return "", err
}
if firstMax, err = resource.ParseQuantity(first.Max); err != nil {
return "", err
}
if secondMin, err = resource.ParseQuantity(second.Min); err != nil {
return "", err
}
if secondMax, err = resource.ParseQuantity(second.Max); err != nil {
return "", err
}
interSectionStart := math.Max(float64(firstMin.Value()), float64(secondMin.Value()))
intersectionEnd := math.Min(float64(firstMax.Value()), float64(secondMax.Value()))
// the minimum of the intersection shall be returned as the claim size
var intersectionMin resource.Quantity
if intersectionEnd-interSectionStart >= 0 { //have intersection
intersectionMin = *resource.NewQuantity(int64(interSectionStart), "BinarySI") //convert value to BinarySI format. E.g. 5Gi
// return the minimum of the intersection as the claim size
return intersectionMin.String(), nil
}
return "", fmt.Errorf("intersection of size ranges %+v, %+v is null", first, second)
}

View File

@ -0,0 +1,102 @@
/*
Copyright 2024 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package utils
import (
"context"
"fmt"
"time"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apiserver/pkg/storage/names"
"k8s.io/client-go/dynamic"
"k8s.io/kubernetes/test/e2e/framework"
)
const (
// VolumeGroupSnapshot is the group snapshot api
VolumeGroupSnapshotAPIGroup = "groupsnapshot.storage.k8s.io"
// VolumeGroupSnapshotAPIVersion is the group snapshot api version
VolumeGroupSnapshotAPIVersion = "groupsnapshot.storage.k8s.io/v1alpha1"
)
var (
// VolumeGroupSnapshotGVR is GroupVersionResource for volumegroupsnapshots
VolumeGroupSnapshotGVR = schema.GroupVersionResource{Group: VolumeGroupSnapshotAPIGroup, Version: "v1alpha1", Resource: "volumegroupsnapshots"}
// VolumeGroupSnapshotClassGVR is GroupVersionResource for volumegroupsnapshotsclasses
VolumeGroupSnapshotClassGVR = schema.GroupVersionResource{Group: VolumeGroupSnapshotAPIGroup, Version: "v1alpha1", Resource: "volumegroupsnapshotclasses"}
)
// WaitForVolumeGroupSnapshotReady waits for a VolumeGroupSnapshot to be ready to use or until timeout occurs, whichever comes first.
func WaitForVolumeGroupSnapshotReady(ctx context.Context, c dynamic.Interface, ns string, volumeGroupSnapshotName string, poll, timeout time.Duration) error {
framework.Logf("Waiting up to %v for VolumeGroupSnapshot %s to become ready", timeout, volumeGroupSnapshotName)
if successful := WaitUntil(poll, timeout, func() bool {
volumeGroupSnapshot, err := c.Resource(VolumeGroupSnapshotGVR).Namespace(ns).Get(ctx, volumeGroupSnapshotName, metav1.GetOptions{})
if err != nil {
framework.Logf("Failed to get group snapshot %q, retrying in %v. Error: %v", volumeGroupSnapshotName, poll, err)
return false
}
status := volumeGroupSnapshot.Object["status"]
if status == nil {
framework.Logf("VolumeGroupSnapshot %s found but is not ready.", volumeGroupSnapshotName)
return false
}
value := status.(map[string]interface{})
if value["readyToUse"] == true {
framework.Logf("VolumeSnapshot %s found and is ready", volumeGroupSnapshotName)
return true
}
framework.Logf("VolumeSnapshot %s found but is not ready.", volumeGroupSnapshotName)
return false
}); successful {
return nil
}
return fmt.Errorf("VolumeSnapshot %s is not ready within %v", volumeGroupSnapshotName, timeout)
}
func GenerateVolumeGroupSnapshotClassSpec(
snapshotter string,
parameters map[string]string,
ns string,
) *unstructured.Unstructured {
deletionPolicy, ok := parameters["deletionPolicy"]
if !ok {
deletionPolicy = "Delete"
}
volumeGroupSnapshotClass := &unstructured.Unstructured{
Object: map[string]interface{}{
"kind": "VolumeGroupSnapshotClass",
"apiVersion": VolumeGroupSnapshotAPIVersion,
"metadata": map[string]interface{}{
// Name must be unique, so let's base it on namespace name and use GenerateName
"name": names.SimpleNameGenerator.GenerateName(ns),
},
"driver": snapshotter,
"parameters": parameters,
"deletionPolicy": deletionPolicy,
},
}
return volumeGroupSnapshotClass
}

View File

@ -0,0 +1,22 @@
# test/e2e/testing-manifests
## Embedded Test Data
In case one needs to use any test fixture inside your tests and those are defined inside this directory, they need to be added to the `//go:embed` directive in `embed.go`.
For example, if one wants to include this Readme as a test fixture (potential bad idea in reality!),
```
// embed.go
...
//go:embed some other files README.md
...
```
This fixture can be accessed in the e2e tests using `test/e2e/framework/testfiles.Read` like
`testfiles.Read("test/e2e/testing-manifests/README.md)`.
This is needed since [migrating to //go:embed from go-bindata][1].
[1]: https://github.com/kubernetes/kubernetes/pull/99829

View File

@ -0,0 +1,11 @@
# See the OWNERS docs at https://go.k8s.io/owners
approvers:
- klueska
- pohly
reviewers:
- klueska
- pohly
- bart0sh
labels:
- sig/node

View File

@ -0,0 +1,85 @@
# This YAML file deploys the csi-driver-host-path on a number of nodes such
# that it proxies all connections from kubelet (plugin registration and dynamic
# resource allocation). The actual handling of those connections then happens
# inside the e2e.test binary via test/e2e/storage/drivers/proxy. This approach
# has the advantage that no separate container image with the test driver is
# needed and that tests have full control over the driver, for example for
# error injection.
#
# The csi-driver-host-path image is used because:
# - it has the necessary proxy mode (https://github.com/kubernetes-csi/csi-driver-host-path/commit/65480fc74d550a9a5aa81e850955cc20403857b1)
# - its base image contains a shell (useful for creating files)
# - the image is already a dependency of e2e.test
kind: ReplicaSet
apiVersion: apps/v1
metadata:
name: dra-test-driver
labels:
app.kubernetes.io/instance: test-driver.dra.k8s.io
app.kubernetes.io/part-of: dra-test-driver
app.kubernetes.io/name: dra-test-driver-kubelet-plugin
app.kubernetes.io/component: kubelet-plugin
spec:
selector:
matchLabels:
app.kubernetes.io/instance: test-driver.dra.k8s.io
app.kubernetes.io/part-of: dra-test-driver
app.kubernetes.io/name: dra-test-driver-kubelet-plugin
app.kubernetes.io/component: kubelet-plugin
replicas: 1
template:
metadata:
labels:
app.kubernetes.io/instance: test-driver.dra.k8s.io
app.kubernetes.io/part-of: dra-test-driver
app.kubernetes.io/name: dra-test-driver-kubelet-plugin
app.kubernetes.io/component: kubelet-plugin
spec:
# Ensure that all pods run on distinct nodes.
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/instance: test-driver.dra.k8s.io
topologyKey: kubernetes.io/hostname
containers:
- name: registrar
image: registry.k8s.io/sig-storage/hostpathplugin:v1.7.3
args:
- "--v=5"
- "--endpoint=/plugins_registry/dra-test-driver-reg.sock"
- "--proxy-endpoint=tcp://:9000"
volumeMounts:
- mountPath: /plugins_registry
name: registration-dir
- name: plugin
image: registry.k8s.io/sig-storage/hostpathplugin:v1.7.3
args:
- "--v=5"
- "--endpoint=/dra/dra-test-driver.sock"
- "--proxy-endpoint=tcp://:9001"
securityContext:
privileged: true
volumeMounts:
- mountPath: /dra
name: socket-dir
- mountPath: /cdi
name: cdi-dir
volumes:
- hostPath:
path: /var/lib/kubelet/plugins
type: DirectoryOrCreate
name: socket-dir
- hostPath:
path: /var/run/cdi
type: DirectoryOrCreate
name: cdi-dir
- hostPath:
path: /var/lib/kubelet/plugins_registry
type: DirectoryOrCreate
name: registration-dir

View File

@ -0,0 +1,33 @@
/*
Copyright 2021 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package testing_manifests
import (
"embed"
e2etestfiles "k8s.io/kubernetes/test/e2e/framework/testfiles"
)
//go:embed dra flexvolume guestbook kubectl sample-device-plugin gpu statefulset storage-csi
var e2eTestingManifestsFS embed.FS
func GetE2ETestingManifestsFS() e2etestfiles.EmbeddedFileSource {
return e2etestfiles.EmbeddedFileSource{
EmbeddedFS: e2eTestingManifestsFS,
Root: "test/e2e/testing-manifests",
}
}

View File

@ -0,0 +1,145 @@
#!/bin/sh
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This driver is especially designed to test a long mounting scenario
# which can cause a volume to be detached while mount is in progress.
FLEX_DUMMY_LOG=${FLEX_DUMMY_LOG:-"/tmp/flex-dummy.log"}
VALID_MNTDEVICE=foo
# attach always returns one valid mount device so a different device
# showing up in a subsequent driver call implies a bug
validateMountDeviceOrDie() {
MNTDEVICE=$1
CALL=$2
if [ "$MNTDEVICE" != "$VALID_MNTDEVICE" ]; then
log "{\"status\":\"Failure\",\"message\":\"call "${CALL}" expected device "${VALID_MNTDEVICE}", got device "${MNTDEVICE}"\"}"
exit 0
fi
}
log() {
printf "$*" >&1
}
debug() {
echo "$(date) $*" >> "${FLEX_DUMMY_LOG}"
}
attach() {
debug "attach $@"
log "{\"status\":\"Success\",\"device\":\""${VALID_MNTDEVICE}"\"}"
exit 0
}
detach() {
debug "detach $@"
# TODO issue 44737 detach is passed PV name, not mount device
log "{\"status\":\"Success\"}"
exit 0
}
waitforattach() {
debug "waitforattach $@"
MNTDEVICE=$1
validateMountDeviceOrDie "$MNTDEVICE" "waitforattach"
log "{\"status\":\"Success\",\"device\":\""${MNTDEVICE}"\"}"
exit 0
}
isattached() {
debug "isattached $@"
log "{\"status\":\"Success\",\"attached\":true}"
exit 0
}
domountdevice() {
debug "domountdevice $@"
MNTDEVICE=$2
validateMountDeviceOrDie "$MNTDEVICE" "domountdevice"
MNTPATH=$1
mkdir -p ${MNTPATH} >/dev/null 2>&1
mount -t tmpfs none ${MNTPATH} >/dev/null 2>&1
sleep 120
echo "Hello from flexvolume!" >> "${MNTPATH}/index.html"
log "{\"status\":\"Success\"}"
exit 0
}
unmountdevice() {
debug "unmountdevice $@"
MNTPATH=$1
rm "${MNTPATH}/index.html" >/dev/null 2>&1
umount ${MNTPATH} >/dev/null 2>&1
log "{\"status\":\"Success\"}"
exit 0
}
expandvolume() {
debug "expandvolume $@"
log "{\"status\":\"Success\"}"
exit 0
}
expandfs() {
debug "expandfs $@"
log "{\"status\":\"Success\"}"
exit 0
}
op=$1
if [ "$op" = "init" ]; then
debug "init $@"
log "{\"status\":\"Success\",\"capabilities\":{\"attach\":true, \"requiresFSResize\":true}}"
exit 0
fi
shift
case "$op" in
attach)
attach $*
;;
detach)
detach $*
;;
waitforattach)
waitforattach $*
;;
isattached)
isattached $*
;;
mountdevice)
domountdevice $*
;;
unmountdevice)
unmountdevice $*
;;
expandvolume)
expandvolume $*
;;
expandfs)
expandfs $*
;;
*)
log "{\"status\":\"Not supported\"}"
exit 0
esac
exit 1

View File

@ -0,0 +1,70 @@
#!/bin/sh
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This driver implements a tmpfs with a pre-populated file index.html.
FLEX_DUMMY_LOG=${FLEX_DUMMY_LOG:-"/tmp/flex-dummy.log"}
log() {
printf "$*" >&1
}
debug() {
echo "$(date) $*" >> "${FLEX_DUMMY_LOG}"
}
domount() {
debug "domount $@"
MNTPATH=$1
mkdir -p ${MNTPATH} >/dev/null 2>&1
mount -t tmpfs none ${MNTPATH} >/dev/null 2>&1
echo "Hello from flexvolume!" >> "${MNTPATH}/index.html"
log "{\"status\":\"Success\"}"
exit 0
}
unmount() {
debug "unmount $@"
MNTPATH=$1
rm ${MNTPATH}/index.html >/dev/null 2>&1
umount ${MNTPATH} >/dev/null 2>&1
log "{\"status\":\"Success\"}"
exit 0
}
op=$1
if [ "$op" = "init" ]; then
debug "init $@"
log "{\"status\":\"Success\",\"capabilities\":{\"attach\":false}}"
exit 0
fi
shift
case "$op" in
mount)
domount $*
;;
unmount)
unmount $*
;;
*)
log "{\"status\":\"Not supported\"}"
exit 0
esac
exit 1

View File

@ -0,0 +1,143 @@
#!/bin/sh
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This driver implements a tmpfs with a pre-populated file index.html.
# Attach is required, but it is a no-op that always returns success.
FLEX_DUMMY_LOG=${FLEX_DUMMY_LOG:-"/tmp/flex-dummy.log"}
VALID_MNTDEVICE=foo
# attach always returns one valid mount device so a different device
# showing up in a subsequent driver call implies a bug
validateMountDeviceOrDie() {
MNTDEVICE=$1
CALL=$2
if [ "$MNTDEVICE" != "$VALID_MNTDEVICE" ]; then
log "{\"status\":\"Failure\",\"message\":\"call "${CALL}" expected device "${VALID_MNTDEVICE}", got device "${MNTDEVICE}"\"}"
exit 0
fi
}
log() {
printf "$*" >&1
}
debug() {
echo "$(date) $*" >> "${FLEX_DUMMY_LOG}"
}
attach() {
debug "attach $@"
log "{\"status\":\"Success\",\"device\":\""${VALID_MNTDEVICE}"\"}"
exit 0
}
detach() {
debug "detach $@"
# TODO issue 44737 detach is passed PV name, not mount device
log "{\"status\":\"Success\"}"
exit 0
}
waitforattach() {
debug "waitforattach $@"
MNTDEVICE=$1
validateMountDeviceOrDie "$MNTDEVICE" "waitforattach"
log "{\"status\":\"Success\",\"device\":\""${MNTDEVICE}"\"}"
exit 0
}
isattached() {
debug "isattached $@"
log "{\"status\":\"Success\",\"attached\":true}"
exit 0
}
domountdevice() {
debug "domountdevice $@"
MNTDEVICE=$2
validateMountDeviceOrDie "$MNTDEVICE" "domountdevice"
MNTPATH=$1
mkdir -p ${MNTPATH} >/dev/null 2>&1
mount -t tmpfs none ${MNTPATH} >/dev/null 2>&1
echo "Hello from flexvolume!" >> "${MNTPATH}/index.html"
log "{\"status\":\"Success\"}"
exit 0
}
unmountdevice() {
debug "unmountdevice $@"
MNTPATH=$1
rm "${MNTPATH}/index.html" >/dev/null 2>&1
umount ${MNTPATH} >/dev/null 2>&1
log "{\"status\":\"Success\"}"
exit 0
}
expandvolume() {
debug "expandvolume $@"
log "{\"status\":\"Success\"}"
exit 0
}
expandfs() {
debug "expandfs $@"
log "{\"status\":\"Success\"}"
exit 0
}
op=$1
if [ "$op" = "init" ]; then
debug "init $@"
log "{\"status\":\"Success\",\"capabilities\":{\"attach\":true, \"requiresFSResize\":true}}"
exit 0
fi
shift
case "$op" in
attach)
attach $*
;;
detach)
detach $*
;;
waitforattach)
waitforattach $*
;;
isattached)
isattached $*
;;
mountdevice)
domountdevice $*
;;
unmountdevice)
unmountdevice $*
;;
expandvolume)
expandvolume $*
;;
expandfs)
expandfs $*
;;
*)
log "{\"status\":\"Not supported\"}"
exit 0
esac
exit 1

View File

@ -0,0 +1,147 @@
# This DaemonSet was originally referenced from
# https://github.com/GoogleCloudPlatform/container-engine-accelerators/blob/master/nvidia-driver-installer/cos/daemonset-preloaded.yaml
# The Dockerfile and other source for this daemonset are in
# https://github.com/GoogleCloudPlatform/cos-gpu-installer
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nvidia-driver-installer
namespace: kube-system
labels:
k8s-app: nvidia-driver-installer
spec:
selector:
matchLabels:
k8s-app: nvidia-driver-installer
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
name: nvidia-driver-installer
k8s-app: nvidia-driver-installer
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-accelerator
operator: Exists
tolerations:
- operator: "Exists"
hostNetwork: true
hostPID: true
volumes:
- name: dev
hostPath:
path: /dev
- name: vulkan-icd-mount
hostPath:
path: /home/kubernetes/bin/nvidia/vulkan/icd.d
- name: nvidia-install-dir-host
hostPath:
path: /home/kubernetes/bin/nvidia
- name: root-mount
hostPath:
path: /
- name: cos-tools
hostPath:
path: /var/lib/cos-tools
- name: nvidia-config
hostPath:
path: /etc/nvidia
initContainers:
- image: "ubuntu@sha256:3f85b7caad41a95462cf5b787d8a04604c8262cdcdf9a472b8c52ef83375fe15"
name: bind-mount-install-dir
securityContext:
privileged: true
command:
- nsenter
- -at
- '1'
- --
- sh
- -c
- |
if mountpoint -q /var/lib/nvidia; then
echo "The mountpoint /var/lib/nvidia exists."
else
echo "The mountpoint /var/lib/nvidia does not exist. Creating directories /home/kubernetes/bin/nvidia and /var/lib/nvidia and bind mount."
mkdir -p /var/lib/nvidia /home/kubernetes/bin/nvidia
mount --bind /home/kubernetes/bin/nvidia /var/lib/nvidia
echo "Done creating bind mounts"
fi
# The COS GPU installer image version may be dependent on the version of COS being used.
# Refer to details about the installer in https://cos.googlesource.com/cos/tools/+/refs/heads/master/src/cmd/cos_gpu_installer/
# and the COS release notes (https://cloud.google.com/container-optimized-os/docs/release-notes) to determine version COS GPU installer for a given version of COS.
# Maps to gcr.io/cos-cloud/cos-gpu-installer:v2.1.10 - suitable for COS M109 as per https://cloud.google.com/container-optimized-os/docs/release-notes
- image: "gcr.io/cos-cloud/cos-gpu-installer:v2.1.10"
name: nvidia-driver-installer
resources:
requests:
cpu: 150m
securityContext:
privileged: true
env:
- name: NVIDIA_INSTALL_DIR_HOST
value: /home/kubernetes/bin/nvidia
- name: NVIDIA_INSTALL_DIR_CONTAINER
value: /usr/local/nvidia
- name: VULKAN_ICD_DIR_HOST
value: /home/kubernetes/bin/nvidia/vulkan/icd.d
- name: VULKAN_ICD_DIR_CONTAINER
value: /etc/vulkan/icd.d
- name: ROOT_MOUNT_DIR
value: /root
- name: COS_TOOLS_DIR_HOST
value: /var/lib/cos-tools
- name: COS_TOOLS_DIR_CONTAINER
value: /build/cos-tools
volumeMounts:
- name: nvidia-install-dir-host
mountPath: /usr/local/nvidia
- name: vulkan-icd-mount
mountPath: /etc/vulkan/icd.d
- name: dev
mountPath: /dev
- name: root-mount
mountPath: /root
- name: cos-tools
mountPath: /build/cos-tools
command:
- bash
- -c
- |
echo "Checking for existing GPU driver modules"
if lsmod | grep nvidia; then
echo "GPU driver is already installed, the installed version may or may not be the driver version being tried to install, skipping installation"
exit 0
else
echo "No GPU driver module detected, installing now"
/cos-gpu-installer install
fi
- image: "gcr.io/gke-release/nvidia-partition-gpu@sha256:e226275da6c45816959fe43cde907ee9a85c6a2aa8a429418a4cadef8ecdb86a"
name: partition-gpus
env:
- name: LD_LIBRARY_PATH
value: /usr/local/nvidia/lib64
resources:
requests:
cpu: 150m
securityContext:
privileged: true
volumeMounts:
- name: nvidia-install-dir-host
mountPath: /usr/local/nvidia
- name: dev
mountPath: /dev
- name: nvidia-config
mountPath: /etc/nvidia
containers:
- image: "registry.k8s.io/pause:3.10"
name: pause

View File

@ -0,0 +1,57 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: nvidia-gpu-device-plugin
namespace: kube-system
labels:
k8s-app: nvidia-gpu-device-plugin
addonmanager.kubernetes.io/mode: EnsureExists
spec:
selector:
matchLabels:
k8s-app: nvidia-gpu-device-plugin
template:
metadata:
labels:
k8s-app: nvidia-gpu-device-plugin
spec:
priorityClassName: system-node-critical
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-accelerator
operator: Exists
tolerations:
- operator: "Exists"
effect: "NoExecute"
- operator: "Exists"
effect: "NoSchedule"
volumes:
- name: device-plugin
hostPath:
path: /var/lib/kubelet/device-plugins
- name: dev
hostPath:
path: /dev
containers:
- image: "registry.k8s.io/nvidia-gpu-device-plugin@sha256:4b036e8844920336fa48f36edeb7d4398f426d6a934ba022848deed2edbf09aa"
command: ["/usr/bin/nvidia-gpu-device-plugin", "-logtostderr"]
name: nvidia-gpu-device-plugin
resources:
requests:
cpu: 50m
memory: 10Mi
limits:
cpu: 50m
memory: 10Mi
securityContext:
privileged: true
volumeMounts:
- name: device-plugin
mountPath: /device-plugin
- name: dev
mountPath: /dev
updateStrategy:
type: RollingUpdate

View File

@ -0,0 +1,28 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: agnhost-primary
spec:
replicas: 1
selector:
matchLabels:
app: agnhost
role: primary
tier: backend
template:
metadata:
labels:
app: agnhost
role: primary
tier: backend
spec:
containers:
- name: primary
image: {{.AgnhostImage}}
args: [ "guestbook", "--http-port", "6379" ]
resources:
requests:
cpu: 100m
memory: 100Mi
ports:
- containerPort: 6379

View File

@ -0,0 +1,16 @@
apiVersion: v1
kind: Service
metadata:
name: agnhost-primary
labels:
app: agnhost
role: primary
tier: backend
spec:
ports:
- port: 6379
targetPort: 6379
selector:
app: agnhost
role: primary
tier: backend

View File

@ -0,0 +1,28 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: agnhost-replica
spec:
replicas: 2
selector:
matchLabels:
app: agnhost
role: replica
tier: backend
template:
metadata:
labels:
app: agnhost
role: replica
tier: backend
spec:
containers:
- name: replica
image: {{.AgnhostImage}}
args: [ "guestbook", "--replicaof", "agnhost-primary", "--http-port", "6379" ]
resources:
requests:
cpu: 100m
memory: 100Mi
ports:
- containerPort: 6379

View File

@ -0,0 +1,15 @@
apiVersion: v1
kind: Service
metadata:
name: agnhost-replica
labels:
app: agnhost
role: replica
tier: backend
spec:
ports:
- port: 6379
selector:
app: agnhost
role: replica
tier: backend

Some files were not shown because too many files have changed in this diff Show More