Node Feature Discovery - Intel Builders

2y ago
20 Views
2 Downloads
630.43 KB
19 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Warren Adams
Transcription

Node Feature DiscoveryApplication NoteDecember 2018Document Number: 606833-001

You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intelproducts described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter draftedwhich includes subject matter disclosed hereinNo license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intelproduct specifications and roadmaps.The products described may contain design defects or errors known as errata which may cause the product to deviate frompublished specifications. Current characterized errata are available on request.Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725or by visiting: http://www.intel.com/design/literature.htmIntel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or serviceactivation. Learn more at http://www.intel.com/ or from the OEM or retailer.Intel, the Intel logo, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries.*Other names and brands may be claimed as the property of others.Copyright 2020, Intel Corporation. All rights reserved.Node Feature DiscoveryApplication Note2December 2018Document Number: 606833-001

Contents1.0Introduction . 52.0Overview . 63.0How NFD Works . 94.0Feature Labels .105.0Deployment .125.15.25.36.0Deployment as a DaemonSet . 12Deployment as a Job . 13Deploying Custom-Built Version . 13Using Labels to Schedule Pods .146.16.27.0nodeSelector . 14nodeAffinity . 14Runtime Configuration .167.17.28.0Command Line Options . 16Configuration File . 16Summary.18Appendix A Terminology and References .19FiguresFigure 1.Figure 2.Node Feature List . 8Node Feature Discovery in Kubernetes . 9TablesTableTableTableTable1.2.3.4.Feature Labels . 10Command Line Options . 16Terminology . 19References . 19December 2018Document Number: 606833-001Node Feature DiscoveryApplication Note3

Revision HistoryDateDecember 2018Node Feature DiscoveryApplication Note4Revision001DescriptionInitial release.December 2018Document Number: 606833-001

Introduction1.0IntroductionNode Feature Discovery (NFD) is a Kubernetes* add-on that detects and advertiseshardware and software capabilities of a platform that can, in turn, be used to facilitateintelligent scheduling of a workload. This document details the deployment and usageof NFD. It is written for developers and architects who want to integrate NFD into theirKubernetes deployment in order to facilitate improved workload placement based onplatform capabilities.Node Feature Discovery is part of the Enhanced Platform Awareness (EPA) suite whichrepresents a methodology and a set of changes in Kubernetes targeting intelligentconfiguration and capacity consumption of platform capabilities. Through increasedawareness and orchestration of system resources and capabilities, EPA deliversimproved application performance and determinism.NFD is an open source Kubernetes community project. The software is available -discoveryThis document is part of the Container Experience Kit for Enhanced PlatformAwareness (EPA). Container Experience Kits are collections of user guides, applicationnotes, feature briefs, and other collateral that provide a library of best-practicedocuments for engineers who are developing container-based applications. Otherdocuments in the EPA Container Experience Kit can be found ologies/container-experience-kitsNote: This document does not describe how to set up a Kubernetes cluster. We recommendthat you perform those steps as a prerequisite. For more setup and installationguidelines of a complete system, refer to the Deploying Kubernetes and Container BareMetal Platform for NFV Use Cases with Intel Xeon Scalable Processors User Guidelisted in Table 4.The relevant documents include:Document TitleDocument TypeEnhanced Platform Awareness in KubernetesFeature BriefEnhanced Platform Awareness in KubernetesApplication NoteEnabling New Features with Kubernetes for NFVWhite PaperEnhanced Platform Awareness in KubernetesPerformance Benchmark ReportDecember 2018Document Number: 606833-001Node Feature DiscoveryApplication Note5

Overview2.0OverviewIn a standard deployment, Kubernetes reveals very few details about the underlyingplatform to the user. This may be a good strategy for general data center use, but, inmany cases a workload behavior or its performance, may improve by leveraging theplatform (hardware and/or software) features. Node Feature Discovery detects thesefeatures and advertises them through a Kubernetes concept called node labels which,in turn, can be used to control workload placement in a Kubernetes cluster. NFD runs asa separate container on each individual node of the cluster, discovers capabilities of thenode, and finally, publishes these as node labels using the Kubernetes API.NFD only handles non-allocatable features, that is, unlimited capabilities that do notrequire any accounting and are available to all workloads. Allocatable resources thatrequire accounting, initialization and other special handling (such as Intel QuickAssistTechnology, GPUs, and FPGAs) are presented as Kubernetes Extended Resources andhandled by device plugins. They are out of the scope of NFD.NFD currently detects the following features: CPUID: Intel processors have a special CPUID instruction for determining the CPUfeatures, including the model and support for instruction set extensions, such asIntel Advanced Vector Extensions (Intel AVX). Certain workloads, such as machinelearning, may gain a significant performance improvement from these extensions(e.g. AVX-512). NFD advertises all CPU features obtained from the CPUIDinformation. SR-IOV networking: Single Root I/O Virtualization (SR-IOV) is a technology forisolating PCI Express* resources. It allows multiple virtual environments to share asingle PCI Express hardware device (physical function, PF) by offering multiplevirtual functions (VF) that appear as separate PCI Express interfaces. In the case ofnetwork interface cards (NICs), SR-IOV VFs allow direct hardware access frommultiple Kubernetes pods, increasing network I/O performance, and making itpossible to run fast user-space packet processing workloads (for example, based inData Plane Development Kit). NFD detects the presence of SR-IOV-enabled NICs,allowing optimized scheduling of network-intensive workloads. Intel RDT: Intel Resource Director Technology (Intel RDT) allows visibility andcontrol over the usage of last-level cache (LLC) and memory bandwidth betweenco-running workloads. By allowing allocation and isolation of these sharedresources, and thus reducing contention, RDT helps in mitigating the effects ofnoisy neighbors. This provides more consistent and predictable performance whichmay be essential in meeting Service Level Agreements (SLA), for example. NFDdetects the different RDT technologies supported by the underlying hardwareplatform. Intel Turbo Boost Technology: Intel Turbo Boost Technology acceleratesprocessor performance for peak loads, dynamically overclocking processor cores ifNode Feature DiscoveryApplication Note6December 2018Document Number: 606833-001

Overviewthey are operating within the power, current, and temperature limits of theprocessor. This can provide significant performance benefits for CPU-boundworkloads. On the other hand, some workloads behave better when thistechnology has been disabled. NFD detects the state of Intel Turbo BoostTechnology, allowing optimal scheduling of workloads that have a well-understooddependency on this technology. IOMMU: An input/output memory management unit (IOMMU), such as Intel Virtualization Technology (Intel VT) for Directed I/O (Intel VT-d) technology,allows isolation and restriction of device accesses. This enables direct hardwareaccess in virtualized environments, highly accelerating I/O performance byremoving the need for device emulation and bounce buffers. This can be crucial forI/O heavy workloads in Kubernetes deployments using hypervisor-based containerruntimes, such as Kata* Containers. NFD detects if an IOMMU is supported by thehost hardware platform and enabled in the kernel of the host operating system. SSD storage: Solid state drives (SSD) have a huge performance advantage overtraditional rotational hard disks. This may be important for disk I/O intensiveworkloads. NFD detects the presence of non-rotational block storage on the node,making it possible to accelerate workloads requiring fast local disk access. NUMA topology: Non-uniform memory access (NUMA) is a memory architecturewhere CPU’s memory access times are dependent on the memory location. Accessto CPU’s local memory is faster than to non-local memory (local memory of anotherCPU) which can cause workloads to perform poorly if not properly designed forNUMA systems. On the other hand, some highly NUMA-aware applications mayexperience negligible performance penalties. NFD detects the presence of NUMAtopology, making it possible to optimize scheduling of applications based on theirNUMA-awareness. Linux* kernel: Some specific workloads may be highly dependent on the kernelversion of the underlying host operating system. For example, some kernel featuresmay be required to be able to run an application, or, they provide measurableperformance benefits. NFD detects the kernel version and advertises it throughmultiple labels, allowing the deployment of workloads with different granularity ofkernel version dependency. PCI: Detecting the presence of compatible PCI hardware devices is beneficial forsome workloads. For example, Kubernetes device plugins need to be deployed onlyon nodes that have hardware that the device plugin manages. NFD detects PCIdevices, allowing optimized scheduling of workloads dependent on certain PCIdevices.NFD is under active development, and in 2018 it has evolved significantly, gaining newfunctionality, including the discovery of multiple new features, as shown in Figure 1.December 2018Document Number: 606833-001Node Feature DiscoveryApplication Note7

OverviewFigure 1. Node Feature ListNode Feature DiscoveryApplication Note8December 2018Document Number: 606833-001

How NFD Works3.0How NFD WorksNode Feature Discovery is designed to run on every node in a Kubernetes cluster, eitheras a daemon set or a job. The Node Feature Discovery pod discovers capabilities oneach node it runs on, and then advertises those capabilities as node labels. As seen inFigure 2 below, NFD has run on each node and discovered the capabilities of the nodes(Turbo Boost, AVX, IOMMU, etc.) and then advertised those as labels which are storedon the Kubernetes Master node in the ETCD data store. ETCD stores configurationinformation for large scale distributed systems; the name originates from the Unix*/etc folder plus d for distributed systems.Using a Node Selector, an incoming pod can express its requirements for specificcapabilities. Figure 2 shows Application A needs to land on a node with SR-IOV andTurbo Boost capabilities. The Kubernetes scheduler on the master node will use thestored node labels to match the incoming pod to the most appropriate node. InFigure 2, this is Node 1. Application B has no special capability requests, therefore it canbe placed on either node. In Figure 2, it has been placed on Node 2.Figure 2. Node Feature Discovery in KubernetesDecember 2018Document Number: 606833-001Node Feature DiscoveryApplication Note9

Feature Labels4.0Feature LabelsNFD uses labels for advertising node-level features. Kubernetes labels are key-valuepairs that are attached to Kubernetes objects, such as pods or nodes for specifyingattributes of objects that may be relevant to the end user. They can also be used toorganize objects into specific subsets. Labels are a part of the metadata informationthat is attached to each node's description. All this information is stored in etcd in theKubernetes control plane. Node labels published by NFD encode the followinginformation: A namespace The source of the feature The name of the feature, with optional attribute name or sub-feature separated bya dot The value or the state of the featureAn example of a label created by node feature etwork-sriov.capable trueThis indicates that the namespace is node.alpha.kubernetes-incubator.io, thesource is network, feature name is sriov.capable and the value is true, indicatingthe presence of SR-IOV capable network interface card.In addition to the actual node feature labels, NFD advertises its own software ature-discovery.version v0.3.0The following table describes the details of the supported feature sources and theirfeature labels.Table 1.SourcecpuidFeature LabelsFeature cpuid featurename Attributen/aPossiblevaluestrueDescriptionAll CPU features returned by the CPUID instruction. uid-AVX VX512F HA truekernelversionNode Feature DiscoveryApplication Note10full versionstring Full kernel version. nel-version.full 4.5.67-g123abcdemajor versionnumber First component of the kernel version. nel-version.major 4minor versionnumber Second component of the kernel version. nel-version.minor 5December 2018Document Number: 606833-001

Feature luesDescriptionrevision versionnumber Third component of the kernel version. nel-version.minor 6n/atrueAn IOMMU is present and enabled in the kernel. mu-enabled truememorynuman/atrueNUMA topology detected. ory-numa truenetworksriovcapabletrueSR-IOV capable Network Interface Card(s) present. work.sriov.capable trueconfiguredtrueSR-IOV Virtual Functions have been configured. work.sriov.configured truepci device label presenttruePresence of PCI device is detected. 1200 8086.present trueNOTE: device label is composed of raw PCI IDs, separated by underscores. The set of fields is configurable, validfields being class, vendor, device, subsystem vendor and subsystem device (defaults are class and vendor).Also the set of PCI device classes that the feature source detects is configurable. By default, device classes03(h), 0b40(h) and 12(h), i.e. GPUs, co-processors, and accelerator cards, are detected. See the RuntimeConfiguration section for more details about NFD configuration options.rdtRDTCMTn/atrueIntel RDT Cache Monitoring Technology is supported. -RDTCMT trueRDTMBMn/atrueIntel RDT Memory Bandwidth Monitoring is supported. -RDTMBM trueRDTMBAn/atrueIntel RDT Memory Bandwidth Allocation is supported. -RDTMBA trueRDTMONn/atrueIntel RDT monitoring technologies are supported. -RDTMON trueRDTL3CAn/atrueIntel RDT L3 Cache Allocation Technology is supported. -RDTL3CA trueRDTL2CAn/atrueIntel RDT L2 Cache Allocation Technology is supported. -RDTL2CA trueselinuxenabledn/atrueSELinux enforcing has been turned on in the Linux kernel. inux-enabled truestoragenonrotationaldiskn/atrueNon-rotational block device(s), like an SSD, is present. rage-nonrotationaldisk trueDecember 2018Document Number: 606833-001Node Feature DiscoveryApplication Note11

Deployment5.0Deployment5.1Deployment as a DaemonSetThe preferred way to deploy NFD is to run it as a Kubernetes DaemonSet. This ensuresthat all nodes of the cluster run NFD, and also, new nodes get automatically labeled assoon as they become schedulable in the cluster. As a DaemonSet, NFD runs in thebackground, re-labeling nodes every 60 seconds (by default) so that any changes in thenode capabilities are detected.In its Github repository, NFD provides template specs that can be used for deployment: kubectl create -f bator/node-featurediscovery/master/rbac.yaml kubectl create -f scovery-daemonset.yaml.templateThis deploys the latest release of NFD, with the default configuration in the defaultKubernetes namespace.You can verify that NFD is running as expected by running: kubectl get ds/node-feature-discoveryThe output is similar 5UP-TO-DATEAVAILABLE5NODE SELECTOR none Also, you can check the labels created by NFD: kubectl label node --list –allListing labels for Node./node2:beta.kubernetes.io/arch amd64beta.kubernetes.io/os linuxkubernetes.io/hostname AESNI VX LMUL MOV X16 RMS 16C MX true Node Feature DiscoveryApplication Note12December 2018Document Number: 606833-001AGE2m

Deployment5.2Deployment as a JobAn alternative deployment mechanism is to run NFD as a one-shot Kubernetes Job.This may be a useful in a static cluster where no hardware changes are expected andthe number of running pods need to be minimized. The NFD repository contains anexample template and a deployment script that demonstrates this. You need to clonethe repository in order to run this: git clone rediscovery && cd node-feature-discovery ./label-nodes.shThe script launches as many instances of NFD as there are nodes in the Ready state inthe cluster. However, this approach is not guaranteed to correctly run NFD on everynode in all situations. For example, if some node is tainted NoSchedule or fails to start ajob for some other reason, then NFD may not run correctly. For these reasons, werecommend that you use a DaemonSet deployment.5.3Deploying Custom-Built VersionSometimes it may be desirable to run a self-built version of NFD, for example to try outan unreleased version. You must have installed Docker* and make to run the build.Follow the steps below:1. Clone NFD source code: git clone rediscovery cd node-feature-discovery2. Next, run make to build the Docker image: make3.Take a note of the NFD image hash built in the previous step, tag it and push theNFD image to your Docker registry, available for your Kubernetes cluster: docker tag image hash docker registry / image name docker push docker registry / image name 4.Edit node-feature-discovery-daemonset.yaml.template and change it touse your custom-built container image:.Image: docker registry / image name 5. Deploy NFD: kubectl apply -f mber 2018Document Number: 606833-001Node Feature DiscoveryApplication Note13

Using Labels to Schedule Pods6.0Using Labels to Schedule PodsDeploying NFD in a Kubernetes cluster labels all schedulable nodes according to theunderlying platform features. These labels can be used for placing hard or softconstraints on where specific pods should be run. This section describes the twomechanisms to achieve this: nodeSelector and nodeAffinity.6.1nodeSelectornodeSelector is a simple and limited mechanism to specify hard requirements onwhich node a pod should be run. nodeSelector contains a list of key-value pairspresenting required node labels and their values. A node must fulfill each of therequirements, that is, it must have each of the indicated label-value pairs in order forthe pod to be able to be scheduled there.The example below shows a pod specification requiring to be run on a node withSR-IOV capability:apiVersion: v1kind: Podmetadata:name: "true"containers:- name: nginximage: nginx6.2nodeAffinitynodeAffinity provides a much more expressive way to specify constraints on whichnodes a pod should be run. It provides a range of different operators to use formatching label values (not just “equal to”), and allows the specification of both hardand soft requirements (i.e. preferences).The example below presents a pod specification where a pod is required to be run on ahost with kernel version greater than 4.14, with a preference for Intel Turbo BoostTechnology being disabled (demonstrating soft anti-affinity).apiVersion: v1kind: Podmetadata:name: node-affinity-examplespec:Node Feature DiscoveryApplication Note14December 2018Document Number: 606833-001

Using Labels to Schedule IgnoredDuringExecution:nodeSelectorTerms:- matchExpressions:- key: on.majoroperator: Gtvalues: ["3"]- key: on.minoroperator: Gtvalues: ion:- weight: 1preference:matchExpressions:- key: operator: NotInvalues: ["true"]containers:- name: nginximage: nginxDecember 2018Document Number: 606833-001Node Feature DiscoveryApplication Note15

Runtime Configuration7.0Runtime ConfigurationThe template deployment specs provided as part of the NFD source code repository(see the Deployment section) specifies a default configuration that should be usableas-is for most users. However, NFD provides ways to alter its behavior throughcommand line options and a configuration file.7.1Command Line OptionsNFD has multiple command line options that can be used for tasks such as altering theset of labels to be advertised, for example. Specify the desired command line optionsunder the Args keyword in the NFD pod specification.Available command line options are listed in the following table.Table 2.Command Line OptionsOption7.2Description--sources sources Comma-separated list of enabled feature sources. Can be used to limitdetected features to a limited set of features. By default, all sources areenabled.--label-whitelist pattern Regular expression to filter published label names. Empty by default, thatis, no whitelist filter is enabled and all labels are published.--oneshotLabel once and exit after that. This is used in the one-shot Jobconfiguration.--sleep-interval time Interval of re-labeling. Specified using numbers and units ("s", "m", "h"), forexample: "1m30s". Non-positive value implies no re-labeling (that is,infinite sleep). Does not have any effect if --oneshot is specified. Defaultinterval is 60s.--no-publishDo not publish any labels. Useful for testing.--config path NFD configuration file to read. Can be used to specify a custom locationfor the configuration file.--options config Specify configuration options from command line, specified in the sameformat as in the configuration file (i.e. json or yaml). These options willoverride settings read from the configuration file. Useful for quickly testingconfiguration options and specifying a single configuration option.Configuration FileSome aspects of NFD can be configured through an optional configuration file, which islocated by default in -discovery.conf. A custom location can be specified using the --configcommand line option. The configuration file must be available inside the NFD pod, andNode Feature DiscoveryApplication Note16December 2018Document Number: 606833-001

Runtime Configurationthus, Volumes and VolumeMounts are needed to make it available for NFD. Thepreferred method is to use a ConfigMap.The following steps provide an example of creating and deploying a configuration map,using the example configuration from the NFD source code repository as a template.1. Use the example configuration as a base for your customized configuration. cp node-feature-discovery.conf.example node-featurediscovery.conf vim node-feature-discovery.conf # edit the configuration2. Create a Kubernetes ConfigMap object from your configuration file. kubectl create configmap node-feature-discovery-config --fromfile node-feature-discovery.conf3.Configure Volumes and VolumeMounts in the NFD pod spec:Note: Only the relevant code snippets are shown below.containers:- volumeMounts:- name: node-feature-discovery-configmountPath: - name: node-feature-discovery-configconfigMap:name: node-feature-discovery-config.4.NFD will read your custom configuration file.You could also use other types of volumes, of course. For example, hostPath could beused for local node-specific configurations.The example configuration in the NFD source code repository is used as a configurationin the NFD container image. Thus, by directly editing the example configuration, youcan alter the default configuration in custom-built images.Configuration options can also be specified via the --options command line flag, inwhich case no mounts need to be used. This is mostly recommended for quickly testingconfiguration options and possibly specifying single options without the need to useConfigMap. For example (a snippet from NFD DaemonSet specification):.containers:- args:- '--options {"sources": { "pci": { "deviceClassWhitelist":["12"] } } }'- '--sleep-interval 60s'.Currently, the only available configuration options are related to the PCI feature source.December 2018Document Number: 606833-001Node Feature DiscoveryApplication Note17

Summary8.0SummaryTogether with other EPA technologies, including device plugins, NFD facilitatesworkload optimization through resource-aware scheduling. In particular, NFD canbenefit workloads that utilize modern vector data processing instructions, requireSR-IOV networking, and have specific kernel requirements.This document describes the usage and benefits of Node Feature Discovery in aKubernetes deployment, including: Deployment of NFD Description of platform features discovered by NFD Using NFD labels for optimizing

Node Feature Discovery is designed to run on every node in a Kubernetes cluster, either as a daemon set or a job. The Node Feature Discovery pod discover s capabilities on each node it runs on, and then advertises those

Related Documents:

Tall With Spark Hadoop Worker Node Executor Cache Worker Node Executor Cache Worker Node Executor Cache Master Name Node YARN (Resource Manager) Data Node Data Node Data Node Worker Node Executor Cache Data Node HDFS Task Task Task Task Edge Node Client Libraries MATLAB Spark-submit script

Intel C Compiler Intel Fortran Compiler Intel Distribution for Python* Intel Math Kernel Library Intel Integrated Performance Primitives Intel Threading Building Blocks Intel Data Analytics Acceleration Library Included in Composer Edition SCALE Intel MPI Library Intel Trace Analyze

Document Number: 337029 -009 Intel RealSenseTM Product Family D400 Series Datasheet Intel RealSense Vision Processor D4, Intel RealSense Vision Processor D4 Board, Intel RealSense Vision Processor D4 Board V2, Intel RealSense Vision Processor D4 Board V3, Intel RealSense Depth Module D400, Intel RealSense Depth Module D410, Intel

Lenovo recommends Windows 8 Pro. SPECIFICATIONS PrOCESSOr OPErATING SySTEM I/O (INPUT/OUTPUT) POrTS Mini-Tower / Small Form Factor (SFF) Intel Core i7-4770S 65W Intel Core i7-4770 84W Intel Core i5-4430S 65W Intel Core i5-4430 84W Intel Core i5-4570S 65W Intel Core i5-4570 84W Intel Core i5-4670S 65W Intel Core i5-4670 84W Intel Core i3-4330 65W

PRICE LIST 09_2011 rev3. English. 2 Price list 09_2011. 3. MMA. Discovery 150TP - Multipower 184. 4 Discovery 200S - Discovery 250: 5 Discovery 400 - Discovery 500: 6: TIG DC: Discovery 161T - Discovery 171T MAX 7: Multipower 204T - Discovery 220T 8: Discovery 203T MAX - Discovery 300T 9: Pioneer 321 T 10:

5. Who uses Node.js 6. When to Use Node.js 7. When to not use Node.js Chapter 2: How to Download & Install Node.js - NPM on Windows 1. How to install Node.js on Windows 2. Installing NPM (Node Package Manager) on Windows 3. Running your first Hello world application in Node.js Chapter 3: Node.js NPM Tutorial: Create, Publish, Extend & Manage 1.

HP recommends Windows 10 Pro. FormFactor Mini AvailableOperatingSystem AvailableProcessors Intel Core i5-6500 with Intel HD Graphics 530 (3.2 GHz, up to 3.6 GHz with Intel Turbo Boost, 6 MB cache, 4 cores); Intel Core i5-6500T with Intel HD Graphics 530 (2.5 GHz, up to 3.1 GHz with Intel Turbo Boost, 6 MB cache, 4 cores); Intel Core i7-6700 with Intel HD Graphics 530 (3.4

The “Agile Software Development Manifesto” was developed in February 2001, by representatives from many of the fledgling “agile” processes such as Scrum, DSDM, and XP. The manifesto is a set of 4 values and 12 principles that describe “What is meant by Agile". THE AGILE VALUES 1. Individuals and interactions over processes and tools 2. Working software over comprehensive .