Consistent Structural Relation Learning For Zero-Shot .

2y ago
9 Views
2 Downloads
1.17 MB
11 Pages
Last View : 3m ago
Last Download : 3m ago
Upload by : Camryn Boren
Transcription

Consistent Structural Relation Learning forZero-Shot SegmentationPeike Li1,2 , Yunchao Wei1 , Yi Yang1ReLER Lab, Australian Artificial Intelligence InstituteUniversity of Technology Sydney2Baidu Researchpeike.li@student.uts.edu.au, {yunchao.wei, yi.yang}@uts.edu.au1AbstractZero-shot semantic segmentation aims to recognize the semantics of pixels fromunseen categories with zero training samples. Previous practice [1] proposed totrain the classifiers for unseen categories using the visual features generated fromsemantic word embeddings. However, the generator is merely learned on the seencategories while no constraint is applied to the unseen categories, leading to poorgeneralization ability. In this work, we propose a Consistent Structural RelationLearning (CSRL) approach to constrain the generating of unseen visual features byexploiting the structural relations between seen and unseen categories. We observethat different categories are usually with similar relations in either semantic wordembedding space or visual feature space. This observation motivates us to harnessthe similarity of category-level relations on the semantic word embedding spaceto learn a better visual feature generator. Concretely, by exploring the pair-wiseand list-wise structures, we impose the relations of generated visual features tobe consistent with their counterparts in the semantic word embedding space. Inthis way, the relations between seen and unseen categories will be transferredto implicitly constrain the generator to produce relation-consistent unseen visualfeatures. We conduct extensive experiments on Pascal-VOC and Pascal-Contextbenchmarks. The proposed CSRL outperforms existing state-of-the-art methods bya large margin, resulting in 7-12% on Pascal-VOC and 2-5% on Pascal-Context.1IntroductionSemantic segmentation [2, 3] is a fundamental computer vision task that aims to assign a semanticlabel to each pixel in the given image. Although the development of FCN-based models [4, 5, 6]has significantly advanced semantic segmentation, the success of these approaches highly relies oncost-intensive and time-consuming dense mask annotations to train the network. To relieve the humaneffort in annotating accurate pixel-wise masks, there is an increasing interest in weakly-supervisedsegmentation and few-shot segmentation methods. Weakly supervised segmentation [7, 8] targetson learning segmentation models using lower-quality annotations such as image-level labels [9, 10],bounding boxes [11, 12] and scribbles [13, 14], which can be obtained more efficiently comparedto pixel-wise masks. Meanwhile, few-shot segmentation [15, 16, 17, 18, 19] tackles the semanticsegmentation from a meta-learning perspective and aims to perform segmentation with only a fewannotated samples. Even significant progress has been made, these works are hard to completelyliberate the request for mask annotations.Most recently, Bucher et al. [1] took a step further to investigate how to effortlessly recognizethose never-seen categories with zero training examples, and proposed a new learning paradigm, Part of this work is done when Peike Li is an intern at Baidu Research34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.

Semantic SpaceVisual SpaceSemantic SpaceVisual SpaceConstraintSeen ClassNo ConstraintUnseen Class(a) Node-to-node generator(b) Structural generatorFigure 1: Illustration of CSRL. To achieve the goal of GZS3, we learn a generator to produce visualfeatures from semantic word embeddings. Compared to (a) node-to-node generator, the proposed (b)structural generator explores the structural relations between seen and unseen categories to constrainthe generation of unseen visual features.named Generalized Zero-Shot Semantic Segmentation (GZS3). Specifically, during the trainingphase, in addition to the annotated images of seen categories, we are also provided with the semanticword embeddings of both seen and unseen labels. At test time, GZS3 aims to segment imagescontaining pixels of all categories. As zero training examples of unseen categories are available, thekey challenge of GZS3 lies in how to correctly recognize the pixels from these unseen categories. Totackle this, Bucher et al. [1] proposed a generative method by exploiting semantic word embeddingsto generate unseen visual features, which are further employed to learn the classifiers for conductingsegmentation. However, when training the generator from semantic space to visual space, they takeeach category independently with merely node-to-node knowledge transfer of seen categories. Asshown in Figure 1a, no constraint is applied to guarantee the quality of generated visual features ofunseen categories, resulting in poor generalization ability.Hence, we seek to harness the inter-class relationship between seen and unseen categories to learna better generator. We observe that different categories are roughly with similar relations in eithersemantic word embedding space or visual feature space. Therefore, we assume the relational structureembedded in the semantic space can be conveniently transferred to constrain the generated visualfeatures of unseen categories. To this end, we propose Consistent Structural Relation Learning (CSRL)framework to tackle the challenging GZS3 task. Particularly, we propose a semantic-visual structuralgenerator by integrating both feature generating and relation learning in a unified network architecture.Instead of taking each category independently, our CSRL generates the visual features from bothseen and unseen categories, simultaneously. We additionally introduce the relational constraintsfrom different structure granularities, including point-wise, pair-wise, and list-wise consistency, tofacilitate the generalization of unseen categories. In this way, the learned visual features will beimposed to keep a consistent relational structure to their semantic-based counterparts, making thegenerator better adapt to unseen categories. Following [1], we conduct extensive experiments ontwo GZS3 benchmarks based on Pascal-VOC and Pascal-Context datasets. The proposed CSRLoutperforms existing state-of-the-art methods by a large margin, resulting in 7-12% on Pascal-VOCand 2-5% on Pascal-Context.2Related WorkZero-Shot Learning ZSL [20, 21] aims to recognize unseen classes with no training examplesby leveraging the semantic label embeddings (e.g., word embeddings or attribute vectors) as sideinformation. Despite on the traditional image classification task, ZSL has been applied to predictnovel action in videos [22, 23], detect unseen objects [24, 25], and recently, to segment pixel-wiseunseen categories [26, 1]. Former practices address ZSL by learning a projection function from visualspace to semantic space [27, 28] or model weight space [29]. However, the intra-class variation invisual space is neglected by mapping to a deterministic word embedding in semantic space. Recently,due to the advance of deep generative models [30, 31], one can overcome the scarce of unseenvisual features by directly generating samples from semantic word embeddings. Commonly, thesegenerative-based methods [32, 33] train their models firstly on seen classes and then generate visualfeatures for unseen classes. However, the quality of the generated unseen features solely relies on2

the generalization ability of the generator. Differently, in this work, we apply structural relationconsistency as constraints to guide the learning process.Generalized Zero-shot Semantic Segmentation Semantic segmentation under fully supervisedparadigm [34, 35, 36, 37, 38, 39] and domain adaptation scheme [40, 41, 42] are extensively studied.To extremely reduce the cost of label annotation, previous works focus on weakly-supervisedsegmentation [7, 8, 43] and few-shot segmentation [17, 44]. Most recent works [1, 26] furtherextend the zero-shot learning to the semantic segmentation task. The semantic word embeddings areprojected to synthetic visual features [1] and classifier weights [26]. However, the structural relationsbetween seen and unseen classes are not well explored. In this work, instead of simple node-to-nodemapping, we tackle the zero-shot segmentation from a new perspective as structural relation learningfrom semantic space to visual space.3PreliminariesWe denote a set of seen classes as S and a disjoint set of unseen classes as U, where S U .Let Ds {(x, y x X , y Y s } represents the set of labeled training data on seen classes, wherex is the pixel-wise feature embeddings from the visual space X Rdv , y is the correspondinglabel in the label space Y s of seen classes. Similar to the generalized zero-shot learning setting,in the task of GZS3, we aim to learn a model that takes an image as input and predicts the labelof each pixel among both seen and unseen classes S U. Clearly, without any side information,zero-shot learning is infeasible as there are no training samples of unseen classes. Thus, to achieve thegoal of zero-shot learning, except the training set Ds , we are also provided with the semantic word S U embeddings {aj aj A}j 1 for both seen and unseen classes, where the semantic space A Rdw .The dw -dimensional semantic embeddings could be word representations (e.g., word2vec [45] orGloVe embeddings [46]) or class attribute vectors [47]. In order to overcome the absence of unseenvisual features, recent works [32, 33] adopt the generative model to produce unseen visual features.Specially, a generator G : A X is learned to generate visual features using corresponding wordembeddings as input. Another benefit of these generative-based methods is that one can achievethe goal of zero-shot learning by directly adopting the existing CNN model (e.g., Deeplab) withoutcomplex architecture modification. Concretely, the generator G is learned on seen classes and thengenerate visual features for unseen classes. A new classifier (usually the last layer of CNN) isretrained on real seen visual features and generated unseen visual features. At test time, the label ofeach pixel is predicted by selecting the category with the largest probability.4MethodologyAs shown in Figure 2, we illustrate the details of the proposed CSRL framework. The goal of CSRLis to learn a better generator to produce visual features using semantic word embeddings as input. Toachieve this goal, we introduce a semantic-visual structural generator to alternately update the nodefeatures of each category and the inter-category relations. We further exploit the structural relationconsistency between seen and unseen categories to constrain the generating of unseen visual features.These structural relations include the point-wise, pair-wise and list-wise relations between seen andunseen categories. The generalized zero-shot semantic segmentation is achieved by learning on realseen visual features and the generated unseen visual features.4.1Semantic-Visual Structural GeneratorGiven a set of semantic word embeddings including samples from both seen and unseen categories, weaim to generate the corresponding set of synthetic visual features considering the relationships amongcategories. Such semantic-to-visual generation is achieved by a node-edge graph G (V, E), calledsemantic-visual structural generator in this work. The nodes V : {vi,n i [1, S U ], n [1, N ]}in the graph denote the pixel-level feature embeddings with total N samples for category i. Theedges E : {eij i, j [1, S U ]} are constructed based on the relationships between prototypesof category i and j.3

Semantic WordEmbeddingsSemnatic SpaceRelationAggregationFeatureAggregationVisual SpaceCSRL LossesFeature SupervisionRelaiton SupervisionSeenFeaturesUnSeenFeaturesFigure 2: The framework of the proposed CSRL. Our CSRL incorporates the feature generatingand relation learning into a unified architecture. Given the semantic word embedding, CSRL generatesvisual features by alternately feature and relation aggregation. The proposed CSRL is trained undersupervision from point-wise consistency on seen classes, pair-wise and list-wise consistency acrossseen and unseen classes.The structural generator consists of L layers, where each layer contains a feature aggregation step toupdate the node feature and a relation aggregation step to update the edge feature. We denote vi ande ij as the node feature and the edge feature of layer [1, L], respectively.As the semantic word embedding ai is a deterministic value, we enhance the feature diversity byconcatenating a random variable z with a Gaussian distribution. Thus, node features are initialized0by the semantic word embeddings vi,n [ai zi,n ], where denotes the concatenation operation.0Edge features eij ai · aj /kai k2 kaj k2 are initialized by the cosine similarity between semanticword embeddings.Feature Aggregation To alleviate the issue introduced by abnormal samples, especially only alimited number of samples in one categories, we aggregate the feature representation based on thecategory prototypes instead of raw samples. Specially, the category prototype pi is defined as,p 1 iN1 X 1v .N n 1 i,n(1)After calculating all prototype representations {pi i [1, S U ]}, we are able to propagate therelevant knowledge from other categories based on the edge features. The node feature aggregationof the l-th layer follows, S U X 1 vi,n fv ([vi,n 1e 1]; φ v ).ij pi(2)j 1,j6 iwhere fv is a transformation network with parameters φ v .Relation Aggregation After aggregate the node features, the edge feature aggregation is processedbased on the newly updated node features, The edge feature aggregation of the l-th layer follows,e ij fe ( p i p j ; φ e )e 1ij ,(3)where fe is a transformation network with parameters φ e .By alternately feature aggregation and the relation aggregation steps, we simultaneous achieve thefeature generating and relation learning. At L, the output nodes are the generated visual featuresx̂ including both seen and unseen categories, while the edge features are the learned relations betweencategories.4.2Consistent Structural Relation LearningThe key to generalized zero-shot segmentation is the ability to generate visual features x̂ X̂conditioned on the semantic word embedding a, even without access to any image pixels of thiscategory. In order to learn a better generator, we explore the relation constraints from differentstructure granularities as supervision signals to train the generator G.4

Point-wise consistency At training time, only the real visual features from seen categories areavailable to access. Thus, on these seen categories, we optimize the distribution divergence betweenreal visual features and generated visual features as supervision signals. As this divergence reflectsthe consistency of every single category between real and generated visual feature distributions, herewe note it as point-wise consistency. Here, we minimize distribution divergence on seen categoriesby optimizing the maximum mean discrepancy as, S Lpoint 1 X[Ex,x0 X c K(x, x0 ) Ex̂,x̂0 X̂ c K(x̂, x̂0 ) 2Ex X c ,x̂ X̂ c K(x, x̂)], S c 1(4)where K is the Gaussian kernel with bandwidth parameter σ defined as K(x, x0 ) exp( 2σ1 2 kx x0 k2 ).By optimizing the point-wise consistency on seen categories, there is no explicit constraint on thegeneration of unseen categories. Thus the quality of produced unseen features purely relies onthe generalization ability of the generator. To enhance and constrain the visual feature generationespecially on unseen categories, we transfer the structural relations on semantic word embeddingspace to the generator visual features space. In this paper, we consider the pair-wise consistencyand list-wise consistency. The pair-wise relations reflect that the feature similarity between twocategories, i.e., one seen category and one unseen one, should be consistent on both semantic spaceand visual space. The list-wise relations require that the relation ranking permutation order shouldalso be consistent on semantic space and visual space.Pair-wise consistency We extract the relation matrix between unseen and seen categories from theedge features in structural generator G as M {e ij i [1, U ], j [1, S ]} R U S . For eachunseen category, the relation values is further normalized by applying softmax function as follows,exp(eij /γ),ẽ ij P S j 0 1 exp(eij 0 /γ)(5)where γ is a scaling factor to soften the relation distribution. Thus, in the semantic word embeddingspace (i.e., the input layer 0), we have the relation matrix as MA . In the generated visual featurespace (i.e., the output layer L), the relation matrix is denote as MX̂ .To maintain the pair-wise relation consistency between semantic space and visual feature space, weadopt the Kullback-Leibler divergence as the learning objective. Concretely, the pair-wise consistencyis defined as, U 1 XX̂Lpair (MA , MX̂ ) DKL [MA(6)i Mi ]. U i 1List-wise consistency Instead of only focus on the relationship from a pair of categories at a time,inspired by [48, 49], we further investigate the entire ranking permutation of the relation list ascomplementary supervision. The core idea is that we take the relation ranking as a distribution ratherthan a deterministic order. We aim to associate the probability with every rank permutation betweensemantic space and visual space. Given one permutation π of the relation list, where π(i) denotes thei-th list index of this permeation. We calculate the probability of this ranking permutation as,P (π Mi ) S Yexp(eiπ(j) /γ)P S j 1k j exp(eiπ(k) /γ)(7)where γ is a scaling factor.We aim to maintain all possible relation ranking permutations π P as consistent as possible both onsemantic space and visual features space. Similar to pair-wise consistency, the list-wise consistencyis defined as, U 1 XX̂Llist (M , M ) DKL [P (π P MAi )kP (π P Mi )] U i 1AX̂5(8)

Table 1: Generalized zero-shot semantic segmentation performance on Pascal-VOC seen-104.3MethodsSeen mIoUUnseen mIoUOverall mIoUOverall 3.6%31.0%Training and InferenceIn this subsection, we introduce the whole procedures to achieve GZS3. During the training stage,we start from training an off-the-shelf segmentation model (e.g., DeepLabv3 ) on all annotated datafrom seen categories. After training on seen categories, we remove the last classification layer andthe remaining network serves as a visual features extractor to get the training set of seen categoriesDs . Then, we train our semantic-visual structural generator G under the supervision of consistentstructural relation learning losses,L(φ) Lpoint Lpair Llist .(9)To maintain simplicity, here we directly add these three terms. Once the generator G is trained,arbitrarily many visual features can be generated from semantic word embeddings, especially forunseen categories. In this way, we build a generated unseen training set denote as D̂u {x̂, y x̂ X̂ , y Y u }. A new pixel-level classifier is trained on the combined training set including real seenvisual features from Ds and generated unseen visual features from D̂u . In this way, the new modelcan be used to conduct generalized zero-shot semantic segmentation of a given image that exhibitcategories from both seen and unseen classes.5Experiments5.1 Experiment SettingsDatasets We conduct experiments on two datasets including Pascal-VOC [50] and PascalContext [51]. Pascal-VOC focuses on object semantic segmentation scenario, which contains 10,582training and 1,449 validation images from 20 classes. Pascal-Context targets on the scene parsingscenario, which comprises 4,998 training and 5,105 validation images from 59 classes. Following [1],we construct zero-shot segmentation setups with different number of unseen classes, including 2, 4, 6,8 and 10 unseen classes, and all the rest ones are the seen classes. Concretely, the unseen class set isextended in an incremental manner, i.e., the 4-unseen set contains the 2-unseen set. The unseen classsplits are 2-cow/motorbike, 4-airplane/sofa, 6-cat/tv, 8-train/bottle, 10-chair/potted-plant for PascalVOC dataset and 2-cow/motorbike, 4-sofa/cat, 6-boat/fence, 8-bird/tvmonitor, 10-keyboard/aeroplanefor Pascal-Context dataset.Evaluation Metrics In our experiments, similar to the standard semantic segmentation task, weadopt mean intersection-over-union (mIoU) as the principal metric. The generalized zero-shot6

Table 2: Generalized zero-shot semantic segmentation result on Pascal-Context seen-10MethodsSeen mIoUUnseen mIoUOverall mIoUOverall .8%19.5%semantic segmentation focuses on the overall performance including both seen and unseen categories.To avoid the performance on seen categories dominates, we also report the harmonic mean (hIoU) ofseen mIoU and unseen mIoU suggested by [52],hIoU 2 mIoUs mIoUu.mIoUs mIoUu(10)Implementation Details We choose the DeeplabV3 [6] with ResNet-101 [53] as our segmentationnetwork. The ImageNet [54] covers a wide range of categories, where most unseen categories areactually included. Therefore, directly adopting the publicly ImageNet pre-trained model may breakthe setting of zero-shot learning. To avoid the supervision leakage from unseen classes, we employthe model provided by [1], which is solely pre-trained using seen categories. For the aggregationnetwork fe and fv in Sec 4.1, we use the multi-layer perception network proposed by [55]. Weimplemented our method both by the Pytorch platform and the PaddlePaddle platform, both achievingsimilar performance. More details of the network structure and parameter settings can be found inour supplementary materials.5.2 Comparisons with State-of-the-art MethodsWe compare our proposed CSRL with SegDeViSe [56], SPNet [26], ZS3Net [1]. SegDeViSe regressessemantic word features from pixel-level visual features, which is learned by maximizing the cosinesimilarity between the output and the target word embeddings. SPNet encodes images in the wordembedding space and uses a semantic projection layer to produce class probabilities. ZS3Net isthe current state-of-the-art method, which generates unseen visual features from word embeddingsto achieve zero-shot segmentation. All these methods adopt the same segmentation network, i.e.,DeepLabV3 , for a fair comparison. The key commonality shared by these methods is: they take eachcategory as an independent point without considering its relations to other categories. Differently, wegenerate the unseen visual features by exploring the structural relations between categories.We report the performance of generalized zero-shot semantic segmentation on Pascal-VOC dataset inTable 1 and Pascal-Contex dataset in Table 2. Results of SPNet are based on our implementation, andother results of ZS3Net and SegDeVis are directly taken from paper [1]. In these two tables, first,we observe that the generative methods (i.e., ZS3Net, CSRL) significantly outperforms semanticembedding-based methods (i.e., SegDeViSe, SPNet). The semantic embedding-based methods,although perform well on seen categories, achieve a large performance drop for unseen ones. Byleveraging structural relation consistency to better guide the generation of unseen visual features, ourCSRL provides significant gains particularly on the unseen classes (e.g., 10.3% for the 2-unseen7

Input imageGTZS3NetCSRLFigure 3: Qualitative comparisons on Pascal-VOC dataset under the unseen-2 setting.split in terms of unseen mIoU). Second, our CSRL significantly outperforms others by large marginsfor various splits ( 7-12% for hIoU), which can well demonstrate the effectiveness of the consistentstructural relation learning framework. Third, our CSRL also achieves large performance gains on themore challenging benchmark Pascal-Context, which requires densely predictions for the full images.The qualitative comparison between ZS3Net and CSRL is shown in Figure 3. We can observe thatour CSRL achieves much better segmentation results and successfully recognize the unseen objects(e.g. cow and motorbike) where the ZS3Net mostly fails. More qualitative results are provided in thesupplementary materials.Table 3: Ablation study of CSRL on PascalVOC.Semantic Space RelationExpPoint Pair ListVisual Space RelationXXX- - 73.0% 40.3% 69.8% 51.9%X - 73.4% 43.3% 70.5% 54.5%- X 73.0% 42.7% 70.1% 53.9%CSRL XX X 73.4% 45.7% 70.7% 56.3%IIIIIIGenerated Visual Features Relaiton (w/o CSRL)Generated Visual Features Relaiton (w/ CSRL)Seen Unseen Overall OverallmIoU mIoU mIoU hIoUFigure 4: Relations between unseen (cow andmotorbike) and seen categories.5.3Ablation AnalysisQuantitative analysis for structural relations We conduct extensive quantitative analysis forthose key components in CSRL. In Table 3, we compare the effects of different structural relationswith the unseen-2 split on Pascal-VOC. First, simply performing node-to-note generation resultsin the hIoU score of 51.9% (I). Second, by introducing the pair-wise relation for optimization, thehIoU will be significantly enhanced by 3.6% (II). Third, by replacing pair-wise relation with list-wiserelation, the improvement is still notable, i.e., 2.0% (III). Finally, simultaneously considering all thecomponents will lead to the best hIoU score of 56.3% (CSRL).Qualitative analysis for inter-category relations In Figure 4, we visualize the inter-categoryrelations between unseen categories (i.e., cow and motorbike) and seen categories based on differentfeature embeddings. The relations are normalized for better visualization. The darker the colors,the stronger the relations. “Semantic Space Relation" and “Visual Space Relation" indicate cosinesimilarities of using word2vec features and CNN features with supervised training, respectively. First,we can observe that semantic relations between unseen and seen categories keep consistent acrossdifferent feature spaces. Second, by introducing the CSRL, the relations of generated visual featureswill be more consistent compared to those without CSRL, leading to better discriminative ability.8

6ConclusionIn this paper, to tackle the challenging generalized zero-shot semantic segmentation task, we proposeda simple yet effective framework called Consistent Structural Relation Learning (CSRL). We proposea semantic-visual structural generator by integrating both feature generating and relation learning ina unified network architecture. We effectively explore relation consistency from multiple structuregranularities to better guide the generation of unseen visual features. The proposed CSRL achievesthe new state-of-the-art on two zero-shot segmentation benchmarks, which outperforming the formerpractices by a large margin. Although CSRL achieves a large improvement for the generalizedzero-shot semantic segmentation, there is still a long way to go. We can observe that there is still alarge performance gap between the seen and the unseen categories on the two benchmarks. Thus,more effective GZS3 algorithms are still required to alleviate this gap. We hope that our efforts willmotivate more researchers and ease future research.AcknowledgmentThis work is partly supported by ARC DECRA DE190101315 and ARC DP200100938.Broader ImpactOur research advances the zero-shot learning segmentation task, which alleviates the need forexpensive human annotations when learning the unseen categories. Moreover, our research needs lesscomputational cost which only needs to re-train the classification head rather than the whole network.Thus our research is more financially-friendly and environmental-friendly compared to the traditionalfully-supervised learning paradigm. By utilizing the large amount of word embedding vectors, thenetwork can be built with stronger scalability to potential unseen categories.References[1] Bucher, M., T.-H. Vu, M. Cord, et al. Zero-shot semantic segmentation. In Advances in Neural InformationProcessing Systems (NeurIPS). 2019.[2] Noh, H., S. Hong, B. Han. Learning deconvolution network for semantic segmentation. In IEEEInternational Conference on Computer Vision (ICCV), pages 1520–1528. 2015.[3] Long, J., E. Shelhamer, T. Darrell. Fully convolutional networks for semantic segmentation. In IEEEConference on Computer Vision and Pattern Recognition (CVPR), pages 3431–3440. 2015.[4] Liang, X., Z. Hu, H. Zhang, et al. Symbolic graph reasoning meets convolutions. In Advances in NeuralInformation Processing Systems (NeurIPS), pages 1853–1863. 2018.[5] Chen, L.-C., G. Papandreou, I. Kokkinos, et al. Deeplab: Semantic image segmentation with deepconvolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Recognitionand Machine Intelligence, 40(4):834–848, 2017.[6] Chen, L.-C., Y. Zhu, G. Papandreou, et al. Encoder-decoder with atrous separable convolution for s

update the node feature and a relation aggregation step to update the edge feature. We denote v‘ i and e‘ ij as the node feature and the edge feature of layer ‘2[1;L], respectively. As the semantic word embedding a iis a deterministic value, we enhance the feature diversity by concat

Related Documents:

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

10 tips och tricks för att lyckas med ert sap-projekt 20 SAPSANYTT 2/2015 De flesta projektledare känner säkert till Cobb’s paradox. Martin Cobb verkade som CIO för sekretariatet för Treasury Board of Canada 1995 då han ställde frågan

service i Norge och Finland drivs inom ramen för ett enskilt företag (NRK. 1 och Yleisradio), fin ns det i Sverige tre: Ett för tv (Sveriges Television , SVT ), ett för radio (Sveriges Radio , SR ) och ett för utbildnings program (Sveriges Utbildningsradio, UR, vilket till följd av sin begränsade storlek inte återfinns bland de 25 största

Hotell För hotell anges de tre klasserna A/B, C och D. Det betyder att den "normala" standarden C är acceptabel men att motiven för en högre standard är starka. Ljudklass C motsvarar de tidigare normkraven för hotell, ljudklass A/B motsvarar kraven för moderna hotell med hög standard och ljudklass D kan användas vid

LÄS NOGGRANT FÖLJANDE VILLKOR FÖR APPLE DEVELOPER PROGRAM LICENCE . Apple Developer Program License Agreement Syfte Du vill använda Apple-mjukvara (enligt definitionen nedan) för att utveckla en eller flera Applikationer (enligt definitionen nedan) för Apple-märkta produkter. . Applikationer som utvecklas för iOS-produkter, Apple .

och krav. Maskinerna skriver ut upp till fyra tum breda etiketter med direkt termoteknik och termotransferteknik och är lämpliga för en lång rad användningsområden på vertikala marknader. TD-seriens professionella etikettskrivare för . skrivbordet. Brothers nya avancerade 4-tums etikettskrivare för skrivbordet är effektiva och enkla att

Den kanadensiska språkvetaren Jim Cummins har visat i sin forskning från år 1979 att det kan ta 1 till 3 år för att lära sig ett vardagsspråk och mellan 5 till 7 år för att behärska ett akademiskt språk.4 Han införde två begrepp för att beskriva elevernas språkliga kompetens: BI

**Godkänd av MAN för upp till 120 000 km och Mercedes Benz, Volvo och Renault för upp till 100 000 km i enlighet med deras specifikationer. Faktiskt oljebyte beror på motortyp, körförhållanden, servicehistorik, OBD och bränslekvalitet. Se alltid tillverkarens instruktionsbok. Art.Nr. 159CAC Art.Nr. 159CAA Art.Nr. 159CAB Art.Nr. 217B1B