SUPPLEMENTARY MATERIAL—— AOT: Appearance Optimal . - NeurIPS

2m ago
5 Views
0 Downloads
5.29 MB
9 Pages
Last View : 1d ago
Last Download : n/a
Upload by : Mollie Blount
Transcription

——SUPPLEMENTARY MATERIAL——AOT: Appearance Optimal Transport Based IdentitySwapping for Forgery DetectionHao Zhu 1,2 , Chaoyou Fu 1,3, , Qianyi Wu 4 , Wayne Wu 4 , Chen Qian 4 , Ran He 1,3†1NLPR & CEBSIT & CRIPAC, CASIA 2 Anhui University3University of Chinese Academy of Sciences 4 SenseTime Researchhaozhu96@gmail.com ianchen}@sensetime.comAppendixAImplementation DetailsA.1Data PreparationThe training data only consists of the target faces and the reenacted faces. The target faces are directlyextracted from the original FF [9] (900 videos) and DPF-1.0 [4] (10 identities). Then, we leverageDFL [8] and FSGAN [6] to produce the reenact faces using identities that are not existed in the targetfaces. Our quantitative experiments are conducted on the remaining videos in FF and DPF-1.0.Then, we first detect 106 facial landmarks of each video. Then, we crop the face area and resize theminto 256*256 resolution. To obtain the PNCC and normals, we use 3DDFA [15] to estimate the 3Dmesh of each face, and render the mesh and corresponding PNCC and normals codes to images via aneural renderer [5].A.2Training StrategiesWe use PyTorch [7] to implement our model. In the training phase, our model is trained with 200Kiterations on two NVIDIA1080Ti GPUs, where the batch size 16. We use Adam optimizer forrelighting generator with β1 0.5, β2 0.999, weight decay 0.0002, and RMSprop optimizer forMix-and-Segment Discriminator (MSD), Ω, Ψ with beta 0.9. The learning rates of both the Adamand the RMSprop optimizers are set to 0.0002. In Ltotal , we set λ1 120, λ2 1, λ3 90, and λ4 1.The full training algorithm is summarized here 1.A.3Network ArchitecturesThe full architecture as shown in Fig. S1. †Equal contributionCorresponding author34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.

Algorithm 1 Training algorithm.Require: {Xr }N , {Xt }N ;Require: Initialize Ωi , Ψi , P erceptualEncoder, Decoder, and M SD with θi , ωi , α, β, γ respectively.1: while not converged do2:Sample mini-batch {xr }3:Sample mini-batch {xt }4:5:6:7:8:9:10:11:12:13:// Forward: Encoder1234FX, FX, FX, FX P ERCEPTUAL E NCODER({xr })rrrr1234FXt , FXt , FXt , FXt P ERCEPTUAL E NCODER({xt })// Update: NOTPEfor i 1, 2, 3 dofor j 1, ., ni doiSample vrj FXrjiSample vt FXt1gωi ωi [ mΨi (Ωi (vrj Xr , Xt )) ωi ωi α·RMSProp(ωi ,x)ωi CLIP(ωi , c, c)end forend forfor i 1, 2, 3 dofor j 1, ., ni dosSample vrj FXr1gθi θi m Ψi (Ωi (vrj Xr , Xt ))θi θi α·RMSProp(θi ,x)end forend forj114:m Ψi (vt )]15:16:17:18:19:20:21:22:23:24:25:26:27:// Forward: NOTPE28:for i 1, 2, 3 doi29:FYi Ωi (FX,Xr ,Xt )r30:end for31:32:// Forward: Decoder123433:Yt,t D ECODER(FX, FX, FX, FX)rrrr123434:Yr,t D ECODER(FY , FY , FY , FXr )35:36:// Update: MSD37:Mr Random Mask Generator().38:Ymix M IX(Yt,t , Yr,t , Mr )1139:gγ γ [ m][Mr M SD(Ymix )] m[(1 Mr ) M SD(Ymix )]40:γ γ α · ADAM (γ)41:42:// Update: Perceptual Encoder, Decoder143:gβ β [ m]Loss(Yt,t , Yr,t , Xt )44:β β α · ADAM (β)45: end while2. Total Loss

Mix-and-Segment Discriminator(MSD)Relighting GeneratorFace DecoderPerceptual EncoderPixel ShufflePixel ShuffleWeight NormWeight 3,256,256)ReLuInstance /1Weight 28/4/2/1Conv/64/128/4/2/1Pixel ShuffleInstance NormPNCC/NormalsReLuInstance NormPixel Shuffle3DDFAInstance NormWeight NormLeakyReLu(0.2)NOTFEBlockscaleInstance Norm ReLuConv/128/256/4/2/1 Instance NormPixel ShufflescaleReLu Pixel stance NormscaleDeConv/1024/512/3/1Instance NormNOTFEBlockConv/512/512/4/2/1ReLU Pixel ShufflePixel ShuffleLeakyReLu(0.2)Weight NormLeakyReLu(0.2)Weight 512/4/2/1DeConv/512/512/3/1/Instance NormReLUInstance NormReLULinear/256/1Linear/k/1024Linear/1024/ kLinear/1024/512Linear/k 256/1024Linear/1024/256Instance NormConv/128/256/4/2/1LeakyReLU(0.2) Linear/1024/256K-channelForeachpixelInstance NormInstance Norm Pixel ShuffleDeConv/1024/256/3/1Weight NormK-channelLeakyReLU(0.2) LeakyReLu(0.2) Conv/64/128/4/2/1ReLuWeight Norm Instance /3/64/4/2/1Conv/256/512/4/2/1ReLuNOTFEBlock ReLuPixel ShuffleWeight NormLeakyReLu(0.2)DeConv/1024/256/3/1Instance NormWeight NormDeConv/256/64/3/1/1Pixel ShuffleWeight NormLeakyReLu(0.2)Conv/256/512/4/2/1Pixel ShuffleWeight /2/1DeConv/128/3/3/1/1Figure S1: The detailed pipeline of our proposed model.BCompared BaselineB.1Face Swapping MethodsDeepfaceLab. DeepfaceLab (DFL) [8] requires to retrain the model for different source identities.It means we need to train the DFL model different videos respectively. It should be clear that, DFLprovides lots of options to tune the results. In practice, we use the options reported in Table S1.FSGAN. FSGAN [6] is a landmark-guided subject agnostic method. We leverage the latest modelsprovided by authors.B.2Appearance Transfer MethodsPoisson Blending. Poisson Blending is a classical image harmonization method. We use theOpenCV implemented version, and set the flag cv2.NORMAL CLONE.Deep Image Harmonization (DIH) [13]. 3 DIH is a deep learning based image harmonizationmethod and it can capture both the context and semantic patterns of the images rather than hand-craftfeatures.3DIH: https://github.com/wasidennis/DeepHarmonization3

Table S1: Options of DeepFaceLab.nameresolutionface typemodels opt on gpuarchiae dimse dimsd dimsd mask dimsmasked trainingeyes priolr dropoutrandom warpTraining Optionschoice name224gan powerftrue face powerTrueface style powerdfhdbg style power256ct mode64clipgrad64pretrain22autobackup hourTruewrite preview historyFalsetarget iterFalserandom flipTruebatch ging Optionsnamemask modeerode mask modifierblur mask modifiermotion blur poweroutput face scalecolor transfer modesharpen modeblursharpen amountsuper resolution powerimage denoise powerbicubic degrade powercolor degrade powerchoicelearned5501rctnone01000Style Transfer for Headshot Portraits (STHP) [10]. 4 STHP allows users to easily produce styletransferred results. It transfers multi-scale local statistics of an reference portrait into another.WCT2 . 5 WCT2 is a state-one-the-art photorealistic style transfer method. We use the optionunpool ’cat5’ version, and the pretrained models.CAdditional ExperimentsC.1Noise AnalysisFurthermore, we verified our results with photo forgery methods: noise analysis, error level analysis,level sweep, luminance gradient 6 . As shown in Fig. S2, ours framework reduces the noises (Fig. S2(a, b)) and preserves the appearance with target images (Fig. S2 (c, d)).(b) Error Level Analysis(c) Level Sweep(d) Luminance GradientOursDFLTarget(a) Noise AnalysisFigure S2: Noise analysis with photo forensics algorithms. Our method can not only reduce thenoises (a,b), but also better preserve appearances. (c,d).4STHP: https://people.csail.mit.edu/yichangshih/portrait web/WCT2 : to-forensics/54

MixMixed Mask Swapped TargetPredictMaskFigure S3: The mixed results.C.2Results of Mix-and-Segment DiscriminatorWe provide more results of the mixed results. As shown in Fig. S3, we mix the target faces and theswapped faces using the mix mask. It is difficult to find the real patch and the fake patch.C.3Feature VisualizationTo give intuitive results, we visualize the features at different scales by using PCA to reduce thedimensions of them to 3-dimensional vectors.In the latent space the pixel distributions are more balance under different lighting conditions, asshown in Fig. S4.Image 𝐹1𝐹2𝐹3Image 𝐹1𝐹2𝐹3Image 𝐹1Figure S4: Visualization of the features at different scales.5𝐹2𝐹3

Table S2: Inference speed comparisonMethodsPoissonDIHSTHPWCTAOT (ours)C.4FPS3.8911.2471.6862.81712.821Speed ComparisonFurthermore, as reported in Table S2, our framework achieves the highest FPS compared with otherrelated methods, which means our method introduces the minimum computational burdens. Allexperiments conducted on Ubuntu16.04 with an Intel i7-7700K CPU and a Nvidia 1060 GPU.C.5Forgery DetectionBinary detection accuracy of two video classification baselines: I3D [1] and TSN [14] on the hiddenset provided by DeeperForensics-1.0 [4].We trained the baselines on four manipulated datasets of FF [9] [9] produced by DeepFakes [2],Face2Face [11], FaceSwap [3], and NeuralTextures [12]. (Green bars). Then, we add 100 manipulatedvideos produced by our method to the training set. All detection accuracies are improved with theaddition of our data. (Blue bars).Figure S5: Forgery Detection Results.6

TargetDFL PoissonBlending DIH STHPFigure S6: Comparison results with DFL.7 𝐖𝐂𝐓 𝟐Ours

TargetFSGAN PoissonBlending DIH STHPFigure S7: Comparison8 results with FSGAN. 𝐖𝐂𝐓 𝟐Ours

References[1] João Carreira and Andrew Zisserman. Quo vadis, action recognition? A new model and the kinetics dataset.In Conference on Computer Vision and Pattern Recognition, pages 4724–4733, 2017.[2] Deepfakes. https://github.com/deepfakes/faceswap, Accessed: ee/master/dataset/[3] FaceSwap.FaceSwapKowalski, Accessed: 2020.4.[4] Liming Jiang, Wayne Wu, Ren Li, Chen Qian, and Chen Change Loy. Deeperforensics-1.0: A large-scaledataset for real-world face forgery detection. arXiv preprint arXiv:2001.03024, 2020.[5] Hiroharu Kato, Yoshitaka Ushiku, and Tatsuya Harada. Neural 3d mesh renderer. In Conference onComputer Vision and Pattern Recognition, 2018.[6] Yuval Nirkin, Yosi Keller, and Tal Hassner. Fsgan: Subject agnostic face swapping and reenactment. InInternational Conference on Computer Vision, pages 7184–7193, 2019.[7] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen,Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deeplearning library. In Advances in Neural Information Processing Systems, pages 8024–8035, 2019.[8] Ivan Perov, Daiheng Gao, Nikolay Chervoniy, Kunlin Liu, Sugasa Marangonda, Chris Umé, Mr. Dpfks,Carl Shift Facenheim, Luis RP, Jian Jiang, Sheng Zhang, Pingyu Wu, Bo Zhou, and Weiming Zhang.Deepfacelab: A simple, flexible and extensible face swapping framework. arXiv preprint arXiv:2005.05535,2020.[9] Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner.Faceforensics : Learning to detect manipulated facial images. In International Conference on ComputerVision, pages 1–11, 2019.[10] YiChang Shih, Sylvain Paris, Connelly Barnes, William T Freeman, and Frédo Durand. Style transfer forheadshot portraits. ACM Transactions on Graphics, 33(4):148, 2014.[11] Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. Face2face:Real-time face capture and reenactment of RGB videos. In Conference on Computer Vision and PatternRecognition, pages 2387–2395, 2016.[12] Justus Thies, Michael Zollhöfer, and Matthias Nießner. Deferred neural rendering: Image synthesis usingneural textures. CoRR, abs/1904.12356, 2019.[13] Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Xin Lu, and Ming-Hsuan Yang. Deep imageharmonization. In Conference on Computer Vision and Pattern Recognition, pages 3789–3797, 2017.[14] Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, and Luc Van Gool. Temporalsegment networks: Towards good practices for deep action recognition. In Bastian Leibe, Jiri Matas, NicuSebe, and Max Welling, editors, European Conference on Computer Vision, pages 20–36, 2016.[15] Xiangyu Zhu, Xiaoming Liu, Zhen Lei, and Stan Z Li. Face alignment in full pose range: A 3d totalsolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(1):78–92, 2017.9

Style Transfer for Headshot Portraits (STHP) [10]. 4 STHP allows users to easily produce style transferred results. It transfers multi-scale local statistics of an reference portrait into another. WCT2. 5 WCT2 is a state-one-the-art photorealistic style transfer method. We use the option unpool 'cat5' version, and the pretrained models.

Related Documents:

spring-aot-gradle-plugin: Gradle plugin that invokes AOT transformations. spring-aot-maven-plugin: Maven plugin that invokes AOT transformations. samples: contains various samples that demons

OMH issued an Interim Report required by Kendra’s Law, which reviewed the implemen-tation and status of AOT and presented find-ings from OMH’s evaluation of the program.2 This Final Report on the status of AOT in New York State is also statutorily required and updates the Interim Report. Impleme

138 level product is known as the Intermediate Product (IP) as it is used to create the aggregated AOT EDR, 139 along with acting as an input for other VIIRS products. The VIIRS algorithm aggregates 8x8 arrays of IP AOT pixels into a single EDR pixel with a resolution of 6 km

Service-Oriented Architecture Service-Oriented paradigm . Focus on realizing SOA through and applying service-orientation principles to Web services technology . 5 AOT LAB . Technology, and Design . Prentice Hall Service-orientation has become a distinct design approach which introduces commonly ac

II. Optimal Control of Constrained-input Systems A. Constrained optimal control and policy iteration In this section, the optimal control problem for affine-in-the-input nonlinear systems with input constraints is formulated and an offline PI algorithm is given for solving the related optimal control problem.

compared to the three previous methods. ’ Some previous algorithms achieve optimal mapping for restricted problem domains: Chortle is optimal when the input network is a tree, Chortle-crf and Chortle-d are optimal when the input network is a tree and h’ 5 6, and DAG- Map is optimal when the mapping constraint is monotone, which is true for .

tests of the optimal diet model of optimal foraging theory. Optimal foraging theory (OFT) develops hypotheses about how a species would feed if it acted in the most eco-nomical manner with respect to time and energy expenditure (MacArthur & Pianka, 1966). Hanson (1987) summarized the assumptions underlyin

not satisy \Dynamic Programming Principle" (DPP) or \Bellman Optimality Principle". Namely, a sequence of optimization problems with the corresponding optimal controls is called time-consistent, if the optimal strategies obtained when solving the optimal control problem at time sstays optimal when the o

The candidate portfolio is a weighted combination of near-optimal portfolios 3, 5 and 7 This portfolio is a realistic portfolio that is near optimal and within the risk budget Near-optimal portfolios Gov. bonds Corp. bonds IG Corp. bonds HY Cash Equity Dev. M. Equity EM M. Private Equity Real Estate Robust Near-Optimal Portfolio Construction .

models and novel renderings of time-varying appearance: Database of Time-Varying Surface Appearance: A major con-tribution of our work is a database of time-varying appearance mea-surements that is released along with the publication. We have cap-e-mail: jwgu@cs.columbia.edu tured 26 s

this appearance editing task have been proposed, enabling the editing of scene objects’ appearances, lighting, and materials, as well as entailing the introduction of new interaction paradigms and specialized preview rendering techniques. In this STAR we provide a comprehensive survey of artistic appearance, lighting, and material editing .

Stochastic optimization Dynamically orthogonal level-set equations Reachability Science of autonomy Energy-optimal a b s t r a c t stochastic optimization methodology is formulated computingfor energy-optimal paths among from time-optimal paths of autonomous vehicles navigating in a dynamic flow field. Based on partial differ-

errors. Following the invention of the Origin-Based Assignment (OBA) algorithm by Bar-Gera (2002), such precision is now possible; see Patriksson (1994) for an overview of these models and solution methods. The findings of the comparison of user-optimal and system-optimal route patterns presented below may be surprising as well as informative.

Real-Time Implementation of Optimal Energy Management in Hybrid Electric Vehicles: Globally Optimal Control of Acceleration Events Widely published research shows that significant fuel economy improvements through optimal control of a vehicle powertrain are possible if the future vehicle velocity is known

ample scenarios, in which the computation of optimal subset repairs is a core task. Data repairing.Asdiscussedin[40,20,11],theben-efit of computing optimal subset repairs to data repairing is twofold. First, computing optimal subset repairs is con-sidered an important task in automatic data repairing [22, 45, 25].

Average k-shortest path length Load balancing property RRG is near optimal in terms of average k-shortest path length RRG is far from optimal for all other metrics GDBG was found near optimal for all metrics GDBG was used as a simulation benchmark to evaluate RRG Depending on traffic pattern, RRG is not always near optimal

ORIGINAL ARTICLE Optimal spot market inventory strategies . there has been a recent stream of research focusing on determining an optimal mix of long term fixed commitment and options/procurement contracts. In these mod- . This optimal policy and the solution structure were more explicitly

a result, the optimal strategy is implementable. The optimal R&D process involves testing the most promisingtechnology. The optimal test is designed to be difficult to pass, so good news comes infrequently, as in a Poisson process. A successful test confirms the firm’s

Optimal policy when R(s, a, s’) -0.03 for all non-terminals s (cost of living) We want an optimal policy A policy gives an action for each state An optimal policy is one that maximizes expected utility if followed For Deterministic single-agent search problems, derived an optimal plan, or sequence of actions, from start to a .

2 Ring Automotive Limited 44 (0)113 213 7389 44 (0)113 231 0266 Ring is a leading supplier of vehicle lighting, auto electrical and workshop products and has been supporting the automotive aftermarket for more than 40 years, supplying innovative products and a range synonymous with performance and quality. Bulb technology is at the heart of the Ring business, which is supported by unique .