Dipfake video one frame


First Order Motion Model working example


Is it possible to make an entire movie from one photograph? And having recorded the movements of one person, replace him with another in the video? Of course, the answer to these questions is extremely important for areas such as cinema, photography, and the development of computer games. The solution could be digital photo processing using specialized software. The problem in question among specialists in this field is called the task of automatic synthesis of video or image animation.


To obtain the expected result, existing approaches combine objects extracted from the original image and movements that can be delivered as a separate video - β€œdonor”.


Now, in most areas, image animation is done using computer graphics tools. This approach requires additional knowledge about the object that we want to animate - its 3D model is usually necessary (how it works now in the film industry can be found here ). Most of the latest solutions to this problem are based on in-depth training of models, which are based on generative-competitive neural networks (GAN) and variational autoencoders (VAE). These models usually use pre-trained modules to search for key points of objects in the image. The main problem with this approach is that these modules can only recognize the objects on which they were trained.


, ? Β«First Order Motion Model for Image AnimationΒ». β€” First Order Motion Model, . , (, , ), , .


…



, .


, , (occlusion map). . , , .



: .
D∈R3Γ—HΓ—WS∈R3Γ—HΓ—W. SD.



SD. , ( ) R. T^S←DDSO^S←D. .


.



TS←DDS. TS←D. , R( ), TS←DTS←RTR←D. , X, TX←R. Kp1,...,pK, p1,...,pKR.


:



TR←X=TX←Rβˆ’1, , TX←R.


:


TS←D=TS←R∘TR←D=TS←R∘TD←Rβˆ’1


TS←R(pk)TD←R(pk). U-Net, K, .
softmax , .


PT^S←DTS←D(z)( z), S. , T^S←D, , D, S. T^S←D, KS0,...,Sk(S0=S), T^S←D. S1,...,SkU-Net.
T^S←D(z):



Mkβ€” (M0 β€” ) Jk:




, SD^. , . down-sampling ξ∈RHβ€²Γ—Wβ€². ΞΎc T^S←D. S, D^. β€” O^S←D∈[0,1]Hβ€²Γ—Wβ€², , , S. :


ΞΎβ€²=O^S←DβŠ™fw(ΞΎ,T^S←D)


fw(β‹…,β‹…), βŠ™β€” ( ).


, . ΞΎβ€², .



, . reconstruction loss, . - VGG-19. reconstruction loss :


Lrec(D^,D)=βˆ‘i=1I|Ni(D^)βˆ’Ni(D)|


D^β€” , Dβ€” , Ni(β‹…)β€” i- , VGG-19, Iβ€” .



- . . , . , . , , , .


, XTX←Y, , thin plane spline. Y. , TX←R
TY←R. C :


TX←R≑TX←Y∘TY←R


( 1β€” ):



L1. , reconstruction loss 2 .



S1D1,...DTDt, St. S1D1Dt. , TDt←D1(p)pk:



, β€” S1D1.


!


4 :


  1. VoxCeleb β€” 22496 , YouTube;
  2. UvA-Nemo β€” , 1240 ;
  3. BAIR robot pushing β€” , , , . 42880 128 .
  4. 280 TaiChi YouTube.

X2Face Monkey-Net, .



As can be seen from the table, the First Order Motion model is superior to other approaches in all respects.


The long-awaited examples



Mgif



Fashion


Now try it yourself! It is very simple, everything is prepared here .


All Articles