👩🏻‍🎤 🎈 ♦️ El título "Leer artículos para usted". Abril de 2020. Parte 1 👨🏽‍🏭 👂🏿 🌜

Hola Habr! Continuamos publicando reseñas de artículos científicos de miembros de la comunidad Open Data Science del canal #article_essense. Si quieres recibirlos antes que los demás, ¡únete a la comunidad !

Artículos para hoy:

2020 : — , 1, 2
2019 : — , — , —
2017 — 2018, — 2018
2017 : , , —

1. TResNet: High Performance GPU-Dedicated Architecture

: Tal Ridnik, Hussam Lawen, Asaf Noy, Itamar Friedman (DAMO Academy, Alibaba Group, 2020)
:: GitHub project ::
: ( artgor, habr artgor)

, . TResNet-XL top-1 accuracy 84.3% imagenet. , Resnet50 gpu throughput, top-1 accuracy 80.7%

, :

depthwise 11 convolutions. FLOPS, GPU , . .
Multi-path, , backpropagation, . - inplace .

3 : TResNet-M, TResNet-L and TResNet-XL. . : SpaceToDepth stem, Anti-Alias downsampling, In-Place Activated BatchNorm, Blocks selection and SE layers.

Stem Design
Stem Block — . , Resnet50 conv 7x7, stride 2 + maxpooling, 224 56. SpaceToDepth, , conv 1x1, .

Anti-Alias Downsampling
downscaling . , shift-equivariance -.

Inplace-ABN
Inplace-ABN BatchNorm+ReLU . . ReLU Leaky-ReLU. , .

Blocks Selection
ResNet34 BasicBlocks conv 3x3, ResNet50 Bottleneck conv 1x1 conv 3x3 — , GPU. : 2 BasicBlock, 2 Bottleneck.

SE Layers
SE 3 stage . :

JIT Compilation , — AA blur filter and the SpaceToDepth. GPU cost 2 .

Global Average Pooling
view mean Pytorch AvgPool2d 5

Inplace Operations
, : Inplace-ABN, residual connections, SE blocks, activations . .

:
224224, 300 , SGD 1-cycle policy. : Auto-augment, Cutout, Label-smooth and Trueweight-decay. ImageNet, 0 1. Resnet50 TResNet.

Ablation Study
SE AA. .

2. Controllable Person Image Synthesis with Attribute-Decomposed GAN

: Yifang Men, Yiming Mao, Yuning Jiang, Wei-Ying Ma, Zhouhui Lian (China, 2020)
:: GitHub project :: Blog :: Video
: ( digitman, habr digitman)

, " " . , , . — (, , ) . .

, , — . . , .

2 , ( I_s) ( I_t). , . 18 P_t ( ). Pose encoder 2 downsampling .

( 8, , . . , style code. , , .

texture encoder : — , — vgg, vgg.

style code - , . — StyleBlock AdaIN ( StyleGAN). 8 . — I_g.

. . — ( ). — PatchGAN , .

. Adversarial loss- . Reconstruction loss — l1 . Perceptual loss — l2 vgg . Contextual loss (CX) — ( VGG , ).

— DeepFashion. , , .

, . , . . , , , .

3. Learning to See Through Obstructions

: Yu-Lun Liu, Wei-Sheng Lai, Ming-Hsuan Yang, Yung-Yu Chuang, and Jia-Bin Huang (Taiwan, USA, 2020)
:: GitHub project :: Blog
: ( belskikh)

(, , ..) , . , , — . , optical flow, .

coarse to fine multi stage , , . optical flow , , .

Initial Flow Decompositon
OF . : flow estimator. , cost volume ( Cnns for optical flow using pyramid, warping, and cost volume), , . Flow Estimator FC , , initial flow, , .
Background/Reflection Layer Reconstruction
, , . , optical flow, , . , .
Optical Flow Refinement
optical flow PWC-Net.

— initial flow, L1 , PWC-Net.

, , L1 reconstruction loss gradient loss (L1 - -. ). , . unsupervised .

L1 , ( 0). total variation loss.

4. Tracking Objects as Points

: Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl (UT Austin, Intel Labs, 2020)
:: GitHub project
: ( belskikh)

CenterNet , .

Objects as Points, CenterNet — anchorbox free object detector, . , , — .

, :

;
class-agnostic ( , , CenterNet);
— ; .

— , , ( , ).

, :

- ground truth,
false positives ground truth
false negatives, .

— CenterNet, , , .

, , .

— 67.3% MOTA on the MOT17 challenge at 22 FPS and 89.4% MOTA on the KITTI tracking benchmark at 15 FPS.

CenterNet monocular 3D detection, 3D tracking , .

(Kalman, Optical Flow), , , .

, , , - .

, anchor box .

5. CookGAN: Meal Image Synthesis from Ingredients

: Fangda Han, Ricardo Guerrero, Vladimir Pavlovic (USA, UK, 2020)

: ( digitman, habr digitman)

. , , , .

. — bidirectional LSTM, word2vec , LSTM attention. — ResNet50, Imagenet, UPMC-Food-101. "FoodSpace" ["", " ", " "]. .

, p+. , . , . , , , .

, 'c', , 'c' (64, 128, 256). . :

conditional , .. 'c'. — unconditional ( "", "").

L_{i}^{c o n d} = - E_{v + \sim p_{d_{i}}} [l o g D_{i} (v^{+}, c)] + E_{v - \sim p_{d_{i}}} [l o g D_{i} (v^{-}, c)] + E_{\tilde{v} + \sim p_{G_{i}}} [l o g D_{i} ({\tilde{v}}^{+}, c)], L_{i}^{u n c o n d} = - E_{v + \sim p_{d_{i}}} [l o g D_{i} (v^{+})] + E_{v - \sim p_{d_{i}}} [l o g D_{i} (v^{-})] + E_{\tilde{v} + \sim p_{G_{i}}} [l o g D_{i} ({\tilde{v}}^{+})] .

$L^{cond}_i = -E_{v+\sim p_{d_i}}[log D_i(v^+ , c)] + E_{v-\sim p_{d_i}}[log D_i(v^- , c)] + E_{\tilde{v}+\sim p_{G_i}}[log D_i(\tilde{v}^+ , c)],\\ L^{uncond}_i = -E_{v+\sim p_{d_i}}[log D_i(v^+)] + E_{v-\sim p_{d_i}}[log D_i(v^-)] + E_{\tilde{v}+\sim p_{G_i}}[log D_i(\tilde{v}^+)].$

, cycle-consistency ( ). v+, q+, .. , ( ) v~+ q~+. : L_ci = cos(q+, q~+).