🎢 🔰 🧝🏻 Redes morfológicas bipolares: una neurona sin multiplicación 🤶🏼 👉 👁‍🗨

Hoy en día es difícil encontrar un problema que aún no ha sido propuesto para ser resuelto por las redes neuronales. Y en muchos problemas ya no se consideran otros métodos. En tal situación, es lógico que en la búsqueda de la "bala de plata", los investigadores y tecnólogos ofrezcan más y más nuevas modificaciones de las arquitecturas de redes neuronales, lo que debería brindar a los solicitantes "¡felicidad para todos, para nada, y que nadie se ofenda!" Sin embargo, en problemas industriales a menudo resulta que la precisión del modelo depende principalmente de la limpieza, el tamaño y la estructura de la muestra de entrenamiento, y el modelo de red neuronal requiere una interfaz razonable (por ejemplo, es desagradable cuando la respuesta lógica debería ser una lista de longitud variable).

Otra cosa es la productividad, la velocidad. Aquí la dependencia de la arquitectura es directa y bastante predecible. Sin embargo, no todos los científicos están interesados. Es mucho más agradable pensar durante siglos, épocas, apuntar mentalmente a un siglo en el que mágicamente la potencia informática será inimaginable y la energía extraída del aire. Sin embargo, también hay suficientes personas mundanas. Y es importante para ellos que las redes neuronales sean más compactas, más rápidas y más eficientes en el momento. Por ejemplo, esto es importante cuando se trabaja en dispositivos móviles y en sistemas integrados donde no hay una tarjeta de video potente o si necesita ahorrar batería. Se ha hecho mucho en esta dirección: aquí hay redes neuronales enteras de pequeño tamaño, y la eliminación del exceso de neuronas y las descomposiciones de convolución de tensor, y mucho más.

, , . .

, , “ ” - . , , . . , . , , , .

, , , , . , , . . . Labor omnia vīcit improbus et dūrīs urgēns in rēbus egestās.

, — . 90- [1, 2]. . , [3], [4]. , , . [5], [6]. , . .

, , , , .

y (x, w) = σ (\sum_{i = 1}^{N} w_{i} x_{i} + w_{N + 1})

$y(\mathbf{x}, \mathbf{w}) = \sigma \left (\sum_{i=1}^N w_i x_i + w_{N+1} \right)$

, $x$ $w$ , $\sigma$ .

(-), . , . , 4 , :

\sum_{i = 1}^{N} x_{i} w_{i} = \sum_{i = 1}^{N} p_{i}^{00} x_{i} w_{i} - \sum_{i = 1}^{N} p_{i}^{01} x_{i} | w_{i} | - \sum_{i = 1}^{N} p_{i}^{10} | x_{i} | w_{i} + \sum_{i = 1}^{N} p_{i}^{11} | x_{i} | | w_{i} |,

$\sum_{i=1}^N x_i w_i = \sum_{i=1}^N p_i^{00} x_i w_i - \sum_{i=1}^N p_i^{01} x_i |w_i| - \sum_{i=1}^N p_i^{10} |x_i| w_i + \sum_{i=1}^N p_i^{11} |x_i| |w_i|,$

p_{i}^{k j} = {\begin{cases} 1, если (- 1)^{k} x_{i} > 0 and (- 1)^{j} w_{i} > 0 \\ 0, иначе \end{cases}

$p_i^{kj} = \begin{cases} 1 , \mbox{ } (-1)^k x_i > 0 \mbox{ and } (-1)^j w_i > 0\\ 0 , \mbox{ } \end{cases}$

. :

M = max_{j} (x_{j} w_{j}) k = \frac{\sum_{i = 1}^{N} x_{i} w_{i}}{M} - 1

$M = \max_j (x_j w_j) \\ k = \frac{\sum \limits_{i=1}^N x_i w_i}{M} -1$

\sum_{i = 1}^{N} x_{i} w_{i} = \exp {\ln \sum_{i = 1}^{N} x_{i} w_{i}} = \exp {\ln M (1 + k)} = (1 + k) \exp \ln M = = (1 + k) \exp {\ln (max_{j} (x_{j} w_{j}))} = (1 + k) \exp max_{j} \ln (x_{j} w_{j}) = = (1 + k) \exp max_{j} (\ln x_{j} + \ln w_{j}) = (1 + k) \exp max_{j} (y_{j} + v_{j}) \approx \exp max_{j} (y_{j} + v_{j}),

$\sum_{i=1}^N x_i w_i = \exp \lbrace \ln \sum_{i=1}^N x_i w_i \rbrace = \exp \left \{ \ln M (1 + k) \right \} =(1 + k)\exp \ln M = \\ = (1 + k) \exp \left \{ \ln(\max_j (x_j w_j))\right \} = (1 + k)\exp \max_j \ln (x_j w_j) = \\ = (1 + k)\exp \max_j (\ln x_j + \ln w_j) = (1 + k)\exp \max_j (y_j + v_j) \approx \exp \max_j (y_j + v_j),$

$y_j$ — , $v_j = \ln w_j$ — . , , $k \ll 1$ . $0 \leq k \leq N - 1$ , , ( $k = 0$ ), — ( $k = N-1$ ). $N$ . , — , . , , — , . - .

- . 1. ReLU 4 : . . , .

, , . , , . (, -), — .

. 1. .

, - :

BM (x, w) = \exp max_{j} (\ln ReLU (x_{j}) + v_{j}^{0}) - \exp max_{j} (\ln ReLU (x_{j}) + v_{j}^{1}) - - \exp max_{j} (\ln ReLU (- x_{j}) + v_{j}^{0}) + \exp max_{j} (\ln ReLU (- x_{j}) + v_{j}^{1}),

$\mbox{BM}(x, w) = \exp{\max_j (\ln \mbox{ReLU}(x_j) + v^{0}_j)} - \exp{\max_j (\ln \mbox{ReLU}(x_j) + v^{1}_j)} - \\ - \exp{\max_j (\ln \mbox{ReLU}(-x_j) + v^{0}_j)} + \exp{\max_j (\ln \mbox{ReLU}(-x_j) + v^{1}_j)},$

v_{j}^{k} = {\begin{cases} \ln | w_{j} |, если (- 1)^{k} w_{j} > 0 \\ - \infty, иначе \end{cases}

$v_j^{k} = \begin{cases} \ln |w_j| , \mbox{ } (-1)^k w_j > 0\\ -\infty , \mbox{ } \end{cases}$

, . , . , , .

, , , : - ! , . ( 1) ( 2). ? , . , .

, -: - , , -, , . - : , , , .

, , , incremental learning — , . . - , . “” — ( 1), — ( 2). “” , , . , -, , , -.

MNIST

MNIST — , 60000 28 28. 10000 . 10% , — . . 2.

. 2. MNIST.

conv(n, w_x, w_y) — n w_x w_y;
fc(n) — n ;
maxpool(w_x, w_y) — max-pooling w_x w_y;
dropout(p) — dropout p;
relu — ${\mbox{ReLU}(x) = \max(x, 0)}$ ;
softmax — softmax.

MNIST :

CNN1: conv1(30, 5, 5) — relu1 — dropout1(0,2) — fc1(10) — softmax1.

CNN2: conv1(40, 5, 5) — relu1 — maxpool1(2, 2) — conv2(40, 5, 5) — relu2 — fc1(200) — relu3 — dropout1(0,3) — fc2(10) — softmax1.

. 1. “” . () ().

1. MNIST. — , — .

		1,	1, +	2,	2, +
CNN1	-	98,72	-	98,72	-
CNN1	conv1	42,47	98,51	38,38	98,76
CNN1	conv1 — relu1 — dropout1 — fc1	26,89	-	19,86	94,00
CNN2	-	99,45	-	99,45	-
CNN2	conv1	94,90	99,41	96,57	99,42
CNN2	conv1 — relu1 — maxpool1 — conv2	21,25	98,68	36,23	99,37
CNN2	conv1 — relu1 — maxpool1 — conv2 — relu2 — fc1	10,01	74,95	17,25	99,04
CNN2	conv1 — relu1 — maxpool1 — conv2 — relu2 — fc1 — dropout1 — relu3 — fc2	12,91	-	48,73	97,86

-, , - . , - , . , .

: . , . : - .

MRZ

MRZ- , (. . 3). 280 000 21 17 37 MRZ, .

. 3. MRZ .

CNN3: conv1(8, 3, 3) — relu1 — conv2(30, 5, 5) — relu2 — conv3(30, 5, 5) — relu3 — dropout1(0,25) — fc1(37) — softmax1.

CNN4: conv1(8, 3, 3) — relu1 — conv2(8, 5, 5) — relu2 — conv3(8, 3, 3) — relu3 — dropout1(0,25) — conv4(12, 5, 5) — relu4 — conv5(12, 3, 3) — relu5 — conv6(12, 1, 1) — relu6 — fc1(37) — softmax1.

2. “” . () ().

, MNIST: -, , . - , - .

2. MRZ. — , — .

		1,	1, +	2,	2, +
CNN3	-	99,63	-	99,63	-
CNN3	conv1	97,76	99,64	83,07	99,62
CNN3	conv1 — relu1 — conv2	8,59	99,47	21,12	99,58
CNN3	conv1 — relu1 — conv2 — relu2 — conv3	3,67	98,79	36,89	99,57
CNN3	conv1 — relu1 — conv2 — relu2 — conv3 — relu3 — dropout1 — fc1	12,58	-	27,84	93,38
CNN4	-	99,67	-	99,67	-
CNN4	conv1	91,20	99,66	93,71	99,67
CNN4	conv1 — relu1 — conv2	6,14	99,52	73,79	99,66
CNN4	conv1 — relu1 — conv2 — relu2 — conv3	23,58	99,42	70,25	99,66
CNN4	conv1 — relu1 — conv2 — relu2 — conv3 — relu3 — dropout1 — conv4	29,56	99,04	77,92	99,63
CNN4	conv1 — relu1 — conv2 — relu2 — conv3 — relu3 — dropout1 — conv4 — relu4 — conv5	34,18	98,45	17,08	99,64
CNN4	conv1 — relu1 — conv2 — relu2 — conv3 — relu3 — dropout1 — conv4 — relu4 — conv5 — relu5 — conv6	5,83	98,00	90,46	99,61
CNN4	conv1 — relu1 — conv2 — relu2 — conv3 — relu3 — dropout1 — conv4 — relu4 — conv5 — relu5 — conv6 -relu6 — fc1	4,70	-	27,57	95,46

, , . , - . MNIST MRZ.

? , - . , (, ) . , — TPU, .

, , : , .

PS. ICMV 2019:
E. Limonova, D. Matveev, D. Nikolaev and V. V. Arlazarov, “Bipolar morphological neural networks: convolution without multiplication,” ICMV 2019, 11433 ed., Wolfgang Osten, Dmitry Nikolaev, Jianhong Zhou, Ed., SPIE, Jan. 2020, vol. 11433, ISSN 0277-786X, ISBN 978-15-10636-43-9, vol. 11433, 11433 3J, pp. 1-8, 2020, DOI: 10.1117/12.2559299.

G. X. Ritter and P. Sussner, “An introduction to morphological neural networks,” Proceedings of 13th International Conference on Pattern Recognition 4, 709–717 vol.4 (1996).
P. Sussner and E. L. Esmi, Constructive Morphological Neural Networks: Some Theoretical Aspects and Experimental Results in Classification, 123–144, Springer Berlin Heidelberg, Berlin, Heidelberg (2009).
G. X. Ritter, L. Iancu, and G. Urcid, “Morphological perceptrons with dendritic structure,” in The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ ’03., 2, 1296–1301 vol.2 (May 2003).
G. X. Ritter and G. Urcid, “Lattice algebra approach to single-neuron computation,” IEEE Transactions on Neural Networks 14, 282–295 (March 2003).
H. Sossa and E. Guevara, “Efficient training for dendrite morphological neural networks,” Neurocomputing 131, 132–142 (05 2014).
E. Zamora and H. Sossa, “Dendrite morphological neurons trained by stochastic gradient descent,” in 2016 IEEE Symposium Series on Computational Intelligence (SSCI), 1–8 (Dec 2016).

Redes morfológicas bipolares: una neurona sin multiplicación

More articles: