🔗 ⏭️ 👨‍👩‍👧‍👦 Bipolare morphologische Netzwerke: ein Neuron ohne Multiplikation 👸🏿 🆗 🎤

Heutzutage ist es schwierig, ein Problem zu finden, dessen Lösung durch neuronale Netze noch nicht vorgeschlagen wurde. Und bei vielen Problemen werden andere Methoden nicht mehr berücksichtigt. In einer solchen Situation ist es logisch, dass Forscher und Technologen auf der Suche nach der „Silberkugel“ immer mehr neue Modifikationen neuronaler Netzwerkarchitekturen anbieten, die den Antragstellern „Glück für alle, für nichts und niemanden beleidigen lassen“ sollen. Bei industriellen Problemen stellt sich jedoch häufig heraus, dass die Genauigkeit des Modells hauptsächlich von der Sauberkeit, Größe und Struktur des Trainingsmusters abhängt und das neuronale Netzwerkmodell eine vernünftige Schnittstelle erfordert (zum Beispiel ist es unangenehm, wenn die logische Antwort eine Liste mit variabler Länge sein sollte).

Eine andere Sache ist Produktivität, Geschwindigkeit. Hier ist die Abhängigkeit von der Architektur direkt und durchaus vorhersehbar. Es sind jedoch nicht alle Wissenschaftler interessiert. Es ist viel angenehmer, über Jahrhunderte, Epochen, nachzudenken und mental auf ein Jahrhundert zu zielen, in dem die Rechenleistung auf magische Weise unvorstellbar ist und die Energie aus der Luft gewonnen wird. Es gibt jedoch auch genug weltliche Menschen. Und für sie ist es wichtig, dass die neuronalen Netze derzeit kompakter, schneller und energieeffizienter sind. Dies ist beispielsweise wichtig, wenn Sie auf Mobilgeräten und in eingebetteten Systemen arbeiten, in denen keine leistungsstarke Grafikkarte vorhanden ist oder Sie Batterie sparen müssen. In dieser Richtung wurde viel getan: Hier gibt es kleine ganzzahlige neuronale Netze, die Entfernung überschüssiger Neuronen, Zerlegungen der Tensorkonvolution und vieles mehr.

, , . .

, , “ ” - . , , . . , . , , , .

, , , , . , , . . . Labor omnia vīcit improbus et dūrīs urgēns in rēbus egestās.

, — . 90- [1, 2]. . , [3], [4]. , , . [5], [6]. , . .

, , , , .

y (x, w) = σ (\sum_{i = 1}^{N} w_{i} x_{i} + w_{N + 1})

$y(\mathbf{x}, \mathbf{w}) = \sigma \left (\sum_{i=1}^N w_i x_i + w_{N+1} \right)$

, $x$ $w$ , $\sigma$ .

(-), . , . , 4 , :

\sum_{i = 1}^{N} x_{i} w_{i} = \sum_{i = 1}^{N} p_{i}^{00} x_{i} w_{i} - \sum_{i = 1}^{N} p_{i}^{01} x_{i} | w_{i} | - \sum_{i = 1}^{N} p_{i}^{10} | x_{i} | w_{i} + \sum_{i = 1}^{N} p_{i}^{11} | x_{i} | | w_{i} |,

$\sum_{i=1}^N x_i w_i = \sum_{i=1}^N p_i^{00} x_i w_i - \sum_{i=1}^N p_i^{01} x_i |w_i| - \sum_{i=1}^N p_i^{10} |x_i| w_i + \sum_{i=1}^N p_i^{11} |x_i| |w_i|,$

p_{i}^{k j} = {\begin{cases} 1, если (- 1)^{k} x_{i} > 0 and (- 1)^{j} w_{i} > 0 \\ 0, иначе \end{cases}

$p_i^{kj} = \begin{cases} 1 , \mbox{ } (-1)^k x_i > 0 \mbox{ and } (-1)^j w_i > 0\\ 0 , \mbox{ } \end{cases}$

. :

M = max_{j} (x_{j} w_{j}) k = \frac{\sum_{i = 1}^{N} x_{i} w_{i}}{M} - 1

$M = \max_j (x_j w_j) \\ k = \frac{\sum \limits_{i=1}^N x_i w_i}{M} -1$

\sum_{i = 1}^{N} x_{i} w_{i} = \exp {\ln \sum_{i = 1}^{N} x_{i} w_{i}} = \exp {\ln M (1 + k)} = (1 + k) \exp \ln M = = (1 + k) \exp {\ln (max_{j} (x_{j} w_{j}))} = (1 + k) \exp max_{j} \ln (x_{j} w_{j}) = = (1 + k) \exp max_{j} (\ln x_{j} + \ln w_{j}) = (1 + k) \exp max_{j} (y_{j} + v_{j}) \approx \exp max_{j} (y_{j} + v_{j}),

$\sum_{i=1}^N x_i w_i = \exp \lbrace \ln \sum_{i=1}^N x_i w_i \rbrace = \exp \left \{ \ln M (1 + k) \right \} =(1 + k)\exp \ln M = \\ = (1 + k) \exp \left \{ \ln(\max_j (x_j w_j))\right \} = (1 + k)\exp \max_j \ln (x_j w_j) = \\ = (1 + k)\exp \max_j (\ln x_j + \ln w_j) = (1 + k)\exp \max_j (y_j + v_j) \approx \exp \max_j (y_j + v_j),$

$y_j$ — , $v_j = \ln w_j$ — . , , $k \ll 1$ . $0 \leq k \leq N - 1$ , , ( $k = 0$ ), — ( $k = N-1$ ). $N$ . , — , . , , — , . - .

- . 1. ReLU 4 : . . , .

, , . , , . (, -), — .

. 1. .

, - :

BM (x, w) = \exp max_{j} (\ln ReLU (x_{j}) + v_{j}^{0}) - \exp max_{j} (\ln ReLU (x_{j}) + v_{j}^{1}) - - \exp max_{j} (\ln ReLU (- x_{j}) + v_{j}^{0}) + \exp max_{j} (\ln ReLU (- x_{j}) + v_{j}^{1}),

$\mbox{BM}(x, w) = \exp{\max_j (\ln \mbox{ReLU}(x_j) + v^{0}_j)} - \exp{\max_j (\ln \mbox{ReLU}(x_j) + v^{1}_j)} - \\ - \exp{\max_j (\ln \mbox{ReLU}(-x_j) + v^{0}_j)} + \exp{\max_j (\ln \mbox{ReLU}(-x_j) + v^{1}_j)},$

v_{j}^{k} = {\begin{cases} \ln | w_{j} |, если (- 1)^{k} w_{j} > 0 \\ - \infty, иначе \end{cases}

$v_j^{k} = \begin{cases} \ln |w_j| , \mbox{ } (-1)^k w_j > 0\\ -\infty , \mbox{ } \end{cases}$

, . , . , , .

, , , : - ! , . ( 1) ( 2). ? , . , .

, -: - , , -, , . - : , , , .

, , , incremental learning — , . . - , . “” — ( 1), — ( 2). “” , , . , -, , , -.

MNIST

MNIST — , 60000 28 28. 10000 . 10% , — . . 2.

. 2. MNIST.

conv(n, w_x, w_y) — n w_x w_y;
fc(n) — n ;
maxpool(w_x, w_y) — max-pooling w_x w_y;
dropout(p) — dropout p;
relu — ${\mbox{ReLU}(x) = \max(x, 0)}$ ;
softmax — softmax.

MNIST :

CNN1: conv1(30, 5, 5) — relu1 — dropout1(0,2) — fc1(10) — softmax1.

CNN2: conv1(40, 5, 5) — relu1 — maxpool1(2, 2) — conv2(40, 5, 5) — relu2 — fc1(200) — relu3 — dropout1(0,3) — fc2(10) — softmax1.

. 1. “” . () ().

1. MNIST. — , — .

		1,	1, +	2,	2, +
CNN1	-	98,72	-	98,72	-
CNN1	conv1	42,47	98,51	38,38	98,76
CNN1	conv1 — relu1 — dropout1 — fc1	26,89	-	19,86	94,00
CNN2	-	99,45	-	99,45	-
CNN2	conv1	94,90	99,41	96,57	99,42
CNN2	conv1 — relu1 — maxpool1 — conv2	21,25	98,68	36,23	99,37
CNN2	conv1 — relu1 — maxpool1 — conv2 — relu2 — fc1	10,01	74,95	17,25	99,04
CNN2	conv1 — relu1 — maxpool1 — conv2 — relu2 — fc1 — dropout1 — relu3 — fc2	12,91	-	48,73	97,86

-, , - . , - , . , .

: . , . : - .

MRZ

MRZ- , (. . 3). 280 000 21 17 37 MRZ, .

. 3. MRZ .

CNN3: conv1(8, 3, 3) — relu1 — conv2(30, 5, 5) — relu2 — conv3(30, 5, 5) — relu3 — dropout1(0,25) — fc1(37) — softmax1.

CNN4: conv1(8, 3, 3) — relu1 — conv2(8, 5, 5) — relu2 — conv3(8, 3, 3) — relu3 — dropout1(0,25) — conv4(12, 5, 5) — relu4 — conv5(12, 3, 3) — relu5 — conv6(12, 1, 1) — relu6 — fc1(37) — softmax1.

2. “” . () ().

, MNIST: -, , . - , - .

2. MRZ. — , — .

		1,	1, +	2,	2, +
CNN3	-	99,63	-	99,63	-
CNN3	conv1	97,76	99,64	83,07	99,62
CNN3	conv1 — relu1 — conv2	8,59	99,47	21,12	99,58
CNN3	conv1 — relu1 — conv2 — relu2 — conv3	3,67	98,79	36,89	99,57
CNN3	conv1 — relu1 — conv2 — relu2 — conv3 — relu3 — dropout1 — fc1	12,58	-	27,84	93,38
CNN4	-	99,67	-	99,67	-
CNN4	conv1	91,20	99,66	93,71	99,67
CNN4	conv1 — relu1 — conv2	6,14	99,52	73,79	99,66
CNN4	conv1 — relu1 — conv2 — relu2 — conv3	23,58	99,42	70,25	99,66
CNN4	conv1 — relu1 — conv2 — relu2 — conv3 — relu3 — dropout1 — conv4	29,56	99,04	77,92	99,63
CNN4	conv1 — relu1 — conv2 — relu2 — conv3 — relu3 — dropout1 — conv4 — relu4 — conv5	34,18	98,45	17,08	99,64
CNN4	conv1 — relu1 — conv2 — relu2 — conv3 — relu3 — dropout1 — conv4 — relu4 — conv5 — relu5 — conv6	5,83	98,00	90,46	99,61
CNN4	conv1 — relu1 — conv2 — relu2 — conv3 — relu3 — dropout1 — conv4 — relu4 — conv5 — relu5 — conv6 -relu6 — fc1	4,70	-	27,57	95,46

, , . , - . MNIST MRZ.

? , - . , (, ) . , — TPU, .

, , : , .

PS. ICMV 2019:
E. Limonova, D. Matveev, D. Nikolaev and V. V. Arlazarov, “Bipolar morphological neural networks: convolution without multiplication,” ICMV 2019, 11433 ed., Wolfgang Osten, Dmitry Nikolaev, Jianhong Zhou, Ed., SPIE, Jan. 2020, vol. 11433, ISSN 0277-786X, ISBN 978-15-10636-43-9, vol. 11433, 11433 3J, pp. 1-8, 2020, DOI: 10.1117/12.2559299.

G. X. Ritter and P. Sussner, “An introduction to morphological neural networks,” Proceedings of 13th International Conference on Pattern Recognition 4, 709–717 vol.4 (1996).
P. Sussner and E. L. Esmi, Constructive Morphological Neural Networks: Some Theoretical Aspects and Experimental Results in Classification, 123–144, Springer Berlin Heidelberg, Berlin, Heidelberg (2009).
G. X. Ritter, L. Iancu, and G. Urcid, “Morphological perceptrons with dendritic structure,” in The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ ’03., 2, 1296–1301 vol.2 (May 2003).
G. X. Ritter and G. Urcid, “Lattice algebra approach to single-neuron computation,” IEEE Transactions on Neural Networks 14, 282–295 (March 2003).
H. Sossa and E. Guevara, “Efficient training for dendrite morphological neural networks,” Neurocomputing 131, 132–142 (05 2014).
E. Zamora and H. Sossa, “Dendrite morphological neurons trained by stochastic gradient descent,” in 2016 IEEE Symposium Series on Computational Intelligence (SSCI), 1–8 (Dec 2016).

Bipolare morphologische Netzwerke: ein Neuron ohne Multiplikation

More articles: