11月30日至12月1日,在下诺夫哥罗德举行了OpenVINO黑客马拉松。要求参与者使用英特尔OpenVINO工具包创建原型产品解决方案。组织者提出了示例主题列表,可以在选择任务时进行指导,但最终决定权仍在团队之间。另外,鼓励使用产品中未包含的模型。

在本文中,我们将讨论如何创建产品原型,并最终赢得第一名。
黑客马拉松涉及十多个团队。很好,其中一些来自其他地区。黑客马拉松的地点被选为“ Pochain上的克里姆林宫”场地,随行人员将下诺夫哥罗德的旧照片挂在里面!(我提醒您,目前,英特尔的总部位于下诺夫哥罗德)。参与者有26个小时的时间编写代码,最后必须提出他们的决定。另外一个好处是演示会话的存在,以确保实现由真相构想的所有内容,并且不会在演示文稿中留下任何想法。商业,小吃,食物,一切也在那里!
此外,英特尔还可选提供相机,Raspberry PI,Neural Compute Stick 2。
任务选择
. -, , , .
, , , . , OpenVINO, , . — . . , OpenVINO , , :
: retail . . - — .
, , . , , , !
:
Raspberry Pi 3 c Intel NCS 2.
NCS — CNN , , ̶̶̶̶̶̶̶ ̶̶ ̶̶̶̶̶̶̶ .
: . USB-, RPI. “ ”. Voice Bonnet Google AIY Voice Kit, .
Raspbian AIY projects , , ( 5 ):
arecord -d 5 -r 16000 test.wav
, . , alsamixer, Capture devices 50-60%.

,
-
AIY Voice Kit , RGB-, . “Google AIY Led” : https://aiyprojects.readthedocs.io/en/latest/aiy.leds.html
, 7 , 8 , !
GPIO Voice Bonnet, ( AIY projects)
from aiy.leds import Leds, Color
from aiy.leds import RgbLeds
C dict, RGB Tuple aiy.leds.Leds, :
led_dict = {'neutral': (255, 255, 255), 'happy': (0, 255, 0), 'sad': (0, 255, 255), 'angry': (255, 0, 0), 'fearful': (0, 0, 0), 'disgusted': (255, 0, 255), 'surprised': (255, 255, 0)}
leds = Leds()
, , ( ).
leds.update(Leds.rgb_on(led_dict.get(classes[prediction])))

, !
pyaudio webrtcvad . , , .
webrtcvad — 10/20/30, ( ) 48, 48000×20/1000×1()=960 . Webrtcvad True/False , .
:
- list , , , .
- >=30 (600 ), , >250, , , , , .
- < 30, 300, . ( )
def to_queue(frames):
d = np.frombuffer(b''.join(frames), dtype=np.int16)
return d
framesQueue = queue.Queue()
def framesThreadBody():
CHUNK = 960
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 48000
p = pyaudio.PyAudio()
vad = webrtcvad.Vad()
vad.set_mode(2)
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
false_counter = 0
audio_frame = []
while process:
data = stream.read(CHUNK)
if not vad.is_speech(data, RATE):
false_counter += 1
if false_counter >= 30:
if len(audio_frame) > 250:
framesQueue.put(to_queue(audio_frame,timestamp_start))
audio_frame = []
false_counter = 0
if vad.is_speech(data, RATE):
false_counter = 0
audio_frame.append(data)
if len(audio_frame) > 300:
framesQueue.put(to_queue(audio_frame,timestamp_start))
audio_frame = []
, github, , , . , , , OpenVINO — IR (Intermediate Representation). 5-7 github, , — .
:
, . OpenVINO :
- Open Model Zoo,
- Model Optimzer, (Tensorflow, ONNX e.t.c) Intermediate Representation,
- Inference Engine IR Intel, Myriad Neural Compute Stick
- OpenCV ( Inference Engine)
IR : .xml .bin.
IR Model Optimizer :
python /opt/intel/openvino/deployment_tools/model_optimizer/mo_tf.py --input_model speaker.hdf5.pb --data_type=FP16 --input_shape [1,512,1000,1]
--data_type
, . FP32, FP16, INT8. .
--input_shape
. C++ API, .
IR DNN OpenCV forward .
import cv2 as cv
emotionsNet = cv.dnn.readNet('emotions_model.bin',
'emotions_model.xml')
emotionsNet.setPreferableTarget(cv.dnn.DNN_TARGET_MYRIAD)
Neural Compute Stick, , Raspberry Pi , .
: ( 0.4), MFCC, :
emotionsNet.setInput(MFCC_from_window)
result = emotionsNet.forward()
. , - , . , — . , . , .
, ( , , ).
.:
python3 voice_db/record_voice.py test.wav
( )
fast fourier transform, numpy array (.npy):
for file in glob.glob("voice_db/*.wav"):
spec = get_fft_spectrum(file)
np.save(file[:-4] + '.npy', spec)
create_base.py
:
for file in glob.glob("voice_db/*.npy"):
spec = np.load(file)
spec = spec.astype('float32')
spec_reshaped = spec.reshape(1, 1, spec.shape[0], spec.shape[1])
srNet.setInput(spec_reshaped)
pred = srNet.forward()
emb = np.squeeze(pred)
, , cosine distance ( , ) — 0.3):
dist_list = cdist(emb, enroll_embs, metric="cosine")
distances = pd.DataFrame(dist_list, columns = df.speaker)
, 1-2 ( 7 2.5). -.
-
: , .
Raspberry Pi, websocket (http over tcp protocol).
, json , , . , . golang, , , .
, . , hub, ( ), ( ), , hub.

Front-end web-, JavaScript React . , , back-end Raspberry Pi. , react-router, , WebSocket. Raspberry Pi , probability . , , , , .

, , , , . , , , . — , . , , , , .
, 150$:
- Raspberry Pi 3 ~ 35$
- Google AIY Voice Bonnet ( respeaker) ~ 15$
- Intel NCS 2 ~ 100$
:
: https://github.com/vladimirwest/OpenEMO

,
. . . , , AI .