30 ูููู
ุจุฑ - 1 ุฏูุณู
ุจุฑ ูู ููุฌูู ูููุบูุฑูุฏ ุนูุฏุช ูุงูุงุซูู OpenVINO . ุทููุจ ู
ู ุงูู
ุดุงุฑููู ุฅูุดุงุก ุญู ู
ูุชุฌ ูู
ูุฐุฌู ุจุงุณุชุฎุฏุงู
ู
ุฌู
ูุนุฉ ุฃุฏูุงุช Intel OpenVINO. ุงูุชุฑุญ ุงูู
ูุธู
ูู ูุงุฆู
ุฉ ุจุงูุนููุฉ ู
ู ุงูู
ูุงุถูุน ุงูุชู ูู
ูู ุฃู ุชุณุชุฑุดุฏ ุจูุง ุนูุฏ ุงุฎุชูุงุฑ ุงูู
ูู
ุฉ ุ ูููู ุงููุฑุงุฑ ุงูููุงุฆู ุจูู ู
ุน ุงููุฑู. ุจุงูุฅุถุงูุฉ ุฅูู ุฐูู ุ ุชู
ุชุดุฌูุน ุงุณุชุฎุฏุงู
ุงููู
ุงุฐุฌ ุงูุชู ูู
ูุชู
ุชุถู
ูููุง ูู ุงูู
ูุชุฌ.

ูู ุงูู
ูุงูุฉ ุณูุชุญุฏุซ ุนู ููููุฉ ุฅูุดุงุก ูู
ูุฐุฌูุง ุงูุฃููู ููู
ูุชุฌ ุ ูุงูุฐู ูุฒูุง ุจู ูู ุงูููุงูุฉ ุงูู
ุฑูุฒ ุงูุฃูู.
ุดุงุฑู ูู ูุงูุงุซูู ุฃูุซุฑ ู
ู 10 ูุฑู. ู
ู ุงูุฌู
ูู ุฃู ุจุนุถูู
ุฌุงุก ู
ู ู
ูุงุทู ุฃุฎุฑู. ุชู
ุงุฎุชูุงุฑ ู
ูุงู "ุงููุฑู
ููู ุนูู Pochain" ุ ุญูุซ ุชู
ุชุนููู ุงูุตูุฑ ุงููุฏูู
ุฉ ูููุฒูู ูููุบูุฑูุฏ ุ ููููู ู
ูุงููุง ูููุงูุงุซูู! (ุฃุฐูุฑู ุฃูู ูู ุงูููุช ุงูุญุงูู ุ ููุน ู
ูุชุจ ุฅูุชู ุงูู
ุฑูุฒู ูู ููุฌูู ูููุบูุฑูุฏ). ุชู
ู
ูุญ ุงูู
ุดุงุฑููู 26 ุณุงุนุฉ ููุชุงุจุฉ ุงูุฑู
ุฒ ุ ููู ุงูููุงูุฉ ูุงู ู
ู ุงูุถุฑูุฑู ุชูุฏูู
ูุฑุงุฑูู
. ูุงูุช ููุงู ู
ูุฒุฉ ุฅุถุงููุฉ ู
ููุตูุฉ ูุฌูุฏ ุฌูุณุฉ ุชุฌุฑูุจูุฉ ููุชุฃูุฏ ู
ู ุชูููุฐ ูู ุดูุก ุชุตูุฑู ุงูุญูููุฉ ุ ูุนุฏู
ุชุฑูู ู
ุน ุฃููุงุฑ ูู ุงูุนุฑุถ ุงูุชูุฏูู
ู. Merch ุ ุงููุฌุจุงุช ุงูุฎูููุฉ ุ ุงูุทุนุงู
ุ ูู ุดูุก ูุงู ููุงู ุฃูุถูุง!
ุจุงูุฅุถุงูุฉ ุฅูู ุฐูู ุ ูุฏู
ุช Intel ุจุดูู ุงุฎุชูุงุฑู ูุงู
ูุฑุงุช Raspberry PI ู Neural Compute Stick 2.
ุงุฎุชูุงุฑ ุงูู
ูุงู
. -, , , .
, , , . , OpenVINO, , . โ . . , OpenVINO , , :
: retail . . - โ .
, , . , , , !
:
Raspberry Pi 3 c Intel NCS 2.
NCS โ CNN , , ฬถฬถฬถฬถฬถฬถฬถ ฬถฬถ ฬถฬถฬถฬถฬถฬถฬถ .
: . USB-, RPI. โ โ. Voice Bonnet Google AIY Voice Kit, .
Raspbian AIY projects , , ( 5 ):
arecord -d 5 -r 16000 test.wav
, . , alsamixer, Capture devices 50-60%.

,
-
AIY Voice Kit , RGB-, . โGoogle AIY Ledโ : https://aiyprojects.readthedocs.io/en/latest/aiy.leds.html
, 7 , 8 , !
GPIO Voice Bonnet, ( AIY projects)
from aiy.leds import Leds, Color
from aiy.leds import RgbLeds
C dict, RGB Tuple aiy.leds.Leds, :
led_dict = {'neutral': (255, 255, 255), 'happy': (0, 255, 0), 'sad': (0, 255, 255), 'angry': (255, 0, 0), 'fearful': (0, 0, 0), 'disgusted': (255, 0, 255), 'surprised': (255, 255, 0)}
leds = Leds()
, , ( ).
leds.update(Leds.rgb_on(led_dict.get(classes[prediction])))

, !
pyaudio webrtcvad . , , .
webrtcvad โ 10/20/30, ( ) 48, 48000ร20/1000ร1()=960 . Webrtcvad True/False , .
:
- list , , , .
- >=30 (600 ), , >250, , , , , .
- < 30, 300, . ( )
def to_queue(frames):
d = np.frombuffer(b''.join(frames), dtype=np.int16)
return d
framesQueue = queue.Queue()
def framesThreadBody():
CHUNK = 960
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 48000
p = pyaudio.PyAudio()
vad = webrtcvad.Vad()
vad.set_mode(2)
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
false_counter = 0
audio_frame = []
while process:
data = stream.read(CHUNK)
if not vad.is_speech(data, RATE):
false_counter += 1
if false_counter >= 30:
if len(audio_frame) > 250:
framesQueue.put(to_queue(audio_frame,timestamp_start))
audio_frame = []
false_counter = 0
if vad.is_speech(data, RATE):
false_counter = 0
audio_frame.append(data)
if len(audio_frame) > 300:
framesQueue.put(to_queue(audio_frame,timestamp_start))
audio_frame = []
, github, , , . , , , OpenVINO โ IR (Intermediate Representation). 5-7 github, , โ .
:
, . OpenVINO :
- Open Model Zoo,
- Model Optimzer, (Tensorflow, ONNX e.t.c) Intermediate Representation,
- Inference Engine IR Intel, Myriad Neural Compute Stick
- OpenCV ( Inference Engine)
IR : .xml .bin.
IR Model Optimizer :
python /opt/intel/openvino/deployment_tools/model_optimizer/mo_tf.py --input_model speaker.hdf5.pb --data_type=FP16 --input_shape [1,512,1000,1]
--data_type
, . FP32, FP16, INT8. .
--input_shape
. C++ API, .
IR DNN OpenCV forward .
import cv2 as cv
emotionsNet = cv.dnn.readNet('emotions_model.bin',
'emotions_model.xml')
emotionsNet.setPreferableTarget(cv.dnn.DNN_TARGET_MYRIAD)
Neural Compute Stick, , Raspberry Pi , .
: ( 0.4), MFCC, :
emotionsNet.setInput(MFCC_from_window)
result = emotionsNet.forward()
. , - , . , โ . , . , .
, ( , , ).
.:
python3 voice_db/record_voice.py test.wav
( )
fast fourier transform, numpy array (.npy):
for file in glob.glob("voice_db/*.wav"):
spec = get_fft_spectrum(file)
np.save(file[:-4] + '.npy', spec)
create_base.py
:
for file in glob.glob("voice_db/*.npy"):
spec = np.load(file)
spec = spec.astype('float32')
spec_reshaped = spec.reshape(1, 1, spec.shape[0], spec.shape[1])
srNet.setInput(spec_reshaped)
pred = srNet.forward()
emb = np.squeeze(pred)
, , cosine distance ( , ) โ 0.3):
dist_list = cdist(emb, enroll_embs, metric="cosine")
distances = pd.DataFrame(dist_list, columns = df.speaker)
, 1-2 ( 7 2.5). -.
-
: , .
Raspberry Pi, websocket (http over tcp protocol).
, json , , . , . golang, , , .
, . , hub, ( ), ( ), , hub.

Front-end web-, JavaScript React . , , back-end Raspberry Pi. , react-router, , WebSocket. Raspberry Pi , probability . , , , , .

, , , , . , , , . โ , . , , , , .
, 150$:
- Raspberry Pi 3 ~ 35$
- Google AIY Voice Bonnet ( respeaker) ~ 15$
- Intel NCS 2 ~ 100$
:
: https://github.com/vladimirwest/OpenEMO

,
. . . , , AI .