How to remember everyone in person, or an effective search for faces in a large database

About myself


Hello, Habr! My name is Pavel, I work as a technical director in a company engaged in the production of IoT devices. We produce a lot of things - from smart home controllers to smart metering devices on our patented sensor network protocol.


They also act as the general director of the IT company. In the past, ACM ICPC semi-finalist of the World Cup programming.


Motivation


I write this article because our team killed about a month to find a solution (another two weeks to implement and write tests) to store and efficiently find recognized persons in the database, in order to save time for you in your projects. Spoiler: they didn’t find anything ready-made like a cool plug-in for the existing DBMS, but the deadlines were blazing, so we wrote our own DBMS for this very task (storing a huge number of face embeddings). My article in no way claims to be an exhaustive guide, but I hope that it will provide a starting point for further study and development of our thoughts.


Embedding is a mapping from a discrete vector of categorical features into a continuous vector with a predetermined dimension.

So where did it all begin


- , , , / , . , , , , . "" 87%, , , . , 3 . . 2-3 . — … , , - "" ( ) 6 . .



, , : ( ) , , , , , , . 10 200 , . , 50 . 10.


, +- , .


, , , dlib, . C++, BLAS, Python CPU. .


, dlib , 0.6, . 128.


, . , , , , , . , , k , k , , - -. , -, -.


.



. dlib .


def get_img_vector(img):
    dets = detector(img, 1)
    for k, d in enumerate(dets):
        shape = sp(img, d)
        return facerec.compute_face_descriptor(img, shape)
    return None

def prepare_database():
    database = {}

    for file in glob.glob("_images/*"):
        identity = os.path.splitext(os.path.basename(file))[0]
        img = cv2.imread(file, 1)

        database[identity] = get_img_vector(img)

    return database

def who_is_it(img, shape, database):
    face_descriptor1 = facerec.compute_face_descriptor(img, shape)

    min_dist = 100
    identity = None

    for (name, db_enc) in database.items():
        dist = distance.euclidean(db_enc, face_descriptor1)

        if dist < min_dist:
            min_dist = dist
            identity = name

    print(min_dist)
    if min_dist > 0.57:
        return None
    else:
        return str(identity)

if __name__ == "__main__":
    global sp
    sp = dlib.shape_predictor('weights/shape_predictor_face_landmarks.dat')
    global facerec
    facerec = dlib.face_recognition_model_v1('weights/dlib_face_recognition_resnet_model_v1.dat')
    global detector
    detector = dlib.get_frontal_face_detector()

    database = prepare_database()
    webcam_face_recognizer(database)

webcam_face_recognizer ( cv2- ) . who_is_it, . , , , , !


, 1 . (N*k), N — , k — . , , . , , - . .



, ?


— , , . — . , , , L2 .



scores = np.linalg.norm(face_descriptor1 - np.asarray(database.items()), axis=1)
min_el_ind = scores.argmax()

, , , .


, , nmslib. HNSW k . , . :


import nmslib

index = nmslib.init(method='hnsw', space='l2', data_type=nmslib.DataType.DENSE_VECTOR)

for idx, emb in enumerate(database.items()):
    index.addDataPoint(idx, emb)

index_params = {
    'indexThreadQty': 5,
    'skip_optimized_index': 0,
    'post': 3,
    'delaunay_type': 2,
    'M': 100,
    'efConstruction': 2000
}

index.createIndex(index_params, print_progress=True)
index.saveIndex('./db/database.bin')

HNSW .


"" . ?



, 4 . dlib , .


image


, . , , . .


postgresql


- , , (, . ) .


:


import postgresql

def setup_db():
    db = postgresql.open('pq://user:pass@localhost:5434/db')
    db.execute("create extension if not exists cube;")
    db.execute("drop table if exists vectors")
    db.execute("create table vectors (id serial, file varchar, vec_low cube, vec_high cube);")
    db.execute("create index vectors_vec_idx on vectors (vec_low, vec_high);")

:


query = "INSERT INTO vectors (file, vec_low, vec_high) VALUES ('{}', CUBE(array[{}]), CUBE(array[{}]))".format(
            file_name,
            ','.join(str(s) for s in encodings[0][0:64]),
            ','.join(str(s) for s in encodings[0][64:128]),
        )
db.execute(query)

:


import time
import postgresql
import random

db = postgresql.open('pq://user:pass@localhost:5434/db')

for i in range(100):
    t = time.time()
    encodings = [random.random() for i in range(128)]

    threshold = 0.6
    query = "SELECT file FROM vectors WHERE sqrt(power(CUBE(array[{}]) <-> vec_low, 2) + power(CUBE(array[{}]) <-> vec_high, 2)) <= {} ".format(
        ','.join(str(s) for s in encodings[0:64]),
        ','.join(str(s) for s in encodings[64:128]),
        threshold,
    ) + \
            "ORDER BY sqrt(power(CUBE(array[{}]) <-> vec_low, 2) + power(CUBE(array[{}]) <-> vec_high, 2)) ASC LIMIT 1".format(
                ','.join(str(s) for s in encodings[0:64]),
                ','.join(str(s) for s in encodings[64:128]),
            )
    print(db.query(query))
    print('inset time', time.time() - t, 'ind', i)

10^5 (4- i5, 2,33 GHz) 0.034 .
? +- ( ). , , … , .


K-d


- , , " ", . , .


, , , .


K-d — k- . , ( , . .), .


, :


.


(N) , , , (N). , !


"" , NDA.


. , .



15 . , . . .


— . , , :)


k-tree , ( ), , ( -), 4-6 . .



, . — 1,5 , , , , , , . , , .


, , , . , mail cloud. , .


If there are other options for solving this problem, I will gladly read about them in the comments.


And the moral of this fable is this - the crowd even felled the lionopen source solutions tested by time. Learn algorithms, pasans :)


All Articles