How lasers and sensors help keep judges nervous

Hello, Habr!

Evaluating an athlete’s performance is a task that does not become easier every year. Speeds increase, programs become more complicated, new elements and their bundles appear. Compare at least the performances of skaters or gymnasts in London, Rio, Vancouver or Sochi and the programs for which their predecessors received gold half a century ago. The difference will be felt even by one who does not follow the sport.



Who are the judges? Although they are well versed in their field, yet ordinary people who get tired, distracted, blink, give in to emotions. The result is controversial decisions, after which the fans are ready to send the entire panel of judges “to the soap”.
Since a person is not perfect, why not compensate for the weaknesses with the latest achievements of science and technology. Another dead heat at the finish line led Edward Maybridge to the idea that a timely photo taken of horses crossing the line would save them from heated debate and their equally hot consequences when the stakes are so high. They started to practice quickly and for the first time the photo finish during the races was already used at the end of the 19th century. The first video replay will mark the 65th anniversary this year. Since the 1970s, tennis has been using electronic line judge - a computerized system that determines where the ball landed.

Such systems are effective when a certain action leads to victory (to cross the finish line first, score the ball into the goal, jump above rivals, etc.), but are almost useless when the best is determined, for example, by the technique of performing the elements, their number and sequence in the program. Here you need something more complicated than simple Instant Replay. Fujitsu sees 3D sensor technology as its solution, which allows real-time scanning, digitizing and evaluating of athletes' movements. More about the principle of its work under the cut.

In May 2016, Fujitsu and the Japan Gymnastics Association (JGA) entered into a joint research agreement to create a referee support system using 3D scanning and recognition technology. The JGA, for its part, provided Fujitsu with practical knowledge of the judges, data on athletes, as well as a testing environment, while Fujitsu developed a prototype referee support system using 3D sensors.

You will probably tell why reinvent the wheel. There is a well-known motion capture technology, which has long been successfully used in the film industry and game dev. Why not apply it? The answer is pretty simple. Dozens of sensors the size of a little smaller than a ping-pong ball significantly interfere with training, not to mention letting athletes into the tatami or playing field to compete for medals. They tried, but the application of such technology was usually limited to laboratory conditions. Of course, the collected data could be used to optimize training or prevent injuries, but the jury was not able to at least make life a bit easier.

The development of the Internet of things and the introduction of IoT sensors have brought more benefits to refereeing. Hidden in equipment, in some forms, such as archery or taekwondo, they quite successfully help determine which of the participants was better. In archery, the sensor determines the position of the arrow in the target, in taekwondo - it allows you to evaluate the hits that hit the shields and helmet. Although the idea is far from new, let's recall tennis, but with the development of IoT, there are more and more opportunities to use sensors in various disciplines.
True, IoT sensors will not help create a truly universal system for supporting judges. Firstly, for each sport you have to choose your type of sensor, and secondly, in many cases, sensors must be placed directly on the athletes. Thirdly, they will not help in real time to create a 3D model of the athlete’s movement, which means they won’t be able to be applied in those sports where movements, equipment and complexity of elements are evaluated.

The choice of gymnastics as a starting point is not accidental. Firstly, gymnastics is distinguished by the greatest variety of movements that athletes make. This will allow in the long term to collect a large amount of data, create on its basis a base of movements with a high degree of versatility and use it in other sports.
The second reason is more prosaic. Gymnastics is a popular and well-developed sport in Japan. Also, against the backdrop of the "aging" of the Japanese population (by 2035, the elderly will account for almost a third of the country's total population), the government actively supports initiatives aimed at developing sports and health care. As a result, Fujitsu relatively easily received comprehensive support and expert assistance from specialists from the Japanese Gymnastics Association and the International Gymnastics Federation, as well as from other interested organizations.

3D sensors


To get rid of markers and sensors that had to be mounted directly on athletes, Fujitsu decided to use deep images (that is, images where the distance to the object at that point is stored in each pixel, not the color) for analysis. To perform three-dimensional scanning of human movements, the system uses three-dimensional laser sensors that read depth images, which are the contours of the surface of the body. After that, skeleton recognition technology is applied to the resulting images to determine the position of the joints. Just this allows you to accurately calculate the angles associated with the position of the elbows, knees, spine, etc., and to analyze in detail the movements of the body based on a temporary change in the values ​​of these angles. That is, judges can, relying on the model obtained by the system,determine whether, for example, the gymnast’s back was straight during the execution of the elements and decide on a fine.

Accurate shooting of the athlete’s fast movements requires a high frame rate and a method for collecting in-depth images that could capture all movements in high resolution and over long distances. For this reason, standard depth cameras immediately fell away. Despite the fact that such a camera receives depth information with high speed and high resolution, it can do this only from a short distance - no more than 5 meters. Which greatly limits their use on the competition sites.
With laser sensors based on LIDAR technology (Light Detection and Ranging), the situation is better. They can receive in-depth images of an object from a distance of up to 15 meters, but the scanning speed and image quality here depends on the configuration of the scanning system on the projection side and the optical system on the detection side. For example, in a system with a rotating polygonal mirror, after each scan line, the system must wait for the mirror to rotate to a certain position in order to start the next scanning process, which greatly reduces the speed.

The use of mirrors based on microelectromechanical systems (MEMS) can significantly increase the scanning speed, but even here it was necessary to “modify it with a file”. In order to use a scanning system based on laser sensors and MEMS mirrors in sports, it is necessary to increase the number of scan points by more than ten times in comparison with the existing LIDAR technology, which means that it is necessary to increase the scanning speed of MEMS mirrors. Otherwise, you will not be able to receive high-resolution images.
Therefore, it was necessary to reduce the size of the MEMS mirror using a scanning angle magnifying lens. If the projection and detection of light is coaxial, reducing the size of the MEMS mirror, which is also used for detection, will prevent the reflection of all light from the target, thereby reducing the amount of light on the photodetector. To ensure sufficient detectable light, Fujitsu used an optical system with separate projection and detection units.

The figure below shows the configuration of a three-dimensional laser sensor developed by Fujitsu Laboratories, which is equipped with an optical split projection / detection system using a MEMS mirror.



To measure the distance to the target, this system uses the Time of Flight (ToF) method, which measures the time from projecting a laser pulse to determining its reflection. Having noted the time required for projecting a laser pulse, reflection from the target and detection on the detection unit as ΔT and the speed of light as c (approximately 300,000 km / s), you can set the distance d to the target using the following equation:

d = (c × Δ T) / 2

But the difficulties did not end there. Firstly, it was important to ensure relative freedom of positioning of the sensors, since it is not always possible to set them at a certain and constant distance from objects, because all the venues for competitions are different. For example, the sensor received a deep image of an object in high resolution when it was at close range. But if the object moves farther from the sensor, the image resolution will drop, provided that the viewing angle remains the same. To avoid this, we added control of the viewing angle to the system.

It was also necessary to “cut off” the excess light that enters the system (sunlight, spotlights, camera flashes, etc.). For this, a multi-segment light detection technology was developed, thanks to which the scanning system synchronizes with the MEMS mirror control signals in order to selectively turn on only the photodetector that receives the largest amount of light reflected from the object, while disabling all others that are affected by ambient light .

Finally, synchronization has been added between several blocks of 3D laser sensors to avoid blind spots.

So, the task of obtaining in-depth images of the movements of athletes in high quality and with high speed was solved. The only thing left is to analyze them.

Skeleton Recognition Technology


Skeleton recognition technology allows you to extract data on the positions of various joints of the human body from deep images from 3D sensors. In sports such as sports and rhythmic gymnastics, figure skating, diving, etc. 3D-information about the position of the joints, their angles should be extremely accurate, since the number of points depends on this, which ultimately determines the winner.

The following figure shows the principle of technology that provides high speed and accuracy of skeleton recognition. At the preparatory stage, the system has already been trained to determine where the joints are in the image and create a 3D model of the body position based on them, but it also learns in the process from the new data it receives.



At the training stage, forecasting models are created that derive the estimated values ​​of the coordinates of the joints using depth images. To do this, deep images were created using computer graphics from previously obtained movements with the coordinates of the joints to prepare the training set for machine learning.

As a result, at the recognition stage, the multi-point deep images obtained from several 3D laser sensors are superimposed with the prediction model created at the training stage to obtain three-dimensional coordinates of the joints (i.e., recognize the skeleton). At this stage, the obtained coordinates of the joints are used as initial values ​​for applying the human model to a point cloud, corresponding to the deep images obtained from each sensor. This process is called “fit.” To make the coordinates of the point cloud as close as possible to the coordinates of the surface of the human model used for fitting, the “degree of coincidence” (likelihood) is determined, and then the coordinates are searched with maximum likelihood, which will determine the final three-dimensional coordinates of the joint.

When recognizing the skeleton using machine learning, accuracy is usually low, as the positions of the joints are determined based on the prediction model. However, this subsequent fitting process improves accuracy by comparing the position of the joints with the actual measured values ​​in accordance with point clouds from several 3D laser sensors. At this time, the accuracy of the measured values ​​in skeleton recognition based on machine learning determines the adjustment range and, therefore, affects the accuracy of the final results of skeleton recognition and processing time. In order to increase the accuracy of skeleton recognition based on machine learning, several forecasting models are being prepared that combine such body positions as the front (front), handstand (handstand) and the back (rear),and a method is applied that selects the optimal prediction model by determining the position of the body before recognizing the skeleton. Compared to the method of consolidating all movements in a single forecasting model, this method significantly increases recognition accuracy by limiting the movements that should be studied in the forecasting model.



This image shows the results of skeleton recognition based on machine learning using several sensors in gymnastics competitions. When riding circles on a horse, a prediction model corresponding to the forward position is used, and for a jump, a prediction model corresponding to a handstand is used. These results show that switching between prediction models for different types of body position allows skeleton recognition with high accuracy even for complex movements typical of gymnastics.

Implementation and Application


The first test demonstration of the system was held in October 2016 at the Congress of the International Gymnastics Federation, after which work began on the actual implementation of the technology. In October 2017, the first test experiment using actual competition data was conducted at the 47th World Gymnastics Championships in Montreal.

At the World Gymnastics Championships in Stuttgart in 2019, the Fujitsu system was officially recognized as an auxiliary tool for assessing the difficulty of performing in 4 forms: gymnastic horse, rings, vault (men and women).
It is worth saying that the use of Fujitsu 3D-sensor system is not limited only to the help of judges. There are many potential application scenarios.

Following the performances of gymnasts, the system learns to recognize the most diverse and complex movements. Therefore, it will soon be possible to adapt its application to other sports, it is only necessary to determine the appropriate forecasting model for each specific discipline. This will not only help judges make decisions faster, which will have a beneficial effect on the number of appearances on television broadcasts (less jury meetings - more time in front of cameras for athletes), but also help viewers better understand what is happening on the court. Processed images from scanners are excellent for visualization of individual moments of performance (the execution of complex elements, errors).




Athletes and trainers can use the video about the system’s operation and scenarios of its application using 3D models obtained as a result of scanning to improve equipment, optimize training and prevent injuries. Also, such a system opens up new possibilities for remote training and consultations, since the models from the system allow a much better understanding of the athlete’s technique than conventional video recordings. At the same time, human movements are presented in digital format, which means that these data can be used for research.

This usage scenario is becoming particularly relevant in the current period. Now the movement of people even between cities, and even more so between countries, is limited, and athletes, nevertheless, need practice and competent consultations of coaches and other specialists in order not to lose shape in anticipation of when sports life will return to normal.

You can also abandon the tomes of the “rules of refereeing” mottled with static illustrations and lengthy textual explanations of how the gymnast needs to perform the exercise. The future lies with the applications, and on the basis of data and models obtained from 3D sensors, we will get a great application for judges with a set of rules, dynamic detailed images of the correct execution technique, which allows a minimum of discrepancies or double interpretations.

Finally, they are going to use the resulting 3D scanning and recognition system for the rehabilitation of patients. It helps to visualize the restoration of joint mobility and correctly adjust treatment. Interestingly, this technology originally grew from the development of Fujitsu Laboratories for rehabilitation in medical institutions. Indeed, history is cyclical.

Useful links

3D Sensing Technology for Real-Time Quantification of Athletes' Movements
ICT-based Judging Support System for Artistic Gymnastics and Intended New World Created Through 3D Sensing Technology
"A step towards the future" with the first official use of Fujitsu technology to support judging at the 2019 Artistic Gymnastics World Championships

All Articles