Semantic Digital Systems

In the previous text ( Myths of semantic technology ) it was not without share of shock that it was stated that there is no semantics in IT semantics. Although, of course, you need to separately answer two questions: (1) do the data contain meaning? and (2) does the computer understand this meaning? We will leave the second question to philosophers, although the answer to it is already obvious. The answer to the first is also obvious: information systems are needed to process information, that is, meaningful, semantically defined data. In this case, of course, one must understand that these data are meaningful only for the person who initially produces these data, writes a program for their processing, and ultimately perceives their meaning.

Different IT systems relate differently to data content. There are applications that are indifferent to the meaning, that is, they process the data, completely ignoring their content. These include the simplest programs that work with text, sound, image. Their algorithms do not focus on the content of processed files. The text editor does not care if the business contract, scientific article or homework of the school student is loaded into it.

The remaining IT systems are sensitive to semantics, that is, they somehow react to the content of the data. Such systems, unlike systems of the first type, store data not in β€œsmooth” files, but in the form of structured arrays with a breakdown into types and values. This very data structure should be associated with semantics. Further, it should be noted that there are two ways to specify the semantics of data: (1) through the architecture of the system, for example, using the structure of the database tables, and (2) by configuring the data itself. That is, the semantics of the data is either rigidly determined by the structure of the application, or it can be independent of the application sewn into the data itself. And the second way of structuring data, when the data model is determined by the data itself, is what we call semantic.

So, a special type of IT systems operating with a special semantic data format should be highlighted. The main distinguishing feature of semantic systems is that the data processing algorithms are defined not by the application architecture (database structure or program code), but by the data itself: data values, their typification, and logical relations are written in the form of an array of statements uniform in format. That is, on the one hand, we have a format with which the data describe themselves, their semantics, and on the other, universal applications that process data of arbitrary semantics, provided that they correspond to the format. And here, indeed, it is tempting to say that semantic systems understand the meaning of data, although, of course, we should only talk about the formal distinction of one meaning from another,without any understanding from the computer.

Here, of course, it should be noted that at the moment, semantic systems have not yet fully reached the level of their non-semantic competitors. Semantic markup so far allows you to fix only a static data structure: describe entities, properties, individuals, property values ​​of individuals, establish subordinate relationships between entities, and set rules for deriving new statements. That is, the modern semantic system is essentially a universal data warehouse with the ability to implement complex search and generate new data, according to the axioms and rules contained in the data itself. Moreover, the storage can be either distributed (network) or local. For complete happiness in technology, there is not enough specification of a description of actions, that is, a method of embedding business process models in the semantic data.

Let us try to highlight the advantages of semantic systems relative to standard ones and the conditions necessary for the realization of these advantages (the description goes without reference to any standards).

First of all, semantic systems are universal applications that are not rigidly tied to subject areas. To work with various data models, the application does not need to make any changes; it is only necessary to describe the structure of the subject area using special languages, that is, create its ontology, and load the ontology along with the actual data into the application. Moreover, the data structure at any time can be freely modified, supplemented by new concepts, relationships, rules.

Obviously, semantic applications are generally slower than those whose data structure and algorithms are hard coded in the code. However, there are many business processes for which the speed of their modeling and the ability to freely modify models are more important than the speed of the application.

The most important advantages of semantic technology include the automation of data exchange. Thanks to the universal data description format, independent applications can freely interact. To fully realize this feature, two conditions must be met: (1) applications use single dictionaries containing entity definitions, and (2) applications support unique entity identification, which prevents collisions. Dictionaries must be in semantic data format, and their elements must also have unique identifiers. As a result, we get the opportunity of collective use of ontologies and free (without any API) data exchange.

The semantic presentation of data, that is, combining in one array of factual data and their conceptual scheme, allows you to implement complex search options taking into account all kinds of conditions and dependencies. Moreover, the search can be conducted not only by the local ontology repository, but also by a variety of applications on the network.

So, the main task of semantic technologies is the unification of working with data in order to optimize the construction of symbolic models of subject areas, automate the exchange of data between independent applications and refine the search for data. The problem is solved: (1) the inclusion of metadata in the data itself, (2) the unification of the data format, (3) the introduction of a unique identification of data, (4) the standardization of dictionaries and output rules.

Continued Semantics and Activities

All Articles