As Sports.ru wrote their WYSIWYG editor

In mid-2018, Sports.ru thought about moving to a new WYSIWYG text editor for user posts. Since June 2019, the editor has been in beta mode. During this time, we solved many problems associated with both the design of the architecture of the entire service and the implementation of the editor itself in the browser based on the ProseMirror library , and decided to share our experience.



Table of contents


1. Introduction
1.1. Why did you need WYSIWYG
1.2. Description of the task that the developers faced
2. How to choose the tool
3. What happened
3.1. Service architecture
3.2. What are the challenges
4. Beta Test Results


1. Introduction



1.1. Why did you need WYSIWYG


Sports.ru is media about sports with an audience of 20 million users per month. Our main differences from classic media are the community and UGC . User content - ratings, comments, chats, posts - not only complements the editorial value, but also creates a platform for users to interact with each other. Every month, our users write almost 10 thousand posts. The best of them are submitted to the main page of the site together with the editorial ones, sent to mobile applications, social networks. User content accounts for about 40% of all reads on Sports.ru pages.

We want to be the most convenient platform for sports authors, to help create content and deliver it to an interested audience. 10 years we used the TinyMCE editor- and in the end it became outdated, it ceased to suit both the team and users who were used to modern editors.


Fig. 1. The interface of the old editor based on TinyMCE

From the authors of blogs regularly received about the following complaints:

  • I wrote for a long time, then accidentally closed the tab and everything was gone;
  • writing long texts is very inconvenient;
  • It’s annoying that to insert each image, you must first upload it to the image hosting.

The team also had its own complaints:

  • in TinyMCE you cannot upload images directly from a file, you can only attach links to images, and due to the fact that users were not able to upload images to our storage, if links to them died, we could not do anything about it;
  • the possibilities for editing and formatting the text are not balanced. On the one hand, there were not enough styles, for example, for internal headers in the text. On the other hand, it was possible to use the available tools in any combination. As a result, the posts looked not uniform (at Sports.ru, work began on the implementation of the design system and user posts should look in accordance with it);
  • content is created and stored in HTML, so it’s difficult to manage styles in posts on different clients, and just make changes to the layout of posts.

Next is a story about how we solved the problem of creating a new editor on the front-end side. True, there will also be something about the product, design and backend parts, because without this it will be difficult to understand why a particular decision was made at the front end.


1.2. Description of the task facing the developers


In short, the thesis from which we initially drew upon: blogging on Sports.ru is a pain. In principle, it would be possible not to make a new editor, but simply add autosave and the ability to upload pictures to your own storage - and most of the complaints from users and employees would disappear. But I still wanted not to support the tool on old technologies, but to create a new modern editor that we can easily develop and scale.

In addition to the inconvenient interface, one of the main technical problems of the old editor was that the content of the post was saved immediately as an HTML string, and changes in the appearance of the post either required the intervention of backend developers, or were implemented in runtime on the client (for example, the placement of ad units in the body of the post). Our task, among other things, was to separate the data from their presentation and, accordingly, leave the layout and interface in the client code, and work with the data in the server code.

As a role model, we took Medium , sometimes spying on ideas from Google Doc . In addition to solving the problems already identified, we decided to add several new features that would make using the editor more comfortable:

  • WYSIWYG, .. what you see is what you get (. « , », ), , . , ;
  • ( , , , , ; , ) .

At the same time, the editor himself should not have been tied to the features of Sports.ru, because Sports.ru, although the flagship project in our company, is still not the only one. The company is also developing the international sports media Tribuna , a social network for betting enthusiasts Betting Insider , and has recently launched its own production studio engaged in advertising projects. Developing an online editor is expensive enough to not want to reuse this code on another site with different typesetting and styles, with its own set of tools for editing and formatting.

We have a lot of text content, and before starting work on creating a new post editor, we thought about how this content should be stored. TinyMCE did not give us a choice and the content had to be stored only in HTML, which, as mentioned above, did not suit the team. As a result, we came up with our own format for storing text data that meets our requirements, and called it structured body.

Structured body is an array of objects that reflects the structure of the content. In this case, the content is divided into elements that are independent blocks, for example, paragraph, list, picture. An element stores information about what type it is and what properties it has. For example, the subtitle block describes the title in the text, it must contain the text and level fields. Accordingly, text contains the text of this heading, and level contains the level (from 1 to 4). A structured body, consisting of one second-level header, might look, for example, like this:

const structuredBody = [
    {
        type: 'subtitle',
        value: {
            text: '    ',
            level: 2,
        },
    },
];

The transition to structured body allowed us to begin the process of separating business logic, data and their presentation. Ultimately, we want the server and clients to exchange only data. And how and why to display this data to the end user, each client will determine independently.

Content in the formatd body format is stored in JSON, and to validate its contents, we created a JSON schema called structured body schema. This diagram describes all valid elements and their properties. Thus, we can be sure that wherever a structured body is needed, one set of keys and values ​​is used.

Moreover, it allows different teams to use the same services to process content in this format. For example, a service for generating HTML from a structured body for displaying content or an editor for creating content. This significantly reduces the cost of developing and supporting the entire core of services related to the creation and display of content.

It was assumed that the new editor should accept input and output content exclusively in the structured body format. And here it was necessary to take into account the subtle point: since earlier the posts were immediately saved in HTML, this HTML string from the database was transmitted to the client for display (hereinafter, by the client we mean only the browser, unless otherwise specified). Now we want to store the content of all posts in the structured body, but clients can only process HTML. So, along with the task of moving to a new editor, the task of implementing a new way for clients to display posts for reading directly from structured body is simultaneously going on. We decided that it is better to eat an elephant piecemeal, so first you need to completely abandon TinyMCE, and only then take on the logic of displaying posts for reading. Moreover,not all old posts managed to translate the content into a new format, which means that these posts will always be stored only in HTML and it is necessary for them to also retain the ability to read.

Total: part of the posts (all new and old that were successfully transferred to the new format) will be stored in two formats - HTML and structured body - until the new display logic for reading is implemented, and the rest (most of the old and very very old posts) will remain only in HTML.


2. How to choose a tool


We had to realize the ability to edit and create a post on the client, taking into account the above features and limitations. As always, you can take a ready-made solution, or you can come up with your own.

To begin with, we examined what ready-made libraries for creating WYSIWYG editors are and whether they are suitable for us. We settled on Slate , Draft.js and ProseMirror .

In addition to storing content in a data structure, the critical moment for us was also the ability to work with Vue or pure JS, because we had already begun to move the site to a new technological stack using Vue + Vuex. In addition, I would like to expand the capabilities of the finished library with the help of new modules (third-party or self-written) if necessary.

Tab. 1. Comparison of the reviewed libraries by the most important parameters for Sports.ru


As you can see from the table, ProseMirror fully complied with our requirements, so we no longer considered the idea of ​​writing our own library for editing text content, but began to study this library in more detail. There is still a rather popular Quill , which did not get into our comparison just because we honestly forgot about it at the stage of selecting a tool. According to our key requirements, it also passes, but it just so happened. We already talked about what ProseMirror is and how to work with it in another article .


3. What happened



3.1. Service Architecture


The content editor itself on the client is far from all. You need to place the editor in an existing project, display it somewhere on the web page, and also consider interaction with the backend, solve the problem of simultaneously supporting two editors (you couldn’t immediately abandon the old one) and storing the content in two formats (HTML and structured body).

All these tasks can be divided into those related to the frontend, backend and their integration. We are primarily concerned with front-end and integration issues, although we also mention a couple of important aspects of the back-end tasks.

Frontend services for the editor can be divided into several levels:

  1. web page for creating and editing a post;
  2. Vue-app, . , , Vue, -, , , , .., , , ;
  3. WYSIWYG- ProseMirror, Vue. , , ;
  4. SB2HTML – HTML structured body, . , structured body – , . , , , . Sports.ru HTML structured body, - HTML . HTML Node.Js, JS- .

The process of saving the post is shown in Fig. 2. The content of the post in the structured body format and its meta-data are transferred to the backend. The backend sends content to the SB2HTML service, receives ready HTML in the response, puts all this into the database and tells the client that the post has been successfully saved, or reports an error.


Fig. 2. Scheme for saving a post when creating or editing in a WYSIWYG editor


3.2. What difficulties did you face?


There were many difficulties, they arose constantly and often in the most unexpected moments.

As we already said, the content editor is located inside the form, which allows you to enter additional data necessary to create a post, such as title, annotation, etc. For annotation, it should be possible to download images from a file and via a link from the Internet. But for content we also want to load images from a file and by reference, moreover, according to the same rules. And here we are faced with a dilemma: on the one hand, the content of a post is isolated from the external form when editing and is served by ProseMirror tools, but on the other hand, I want to observe the DRY principleand do not duplicate the same code. We solved this as follows: we described loading images as a set of methods in an object at the Vue form level and passed this object as one of the parameters to the constructor of the WYSIWYG editor.

Entities that describe content — Node and Fragment — are defined in the ProseMirror model. However, only indexes are used for a transaction to determine the range of characters to which this transaction is applied (indexes are counted both from the beginning of the document and from the beginning of the parent node). Character indexing is one of the central concepts of ProseMirror, but when editing and formatting text it is much more convenient to think about entities from the ProseMirror model. As a result, for a comfortable work with content, we wrote our helpers to simplify interaction with a document for transactions. After the beginning of our work, the tiptap library appeared , which is a set of similar helpers.

The next problem was that at the stage of creating the scheme, we realized that we already have an approved internal format for storing content - a structured body that meets our needs, and ProseMirror stores the content in its own format in a story. Switching to the ProseMirror format was difficult and impractical. We found ourselves in a situation where data in one format comes to the client via the API, and another needs to be displayed. A similar situation arises when it is necessary to save modified or created content. To do this, we implemented a converter that converts formats back and forth. They wrote a simple test for him, which takes the content of one post in the formatd body format, translates it into the ProseMirror format, then back and already compares the original version with the received one. It turned out quickly and easily.

In the future, as the document scheme changed and, as a rule, became more complicated, it became clear that the slightest change could lead to errors in the editor, and such a test seems to give very poor coverage. As a result, I had to write tests on almost all combinations of nodes and brands on two small converter methods. Now without these tests it is impossible to determine whether the next change in the circuit will break something or not.

The next problem is again related to the need for backward compatibility of old and new technologies. Our WYSIWYG editor is implemented only in browsers (desktop and, soon, mobile). Accordingly, for editing content on the client is given in JSON in the format structured body, however, reading posts in browsers is carried out only from HTML. At the same time, most mobile applications have already switched to displaying user posts directly from structured body.

For mobile applications, it was necessary to provide for the case when the client cannot process some element from structured body. For example, if a new element is added to structured body, the display of which is implemented only in a newer version of the application. Since not all users update their applications at the same time, it was necessary to provide a plan “B” for older versions: instead of creating HTML from structured body, insert a ready-made HTML fragment for the desired element. The presence of HTML fragments for each element was not provided in the structured body scheme, because the very idea of ​​this structure was to refuse to store data in HTML. But in the end, we came to the conclusion that we need two structured body schemes - one for display and one for editing. The differences between the schemes arethat the structured body for editing contains only the content of the article, and for display we add some additional elements. In particular, an HTML fragment for each element is created when a post is saved in the SB2HTML service and is added only to structured body to display the post. In addition, the structured body also displays advertising space in the content for display.

When we open content for editing in a browser, we basically can’t encounter an unknown element, because all posts are created and displayed in the same way. But they decided to foresee such a case for the future too. To do this, we added a default stub element to the ProseMirror schema. We named this element unsupportedBlock. The stub appears in place of an unsupported item. We stylized it as a gray rectangle with text stating that this element is not supported and cannot be edited. When a post is saved, such an element remains unchanged in structured body. The user can change its location relative to other elements, but the internal content of an unknown element cannot be changed or edited. However, the user can delete such an element, then, of course,it will not be saved in the final document.

All the problems described were related to the difficulties of implementing the WYSIWYG editor itself. But while it existed in beta mode, we could not abandon the old editor on TinyMCE and were forced to support both editors, providing backward compatibility between them. For example, you could create a post in the WYSIWYG editor, save, then edit it in TinyMCE, save, reopen in WYSIWYG, and so on. As a result, when opening in WYSIWYG, we saw the same content as in the previous save in TinyMCE. To implement backward compatibility, it was necessary to submit HTML content to TinyMCE, which we already learned to create from structured body and save to the database while saving the post. And when saving a post through TinyMCE, the created content on the server is run through the HTML2SB service,as a result, we can save both fresh HTML and structured body.

HTML2SB is the opposite of what SB2HTML does, that is, converts content from HTML to structured body. Chronologically, this service appeared earlier than anything else, because before the creation of the WYSIWYG editor, the only way to get post content in structured body format was direct parsing from HTML. HTML2SB was part of the backend infrastructure around the post editor, but after abandoning TinyMCE, it was no longer needed.


4. Beta Results


Now WYSIWYG-editor is available to all users in beta version, and will soon become the main editor of Sports.ru posts. We have already received a tool for creating and editing posts that meets most of our requirements:

  • the editor’s interface has become clear, concise and modern, writing long posts has become much easier;
  • Now you can download images from a file and by a link that are immediately placed in our repository;
  • added the option to embed embeds from major social networks and video hosting sites;
  • cleaned up text formatting styles;
  • mobile applications have already switched to displaying posts from structured body and can set their own styles for content.

Of course, the editor is not yet fully debugged, we periodically detect new bugs. The following updates are coming:

  • autosave;
  • WYSIWYG version for users with extended rights (administrators, full-time editors);
  • creating and editing posts from mobile browsers;
  • Messages about parallel editing of a document by several users;
  • tips and onboarding;
  • statistical widgets for sports teams, matches and lineups.

At the time of this writing, more than 13,000 posts have already been published through the beta version of the editor, which is about 20% of the total number of user texts on Sports.ru for the period from June 2019 to February 2020 inclusive. The share of posts created and published through the new editor is growing steadily.


Fig. 3. The proportion of user posts created and published in the new editor

It seems that the organic growth in the share of user posts created and published through the new editor is a signal that users are happy with the update, which is also confirmed by feedback in the announcement of its launch in beta testing (some of them are shown in Fig. 4). Therefore, in the coming months we plan to completely transfer the creation of posts to the new editor, so as to focus only on its support and development. 
By the way, what functionality would you add to our WYSIWYG editor?


Fig. 4. User comments in a post with the announcement of the WYSIWYG editor update 

All Articles