This week marks the official end of my Google Summer of Code 2016 project. For the past 3 months I have been working hard on developing a rich and usable user experience for translators using the Translation Management Tool module in Drupal 8. The main goal was to create two CKEditor plugins. The first plugin would convert text parts into segments to easily segment the content into smaller bits and enable easier translation management. It’s also connected with the Translation Memory, to provide saved translation suggestions for requested segments. The second plugin is about masking HTML tags inside segments. This really helps to understand the structure better and cleanly show which opening/closing tags are missing inside a segment.

I believe the goal was clearly reached quite some time ago. After that, we started implementing more functionality around these two main “pillars”. Users can now click through segments, which are clearly shown in both, the source and the translation editors, and check in the area below the editor for the translation suggestions. They can be used to replace the selected segment. Each segment can contain HTML tags, which are masked in the pre-process, before populating the content. They are displayed as widgets, as they can be clickable and draggable. If the translation is missing some tags, a warning message is displayed for awareness. When the translation seems to be fine, it can be marked as completed. This increases the counter of completed segments in the sticky division in the top right corner and colors the segment with a green background.

All of the above currently works fine and more extra features will be added soon!

Achievements

I started last week as always, by working on the feedback that I got from my mentors on our weekly meeting. This mainly involved some UI fixes, like writing a cleaner validation text and displaying the missing tags not as just links, but with the same styling as the tags that are in the editor. I’ve also added a title for these missing tags, which is displayed when the user hovers the mouse pointer over them. This helps the user to understand that those are connected and when clicked, they can be placed at the cursor position inside the editor. Also to to save space for smaller screens and to tidy up the table with the translation suggestions in the area below the editor, we’re displaying just a more simple text inside the button - “Use” instead of “Use suggestion”. Another UI feature that I added on my mentor’s suggestion was to mark the active segment that has any missing tags with the red color to indicate it more strongly and clearly.

Then I moved to the most important thing, the code cleanup, which I think was really needed. I was already using ESLint - a pluggable JavaScript linter, as a PhpStorm plugin extension, during the coding process which helped me a lot with the code quality, but I felt there could be done something more to run everything faster and cleaner.

That is why I met with one friend of mine during the weekend and we spent some time reviewing hwo the plugins work and figuring out what could be done better. There were three main things that needed to be done:

  • Split the bigger Javascript functions into many smaller ones when possible
  • Check how often these functions are called and adjust their scope properly (if they are not called often, we can put them inside the scope of their parent function)
  • Check for loops for better performance

I have also added function comments based on the JavaScript API documentation for Drupal and checked everything for the correct syntax (variable names, data types, tag orders…). I am sure the code is much more easier to read now for other developers in the community.

While working on this, I found out some problems regarding JavaScript closure inside loops. This is related to the binding of of click events for adding the missing tags in the editor. This was an intriguing challenge, but Stack Overflow was (as always) a really helpful source for finding a good working solution.

I also spent quite some time trying to solve the issue regarding widget dragging inside the segments, which are defined as block elements, but still didn’t find any proper solution. I also tried to contact nod_ and Wim Leers, but got no positive feedback. Any help from the community on that matter would be appreciated. :)

Finally, I would like to make a quick overview of how to test my module, list the things that are working and things that are still missing and need work.

How to test it:

  • make sure you have all the dependencies (Translation Management Tool, Paragraphs, Translation Memory)
  • install the module
  • for now, only the predefined nodes from my module are working - a simple Test for CKEditor plugins and Test for CKEditor plugins with paragraphs
  • request a (german) translation of either one of those two
  • enable plugins on translate/review page in the editor toolbar
  • test it (and possibly send me some feedback)

What works:

  • dummy nodes for testing with new text format
  • works on nodes with one or more editor pairs (paragraphs)
  • masking and unmasking the HTML tags
  • displaying the active segment in both editors
  • translation memory querying
  • displaying suggested translations from the memory
  • using suggested translations - placing the selected one in the active editor
  • validation of missing tags (globally and per segment)
  • adding of missing tag
  • set segments as completed
  • masked tag dragging

What is currently not working:

  • text segmentation (in tmgmt_memory)
  • saving segmented content properly, so that accepting translation does not save segments but initial content
  • responsiveness

Note, some of the “not-working” issues listed above are not on my side; either they are out of scope or from a related project.

Another important thing that we should be aware of: this project provides a number of puzzle pieces towards providing translators and reviewers better tools to work with TMGMT. Many other things are needed until this fully works together with real content.

My latest code can be found in my Github repository and in my sandbox project.

Goals for next week

For this last week, I am planning to wrap up my project completely, update the documentation if needed and create issues for my sandbox.

Conclusion

I am very thankful to my mentors Miro Dietiker and Sascha Grossenbacher for the constant supervision, guidance and support throughout this amazing experience. Our future plans involve making the initiated project fully functional in real world with true translations and real usage. When we will think that it is ready, it will be merged into the TMGMT module and I will actively maintain it.

I would also like to thank Google for this great opportunity and the open-source community for the support with my project and for pushing technologies forward step by step, every day. During this summer I gained many technical and psychological skills. On one side I got to know myself better and while working remotely I gained a lot of self-discipline and motivation, on the other side it was a real challenge to study and learn JavaScript, together with the CKEditor API. I am sure all the skills gained will pay back someday!

We’ve come to the last phase of coding for my Google Summer of Code 2016 project, and things are currently going quite well. The scope of developing CKEditor plugins for the Translation Management Tool module was really fascinating so far, with many challenges to solve and new things to learn every day. I believe my work will significantly help local translators in their translation process by providing a nice content segmentation and translation suggestions. The idea is to create a tool similar to Google Translate, but with some extra features, like segmenting the text and masking HTML tags inside them.

Achievements

Last week I discussed about the complete overview list of all the things that are already implemented and how I imagined the UI for the new translation process.

As I got quite some feedback from my mentors on the mockup that I created, I have focused on this part this week. I have extended it with definitions and synonyms for a specific (selected) word that we get for the marked word from the Translation Memory and created some screenshots, that can be found in my Google Drive folder. Any tips and ideas about the UI are strongly appreciated.

While working on this topic, I realised that the current scale for the quality of the translations in the memory is not perfectly suitable, since it ranges from 0 to 5. I suggested to the maintainer of the Translation memory project to extend the quality scale to 10. I believe this would result in a more accurate scale of marking the translation’s quality and would be easier for the translator to decide, which translation is better. For that, I opened this issue on drupal.org. With the usage of real quality values for the translations, we can get rid of all the hardcoded testing values. The source values are yet to be discussed and implemented properly, after that I can merge the mockup branch, which contains the new UI with the main master branch. The code for the mockup with all the updates can be found here.

Another topic that we discussed on our previous weekly meeting was about the validation area. My first idea was to put it in the sticky division on the right top corner, but with some thinking, we agreed to put it back in the area below the editor. I changed the visual appearance of the warning and proposed two solutions - one that follows the Drupal guidelines for general error messages on the page and another one, that is more simplistic.

Apart from the UI, I had to fix different issues that I spotted along the testing process. I finally managed to fix the widgets issue described in this blog post. I did that by setting the tmgmt-segment element as a div element and the tmgmt-tag as an inline widget. I believe the nesting issue was caused by setting the tmgmt-segment tag to contain text with this line of code:

dtd['tmgmt-segment'] = {'#': 1};

The next problem that I’m having is with dragging the widget inside the segment. I can’t figure out why the parent segment is created and displayed around the tag when the dragging happens. I will have to dig deeper in the dtd and widgets topic.

I also spent some time on the performance part, solving an important issue regarding the call of the onChange event function. I am using the debouncer to limit the invocations of this function in a given time frame, since before that, if the user was typing some text the function was called many times, resulting in significant resource consumption.

The full word mockup.
The image shows the current status of the mockup for the word selection.

Goals for next week

My workload for next week will be very intense, since I have so many issues opened on various topics.

Mockup/UI:

  • Use the shorter text for buttons (just “Use”)
  • Remove the “Use suggestion” in the table header
  • Display the active segment red if there are missing tags
  • Change the validation text and the display of the missing tags in the area below
  • Ordering of the translations should be ascending, not descending, based on quality value

Code cleanup:

  • Add/remove comments
  • Extend the documentation
  • Find bottlenecks
  • Optimise the code for better performance
  • Test with bigger nodes, which have many segments and tags

Fixes:

  • The bug with dragging widgets

Other:

  • Update the sandbox project
  • Import translations from a file
  • Support real translation jobs by applying the segmentation automatically on the source text

August is already here and the 10th week of coding for the Google Summer of Code 2016 project has come to an end. My scope for this summer is to build a functional user interface for the Translation Management Tool module, which will significantly help local translators in their translation process. As we are almost at the end of the project, I am starting to wrap things up in a more clean and suitable way.

Achievements

As written in my previous blog post, I struggled to find a solution for displaying the masked tags inside the segments as widgets. I believe this is a crucial part that still needs to be solved. I tried seeking help on stackoverflow (where I opened a thread with my issue) and I also contacted Reinmar - the maintainer of the CKEditor module for Drupal 8, but didn’t get any response so far. I will need to spend some more time this week and solve this blocker.

Other than that, my main goal for this week was to create a nice overview of the things that are currently implemented and displayed in the area below the editor, and think about the things that we can add. Because during the coding process we faced many issues and got so many new (crazy) ideas and added new features, which were not initially planned in the scope of my project, I needed to write a clean checklist.

As of now, in the area below we were displaying the following:

  • clicked word
  • active tag
  • active segment
  • suggestions from the translation memory
  • a button for each suggestion
  • completed segments counter
  • global counter of missing tags
  • missing tags per active segment

Other important things, that are implemented:

  • display the active segment in both editors (source and translation)
  • translation memory lookup
  • replacement of the segment with the suggestion
  • context menu for marking segments as completed

Once I had this written down, I had to define the features, that I consider out of scope for my project. In the last weekly meeting we got a great idea of implementing a wizard, that will guide the users step-by-step over the translation process. By clicking on a button, it would be possible to translate each segment one by one, with potential validation warnings in between the translation process. I believe this is absolutely out of scope, as it would take too much time for now and we have other priorities. Another thing that I won’t find time to implement is an external library of definitions and synonyms of words (a dictionary). This needs to be done in TMGMT and we’ll have to consider this once the initial version is complete.

After that, my next goal was to create a mockup of my ideas, how the final product should look like. As a reference point, I took Google Translate, which has a really nice and useful UI, but with so many functionalities, it can really be confusing sometimes. My initial idea was to separate the UI into three main parts. The area inside the editor, the area below the editor and a new one, a scrollable sticky division at the top right corner.

The area inside the editor consists of the widgets for the masked tags and the segments. The tags should be draggable with a handle bar, while the segments should persist in their position. While the segments are defined as block elements, the tags are inline elements. This means that both, tags and segments are nicely marked and easy to spot in the editor’s area.

For the are below the editor, I need to adapt to the Drupal 8 GUI. As everything seems to be quite squared, with straight lines and borders, my proposal would be to have line, stating which segment is being translated and a simple table, where the rows would be the translation suggestions and columns would hold the following:

  • Quality (of the translation suggestion)
  • Source (human or machine)
  • Translation suggestion
  • Use suggestion (a button)
The mockup of the area below the editor.
The mockup of the area below the editor.

For the quality, I used the HTML5 meter element. It represents a scalar measurement within a known range, or a fractional value. The attributes are really easy to set and I like the look of it.

If you ask me, the buttons in Drupal 8 are not that user friendly, especially if used in a context seen above. I would change the add suggestion button to a button with a check icon and see how that looks.

The sticky division would hold some important information, such as the counter of the completed segments and the validation counters. I proposed to this two things here, because they are both numerical counters and are easy to spot at first sight. The background of this element would adapt to the tag validation. If all translations would be valid, it would be green, in other case I would set the background to red. In our weekly meeting with my mentors, we discussed this matter. They didn’t like the solution for the validation part and would rather have it in the area below the editor. The reason for this is understandable, as the area below the editor holds the parts, that the user can interact with - and the validation part that displays the missing tags is clickable, so that the user can place them in the editor at the cursor position. I will need to rethink and redefine my mockup to fit the validation part properly in the following week.

I already implemented my ideas on a separate branch called mockup, which will still be in constant development until we find a proper final solution.

Goals for next week

From this weekly meeting with my mentors I got some inputs on how to test and improve my code performance. I will work on that for a day or two, but mainly I will work on the mockup and other small fixes that we found lately. I will also implement the adding of the missing tags from the validation message, since this is not implemented yet - I needed some feedback there.

We are already at the end of July and we just passed the ninth week of coding for the Google Summer of Code 2016 program, which I’m also part of. I am working on extending the Translation Management module with new, very useful functionalities that users will love. Specifically, I am extending the CKEditor with new plugins for easier segmentation of the content and masking the HTML tags that are inside them. This will result in a much nicer user experience in translating the content by the local translators.

My current work is based on the masked HTML tags. The masking was properly defined and implemented in the previews weeks. More info about the structure can be found here. During that implementation I encountered a few issues, which I managed to solve successfully. This was thourougly described in my latest blog post. Below is a description of the progress that I achieved this week.

Achievements

As planned, I started this week by implementing a validation of the masked tags. This consists of looping over every segment, displaying the counter of found missing tags in the translation editor based on the source editor, and the array of missing tags. For the purpose of testing, I removed some tags from the dummy segments, that were stored in the translation memory. Both, the counter and the exact missing tags are displayed in the area below the editor, as soon as a missing tag is detected.

This means, that we do the validation at many levels:

  • when the editors are loaded
  • when we add a suggestion from the memory
  • when we detect that the content was changed with the CKEDITOR.on('change') event.

The benefit of displaying the missing tags is crucial here, so that the translator gets warned about the possibility to have an incorrect DOM structure of the translated text. Later on we might add an option for the user to click on a missing tag and drag&drop it into the editor.

My focus this week was also on converting the tags to widgets. This is needed because the DOM structure of the content that the browser renders was completely wrong - the closing tags were automatically added. As Reinmar pointed out in this thread:

You need to implement your tag as a widget. You already configured your DTD pretty well (you only forgot to set in what elements the element can exist and whether it is more like a block or inline element), so the parser will accept it and handle as an empty tag. Now you need to wrap it with a widget in order to isolate it so it does not break the editing experience. That should do the trick.

So, firstly, I managed to configure our tmgmt-segment and tmgmt-tag elements in the DTD. My idea was to create a block element - the segment, and an inline element - the tag. This worked completely fine. After I converted the tag to an inline widget, I found some issues, which I unfortunately still need to solve. The widgets are displayed and are completely draggable, which is great, but they are not in the correct place anymore. They are somehow being placed outside of the segments. I will need to investigate this issue a bit more. One solution would be to try and convert also the segment to a non-draggable widget, since I have read somewhere that nested widgets are supported in the latest version of CKEditor.

Tags as widgets
The current issue can be seen in this screenshot.

The code of my widget progress can be found in my Github account. I created a new branch called widget, which is accessible here.

Goals for next week

I will spend an hour or two on trying to solve the issue described above in the next week.

My main goal will rather be to create a complete overview of what is currently implemented and displayed in the area below the editor, and think about the things that we can add.

I will also create a mockup of my ideas, how the final product should look like and discuss about that with my mentors. We discussed about adding some kind of a wizard, which would help the user in the translation process by guiding you through the steps of what still needs work or review. I will take that in consideration for the mockups.

Last, but not least, I will do some refactoring of a few pieces of code to get a better performance of my module.

The eight week of my Google Summer of Code 2016 development process is over and the Translation Management module for Drupal 8 is closer to becoming a much much better CAT tool than it ever was. My contribution to this project consists of developing CKEditor plugins for displaying pieces of content - segments, and masked tags inside them. Since we already passed the first midterm evaluation period, we are slowly but steadily moving forward the end goal of having a functional product that the open source community can use.

Currently, I am working on the second plugin for masking HTML tags that are inside segments. The purpose of this plugin is to easily check the structure of the content and be warned of any missing or unclosed tags.

Achievements

Last week I found some important blockers, which prevented me to make significant progress.

I successfully managed to fix the one regarding the matching of segments containing masked tags. The idea behind the translation of segments is to get the active segment, together with the masked HTML tags inside it and make a HTTP request to the translation memory. If one or more perfect matches are found, we display that as a suggestion and the user can then easily replace the text the chosen one. The main issue I faced here was the fact that the masking and unmasking of the tags was not yet supported by TMGMT. Together with the MD Systems team we worked on this part, and the final patch for this issue should be committed soon. My mentors, Miro Dietiker and Sascha Grossenbacher were providing constant support and guidelines in finding the optimal solution.

Once that was done, I could work on top of it and define a new workflow with one additional step:

  1. Masking the tags with TMGMT.
  2. Encode the masked tag before sending it through the HTTP request to the service controller.
  3. Unmask the tag.
  4. Query the segment with the unmasked tag in the translation memory.
  5. If we have a match, mask it and return it.

Because the segments in the memory are not masked, we need to properly invoke the mask and unmask function the service controller. The problem here was that we cannot call alter hooks from my controller. As a quick solution, we provided them as simple external functions, which can then be called in the hook_tmgmt_data_item_text_output_alter() and hook_tmgmt_data_item_text_input_alter() hooks respectively to mask the tags on load and unmask on save, but also in the controller.

The second issue that I stumbled upon last week was related to the closing tags. Since we defined our custom tag - tmgmt-tag, it needed to be added to the list of empty (self-closing) elements in the CKEDITOR.dtdobject. This only solved the issue partially. Yes, the source in the editor shows the HTML structure completely right, but the DOM structure that the browser renders is still wrong. I found only one forum thread on stackoverflow regarding the same issue. The solution would be to implement our tmgmt-tag as a widget. Regarding the CKEditor API, to do that, CKEditor 4.3 and above is needed and most importantly, the Widget plugin, along with its dependencies. I tried to implement it, but it seems like that the widget plugin is not part of Drupal 8 yet.

To solve this, I will follow my two options:

  • contact the maintainers of the CKEditor for Drupal to get more info
  • include it in my module

Other than fixing last week’s issues, I created a new branch in my Github repository and started with refactoring. Based on my prototype EditorPair, I changed some functions to prototype methods. The idea here is that the functions should not access any data on the page, just do the logic. The prototype and its methods will be moved to a new file, so that the code would be accessible from both modules, segments and tags. After that, I would use some code review from my mentors here.

Goals for next week

The main goal for this week is to fix the DOM structure and implement the widget for our custom tag. Once that is done, we can do some nice stuff, like capture clicks and move the tags around and display the data in their raw attribute. I also plan to do the validation of the masked tags. To begin with, a simple counter per segment will be added in both, source and translation editor. Each validation error will be per segment and will be displayed in the area below. I will also display exactly which tags are missing, so that the user gets a better idea which tags need to be added. More about it later in the followups.