Pages

Thứ Sáu, 16 tháng 4, 2021

Table of Contents in Blogger (for posts in Vietnamese language)



The link  https://Your_URL.blogspot.com/feeds/posts/default contains all important information about your posts, which consists of title, date, label, URL and content. They are indexed to maintain their correlations; that is, title(n), date(n), label(n),  URL(n) and content(n) all belong to the same post(n). URL is a hyperlink of post(n) to title(n); content contains the text in the first paragraph of post(n). Your_URL is the URL of your blogger. Note that blogspot allows only up to 150 most recent posts to be included in the above link.


The data in the above link can be sorted according to “date”, or “label”; also, alphabetically within “label” and/or “title”. Figure 1 shows a table of contents that has data sorted by labels and titles.


Figure 1      A Table of Contents With Data Sorted By labels And titles

label(1)

            title(1)

title(p)


 label(2)

            title(p+1)

          …

            title(p+q)


 label(3)

          …

The titles in each label are sorted alphabetically. Excellent JavaScript codes are available on the Internet to create such table of contents, one of which is found in the following link: https://drive.google.com/file/d/1Q6SPWgGo29rJ6z57FNuiBzP0HXhPG28p/view?usp=sharing. You need to replace “your_URL” (in “your_URL.blogspot.com”) by your own before using. It’s to be noted that the above code is meant for posts in English. It does not account for posts in other languages, particularly those with special characters and/or diacritics, for which Vietnamese is one.

A few years ago, with the lack of a sorting algorithm for texts in Vietnamese, people resorted to an alternative where the above code was used to first generate an intermediate table of contents, followed by the use of MS Excel or Google Sheets to sort posts in Vietnamese. Nowadays, the JavaScript array sort function “.sort(Intl.Collator().compare)” can be used  to sort texts in different (international) languages, including Vietnamese (the code name “vi” for Vietnamese may be omitted).

To illustrate the difference between .sort() and .sort(Intl.Collator().compare), we create Table 1 where strings in English, Vietnamese, and mixed Vietnamese & English are sorted by the above 2 functions.

Table 1 Comparison of outputs from .sort and .sort(Intl.Collator().compare)

Input
.sort
.sort(Intl.Collator().compare)
Paris, Sweden, Spain, Los Angeles, India, Africa, Sydney, Bangkok, Florida, 10 dollars, Uganda
10 dollars, Africa, Bangkok, Florida , India, Los Angeles, Paris, Spain, Sweden, Sydney, Uganda
10 dollars, Africa, Bangkok, Florida, India, Los Angeles, Paris, Spain, Sweden, Sydney, Uganda
Sài Gòn, địa danh, ăn cơm, tin buổi sáng, bưởi, lúa, đảm đang, 10 năm, Ông đại diện
10 năm, Sài Gòn, bưởi, lúa, tin buổi sáng, Ông đại diện , ăn cơm, đảm đang, địa danh
10 năm, ăn cơm, bưởi, đảm đang, địa danh, lúa, Ông đại diện, Sài Gòn, tin buổi sáng
Sài Gòn, địa danh, USA, ăn cơm, English, tin buổi sáng, bưởi, lúa, đảm đang, 10 năm, Ông đại diện
10 năm, English, Sài Gòn, USA, bưởi, lúa, tin buổi sáng, Ông đại diện , ăn cơm, đảm đang, địa danh
10 năm, ăn cơm, bưởi, đảm đang, địa danh, English, lúa, Ông đại diện, Sài Gòn, tin buổi sáng, USA

In what follows, we will modify the code in the above link and use .sort(Intl.Collator().compare) to produce similar table of contents for posts in Vietnamese. Our modification starts with the set of (labels(n), title(n) and URL(n)) that is already available in the above link. This set of data is VERY important, because it defines the relationships between labels(n), title(n) and URL(n).

The data in Figure 1 can be re-arranged as shown in Figure 2.

Figure 2      Relationship between labels(n), title(n), and URL(n) before sorting in Vietnamese

label(1)
title(1)
URL(1)
label(1)
title(2)
URL(2)
label(1)
title(p)
URL(p)
label(2)
title(p+1)
URL(p+1)
label(2)
title(p+2)
URL(p+2)
label(2)
title(p+q)
URL(p+q)

In this example, title(1)…title(p) have the same label(1) (thus, label(1) is repeated p times in the table); title(p+1)…title(p+q) have the same label(2) (hence label(2) is repeated q times), etc.

We will use .sort(Intl.Collator().compare) to sort label(n) first, while maintaining its relationship with title(n) and URL(n). For illustration purposes, let’s assume after sorting, label(2) will appear before label(1) and the table will be as shown in Figure 3.

Figure 3      Relationship between label(n), title(n), and URL(n) after sorting by label(n) in Vietnamese

label(2)
title(p+1)
URL(p+1)
label(2)
title(p+2)
URL(p+2)
label(2)
title(p+q)
URL(p+q)
label(1)
title(1)
URL(1)
label(1)
title(2)
URL(2)
label(1)
title(p)
URL(p)

In the next step, we will use .sort(Intl.Collator().compare) to sort title(n) in each label while maintaining its relationship with URL(n). For illustration purposes, let’s assume within label(2), the sorting will result in the swapping of title(p+2) and title(p+q); the new relationship between label(2) and title(n), URL(n) is as shown in Figure 4.

Figure 4      Relationship between label(2), title(n), and URL(n) after sorting by title(n) in Vietnamese

label(2)
title(p+1)
URL(p+1)
label(2)
title(p+q)
URL(p+q)
label(2)
title(p+2)
URL(p+2)

We will discuss our sorting in more detail below.

We search the entire set of label(n). Whenever we detect a change in the content of label(n), we know that we just pass the point where label(i) has changed to label(i+1).

function labels_index(temp) {
var label_occur = new Array();
var label_name = new Array();j = 0;
          label_name[j] = temp[j];
          label_occur[j] = 1;
for (i = 0; i < postLabels.length -1; i++) {
          if(temp[i]!== temp[i+1])                               {        j = j+1;
                    label_occur[j] = 1;
                    label_name[j] = temp[i+1];       }
          else {           label_occur[j] =label_occur[j]+1;       }       
          }
return [label_occur,label_name]; }

The above function returns all labels(n) and their numbers of occurrences. Since the sort function is based on an in place algorithm, we must preserve the original labels.

temp_Labels =postLabels;

We now use function labels_index to determine the order of label(1)….label(label_occur) and the label names (label_name) before vi-sorting (i.e. sorting in Vietnamese language).

var {label_occur_eng,label_name_eng} = labels_index(temp_Labels);
var data = labels_index(temp_Labels);
          label_occur_eng = data[0];
          label_name_eng = data[1];

Now we use .sort(Intl.Collator().compare) to sort labels and save the result as postLabels.

var collator = new Intl.Collator('vi');
postLabels =temp_Labels.sort(collator.compare);
var data = labels_index(postLabels);
          label_occur_vi = data[0];
          label_name_vi = data[1];

Now we determine the correlation of postLabels before and after sorting (the result of which will be used to align postTitle and postUrl with postLabels after vi-sorting postTitle).

for (i=0;i <label_name_vi.length;       i++)
{        sum =0;
                   if (i!== 0)    {        for (kk=0; kk<i;kk++)    {        sum = sum + label_occur_vi[kk];          }                 }
for (ij=0; ij<label_occur_vi[i]; ij++)
                   {ik = sum+ij;        postLabels[ik]= label_name_vi[i];      }        }       

We now compute temp_Title and temp_Url from postTitle and postUrl, based on the indices in labels before and after vi-sorting.

          nl_occur_eng= Object.values(label_occur_eng);
          sum = 0;
for (i=0; i <label_occur_eng.length; i++) {
          start_index[i] =sum;
          sum= start_index[i] +label_occur_eng[i] ;     }
          sum = 0;
for (i=0; i <label_occur_eng.length; i++) {
          for (k=0; k<label_occur_eng.length  ; k++)
                    { if (label_name_vi[i] == label_name_eng[k])
                    { kk =k;}    }
for (j=0; j<label_occur_vi[i]; j++ ){
          jj =sum + j;
          jk = start_index[kk]+j;
          temp_Url[jj] = postUrl[jk ];
          temp_Title [jj] = postTitle[jk ]; }
          sum = jj + 1;         }

We now redefine postTitle and postUrl.  Note that postTitle has been re-arranged to suit postLabels (that has been vi-sorted).

postUrl = temp_Url;
postTitle = temp_Title;

We now compute start-index of each label postLabel (that has been vi-sorted), then vi-sort postTitle.

          nl= Object.values(label_occur_vi);    
          sum = 0;
for (i=0; i <label_occur_vi.length; i++) {
          start_index[i] =sum;
          sum = start_index[i] +label_occur_vi[i] ;}  
for (n=0;n <label_occur_vi.length; n++) {
          i = nl[n];
for (j=0; j<i; j++)  {
          p_n[j]= postTitle[j+start_index[n]];    }
// truncate p_n to "i" elements  
          post_n= p_n.slice(0,i);
// vi-sort p_n
          p_n= post_n.sort(collator.compare);
for (j=0; j<i; j++ ){
          postTitle[j+start_index[n]]        =p_n[j] ;      }        }
// sort URl based on absolute index
for (i=0; i<postTitle.length ; i++){
for (j=0; j<postTitle.length ; j++){
          if (postTitle[i] == originalTitle[j] ){ k =j; }   }
          temp_Url[i] = originalUrl[k];    }
          postUrl = temp_Url;

The revised code of table of contents for posts in Vietnamese can be downloaded from https://drive.google.com/file/d/1DwQwE-oiURnGJbYZNPt0T9VKG7TD48w_/view?usp=sharing and you are free to use it.

Finally, I’d like to share an experience with you. I used NotePad++ to test my revised code and Chrome to display the result. I used Chrome debugger by pressing F12, which worked very nicely for me!