Functions of esProc/R/Python/Perl in Structured Data Process by Comparison :Chapter 13.Sorting

esProc

It directly provides a sorting function for Table sequence, by which you can freely specify a sorting expression and rules with specific parameters. The sorting operation will result in generating new sequence, modifying the order of original sequences, as well as producing a sequence made up from the sequence numbers of sorted records. For example:

    =tbl.sort(col1:1,col2:-1)            //Indicate to generate a new sequence by sorting col1 in ascending order and then col2 in descending order.

    =tbl.sort@o(col1:1,col2:-1)     //Not to generate new sequence, but to modify the order of the original table sequence

    =tbl.psort(col1:1,col2:-1)         //Generate the sequence formed by original sequence numbers of sorted records

Perl

Let us suppose that @table is a 2D array, in order to sort by a combination of Columns 1, 2 and 5, where Column 1 and 2 are character values; Column 5 is numeric value, the source code is as follows:

    #A sorting option is first defined

    sub seniority {

    $a->[0] cmp $b->[0]                 

    or $a->[1] cmp $b->[1] ]

    or $a->[4] <=> $b->[4]

    }

    #Then sort using this option

    @ranked = sort seniority @table;

As can be seen from the above example, a sorting rule in Perl, whether simple or complex, requires to be defined as a subroutine independently, or directly embedded into the sort statement, and then execute it as required by this sorting rule. It is convenient to describe a single field sort, but complex field reference will lead to relatively long code; the code for multi-field sort is longer and tedious.

Python

Python provides a set of arrays with the function for sorting in ascending order, for example:

    a=[1,7,8,9,0,-5]

    a.sort()

If you want to sort it in descending order, the sorting result can be operated in a reverse way, for example:

    a.sort()

    a.reverse()

Remember that a.sort().reverse() is not permitted, and it must be split into two statements, otherwise an error will occur. This is a rather strange place in Python, in other words, the returned value of a.sort() does not seem to be an array object which can be computed continuously.

For 2D array, sort() function will regard all the columns of each row as a whole when sorting them, what it looks like is just to sort only by Column 1.

    b=[ [100,1,2,5,”Li Si” ],

           [-1,1.2,2.1,5,”Zhang San” ],

          [88,9.1,”Wang Wu” ],

          [9,1,3,”Yang Si”]

       ]

    b.sort()

         If you want to sort by other columns, a sorted()function needs to be used, for example:

    b=[ [100,1,2,5,”Li Si” ],

           [-1,1.2,2.1,5,”Zhang San” ],

           [88,9.1,8,100,”Wang Wu” ],

           [9,1,3,7,”Yang Si”]

         ]

    c=sorted(b,key=lambda x:(x[3]))      #Indicate to sort by Column 4

    c=sorted(b,key=lambda x:(-x[3]))    #If you want to sort in descending order, you can only add a minus sign

The sorting function provided by Python is not easy for use, which can only achieve the sorting in ascending order. If you want to sort by specific column other than Column 1, it is harder to write appropriate expression, and also unintelligible.

R

    order(tbl[,1],decreasing=TRUE)                #The order is equivalent to psort of esProc, which returns the original sequence number of sorted record

    tbl[order(tbl[,1],decreasing=TRUE),]       #Get the data frame of sorted records

    sort(tbl[,1],decreasing=TRUE)                  #The sort is equivalent to the sort of esProc, which returns the data frame of sorted records

    order(data[,1]+data[,2],decreasing=TRUE)    #Sort by the sum of Column 1 and Column 2 in reverse order

    order(data[,1],data[,2],decreasing=TRUE)     #First sort by Column 1 in descending order, and then by Column 2 in descending order

    order(data[,1],-data[,2],decreasing=TRUE)    #First sort by Column 1 in descending order, and then by Column 2 in ascending order.

The parameters of sort() and order() are exactly consistent

As we can see from the above codes, order function only provides one parameter instead of  multiple parameters for each field to  specify whether it is sorted in ascending order or in descending order. Therefore, when it is necessary to sort in ascending order by one field and in descending order by other field, only a minus sign can be used to change the field value to negative value. It is harder to understand.

esProc r perl python_comparison_sort

Advertisements

About datathinker

a technical consultant on Database performance optimization, Database storage expansion, Off-database computation. personal blog at: datakeywrod, website: raqsoft
This entry was posted in esProc/R/Python/Perl, Structured Data Process and tagged , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s