Functions of esProc/R/Python/Perl in Structured Data Process by Comparison: Chapter3.Import 2D table from Excel

Data of the regular 2D table is imported from Excel into the memory, and stored in relevant objects provided by the product.

esProc

=file(“d:/data.xls”).importxls@t(id,name,score;”sheet1″,1:10)   

You can specify the field name, sheet name, Start row and End row to be imported

Perl 

It may be a little more trouble for Perl to access to Excel. You need to execute cpanSpreadsheet::ParseExcel command in order to install the relevant class package, and then invoke its interface for access, see below for sample code:

#!perl -w 

use Spreadsheet::ParseExcel;

use Spreadsheet::ParseExcel::FmtUnicode; 

my $parser   = Spreadsheet::ParseExcel->new();

my $formatter = Spreadsheet::ParseExcel::FmtUnicode->new(Unicode_Map=>”CP936″);

my $workbook = $parser->parse(‘d:/data.xlsx’, $formatter); 

if ( !defined $workbook ) {

    die $parser->error(), “.\n”;

}

 for my $worksheet ( $workbook->worksheets() ) { 

    my ( $row_min, $row_max ) = $worksheet->row_range();

    my ( $col_min, $col_max ) = $worksheet->col_range(); 

    for my $row ( $row_min .. $row_max ) {

        for my $col ( $col_min .. $col_max ) {

            my $cell = $worksheet->get_cell( $row, $col );

            next unless $cell;

            print “Row, Col    = ($row, $col)\n”;

            print “Value       = “, $cell->value(),       “\n”;

            print “\n”;

        }

    }

}

Python

It may also be a little more trouble for Python to import the Excel. You need to install the xlrd class package, and then invoke its interface for access, see below for sample code:

#-*- coding: utf8 -*-

import xlrd 

fname = “d:/data.xlsx”

bk = xlrd.open_workbook(fname)

shxrange = range(bk.nsheets)

try:

        sh = bk.sheet_by_name(“Sheet1”)

except:

        print (“no sheet in %s named Sheet1” % fname) 

nrows = sh.nrows

ncols = sh.ncols

print (“nrows %d, ncols %d” % (nrows,ncols)) 

cell_value = sh.cell_value(1,1)

#print cell_value 

mydata = []

for i in range(1,nrows):

        row_data = sh.row_values(i)

        mydata.append(row_data)

print (mydata) 

R

The gdata package needs to be installed first, but the installation program is quite easy to fail. And in the end, it is installed successfully after the CRAN mirror has been replaced for several times. However, it is used quite easily, as shown in the following example:

library(gdata)

a<-read.xls(“d:/data.xls”,sheet=1)         # Loadinto data frame by one statement

print(a)

Conclusions:

2014-06-10_144314

Advertisements

About datathinker

a technical consultant on Database performance optimization, Database storage expansion, Off-database computation. personal blog at: datakeywrod, website: raqsoft
This entry was posted in Structured Data Process and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s