Preparing Testing Data with esProc

Testing data preparation is a critical work in software testing. High-quality testing data can better simulate the business case. It helps to meet the testing requirements by timely and effective evaluation of software performance, or finding potential issues in the software builds. Most of time, the amount of data used in testing is relatively large, and the data needs to be randomly generated according to specific requirements. Sometimes there is certain relationship between the data, and there is the need to retrieve data from an existing database. Therefore, the preparation of testing data often means complexity and and huge workload.

esProc is a handy tool for testing data preparation.

Now we need to prepare the testing data for employee’s information in text format, including employee number, name, gender, date of birth, city and state of residence, etc. Through this example, we can understand the way testing data are being prepared.

We have the following requirements for testing data: the employee numbers are generated sequentially. Name and gender are randomly generated. Birthdays are randomly generated, however we need to ensure that the current age of the employees are between 18 to 55 years.City and states were randomly obtained from a table in database.

In 3 text files Top100MaleNames.txt, Top100FemaleNames.txt and Top100Surnames.txt, there are 100 most used male and female names, and surnames stored.

 esProc_prepare_testdata_1

The cities of employees need to be retrieved randomly from the CITIES table in database:

esProc_prepare_testdata_2

According to the STATEID fields in CITIES table, we can retrieve the abbreviation of the state for the employee from STATES table:

esProc_prepare_testdata_3

Note that when generating the employee information, the name of the employee is related to his/her gender. Therefore we need to retrieve the text data first, combine the most used male and female names, and add the gender field to them:

esProc_prepare_testdata_4

After sorting, we can see in C2 the following sorted table consists of name and gender:

esProc_prepare_testdata_5

Similarly, the city and abbreviation of states are also related. After retrieving data from database, the abbreviation of states is added to city information:

esProc_prepare_testdata_6

And A4:

esProc_prepare_testdata_7

Then the basic information of generated data are sorted, including the data structure for employee information table, and amounts of testing data to be generated, etc.:

esProc_prepare_testdata_8

Among this, the number in C5 is the definition of cache, meaning that after generation of every 1500 records we need to input data to the text file once. This way we can control the memory space being used. In B6 the data structure of employee information table is output to the text file.

As the next step, we can now run a loop to generate the testing data for every employee:

esProc_prepare_testdata_9

B7 generates a random sequence number as reference to names, while C7 generates one for surnames. They are used to retrieve the name and gender for the employees. According to the requirements, B11 randomly generates the age,and according to the age, selecting a random date in the corresponding year in code line 12 as this employee’s birthday. In line 13, 14 of the code, randomly select a city and to get the city and state for the employee. After all required data are generated, B15 will add all data to the sorted table of employee information created in A5. A16 controls the data output,and write data to text file after every 1500 records. After data output A5 is dumped, to avoid occupying too much memory.

After all data output, the text file are as following:

esProc_prepare_testdata_10

When preparing testing data with esProc, we can run a loop to generate large amount of random data. Meanwhile, in the loop, we can retrieve existing database data or text data easily, to generate data according to business needs and avoid writing complex programs.

Advertisements

About datathinker

a technical consultant on Database performance optimization, Database storage expansion, Off-database computation. personal blog at: datakeywrod, website: raqsoft
This entry was posted in Program Language and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s