Testing data preparation is a critical work in software testing. High-quality testing data can better simulate the business case. It helps to meet the testing requirements by timely and effective evaluation of software performance, or finding potential issues in the software builds. Most of time, the amount of data used in testing is relatively large, and the data needs to be randomly generated according to specific requirements. Sometimes there is certain relationship between the data, and there is the need to retrieve data from an existing database. Therefore, the preparation of testing data often means complexity and and huge workload.
esProc is a handy tool for testing data preparation.
Now we need to prepare the testing data for employee’s information in text format, including employee number, name, gender, date of birth, city and state of residence, etc. Through this example, we can understand the way testing data are being prepared.
We have the following requirements for testing data: the employee numbers are generated sequentially. Name and gender are randomly generated. Birthdays are randomly generated, however we need to ensure that the current age of the employees are between 18 to 55 years.City and states were randomly obtained from a table in database.
In 3 text files Top100MaleNames.txt, Top100FemaleNames.txt and Top100Surnames.txt, there are 100 most used male and female names, and surnames stored.
The cities of employees need to be retrieved randomly from the CITIES table in database:
According to the STATEID fields in CITIES table, we can retrieve the abbreviation of the state for the employee from STATES table：
Note that when generating the employee information, the name of the employee is related to his/her gender. Therefore we need to retrieve the text data first, combine the most used male and female names, and add the gender field to them:
After sorting, we can see in C2 the following sorted table consists of name and gender：
Similarly, the city and abbreviation of states are also related. After retrieving data from database, the abbreviation of states is added to city information:
Then the basic information of generated data are sorted, including the data structure for employee information table, and amounts of testing data to be generated, etc.:
Among this, the number in C5 is the definition of cache, meaning that after generation of every 1500 records we need to input data to the text file once. This way we can control the memory space being used. In B6 the data structure of employee information table is output to the text file.
As the next step, we can now run a loop to generate the testing data for every employee:
B7 generates a random sequence number as reference to names, while C7 generates one for surnames. They are used to retrieve the name and gender for the employees. According to the requirements, B11 randomly generates the age,and according to the age, selecting a random date in the corresponding year in code line 12 as this employee’s birthday. In line 13, 14 of the code, randomly select a city and to get the city and state for the employee. After all required data are generated, B15 will add all data to the sorted table of employee information created in A5. A16 controls the data output,and write data to text file after every 1500 records. After data output A5 is dumped, to avoid occupying too much memory.
After all data output, the text file are as following:
When preparing testing data with esProc, we can run a loop to generate large amount of random data. Meanwhile, in the loop, we can retrieve existing database data or text data easily, to generate data according to business needs and avoid writing complex programs.