The Best Test Data Management Practices in an Increasingly Digital World

A quick scan of the application landscape shows that customers are more empowered, digitally savvy, and eager to have superior experiences faster. To achieve and maintain leadership in this landscape, organizations need to update applications constantly and at speed. This is why dependency on agile, DevOps, and CI/CD technologies has increased tremendously, further translating to an exponential increase in the adoption of test data management initiatives. CI/CD pipelines benefit from the fact that any new code that is developed is automatically integrated into the main application and tested continuously. Automated tests are critical to success, and agility is lost when test data delivery does not match code development and integration velocity.

Why Test Data Management?

Industry data shows that up to 60% of development and testing time is consumed by data-related activities, with a significant portion dedicated to testing data management. This amply validates that the global test data management market is expected to grow at a CAGR of 11.5% over the forecast period 2020-2025, according to the ResearchandMarkets TDM report.

Best Practices for Test Data Management

Any organization focusing on making its test data management discipline stronger and capable of supporting the new age digital delivery landscape needs to focus on the following three cornerstones.

Applicability:
The principle of shift left mandates that each phase in an SDLC has a tight feedback loop that ensures defects don’t move down the development/deployment pipeline, making it less costly for errors to be detected and rectified. Its success hinges to a large extent on close mapping of test data to the production environment. Replicating or cloning production data is manually intensive, and as the World Quality Report 2020-21 shows, 79% of respondents create test data manually with each run. Scripts and automation tools can take up most heavy lifting and bring this down to a large extent when done well. With production quality data being very close to reality, defect leakage is reduced vastly, ultimately translating to a significant reduction in defect triage cost at later stages of development/deployment.

However, using production-quality data at all times may not be possible, especially in the case of applications that are only a prototype or built from scratch. Additionally, using a complete copy of the production database is time and effort-intensive – instead, it is worthwhile to identify relevant subsets for testing. A strategy that brings together the right mix of product quality data and synthetic data closely aligned to production data models is the best bet. While production data maps to narrower testing outcomes in realistic environments, synthetic data is much broader and enables you to simulate environments beyond the ambit of production data. Usage of test data automation platforms that allocates apt dataset combinations for tests can bring further stability to testing.

Tight coupling with production data is also complicated by a host of data privacy laws like GDPR, CCPA, CPPA, etc., that mandate protecting customer-sensitive information. Anonymizing data or obfuscating data to remove sensitive information is an approach that is followed to circumvent this issue. Usually, non-production environments are less secure, and data masking for protecting PII information becomes paramount.

Accuracy:
Accuracy is critical in today’s digital transformation-led SDLC, where app updates are being launched to market faster and need to be as error-free as possible, a nearly impossible feat without accurate test data. The technology landscape is also more complex and integrated like never before, percolating the complexity of data model relationships and the environments in which they are used. The need is to maintain a single source of data truth. Many organizations adopt the path of creating a gold master for data and then make data subsets based on the need of the application. Adopting tools that validate and update data automatically during each test run further ensures the accuracy of the master data.

Accuracy also entails ensuring the relevance of data in the context of the application being tested. Decade-old data formats might be applicable in the context of an insurance application that needs historic policy data formats. However, demographic data or data related to customer purchasing behavior applicable in a retail application context is highly dynamic. The centralized data governance structure addresses this issue, at times sunsetting the data that has served its purpose, preventing any unintended usage. This also reduces maintenance costs for archiving large amounts of test data.

Also important is a proper data governance mechanism that provides the right provisioning capability and ownership driven at a central level, thereby helping teams use a single data truth for testing. Adopting similar provisioning techniques can further remove any cross-team constraints and ensure accurate data is available on demand.

Availability:
The rapid adoption of digital platforms and application movement into cloud environments have been driving exponential growth in user-generated data and cloud data traffic. The pandemic has accelerated this trend by moving the majority of application usage online. ResearchandMarkets report states that for every terabyte of data growth in production, ten terabytes are used for development, testing, and other non-production use cases, thereby driving up costs. Given this magnitude of test data usage, it is essential to align data availability with the release schedules of the application so that testers don’t need to spend a lot of time tweaking data for every code release.

The other most crucial thing in ensuring data availability is to manage version control of the data, helping to overcome the confusion caused by conflicting and multiple versioned local databases/datasets. The centrally managed test data team will help ensure single data truth and provide subsets of data as applicable to various subsystems or based on the need of the application under test. The central data repository also needs to be an ever-changing, learning one since the APIs and interfaces of the application keeps evolving, driving the need for updating test data consistently. After every test, the quality of data can be evaluated and updated in the central repository making it more accurate. This further drives reusability of data across a plethora of similar test scenarios.

The importance of choosing the right test data management tools

In DevOps and CI/CD environments, accurate test data at high velocity is an additional critical dimension in ensuring continuous integration and deployment. Choosing the right test data management framework and tool suite helps automate various stages in making data test ready through data generation, masking, scripting, provisioning, and cloning. World quality report 2020-21 indicates that the adoption of cloud and tool stacks for TDM has witnessed an increase, but there is a need for more maturity to make effective use.

In summary, for test data management, like many other disciplines, there is no one size fits all approach. An optimum mix of production mapped data, and synthetic data, created and housed in a repository managed at a central level is an excellent way to go. However, this approach, primarily while focusing on synthetic data generation, comes with its own set of challenges, including the need to have strong domain and database expertise. Organizations have also been taking TDM to the next level by deploying AI and ML techniques, which scan through data sets at the central repository and suggest the most practical applications for a particular application under test.

Need help? Partner with experts from Trigent to get a customized test data management solution and be a leader in the new-age digital delivery landscape.

4 Rs for Scaling Outsourced QA. The first steps towards a rewarding engagement

Expanding nature of products, need for faster releases to market much ahead of competition, knee jerk or ad hoc reactions to newer revenue streams with products, ever increasing role of customer experience across newer channels of interaction, are all driving the need to scale up development and testing. With the increased adoption of DevOps, the need to scale takes a different color altogether.

Outsourcing QA has become the norm on account of its ability to address the scalability of testing initiatives and bring in a sharper focus on outcome-based engagements. The World Quality Report 2020 mentions that 34% of respondents felt QA teams lack skills especially on the AI/ML front. This further reinforces their need to outsource for getting the right mix of skill sets so as to avoid any temporary skill set gaps.

However, ensuring that your outsourced QA gives you speed and scale can be a reality only if the rules of engagement with the partner are clear. Focusing on 4 R’s as outlined below while embarking on the outsourcing journey, will help you derive maximum value.

  1. Right Partner
  2. Right Process
  3. Right Communication
  4. Right Outcome

Right Partner

The foremost step is to identify the right partner, one with a stable track record, depth in QA, domain as well as technology, and the right mix of skill sets across toolsets and frameworks. Further, given the blurring lines between QA and development with testing being integrated across the SDLC, there is a strong need for the partner to have strengths across DevOps, CI/CD in order to make a tangible impact on the delivery cycle.

The ability of the partner to bring to table prebuilt accelerators can go a long way in achieving cost, time and efficiency benefits. The stability or track record of the partner translates to the ability to bring onboard the right team which stays committed throughout the duration of the engagement. The team’s staying power assumes special significance in longer duration engagements wherein shifts in critical talent derails efficiency and timelines on account of challenges involved with newer talent onboarding and effective knowledge transfer.

An often overlooked area is the partner’s integrity. During the evaluation stages, claims pertaining to industry depth as well as technical expertise abound and partners tend to overpromise. Due care needs to be exercised to know if their recommendations are grounded in delivery experience. Closer look at the partner’s references and past engagements not only help to gain insight into their claims but also help to evaluate their ability to deliver in your context.

It’s also worthwhile to explore if the partner is open to differentiated commercial models that are more outcome driven and based on your needs rather than being fixated on the traditional T&M model.

Right Process

With the right partner on board, creating a robust process and governing mechanism assumes tremendous significance. Mapping key touchpoints from the partner side, aligning them to your team, and identifying escalation points serve as a good starting point. With agile and DevOps principles having collaboration across teams as the cornerstone, development, QA, and business stakeholder interactions should form a key component of the process. While cross-functional teams with Dev QA competencies start off each sprint with a planning meeting, formulating cadence calls to assess progress and setting up code drop or hand off criteria between Dev and QA can prevent Agile engagements from degrading into mini waterfall models.

Bringing in automated CI/CD pipelines obviates the need for handoffs substantially. Processes then need to track and manage areas such as quality and release readiness, visibility across all stages of the pipeline through reporting of essential KPIs, documentation for managing version control, resource management, and capacity planning. At times, toolset disparity between various stages and multiple teams driving parallel work streams creates numerous information silos leading to fragmented visibility at the product level. The right process should focus on integration aspects as well to bridge these gaps. Each team needs to be aware and given visibility on ownership at each stage of the pipeline.

Further, a sound process also brings in elements of risk mitigation and impact assessment and ensures adequate controls are built into SOP documents to circumvent any unforeseen event. Security measures is another critical area that needs to be incorporated into the process early on, more often it is an afterthought in the DevOps process. Puppet 2020 State of DevOps report mentions that integrating security fully into the software delivery process can quickly remediate critical vulnerabilities – 45% of organizations with this capability can remediate vulnerabilities within a day.

Right Communication

Clear and effective communication is an integral component of QA, more so when DevOps, Agile, and similar collaboration-heavy initiatives are pursued achieving QA at scale. Effective communication at the beginning of the sprint ensures that cross-functional teams are cognizant of the expectations from each of them and have their eye firmly fixed on the end goal of application release. From then on, a robust feedback loop, one that aims at continuous feedback and response, cutting across all stages of the value chain, plays a vital role in maintaining the health of the DevOps pipeline.

While regular stand-up meetings have their own place in DevOps, effective communication needs to go much beyond to focus on tools, insights across each stage, and collaboration. A wide range of messaging apps like Slack, email, and notification tools accelerate inter-team communication. Many of these toolkits are further integrated with RSS feeds, google drive, and various CI tools like Jenkins, Travis, Bamboo, etc. making build pushes and code change notifications fully automated. Developers need notifications when a build fails, testers need them when a build succeeds and Ops need to be notified at various stages depending on the release workflow.

The toolkits adopted by the partner also need to extend communication to your team. At times, it makes sense for the partner to have customer service and help desk support as an independent channel to accept your concern. The Puppet report further mentions that companies at a high level of DevOps maturity use ticketing systems 16% more than what is used by companies at the lower end of the maturity scale. Communication of the project’s progress and evolution to all concerned stakeholders is integral irrespective of the platforms used. Equally important is the need to categorize communication in terms of priority and based on what is most applicable to classes of users.

Documentation is an important component of communication and from our experiences, commonly underplayed. It is important for sharing work, knowledge transfer, continuous learning and experimentation. Code that is well documented enables faster completion of audit as well. In CI/CD based software release methodology, code documentation plays a strong role in version control across multiple releases. Experts advocate continuous documentation as core communication practice.

Right Outcome

Finally, it goes without saying that setting parameters for measuring the outcome, tracking and monitoring those, determines the success of the partner in scaling your QA initiatives. Metrics like velocity, reliability, reduced application release cycles and ability to ramp up/ramp down are commonly used. Further, there are also a set of metrics aimed at the efficiency of the CI/CD pipeline, like environment provisioning time, features deployment rate, and a series of build, integration, and deployment metrics. However, it is imperative to supplement these with others that are more aligned to customer-centricity – delivering user-ready software faster with minimal errors at scale.

In addition to the metrics that are used to measure and improve various stages of the CI/CD pipeline, we also need to track several non-negotiable improvement measures. Many of these like deployment frequency, error rates at increased load, performance & load balancing, automation coverage of delivery process and recoverability helps to ascertain the efficiency of QA scale up.

Closely following on the heels of an earlier point, an outcome based model which maps financials to your engagement objectives will help to track outcomes to a large extent. While the traditional T&M model is governed by transactional metrics, project overlays abound in cases where engagement scope does not align well to outcome expectations. An outcome based model also pushes the partner to bring in innovation through AI/ML and similar new age technology drivers – providing you access to such skill sets without the need for having them on your rolls.

If you are new to outsourcing, or working with a new partner, it may be good to start with a non-critical aspect of the work (for regular testing or automation), establish the process and then scale the engagement. For those players having maturity in terms of adopting outsourced QA functions in some way or the other, the steps outlined earlier form an all inclusive checklist to ensure maximization of engagement traction and effectiveness with the outsourcing partner.

Partner with us

Trigent’s experienced and versatile Quality Assurance and Testing team is a major contributor to the successful launch, upgrade, and maintenance of quality software used by millions around the globe. Our experienced responsible testing practices put process before convenience to delight stakeholders with an impressive industry rivaled Defect Escape Ratio or DER of 0.2.

Trigent is an early pioneer in IT outsourcing and offshore software development business. We enable organizations to adopt digital processes and customer engagement models to achieve outstanding results and end-user experience. We help clients achieve this through enterprise-wide digital transformation, modernization, and optimization of their IT environment. Our decades of experience, deep domain knowledge, and technology expertise delivers transformational solutions to ISVs, enterprises, and SMBs.

Contact us now.

Continuous Integration with Jenkins

Continuous Integration (CI) is a software development practice at Trigent where members of development teams integrate their work frequently using the agile methodology. Integration testing can range from once to several times daily. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. The main aim of CI is to provide instant feedback when a defect is found in the code so that the defect can be rectified as soon as possible.

The agile development cycle followed by CI is Code>>Build>>Test>>Deploy which helps to have high quality and bug-free code.

CI runs have two major phases:

Step One – ensures code compiles.

Step two – ensures the code works as designed.

For best results, these two phases should be followed by running a series of automated tests that validate all levels of a product.

Advantages of CI

  • Bug fixes are detected immediately
  • Debugging becomes less time consuming
  • There is no integration testing step, which saves time
  • Less expensive bug fixes
  • Since code is committed frequently, roll-back is easier in case of any major issues
  • Makes code robust and speeds up the development in an agile environment

Jenkins

Jenkins is an open-source tool to perform continuous integration. Jenkins is great at finding issues in software early. The main aim of Jenkins is to build when an event has occurred. For example, ‘build after every few minutes’ or ‘build after every commit’. Jenkins also monitors test execution and sends out notifications when a build has passed or failed.  It builds and tests your software continuously and monitors the execution and status of jobs, making life easier for the team and identifies issues at the earliest. Jenkins is a highly configurable system and also supports lots of plug-ins

Jenkins Advantages

  • We can configure build-to-run periodically
  • Once a project is successfully created in Jenkins, all future builds are automatic
  • Jenkins comes with basic reporting features, i.e. keeping track of build status, last success/failure, and so forth.
  • Deploys code instantly, no developer builds
  • Generates test reports
  • Notifies stakeholders of build status
  • A large number of plugins supported on Jenkins

Automated Continuous Regression Testing

Regression Testing becomes a challenge if defects are not found in the initial stages. A major concern when developing new software features is that another part of the code will be affected in unexpected ways. With a typical development process, testers often do not get time to execute a full set of regression tests until late in the release when it is much more costly to fix and retest the product. Continuous integration pairs continuous builds with test automation to ensure that each build also assesses the quality of the codebase. Continuous Regression runs on a daily basis in the background so that we can identify issues caused by new commits.

What are the best practices for Automated Regression Testing?

How it all Began ?

Period since the industrial revolution (1820) is referred by scientists as Anthropocene, also referred as Human Age. Significance of this period is automation of manual tasks using machines. Automation has helped mankind become more productive, self-sufficient and innovative.

The onset of Anthropocene era saw machines being used for manufacturing of textiles, paper, cement, chemicals, glass, stream engines, and automobiles and also used for agriculture. Today, you hardly find a household which doesn’t use technology/machines.

Since the invention of computers in the late 20th century, software has empowered humans to eliminate manual processes, introducing automation tools, where ever applicable. And till date, it’s ongoing.

Software development and testing

Software development lifecycle involves four important phases, namely requirement gathering, architecture and design, core development and testing. Software testing validates the functional and non-functional capability of the software. Today, software testing is mostly done manually by one or more person based on the size and complexity of the software.

Automated regression testing

During the initial stages of a software’s life cycle, it is valuable to have humans manually validate software’s capability and usability. However, as a software becomes mature it becomes a challenge in terms of time / cost to validate all aspects of the software manually. For a tester it becomes monotonous and is error prone when executing hundreds of test cases for every major software enhancements. So, it is essential to automate the tasks done by the tester to improve quality and thereby save time / cost. This process is referred as automated regression testing.

During the software development, it is essential that the test team build a regression suite from the very initial stage. This effort will pay high dividend during the later stage of the development and during software maintenance.

4 essential ingredients for an automated regression testing suite 

  1. Test database/environment on which the test scripts have to be executed
  2. Test scripts that translates the regression test case and has an expected result
  3. Test report that compares expected and actual result and
  4. Report that highlights the code covered by the regression suite

Test database has predefined data that are required for the software to run in a stable manner and for the test scripts to be executed successfully. Test team has to continuously update the test database as per the changes to the software.

Test scripts translate the hand written test cases to a re-executable script. There are multiple tools that helps in writing test script, the popular one’s are QTP, Winrunner, Selenium, Watir, SOA test, Testing Anywhere, etc. Selenium, Watir are open source and helps in building your own framework.

Every test script should have an assert statement to validate the expected result and the actual result match. In case they match it is a success else a failure. End of the regression test suite execution the test or project manager should review the report, ascertain the quality of the software and take appropriate action.

Validate test sufficiency:

Regression test suite should be constantly enhanced to reflect the changes to the software. Before enhancing one needs to know if the test scripts or test plans are sufficiently covering all the modules of the software. Code coverage tools helps in assessing the sufficiency of the test scripts. Some of the popular code coverage tools are,

For Java:

a)   EMMA (http://emma.sourceforge.net)

b)   Hansel (http://hansel.sourceforge.net)

c)   jCoverage (http://www.jcoverage.com)

d)   Cobertura (http://cobertura.sourceforge.net)

For .NET:

a)   OpenCover (https://github.com/sawilde/opencover)

b)   Clover.NET (http://www.cenqua.com/clover.net)

c)   NCover (http://ncover.sourceforge.net)

For PHP:

a)   PHP Test Coverage Tool

Our Software testing consulting and automation testing services takes into account all of the above factors. We are amongst a few automation testing companies who provide detailed reporting to validate Testing Sufficiency. Here’s a sample report.

Ensuring maximum test coverage and managing timeline – Software Quality Assurance

What is Software Quality Assurance?

“Quality Assurance” is the calling card for a software testing services provider. No matter how robust a software application is, its failure to perform at a critical instance can prove fatal to clients. Through the lens of history, it has been observed that enterprises spend more on bug fixes as compared to developing software applications.

Though an in-depth analysis of quality is outside the scope of this post, I will give a round-up on “Quality Assurance” by ensuring maximum test coverage as a means of achieving sustained client relationships.

One of the common challenges for a software testing team is to deliver projects within the time-frame yet ensure maximum test coverage. Software testing is not a one size fits all solution for problems as testers have to drill deeper to explore bug fixes that can damage the quality of an application. Though a solution to this is not readily available, however, we may need to arrive at one, based on the project circumstances/needs.

Before I set out for a solution, let us explore some of the possible risks/outcomes of not meeting the time-frame/maximum coverage:

a) Time Frame Overshoot: Time Consuming Delivery, Client Dissatisfaction, Extra Cost, Extended Time to market

b) No Coverage: No confidence in product quality, buggy product, user dissatisfaction, extra cost in fixing issues and re-releasing the product, etc.

“Time Limit” is a management activity, but from a tester’s perspective, we run on a pretty tight schedule and have to focus primarily on ensuring maximum coverage within the prescribed time limit. We have to ensure the application performs its function time after time. And to achieve this, we can develop a few handy utilities/tools like the one I have described below:

1) Develop a complete test suite, which includes:

  1. Test case for all functionality of the application
  2. Critical Functionality of the application being identified and filterable
  3. Prioritize test case as high/medium/low

[Benefits]: Allows to “Pick & Choose” test cases for execution based on:

  1. Critical module/function
  2. High Priority test cases
  3. Regression testing cases based on Change/Bug Fix in the current build

2) Optimize test scenarios:

  1. Eliminate repetitive and non-valuable tests that may yield the same results as other tests in the suite.
  2. Effective Coverage: Focus on combination testing using orthogonal array-based mathematical model tools like All Pairs, PICT, etc., and Decision Tables.

[Benefits]: Provides minimum scenarios covering all possible combinations which are non-repetitive and valuable in predicting the result.

3) Automated  Regression Suite – Automate all possible regression test cases

[Benefits]: Ensures execution and coverage of all mandatory flows without much manual intervention, even during crunch situations

4) Focused Testing: Apart from the above in case we have a very tight deadline, it is always preferable to focus the testing efforts on some critical areas like:

  1. Areas that may be vulnerable to changes and yield more bug fixes
  2. Areas that have yielded more bugs in the past
  3. Areas which are exposed to the End User or User Critical

Summary:

All the above utilities applied together will provide a cohesive framework to guide a tester or a development team to maximize test coverage and achieve greater quality within the stipulated time limit. However, it also depends on the tester’s knack for choosing the “Minimum most test cases” to draw upon insights that can help solve similar issues.

In the process, if the user over emphasizes minimizing the test cases to meet the time limit or if the area of focus is incorrect, he may miss out in terms of coverage and may lead to under testing. Hence the tester should wisely balance in choosing the right focus area and the right test cases for the build and plan to execute them within the given time frame. If the tester feels that the selected/shortlisted test cases cannot be run in the given time frame, it is always advisable to buy more time. And in case that is not possible, it’s prudent to alert the stakeholders about what test cases have been ignored and which modules are under-tested.

Ignore “Quality Assurance” in your services road-map and you do so at your own peril.