The Effects of Bad UAT Environments

API Integration Project Nightmare

I was recently faced with an API integration project that turned out to be a nightmare. This project made me sit down and seriously think about how I assess integration projects and the kinds of questions I should ask when building requirements documents or making estimates.

The Problem with Downstream Links in UAT Environments

In this project, the software I was responsible for acted as a "fire brigade line" for data from external parties. We would pull data from Party A, pass it along, and then push it to Party B. However, neither Party A nor Party B had UAT environments without downstream links.

This meant that if we asked for data from Party A, they were reaching out to a Party A2 to get the data, even within their own UAT environment. Similarly, when we passed data to Party B, they would not be the final consumers of the data, but instead would pass it along to a Party B2 for processing and confirmation. Similar again was the lack of a true UAT environment.

Party A2 --|pull|-- Party A --|pull|-- ME --|push|-- Party B --|push|-- Party B2

Lack of Separate Certification

Unfortunately, due to poor technical project management, there was no consideration of certifying the interaction with Party A or Party B separately. As a result, the only testing and certification that were done required the proper functioning of UAT environments controlled by four different companies beyond my own.

This led to a repeating cycle of the following frustrating setup:

Party B expected specific scenarios to be run, but had to request data from Party B2 that would fulfill those scenarios.
The data from Party B2 had to be passed from Party A to Party A2 to be added to their systems to be retrievable.
Party A wanted to test My Software with multiple Party A2's, as they were technically connecting to multiple remote entities with multiple methods.
Party B's link to Party B2 went down in the middle of multiple certification attempts, halting all work.

The result of this convoluted web were multiple testing and certification attempts that simply could not be finished because the sample data was not correct. This leads to days or weeks of waiting while new data was acquired and passed to the entities that had to load it for the next testing attempt.

The Correct Behavior

In an ideal scenario, certifying an external party API should only involve the interaction between our system and their system. If Party B cannot communicate successfully with Party D, that is not our problem and should not have any relevance to our certification process. Likewise with Party A; the variety and style and reliability of their connection to party C is irrelevant to the certification of My Software.

But Wait, It Gets Worse!

Not only did we have to deal with "transitive API dependencies", but also the ultimate consumer of the data, Party B2, is replicating real-world data for its test cases. This means that any case we run through it is not usable again - it's consumed!

This makes it impossible for us to have integration or regression tests for this API interface because there are no sets of parameters that we can consistently run to receive a consistent response.

Even during implementation, if we managed to achieve a successful result with one set of data, we could not ever replicate it exactly. We would have to keep trying additional sets of data that MIGHT give us the same response scenario.

The Correct Behavior

Party B should have specific test cases with specific inputs and outputs, so that we who are developing against their API can confidently test all possible scenarios they will be certified on.

This is particularly maddening because Party B is the entity performing the certification and requiring these scenarios, with no reliable way to replicate them.

Conclusion

This is sadly a common occurrence when you have non-technical organizations trying to solve problems with software without having proper software engineering disciplines in place. You end up with a mess of glued-together pieces in a rush to just get the work done, without any consideration for the repeated cost of these certifications in time and resources.

Much like with technical debt, there has to be a party within the company that recognizes the value in putting time into building a proper UAT environment. Instead, what happens is an incremental cost is suffered in time and resources on every client implementation.

A testing first approach should always be taken if you or your company is going to require third parties to pass a certification in order to use your services. Lacking this approach, you are adding unnecessary friction to the very parties your business relies on to make a profit.

API Integration Project Nightmare

The Problem with Downstream Links in UAT Environments

Party A2 --|pull|-- Party A --|pull|-- ME --|push|-- Party B --|push|-- Party B2

Lack of Separate Certification

This led to a repeating cycle of the following frustrating setup:

Party B expected specific scenarios to be run, but had to request data from Party B2 that would fulfill those scenarios.
The data from Party B2 had to be passed from Party A to Party A2 to be added to their systems to be retrievable.
Party A wanted to test My Software with multiple Party A2's, as they were technically connecting to multiple remote entities with multiple methods.
Party B's link to Party B2 went down in the middle of multiple certification attempts, halting all work.

The Correct Behavior

But Wait, It Gets Worse!

This makes it impossible for us to have integration or regression tests for this API interface because there are no sets of parameters that we can consistently run to receive a consistent response.

The Correct Behavior

Party B should have specific test cases with specific inputs and outputs, so that we who are developing against their API can confidently test all possible scenarios they will be certified on.

This is particularly maddening because Party B is the entity performing the certification and requiring these scenarios, with no reliable way to replicate them.