Tuesday, January 31, 2017

From Requirements and Design to Specifications and Programs

Untitled Document.md People ask what the greatest benefit has been as a result of having done both software QA and web development. They assume I might release fewer bugs than other developers; that it’s perfect on the first attempt. Maybe my unit tests just write themselves magically. True as these may be (they aren’t), the biggest benefit of having had my feet in both worlds is my ability to consider the code I write from the perspective of someone who doesn’t know how the code was written. It’s my ability to think like a user. In practice, this boils down to the difference between considering two different software development tools: user requirements and technical specifications.

Knowing what separates a list of requirements from a list of specifications requires that we put on a few different hats. Requirements can be said to describe relationships in some environment; an environment being a browser running a website, for example. Specifications then would describe the way some program that we’ve built manipulates this environment. Specifications are going to be pretty comprehensive. Indeed, we can think of a program as the same as a list of specifications, only now all the items in the list will be executable. In this article I’ll describe the process of deriving specifications from requirements. This document borrows heavily from the paper “Deriving Specifications from Requirements: an Example” by Michael Jackson and Pamela Zave.

In order to define this process, we’re going to need to lay some groundwork and define some of its parts. I mentioned environments in the preceding paragraph. What is an environment?

I, being a web developer, typically build programs which have to function as part of some web application running in a browser on either a desktop machine or mobile device. These things - an existing web application, a web browser, some hardware to run it all - are parts of an environment. In addition to these technological components of the environment, there are a couple of other very important components: the user, and time.

Into this environment I want to build and install some new program, or as Jackson and Zave call it, a machine. The job of a machine is to act upon the environment in such a way that the states in the list of requirements all come to exist. The machine is a wish fulfiller.

This is all well and good, but how then does a programmer take a list of requirements and use it to build some machine? Let’s take a look at an example. Suppose we have a web application that already exists. Our users have asked us to install a new button in the application. The button in this case will, when pushed, cause a modal popup to appear on the page. Printed in this modal will be the number of times the button has been pushed. The modal will have a little red “X” button that, when clicked, will close the modal. Thus ends our requirements for a new button.

This is fantastic. I love buttons.

Note how every component in the requirements describes some condition in the environment, or some relationship betweens things in the environment. There are user-initiated events. The descriptions refer to things that will be apparent whether or not our user knows how the machine is accomplishing them.

The job of knowing how a machine will cause these relationships to exist is ours. Let’s start by listing all the apparent events that can be performed using this machine:

Event Event Name
User clicks button ClickButton
App adds 1 to # of times button has been clicked IterateClicks
App loads modal with iterated number of button clicks LoadModal
User clicks “X” to close modal ClickClose
App closes modal CloseModal

Nice. These are in an order where each event follows chronologically after the event which immdiately preceded it. Notice that the outcomes of these events may or may not result in some of the observable phenomena described in the requirements. The event IterateClicks, strictly speaking, doesn’t produce anything observable in the environment. Sure, it’s a necessary step for the following events to function as required; it just isn’t shared between the machine and the environment. All the rest of the events in the list produce shared phenomena. This separation of interestes is a key difference between user requirements and program specifications. A user might say “I don’t care how you do it, just make sure the number in that modal is correct every time.” Aye aye, intrepid user.

Whether a phenomenon is shared between program and environment is one thing, but it doesn’t describe who or what controls that phenomenon. For example, a button click is shared because the user does it and the program receives it, but only the user can control whether the button has been clicked. Likewise, the modal popping up happens in response to the events that precede it, but the enforcement of that logic can only be controlled by the program, and not by the user. Any phenomenon can only be spontaneously controlled by either something in the environment (a user clicks a thing), or by something within the machine (a preceding event demands the next action be performed).

Let’s update our list of events with their shared status and their controlling actors:

Event Event Name Shared? Controller
User clicks button ClickButton Yes User
App adds 1 to # of times button has been clicked IterateClicks No Program
App loads modal LoadModal Yes Program
User clicks “X” to close modal ClickClose Yes User
App closes modal CloseModal Yes Program

Now we can see the benefit of thinking like a user in addition to thinking like a programmer. Good requirements explain every phenomenon a user would expect to witness in the given environment, but they wouldn’t have anything to say about program-controlled, unshared phenomena. In fact, a list of specifications is a superset of requirements. It’ll contain all the relationships between phenomena in the environment (all the stuff a user sees and does on a page), and it’ll also describe the behavior of the machine within the environment (how the new code interfaces with the environment).

At a high level, the task of deriving specifications from some given requirements is to list all the events kicked off from the environment, and insert all the behind-the-scenes events that the program will have to perform in order to make each next phenomenon a reality.

There are gotchas. What we’ve described is one path through this new button experience; what some might call the “happy path.” What happens when there’s a modal already loaded and the user clicks that button again? To specify the outcomes for unexpected events like this, we need to to talk about the state of our environment.

A state is a collection of phenomena described in user requirements. What kind of states can the environment be in with regards to our new button-modal functionality? This can be determined by examining the environment-controlled events in our list of requirements. We have two: ClickButton and ClickClose. We can determine future state from an event by hopping down the list of all the program-controlled events that are preceded by the given environment-controlled event. For example, a user performing ClickButton causes the program to perform IterateClicks, and then LoadModal. After that, time passes until the user or environment performs some other action upon our program. This is one state of our program. Number has been iterated, modal has been loaded. There is one other state, triggered by a user’s ClickClose action: modal has been closed. These two states fulfill our simple requirements.

In a perfect world, we can assume that the requirements are comprehensive. The “open” button and the “close” button do what they do, and that’s all that they should do. Let’s dwell in this perfect world a while.

The requirements give us the two states of this program. Our needing to lock the “open” button while the modal is loaded need not be in the requirements, but it does need to be in our specifications. Remember that requirements are a complete description of phenomena in the environment - all states are described. Specifications, being a superset, are the states described in requirements, plus the additional means to get there. Locking the “open” button while the modal is loaded should therefore be added to our specifications in order to fulfill the requirements.

Event Event Name Shared? Controller
User clicks button ClickButton No User
App adds 1 to # of times button has been clicked IterateClicks No Program
App locks “open” button LockButton Yes Program
App loads modal LoadModal Yes Program
User clicks “X” to close modal ClickClose No User
App unlocks “open” button UnlockButton Yes Program
App closes modal CloseModal Yes Program

I think we’ve explored the outcome of every environment-provided event in our requirements. But we’re not done yet! In most cases the environment where we’re installing our new program is going to provide plenty of other events that might affect our state. What if the browser is minimized? What if JavaScript is disabled? What if everything needs to be controlled by a keyboard instead of a mouse or touchscreen? Outcomes branching from existing events available through the environment will need to be specified in order for our machine to diligently create the states described in our requirements. Exploring these paths often leads to necessary changes in the requirements themselves.

Think like a user. A user needs the machine that you’ve built to deliver some expected results. A user does not necessarily need to see the gears grinding in order to deliver those results. I, in my role as gear-grinder, find it helpful to remind myself every now and then that the machine is not the outcome of my efforts. The user’s environment is.

Monday, January 30, 2017

To Test a Website, You Have To Rebuild Its Interface

If you’re automating tests, then you’re a programmer. What you’re programming is an interface to the application under test.

Lots of people start software test automation the same way — write a test that runs and stands on its own and lets you know whether X thing on a website, given an input of Y, yields Z. Robot type ‘hello’ into Google, robot click button, robot verify results. It’s all right there: the input data, the paths to the site’s elements, the test assertions, all in one script, and life is good for that test case. It’s a tidy little monolith.

If your company desperately needed to know at all times that clicking the ‘Go’ button on Google.com returned search results, you’d be sitting pretty. More likely your company has tasked someone (maybe you) with authoring a mountain of test cases, organized into test suites, to verify all the features of whatever application they’re launching or maintaining. If you start like I started, you automate a handful of these cases in the same way — tiny, monolithic scripts that work — and set each aside, but it starts catching up with you.

You start wrestling with inefficiencies that “real programmers” have been ironing out for years. You’re copying and pasting code for common functions within the application under test. Your UX guy changes the CSS a little and half your scripts can’t find the login text field the following morning. You’re in maintenance hell with your little stable of scripts and you haven’t even automated the smoke test suite yet. What is a test engineer to do?

Don’t panic. What you need is a dose of abstraction.

Consider that any test script you write should be able to be executed on its own, apart from all the other scripts you've written. If your robot needs to sign in to test some protected capability, then signing in should be a component of the test script for that capability. This is important because you may end up realizing that 98.8% of your test cases require the user to be signed in, or to have uploaded a file, or to have accomplished any one of a thousand simple primary actions. Here’s the thing: when the developers of the application wrote that sign-in function, they wrote it once and they put it somewhere where other components of the application could access it. Signing in was such a common use case that to repeat it everywhere that it was needed in the code would have been folly. 

You should follow suit.

You’re a programmer, and you need to learn how to set up functions in your language of choice. It should happen in such a way that wherever your monolithic test scripts have the four or five lines that you've pasted everywhere that tell your robot to sign in with username “foo” and password “bar” they can instead have one line that sends these values to the function and gets back the access it needs:

sign_in(“foo”, “bar”);

…and elsewhere:

sub sign_in(string username, string password) {
    return “success”;

This is an evolution. This is your code getting mature.

This is the beginning of a design pattern called Page Objects. In a nutshell, all the capabilities that can be performed by the application you’re testing should be in functions abstracted away from your test scripts. Any part of your test scripts that return actual pass/fail values should be kept out of those functions. And the easiest way to organize your functions is by grouping them so that capabilities accomplished on the same page (or frequently used component) in the application under test are packaged together in your functional interface. So our friend sub sign_in() up there might be in a whole other file called Login_Page.xx, saved in a place where your test scripts can find it.

It can seem overwhelming. How will you know which capabilities to program into your interface? The application you’re testing has so many capabilities that it seems you could write interfaces for them all year and never write a single actual test against any of them. The secret is to write the test first.

This is the beginning of a concept called Test-Driven Development, or TDD. To wade backward into the generation of our sign_in test above, you would have written the test script, that first chunk, first. You would have then noticed that sub sign_in() doesn’t exist anywhere in the interface you’re building, and so you would go build that second chunk. And you will do the same for every other function needed by your test scripts to interact with the application under test. The advantage is you’ve already got your test cases written. You know what needs to be accomplished. You can write these as test scripts and fill in the blanks in your interface as you go. At any given time your interface will have exactly as much functionality as it needs to take on what your test scripts throw at it and interact with the application you’re testing.

If any of your tests need you to sign in, you will have a function that does it. And if your UX guy changes the name of the login text field, you will only have to change your code in one place. You’re a programmer — spend less time maintaining your code and more time writing programs that will help your team deliver a quality application.

Tuesday, January 10, 2017

Parallel Web Testing With A Queue Of Robot Users

I wanted to use Selenium to test some capabilities of this web application my team was building, so I wrote a lot of tests in Java and started running them. Okay, great, if any of these capabilities being tested break in the course of development, the robots will find out about it and report back to us. Nothing new here; that’s the nature of test automation. The same old story usually continues down a path like “These tests are great, but what can we do to optimize how they run?” For example if we want our full suite of tests to go through an end-to-end run in less time, we’ll consider running tests in parallel. And so that’s what we did.
There are a thousand and twelve benefits to running your UI tests in parallel, and it feels like there are just as many pitfalls. I’d like to address one of the pitfalls I encountered, and that was dealing with our application’s session management. This web application requires its users to log in in order to access any of the good test-worthy capabilities. When a single user logs in on Computer A, a session is created for that user — this is so all the parts of the application know that this user guy is authorized to see and use all these protected capabilities, at least until the session ends or expires. But then if the same user logs in on Computer B, a new session is created for that user and that first session is ended. So back on Computer A, the user suddenly loses access to all these protected capabilities. The end result is if we wanted to test two capabilities on different machines at the same time, we were going to need to do it with two different users. Substitute “x” for “two” where “x” is the number of tests we wanted to run in parallel, and you see the kind of problem we were trying to solve.
Here’s how we did it.
The plan was to set up a queue of valid test accounts. When a test script wanted to log into our application, rather than each using the same “userX” login, they’d grab the first test username in line, removing it from the queue. That way when the next test script wanted to grab a username of its own, it would get the new first-in-line username and everyone would have its own unique test user and it would be all smiles and magic and butterscotch valleys. When a test script was finished it would add its username to the back of the queue so some future script could reuse it.
First, we registered an army of test accounts in our application, with usernames in a series like this:
robot1, robot2, robot3 .. robot39, robot40
Since we have 40 robots, we can run up to 40 tests at a time without running out of valid usernames to use. For us, that’s plenty.
Now, before we set this up we were passing the username as a straight up string for Selenium to plop into the “login” field of our application:
String APP_USERNAME = “userX”;
and then…
What we wanted to do was populate that value with a String popped off our queue:
APP_USERNAME = RoundRobin.getAppUsername(appUsernames);
and elsewhere:
public class RoundRobin {
private static Queue<String> appUsernames = new LinkedList<>();
public static Queue<String> setRoundRobin() {
 for(int i = 0; i < 40; i = i + 1){
   String thisSuffix = Integer.toString(i);
   String thisUsername = String.format(“robot%s”,thisSuffix);
 return appUsernames;
public static String getAppUsername(Queue<String> appUsernames) {
  return appUsernames.remove();
In RoundRobin, we’re setting up the Queue of appUsernames which will live in this class and will hold those 40 Strings, valid login values for our application. When we want to populate that Queue, before we start running all our tests, we have the setup class setRoundRobin fill it up with Strings and return the whole object so the setup class knows the deed is done. Here’s what the setup class looks like:
public class SuiteSetup {
 @BeforeSuite(alwaysRun = true)
 public void loadRoundRobin() throws InterruptedException {
@BeforeSuite is a TestNG annotation that tells this class to run before any other tests are run. In order for the test classes to know that this is living out there, expceting to be run first, we have to make sure that all our test classes extend this setup class, like this:
public class VideoInProgressIndicatorTest extends SuiteSetup {
 private static String APP_USERNAME;
 private static WebDriver driver;
 private static void startDriver() throws IOException {
   APP_USERNAME = RoundRobin.getAppUsername(appUsernames);
 private static void verifyInProgressIndicator() {
 private void teardown() {
You’ll notice the ‘returnAppUsername’ method used in teardown — that’s what adds the username String to the back of the line for re-use. In RoundRobin it looks like this:
public static boolean returnAppUsername(String returningUsername) {
 return appUsernames.add(returningUsername);
And there you have it. With this little trick we’ve been able to winnow down the number of pitfalls related to parallel testing to a lean one thousand and eleven.