POM or TAP design pattern for test automation using Selenium WebDriver


POM or TAP design pattern for test automation using Selenium WebDriver
August 21, Monday, 2017

I've written the same topic in Bangla in my previous blog post mainly as a note to myself and of course, to share the idea with my people (who speak the same language). Now, at work, we have an automation training going on and it seems that the same note in English would be helpful to the trainees as well. So, here's the translation in English.

But before I get into the details, let me acknowledge and thank the main two architects behind the project that implemented the POM (or TAP as we call it) design pattern for test automation at work - Josh Allen and Mike Cook. Thank you guys!! I learned a lot working with the awesome framework!

Since this post should serve as a note, I'll try to be brief and to the point, intentionally ignoring some details. But as you read through, please click on the images with code examples to enlarge for better viewing. So, let's get started.

TAP stands for Test Action Page, also it was named keeping Test Automation Platform in mind. As the name suggests the framework mainly focuses on three parts:
  1. Test - it would define the high level test plan, kind of like the Manager's view.
  2. Action - It's like the brain of the test. It should contain the logic, control flow of the test plan.
  3. Page - This will be responsible for interacting with the actual web pages.
I'll take a bottom-up approach explaining the idea, starting with the Page. Imagine we need to automate the test case that will test the login to a system. The very first page will be the log in page where the user would provide their username and password and click the submit button (Image-1).

Image -1: The log in page































A successful login will then go to the main page (Image-2) that might have many more options, links etc. Simple, huh?


Image -2: The main page







































Now the first page has three web elements: namely two input text boxes, and a submit button. Thus the 'Page' java class that represents the loing web page should have three web elements to manipulate them and a method to mimic the 'Submit' button click. Fair enough? Below is a code snippet of such a page (Image-3):

Image - 3: The Java class that represents the login page.

Notice in Image-3 above (in 'defineRequired()' method: lines 32-39), how we are finding these three elements using their names and ensuring that they are not null, basically ensuring that they are there. You may be wondering how can we get the names of these web elements. That's easy. While in the login page, hit the "F12" key in the keyboard and it will open up the 'Developer' view with the source code of the web page. This is generally true for any web browser. Image-4 shows a sample in my Chrome browser:

Image-4: Hitting F12 key in the keyboard lets you view the source 







































Once the view source pane is open, it is literally just by clicking the web element or hovering the mouse over it, will give you its name or id. As in our case, the name of the Login text box is 'login'.

Once you have got references to all of the necessary web elements of a page, you'll need to use them in the methods (of the same Java class) to interact with them. For our example, the method 'loginAs()' in Image-3 takes the username and password from the user, puts them in the text boxes and simply mimics the click of the submit button. The framework that we use at work, has this 'UtilsUI' utility class that has a bunch of methods that implement the functionality to interact with the UI web elements (e.g., click a button or a link, choose from a drop-down, select a radio button, etc.)

Like wise, the Java class or the Page (code not shown in this post for brevity) that represents the main web page (after the successful login) will have a method that will verify that all the necessary web elements are there (e.g., different links, text etc. in Image-2) and have methods to interact with them (e.g., we may have a method called 'clickUpLink()')

Ok, now let's discuss about the second piece of the puzzle: the Action Java class. 

As I mentioned at the beginning, this is the place where all the logic and control flow will go. What does that mean? Well, at this point, we will need two objects representing the two pages: namely login page and the main page. Once we are in the login page, we need to call the 'loginAs()' method providing it with the username and password. If it's successful, we should be in the main page and we need to verify that we are actually in the main web page, right? The Image-5 below gives a sample such Action with if-else logic:


Image - 5: Action: the brain of the test plan






















Note how we get the reference to the web driver (basically the browser) and the two page objects representing the login page and the main page first (in setupPages() method in lines: 72 -75). In the 'execute()' method (starting in lines 91, please ignore the first few lines for simplicity) we first check if we are in the right login page (line 91). If so, we call the 'loginAs()' method in the next line providing it with the username and password. If that comes out to be successful, we then verify (in line 93) that we are now in the main page. Depending on the result, we report 'Pass' or 'Fail' or throw a 'Fault'. Again, this is like Action encapsulates the control flow or the logic of the test case.

Now, last but not the least piece of the puzzle: the Test.

The Test will have the high-level stuff in it. Its main responsibility is to gather the input data to run the test, instantiate the necessary Action object and then simply call the 'execute' methods of those Action objects. For our example here, we have just one Action that takes two user input: username and password. Image - 6 below shows the sample code of the test:


Image - 6: The Test: Manager's perspective


























Notice how in the above code sample (in lines 43 - 44) it is gathering the user input. Please focus on the idea now and ignore how it actually does this (i.e., read from the user input file using TestParameter class). It then instantiates the Action with these input values in line 46. Finally, using JUnit it asserts that the 'execute()' method returns true in line 52.

That's it! Using Test-Action-Page trio we have just completed going through our first test automation. Now you might be wondering and like: "Wait! Why do we need all these just to test a simple login? How does it help?" Good question. But to answer your question, consider this: for all the future test cases that will need a user login and then other following steps, you will have the action and pages already implemented and ready to use. You would simply use the already implemented login action and build on top of it to work on the "other following steps" parts of your test case. Code reuseability!

Code resuability will tremendously speed up your implementation of other features/test cases as you keep on building. Moreover, you'd be able to communicate your idea, help maintain existing code much easily. Just like Josh and Mike - even without knowing - helped me understand the concept a lot better building this awesome framework we call InforTAP.

By the way, Page Object Model or POM for short is almost the same concept as of our TAP. If you are interested in learning more about it, you could try reading this article.

Happy coding!

Thanks,
--Ishti





No comments:

Post a Comment

কাজের জায়গায় ভুল থেকে শেখা: regex 'র একটা খুব কমন বিষয় যেটা এতদিন ভুল জানতাম

কাজের জায়গায় ভুল থেকে শেখা: regex 'র একটা খুব কমন বিষয় যেটা এতদিন ভুল জানতাম  ৩ ফেব্রুয়ারি, শনিবার, ২০২৪ রেগুলার এক্সপ্রেশন (Regular Exp...