Johann Mitloehner, 2022-10-13
Robotic Process Automation (RPA) allows businesses to automate tasks that are typically carried out by employees.
RPA is usually aimed primarily at repetitive and tedious tasks in order to
In order to mimick human interaction with applications RPA needs to
This type of automation is usually understood to involve software that accesses the back-end of a system through its API (application programming interface). In contrast, RPA accesses the system via the front end, usually a GUI (graphical user interface), closely mimicking the human/computer interaction.
Since not all systems provide an API but need to provide a frontend for human users, there are situations where RPA is the only feasible approach to automation.
These are aimed at performing system testing with pre-defined test cases and expected results. The focus is usually not on interacting with multiple applications; however, sophisticated testing tools like the robot framework which will be used here allow for extracting data and processing it in more than one application.
This type of software is usually understood to control physical robots. RPA involves software robots i.e. programs that are not controlling a physical robot but perform actions by interacting with other software systems, not the physical world.
A confusing term that is hard to define, since even natural intelligence remains an ellusive concept. We will use the less misleading term machine learning instead to refer to a type of decision making that sometimes seems like AI but is really just another type of software.
This usually refers to software that 'learns' from observation i.e. from data providing instances of situations and actions, e.g.
The system is usually understood to work on data from a large number of cases in an adaptive manner -- often, but not necessarily, working iteratively through the cases. It will (hopefully) adjust its behaviour towards optimal decisions, usually defined by minimising an error function.
Various approaches are used; connectionist models, especially deep learning neural nets, are currently very much in favour because of spectacular successes, particularly in image processing. However, there are a number of much simpler yet still useful approaches. Some of them will be discussed here, as it is feasible to apply them in RPA with limited resources in terms of time and coding skill.
RPA is usually aimed at back-office tasks while the automated servicing of customer request is increasingly offered using chatbot technology. Since chatbots automate front-office tasks they are often not seen as RPA; however, using the above definition chatbots qualify as RPA, since they are software robots that automate tasks otherwise carried out by employees.
There is always a lot of hype surrounding new concepts in management and technology. The following list of RPA benefits is somewhat conservative and also a little critical; just some food for thought.
Cost savings. When robots do the work of people we can cut personel cost. Euphemistically this is described as freeing time for more creative work.
Resilience. While the human workforce is limited there can always be more software robots, the only limit being the performance of the computing hardware. Therefore, if demand suddenly increases, the robot army is instantly ready.
Accuracy. Once a process is defined any errors left are due to design, not the software robots. Humans make mistakes, particularly in tedious tasks; (software) robots do not. Physical robots are a different matter and can make disastrous mistakes.
Compliance. An automated process can be defined so as to be fully compliant to some regulations at all times. Humans might make exceptions that can lead to trouble; robots make no exceptions, and no trouble.
Productivity. When measured in terms of input/output relation robots are hard to beat, especially software robots who work 24/7 without any wear and tear. If needed they can be replicated at practically zero additional cost.
Employee happiness. When freed from tedious tasks people can (in theory) do more creative and fulfilling things, and that may well make them happier (at least those who still have a job).
This list is similar to the previous one but focuses on the development and deployment of software, which is often a huge and risky project. Fortunately, RPA is somewhat different from typical enterprise software projects:
No Disruption. The RPA only uses the front-end of the system, so any problems caused must have been present already in day-to-day operations by human users (and have hopefully been wiped out). Deployment of RPA is less likely to cause disruption of service, whereas API-based process automation can access functions not available to users, or in a manner not possible when using the front-end GUI, and thereby causing unforeseen problems.
Scalable. RPA software robots not only work faster than humans, they also run continously 24/7 all year long, thereby easily meeting with increased demand; and since they are just software programs we can have more than one instance running on one or more computers at the same time. The only limit is the performance of the computing hardware.
Small Investment. Obviously this depends on the project. However, as we will see, at least some simple RPA projects can be cheap yet useful.
Quick ROI. A simple RPA project can start generating return on investment relatively quickly since development and deployment tend to be less problematic compared to a similar process automation project based on API programming.
These are compared to traditional automation based on API programming i.e. using the back-end of the system. Most of these problems do not seem overwhelming or unsolvable, though.
New Approach. Developers need to learn new methods, and in the case of the robot framework also a new language.
Performance. Compared to API-based automation the RPA approach will tend to be slower, maybe even so much slower that it is not feasible for a particular project.
Front-End Limitations. Remember that the front-end was not designed to be used by robots, but by humans. Problems may arise that were never faced before, because human users would know how to handle exceptional situations while robots just shamble on -- remember the paint robots in car factories that turn on each other
Citizen Developers. Business units can now develop bots using simple end-user tools and without the need for support by an IT team, or any involvement (or even knowledge) of the IT department. This can be seen as a benefit or a nightmare, depending on where you stand.
In the following we will look at examples of RPA in the following areas:
Testing. While test suites can of course be run via the API (if one is available) there may be subtle differences to actual user interaction that are not easy to cover reliably. Using RPA and the GUI closely mimicks human interaction and (hopefully) bypasses those problems.
Web Scraping. Since the Robot Framework uses XPath to access elements in HTML documents, and also allows for very simple integration of custom Python code in Robot test case files, it is easy to automatically extract content from web pages for further processing, also known as web scraping.
Customer Service. Many customer requests fall into one of very few categories and are therefore prime candidates for automation. This is an area that applies concepts from chatbots and machine learning. We will look at a simple case study using robot testing for automation and open datasets for machine learning.
The Robot Framework is available at robotframework.org. Its main purpose is automated testing; however, it is based on the Selenium library which can be used for general process automation in the interaction with web servers.
There are various types of testing frameworks; the Robot Framework uses keyword-driven testing: the idea is that the keywords
The approach somewhat resembles pseudo-code in algorithm design; it can be used for both manual and automated testing.
The following examples provide an introduction to the approach. The Robot Framework is written in Python, and we need to install some packages, and maybe Python itself as well.
The installation can be tricky..
The robotframework github site has detailled installation instructions for various operating systems.
When you install Python from the official source python.org make sure to check the little box "Add Python to the Path".
Adding python to the PATH means that you can start python on the command line. However, we also need the web driver scripts in the PATH. If you see something like
Driver copied to: C:\Users\ramen\bin\geckodriver.exe
WARNING: Path 'C:\Users\ramen\bin' is not in the PATH environment variable.
you need to add that users directory to the PATH. Exactly how this works depends on your operating system.
For related problems:
Github Page webdrivermanager & Command Line Options:
https://github.com/MarketSquare/webdrivermanager/#command-line-options
and if that still does not work, PATH variable setting:
https://github.com/robotframework/robotframework/blob/master/INSTALL.rst#configuring-path
It is recommended to use Python in one of the popular Linux distributions where it comes with the rest of the system. Many Linux desktop components rely on Python, so usually both Python 2 and Python 3 are part of the distribution.
Linux can easily be installed alongside an already existing operating system via dual boot i.e. you choose which system to use in this session when you start up your computer. The installation will take maybe half an hour or so, but it can save a lot more time and frustration later. All you need is a USB stick and at least about 20 GB of free space on your drive. Download the installation image, put it on your stick, and boot from that. Your favorite Linux distribution web site has all the details; this author prefers Mint, but there are many others, see distrowatch.
Python comes in a number of distributions for various needs and operating systems:
The primary source is python.org. This is the standard and reference CPython implementation, and it works perfectly for our purposes; it uses the pip installer which you see in all the examples.
Another option is the Anaconda distribution which uses its own installer conda instead of pip. This comes bundled with heaps of software, including flask (but not robotframework). Work with this if you have Anaconda already installed.
Once you have Python running you should be able to use pip (or conda) to install additional Python packages, and everything should work just fine. Fingers crossed, knock on wood.
☆ Depending on your distribution/setup, you may have to use the command py instead of python to run Python scripts.
Sadly the Python developers made a decision years ago to make the new Python 3 incompatible with the older version 2. The differences are few and small, but still more than enough to cause trouble. We will continue to suffer the consequences for many years to come.
We are using Python 3 here. Depending on your distribution and operating system this may be standard; however, make sure that when you enter on the command line
python
you actually get the Python 3 interpreter prompt, not the older Python 2. Leave the interactive interpreter by entering ctrl-d
You see python3 in all examples here since this gives us Python 3 on Linux. Otherwise, depending on your configuration, you might get Python 2.
Depending on your distribution and operating system you may not have a python3 command, so you have to use python instead.
☆ On Linux there are both versions available, since many system/desktop components depend on Python 2. Do not remove Python 2 from your Linux system.
Open a terminal window and enter the following statements on the command line. This only needs to be done once for our setup.
First we make sure to have the current version of the package installer pip:
python3 -m pip install --upgrade pip
The pip module should be part of your Python distribution; otherwise you will get an error and you have to install pip: on Debian-based Linux systems enter
sudo apt install python3-pip
Now we can install Python packages:
python3 -m pip install selenium
python3 -m pip install --upgrade robotframework-seleniumlibrary
python3 -m pip install webdrivermanager
Linux: Note the message after the webdrivermanager install about .local/bin not being in our PATH environment variable! We will need to fix this.
Current pip versions should automatically switch to user install when root permission is missing. If you get errors about permissions then add --user at the end of all the install commands, such as
python3 -m pip install selenium --user
Depending on your operating system environment you need to put .local/bin on our PATH. The following refers to Linux.
pico .bashrc
export PATH=$PATH:$HOME/.local/bin
If you get weird error messages then you messed up the .bashrc file. Go back to the first terminal, start the pico editor again, and fix the problem.
Now we can configure Firefox as our web browser: On the command line enter
webdrivermanager firefox -l $HOME/.local/bin
This should download the web driver.
Linux: Look for a message that says
Symlink created: /home/username/.local/bin/geckodriver
Make sure you see that message. We need that link in .local/bin
Other operating systems: if you run into problems with PATH error messages, command not found, or geckodriver not found, try this:
webdrivermanager firefox -l AUTO
☆ This part may not be needed, only fix the script if you get an error when you try to run your robot scripts i.e. when you enter robot on the command line.
find . -name service.pyLook for the firefox version; depending on the Python version it will be something like
./.local/lib/python3.10/site-packages/selenium/webdriver/firefox/service.py
pico ./.local/lib/python3.10/site-packages/selenium/webdriver/firefox/service.py
# Set a port for CDP if '--connect-existing' not in self.service_args: self.service_args.append("--websocket-port") self.service_args.append("%d" % utils.free_port())to this:
# Set a port for CDP #if '--connect-existing' not in self.service_args: # self.service_args.append("--websocket-port") # self.service_args.append("%d" % utils.free_port())
You can do all the steps illustrated here on the command line by writing Robot Framework scripts using your text editor, such as pico; then, enter
robot myfile.rob
to run the robot test file myfile.rob; in some respects this is the best way to approach the robot automation, since you can then use shell scripts to automate the process easily. However, if you want instead to use Jupyter notebook for your Python coding then you can install it on your computer by entering the following on the command line:
python3 -m pip install notebook
When the install is finished you can start Jupyter by entering on the command line:
jupyter notebook
In order to provide well-defined test cases we will build our own little web site using the Flask framework. This will allow us to iteratively refine the web application in small steps and simultaneously keep testing as we add new features.
**Flask** is a microframework that is contained in many Python distributions. It includes its own development web server, but it is preferrable (in terms of performance and other respects) to use a production-level web server; for our purpose **Gunicorn** will do the job nicely.
We can install the packages easily (if we need to):
python3 -m pip install flask
python3 -m pip install gunicorn
☆ If installing gunicorn is not an option for you, then simply add the line
app.run(port=8080)
at the end of the Python file with the Flask code; start it like any other Python script by entering the following on the command line:
python3 filename.py
in order to use the built-in webserver mode of Flask. There are some limitations, however those should not be a problem for our purposes.
To check if we can use the package and create a minimal application we need to write some code into a Python script and start that script on the command line.
We use our favorite text editor to create a file credit.py by entering on the command line:
pico credit.py
and now use the mouse for copy and paste the following code into the file:
from flask import Flask
app = Flask(__name__)
@app.route("/")
def credit():
return "<p>Credit App"
then ctrl-O Enter ctrl-X to save and exit.
☆ If you prefer work in a GUI environment then use the text editor from the file browser provided by your desktop, e.g. in Linux XFCE you can find a little icon in the bottom panel that looks like an old-fashioned filing cabinet: click it to start the file browser.
Things to note about the code above:
Now we can start the web server.
Open a new terminal window, change to your project directory (unless you work in your home directoy), and enter the following:
gunicorn -b localhost:8080 --reload --access-logfile - credit:app
Our web server is now running on the localhost (the computer we are working on) and accepting requests at the URL
If you run the web server and this notebook (the one that you are reading right now) on your own computer you can just click on the link above. It should open a new tab in your browser and show the response of your web server.
In the terminal window where you started the Flask web server you should see a line of log output like this one:
127.0.0.1 - - [13/Feb/2022:13:13:03 +0100] "GET / HTTP/1.1" 200 15 "-" "Mozilla/5.0 ... Firefox/97.0"
It is always a good idea to look at the log, especially when things do not work the way we expect, e.g. try requesting the following URL:
http://localhost:8080/thisdoesnotexist
and then look at the log output.
the page displayed by the browser should be something like
Not Found -- The requested URL was not found on the server
We are now ready to begin automated testing.
The robot framework relies on test files containing the instructions for tests. Here is a minimal example:
*** Settings *** Library SeleniumLibrary *** Test Cases *** Valid Home Page Open Browser http://localhost:8080 Firefox Page Should Contain Credit App [Teardown] Close Browser
The test file contains test cases and is written in a somewhat unusual notation that resembles natural language.
This test will try to open the root document of the web server running on this computer, using the Firefox browser.
Use your text editor to put this code into a file, such as t1.rob, and then run the robot on the command line (you may need to open a new terminal to get a command line that is not already running a job):
robot t1.rob
The robot should read the test file and execute the test steps
Sometimes the robot seems to get stuck after a few lines of output:
press ctrl-c
Starting with option --quiet or --console none also seems to help:
robot --quiet t1.rob
Other output files such as output.xml will still be written and can be analysed later.
There will be several output files in the current directory:
Opening the browser window takes a few seconds, but once it is open any further actions will happen fast, so we will only see the page contents briefly. For this reason the screenshots are collected.
To run the tests in this notebook and make sure that the results actually corresponds to the robot code displayed here we define some Jupyter cell magic.
Jupyter notebooks consist of
When we run cells containing robot code we want the robot code executed and the result (the content of the XML result file) displayed below; fortunately it is really easy to write cell magic in Jupyter, and we have created a bit of code. Create two files in the current directory:
robotmagic.py
from IPython.core.magic import register_cell_magic
import xml.etree.ElementTree as ET
import os
import xmlrep
@register_cell_magic
def robot(line, cell):
fout = open('tmp.rob', 'w')
fout.write(cell)
fout.close()
os.system('robot --quiet -o output.xml tmp.rob')
xmlrep.report('output.xml')
xmlrep.py
import xml.etree.ElementTree as ET
import sys
def report(fn):
for t in ET.parse(fn).getroot().iter('test'):
s = t.findall("status")[-1]
res = s.get("status") + ' ' + t.get("name")
if s.get("status") == "FAIL": res = res + ' ' + s.text
print(res)
if __name__ == '__main__': report(sys.argv[1])
With these Python module files present in the current directory the following import statement achieves our own brand of robot magic:
from robotmagic import robot
Now we can use the %% notation for our robot test code in this notebook (which you are reading at the moment).
A cell marked with the %% notation will be executed just like other code cells, and the result will be displayed below the code cell.
The %% notation is only for use in Jupyter notebooks, not in robot test files.
However, the Robot Framework simply ignores everything that is not a recognized section header
headlessfirefox avoids opening a window showing the browser; this might result in a small speedup
%%robot
*** Settings ***
Library SeleniumLibrary
*** Test Cases ***
Valid Home Page
Open Browser http://localhost:8080 headlessfirefox
Page Should Contain Credit App
[Teardown] Close Browser
Now we can use Cell/Run All from the Jupyter menu and have all cells executed and their results displayed. This is convenient, and we can be certain that the results actually correspond to the code (which is not necessarily true when we use copy and paste from the terminal).
If you want to create a notebook like the one you are reading now then you can download the file robotmagic.py and use from..import as above. The .py file must be in the same directory with the notebook file. The ideas in that code will be discussed below.
In the terminal where you started the web server you should see the robot requests in the log outout. You can check the file using a GUI app, or on the command line enter
ls -lrt
This displays the files sorted by modification time.
After some testing we get a lot of robot log files and PNG screenshots. You can delete them using the following command:
rm *.log *.png
☆ This will permanently delete all files in the current directory with the extension .log and .png -- make sure
The minimal example works, now we can turn to some more useful elements of the syntax.
Use your favorite text editor to create another file, such as t2.rob, with the following content:
%%robot
*** Settings ***
Documentation Check for text in page
Library SeleniumLibrary
*** Variables ***
${URL} http://localhost:8080
${BROWSER} Firefox
*** Test Cases ***
Valid Home Page - Keywords
Open Browser To Home Page
Page Should Contain Credit App
[Teardown] Close Browser
*** Keywords ***
Open Browser To Home Page
Open Browser ${URL} ${BROWSER}
Things to note about the code above:
the Test Cases section contains one or more cases
Let's try a test that we expect to fail.
Use your text editor and put the following in a file t3.rob:
%%robot
*** Settings ***
Documentation Check for text in page, expect fail
Library SeleniumLibrary
*** Variables ***
${LOGIN URL} http://localhost:8080
${BROWSER} Firefox
*** Test Cases ***
Valid Home Page - Expect Fail
Open Browser To Home Page
Page Should Contain The Credit App
[Teardown] Close Browser
*** Keywords ***
Open Browser To Home Page
Open Browser ${LOGIN URL} ${BROWSER}
Note how we introduced a small change in the text:
Our website says "Credit App", not "The Credit App"
This test should fail.
Our minimal app is performing its Hello function, but now we add some more features.
We are slowly approaching a small application with database connection.
Incremental development and testing will go side by side.
Let's add a link to a list of clients.
Note that at this point we do not yet need an actual procedure for listing anything, just the link to it.
Use your text editor to change the content of the file credit.py to the following:
from flask import Flask
app = Flask(__name__)
@app.route("/")
def credit():
return """<h1>Credit App</h1>
<p><a href=clients>Clients</a>"""
Remember the option --reload which we added to the unicorn web server command.
Take a look the terminal window that runs the gunicorn web server: you should see a line that looks like this:
... [INFO] Worker reloading: /home/.../credit.py modified
The gunicorn web server has detected the change in the source file and restarted the web app.
Our web app starts to get a little more elaborate.
We now have a few distinct elements that we can check in our robot tests.
We can use XPath expressions to find elements in the HTML and check their contents.
Use your text editor to create a file t4.rob and put the following code into that file:
%%robot
*** Settings ***
Documentation Check shop page for header and links
Library SeleniumLibrary
*** Variables ***
${LOGIN URL} http://localhost:8080
${BROWSER} Firefox
*** Test Cases ***
Valid Home Page - Headers and Links
Check Home Page
[Teardown] Close Browser
*** Keywords ***
Check Home Page
Open Browser ${LOGIN URL} ${BROWSER}
Element Should Contain //h1 Credit App
Element Should Contain //a Clients
We introduced the user-defined phrase "Check Home Page" in the test case and defined it in the keywords section.
The pre-defined phrase "Element Should Contain" has two parameters:
Note that only the first section header h1 in the page is checked. To check other elements which are not the first of their type we will need to specify their XPath more elaborately.
In a similar fashion we check for the link with a given text. Note that for this test to succeed we do not need a procedure actually listing anything, only a link with the specified text.
There are various types of element locators in the Selenium library which is used in the Robot Framework; XPath can get somewhat tedious but it offers sufficient power to locate elements in more complex web content.
Let's run the robot again: on the command line enter
robot t4.rob
The text output should PASS.
Let us re-design our shop and add a few links. We want to
The last link is really for our convenience here and would probably not be included in a real world application of this kind.
Use your text editor to change the file credit.py to the following:
from flask import Flask, render_template
app = Flask(__name__)
@app.route("/")
def credit():
return """<h1>Credit App</h1>
<ul>
<li><a href=clients>Clients</a></li>
<li><a href=newclient>New Client</a></li>
<li><a href=initdb>Init DB</a></li>
</ul>"""
To test for the presence of a link with the text "Customers" we need to use an XPath expression that points to this link. We could simply use indexing with square brackets such as [2] but that could lead to trouble lateron when we add more links and change their order.
Instead, we delve a little deeper into XPath and use the functions contains() and text().
Use your text editor to put the following robot code into a file t5.rob:
%%robot
*** Settings ***
Documentation Check shop page for header and links
Library SeleniumLibrary
*** Variables ***
${LOGIN URL} http://localhost:8080/
${BROWSER} Firefox
*** Test Cases ***
Valid Home Page - XPath
Check Home Page
[Teardown] Close Browser
*** Keywords ***
Check Home Page
Open Browser ${LOGIN URL} ${BROWSER}
Element Should Contain //h1 Credit App
Element Should Contain //a Clients
Element Should Contain //a[contains(text(), "Clients")] Clients
Note that in the last line we need to supply the text twice although it is already obvious from the XPath expression.
Run the robot again to see that the XPath expressions actually target the link as intended.
robot t5.rob
The text output should show PASS.
The test works as intended. Now we can move on the actually providing procedures for the functions, such as the list of clients. For that purpose, we chose a somewhat more elaborate route by introducing a database connection to our little sample app.
To make our approach extensible and reasonably realistic we add more functions to the sample app instead of just providing toy examples. Playing in the sandbox can only get us so far.
Fortunately there is a free open-source database management system that we can easily use in our sample app: SQLite. This DBMS is widely used since it is so easy to install and apply. It should be noted that it is also lacking in several respects, such as in terms of implementing standard SQL numeric data types. We happily accept these deficiencies since they do not bother us here (much).
See the documentation for more details about SQLite2 and its use in Python3:
The new version of our shop app contains two new routes:
Our application is now a handsome size, make sure you do not miss anything when copy and paste into the file credit.py:
from flask import Flask, render_template, request
import sqlite3
app = Flask(__name__)
def getconn():
return sqlite3.connect("credit.db")
@app.route("/")
def credit():
return """<h1>Credit App</h1>
<ul>
<li><a href=clients>Clients</a></li>
<li><a href=newclient>New Client</a></li>
<li><a href=initdb>Init DB</a></li>
</ul>"""
@app.route('/initdb')
def initdb():
conn = getconn()
cur = conn.cursor()
cur.execute("drop table if exists client")
cur.execute("create table client "
+ "(id int primary key, lim int, sex int, edu int, mar int, age int)")
conn.commit()
conn.close()
return "DB initialized."
@app.route('/clients')
def clients():
conn = getconn()
cur = conn.cursor()
rows = cur.execute("select id, lim from client")
html = "<h3>Clients</h3><table>\n"
for row in rows:
html += "<tr><td align=right> %d <td align=right> %.2f\n" % row
return html + "</table>\n"
conn.close()
@app.route('/newclient')
def newclient():
return """<h3>New Client</h3>
<form action=insertclient method=POST>
<table>
<tr><td>Client ID:<td><input type=text name=id>
<tr><td>Credit Limit:<td><input type=text name=lim>
<tr><td>Sex:<td><input type=text name=sex value=1>
<tr><td>Education:<td><input type=text name=edu value=1>
<tr><td>Marriage:<td><input type=text name=mar value=1>
<tr><td>Age:<td><input type=text name=age value=30>
</table><input type=submit value=OK></form>"""
@app.route('/insertclient', methods=['POST'])
def insertclient():
id = request.form['id']
lim = request.form['lim']
sex = request.form['sex']
edu = request.form['edu']
mar = request.form['mar']
age = request.form['age']
conn = getconn()
cur = conn.cursor()
cur.execute('insert into client (id, lim, sex, edu, mar, age) '
+ ' values (?, ?, ?, ?, ?, ?)', (id, lim, sex, edu, mar, age))
conn.commit()
conn.close()
return "Client inserted."
The SQLite3 connection module should be part of the Python3 distribution, we just need to import it.
we define a funtion getconn() to give a DB connection whenever we need it. SQlite stores the DB in a file in the current directory, as named in the connect() function. So, we should expect a file ending in ".db" after the initdb route is execute for the first time.
in the initdb route we use that getconn() function and get a cursor from the connection. With this cursor we can execute SQL statements.
we drop the table if it exists so we can run this code multiple times. Note that in this interface we do not end SQL statements with ";"
we create a simple table
we insert a few rows into our table. For the sake of simplicity we use integers for everything; SQLite does not worry about that at all, as we will see.
the commit() is necessary here since auto-commit is only default in interactive use. Without it all changes would be lost when the connection is closed: SQLite supports basic transaction logic.
With the table initialised we can list its content:
Now we can do quite a bit of testing!
The following test will
A new client should have an ID and a credit limit; we can easily generate those in Python.
We put our Python code into a file with the extension .py in the current directory.
Let's call it mytools.py; create it with your text editor and put the following code into the file:
import sqlite3
import random
def get_new_client_id():
cur = sqlite3.connect("credit.db").cursor()
row = cur.execute("select max(id)+1 from client").fetchone()
return "%d" % row
def get_new_client_limit():
return "%d" % (10000 * random.randint(1,5))
Do not try to run this file directly; it will not produce any useful results. It will work with the robot after the Library definition in the robot code file.
The code above defines two functions for creating new values which are then available in our robot test files as user-defined phrases.
The underscore character translates as blank in the robot code:
get_new_client_id() becomes Get New Client Id
get_new_client_limit() becomes Get New Client Limit
Both functions return values which we can capture in the robot test file.
☆ Note that our simple solution is not thread-safe. SQLite DB files can be accessed be multiple processes; there is no locking for read access, but it employs database locking for write access. Two processes running at the same time will get the same value for max(id)+1 and therefore identical client records.
The option AUTOINCREMENT set in the SQL table create statements should guarantee unique primary keys.
Another clean solution here would be sequences. Sadly, SQLite does not support them. However, we could create a table in the DB init part:
create table mycount(n int)
insert into mycount values(0)
And then use the following code via the Python API to get a new number:
update mycount set n = n + 1
select n from mycount
In theory, this solution should be thread-safe:
A and B should always see different values in their select results.
However, with respect to later development of the sample application we do not worry about this issue here, since we will bulk import external data.
In the Settings section of the robot test file we use the keyword Library to include the code from our tools module.
Now we can do a lot of testing!
We need to call initdb first, otherwise everything else will fail, since the DB table would not yet exists.
Let's put this into a file t6.rob:
%%robot
*** Settings ***
Documentation Check new client insert
Library SeleniumLibrary
Library mytools.py
*** Variables ***
${LOGIN URL} http://localhost:8080
${BROWSER} Firefox
${ID} 1
*** Test Cases ***
Valid Home Page
Open Browser ${LOGIN URL} ${BROWSER}
Page Should Contain Link //a[@href="clients"]
Page Should Contain Link //a[@href="newclient"]
Page Should Contain Link //a[@href="initdb"]
Valid Init DB
Go To ${LOGIN URL}
Click Link //a[@href="initdb"]
Page Should Contain DB initialized.
Valid Insert
${LIM}= Get New Client Limit
Go To ${LOGIN URL}
Click Link //a[@href="newclient"]
Page Should Contain New Client
Input Text //input[@name="id"] ${ID}
Input Text //input[@name="lim"] ${LIM}
Click Element //input[@type="submit"]
Page Should Contain Client inserted
Set Global Variable ${LIM}
Valid Listing
Go To ${LOGIN URL}
Click Link //a[@href="clients"]
Page Should Contain ${ID}
Page Should Contain ${LIM}
[Teardown] Close Browser
In this test file we have introduced several test cases; divide and conquer.
To access the generated client ID in more than one test we make it global. Sadly this does not work in the Variables section as one would expect. Instead, we do this in the first test case.
The code above contains some other new features:
the user-defined phrase Get New Client Limit returns a value which we put into a variabel LIM
We do not (yet) use our Get New Client Id
the XPath expression //a[@href="newclient"] finds the first a element with an attribute href equal to "newclient"
we follow this link by using the pre-defined phrase Click Link
the pre-defined phrase Input Text finds the form elements and enters the values
click on the submit button and check the response
now we could go straight to the client listing, but instead
Now we check for the name of the new client in the listing
This test will take a little longer; we will probably be able to see the new entry briefly showing in the form fields and the listing.
Performance is not the strong point of this type of automated testing. However, it is still much faster than human testers.
Run the robot:
robot t6.rob
and observe the results; you should see PASS for all tests.
The Robot Framework can easily be used for robotic process automation. This can be made explicit by using the section header Tasks instead of Test Cases; everything else works in the same way as in tests.
In order to facilitate our report processing later we will just continue to use "Test".
We cannot use both tasks and tests in the same robot file.
When we create a task to initialise the DB and insert a few clients we do not want to go through all steps for inserting a new client again and again!
DRY: Don't Repeat Yourself.
Code duplication makes it harder to maintain code. It may well be the root of all evil (in software) ☠
We want to define a new user keywords with the required steps in one place and then use that code for repeated application, only supplying the necessary data in each call as arguments (parameters).
Put the following into a file t7.rob:
%%robot
*** Settings ***
Documentation Init DB and insert some clients
Library SeleniumLibrary
Library mytools.py
*** Variables ***
${LOGIN URL} http://localhost:8080
${BROWSER} Firefox
*** Test Cases ***
Init DB
Open Browser ${LOGIN URL} ${BROWSER}
Click Link //a[@href="initdb"]
Page Should Contain DB initialized
Insert Several Clients
Insert Client id=1001 lim=4000
Insert Client id=1002 lim=8000
Insert Client id=2001
Insert Client id=2002 lim=6000
[Teardown] Close Browser
*** Keywords ***
Insert Client
[Arguments] ${id} ${lim}=5000
Go To ${LOGIN URL}
Click Link //a[@href="newclient"]
Page Should Contain New Client
Input Text //input[@name="id"] ${id}
Input Text //input[@name="lim"] ${lim}
Click Element //input[@type="submit"]
Page Should Contain Client inserted
In the Keywords section we use named arguments with default values. This allows us to call the user-defined phrase Insert Client in the Test section. We can supply all arguments, some, or none.
Run the robot:
robot t7.rob
and observe the results; again, all tests should PASS.
The Robot Framework provides tools for generating summaries from the XML reports for each test; however, they are somewhat cumbersome, and it is easier and more flexible to go through the reports using the XML package of plain Python.
After some study of the structure of the XML files we find that the last status element in each test contains the overall status of the test; we can get that element with the expression findall("status")[-1]
Put the following code into a file xmlrep.py (identical to optional section on Automating the RPA above):
import xml.etree.ElementTree as ET
import sys
def report(fn):
for t in ET.parse(fn).getroot().iter('test'):
s = t.findall("status")[-1]
res = s.get("status") + ' ' + t.get("name")
if s.get("status") == "FAIL": res = res + ' -- ' + s.text
print(res)
if __name__ == '__main__': report(sys.argv[1])
Here the last line is executed when the complete file is run as a script (instead of just importing the report function):
Run it on the command line to check the file output.xml in the current directory:
python3 xmlrep.py output.xml
The output should look like this:
PASS Init DB PASS Insert Several Clients
The Robot Framework features the concept of a Test Suite for this purpose; however, with our XML reporting tools already available it is more convenient to run individual test case files and then process the reports.
To run several robot tests and direct the output to different XML files for later analysis we can add the XML output file to the robot call.
Use your text editor to create a script that runs several tests at once and then summarizes all the XML reports.
Put the following code into a file mytests.sh:
robot -o o4.xml t4.rob robot -o o5.xml t5.rob robot -o o6.xml t6.rob for file in o4.xml o5.xml o6.xml do echo $file python3 xmlrep.py $file done
Run this script by entering the following on the command line:
bash mytests.sh
If for some reason you do not want to use Flask and run a very simple local web server that serves just HTML pages then you do not need to write any Python code: just enter the following on the command line:
python3 -m http.server 8080 --bind 127.0.0.1
This will start a very basic web server with the current directory as root for all HTML files. Note that the port must be free at this time; if your flask server still runs on 8080 then the above command will not work.
This notebook (the one whose HTML version you are reading now) is by default saved as a JSON file, which means that it is relatively easy to process the contents of Jupyter notebook cells with our own Python code.
Here is a little script to
import json
import sys
import os
f = open(sys.argv[1], 'r')
data = json.load(f)
n = 1
cells = data['cells']
for x in cells:
src = x['source']
typ = x['cell_type']
if typ == 'markdown' and len(src) > 1:
if src[1].startswith('%%robot'):
fout = open('tmp.rob', 'w')
fout.write( '\n'.join([ s[:-1] for s in src[1:-1] ]) )
fout.close()
os.system('robot --quiet -o tmp' + str(n) + '.xml tmp.rob')
n = n + 1
for i in range(n):
os.system('python3 xmlrep.py tmp' + str(i) + '.xml')
The Robot Framework can also be used for web scraping i.e. extracting data from web pages using
The Robot Framework provides functions for file access. However, it is much more convenient and versatile to define our own in plain Python.
Add the following code to the mytools.py library file:
def create_my_file(fn):
fp = open(fn, 'w')
fp.close()
def append_my_file(fn, txt):
fp = open(fn, 'a')
fp.write(txt + '\n')
fp.close()
Now we can use these two new keywords in the following robot test file.
As a real-world practical example let's assume that we are interested in the titles of all Bond movies ever made (although the exact definition of that list is somewhat fuzzy), and after a bit of search we find a web site that has a relatively simple structure:
After exploring the page source (right mouse button, View Page Source in Firefox) we can create our robot file:
*** Settings *** Library SeleniumLibrary Library mytools.py *** Variables *** ${url} https://www.pocket-lint.com/tv/news/148096-james-bond-007-best-movie-viewing-order-chronological-release *** Test Cases *** Start Movie List Open Browser ${url} Firefox Page Should Contain James Bond movies Create My File list.txt Iterate Through Movies ${elements}= Get WebElements //h3 FOR ${element} IN @{elements} Append My File list.txt ${element.text} END
The robot code above
Further study the XPath syntax and examples, from sources such as
Find external web pages with moderately complex structure
Write more robot test files to check for elements and text content
Specific challenges:
Make sure not to call the robot too often; leave intervals of a minute or more between calls -- some websites apply automatic exclusion procedures when hit by too many requests from the same source.