Agent-based Testing


Iv4xr adopts an agent-based testing approach. It follows the so-called Belief-Desire-Intent (BDI) type of agency. A BDI agent is driven by goals: it can be given one or more goals that it then tries to accomplish (we also say ‘to solve’). A BDI agent has a different execution model than e.g. procedures or functions. The behaviour of a procedure/function is typically determined solely by its input. In contrast, a BDI agent runs in update cycles where at each cycle it senses its environment and then decides what to do (towards achieving its current goal). This makes a BDI agent very reactive, as it can immediately respond to a change that happens in a cycle, hence making it suitable for controlling and testing a highly dynamic system such as a computer game or a simulator.

Iv4xr offers the concept of tactics to program basic test automation, e.g. to solve simple goals. Complex test scenarios/tasks can be abstractly formulated as goal structures. A search can also be implemented to further automate segments of a test scenario. An example of the use case is shown below, for a game under development. The game is a maze-puzzle game with some hazard elements (e.g. fire). During the development the developers often change the game-level layout as well as its logic. In this example the game logic involves interactables like in-game buttons that can open/close one or more doors, to open/close access to rooms. After such changes, developers may want to check that e.g. key scenarios remain possible. Figure 2 shows such a testing task that involves searching and interacting with 10 interactables. Without any aid, programming such a scenario, even for a small game level shown below, is very time consuming (there are hundreds of steps that need to be programmed) and brittle. So in practice, lengthy test scenarios like this are tested by manually playing the game.

Fig. 1: A game-level under development in Unity. Unity is a game engine, and it also provides a game development IDE, shown in the screenshot above.
Fig. 2: an example scenario/task to test, involving finding and interacting with 10 interactables (buttons) to reach an item that has a key role in the game level (marked green). E.g. this is to verify that this key item remains reachable after the developer tweaks the game-level’s layout or logic

The code snippet below shows how the testing task in Fig. 2 would be expressed in iv4xr. We use the combinator SEQ to specify goals that a test agent has to solve. They all have to be solved, and in the sequence, as specified. So, it works well for expressing a linear scenario as in Fig. 2. Goals composed with a combinator like SEQ are called a goal structure. There are other combinators, e.g. to express priority or repetition, or a combinator that can dynamically construct and deploy a new goal. In the goal structure below we see that we just specify the sequence of buttons to interact, followed by doors whose state is to be checked (marked yellow in the code below).

Fig. 3: An example of expressing a complex test scenario/testing task as a goal structure. SEQ is a combinator that constructs a goal structure from goals or other goal structures. Texts in blue are the goals that made up the above scenario. Texts in yellow perform a check, similar to assert in unit testing.

An important thing to note is the test task as formulated in Fig. 3 abstracts from the complex physical navigation in the game world. For example, simply guiding the player character from button-1 in Fig. 2 to button-2 is very hard to program if we have to do it in terms of primitive player movement. Under the hood, the tactic that is attached to e.g. the goal entityInteracted(e) in Fig. 3 makes use of a path planning algorithm to calculate a route, and then it automatically guides the agent to e. With such automation, it then becomes possible to formulate testing tasks abstractly as in Fig. 3, after which they can be given to test agents to be executed.

Tactics and goals

Fig. 4: A tactic.

Fig.4 above shows an example of a tactic to move the agent (the player-character). g1 … g4 are guards, guarding each action. An action is only enabled if its guard evaluates to true. The combinator ANYof chooses one randomly among enabled actions to be executed. For example, if all the guards are just “true”, then the above tactic will move the agent randomly. Recall that we said that an agent executes in update cycles. When given a tactic T, such as the one above, each cycle will execute the tactic once (in other words, the action chosen by the tactic will execute, once). Over multiple cycles, the agent will just keep doing the same tactic, until we stop it. For this reason, we also give an agent a goal, so that it can stop when the goal is achieved (or when it exhausts its computation budget).

Asking an agent to just move randomly is of course not very effective. We can also embed a pathfinder to the above guards, with the net effect that over multiple cycles they will guide the agent towards a certain destination. We will not discuss the details of this here; there is documentation about this, along with other ‘solvers’ linked from one of the sections below. Below we show that more sophisticated tactic can be built by combining them. E.g. the tactic navigateTo(e) is based on a tactic similar to tactic1 above, but guided by pathfinding (let’s call it travelTo(e) tactic). It will guide the agent to the location of some gameobject e. This of course only works if e is already known to the agent. If the agent observes as an actual user, it will not immediately see the entire gameworld. If it has not seen e before, it will not be able to navigate to it either. In this case, the FIRSTof combinator in the example below will fall back to the explore() tactic, that will drive the agent to first explore the gameworld until it sees e.

Fig. 5: More sophisticated tactics can be built by combining them.

The combinator FIRSTof will execute the first enabled tactic. For example, if the agent is under attack, or if its health becomes too low, then the survivalTactic will become enabled, and it will be the one that is executed instead of the travelTo tactic.

An agent does not directly execute a tactic. Instead, we give it one or more goals; at least one goal. To solve a goal, we couple it with a tactic. When the goal becomes current, its associated tactic will be executed, in multiple cycles, until the goal is solved. A goal itself is usually expressed as a predicate, as shown in the example below:

Goal reached(e) = goal(“”).toSolve(“agent position is next to e”).withTactic( T )

This goal is solved when the agent reaches a position next to a gameobject e. To solve this, the tactic T is coupled to it. We can give it to an agent and execute it like this:

GoalStructure G = reached(e).lift() ;

agent.setGoal(G) ;

while(G.status().inProgress()) agent.update() ;

assertTrue(G.status.success()) ;

Fig. 6: running an agent.

As we have seen from the example in Fig. 2, a complex testing task can be expected to require multiple goals to formulate, as shown in Fig. 3. We can then use goal-combinators like SEQ to specify e.g. a series of sub-goals that have to be accomplished in a certain sequence. In the example in Fig. 3 we also notice that the test scenario contains pairs of (i,d) where i is a goal to interact with an interactable i, which subsequently is expected to flip another gameobject d to a certain state (e.g. if it is a door, we want it to become open). This can be further improved by exploiting a so-called solver that will automatically search for an i that will do the job (so, we do not have to explicitly specify i in the formulation of the testing task).

Junit, assertions, and traces

In Fig. 6 we have seen how to run a (test) agent. We can put this in a JUnit test-method. The code in Fig. 6 has one assertion to check, namely that the given goal must be achieved. In Fig. 3 we have seen how assertions can also be checked by inserting it as a subgoal. Yet another way is by inspecting the agent’s state, accessible via agent.state(), inside the loop in Fig. 6. The agent stores and remembers the observations it receives from the system under test in its state. E.g., when it sees a gameobject e, it remembers its state until it is updated. We can therefore inspect the agent state to check that e, if it has been seen, always has a certain expected property. You can also use Linear Temporal Logic (LTL) assertions. These are described in one of the documentations below.

From JUnit we get verdicts success/fail of the tests. We can additionally instruct the agent to collect data during its run, which is then saved as a csv-format trace file. The latter can be post-processed. We can for example use LTL again to query properties on the collected traces. Or we can visualise them, e.g. to see the physical coverage of the tests (that is, the areas in the game world that the agent visited during the test). One of the documentations below described how to configure the agent to produce traces.

 
Fig. 7: a heat map showing an agent’s visit. It reveals that the area marked green has not been visited.
 

What do we get?

For many highly interactive systems such as computer games, complex behavior often occur due to interactions of multiple entities. These are typically very hard to cover with unit testing. On the other hand, system level testing (e.g. testing complex scenarios as in the example in Fig. 2 above) is also very hard to automate. Agents can contribute here, as they are good to do system-level testing to complement unit testing. This can greatly improve the overall test coverage and found bugs that are hard to discover by unit testing.

A short demo-video

 
 

`

Where to obtain and further read

Papers

  • Concepts behind agent-based automated testing: Aplib: Tactical Agents for Testing Computer Games I. S. W. B. Prasetya, Mehdi Dastani, Rui Prada, Tanja E. J. Vos, Frank Dignum, Fitsum Kifetew, in Engineering Multi-Agent Systems workshop (EMAS), 2020
  • I. S. W. B. Prasetya, Fernando Pastor Ricós, Fitsum Meshesha Kifetew, Davide Prandi, Samira Shirzadehhajimahmood, Tanja E. J. Vos, Premysl Paska, Karel Hovorka, Raihana Ferdous, Angelo Susi, and Joseph Davidson. 2022. An agent-based approach to automated game testing: an experience report. In Proceedings of the 13th International Workshop on Automating Test Case Design, Selection and Evaluation (A-TEST 2022). Association for Computing Machinery, New York, NY, USA, 1-8. doi.org/10.1145/3548659.3561305.