API Tutorial¶

This is a walkthrough of using the mutatest API. These are the same method calls used by the CLI and provide additional flexibility for customization. The code and notebook to generate this tutorial is located under the docs/api_tutorial folder on GitHub.

# Imports used throughout the tutorial

import ast

from pathlib import Path

from mutatest import run
from mutatest import transformers
from mutatest.api import Genome, GenomeGroup, MutationException
from mutatest.filters import CoverageFilter, CategoryCodeFilter

Tutorial setup¶

The example/ folder has two Python files, a.py and b.py, with a test_ab.py file that would be automatically detected by pytest.

# This folder and included .py files are in docs/api_tutoral/

src_loc = Path("example")

print(*[i for i in src_loc.iterdir()
        if i.is_file()], sep="\n")

example/a.py
example/test_ab.py
example/b.py

a.py holds two functions: one to add five to an input value, another to return True if the first input value is greater than the second input value. The add operation is represented in the AST as ast.Add, a BinOp operation type, and the greater-than operation is represented by ast.Gt, a CompareOp operation type. If the source code is executed the expected value is to print 10.

def open_print(fn):
    """Open a print file contents."""
    with open(fn) as f:
        print(f.read())

# Contents of a.py example source file
open_print(src_loc / "a.py")

"""Example A.
"""

def add_five(a):
    return a + 5

def greater_than(a, b):
    return a > b

print(add_five(5))

b.py has a single function that returns whether or not the first input is the second input. is is represented by ast.Is and is a CompareIs operation type. The expected value if this source code is executed is True.

# Contents of b.py example source file

open_print(src_loc / "b.py")

"""Example B.
"""

def is_match(a, b):
    return a is b

print(is_match(1, 1))

test_ab.py is the test script for both a.py and b.py. The test_add_five function is intentionally broken to demonstrate later mutations. It will pass if the value is greater than 10 in the test using 6 as an input value, and fail otherwise.

# Contents of test_ab.py example test file

open_print(src_loc / "test_ab.py")

from a import add_five
from b import is_match


def test_add_five():
    assert add_five(6) > 10


def test_is_match():
    assert is_match("one", "one")

Run a clean trial and generate coverage¶

We can use run to perform a “clean trial” of our test commands based on the source location. This will generate a .coverage file that will be used by the Genome. A .coverage file is not required. This run method is useful for doing clean trials before and after mutation trials as a way to reset the __pycache__.

# The return value of clean_trial is the time to run
# this is used in reporting from the CLI

run.clean_trial(
    src_loc, test_cmds=["pytest", "--cov=example"]
)

datetime.timedelta(microseconds=411150)

Path(".coverage").exists()

True

Genome Basics¶

Genomes are the basic representation of a source code file in mutatest. They can be initialized by passing in the path to a specific file, or initialized without any arguments and have the source file added later. The basic properties include the Abstract Syntax Tree (AST), the source file, the coverage file, and any category codes for filtering.

# Initialize with the source file location
# By default, the ".coverage" file is set
# for the coverage_file property

genome = Genome(src_loc / "a.py")

genome.source_file

PosixPath('example/a.py')

genome.coverage_file

PosixPath('.coverage')

# By default, no filter codes are set
# These are categories of mutations to filter

genome.filter_codes

set()

Finding mutation targets¶

The Genome has two additional properties related to finding mutation locations: targets and covered_targets. These are sets of LocIndex objects (defined in transformers) that represent locations in the AST that can be mutated. Covered targets are those that have lines covered by the set coverage_file property.

genome.targets

{LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16),
 LocIndex(ast_class='Compare', lineno=10, col_offset=11, op_type=<class '_ast.Gt'>, end_lineno=10, end_col_offset=16)}

genome.covered_targets

{LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16)}

genome.targets - genome.covered_targets

{LocIndex(ast_class='Compare', lineno=10, col_offset=11, op_type=<class '_ast.Gt'>, end_lineno=10, end_col_offset=16)}

Accessing the AST¶

The ast property is the AST of the source file. You can access the properties directly. This is used to generate the targets and covered targets. The AST parser is defined in transformers but is encapsulted in the Genome.

genome.ast

<_ast.Module at 0x7f68a4014bb0>

genome.ast.body

[<_ast.Expr at 0x7f68a4014ca0>,
 <_ast.FunctionDef at 0x7f68a4014ac0>,
 <_ast.FunctionDef at 0x7f68a4014eb0>,
 <_ast.Expr at 0x7f68a402c040>]

genome.ast.body[1].__dict__

{'name': 'add_five',
 'args': <_ast.arguments at 0x7f68a4014d30>,
 'body': [<_ast.Return at 0x7f68a4014dc0>],
 'decorator_list': [],
 'returns': None,
 'type_comment': None,
 'lineno': 5,
 'col_offset': 0,
 'end_lineno': 6,
 'end_col_offset': 16}

Filtering mutation targets¶

You can set filters on a Genome for specific types of targets. For example, setting bn for BinOp will filter both targets and covered targets to only BinOp class operations.

# All available categories are listed
# in transformers.CATEGORIES

print(*[f"Category:{k}, Code: {v}"
        for k,v in transformers.CATEGORIES.items()],
      sep="\n")

Category:AugAssign, Code: aa
Category:BinOp, Code: bn
Category:BinOpBC, Code: bc
Category:BinOpBS, Code: bs
Category:BoolOp, Code: bl
Category:Compare, Code: cp
Category:CompareIn, Code: cn
Category:CompareIs, Code: cs
Category:If, Code: if
Category:Index, Code: ix
Category:NameConstant, Code: nc
Category:SliceUS, Code: su

# If you attempt to set an invalid code a ValueError is raised
# and the valid codes are listed in the message

try:
    genome.filter_codes = ("asdf",)

except ValueError as e:
    print(e)

Invalid category codes: {'asdf'}.
Valid codes: {'AugAssign': 'aa', 'BinOp': 'bn', 'BinOpBC': 'bc', 'BinOpBS': 'bs', 'BoolOp': 'bl', 'Compare': 'cp', 'CompareIn': 'cn', 'CompareIs': 'cs', 'If': 'if', 'Index': 'ix', 'NameConstant': 'nc', 'SliceUS': 'su'}

# Set the filter using an iterable of the two-letter codes

genome.filter_codes = ("bn",)

# Targets and covered targets will only show the filtered value

genome.targets

{LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16)}

genome.covered_targets

{LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16)}

# Reset the filter_codes to an empty set
genome.filter_codes = set()

# All target classes are now listed again

genome.targets

{LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16),
 LocIndex(ast_class='Compare', lineno=10, col_offset=11, op_type=<class '_ast.Gt'>, end_lineno=10, end_col_offset=16)}

Using custom filters¶

If you need more flexibility, the filters define the two classes of filter used by Genome: the CoverageFilter and the CategoryCodeFilter. These are encapsultated by Genome and GenomeGroup already but can be accessed directly.

Coverage Filter¶

cov_filter = CoverageFilter(coverage_file=Path(".coverage"))

# Use the filter method to filter targets based on
# a given source file.

cov_filter.filter(
    genome.targets, genome.source_file
)

{LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16)}

# You can invert the filtering as well

cov_filter.filter(
    genome.targets, genome.source_file,
    invert=True
)

{LocIndex(ast_class='Compare', lineno=10, col_offset=11, op_type=<class '_ast.Gt'>, end_lineno=10, end_col_offset=16)}

Category Code Filter¶

# Instantiate using a set of codes
# or add them later

catcode_filter = CategoryCodeFilter(codes=("bn",))

# Valid codes provide all potential values

catcode_filter.valid_codes

dict_values(['aa', 'bn', 'bc', 'bs', 'bl', 'cp', 'cn', 'cs', 'if', 'ix', 'nc', 'su'])

# Valid categories are also available

catcode_filter.valid_categories

{'AugAssign': 'aa',
 'BinOp': 'bn',
 'BinOpBC': 'bc',
 'BinOpBS': 'bs',
 'BoolOp': 'bl',
 'Compare': 'cp',
 'CompareIn': 'cn',
 'CompareIs': 'cs',
 'If': 'if',
 'Index': 'ix',
 'NameConstant': 'nc',
 'SliceUS': 'su'}

# add more codes

catcode_filter.add_code("aa")
catcode_filter.codes

{'aa', 'bn'}

# see all validation mutations
# based on the set codes

catcode_filter.valid_mutations

{_ast.Add,
 _ast.Div,
 _ast.FloorDiv,
 _ast.Mod,
 _ast.Mult,
 _ast.Pow,
 _ast.Sub,
 'AugAssign_Add',
 'AugAssign_Div',
 'AugAssign_Mult',
 'AugAssign_Sub'}

# discard codes

catcode_filter.discard_code("aa")
catcode_filter.codes

{'bn'}

catcode_filter.valid_mutations

{_ast.Add, _ast.Div, _ast.FloorDiv, _ast.Mod, _ast.Mult, _ast.Pow, _ast.Sub}

# Filter a set of targets based on codes

catcode_filter.filter(genome.targets)

{LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16)}

# Optionally, invert the filter

catcode_filter.filter(
    genome.targets, invert=True
)

{LocIndex(ast_class='Compare', lineno=10, col_offset=11, op_type=<class '_ast.Gt'>, end_lineno=10, end_col_offset=16)}

Changing the source file in a Genome¶

If you change the source file property of the Genome all core properties except the coverage file and filters are reset - this includes targets, covered targets, and AST.

genome.source_file = src_loc / "b.py"

genome.targets

{LocIndex(ast_class='CompareIs', lineno=6, col_offset=11, op_type=<class '_ast.Is'>, end_lineno=6, end_col_offset=17)}

genome.covered_targets

{LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16)}

Creating Mutations¶

Mutations are applied to specific LocIndex targets in a Genome. You must speicfy a valid operation e.g., “add” can be mutated to “divide” or “subtract”, but not “is”. The Genome itself is not modified, a returned Mutant object holds the information required to create a mutated version of the __pycache__ for that source file. For this example, we’ll change a.py to use a multiplication operation instead of an addition operation for the add_five function. The original expected result of the code was 10 from 5 + 5 if executed, with the mutation it will be 25 since the mutation creates 5 * 5.

# Set the Genome back to example a
# filter to only the BinOp targets

genome.source_file = src_loc / "a.py"
genome.filter_codes = ("bn",)

# there is only one Binop target

mutation_target = list(genome.targets)[0]
mutation_target

LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16)

# The mutate() method applies a mutation operation
# and returns a mutant

mutant = genome.mutate(mutation_target, ast.Mult)

# applying an invalid mutation
# raises a MutationException

try:
    genome.mutate(mutation_target, ast.IsNot)

except MutationException as e:
    print(e)

<class '_ast.IsNot'> is not a member of mutation category bn.
Valid mutations for bn: {<class '_ast.Mult'>, <class '_ast.Sub'>, <class '_ast.Add'>, <class '_ast.Pow'>, <class '_ast.FloorDiv'>, <class '_ast.Mod'>, <class '_ast.Div'>}.

# mutants have all of the properties
# needed to write mutated __pycache__

mutant

Mutant(mutant_code=<code object <module> at 0x7f68a4040b30, file "example/a.py", line 1>, src_file=PosixPath('example/a.py'), cfile=PosixPath('example/__pycache__/a.cpython-38.pyc'), loader=<_frozen_importlib_external.SourceFileLoader object at 0x7f689cfbd310>, source_stats={'mtime': 1571346690.5703905, 'size': 118}, mode=33188, src_idx=LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16), mutation=<class '_ast.Mult'>)

# You can directly execute the mutant_code
# This result is with the mutated target being
# applied as Mult instead of Add in a.py

exec(mutant.mutant_code)

# Mutants have a write_cache() method to apply
# the change to __pycache__

mutant.write_cache()

# Alternatively, use run to do a single trial
# and return the result

mutant_trial_result = run.create_mutation_run_trial(
    genome, mutation_target, ast.Mult, ["pytest"], max_runtime=5
)

# In this case the mutation would survive
# The test passes if the value is
# greater than 10.

mutant_trial_result.status

'SURVIVED'

# Using a different operation, such as Div
# will be a detected mutation
# since the test will fail.

mutant_trial_result = run.create_mutation_run_trial(
    genome, mutation_target, ast.Div, ["pytest"], max_runtime=5
)

mutant_trial_result.status

'DETECTED'

GenomeGroups¶

The GenomeGroup is a way to interact with multiple Genomes. You can create a GenomeGroup from a folder of files, add new Genomes, and access shared properties across the Genomes. It is a MutableMapping and behaves accordingly, though it only accepts Path keys and Genome values. You can use the GenomeGroup to assign common filters, common coverage files, and to get all targets across an entire collection of Genomes.

ggrp = GenomeGroup(src_loc)

# key-value pairs in the GenomeGroup are
# the path to the source file
# and the Genome object for that file

for k,v in ggrp.items():
    print(k, v)

example/a.py <mutatest.api.Genome object at 0x7f689cfc8c10>
example/b.py <mutatest.api.Genome object at 0x7f689cfc8f70>

# targets, and covered_targets produce
# GenomeGroupTarget objects that have
# attributes for the source path and
# LocIdx for the target

for t in ggrp.targets:
    print(
        t.source_path, t.loc_idx
    )

example/b.py LocIndex(ast_class='CompareIs', lineno=6, col_offset=11, op_type=<class '_ast.Is'>, end_lineno=6, end_col_offset=17)
example/a.py LocIndex(ast_class='Compare', lineno=10, col_offset=11, op_type=<class '_ast.Gt'>, end_lineno=10, end_col_offset=16)
example/a.py LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16)

# You can set a filter or
# coverage file for the entire set
# of genomes

ggrp.set_coverage = Path(".coverage")

for t in ggrp.covered_targets:
    print(
        t.source_path, t.loc_idx
    )

example/b.py LocIndex(ast_class='CompareIs', lineno=6, col_offset=11, op_type=<class '_ast.Is'>, end_lineno=6, end_col_offset=17)
example/a.py LocIndex(ast_class='BinOp', lineno=6, col_offset=11, op_type=<class '_ast.Add'>, end_lineno=6, end_col_offset=16)

# Setting filter codes on all Genomes
# in the group

ggrp.set_filter(("cs",))
ggrp.targets

{GenomeGroupTarget(source_path=PosixPath('example/b.py'), loc_idx=LocIndex(ast_class='CompareIs', lineno=6, col_offset=11, op_type=<class '_ast.Is'>, end_lineno=6, end_col_offset=17))}

for k, v in ggrp.items():
    print(k, v.filter_codes)

example/a.py {'cs'}
example/b.py {'cs'}

# MutableMapping operations are
# available as well

ggrp.values()

dict_values([<mutatest.api.Genome object at 0x7f689cfc8c10>, <mutatest.api.Genome object at 0x7f689cfc8f70>])

ggrp.keys()

dict_keys([PosixPath('example/a.py'), PosixPath('example/b.py')])

# pop a Genome out of the Group

genome_a = ggrp.pop(Path("example/a.py"))
ggrp

{PosixPath('example/b.py'): <mutatest.api.Genome object at 0x7f689cfc8f70>}

# add a Genome to the group

ggrp.add_genome(genome_a)
ggrp

{PosixPath('example/b.py'): <mutatest.api.Genome object at 0x7f689cfc8f70>, PosixPath('example/a.py'): <mutatest.api.Genome object at 0x7f689cfc8c10>}

# the add_folder options provides
# more flexibility e.g., to include
# the test_ files.

ggrp_with_tests = GenomeGroup()
ggrp_with_tests.add_folder(
    src_loc, ignore_test_files=False
)

for k, v in ggrp_with_tests.items():
    print(k, v)

example/a.py <mutatest.api.Genome object at 0x7f68a4044700>
example/test_ab.py <mutatest.api.Genome object at 0x7f689cfd7340>
example/b.py <mutatest.api.Genome object at 0x7f689cfd74f0>