Enhancing Lexical Representation of Test Coverage for Failure Clustering

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 30
  • Download : 0
Failure clustering aims to group multiple test failures based on shared root causes, helping developers to comprehend and debug each root cause (i.e., the underlying fault) in isolation. Clustering of failing test executions requires distances between those executions, for which distance measures between coverage vectors are widely used. Lexical representation of coverage has been suggested as an alternative, representing each structural element covered by a failing execution with the lexical tokens in the element. This paper investigates whether the granularity of the lexical representation affects the effectiveness of the failure clustering. We evaluate varying levels of tokenisation granularity by using them for clustering coexisting real-world test failures in Defects4J benchmark. Our results show that the traditionally adopted subtokenisation can actually deconstruct larger meaningful semantic token units, resulting in suboptimal clustering.
Publisher
IEEE
Issue Date
2021-11
Language
English
Citation

2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW), pp.232 - 238

DOI
10.1109/asew52652.2021.00052
URI
http://hdl.handle.net/10203/312218
Appears in Collection
CS-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0