Uploaded image for project: 'Qt Quality Assurance Infrastructure'
  1. Qt Quality Assurance Infrastructure
  2. QTQAINFRA-7214

API change review scripts need a rework of resetboring.py

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • P3: Somewhat important
    • None
    • production
    • qtqa module
    • None
    • Linux/X11

    Description

      I wrote qtqa/scripts/api-review/resetboring.py before I fully understood the problem it needed to solve; only writing it and later experience with using and updating it taught me what it should really be doing. Its design should be reviewed, with drastic refactoring or even rewriting entirely from scratch as options on the table. A broader review of the whole qtqa/scripts/api-review/ scripts might also be in order by now, given that they've evolved a lot over the last nine years.

      Its basic model is, for each "hunk" of a git diff --cached (as represented to it by the dulwich package), apply a set of "boring" transformations to the new and old versions of each line; where this turns a new line into the same as some old line, it resets that line of the diff, telling git to treat the new line as if its current version were the old line. This typically causes git diff --cached to subsequently not see that line as changed; when all lines of a hunk are likewise identified as unchanged, the hunk is dropped from the diff and thus from the ensuing commit used for the API change review. This is effectively equivalent to an automated git reset -p saying "no" to each of the boring changes (or editing them out). This kinda works mostly tolerably well, but has its difficulties.

      Each "boring" transformation is represented, somewhat naïvely, by a pair of (python) functions (test, purge); each is passed a token-sequence representing a parsed form of the line; if test(tokens) is true, the querying code uses purge(tokens) in place of tokens thereafter; it applies this successively for each pair in a long sequence. However, each transformation detects the "new" form and turns it into its "old" equivalent; where there have been several past forms that could become one modern form, this leads to ambiguities (which are handled, but that gets a bit clunky). Some of the transformations that have been deemed "boring" for these purposes have added something previously absent, but in some cases there is also a matching (also boring) transformation turning something old into the same thing as those add, as an example of such an ambiguity of prior form.

      It would be better – in particular, more flexible – for each recipe in the sequence to be encoded by some object (of some simple base-class; in the initial refactoring to this form, it could just package the pair of functions) that could potentially do something a bit more complex. In any case, several of the pairs of functions have later "arguments" that are really tunnelling local values from where they're defined to when they're run as defaults for these, as the callers don't pass more than one argument; this data would be better delivered to the constructor of the object that takes the place of the pair of functions.

      The current form of the process, reducing each token-sequence to a "minimal" form (in the code's current language; it should really be considered as "base" or "canonical" form), applies transformations from what's modern to what was (in some cases) there in the past; however, the transformer objects merely need to be consistent about which way they go, they could equally go the other way. This might save some of the ambiguity issues.

      The overall structure of the Selector class, and particularly its Selector.Censor helper (which packages the recipes and does the canonicalisation of lines) could undoubtedly also benefit from a design review and some careful thought about how to do it better. Doing that would probably need me available to discuss it and explain whatever I neglected to document (or documented unclearly) and why things are the way they are (and which of those "why" reasons are valid, which just mistakes I made in the initial work).

      I'm going to retire some time this decade and someone needs to take over maintenance of these scripts; if they have to replace them to have something they can maintain, then we need to give them the time to do that. The scripts are now run by the release team, but they need someone to maintain the scripts: it's not clear which team is best placed to do that. Decisions about what should (and should not) count as "boring" ultimately come from development; but maintaining the script might be better handled by QA&QE. Doing the hand-over before I retire would be prudent.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            daniel.smith Daniel Smith
            Eddy Edward Welbourne
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Gerrit Reviews

                There are no open Gerrit changes