Loading...

XML

Word

Printable

Type: Task
Resolution: Unresolved
Priority: P3: Somewhat important
Fix Version/s: None
Affects Version/s: None
Component/s: Metrics / Test Results
Labels:
None

test_datatags table has two columns, that associate a datatag with a numeric ID:

CREATE TABLE IF NOT EXISTS test_datatags
(
    test_datatag_n  integer  PRIMARY KEY  NOT NULL  GENERATED BY DEFAULT AS IDENTITY,
    test_datatag    text  UNIQUE NOT NULL
);

Such tables help normalize the schema (avoid repeating the same string on other tables, only the numeric id) and are not supposed to grow much.

In the past we had the problem of too many completely unique datatags (for example memory locations or random numbers) that don't match between different test runs, that have blown up this frequently accessed table to many million rows.

This has mostly been resolved on dev with targetted test fixes. Only few remain.

But every 5.15 integration keeps adding thousands of new datatags, and this slows things down.

How to resolve?

Possibly break this table into one "archive" table and a "current" one? The unique values are rarely accessed and could be moved to the archival table.

So I need an offline procedure to search for datatags unique in test_runs_raw, produced only from old workitems from old tasks.
Then move those to a "datatags_archive" table.
Will need to break the foreign-key relationship test_runs_raw->test_datatags since datatags will be missing. I do not want to have a foreign-key relationship to the archive table, because the archive table would become the bottleneck then.
Change any JOIN to test_datatag_n to LEFT JOIN in the views that read test_runs_raw.

Another strategy is to add another "count" column and only keep values with count>1.

This needs a bit of thinking because the UNIQUE index covering the whole table is necessary to do the INSERT in a performant way. And that index is 1GB for now.

How can I mark it to not blow up postgres' shared_buffers?
Maybe move it to the compressed drive so that it's easier for the kernel to cache?
If I remove the UNIQUE index will the INSERTION be performant? Will the uniqueness be guaranteed by the application logic? TODO test on dev server.

- Issue Only
- Show All Reviews
- Show Open Reviews

No reviews matched the request. Check your Options in the drop-down menu of this sections header.

Assignee:: Dimitrios Apostolou

Reporter:: Dimitrios Apostolou

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 12 Jul '24 19:28

Updated:: 19 Aug '24 14:01

There are no open Gerrit changes

Details

Description

How to resolve?

Attachments

Gerrit Reviews

Activity

People

Dates

Gerrit Reviews