Skip to content

Reduce create memory usage #96

@tomwhite

Description

@tomwhite

Currently create will create in-memory arrays for all the fixed (VCF) fields from the two input VCZs: variant_contig, variant_position, variant_allele, variant_id, variant_quality, and variant_filter in order to merge them into a single VCZ.

It would be better to just load the first three (variant_contig, variant_position, variant_allele) and keep the indexes (and sort order) in memory, then process the other variant fields sequentially. This is similar to normalise which first computes an index, then uses it on the dataset.

See #95 for more details and timings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions