Skip to content

load_gff3 miscalculates CDS #60

@mpoelchau

Description

@mpoelchau

We've been having trouble with load_gff3 - I initially posted this on the Apollo repo (GMOD/Apollo#2662), reposting it here now. Thanks for considering this, I'd appreciate any pointers!

We are trying to use the python-apollo arrow annotations load_gff3 command to load annotations to the user-created annotations track. It is changing the CDS locations of the model, both with and without the --disable_cds_recalculation option.

Here is what a load without --disable_cds_recalculation looks like; the correct frame can be seen in the track below.

Screenshot 2023-11-28 at 2 53 30 PM

The gff3 that was used to load the annotation has 6 CDS lines; the gff3 for the uploaded annotation has 12 CDS lines (even though the view shows only one CDS segment). Apollo also won't calculate a protein or CDS sequence on the uploaded annotation.

Here is what a load with --disable_cds_recalculation looks like (command: arrow annotations load_gff3 --source https://apollo2-stage-node1-cbo.nal.usda.gov/apollo Anoplophora_glabripennis ~/Downloads/NW_019416298.gff3 --disable_cds_recalculation)
Screenshot 2023-11-28 at 2 50 01 PM

Again, the gff3 for the uploaded annotation in Apollo has 12 CDS lines instead of 6. Apollo also won't calculate a protein or CDS sequence on the uploaded annotation.
I'll note that if you run the same command multiple times, the single CDS will display in a different spot each time.

If you load the annotation by dragging it up, it loads correctly:

Screenshot 2023-11-28 at 2 56 20 PM

This is happening for many (but not all) annotations in multiple assemblies/organisms.

Some other observations:

  • We haven't observed the problem for models with a single CDS/exon segment
  • The underlying genomic sequence has lowercase nucleotides
  • I tried using load_legacy_gff3, and that calculated the CDS correctly, but I'm unable to delete features when I load them with that method (Hibernate operation: could not execute statement; SQL [n/a]; ERROR: update or delete on table "feature" violates foreign key constraint "fk_8jm56covt0m7m0m191bc5jseh" on table "feature_relationship" Detail: Key (id)=(4858111) is still referenced from table "feature_relationship".; nested exception is org.postgresql.util.PSQLException: ERROR: update or delete on table "feature" violates foreign key constraint "fk_8jm56covt0m7m0m191bc5jseh" on table "feature_relationship" Detail: Key (id)=(4858111) is still referenced from table "feature_relationship".)

I've attached "before" and "after" gff3s. (Used .txt extension because GitHub wouldn't let me upload otherswise)
before.txt
after-nocdsrecalc.txt
after.txt

  • Provide the javascript console log output generated from the action.
    None.

  • Provide the server log output generated from the action (typically catalina.out).
    nothing is added to Catalina.out when I add the annotations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions