ASPic Help
ASPic INPUT
To start an Aspic session just click on the red "Run ASPic" button. A submission form split into two parts (for genomic and transcript sequences) will appear.
To successfully complete an Aspic Submission please make sure to fill both sections by providing genome and transcript data.

GENOMIC SEQUENCE
There are 4 different ways to submit a genomic sequence, you can choose just one of them:
Paste sequence: sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of text be shorter than 80 characters in length. A sample sequence in FASTA format, corresponding to human SLC40A1 gene (Ensembl ID: ENSG00000138449), is here.
Upload sequence: Upload a genomic sequence from your computer as a plain text file in Fasta format. A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of text be shorter than 80 characters in length. A sample sequence in FASTA format, corresponding to human SLC40A1 gene (Ensembl ID: ENSG00000138449), is here.
Gene name: Use a current ENSEMBL gene ID for the gene to be analyzed (e.g. ENSG00000138449). Only for human the HUGO gene ID or gene aliases are also allowed. Take care to select the corresponding species (e.g. Human) in the relevant window menu. Yon can also include in the analysis a given number of bp from either side of the gene.
Chromosomal range: Indicate a chromosome number and the chromosome coordinates. Please make sure to select the corresponding species in the above option list and to type coordinates referring to the latest Ensembl assembly. You can also include in the analysis a given number of bp from either side of the selected gene range and specify the strand (forward or reverse).
TRANSCRIPT SEQUENCES
There are 3 different ways to submit a collection of transcribed sequences. You can choose just one, or more, simultaneoulsly.
Paste sequences: You can paste a collection of mRNA and/or EST sequences in MultiFASTA format. MultiFASTA format allow the user to join more than one fasta sequence. Each sequence in FASTA format begins with a single-line description, followed by the nucleotide sequence. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of text be shorter than 80 characters in length. An example sequence collection in MultiFASTA format for Unigene ID Hs.274479, is here.
Upload sequence: You can upload a file containing a collection of mRNA and/or EST sequences in MultiFASTA format. MultiFASTA format allow the user to join more than one fasta sequence. Each sequence in FASTA format begins with a single-line description, followed by the nucleotide sequence. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of text be shorter than 80 characters in length. An example sequence collection in MultiFASTA format for Unigene ID Hs.274479, here.
Process by UNIGENE ID You can automatically extract all mRNA sequences collected into a Unigene cluster (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=unigene) just providing the corresponding ID (e.g. Hs.408312).
Click here if you want to perform Unigene IDs searches.
As one single ASPic run may take several minutes you must provide an e-mail address. As soon as the run is complete you will receive an e-mail with a link to the results.
In the case of a gene name selection (Ensembl gene ID or a Hugo ID or a gene alias only for human species) in the genomic input panel, job results are
stored into a MySQL database. If your input data have been previously pre-computed the old results will immediately visualize, instead of running a new Aspic job. Otherwise, if you want to produce new results (e.g. beacause of an update of the Aspic code or to use an updated Unigene cluster) and run again
the application, just click the "Ignore old searches" box. To browse the currently stored results in Aspic database just click the "DB" button in the top bar of ASPic Home Page.
ASPIC OUTPUT
When the run is complete you will receive an e-mail with the link to the output. The output is organized in 7 sections: 1) Input Info; 2) Gene View; 3) Transcript View; 4) Alignment View; 5) Intron Table; 6) Transcript Table; 7) GTF.
Input Info. Provides a summary of the submitted input and information retrieved by the ASPic engine (e.g. chromosome number, start, end, strand, etc.). Suitable links in this section allow to view or to download input sequences.
Gene View. Provides a schematical graphical view of the gene structure (constitutive exons in yellow, alternative exons in green) and of the predicted introns (labeled by progressive numbers). The expand button allows to vertically stretch the graphical representation (useful for complex gene structures).

Transcript View. Provides a graphical representation of the assembled transcripts. Only most reliable intron are used for the assembly process, i.e. those supported by a perfectly aligning EST with canonical donor and acceptor splice sites, or supported by two or more ESTs. The Transcript View shows the overall exon-intron scheme reporting also the annotation for 5'UTR, CDS, 3'UTR and polyA site.The first transcript (Tr 1) corresponds to the longest transcript containing a CDS corresponding to the one annotated in the CCDS database (www.ncbi.nlm.nih.gov/CCDS) if available, otherwise it is chosen as the longest transcript with the longest predicted CDS.
A 3' terminal arrow
marks polyadenylated transcripts eventually containing a polyA site (AAUAAA or its accepted variants).
Those transcripts containing ORFs shorter than 100 codons are labeled "Unannotated RNA".
Alignment View. Shows the alignment between the genomic sequence and the transcribed sequences near the predicted intron boundaries (15 nt upstream and downstream intron boundaries). The splice site quality is scored according to Shapiro and Senapathy (1987). Darker blue characters in the donor and acceptor motifs conform to the canonical consensus. The accession number of each supporting transcript is also provided.
Intron 1 [537]-[3019]
DONOR SCORE: 92 ACCEPTOR SCORE: 79
GGGTAAGT TGTTTTGGCTGTAGC
GGCGCAAAGGCTTGGGTAAGTTGACCTCCTCGCTT----TTTTGGTGTTTTGGCTGTAGCTCATGGTTGACAGC
GGCGCAAAGGCTTGG
171
172
CTCATGGTTGACAGC
DN998849
GGCGCAAAGGCTTGG
536
537
CTCATGGTTGACAGC
AF013263
GGCGCAAAGGCTTGG
536
537
CTCATGGTTGACAGC
NM_013229
GGCGCAAAGGCTTGG
536
537
CTCATGGTTGACAGC
NM_001160
GGCGCAAAGGCTTGG
536
537
CTCATGGTTGACAGC
NM_181861
GGCGCAAAGGCTTGG
536
537
CTCATGGTTGACAGC
NM_181868
GGCGCAAAGGCTTGG
536
537
CTCATGGTTGACAGC
NM_181869
Intron Table. Reports the relative and absolute coordinates of each detected intron, the number of supporting ESTs, the intron length, donor and acceptor splice sites, and the alignment quality near intron boundaries (expressed in Mismatch %).
Transcript Table. Shows the general features of all alternative full-length transcripts, such as the length, the number of exons, the putative location of the CDS, the length of the putative encoded protein and the transcript variant type. The variant type column reports - for each full-length transcript - the type of splicing event (e.g. alternative 5' or 3' end, exon skipping, etc.), the affected exon or intron as well as its location in the coding and/or untranslated regions of the transcript. Splicing variants are labeled in comparison to a reference transcript given by the longest inferred transcript containing a CDS corresponding to the one annotated in the CCDS database (http://www.ncbi.nlm.nih.gov/CCDS/) for human or by the longest transcript with the longest ORF in other species. The CDS in alternative full length transcripts not containing the CCDS start and stop codons is determined as the longest ORF if longer than 100 codons.
Used abbreviations are: A3E, A5E: alternative 5' or 3'ends (in brackets the length difference of the relevant intron); skip (En): skipping of exon n; uCDS (5'UTR and CDS); CDSu (CDS and 3'UTR); Init (En): alternative initiation at exon n; Term (En): alternative termination at exon n; New (En): new exon after exon n.
GTF. Provides a full textual output of ASPic including: i) absolute intron coordinates and IDs of supporting ESTs; ii) intron assortment of assembled transcript isoforms; iii) absolute exon coordinates; iv) exon assortment of assembled transcript isoforms; v) nucleotide sequence of assembled transcript isoforms in FASTA format (header line reports IDs of polyadenylated transcript).
A link to download the GTF and the Transcript Table is also available on the top of each output view.