Logo 4bases no pay

Getting Started with 4eVAR

1. Getting Started with 4eVAR

Welcome on 4eVAR, the proprietary cluster analysis server of 4bases SA, where clinical results of NGS sequencing data are easily and efficiently accessible through a validated and secured internal workflow.
In this guide you can find a step-by-step walkthrough to analyze your NGS data obtained using 4bases kits.

1.1 Sign up and Login

After browsing https://4evar.4bases.ch/, you will visualize the 4eVAR main page (Figure 1). On the top right corner of the page, you can see the “Login” button.

Figure 1: 4eVAR main page. The login button is highlighted in red.

If you already have a 4eVAR account, just type in your credentials (which are the email you used for signing up and the corresponding password). Then, press the “Login” button.

If you do not have an account yet, click on the “Sign Up” button: you will be asked to enter a valid email and a password for your account. After filling the Captcha field, click on the “Send request” button. You will receive a confirmation email on the address you wrote in your credentials: by validating the address you will be able to use your credential to access 4eVAR platform.
On the very first access with your credentials, you will be asked to accept the privacy policy to proceed with using 4eVAR and if you wish to receive communications from 4bases.

After that, you will have to specify the category of your Customer intended as Affiliation name (Research center/Hospital/ Department/ Company name) or the category and Name of your Company (for Distributors).
This can be done by clicking in the “Profile” button on the 4eVAR menu bar (see Figure 2).

Figure 2: 4eVAR main interface, with the blue menu bar in the upper interface section.

In the Profile menu, you can add the affiliation center in the “Company” section at the bottom of the menu and the relative email address on the right section. Once filled the two sections, you need to logout and login again in order to properly register your edits. Your user account will then be associated to this new Customer and two new sections will appear in the 4eVAR menu bar (Analysis and Lots). If there are multiple users to the same Customer, pay attention to use the same Customer’s name. The first user connected to a new Customer will be the “administrator”, and will be able to create new users for the same Customer and see all the analyses ran using the customers lots.

Should you encounter any login issue (e.g., if you forgot you login email address and/or password), contact our customer service at support@4bases.ch.

1.2 Claim your analysis

Before creating a new analysis, you have to claim your purchased analysis token, which is the LOT number of the 4bases kit you purchased. To do that, you have to press the “Lot” button on the 4eVAR menu bar. Once in the Lot menu, click on the “+ Deliver lot number” button. This will open a pop-up window where you have to type your LOT number exactly as written on your kit box (Figure 3) in the “LOT” section (IMPORTANT: on the PRO kit the Lot number to use is wrote on the BOX A).

Figure 3: “Deliver lot number” pop-up window.

Once inserted the Lot number, press the “+Deliver” button and a small window will state “Lot Delivered Successfully”. You will now see your Lot in the page with the number of analyses left and the kit name associated to it.

Should you encounter any issues in delivering your lot number, in the analyses number or in the analysis name, please contact us at support@4bases.ch, specifying the Lot number, the kit name and the purchased order number.

1.3 Creating an analysis

Once claimed your Lot(s), you can create your analysis (Figure 4). By clicking on the “Analysis” button in the 4eVAR menu bar (left to the “Lot” button), you will be brought to the analyses list. On top of the list you can see a “+” button. Clicking the button opens the “Create new analysis” window. First, give a name to your analysis in the “Analysis” section, according to your needs. Then, select the correct lot from the “lot” drop-down menu. Once selected, you will see the correct associated analysis in the “type” section. Finally, you must select the sequencer (Illumina Paired-end, Nanopore, etc.) in the “sequencer” menu, and the type of “mutation” you are looking for (Germinal or Somatic) in the correspondent section. You can also add a note to the analysis in the blank “note” square at ypur cconvenience. By clicking “+Add” the analysis will be created in the list of the main page.

Figure 4Creating a new analysis on 4eVAR.

1.4 Running an analysis and downloading the results

After creating a new analysis, you will be brought to the analysis list page, where you can select an analysis by clicking on its name (you will see a blue square around it, see Figure 5A). You will then see a file upload interface on the right of the 4eVAR page. You can upload your fastq.gz file(s) (for illumina paired-end sequincing, ensure to load R1 and R2 files for each sample), by dragging the files on the “Drag and drop files here…” box, or you can click on that box and select your sequencing files from your computer (Figure 5A). A squared icon will appear under the the drag and drop space, showing the name of the file and a loading bar (Figure 5B). When the file is completely loaded, it will be shown in a list layout. Only 4 file could be upload at the same time: the exceeding ones will be put in a queue until the first 4 are completely load. Once all files are loaded, you can press the “Run” button at the bottom of the file list, and subsequently click on ”Confirm” in the pop-up window. The analysis will then start running and change its status from “Created” to “Checking”.

IMPORTANT: Do not press the run button before all the sample are completely uploaded.

Figure 5: Uploading files on the 4eVAR platform. A) Sample selection; B) Uploading sequencing files; C) List of successfully uploaded sequencing files.

Each uploaded sequencing file for each sample will be accessible once the analysis is completed. When all the samples have been analyzed, the status of the analysis will change from “Checking” to “Completed”.

Each sample will have its folder (Figure 6), where you will find the output files of the analysis: the BAM file with its associated BAI, the VCF file and its associated TBI index, a ANN Excel file with the variants and a coverage folder with the coverage as CSV file. All the files can be downloaded by clicking on them.

The approximate length of an analysis depends on the size and amount of the sequencing files per sample.

Figure 6: Folders containing the results of the analysis per each sample.

Should you encounter any issues in downloading the results, contact our customer service at support@4bases.ch specifying the Lot number and the analysis name as well.

2. Germinal Analysis with 4eVAR

2.1 Setting up a Germinal analysis

If a Germinal analysis is needed, set the “Germinal” option from the Mutation drop-down menu (Figure 7).

Figure 7: mutation drop-down menu

Once the analysis is created, you have to load your sequencing files (refer to section 1.4) obtained from a 4bases germinal kit. Once all the files are loaded you can launch the analysis by pressing the “Run” button and subsequently clicking on ”Confirm” in the pop-up window.
IMPORTANT: When loading Illumina Paired-End reads, be sure to load an even number of file per sample (2 in case you sequenced on 1 lane, 4 when you used 2 lanes and so on). Loading an odd number of file could result in a partial analysis or affect the final result.

2.2 Checking Germinal analysis results

After clicking on the “Run” button, the status of the analysis will change from “Created” to “Checking”, indicating that the analysis is running. When the analysis is completed, its status will change from “Checking” to “Completed” (Figure 8).

IMPORTANT: If your analysis encountered any issue, its status will switch to “Warning”. If this happens, please contact our support at support@4bases.ch for troubleshooting, indicating Lot number and analysis name as well.

The completed analyses will show a folder for each sample in the “Results” sections, just below the Drag and Drop files area. You can access any folder and its contained files by clicking on the folder name.

Figure 8: Completed analysis showing the uploaded sequencing files and the relative results.

2.3 Files present in a germinal analysis

Once the analysis is completed, you can find the following output files per each sample:

  • A BAM file (named {SAMPLE_NAME}.bam), the compressed binary version of a SAM file, the Standard Alignment Format. You can open this file with any alignment visualization program, such as IGV.
  • A BAI file (named {SAMPLE_NAME}.bam.bai), the index file associated to the corresponding BAM. You will need this to be able to process or open your BAM file.
  • A compressed VCF file (named {SAMPLE_NAME}.vcf.gz), which is the Variant Calling Format file for detected variants in a sample according to a specific reference file. It can be opened with a Text visualization program and loaded in a tertiary analysis solution platform (e.g., Varsome, Franklin).
  • The VCF index file in TBI format (named {SAMPLE_NAME}.vcf.gz.tbi).
  • An Excel file (named {SAMPLE_NAME}_ann.xlsx), containing the list of SNVs (Single Nucleotide Variants) recovered in your sample, with several information on the type of mutation found and its associated clinical impact.
  • A folder named “Coverage”, where you can find a CSV file (Comma Separated Value, editable using Excel or similar software) with the coverage of the sample genes analyzed in the kit (name of the file {SAMPLE_NAME}_coverage.csv).
 

If the sequenced samples were obtained using a 4bases PANEL kit, inside your sample folder you will find 2 more files in your sample output folders (if you analyzed more than 4 samples):

  • Another Excel file (named {SAMPLE_NAME}_CNV.xlsx), containing a table with the recovered CNV in each gene exon.
  • An image (named {SAMPLE_NAME}.svg) that shows the representation in a graph of the CNVs locations.
 

If the sequenced samples were obtained using a 4bases PRO kit, you will find an extra folder in the results section named “CNV” (if you analyzed more than 4 samples). This folder contains one CSV file per sample with a list of CNVs recovered in that analysis.

IMPORTANT: if no CNVs were detected for a sample, the corresponding results will not have an associated CSV file.

2.3.1 Annotation Excel

The “annotation Excel file is divided in 16 columns (Figure 9):

  • CHROM: chromosome in which the mutation is located.
  • POS: base position in the chromosome were the mutation is located.
  • REF: expected base in the reference (which can be more than one in cases of indels).
  • ALT: alternative base recovered in the sample (which can be more than one in cases of indels).
  • VAF (Variant Allele Frequency): frequency of the recovered mutation expressed in percentage.
  • GT: genotype of the mutation, which can be homozygous (hom) or heterozygous (het).
  • DP: depth of the mutation, which is the number of reads aligned on the mutation site.
  • GENE: gene where the mutation was detected.
  • FEATURE ID: RefSeq accession number of a gene transcript.
  • EFFECT: the effect that the mutation has on the gene.
  • HGVS_C: Human Genome Variation Society nomenclature for the base mutation.
  • HGVS_P: Human Genome Variation Society nomenclature for the protein mutation.
  • ClinVar: Clinical impact of the specific variant.
  • ClinVarCONF: in case of Conflicting interpretations of pathogenicity, it shows which are the impact possibility with the number of cases observed in brackets.
  • VarSome Link: link of the mutation on Varsome Clinical. You need to have a VarSome account to be able to visualize the mutation on the database.
  • Franklin Link: link of the mutation on Franklin Database.

Figure 9: Annotation Excel file of a successfully completed analysis.

2.3.2 Coverage CSV

The coverage CSV file consists in 8 columns (Figure 10):

  • chr: chromosome where the target is located.
  • Target_name: specific target of the coverage stats. It could be a gene or an exon of a gene.
  • min: minimum number of reads that cover a specific region of the target.
  • max: maximum number of reads that cover a specific region of the target.
  • mean: average number of reads that cover the target.
  • %cov0: percentage of the target not covered by any reads.
  • %cov>30: percentage of the target covered by more than 30 reads.
  • %cov>100: percentage of the target covered by more than 100 reads.

Figure 10: Coverage CSV file of a successfully completed analysis.

2.3.3 CNV Excel and CSV

DISCLAIMER: 4 germinal samples are required to perform a CNVs analysis using 4bases pro kits, while 6 samples when using 4bases panel kits.

The CNV Excel file consists in 11 columns (Figure 11):

  • Sample: sample name were the CNV were recovered.
  • Gene: target gene where the CNV might have been recovered.
  • Chr: chromosome where the target is located.
  • Start: starting base of the target (on the chromosome).
  • End: ending base of the target (on the chromosome).
  • Length: overall length of the target.
  • Log2ratio: in CNVs analyses the Log2 ratio is a commonly used metric to represent the relative difference in DNA copy number between a sample and a reference.
  • Amp_Del: indicates if in the target is present a duplication (AMP) or a Deletion (Del).
  • BP_Whole: indicates if the deletion is partial (BP) or complete (Whole). In case of partial deletion the cell will show the number of Base Pair deleted.
  • Ab_log2ratio: states the Abnormal log2ratio values, that might evidence a deletion or a duplication (it indicates the same number in Log2ratio column or a blank cell).
  • Ab_Seg_loc: the cell should state “ALL” if all the segment was deleted or duplicated.

Figure 11: Excel file of CNVs of a successfully completed analysis.

The CNV file for PRO kits is a CSV file divided in 12 columns (Figure 12):

  • start.p: starting protein were the Duplication/Deletion starts.
  • end.p: starting protein were the Duplication/Deletion ends.
  • type: type of Copy Number Variation: Deletion or Duplication.
  • nexons: number of Gene exons included in the CNV.
  • start: starting base were the Duplication/Deletion starts(on the chromosome).
  • end: ending base were the Duplication/Deletion starts(on the chromosome).
  • chromosome: chromosome were the CNV is located.
  • id: ID of the CNVs, format is CHR:START-END
  • BF: Bayes Factor is a statistical concept used to compare the fit of two different models to a given dataset. This value indicates the likelihood that the CNV is actually present and it is not an artifact created by the coverage. A Higher value evidences a higher chance of a real CNV.
  • reads.expected: number of reads expected on the specific site calculated with the coverage of your sample.
  • reads.observed: number of reads actually observed in the specific location of your sample.
  • reads.ratio: typically refers to the ratio of the number of sequencing reads mapped to a specific genomic region in a sample compared to a reference or baseline sample.

Figure 12: CSV file of CNVs of a successfully completed analysis.

2.3.4 CNV image (SVG file)

This is the graphical representation of the CNV location in a PANEL kit sample (Figure 13). On the x axis we have the exons location on the gene. On the y axis we have the log2ratio value. The dots in the graph are each exon of the target panel genes. The grey area on the image indicates the expected value of the log2ratio values. A dots that has an abnormal log2ratio value, therefore falling out of the grey area, is evidenced with a red color and its exon name in a label. Thus, it became easily identifiable for the users.

Figure 13: SVG image file showing the locations of the CNVs of a panel kit.

Should you need any further information concerning result files or need assistance on the results in general, contact us at support@4bases.ch specifying also kit Lot number and name of the analysis.

3. Somatic Analysis with 4eVAR

3.1 Setting up a Somatic analysis

If a Somatic analysis is needed, set the “Somatic“option from the Mutation drop-down menu (Figure 14) as explained in section 1.4 of this tutorial.

Figure 14: Mutation drop-down menu

If the Somatic analysis is a Tumor-Normal analysis (comparison between tumoral and germinal samples respectively), you will need to load both the sequencing files of the tumor samples and the sequencing files of the normal samples.

Once the analysis is created, you just have to load your fastq.gz files (as explained in the section 1.4) obtained with the sequencing.

Furthermore, you will need to load the Excel file (Template_SomaticTN.xlsx, downloadable here) to fill with the Normal (column “N”, germinal) and the Tumor (column “T”, somatic) samples names respectively (Figure 15). Please note that the names in the Excel file have to exactly match the names of the fastq files (without any other detail included by the sequencer in the filename). Do not rename the Excel file.
The same germinal sample in the Excel file can be associated to different tumor samples if necessary.

Figure 15: Template_SomaticTN.xlsx

Once all the files are loaded, launch the analysis by pressing the “Run” button and then clicking on “Confirm” in the pop-up window.

IMPORTANT: When loading Illumina Paired-End sequencing files, be sure to load an even number of files per sample (2 files per sample in case you sequenced on 1 lane, 4 files per sample when you used 2 lanes and so on). Loading an odd number of files will result in a broken analysis or affect the final result.

3.2 Checking Somatic analysis results

After clicking on the “Run” button, the status of the analysis will change from “Created” to “Checking”, indicating that your samples are being analyzed. Even when your analysis is in “Checking”, you will be able to see the samples, once they have been analyzed. When the analysis is completed, its status will change from “Checking” to “Completed”.

IMPORTANT: If your analysis encountered any issue, its status will not be “Completed”, but rather will be “Warning”. If this happens, contact ous at support@4bases.ch to find the cause of the issue and its possible solution.

Once completed, the analysis will show a folder for each sample in the “Results” sections, just below the fastq.gz files area. 

Figure 16: sample folders

You can access any folder by clicking on it.

3.3 Files present in a Somatic analysis

In the folder of the Normal samples, you will find the same output files that are present in the Germinal analyses (you can find the tutorial here). In addition, the CNV folder will contain a CSV file in the same format explained in the Germinal analysis tutorial.

In the Tumor samples folders, the result files are:

  • The Tumor BAM file (named {SAMPLE_NAME}_tum.bam).
  • The BAI file associated to the BAM (named {SAMPLE_NAME}_tum.bam.bai).
  • The somatic VCF file (named {SAMPLE_NAME}.vcf), that contains the somatic SNVs found in the sample.
  • The Excel of the annotation (named {SAMPLE_NAME}_filter.xlsx). It is similar to the Germinal Excel file.
  • The coverage folder, containing the coverage CSV file.
  • The HRD folder (“HRD_analysis”) with a CSV file named “all_HRDresults.csv”, which contains the information of the HRD value regarding the tumoral sample.
  • The MAF (Mutation Annotation Format) folder (named {SAMPLE_NAME}_maf_output) which contains one MAF file. This file contains all the annotated somatic variants of the tumor sample.
 

Furthermore, in the results sections you will find a HTML report (with the name {TUMOR_NAME}_{NORMAL_NAME}.html). The html file will contain all the information present in the single files:

  • Tumor Coverage
  • Normal Coverage
  • Tumor Somatic SNVs
  • Normal Germinal SNVs
  • CNA, Copy Number Alterations of the Tumor
  • HRD-score. Homologous Recombination Deficiency.

Figure 17: Report for sample

3.3.1 Somatic Annotation Excel

Figure 18: Annotation Excel

The annotation column are the same to the ones described in the Germinal analyses, except for one single extra column:

  • AMP: The AMP (American Association for Molecular Pathology) classification methodology is used for the interpretation and classification of somatic variants in cancer. The AMP classification system uses a Tier-based approach to categorize somatic variants into different levels of clinical significance: The Tier IV is the less severe and more likely to be benign, while and Tier I is the most severe situation.

3.3.2 HRD-score CSV

The HRD (Homologous Recombination Deficiency) score is a measure used in tumor analysis to assess the extent to which a tumor has defects in its ability to repair DNA through homologous recombination. It is composed of three measures:

  • Loss of Heterozygosity (LOH): LOH measures the extent of allelic imbalance or loss of one allele at certain genomic loci.
  • Telomeric Allelic Imbalance (TAI): TAI assesses allelic imbalances near the telomeres of chromosomes, which can be indicative of genomic instability.
  • Large-Scale State Transitions (LST): LST measures the number of large-scale transitions in copy number states across the genome.
 

In the csv file, you will find 5 columns: the sample name, the values of the previous three measures and the total HRD sum (Figure 19).

Figure 19: HRD-score CSV

3.3.3 MAF file

A somatic MAF file is a specific type of data file commonly used in cancer genomics and bioinformatics. It contains information about somatic mutations in cancer samples, including details about the genomic alterations present in tumor cells as compared to normal cells from the same individual.