IARC meeting 63, Nov 24th, 2020:
The meeting considered the inference of the variant IGHV4-4*02_a106g, in the VDJbase dataset of sample P1_I42_S1. The sequence was seen in 1.27% of all unmutated rearrangements, with 268 sequences including 234 perfect matches to the inferred allele. There was abundant variation in the CDR3 regions of the aligned sequences. IGHV4-4*07 was also present in the genotype, at a lower frequency (0.73% of all unmutated sequences, 157 sequences, 135 unmutated sequences). Haplotyping data was not available. Plots of the final 3’ nucleotides of the inference were also unavailable. In light of the low sequence counts and the lack of haplotyping, the inferred sequence was affirmed as a Level 0 sequence. The final 3’ nucleotides will be considered at a later date, at which time the affirmed sequence will be noted in the IARC minutes.
IARC meeting 84; Oct 25th, 2021:
IGHV4-4*02_A106G was inferred in subject S39 (P1_I42_S1; ERR2567216). This inference has previously been pre-assessed at IARC meeting 63 (https://www.antibodysociety.org/wordpress/wp-content/uploads/2020/12/Meeting-63-24_11_20-minutes.pdf). The inference was supported by many sequences (294) and unmutated sequences (253) a high allelic frequency (64%), a high overall frequency in the unmutated population (1.5%) and many unique CDR3s (243) in the unmutated sequence set. Haplotyping based on alleles of IGHJ6 was not possible. Haplotyping based on different variant sequences of IGHD3-10*01 (DOI: 10.3389/fimmu.2019.00987), one of which is not recorded in the IMGT database, was possible (but was not part of the OGRDB submission as the variant IGHD allele is not present in the database used for inference). Separate analysis following IgDiscover-based inference suggested complete separation of IGHV4-4*02_A106G (IGHV4-4*02_S2599) relative to the other allele of IGHV4-4. Overall, the genotype also carried IGHV4-4*07, IGHV4-59*01, IGHV4-59*08, and IGHV4-61*02_A234G among this set of similar genes. The upstream regions of all these alleles have been inferred in this subject in the past (DOI: 10.3389/fimmu.2021.730105) and the upstream region of IGHV4-4*02_A106G is different from those of the other alleles of this set of genes. IARC now affirms, based on the extensive validation, the sequence at level 1 up to and including base 319 in agreement with past practice. It is acknowledged that the allele most likely carries one additional base, typically A at base position 320. We recognise that alleles of IGHV4-4/IGHV4-59/IGHV4-61 may residue in gene locations different from that associated to the most similar allele in the IMGT database. Although there is nothing in the data, including haplotype assessment, that suggest that the allele does not reside in gene IGHV4-4, it must be recognized that IARC gene naming does not reflect a position on the precise gene location of the allele to a specific gene. The allele is given the name IGHV4-4*i03.
>IGHV4-4*i03
CAGGTGCAGCTGCAGGAGTCGGGCCCAGGACTGGTGAAGCCTTCGGGGACCCTGTCCCTCACCTGCGCTGTCTCTGGTGGCTCCATCAGCAGTGGTAACTGGTGGAGTTGGGTCCGCCAGCCCCCAGGGAAGGGGCTGGAGTGGATTGGGGAAATCTATCATAGTGGGAGCACCAACTACAACCCGTCCCTCAAGAGTCGAGTCACCATATCAGTAGACAAGTCCAAGAACCAGTTCTCCCTGAAGCTGAGCTCTGTGACCGCCGCGGACACGGCCGTGTATTACTGTGCGAGAG. |