We have developed PVS (Protein Variability Server), a web-based tool that uses several variability metrics to compute the absolute site variability in multiple protein-sequence alignments (MSAs). The variability is then assigned to a user-selected reference sequence consisting of either the first sequence in the alignment or a consensus sequence. Subsequently, PVS performs tasks that are relevant for structure-function studies, such as plotting and visualizing the variability in a relevant 3D-structure. Neatly, PVS also implements some other tasks that are thought to facilitate the design of epitope discovery-driven vaccines against pathogens where sequence variability largely contributes to immune evasion. Thus, PVS can return the conserved fragments in the MSA-as defined by a user-provided variability threshold-and locate them in a relevant 3D-structure. Furthermore, PVS can return a variability-masked sequence, which can be directly submitted to the RANKPEP server for the prediction of conserved T-cell epitopes. PVS is freely available at: http://imed.med.ucm.es/PVS/.
Prediction of peptide binding to major histocompatibility complex (MHC) molecules is a basis for anticipating T-cell epitopes. Peptides that bind to a given MHC molecule are related by sequence similarity. Therefore, a position-specific scoring matrix (PSSM)---also known as profile--derived from a set of aligned peptides known to bind to a given MHC molecule can be used as a predictor of both peptide-MHC binding and T-cell epitopes. In this approach, the binding potential of any peptide sequence (query) to the MHC molecule is determined by its similarity to a set of known peptide-MHC binders and can be obtained by comparing the query to the PSSM. Following structural considerations of the peptide-MHC interaction, we will describe here how to derive alignments and PSSMs that are suitable for the prediction of peptide-MHC binding.
Identification of peptides that can bind to major histocompatibility complex (MHC) molecules is important for anticipation of T-cell epitopes and for the design of epitope-based vaccines. Population coverage of epitope vaccines is, however, compromised by the extreme polymorphism of MHC molecules, which is in fact the basis for their differential peptide binding. Therefore, grouping of MHC molecules into supertypes according to peptide-binding specificity is relevant for optimizing the composition of epitope-based vaccines. Despite the fact that the peptide-binding specificity of MHC molecules is linked to their specific amino acid sequences, it is unclear how amino sequence differences correlate with peptide-binding specificities. In this chapter, we detail a method for defining MHC supertypes based on the analysis and subsequent clustering of their peptide-binding repertoires.
Cytotoxic T lymphocytes (CTL) protect against viruses including HIV-1. To avoid viral escape mutants that thwart immunity, we chose 25 CTL epitopes defined in the context of natural infection with functional and/or structural constraints that maintain sequence conservation. By combining HLA binding predictions with knowledge concerning HLA allele frequencies, a metric estimating population protection coverage (PPC) was computed and epitope pools assembled. Strikingly, only a minority of immunocompetent HIV-1 infected individuals responds to pools with PPC >95%. In contrast, virus-naive individuals uniformly expand IFNgamma producing cells and mount anti-HIV-1 cytolytic activity. This disparity suggests a vaccine design paradigm shift from infected to normal subjects.
Histones are DNA-binding proteins found in the chromatin of all eukaryotic cells. They are highly conserved and can be grouped into five major classes: H1/H5, H2A, H2B, H3, and H4. Two copies of H2A, H2B, H3, and H4 bind to about 160 base pairs of DNA forming the core of the nucleosome (the repeating structure of chromatin) and H1/H5 bind to its DNA linker sequence. Overall, histones have a high arginine/lysine content that is optimal for interaction with DNA. This sequence bias can make the classification of histones difficult using standard sequence similarity approaches. Therefore, in this paper, we applied support vector machine (SVM) to recognize and classify histones on the basis of their amino acid and dipeptide composition. On evaluation through a five-fold cross-validation, the SVM-based method was able to distinguish histones from nonhistones (nuclear proteins) with an accuracy around 98%. Similarly, we obtained an overall >95% accuracy in discriminating the five classes of histones through the application of 1-versus-rest (1-v-r) SVM. Finally, we have applied this SVM-based method to the detection of histones from whole proteomes and found a comparable sensitivity to that accomplished by hidden Markov motifs (HMM) profiles.
Manoj Bhasin, Hong Zhang, Ellis L. Reinherz and Pedro A. Reche.
Prediction of methylated CpGs in DNA sequences using a support vector machine.
FEBS Lett. 2005 Jul 25; [Epub ahead of print]Abstract
DNA methylation plays a key role in the regulation of gene expression. The most common type of DNA modification consists of the methylation of cytosine in the CpG dinucleotide. At the present time, there is no method available for the prediction of DNA methylation sites. Therefore, in this study we have developed a support vector machine (SVM)-based method for the prediction of cytosine methylation in CpG dinucleotides. Initially a SVM module was developed from human data for the prediction of human-specific methylation sites. This module achieved a MCC and AUC of 0.501 and 0.814, respectively, when evaluated using a 5-fold cross-validation. The performance of this SVM-based module was better than the classifiers built using alternative machine learning and statistical algorithms including artificial neural networks, Bayesian statistics, and decision trees. Additional SVM modules were also developed based on mammalian- and vertebrate-specific methylation patterns. The SVM module based on human methylation patterns was used for genome-wide analysis of methylation sites. This analysis demonstrated that the percentage of methylated CpGs is higher in UTRs as compared to exonic and intronic regions of human genes. This method is available on line for public use under the name of Methylator ati http://bio.dfci.harvard.edu/Methylator/.
Prediction of peptide binding to major histocompatibility complex (MHC) molecules is a basis for anticipating T-cell epitopes, as well as epitope discovery-driven vaccine development. In the human, MHC molecules are known as human leukocyte antigens (HLAs) and are extremely polymorphic. HLA polymorphism is the basis of differential peptide binding, until now limiting the practical use of current epitope-prediction tools for vaccine development. Here, we describe a web server, PEPVAC (Promiscuous EPitope-based VACcine), optimized for the formulation of multi-epitope vaccines with broad population coverage. This optimization is accomplished through the prediction of peptides that bind to several HLA molecules with similar peptide-binding specificity (supertypes). Specifically, we offer the possibility of identifying promiscuous peptide binders to five distinct HLA class I supertypes (A2, A3, B7, A24 and B15). We estimated the phenotypic population frequency of these supertypes to be 95%, regardless of ethnicity. Targeting these supertypes for promiscuous peptide-binding predictions results in a limited number of potential epitopes without compromising the population coverage required for practical vaccine design considerations. PEPVAC can also identify conserved MHC ligands, as well as those with a C-terminus resulting from proteasomal cleavage. The combination of these features with the prediction of promiscuous HLA class I ligands further limits the number of potential epitopes. The PEPVAC server is hosted by the Dana-Farber Cancer Institute at the site http://immunax.dfci.harvard.edu/PEPVAC/.
The EGF-like domain of smallpox growth factor (SPGF) targets human ErbB-1, inducing tyrosine phosphorylation of certain host cellular
substrates via activation of the receptor's kinase domain and thereby facilitating viral replication. Given these findings, low
molecular weight organic inhibitors of ErbB-1 kinases might function as antiviral agents against smallpox. Here we show that CI-1033
and related 4-anilinoquinazolines inhibit SPGF-induced human cellular DNA synthesis, protein tyrosine kinase activation, and c-Cbl
association with ErbB-1 and resultant internalization. Infection of monkey kidney BSC-40 and VERO-E6 cells in vitro by variola strain
Solaimen is blocked by CI-1033, primarily at the level of secondary viral spreading. In an in vivo lethal vaccinia virus pneumonia
model, CI-1033 alone promotes survival of animals, augments systemic T cell immunity and, in conjunction with a single dose of
anti-L1R intracellular mature virus particle-specific mAb, fosters virtually complete viral clearance of the lungs of infected mice
by the eighth day after infection. Collectively, these findings show that chemical inhibitors of host-signaling pathways exploited by
viral pathogens may represent potent antiviral therapies.
EPIMHC is a relational database of MHC-binding peptides and T cell epitopes that are observed in
real proteins. Currently the database contains 4867 distinct peptide sequences from various sources, including 84 tumor
associated antigens. The EPIMHC database is accessible through a web server that has been designed to facilitate research in
computational vaccinology. Importantly, peptides resulting from a query can be selected to derive specific motif-matrices.
Subsequently, these motif-matrices can be used in combination with a dynamic algorithm for predicting MHC-binding peptides from
user-provided protein queries. AVAILABILITY: The EPIMHC database server is hosted by the Dana-Farber Cancer Institute at the site
http://immunax.dfci.harvard.edu/bioinformatics/epimhc/.
Definition of MHC supertypes through clustering of MHC peptide binding repertoires
Artificial Immune Systems. Procedings of Third international conference, ICARIS. 2004. LNCS 3239, pp.
189-196.Eds. G. Nicosia, V. Cutello, P. J. Bentley and T. Timmis. Springer-Verlag Berling Heidelberg. Abstract
MHC molecules, also known in the human as human leukocyte antigens (HLA), display peptides on antigen presenting cell surfaces for subsequent T cell recognition. Identification of these antigenic peptides is especially important for developing peptide-based vaccines. Consequently experimental and computational approaches have been developed for their identification. A major impediment to such an approach is the extreme polymorphism of HLA, which is in fact the basis for differential peptide binding. This problem can be mitigated by the observation that despite such polymorphisms, HLA molecules bind overlapping set of peptides, and therefore, may be grouped accordingly into supertypes. Here we describe a method of grouping HLA alleles into supertypes based on analysis and subsequent clustering of their peptide binding repertoires. Combining this method with the known allele and haplotype gene frequencies of HLA I molecules for five major American ethnic groups (Black, Caucasian, Hispanic, Native American, and Asian), it is now feasible to identify supertypic combinations for prediction of antigenic peptide, offering the potential to generate peptide-vaccines with a population coverage >95%, regardless of ethnicity. One combination including five distinct supertypes is available online at our PEPVAC web server (http://immunax.dfci.harvard.edu/PEPVAC/). Promiscuous peptides predicted to bind to these five supertypes represent around 5% of all possible peptide binders from a given genome.
We introduced previously an on-line resource, RANKPEP that uses position specific scoring matrices (PSSMs) or profiles for the prediction of peptide-MHC class I (MHCI) binding as a basis for CD8 T-cell epitope identification. Here, using PSSMs that are structurally consistent with the binding mode of MHC class II (MHCII) ligands, we have extended RANKPEP to prediction of peptide-MHCII binding and anticipation of CD4 T-cell epitopes. Currently, 88 and 50 different MHCI and MHCII molecules, respectively, can be targeted for peptide binding predictions in RANKPEP. Because appropriate processing of antigenic peptides must occur prior to major histocompatibility complex (MHC) binding, cleavage site prediction methods are important adjuncts for T-cell epitope discovery. Given that the C-terminus of most MHCI-restricted epitopes results from proteasomal cleavage, we have modeled the cleavage site from known MHCI-restricted epitopes using statistical language models. The RANKPEP server now determines whether the C-terminus of any predicted MHCI ligand may result from such proteasomal cleavage. Also implemented is a variability masking function. This feature focuses prediction on conserved rather than highly variable protein segments encoded by infectious genomes, thereby offering identification of invariant T-cell epitopes to thwart mutation as an immune evasion mechanism.
During development, thymocytes carrying TCRs mediating low-affinity interactions with MHC-bound self-peptides are positively selected for export into the mature peripheral T lymphocyte pool. Thus, exogenous administration of certain altered peptide ligands (APL) with reduced TCR affinity relative to cognate Ags may provide a tool to elicit maturation of desired TCR specificities. To test this "thymic vaccination" concept, we designed APL of the viral CTL epitopes gp33-41 and vesicular stomatitis virus nucleoprotein octapeptide N52-59 relevant for the lymphocytic choriomeningitis virus-specific P14- and vesicular stomatitis virus-specific N15-TCRs, respectively, and examined their effects on thymocytes in vivo using irradiation chimeras. Injection of APL into irradiated congenic (Ly-5.1) mice, reconstituted with T cell progenitors from the bone marrow of P14 RAG2(-/-) (Ly-5.2) or N15 RAG2(-/-) (Ly-5.2) transgenic mice, resulted in positive selection of T cells expressing the relevant specificity. Moreover, the variants led to export of virus-specific T cells to lymph nodes, but without inducing T cell proliferation. These findings show that the mature T cell repertoire can be altered by in vivo peptide administration through manipulation of thymic selection.
Variola, the causative agent of smallpox, is a highly infectious double-stranded DNA virus of the orthopox genus that replicates within the cytoplasm of infected cells. For unknown reasons prominent skin manifestations, including "pox," mark the course of this systemic human disease. Here we characterized smallpox growth factor (SPGF), a protein containing an epidermal growth factor (EGF)-like domain that is conserved among orthopox viral genomes, and investigated its possible mechanistic link. We show that after recombinant expression, refolding, and purification, the EGF domain of SPGF binds exclusively to the broadly expressed cellular receptor, erb-B1 (EGF receptor), with subnanomolar affinity, stimulating the growth of primary human keratinocytes and fibroblasts. High affinity monoclonal antibodies specific for SPGF reveal in vivo immunoprotection in a murine vaccinia pneumonia model by a mechanism distinct from viral neutralization. These findings suggest that blockade of pathogenic factor actions, in general, may be advantageous to the infected host.
Zhong W, Reche PA, Lai CC, Reinhold B, Reinherz EL.
Genome-wide characterization of a viral cytotoxic T lymphocyte epitope repertoire.
J. Biol Chem. 2003 Nov 14; 278(46): 45135-44. Epub 2003 Sep 05. Abstract
A genome-wide search using major histocompatibility complex (MHC) class I binding and proteosome cleavage site algorithms identified 101 influenza A PR8 virus-derived peptides as potential epitopes for CD8+ T cell recognition in the H-2b mouse. Cytokine-based flow cytometry, ELISPOT, and cytotoxic T lymphocyte assays reveal that 16 are recognized by CD8+ T cells recovered directly ex vivo from infected animals, accounting for greater than 70% of CD8+ T cells recruited to lung after primary infection. Only six of the 22 highest affinity MHC class I binding peptides comprise cytotoxic T lymphocyte epitopes. The remaining non-immunogenic peptides have equivalent MHC affinity and MHC-peptide complex half-lives, eliciting T cell responses when given in adjuvant and with T cell receptor-ligand avidity comparable with their immunogenic counterparts. As revealed by a novel high sensitivity nanospray tandem mass spectrometry methodology, failure to process those predicted epitopes may contribute significantly to the absent response. These results have important implications for rationale design of CD8+ T cell vaccines.
Major histocompatibility complex class I (MHCI) and class II (MHCII) molecules display peptides on antigen-presenting cell surfaces for subsequent T-cell recognition. Within the human population, allelic variation among the classical MHCI and II gene products is the basis for differential peptide binding, thymic repertoire bias and allograft rejection. While available 3D structural analysis suggests that polymorphisms are found primarily within the peptide-binding site, a broader informatic approach pinpointing functional polymorphisms relevant for immune recognition is currently lacking. To this end, we have now analyzed known human class I (774) and class II (485) alleles at each amino acid position using a variability metric (V). Polymorphisms (V>1) have been identified in residues that contact the peptide and/or T-cell receptor (TCR). Using sequence logos to investigate TCR contact sites on HLA molecules, we have identified conserved MHCI residues distinct from those of conserved MHCII residues. In addition, specific class II (HLA-DP, -DQ, -DR) and class I (HLA-A, -B, -C) contacts for TCR binding are revealed. We discuss these findings in the context of TCR restriction and alloreactivity.
Negative selection eliminates thymocytes bearing autoreactive T cell receptors (TCR) via an apoptotic mechanism. We have cloned an inhibitor of NF-kappa B, I kappa BNS, which is rapidly expressed upon TCR-triggered but not dexamethasone- or gamma irradiation-stimulated thymocyte death. The predicted protein contains seven ankyrin repeats and is homologous to I kappa B family members. In class I and class II MHC-restricted TCR transgenic mice, transcription of I kappa BNS is stimulated by peptides that trigger negative selection but not by those inducing positive selection (i.e., survival) or nonselecting peptides. I kappa BNS blocks transcription from NF-kappa B reporters, alters NF-kappa B electrophoretic mobility shifts, and interacts with NF-kappa B proteins in thymic nuclear lysates following TCR stimulation. Retroviral transduction of I kappa BNS in fetal thymic organ culture enhances TCR-triggered cell death consistent with its function in selection.
The functional consequences of glycan structural changes associated with cellular differentiation are ill defined. Herein, we investigate the role of glycan adducts to the O-glycosylated polypeptide stalk tethering the CD8alphabeta coreceptor to the thymocyte surface. We show that immature CD4(+)CD8(+) double-positive thymocytes bind MHCI tetramers more avidly than mature CD8 single-positive thymocytes, and that this differential binding is governed by developmentally programmed O-glycan modification controlled by the ST3Gal-I sialyltransferase. ST3Gal-I induction and attendant core 1 sialic acid addition to CD8beta on mature thymocytes decreases CD8alphabeta-MHCI avidity by altering CD8alphabeta domain-domain association and/or orientation. Hence, glycans on the CD8beta stalk appear to modulate the ability of the distal binding surface of the dimeric CD8 globular head domains to clamp MHCI.