Sequence Variant Analysis Using Peptide Mapping By LC–MS/MS

Monoclonal antibodies are usually expressed in mammalian cell lines and are produced in several variants known as isoforms (1,–2).
Microheterogeneity can result from posttranslational and enzymatic modifications as well as those caused by processing, alteration, storage, and incorrect translation of the target protein (1,3). Common sources of heterogeneity include Fc glycosylation, partial carboxypeptidase processing of heavy-chain (HC) C-terminal lysine residues (4), deamidation or isomerization (5), Fc methionine oxidation, hinge-region fragmentation (6), aggregation, and sequence variants. Sequence variants are protein isoforms containing unexpected amino acid sequences. They are classified as “product-related impurities.” The presence of unexpected amino acids may pose concerns regarding bioactivity, stability, and immunogenicity (1,3).






Peptide mapping by mass spectrometry (MS) is a valuable tool for characterizing sequence variants, including single amino acid substitutions in protein variants. These modifications can introduce changes in molecular mass and fragmentation patterns in MS/MS spectra between a precursor (expected) peptide and the modified variant. Consequently, the specific amino acid modification can be identified and localized. To date, most published works on sequence variant characterization involve studies of model proteins such as hemoglobin (Hb) and κ-casein in academic research (9,10,11). In most cases, sequence variants were isolated and enriched to a high level for the convenience of characterization.


Wade et al. published review articles about detection, characterization, and structural analysis of hemoglobin sequence variants using advanced analytical methods, including electrophoretic and chromatographic methods, MS, and DNA analysis (12,13). Ahrer et al. reviewed protein characterization, including sequence variants, using chromatographic and electrophoretic techniques (1). In general, however, detection, characterization, and understanding of sequence variants have not been widely studied (or at least not published) in the industrial field of protein therapeutics (7, 14,15,16,17).

Detecting and characterizing unanticipated sequence variants, especially those present in low quantities, remains a significant challenge. At Genentech, we have developed sensitive and robust analytical methodologies to detect sequence variants at both the DNA and protein levels. Modern molecular biology methods such as quantitative polymerase chain reaction (QPCR) help rapidly identify single nucleotide polymorphism (SNP) and detect DNA sequence variation (15). A novel approach combining peptide mapping using LC–MS/MS with Mascot error tolerant search (ETS) has been developed and reported for detecting protein sequence variants with amino acid substitutions in low abundance (16,–17).

LC–MS/MS: Liquid chromatography with tandem mass spectrometry

MS: Mass spectrometry

MS/MS: Tandem mass spectrometry

MAb: monoclonal antibody

HC: Heavy chain

It is vital to select a clone that is suitable for early stage process development efforts leading to clinical manufacturing. Here we describe the development of a peptide mapping method using LC–MS/MS combined with Mascot ETS to evaluate variants containing amino acid substitutions for a monoclonal antibody. Visual comparison of the wild and variant-type tryptic map UV profiles did not detect the presence of sequence variants. However, when LC–MS/MS data were submitted for the Mascot ETS, a low level of sequence-variant S52G HC was identified and determined to be 0.2% at the peptide level in a lead clone.

Materials and Methods

CHO Cell Lines: A humanized IgG1 monoclonal antibody X (referred as MAb X in this work) was produced using cell lines derived from Chinese hamster ovary (CHO) DUKX B11 dihydrofolate reductase–negative (DHFR) host cell lines (18) by stable transfection of a plasmid encoding genes for DHFR and humanized antibody HC and light chain (LC). MAb production cell lines were selected with methotrexate (MTX) after transfection. We chose the top four producers for clone selection experiments in 2-L bioreactors to evaluate the top clone for production of materials for toxicology testing and phase 1–2 clinical trials.

Bioreactor Cultures and Antibody Purification: Before the bioreactor experiments, cells were maintained in shake-flask cultures using selective, MTX-containing medium. For the passage immediately preceding the bioreactor experiment, they were scaled up in one nonselective, MTX-free bioreactor passage. Fed-batch clone selection cultures were carried out for 14 days in 2-L sparged bioreactors from Applikon ( They were maintained at pH 7.0 using sodium carbonate and CO2, with dissolved oxygen maintained at 30% air saturation by sparging with air and O2. Initial culture temperature was 37 °C, with a shift to 33 °C at 48 hours after inoculation. We used a proprietary serum-free medium supplemented with protein hydrolysate, adding a nutrient feed on day 3. Glucose concentration was maintained above 3 g/L by glucose addition as needed.

Upon termination of the culture, the cell culture fluid was centrifuged for 10 minutes at 7,000 rpm, then sterile-filtered through a 0.2-µm polyethersulfone filter unit from Pall ( Then clarified culture fluid was loaded on a Prosep-vA high-capacity protein A column from Millipore ( that had been equilibrated with 25 mM Tris, 25 mM NaCl, and 5 mM EDTA at pH 7.1. The column was washed with three column volumes of equilibration buffer, followed by three column volumes of 0.4 M potassium phosphate at pH 7.0, and then three column volumes of equilibration buffer. The antibody was eluted with 0.1 M acetic acid at pH 2.9 and neutralized to pH 5.0 by the addition of 1.5 M Tris base.

Trypsin Digestion: S-carboxy-methylation was conducted before digestion with trypsin: 1 mg of purified protein (e.
g., 50 µL at 20 mg/mL) was mixed with 20 µL of 1 M dithiothreitol from Sigma ( and 950 µL denaturing buffer (pH 8.6) containing 6 M guanidine hydrochloride, 360 mM Tris, and 2 mM EDTA, followed by an hour’s incubation at 37 °C. Then 50 µL of 1 M iodoacetic acid from Sigma, freshly prepared in 1 M NaOH, was added to the sample.

After a 15-min sample incubation at room temperature in the dark, the alkylation reaction was quenched by addition of 10 µL dithiothreitol. Reduced and S-carboxymethylated samples were exchanged using PD-10 columns containing Sephadex G-25 medium from GE Healthcare Bio-Sciences AB ( into a pH 8.2 buffer containing 25 mM Tris and 2 mM CaCl2.
MS-grade Trypsin from Promega ( was added at a ratio of ~1:50 (w/w) enzyme to protein. Digestion proceeded for five hours at 37 °C, then the reaction was quenched by addition of an aliquot of 10% (v/v) trifluoroacetic acid (TFA).

LC–MS/MS Analysis: We analyzed the protein digests using LC–MS/MS on a ThermoFinnigan LTQ instrument equipped with a standard electrospray ionization (ESI) source from Thermo Fisher Scientific ( The system interfaced with an Agilent 1200 HPLC unit with an in-line UV detector from Agilent (

In data-dependent scan experiments, the instrument was set to conduct 11 scan events including a single MS scan followed by five repeated cycles of a zoom scan, then one MS/MS scan on the five most intense ions in the MS scan. We analyzed the data from our tandem MS experiments with ThermoFinnigan Xcalibur software ( Separation was carried out on a 5-µm, 300 Å, 250 × 2-mm Jupiter C18 column from Phenomenex ( at a flow rate of 0.25 mL/min, with mobile phases containing 0.1% TFA in H2O (solvent A) and 0.09% TFA in 90% MeCN (solvent B). Column temperature was set at 55 °C. A mixture of peptide sample corresponding to 5 µg of protein was loaded onto the column before injection.

Mascot Error Tolerant Search: Mascot by Matrix Science Inc. (www. is a search engine that uses MS data to identify proteins from primary sequence databases (19,20,21). MS/MS ion search of an LC–MS/MS dataset often reveals a number of spectra that remain unmatched. They are attributable to enzyme nonspecificity, unexpected chemical and posttranslational modifications, and peptide sequences that aren’t in the database. This can be addressed using Mascot ETS, which compares MS/MS data with a theoretical fragmentation pattern for each parent and all possible variations, including chemical and posttranslational modifications and amino acid substitutions (19). That allows detection of amino acid substitutions in a peptide (22,23,24,25).

The search parameters for both standard and ETS are described below:

Carboxymethyl was selected as the fixed modification.

Variable modifications included oxidation on methionine, deamidation on asparagine and glutamine, des-C-terminal lysine, and glycoslation (Go) on asparagine.

Trypsin was the specified enzyme.

Maximum miscleavage site was set to 1.

Precursor peptide charge states were set at 2 and 3.

Mass tolerance was set to 800 ppm for both MS and MS/MS.

With additional manual ETS, the acquired LC–MS/MS data was submitted into Mascot. The first pass was a standard search against the entire in-house database consisting of ~150 entries of molecular sequences. Our second pass was a manual ETS that provides a comprehensive search for a selected protein hit, providing a wide range of possible variations, including posttranslational modifications (PTM) and amino acid substitutions. The “Mascot Search Parameters” box lists parameters for both standard and ETS.

About the Author

Author Details
Corresponding author Amy H. Que is a scientist, Gayle Derfus is an engineer II, and Ashraf Amanullah is director of Oceanside pharma technical development in US biologics for Genentech, 1 Antibody Way, Oceanside, CA 92056; 1-760-231-3061, fax 1-760-231-2465; [email protected]. Boyan Zhang and Jennifer Zhang are scientists, and Yi Yang is a senior research associate in protein analytical chemistry for pharma technical development in US biologics at Genentech, 1 DNA Way, South San Francisco, CA 94080.


1.) Ahrer, K, and A. Jungbauer. 2006. Chromatographic and Electrophoretic Characterization of Protein Variants.. J. Chromatogr. B 841:110-122.

2.) Harris, RJ, JJ Shire, and C. Winter. 2004. Commercial Manufacturing Scale Formulation and Analytical Characterization of Therapeutic Recombinant Antibodies. Drug Dev. Res. 61:137-154.

3.) Harris, RJ. 2005. Heterogeneity of Recombinant Antibodies: Linking Structure to Function. Dev. Biol. (Basel Karger) 122:117-127.

4.) Harris, RJ 1995. Processing of C-Terminal Lysine and Arginine Residues of Proteins Isolated from Mammalian Cell Culture. J. Chromatogr. A 705:129-134.

5.) Harris, RJ. 2001. Identification of Multiple Sources of Charge Heterogeneity in a Recombinant Antibody. J. Chromatogr. B 752:233-245.

6.) Cordoba, A. 2005. Non-Enzymatic Hinge Region Fragmentation of Anti-Bodies in Solution. J. Chromatogr. B 818:115-121.

7.) Harris, RJ. 1993. Assessing Genetic Heterogeneity in Production Cell Lines: Detection By Peptide Mapping of a Low Level Tyr to Gln Sequence Variant in a Recombinant Antibody. Nat. Technol. 11:1293-1297.

8.) 1998. ICH Q5D: Derivation and Characterization of Cell Substrates Used for Production of Biotechnological and Biological Products. Fed. Reg. 63:50244-50249.

9.) Tanaka, K, S Takenake, and S. Tsuyama. 2006. Determination of Unique Amino Acid Substitutions in Protein Variants By Peptide Mass Mapping with FT-ICR MS.. J. Am. Soc. Mass Spectrom. 17:508-513.

10.) Wade, Y, T Fujita, and A. Hayashi. 1989. Structural Analysis of Protein Variants By Mass Spectrometry: Characterization of Haemoglobin Providence Using a Grand-Scale Mass Spectrometer. Biomed. Environ. Mass Spectrom. 18:563-565.

11.) Claverol, S. 2003. Characterization of Protein Variants and Post-Translational Modifications: ESI-MSn Analyses of Intact Proteins Eluted from Polyacrylamide Gels. Mol. Cell. Proteomics 2:483-493.

12.) Wade, Y. 2002. Advanced Analytical Methods for Hemoglobin Variants. J. Chromatogr. B 781:291-301.

13.) Wade, Y. Chapman, J 1996.Structural Analysis of Protein VariantsMethods in Molecular Biology, Vol. 16: Protein and Peptide Analysis By Mass Spectrometry, Humana Press Inc, Totowa:101-113.

14.) Wan, M. 1999. Variant Antibody Identification By Peptide Mapping. Biotechnol. Bioeng. 62:485-488.

15.) Dorai, H. 2007. Investigat
ion of Product Microheterogeneity. BioProcess Int. 5:66-72.

16.) Yang, Y, B Zhang, V. Katta, and J Abstract. The 4th Symposium on the Practical Applications of Mass Spectrometry in the Biotechnology and Pharmaceutical Industries.

17.) Yang, Y. 2010. Detecting Low Level Sequence Variants in Recombinant Monoclonal Antibodies. mAbs 2:285-298.

18.) Urlaub, G, and LA Chasin. 1980. Isolation of Chinese Hamster Cell Mutants Deficient in Dihydrofolate Reductase Activity. Proc. Nat. Acad. Sci. USA 77:4216-4220.

19.) Error Tolerant Search 2010. Matrix Science Ltd.

20.) Homepage 2010. Matrix Science Ltd.

21.) Perkins, DN. 1999. Probability-Based Protein Identification By Searching Sequence Database Using Mass Spectrometry Data. Electrophoresis 20:3551-3567.

22.) Creasy, DM, and JJ Cottrell. 2002. Error Tolerant Searching of Uninterrupted Tandem Mass Spectrometry Data. Proteomics 2:1426-1434.

23.) Bonner, R, and B. Shushan. 1995. Error-Tolerant Protein Database Searching Using Peptide Product–Ion Spectra. Rapid Commun. Mass Spectrom. 9:1077-1080.

24.) Sunyaev, S. 2003. MultiTag: Multiple Error-Tolerant Sequence Tag Search for the Sequence-Similarity Identification of Proteins By Mass Spectrometry. Anal Chem. 75:1307-1315.

25.) Liska, AJ. 2005. Error-Tolerant EST Database Searches By Tandem Mass Spectrometry and MultiTag Software. Proteomics 5:4118-4122.

26.) den Dunnen, JT, and SE. Antonarakis. 2000. Mutation Nomenclature Extensions and Suggestions to Describe Complex Mutations: A Discussion.. Hum. Mutat. 15:7-12.

27.) Lebkowski, JS. 1984. Transfected DNA Is Mutated in Monkey, Mouse, and Human Cells. Mol. Cell. Biol. 4:1951-1960.

28.) Yu, XC. 2009. Identification of Codon–Specific Serine to Asparagine Mistranslation in Recombinant Monoclonal Antibodies By High-Resolution Mass Spectrometry. Anal. Chem. 81:9282-9290.

29.) Guo, D Mechanism of Unintended Amino Acid Sequence Changes in Recombinant Monoclonal Antibodies Expressed in Chinese Hamster Ovary (CHO) Cells.. Biotechnol. Bioeng accepted.

You May Also Like