• Share full article

Advertisement

Supported by

Scientists Finish the Human Genome at Last

The complete genome uncovered more than 100 new genes that are probably functional, and many new variants that may be linked to diseases.

research work on genetics

By Carl Zimmer

Two decades after the draft sequence of the human genome was unveiled to great fanfare, a team of 99 scientists has finally deciphered the entire thing. They have filled in vast gaps and corrected a long list of errors in previous versions, giving us a new view of our DNA.

The consortium has posted six papers online in recent weeks in which they describe the full genome. These hard-sought data, now under review by scientific journals, will give scientists a deeper understanding of how DNA influences risks of disease, the scientists say, and how cells keep it in neatly organized chromosomes instead of molecular tangles.

For example, the researchers have uncovered more than 100 new genes that may be functional, and have identified millions of genetic variations between people. Some of those differences probably play a role in diseases.

For Nicolas Altemose, a postdoctoral researcher at the University of California, Berkeley, who worked on the team, the view of the complete human genome feels something like the close-up pictures of Pluto from the New Horizons space probe.

“You could see every crater, you could see every color, from something that we only had the blurriest understanding of before,” he said. “This has just been an absolute dream come true.”

Experts who were not involved in the project said it will enable scientists to explore the human genome in much greater detail. Large chunks of the genome that had been simply blank are now deciphered so clearly that scientists can start studying them in earnest.

We are having trouble retrieving the article content.

Please enable JavaScript in your browser settings.

Thank you for your patience while we verify access. If you are in Reader mode please exit and  log into  your Times account, or  subscribe  for all of The Times.

Thank you for your patience while we verify access.

Already a subscriber?  Log in .

Want all of The Times?  Subscribe .

We're sorry but you will need to enable Javascript to access all of the features of this site.

Stanford Online

Your guide to genetics and genomics.

Genetic chain illustration

The study of heredity has advanced considerably in recent years. Scientists’ understanding of the human genome has influenced advanced genetic testing methods that have moved physicians’ ability to diagnose and treat diseases forward.

As the collective knowledge of the human genome continues to become more sophisticated, genomics and genetics will play an increasingly important role in disease research.

Understanding genomics and genetics

Genetics and genomics are at the cutting edge of our understanding of human nature. While advancements in gene research have made the two branches of molecular biology virtually indistinguishable from one another, it’s still worth considering their differences to understand their usage and application.

Genetics refers to the scientific study of heredity — or, the study of how genes are passed from parents to offspring. The gene is the basic building block of heredity. It contains hundreds of sequences of DNA with instructions for creating proteins that govern various functions around the body.

Let’s take a look at the expression of green eyes at the genetic level to illustrate the point. The gene OCA2 — located in chromosome 15 — plays an important role in the pigmentation of eye color. OCA2 produces a protein known as P protein, which is involved in the production and storage of melanin. Melanin is the pigment that forms the basis for the coloration of the eyes.

A genetic variation in OCA2 will affect the amount of P protein that is produced, determining how much melanin is present — when there is more melanin, the individual will have darker eye color (like brown), while less melanin will lead to lighter eye colors (like green). It’s important to note that several other genes are also involved in eye color, and a single gene rarely acts completely on its own.

Genomics is a broader term that refers to the study of an individual’s entire genetic makeup — or genome — including the composition of each one of their genes as well as the external influences (including diet and environment) that influence the way those genes are expressed. The human genome is estimated to contain roughly 20,000 to 25,000 individual genes, according to the National Human Genome Research Institute.

Traditionally, genomics was considered distinct from genetics in that the latter focused on the individual proteins produced by a single set of genes, whereas researchers studying the former would instead take a broader view to genetics, examining the way multiple genes interact with one another to produce specific traits and characteristics.

While genetics and genomics are technically distinct, the advancement of gene research in recent years has increasingly led geneticists to use the terms interchangeably. Researchers are discovering that it’s virtually impossible to study a single gene without also studying other genes, environmental factors and other factors that influence the expression of a specific set of genes, making these two branches of molecular biology virtually synonymous.

Applying genetic and genomic testing to disease research

Genetics and genomics play a critical role in the study of human disease. Just as genes pass information about traits and characteristics from parent to offspring, they also inform an individual’s susceptibility to certain diseases and conditions.

For example, cystic fibrosis, Huntington’s disease and phenylketonuria are all inherited diseases that can be passed from parent to offspring, according to the National Human Genome Research Institute. It’s also believed that many lifestyle diseases — including obesity, heart disease, diabetes and certain types of cancers — have a genetic component.

For that reason, physicians have developed a series of genetic tests that examine an individual’s genome to help determine their genetic susceptibility to various diseases, both those with and without known cures.

According to Medline Plus, the most common types of genetic tests include:

  • Molecular tests evaluate a specific gene (or set of genes) to identify possible changes in the chromosome to make a disease or cancer diagnosis. Molecular tests are helpful to physicians when trying to confirm a diagnosis with limited or conflicting information, like when an individual’s symptoms could point to numerous possible conditions.
  • Chromosomal tests assess the entire chromosomal structure to pinpoint changes in the larger body that could point to a genetic disease. Certain conditions are the result of large-scale changes in an entire chromosome, including deletion or duplication of whole portions of the chromosome.
  • Gene expression tests examine the activity of genes in certain cells and organs. Even when an individual contains a specific gene or set of genes, it’s not guaranteed that those genes will receive expression (i.e., an individual can have a genetic predisposition to diabetes but never actually get diabetes).
  • Biochemical tests focus on measuring the amount and number of specific proteins found in the blood. Biochemical tests help determine whether certain genes are overproducing (or underproducing) various types of protein, which could make an individual susceptible to disease.

The use of genetic testing can help physicians identify the possibility of disease long before an individual actually becomes sick, helping patients take preventive wellness measures far in advance.

The future of genomics and genetics: What to expect

The study of genomics and genetics is rapidly advancing, meaning it will likely play an increasingly important role in human disease research and prevention. Some of the emerging trends in genomics and genetics include:

Direct-to-consumer genetic testing

As the technology underlying genetic testing becomes more sophisticated, some private companies are making direct-to-consumer (DTC) genetic tests more readily available. This gives patients easy access to genetic tests without having to go through a healthcare organization.

There is an inherent risk associated with DTC genetic tests. The U.S. Food and Drug Administration (FDA) notes that not all over-the-counter genetic tests can back their claims with empirical evidence, meaning consumers might be receiving results that are imprecise (or wholly incorrect). That could cause them to make ill-informed decisions about their health.

Gene therapy

The advancement of genetic testing has given clinicians a comprehensive depth of information regarding the genetic makeup of patients. Now, that information is being used to inform genetic therapeutic remedies aimed at preventing inherited diseases and conditions.

Gene therapy is a type of medical treatment that enables scientists to replace defective genes, introduce healthy genes or deactivate genes, all of which can help prevent or mitigate certain genetic conditions.

Gene therapy is now being used to treat a number of complex diseases that have perplexed clinicians for generations. In 2017, the FDA approved the first gene therapy in the United States, this one targeting leukemia, according to Science magazine. This particular gene therapy involved introducing a new protein to a patient’s T cells to encourage them to target harmful leukemia cancer cells.

Stay ahead of advancements in genetics and genomics with Stanford Online

Stanford Online’s Genetics and Genomics Programs aims to equip learners with the knowledge and skills they need to keep up with the recent trends in genetics and genomics research, helping them stay ahead in a rapidly advancing field of study.

Learn more about the Fundamentals of Genetics and Genomics and Advanced Topics in Genetics and Genomics Programs and sign up today.

  • Engineering
  • Artificial Intelligence
  • Computer Science & Security
  • Business & Management
  • Energy & Sustainability
  • Data Science
  • Medicine & Health
  • Explore All
  • Technical Support
  • Master’s Application FAQs
  • Master’s Student FAQs
  • Master's Tuition & Fees
  • Grades & Policies
  • Graduate Application FAQs
  • Graduate Student FAQs
  • Graduate Tuition & Fees
  • Community Standards Review Process
  • Academic Calendar
  • Exams & Homework FAQs
  • HCP History
  • Enrollment FAQs
  • Tuition, Fees, & Payments
  • Custom & Executive Programs
  • Free Online Courses
  • Free Content Library
  • School of Engineering
  • Graduate School of Education
  • Stanford Doerr School of Sustainability
  • School of Humanities & Sciences
  • Stanford Human Centered Artificial Intelligence (HAI)
  • Graduate School of Business
  • Stanford Law School
  • School of Medicine
  • Learning Collaborations
  • Stanford Credentials
  • What is a digital credential?
  • Grades and Units Information
  • Our Community
  • Get Course Updates

Loading metrics

Open Access

Mendel’s legacy in modern genetics

* E-mail: [email protected]

Affiliation Public Library of Science, San Francisco, California, United States of America and Cambridge, United Kingdom

ORCID logo

  • Joanna Clarke, 
  • on behalf of the PLOS Biology Staff Editors

PLOS

Published: July 28, 2022

  • https://doi.org/10.1371/journal.pbio.3001760
  • Reader Comments

A new collection of articles celebrating the bicentennial of Gregor Mendel’s birth discuss his life, work and legacy in modern-day genetic research.

Citation: Clarke J, on behalf of the PLOS Biology Staff Editors (2022) Mendel’s legacy in modern genetics. PLoS Biol 20(7): e3001760. https://doi.org/10.1371/journal.pbio.3001760

Copyright: © 2022 Clarke, on behalf of the PLOS Biology Staff Editors. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: The authors received no specific funding for this work.

Competing interests: The authors are current paid employees of the Public Library of Science.

The PLOS Biology Staff Editors are Ines Alvarez-Garcia, Joanna Clarke, Kris Dickson, Richard Hodge, Paula Jauregui, Nonia Pariente, Roland Roberts, and Lucas Smith.

The field of biology owes a great debt to both genetic material and those who study it. From tiny bacteria to colossal giant sequoias, genetic material is the common thread that runs through all life forms and is even found in infectious agents such as viruses and in transposable elements. As such, although a field of study in its own right, genetics underpins every branch of biology and forms an important part of the majority of research questions.

July 20 th 2022 marked the 200 th anniversary of the birth of the scientist-monk J. Gregor Mendel, widely regarded as the founder of genetics. His experiments in selectively breeding pea plants and observing the way that different traits were passed on to each generation [ 1 ] paved the way for our current understanding of the principles that govern inheritance and have influenced present-day applications of genetic research, from in utero testing for genetically inherited diseases to genetically engineering crops to increase yields or nutritional content. Moreover, his meticulous approach to gathering and recording data is presented to biology students as a textbook example of the scientific method in practice.

In recognition of the legacy and impact of Mendel’s ground-breaking work, PLOS Biology has assembled a special collection of articles on the theme of ‘Mendel’s legacy in modern genetics’. In the collection, you will find Perspective articles from experts working across the field of genetics on how Mendel’s work has shaped their areas of interest [ 2 , 3 , 4 ], as well as an exploration of Mendel’s life and work as a scientist told through his own words [ 5 ]. The collection also contains Essays exploring different aspects and applications of modern genetics research; Sarah Garland and Helen Anne Curry use historical perspectives to ask whether gene editing of crops has lived up to its potential, charting the process from its early beginnings in Mendel’s work [ 6 ], and Laurence Hurst asks whether a greater understanding of selfish genetic elements, which do not adhere to the principles of Mendelian inheritance, can explain why so many human embryos have the wrong number of chromosomes and fail to develop [ 7 ].

The collection will be updated with content throughout the year, and we hope that you will take inspiration from our look back at Mendel’s original research and examination of how well it stands up to modern views on genetics. In the words of Eva Matalova in her sketch of Mendel’s life [ 5 ], “[t]o the present day, his ideas, modern attitude, and way of scientific critical thinking made his legacy ever living”.

  • View Article
  • Google Scholar
  • PubMed/NCBI

What does a geneticist do?

Would you make a good geneticist? Take our career test and find your match with over 800 careers.

What is a Geneticist?

A geneticist specializes in the field of genetics, the study of genes and heredity. Geneticists investigate how traits are inherited, how they manifest in individuals and populations, and how genetic variations contribute to human health, diseases, and evolution. They analyze and interpret genetic data, conduct experiments, and use various research techniques to explore the structure, function, and behavior of genes.

Geneticists play an important role in advancing our understanding of genetics and its applications. They may focus on different areas within genetics, such as molecular genetics, population genetics, medical genetics, or agricultural genetics. Their work involves conducting research, publishing scientific papers, collaborating with other scientists, and applying their findings to improve human health, develop treatments for genetic disorders, enhance crop production, or contribute to evolutionary studies. Geneticists also play a significant role in genetic counseling, helping individuals and families understand and navigate genetic risks, inherited conditions, and reproductive choices.

What does a Geneticist do?

A geneticist looking through a microscope.

Duties and Responsibilities The duties and responsibilities of a geneticist can vary depending on their specific area of focus and the sector in which they work. However, here are some common tasks and responsibilities associated with the role:

  • Research and Investigation: Geneticists conduct research to explore various aspects of genetics. They design and execute experiments, analyze genetic data, and interpret the results. This research may involve studying specific genes, investigating genetic disorders or traits, or exploring the genetic basis of diseases. Geneticists use a range of tools and techniques, including genetic sequencing, genome mapping, and bioinformatics, to gather and analyze data.
  • Genetic Counseling: Geneticists often provide genetic counseling services to individuals and families. They help patients understand their genetic risks, evaluate inherited conditions, and make informed decisions regarding genetic testing, family planning, or treatment options. Genetic counselors communicate complex genetic information in a clear and compassionate manner, empowering patients to make well-informed choices regarding their healthcare.
  • Diagnosis and Treatment: Geneticists contribute to the diagnosis and treatment of genetic disorders. They evaluate patient medical histories, perform genetic testing, analyze test results, and provide recommendations for managing or treating genetic conditions. Geneticists collaborate with other healthcare professionals, such as physicians, genetic counselors, and laboratory technicians, to ensure accurate diagnosis and develop appropriate treatment plans.
  • Teaching and Education: Geneticists often serve as educators, sharing their knowledge and expertise with students, trainees, and other professionals. They may teach genetics courses at universities or contribute to training programs for medical professionals, genetic counselors, or laboratory technicians. Geneticists also engage in public outreach and education, disseminating information about genetics, genetic research, and the implications of genetic discoveries to the broader community.
  • Collaboration and Interdisciplinary Work: Geneticists frequently collaborate with researchers from different disciplines, including biologists, clinicians, epidemiologists, and bioinformaticians. They work together to advance scientific understanding, address complex research questions, and apply genetic findings to various fields. Collaboration may involve participating in multi-disciplinary research projects, attending conferences, and publishing scientific papers.
  • Ethical Considerations: Geneticists must consider the ethical implications of their work. They navigate issues related to privacy, informed consent, genetic testing, and the responsible use of genetic information. They adhere to ethical guidelines and standards set by professional organizations to ensure the ethical practice of genetics and protect patient rights.
  • Continuing Education and Professional Development: Geneticists stay updated with the latest advancements in the field by engaging in continuing education and professional development activities. They attend conferences, workshops, and seminars, and read scientific literature to expand their knowledge, learn new research techniques, and remain at the forefront of genetic research.

Types of Geneticists There are various types of geneticists who specialize in different areas of genetics and pursue different career paths. Here are a few examples:

  • Molecular Geneticist: Molecular geneticists focus on studying the structure, function, and regulation of genes at the molecular level. They investigate the role of specific genes and their interactions, analyze DNA sequences, and explore molecular mechanisms underlying genetic disorders, gene expression, and genetic variation.
  • Medical Geneticist: Medical geneticists specialize in diagnosing and managing genetic disorders in clinical settings. They evaluate patients with suspected or confirmed genetic conditions, order and interpret genetic tests, provide genetic counseling, and collaborate with healthcare professionals to develop treatment plans for patients and families affected by genetic disorders.
  • Population Geneticist: Population geneticists study the genetic composition and variation within populations. They examine patterns of genetic diversity, evolution, and the genetic factors influencing population dynamics. Population geneticists use statistical methods, computational modeling, and genomic analysis to understand how genetic variation is distributed and how it evolves over time.
  • Cytogeneticist: Cytogeneticists study chromosomal abnormalities and their impact on health and development. They analyze chromosomes using techniques like karyotyping, fluorescence in situ hybridization (FISH), and chromosomal microarray analysis. Cytogeneticists work in clinical laboratories, conducting diagnostic tests and providing insights into chromosomal disorders and genetic syndromes.
  • Cytogenetic Technologist : Cytogenetic technologists specialize in studying the genetic composition of cells, particularly focusing on chromosomal abnormalities. Using karyotyping and fluorescence in situ hybridization (FISH), they analyze and interpret chromosomal structures to aid in the diagnosis of genetic disorders.
  • Genetic Counselor : Genetic counselors specialize in providing guidance and support to individuals and families who may be at risk of inherited genetic conditions. They help assess genetic risks, explain complex genetic information, coordinate genetic testing, and offer counseling regarding the implications and options for managing or preventing genetic disorders. Genetic counselors work closely with healthcare professionals and patients to navigate the complex landscape of genetics and make informed decisions.
  • Genomic Researcher: Genomic researchers focus on large-scale analysis of genomes to understand genetic variation, gene expression, and the genetic basis of complex traits or diseases. They use advanced sequencing technologies and bioinformatics tools to analyze genomic data sets, identify disease-associated genetic variants, and contribute to advancements in precision medicine and personalized genomics.
  • Plant or Animal Geneticist: Plant or animal geneticists study the genetics and breeding of plants or animals. They work in agriculture, conservation, or research institutions to enhance crop yield, develop disease-resistant varieties, or conserve endangered species. They employ genetic techniques to understand and manipulate the genetic traits of plants or animals for practical applications.

Are you suited to be a geneticist?

Geneticists have distinct personalities . They tend to be investigative individuals, which means they’re intellectual, introspective, and inquisitive. They are curious, methodical, rational, analytical, and logical. Some of them are also artistic, meaning they’re creative, intuitive, sensitive, articulate, and expressive.

Does this sound like you? Take our free career test to find out if geneticist is one of your top career matches.

What is the workplace of a Geneticist like?

The workplace of a geneticist can vary depending on their specific area of focus and the sector in which they work. Geneticists can be found in a variety of settings, including universities, research institutions, hospitals, clinics, biotechnology companies, and government agencies.

In academic settings, geneticists often work in research laboratories within universities or research institutions. They conduct experiments, analyze data, and publish their findings in scientific journals. Academic geneticists also have teaching responsibilities, including instructing undergraduate and graduate students, supervising research projects, and mentoring aspiring scientists.

Geneticists employed in hospitals or clinics may work in clinical laboratories, diagnostic centers, or specialized genetics departments. They collaborate with healthcare professionals, such as medical geneticists, genetic counselors, and laboratory technicians, to provide accurate diagnosis, genetic testing, and counseling services to patients and their families. They play a critical role in applying genetic knowledge to patient care and helping individuals understand and manage genetic conditions.

In the biotechnology and pharmaceutical industries, geneticists work on research and development projects related to drug discovery, genetic engineering, or personalized medicine. They may be involved in designing experiments, analyzing genomic data, and contributing to the development of new therapies or diagnostic tools. Geneticists in industry often collaborate with multidisciplinary teams, including bioinformaticians, molecular biologists, and clinicians, to translate genetic research into practical applications.

Government agencies and research institutes employ geneticists to conduct research, advise on policy matters, and contribute to public health initiatives. They may work on projects related to population genetics, epidemiology, environmental genomics, or genetic surveillance. Government-employed geneticists also play a role in regulatory oversight, ethical considerations, and the development of guidelines related to genetic research, testing, and clinical practice.

Regardless of the setting, geneticists typically spend a significant portion of their time in laboratories or research facilities. They may use a variety of equipment, technologies, and software tools to conduct experiments, analyze genetic data, and perform statistical analyses. Geneticists also attend scientific conferences, workshops, and meetings to present their work, exchange knowledge, and collaborate with other experts in the field.

The workplace of a geneticist fosters an environment of intellectual curiosity, scientific discovery, and collaboration. It provides opportunities for continuous learning, staying updated with advancements in the field, and making meaningful contributions to the understanding of genetics and its applications.

Frequently Asked Questions

Science related careers and degrees.

  • Animal Scientist
  • Anthropologist
  • Archaeologist
  • Astrophysicist
  • Atmospheric Scientist
  • Behavioral Scientist
  • Bioinformatics Scientist
  • Biomedical Scientist
  • Biophysicist
  • Biostatistician
  • Biotechnician
  • Biotechnologist
  • Cellular Biologist
  • Chemical Technician
  • Climate Change Analyst
  • Comparative Anatomist
  • Conservation Biologist
  • Conservation Scientist
  • Criminologist

Cytogenetic Technologist

  • Cytotechnologist
  • Dairy Scientist
  • Developmental Biologist
  • Ecology Biologist
  • Ecotoxicologist
  • Engineering Physicist
  • Entomologist
  • Epidemiologist
  • Evolutionary Biologist
  • Food Science Technologist
  • Food Scientist
  • Forensic Pathologist
  • Forensic Science Technician
  • Forensic Scientist
  • Geospatial Information Scientist
  • Herpetologist
  • Horticulturist
  • Hydrologist
  • Ichthyologist
  • Immunologist
  • Industrial Ecologist
  • Mammalogist
  • Marine Biogeochemist
  • Marine Biologist
  • Marine Conservationist
  • Marine Ecologist
  • Marine Fisheries Biologist
  • Marine Mammalogist
  • Marine Microbiologist
  • Materials Scientist
  • Meteorologist
  • Microbiologist
  • Molecular Biologist
  • Natural Sciences Manager
  • Neurobiologist
  • Neuropsychologist
  • Neuroscientist
  • Oceanographer
  • Ornithologist
  • Paleontologist
  • Particle Physicist
  • Pathologist
  • Pharmaceutical Scientist
  • Physiologist
  • Political Scientist
  • Poultry Scientist
  • Social Scientist
  • Sociologist
  • Soil and Plant Scientist
  • Soil and Water Conservationist
  • Systems Biologist
  • Toxicologist
  • Veterinary Pathologist
  • Volcanologist
  • Wildlife Biologist
  • Wildlife Ecologist
  • Zoo Endocrinologist
  • Animal Sciences
  • Biochemistry
  • Biomedical Sciences
  • Cellular Biology
  • Criminology
  • Dairy Science
  • Environmental Science
  • Food Science
  • Horticulture
  • Microbiology
  • Molecular Biology
  • Political Science
  • Poultry Science
  • Social Science
  • Soil Science

Continue reading

Geneticist vs Cytogenetic Technologist

Geneticists and cytogenetic technologists are both professionals in the field of genetics, but they have distinct roles and responsibilities. Here's a comparison of the two:

  • Role and Expertise: Geneticists are scientists with advanced degrees (usually a Ph.D. or M.D.) who specialize in the study of genetics. They focus on broader aspects of genetics, including the study of genes, inheritance patterns, molecular biology, genomics, and the impact of genetics on health and disease.
  • Research and Clinical Work: Geneticists may engage in research, exploring the fundamental principles of genetics and contributing to scientific knowledge. Some geneticists work in clinical settings, providing genetic counseling, interpreting genetic tests, and diagnosing genetic disorders.
  • Education and Training: Geneticists typically undergo extensive education and training, often obtaining doctoral degrees in genetics or related fields. Their expertise spans a wide range of genetic concepts, and they may work in academia, research institutions, or healthcare settings.
  • Interdisciplinary Collaboration: Geneticists often collaborate with other specialists, such as clinicians, genetic counselors, and molecular biologists, to integrate genetic information into comprehensive patient care.
  • Role and Expertise: Cytogenetic technologists are professionals who specialize in the laboratory analysis of chromosomal structures within cells. Their primary focus is on techniques like karyotyping, fluorescence in situ hybridization (FISH), and other cytogenetic methods to identify chromosomal abnormalities.
  • Clinical Laboratory Work: Cytogenetic technologists work in clinical laboratories, analyzing patient samples to aid in the diagnosis of genetic disorders. They are skilled in handling and processing biological samples, conducting tests, and providing detailed reports based on their cytogenetic analyses.
  • Education and Training: Cytogenetic technologists typically have a bachelor's degree in a related field and may undergo specific training in cytogenetics. Their expertise lies in the practical application of cytogenetic techniques in a clinical or research laboratory setting.
  • Patient Interaction: While cytogenetic technologists may communicate findings to healthcare professionals, they generally do not have direct patient interactions or provide genetic counseling.

In summary, geneticists are scientists with a broader focus on genetics, conducting research and often working in clinical settings. Cytogenetic technologists, on the other hand, specialize in the laboratory analysis of chromosomal structures, providing critical information for the diagnosis of genetic disorders. Both roles are essential in advancing our understanding of genetics and improving patient care.

U.S. flag

A .gov website belongs to an official government organization in the United States.

A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Family Health History
  • About Cascade Testing
  • Hereditary Breast and Ovarian Cancer
  • Hereditary Colorectal (Colon) Cancer
  • Heart Disease, Family Health History, and Familial Hypercholesterolemia
  • Public Health Genomics and Precision Health Knowledge Base
  • Health Topics A-Z

Genetics Basics

What to know.

This page provides information about basic genetic concepts such as DNA, genes, chromosomes, and gene expression.

  • Genes play a role in almost every human trait and disease.
  • Advances in our understanding of how genes work have led to improvements in health care and public health.

A double helix

Your genes affect many things about you, including how you look (for example, your eye color or height) and how your body works (for example, your blood type). In some cases, your genes are linked to diseases that run in your family. In other cases, your genes influence how your body reacts to health conditions, such as infections; to medicines or other treatments for health conditions; or to certain behaviors, such as smoking or alcohol use.

Better understanding of how genes affect health can improve health in many ways. Knowing if someone has a genetic difference that makes them more likely to get a disease can help them take steps to prevent the disease or find it earlier, when it is easier to treat. If someone already has symptoms of a disease or condition, finding out the genetic difference that causes that disease or condition can help the healthcare provider understand what health outcomes the person might have in the future. Improved understanding of how genes are linked to disease can lead to better treatments for those diseases.

Terms to know

DNA (which is short for deoxyribonucleic acid) contains the instructions for making your body work. DNA is made up of two strands that wind around each other and looks like a twisting ladder (a shape called a double helix). Each DNA strand includes chemicals called nitrogen bases, which make up the DNA code. There are four different bases, T (thymine), A (adenine), C (cytosine), and G (guanine). Each base on one strand of DNA is paired with a base on the other strand. The paired bases form the "rungs of the DNA ladder".

The bases are in different orders on different parts of the DNA strand. DNA is "read" by the order of the bases, that is by the order of the Ts, Cs, Gs, and As. The order of these bases is what is known as the DNA sequence. The DNA in almost all living things is made up of the same parts. What's different is the DNA sequence.

Inheritance

Genetic inheritance is the process of passing down DNA from parents to children.

Your genome is all of the DNA in your body.

Chromosomes

DNA is packaged into small units called chromosomes. A chromosome contains a single, long piece of DNA with many different genes. You inherit your chromosomes from your parents. Chromosomes come in pairs. Humans have 46 chromosomes, in 23 pairs. Children randomly get one of each pair of chromosomes from their mother and one of each pair from their father. There are 22 pairs of numbered chromosomes, called autosomes, and the chromosomes that form the 23rd pair are called the sex chromosomes. They determine if a person is born a male or female. A female has two X chromosomes, and a male has one X and one Y chromosome. Each daughter gets an X from her mother and an X from her father. Each son gets an X from his mother and a Y from his father.

Genes and proteins

Each chromosome has many genes. Genes are specific sections of DNA that have instructions for making proteins. Proteins make up most of the parts of your body and make your body work the right way.

You have two copies of every gene. You inherit one copy from your father and one copy from your mother. The genes people inherit from their parents can determine many things. For example, genes affect what a person will look like and whether the person might have certain diseases.

Alleles are forms of the same gene that may have small differences in their sequence of DNA bases. These differences contribute to each person's unique features. Each person has two alleles for each gene, one from each parent. If the alleles of a gene are the same, the person is considered homozygous for the gene. If the alleles are different, the person is considered heterozygous for the gene.

Most of the time, differences between alleles do not have much of an effect on the protein that is made. However, sometimes different alleles can result in differences in traits, such as blood type. Some alleles are associated with health problems or genetic disorders. In these alleles, the differences in the sequence of DNA bases affects the body's ability to make a certain protein.

Because your genes were passed down from your parents, you and your family members share many gene alleles. The more closely related you are, the more gene alleles you have in common.

Cells are the basic units of life. The human body contains trillions of cells. There are many different types of cells that make up the many different tissues and organs in the body. For example, skin cells, blood cells, heart cells, brain cells, and kidney cells are just a few of the cell types that perform different vital functions in the body.

The basic structure of a cell is a jelly-like substance known as cytoplasm, which is surrounded by a membrane to hold it together. Within the cytoplasm are various specialized structures that are important to the work of the cell. One of these structures is the cell nucleus, which contains the DNA packaged in chromosomes.

Gene expression

Gene expression refers to the process of making proteins using the instructions from genes. A person's DNA includes many genes that have instructions for making proteins. Additionally, certain sections of DNA are not part of a gene but are important in making sure the genes are working properly. These DNA sections provide directions about where in the body each protein should be made, when it should be made, and how much should be made.

For the most part, every cell in a person's body contains exactly the same DNA and genes, but inside individual cells some genes are active ("turned on") while others are not. Differences in how genes are used (expressed) to make proteins are why the different parts of your body look and work differently. For example, gene expression in the muscles is different from gene expression in the nerves.

Gene expression can change as you age. Also, your behaviors, such as smoking or exercise, or exposures in your environment can affect gene expression.

DNA methylation

DNA methylation works by adding a chemical (known as a methyl group) to DNA. This chemical can also be removed from the DNA through a process called demethylation. Typically, methylation turns genes "off" and demethylation turns genes "on."

DNA methylation is one of the ways the body controls gene expression. Methylation and demethylation do not change the DNA code (the sequence of the DNA bases), but they help determine how much protein is made.

Genetic change (mutation, gene variant, genetic variant)

A genetic change (sometimes called a mutation, gene variant, or genetic variant) is a change in a DNA base sequence. While not all genetic changes will cause problems, sometimes, changes in genes can lead to changes in proteins and then the proteins don't work the way they are supposed to. This can lead to disease.

Some genetic changes can be passed on from parent to child (inherited). These genetic changes occur in the germ cells, which are the cells that create sperm or eggs. Genetic changes that occur in the other cells in the body (known as somatic cells) do not get passed on to a person's children.

Genetic changes happen when new cells are being made and the DNA is copied. Also, exposures, such as high levels of radiation, can damage the DNA and cause genetic changes. However, most exposures will not result in genetic changes because each cell in the body has a system in place to check for DNA damage and repair the damage once it's found.

Copy number variation (CNV)

Copy number variation (CNV) refers to a feature of the genome, in which various sections of a person's DNA are repeated. While this happens in all people, the number of repeats (or copies) varies from one person to the next. CNVs play an important role in creating genetic diversity in humans. However, some CNVs are linked to diseases.

Environmental factors

Environmental factors include exposures related to where we live, such as air pollution; behaviors, such as smoking and exercise; and other health-related factors, such as the foods that we eat.

Epigenetics

Epigenetics refers to the ways a person's behaviors and the environment can cause changes that affect the way the genes work. Epigenetics turns genes "on" and "off" and thus is related to gene expression.

Epigenetics change as people age, both as part of normal development and aging and because of exposure to environmental factors that happen over the course of a person's life. There are several different ways an environmental factor can cause an epigenetic change to occur. One of the most common ways is by causing changes to DNA methylation. DNA methylation works by adding a chemical (known as a methyl group) to DNA. This chemical can also be removed from the DNA through a process called demethylation. Typically, methylation turns genes "off" and demethylation turns genes "on." Thus, environmental factors can impact the amount of protein a cell makes. Less protein might be made if an environmental factor causes an increase in DNA methylation, and more protein might be made if a factor causes an increase in demethylation.

  • Medline Plus: Genetics This website has consumer-friendly information about the effects of genetic variation on human health.
  • National Human Genome Research Institute: About Genomics This website offers a talking glossary of genetic terms, fact sheets, and other genetics-related resources.
  • Genetic Science Learning Center: Learn. Genetics This website provides educational materials on life sciences for learners and interested individuals.
  • American Society of Human Genetics: Discover Genetics This website provides basic genetics information and resources.

Genomics and Your Health

Learn more about genomics and its importance for your health

bioRxiv

Tree sequences as a general-purpose tool for population genetic inference

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Logan S Whitehouse
  • ORCID record for Daniel R Schrider
  • For correspondence: [email protected]
  • Info/History
  • Supplementary material
  • Preview PDF

As population genetics data increases in size new methods have been developed to store genetic information in efficient ways, such as tree sequences. These data structures are computationally and storage efficient, but are not interchangeable with existing data structures used for many population genetic inference methodologies such as the use of convolutional neural networks (CNNs) applied to population genetic alignments. To better utilize these new data structures we propose and implement a graph convolutional network (GCN) to directly learn from tree sequence topology and node data, allowing for the use of neural network applications without an intermediate step of converting tree sequences to population genetic alignment format. We then compare our approach to standard CNN approaches on a set of previously defined benchmarking tasks including recombination rate estimation, positive selection detection, introgression detection, and demographic model parameter inference. We show that tree sequences can be directly learned from using a GCN approach and can be used to perform well on these common population genetics inference tasks with accuracies roughly matching or even exceeding that of a CNN-based method. As tree sequences become more widely used in population genetics research we foresee developments and optimizations of this work to provide a foundation for population genetics inference moving forward.

Competing Interest Statement

The authors have declared no competing interest.

We have incorporated numerous additional analyses and revisions, including an examination of the impact of training misspecification on both neural network architectures.

View the discussion thread.

Supplementary Material

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Twitter logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Evolutionary Biology
  • Animal Behavior and Cognition (5528)
  • Biochemistry (12575)
  • Bioengineering (9447)
  • Bioinformatics (30853)
  • Biophysics (15864)
  • Cancer Biology (12936)
  • Cell Biology (18538)
  • Clinical Trials (138)
  • Developmental Biology (10006)
  • Ecology (14985)
  • Epidemiology (2067)
  • Evolutionary Biology (19170)
  • Genetics (12750)
  • Genomics (17554)
  • Immunology (12687)
  • Microbiology (29741)
  • Molecular Biology (12381)
  • Neuroscience (64779)
  • Paleontology (479)
  • Pathology (2003)
  • Pharmacology and Toxicology (3459)
  • Physiology (5337)
  • Plant Biology (11093)
  • Scientific Communication and Education (1728)
  • Synthetic Biology (3064)
  • Systems Biology (7690)
  • Zoology (1731)
  • Introduction to Genomics
  • Educational Resources
  • Policy Issues in Genomics
  • The Human Genome Project
  • Funding Opportunities
  • Funded Programs & Projects
  • Division and Program Directors
  • Scientific Program Analysts
  • Contacts by Research Area
  • News & Events
  • Research Areas
  • Research Investigators
  • Research Projects
  • Clinical Research
  • Data Tools & Resources
  • Genomics & Medicine
  • Family Health History
  • For Patients & Families
  • For Health Professionals
  • Jobs at NHGRI
  • Training at NHGRI
  • Funding for Research Training
  • Professional Development Programs
  • NHGRI Culture
  • Social Media
  • Broadcast Media
  • Image Gallery
  • Press Resources
  • Organization
  • NHGRI Director
  • Mission and Vision
  • Policies and Guidance
  • Institute Advisors
  • Strategic Vision
  • Leadership Initiatives
  • Diversity, Equity, and Inclusion
  • Partner with NHGRI
  • Staff Search

Leading AI models struggle to identify genetic conditions from patient-written descriptions

  • Share on Facebook
  • Submit to Reddit
  • Share on LinkedIn

NIH researchers find that large language models rely on concise, textbook-like language to evaluate medical questions.

National Institutes of Health (NIH) researchers discover that while artificial intelligence (AI) tools can make accurate diagnoses from textbook-like descriptions of genetic diseases, the tools are significantly less accurate when analyzing summaries written by patients about their own health. These findings, reported in the American Journal of Human Genetics , demonstrate the need to improve these AI tools before they can be applied in health care settings to help make diagnoses and answer patient questions.

The researchers studied a type of AI known as a large language model, which is trained on massive amounts of text-based data. These models have the potential to be very helpful in medicine due to their ability to analyze and respond to questions and their often user-friendly interfaces.

“We may not always think of it this way, but so much of medicine is words-based,” said Ben Solomon, M.D., senior author of the study and clinical director at the NIH’s National Human Genome Research Institute (NHGRI). “For example, electronic health records and the conversations between doctors and patients all consist of words. Large language models have been a huge leap forward for AI, and being able to analyze words in a clinically useful way could be incredibly transformational.”

The researchers tested 10 different large language models, including two recent versions of ChatGPT. Drawing from medical textbooks and other reference materials, the researchers designed questions about 63 different genetic conditions. These included some well-known conditions, such as sickle cell anemia, cystic fibrosis and Marfan syndrome, as well as many rare genetic conditions.

These conditions can show up in a variety of ways among different patients, and the researchers aimed to capture some of the most common possible symptoms. They selected three to five symptoms for each condition and generated questions phrased in a standard format, “I have X, Y and Z symptoms. What’s the most likely genetic condition?”

We may not always think of it this way, but so much of medicine is words-based. For example, electronic health records and the conversations between doctors and patients all consist of words. Large language models have been a huge leap forward for AI, and being able to analyze words in a clinically useful way could be incredibly transformational.

When presented with these questions, the large language models ranged widely in their ability to point to the correct genetic diagnosis, with initial accuracies between 21% and 90%. The best performing model was GPT-4, one of the latest versions of ChatGPT.

The success of the models generally corresponded with their size, meaning the amount of data the models were trained on. The smallest models have several billion parameters to draw from, while the largest have over a trillion. For many of the lower-performing models, the researchers were able to improve the accuracy over subsequent experiments, and overall, the models still delivered more accurate responses than non-AI technologies, including a standard Google search.

The researchers optimized and tested the models in various ways, including replacing medical terms with more common language. For example, instead of saying a child has “macrocephaly,” the question would say the child has “a big head,” more closely reflecting how patients or caregivers might describe a symptom to a doctor.

Overall, the models’ accuracy decreased when medical descriptions were removed. However, 7 out of 10 of the models were still more accurate than Google searches when using common language.

“It’s important that people without medical knowledge can use these tools,” said Kendall Flaharty, an NHGRI postbaccalaureate fellow who led the study. “There are not very many clinical geneticists in the world, and in some states and countries, people have no access to these specialists. AI tools could help people get some of their questions answered without waiting years for an appointment.”

To test the large language models’ efficacy with information from real patients, the researchers asked patients from the NIH Clinical Center to provide short write-ups about their own genetic conditions and symptoms. These descriptions ranged from a sentence to a few paragraphs and were also more variable in style and content compared to the textbook-like questions.

When presented with these descriptions from real patients, the best-performing model made accurate diagnoses only 21% of the time. Many models performed much worse, even as low as 1% accurate.

The researchers expected the patient-written summaries to be more challenging because patients at the NIH Clinical Center often have extremely rare conditions. The models may therefore not have sufficient information about these conditions to make diagnoses.

However, the accuracies improved when the researchers wrote standardized questions about the same ultrarare genetic conditions found among the NIH patients. This indicates that the variable phrasing and format of the patient write-ups was difficult for the models to interpret, perhaps because the models are trained on textbooks and other reference materials that tend to be more concise and standardized.

“For these models to be clinically useful in the future, we need more data, and those data need to reflect the diversity of patients,” said Dr. Solomon. “Not only do we need to represent all known medical conditions, but also variation in age, race, gender, cultural background and so on, so that the data capture the diversity of patient experiences. Then these models can learn how different people may talk about their conditions.”

Beyond demonstrating areas of improvement, this study highlights the current limitations of large language models and the continued need for human oversight when AI is applied in health care.

“These technologies are already rolling out in clinical settings,” Dr. Solomon added. “The biggest questions are no longer about whether clinicians will use AI, but where and how clinicians should use AI, and where should we not use AI to take the best possible care of our patients.”

About NHGRI and NIH

About the National Human Genome Research Institute (NHGRI):  At NHGRI, we are focused on advances in genomics research. Building on our leadership role in the initial sequencing of the human genome, we collaborate with the world's scientific and medical communities to enhance genomic technologies that accelerate breakthroughs and improve lives. By empowering and expanding the field of genomics, we can benefit all of humankind. For more information about NHGRI and its programs, visit  www.genome.gov . About the National Institutes of Health (NIH):  NIH, the nation's medical research agency, includes 27 Institutes and Centers and is a component of the U.S. Department of Health and Human Services. NIH is the primary federal agency conducting and supporting basic, clinical, and translational medical research, and is investigating the causes, treatments, and cures for both common and rare diseases. For more information about NIH and its programs, visit  www.nih.gov .

Press Contact

Last updated: August 14, 2024

research work on genetics

  • Adolescent and Young Adult Cancer
  • Bile Duct Cancer
  • Bladder Cancer
  • Brain Cancer
  • Breast Cancer
  • Cervical Cancer
  • Childhood Cancer
  • Colorectal Cancer
  • Endometrial Cancer
  • Esophageal Cancer
  • Head and Neck Cancer
  • Kidney Cancer
  • Liver Cancer
  • Lung Cancer
  • Mouth Cancer
  • Mesothelioma
  • Multiple Myeloma
  • Neuroendocrine Tumors
  • Ovarian Cancer
  • Pancreatic Cancer
  • Prostate Cancer
  • Skin Cancer/Melanoma
  • Stomach Cancer
  • Testicular Cancer
  • Throat Cancer
  • Thyroid Cancer
  • Prevention and Screening
  • Diagnosis and Treatment
  • Research and Clinical Trials
  • Survivorship

research work on genetics

Request an appointment at Mayo Clinic

research work on genetics

Mayo Clinic study uncovers genetic cancer risks in 550 patients

Share this:.

Share to facebook

By Susan Murphy

Current screening protocols fail to catch a notable number of people carrying genetic mutations associated with hereditary breast and ovarian cancer syndrome and Lynch syndrome, which increase the risk of developing certain cancers. This issue is particularly pronounced among underrepresented minorities. 

These  research findings , published in JCO Precision Oncology, are based on genetic screenings of more than 44,000 study participants from diverse backgrounds.  

For this Mayo Clinic Center for Individualized Medicine Tapestry project, researchers sequenced the exomes — the protein-coding regions of genes — because this is where most disease-causing mutations are found. They identified 550 people, or 1.24%, as carriers of the hereditary mutations. 

Importantly, half of these people were previously unaware of their hereditary genetic risk and 40% did not meet existing clinical guidelines for genetic testing. 

"This study is a wake-up call, showing us that current national guidelines for genetic screenings are missing too many people at high risk of cancer," says lead author  Niloy Jewel Samadder, M.D. , a Mayo Clinic gastroenterologist and cancer geneticist at the  Center for Individualized Medicine  and the  Mayo Clinic Comprehensive Cancer Center . "Early detection of genetic markers for these conditions can lead to proactive screenings and targeted therapies, potentially saving lives of people and their family members." 

Hereditary breast and ovarian cancer syndrome is linked to mutations in the BRCA1 and BRCA2 genes. Mutations in BRCA1 can lead to a 60% lifetime risk of developing breast cancer and a 40% risk of having ovarian cancer , among other cancers. BRCA2 mutations increase the risk of developing breast cancer to 50% and ovarian cancer to 20%, with additional risks for prostate and pancreatic cancers in males. 

Lynch syndrome is associated with an 80% lifetime risk of developing  colorectal cancer  and 50% risk of uterine/endometrial cancer . 

The study also showed disparities in how underrepresented minority participants met genetic screening guidelines compared to other groups. 

"These results suggest the existing guidelines for genetic testing inadvertently introduce biases that affect who qualifies for testing and who receives coverage through health insurance. This leads to disparities in cancer prevention," Dr. Samadder says. "Our results emphasize the importance of expanding genetic screening to identify people at risk for these cancer predisposition syndromes."   

Advancing precision medicine with Tapestry 

Altogether, the Tapestry project has now sequenced the exomes of more than 100,000 patients and is integrating these results into the patients' electronic health records. This not only personalizes patient care but also provides a rich dataset for further genetic research. 

The overarching mission of Tapestry is to advance personalized medicine and tailor prevention and treatment strategies for individuals, thereby paving the way for targeted healthcare interventions for all. 

Review the  study  for a complete list of authors, disclosures and funding.

Read these articles to learn more about hereditary cancer risk and genetic screening:

  • A silent tumor, precancerous polyps and the power of genetic screening
  • 9 common questions about genetic testing for cancer
  • Mayo Clinic’s DNA study reveals BRCA1 mutations in 3 sisters, prompts life-changing decisions

This article was originally published on the Mayo Clinic News Network .

Related Posts

research work on genetics

Marty Kedian successfully underwent a groundbreaking surgery at Mayo Clinic that restored his voice and his ability to swallow and breathe on his own.

research work on genetics

Study results show a new surgical platform enables real-time diagnoses and tailored surgical treatment in the operating room.

research work on genetics

Mayo Clinic's Cancer Care Beyond Walls program allows some patients to receive part of their cancer care from the comfort of their own homes.

SBU News

A Genetic Analysis of Bacteria Strains Causing Lyme Disease Could Transform Treatment

Borreliaworldmap

International research team including Dr. Benjamin Luft map out genome of 47 strains and develop web-based software for future investigations  

STONY BROOK, NY, August 15, 2024 – After years of research an international team of scientists has unraveled the genetic makeup of 47 strains of known and potential Lyme disease-causing bacteria. The work paves the way toward more accurate diagnostic tests and targeted treatment against the many strains of Borrelia burgdorferi , the cause of Lyme disease, which remains the most prevalent tick-borne disease in the United States and Europe. The team’s findings are published in the journal mBio .

Lyme disease affects hundreds of thousands of people each year. In the United States alone case numbers are approaching 500,000 per year. If left untreated, the infection can spread to joints, the heart and nervous system and cause more severe complications. The authors say that with climate change and potentially other environmental factors, cases of Lyme disease may only keep increasing worldwide. Additionally, some of the Borrelia species that they genetically sequenced in this study that do not cause disease now could be a genetic reservoir for the future evolution of these species.  

“This is a seminal study with not only new genetic findings that map out the genomes of 47 strains of Borrelia, it is a body of work that provides researchers with data and tools going forward to better tailor treatment against all causes of Lyme disease and provides a framework toward similar approaches against other infectious diseases caused by pathogens,” says Benjamin Luft, MD, the Edmund D. Pellegrino Professor of Medicine at the Renaissance School of Medicine at Stony Brook University, and an internationally recognized expert in the investigation and treatment of Lyme disease. Stony Brook Medicine has a clinic dedicated to treating Lyme disease and all tick-borne infections and is home to the Regional Tick-Borne Disease Resource Center .

The research team encompassed investigators from more than a dozen research institutions around the world. In combination, they sequenced the complete genomes of Lyme disease bacteria representing all 23-known species in the group. Most of these hadn’t been sequenced before this effort. The sequencing included multiple strains of the bacteria most commonly associated with human infections and species not previously known to cause disease in humans.

Borreliaworldmap

By comparing these genomes, the researchers reconstructed the evolutionary history of Lyme disease bacteria, tracing the origins back millions of years. They discovered the bacteria likely originated before the breakup of the ancient supercontinent Pangea, which helps explain the current worldwide distribution.

The study also revealed how these bacteria exchange genetic material within and between species. This process, known as recombination, allows the bacteria to rapidly evolve and adapt to new environments. The researchers identified specific hot spots in the bacterial genomes where this genetic exchange occurs most frequently, often involving genes that help the bacteria interact with their tick vectors and animal hosts.

“By understanding how these bacteria evolve and exchange genetic material, we’re better equipped to predict and respond to changes in their behavior, including potential shifts in their ability to cause disease in humans,” explains Weigang Qiu, PhD, Senior Author and Professor of Biology at City University of New York.

To facilitate ongoing research, the team has developed web-based software tools ( BorreliaBase.org ) that enables scientists to compare Borrelia genomes and identify determinants of human pathogenicity.

Future collaborative research by the international team includes a plan to expand the genome analysis to include more strains of Lyme disease bacteria, particularly from understudied regions. They will also investigate the specific functions of genes unique to disease-causing strains, which could reveal new targets for therapeutic interventions.

The research leading to this published work was funded primarily by the National Institute of Health’s National Institutes of Allergy and Infectious Diseases (NIAID). The research was also supported by the Steven & Alexandra Cohen Foundation.

Twenty authors, including Dr. Luft,  are listed on the paper. Leading collaborators and co-authors include Sherwood Casjens of the University of Utah School of Medicine, Weigang Qiu of the City University of New York, Steven Schutzer of Rutgers New Jersey Medical School, Claire Fraser and Emmanuel Mongodin of the University of Maryland School of Medicine, and Richard G. Morgan of New England BioLabs.  

Related Posts

Ankur

Cancel reply

Your Website

Save my name, email, and website in this browser for the next time I comment.

This site uses Akismet to reduce spam. Learn how your comment data is processed .

I would like to sign up for any studies you are doing. I believe I had a bite sometime in April and now I have sciatica in my back left thigh and bottom of my tush. More details to follow if you need me in your study. ie blood tests treatments etc Theodora Reiner (Teddy) I live in Setauket NY near Stony Brook

Prof. kristen brock petroshius stony brook university

Stony Brook Professor Kristen Brock-Petroshius Recipient of Society for Social Work and Research Award for Outstanding Doctoral Dissertation

STONY BROOK, NY — January 19, 2024 — Stony Brook University School of Social Welfare (SSW) Assistant Professor Kristen Brock-Petroshius, PhD, MSW, has been named a recipient of the 2024 Society for Social...

Gaborlabauthors49

Turning Drug Resistance Against Itself

Stony Brook-led research exploits DNA amplification in cancer to halt chemoresistance  STONY BROOK, NY, November 27, 2023 – Cancer drug resistance remains a leading reason why treatments for specific cancers eventually...

Stony brook university hospital

Stony Brook Named Among America’s 100 Best Hospitals for Stroke and Cardiac Care

STONY BROOK, NY, October 24, 2023 – Stony Brook University Hospital (SBUH) has been named among the top tier of hospitals nationwide for stroke and cardiac care, according to Healthgrades, a leading resource that...

Subscribe to Newsletter

Latest releases.

SCY-QNET map

Stony Brook Leads New Program Designed to Further Build and Test Quantum Networks

Whitecoat24oath

Getting a Look at Future MDs

Stony brook, ny; stony brook university hospital: early morning light

Nationally Recognized Neurocritical Care Expert Named Chair of Emergency Medicine

Dino martins kenya

Stony Brook University Center Turkana Basin Institute (TBI) Leadership Change Effective September 1, 2024

Stony brook university hospital

Stony Brook University Hospital Recognized for Excellence in Emergency Nursing

Ojima kaczochaselect

New Drug to Control Pain Related to Cancer Treatment Originally Developed at Stony Brook Gets FDA Clearance

U.s. news & world report names stony brook university hospital among best hospitals in new york for 2024-2025.

stony brook university hospital

Stony Brook Cardiothoracic Surgeons Selected for Membership in the American Association for Thoracic Surgery

Alexander Orlov

Materials Scientist Receives Environmental Service Award From National Society

Stony Brook University Logo

  • Find Stories
  • Media Resources
  • Media Relations Team
  • Press Clip Archives
  • Press Release Archives

Sign Up Today!

Connect with sbu.

Sb matters masthead white

© 2024 Stony Brook University

Subscribe to News

Search sbu news, latest stories.

  • Beautifying the Melville Library, Piece by Piece August 19, 2024
  • Research Paves Way for More Accurate Treatment of Lyme Disease August 19, 2024
  • SoCJ’s Colvin Center Hosts Visiting Scholars from Indonesia, West Bank August 19, 2024
  • Alda Center Healthcare Program Shows Promise at Stony Brook Medicine August 15, 2024
  • A Genetic Analysis of Bacteria Strains Causing Lyme Disease Could Transform Treatment August 15, 2024
  • Alumni News
  • Arts & Entertainment
  • Awards and Honors
  • College of Arts & Sciences
  • College of Business
  • College of Engineering & Applied Sciences
  • Commencement
  • Faculty/Staff
  • Graduate School
  • Long Island
  • School of Communication and Journalism
  • School of Dental Medicine
  • School of Health Professions
  • School of Medicine
  • School of Nursing
  • School of Pharmacy
  • School of Professional Development
  • School of Social Welfare
  • Student Spotlight
  • Sustainability
  • Stay Informed

Get the latest word on Stony Brook news, discoveries and people.

The University of Edinburgh home

  • Schools & departments

Institute of Genetics and Cancer

Early Career Prize for Scott Waddell

Postdoctoral Research Fellow Dr Scott Waddell has won a Scottish Universities Life Sciences Alliance (SULSA) award for his work looking at polycystic liver disease: May 2024

Scott Waddell

Scott, who is part of the Luke Boulter Research Group at the MRC Human Genetics Unit, was a winner in the key life sciences area of Human (with other prizes awarded in the categories of Plant, Animal and Microbe).

SULSA, an alliance of 12 Scottish universities and one research institute, aims to advance Scotland’s research and innovation in the life sciences through strategic collaboration across institutes, disciplines and sectors.

The prize is awarded to outstanding Postdoctoral Researchers whose work shows excellent potential to make an impact in the field of life sciences, enabling them to raise their profile and develop independent networks.

It includes a fully funded tour - worth £1,000 - of three Scottish universities where Scott will deliver a seminar, and £1,000 of flexible funding to be used at his discretion for attending conferences, buying consumables, attending training courses or visiting collaborators.

It was great to share my work and listen to other postdocs present their findings in a range of fields. I’m delighted to have won in my session and look forward to visiting and interacting with other Scottish institutes and universities to promote my work. Scott Waddell
  • SULSA ECR prize
  • Watch Scott’s SULSA presentation on YouTube: https://www.youtube.com/watch?v=mBY-82AYgc4&feature=youtu.be

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 12 August 2024

A genome-wide investigation into the underlying genetic architecture of personality traits and overlap with psychopathology

  • Priya Gupta 1 , 2 ,
  • Marco Galimberti   ORCID: orcid.org/0000-0001-6052-156X 1 , 2 ,
  • Yue Liu 3 ,
  • Sarah Beck   ORCID: orcid.org/0000-0003-4176-2936 1 , 2 ,
  • Aliza Wingo   ORCID: orcid.org/0000-0002-6360-6726 4 , 5 ,
  • Thomas Wingo   ORCID: orcid.org/0000-0002-7679-6282 3 ,
  • Keyrun Adhikari   ORCID: orcid.org/0000-0001-9129-1699 1 , 2 ,
  • Henry R. Kranzler   ORCID: orcid.org/0000-0002-1018-0450 6 , 7 ,
  • VA Million Veteran Program ,
  • Murray B. Stein   ORCID: orcid.org/0000-0001-9564-2871 8 , 9 ,
  • Joel Gelernter   ORCID: orcid.org/0000-0002-4067-1859 1 , 2 &
  • Daniel F. Levey   ORCID: orcid.org/0000-0001-8431-9569 1 , 2  

Nature Human Behaviour ( 2024 ) Cite this article

6116 Accesses

28 Altmetric

Metrics details

  • Genetic variation
  • Human behaviour

Personality is influenced by both genetic and environmental factors and is associated with other psychiatric traits such as anxiety and depression. The ‘big five’ personality traits, which include neuroticism, extraversion, agreeableness, conscientiousness and openness, are a widely accepted and influential framework for understanding and describing human personality. Of the big five personality traits, neuroticism has most often been the focus of genetic studies and is linked to various mental illnesses, including depression, anxiety and schizophrenia. Our knowledge of the genetic architecture of the other four personality traits is more limited. Here, utilizing the Million Veteran Program cohort, we conducted a genome-wide association study in individuals of European and African ancestry. Adding other published data, we performed genome-wide association study meta-analysis for each of the five personality traits with sample sizes ranging from 237,390 to 682,688. We identified 208, 14, 3, 2 and 7 independent genome-wide significant loci associated with neuroticism, extraversion, agreeableness, conscientiousness and openness, respectively. These findings represent 62 novel loci for neuroticism, as well as the first genome-wide significant loci discovered for agreeableness. Gene-based association testing revealed 254 genes showing significant association with at least one of the five personality traits. Transcriptome-wide and proteome-wide analysis identified altered expression of genes and proteins such as CRHR1, SLC12A5, MAPT and STX4 . Pathway enrichment and drug perturbation analyses identified complex biology underlying human personality traits. We also studied the inter-relationship of personality traits with 1,437 other traits in a phenome-wide genetic correlation analysis, identifying new associations. Mendelian randomization showed positive bidirectional effects between neuroticism and depression and anxiety, while a negative bidirectional effect was observed for agreeableness and these psychiatric traits. This study improves our comprehensive understanding of the genetic architecture underlying personality traits and their relationship to other complex human traits.

Similar content being viewed by others

research work on genetics

Identification of pleiotropy at the gene level between psychiatric disorders and related traits

research work on genetics

Gene-based association analysis identifies 190 genes affecting neuroticism

research work on genetics

Multivariate genetic analysis of personality and cognitive traits reveals abundant pleiotropy

Personality dimensions influence behaviour, thoughts, feelings and reactions to different situations. A valuable construct within the field of psychological research has converged on five different dimensions to characterize human personality: neuroticism, extraversion, agreeableness, conscientiousness and openness 1 , 2 . Personality dimensions could be playing an important role in the susceptibility and resilience to diagnosis of psychiatric disorders and their relationship with other health-related traits and responses to treatment.

The last decade has seen an increasing interest in understanding the dimensions of human personality through the lens of genetics. Depression is one mental disorder that has been studied with respect to its relationship to personality traits, with a large portion of genetic risk for depression being captured by neuroticism 3 . The same study found a modest negative association of genetic depression risk with conscientiousness, with small contributions from openness, agreeableness and extraversion. Neuroticism is one of the most studied dimensions of the ‘big five’ personality traits and numerous studies have found positive correlations with depression, anxiety and other mental illnesses 3 , 4 , 5 . Schizophrenia has also been associated with personality traits, especially neuroticism, which has been shown to increase risk for diagnosis 6 . A study using data from the Psychiatric Genomics Consortium (PGC) and personal genomics company 23andMe found two genomic loci to be common between neuroticism and schizophrenia. This study also reported six loci shared between schizophrenia and openness 7 .

The past 15 years have seen an explosion in the use of the genome-wide association study (GWAS). In 2010, Marleen de moor et al. from the Genetics of Personality Consortium (GPC) published a GWAS of the ‘big five’ personality traits conducted with 17,375 adults from 15 different samples of European ancestry (EUR) 8 . This study found two genome-wide significant (GWS) variants near the RASA1 gene on 5q14.3 for openness and one near KATNAL2 on 18q21.1 for conscientiousness but no significant associations for other personality traits. GPC then conducted studies on extraversion and neuroticism in their second phase and meta-analyses were performed. A GWAS of neuroticism that was conducted on approximately 73,000 subjects identified rs35855737 in the MAG1 gene as a GWS variant 9 . Although the sample size was increased substantially to 63,030 subjects in phase II, no GWS variants were detected for extraversion in that study 10 . In 2016, Lo et al. identified six loci associated with different personality traits, including loci for extraversion 11 . A paper that investigated neuroticism along with subjective well-being and depressive symptoms leveraging the UK Biobank (UKB) and other published data 12 was published this same year. A more detailed picture of neuroticism genetics was presented by Nagel et al. 2018 13 , where the authors collected neuroticism genotype data of 372,903 individuals from the UKB and performed a meta-analysis by combining the summary statistics from this UKB sample, 23andMe and GPC phase 1 samples, increasing the total sample size to 449,484. They identified a total of 136 loci and 599 genes showing GWS associations to neuroticism. In 2021, Becker et al. conducted a polygenic index study and created a resource with GWAS meta-analysis summary statistics combining different data cohorts for a large number of traits, including neuroticism, thus increasing the total sample size of neuroticism meta-analysis to 484,560 and increasing the number of novel GWS loci (although this was not the focus of this work) 14 . They also identified six genomic loci for extraversion.

In this work, we conducted GWAS of each of the ‘big five’ personality traits in a sample of ~224,000 individuals with genotype data available from the Million Veteran Program (MVP). Using linkage disequilibrium score regression (LDSC), we estimated the single-nucleotide polymorphism (SNP)-based heritability of each of the five personality traits. We then combined the MVP data with other sources of personality GWAS summary statistics from GPC and UKB and performed meta-analyses for each of the five personality traits, including as many as ~680,000 participants for the largest meta-analysis of neuroticism so far. To gain insights into the biology of these traits, we performed transcriptome-wide association studies (TWAS) and proteome-wide association studies (PWAS) followed by pathway and drug perturbation analyses and variant fine-mapping. We also studied the overlap of these personality traits with anxiety and other complex traits through phenome-wide genetic correlations and conditional analyses. We performed drug perturbation analyses with genes associated with neuroticism and found convergence on drugs for major depressive disorder (MDD). Finally, we conducted Mendelian randomization (MR) experiments to investigate the causal relationship of neuroticism and agreeableness, the two most genetically divergent traits, with depression and anxiety.

In the EUR GWAS in the MVP cohort, we identified in total 34 unique independent genomic loci significantly associated ( P value <5 × 10 −8 ) with at least one of the five personality traits (Table 1 ). The highest numbers of loci were found for extraversion and neuroticism (11 for each) while conscientiousness showed only two loci. In the MVP we identified 4,036 GWS variants ( P  < 5 × 10 −8 ) for neuroticism across 7 independent genomic loci harbouring genes including MAD1L1 , MAP3K14 , CRHR1 , CRHR1-IT1 and VK2 ( P  < 5 × 10 −8 ). Of these seven loci, two ( rs2717043 and rs4757136 ) were also reported to be GWS in Nagel et al. 13 . We identified 11 GWS loci for extraversion, the largest number of GWS loci to be identified for this trait. Associations for extraversion were found near several genes, including CRHR1 , MAPT and METTL15 (total 90 genes). For the two conscientiousness loci, the first locus maps to a region near the genes FOXP2 , PPP1R3A and MDFIC and the second locus maps to the ZNF704 gene, all of which are protein coding genes. For openness, 7 loci were identified spanning over 39 genes, including BRMS1 , RIN1 and B3GNT1 . For agreeableness, 3 loci were identified spanning 19 genes, including SOX7 , PINX1 and FOXP2 . The Manhattan plots for all five traits are shown in Supplementary Fig. 1 .

Two GWS variants were found for agreeableness in the African ancestry (AFR) sample. Variants rs2393573 (effect size, −0.106; standard error of the mean (s.e.m.), 0.018; 95% confidence interval (CI) −0.071, 0.141; P  = 7.502 × 10 −9 ) and rs112726823 (effect, −0.720; s.e.m., 0.130; 95% CI 0.465, 0.975; P 3.268 × 10 −8 ) mapped near CCDC6 and ARHGAP24 . We did not find any GWS variants for any of the other four personality traits in the AFR sample; the multiple subthreshold findings from this analysis may reach the GWS threshold in a larger sample. A list of lead independent SNPs found in the AFR sample for each trait is provided in Supplementary Tables 1 – 5 .

Meta-analysis in EUR populations

The meta-analysis for neuroticism showed associations with 208 independent GWS loci. The increased power due to the inclusion of MVP data resulted in the identification of 79 additional GWS loci, which were not significant in the previous study 13 . Only five loci identified previously ( rs1763839 , rs2295094 , rs11184985 , rs579017 and rs76923064 ) were no longer significant in our meta-analysis. A total of 17 loci of these 79 have also been discovered in the polygenic index study (Supplementary Table 6 ). Thus, we found 62 novel loci associated with neuroticism in our meta-analysis. SNPs and loci were mapped to genes based on chromosomal position, expression quantitative trait loci (eQTL) and chromatic interaction 15 . A total of 231 genes were found significant in the MAGMA (Multi-marker Analysis of GenoMic Annotation) gene-based test 16 . NSF , KANSL1 , FMNL1 , PLEKHM1 and CRHR1 ( P  < 2.850 × 10 −40 ) were among the top significant hits. The largest number of significant loci are located on chromosome 11, followed by chromosome 1. The GWS associations also include two loci with variants rs7818437 (effect, −0.021; s.e.m., 0.002; 95% CI −0.017, 0.025; P  = 7.599 × 10 −17 ) and rs76761706 (effect, −0.035; s.e.m., 0.002; 95% CI −0.031, 0.039; P  = 2.850 × 10 −40 ) located in inversion regions on chromosome 8 and 17, respectively. Variants in these two inversion regions were also previously reported to be significantly associated with neuroticism in the study by Okbay et al. 12 .

For extraversion, after meta-analysing the MVP and GPC data, the number of significant loci increased to 14. The lead signals were located on chromosomes 1–6,11,12, 17 and 19. The most significant locus harbours genes in/near WSCD2 ( P  < 3.449 × 10 −11 ) located on chromosome 12.

Chromosome 11 contains significant variant associations from three traits, namely neuroticism, extraversion and agreeableness, with neuroticism and extraversion both having findings near the ‘basic helix-loop-helix ARNT like 1’ ( ARNTL1 , also known as BMAL1 ) gene, with opposing and significant direction of effect at common variants. Complete information of all identified significant loci for each of the five traits with full statistics is provided in Supplementary Tables 6 – 10 . The cohorts used in meta-analysis are depicted in Fig. 1a . Manhattan plots for meta-analyses of each of the five traits are depicted in Fig. 2 .

figure 1

a , Data collection of the five personality traits. b , Genetic correlation matrix among the five personality traits (meta-data). The heritability value of the respective trait is written in parenthesis. c , A karyogram showing the regions with significant local genetic correlation ( r G  > 0.3) between different personality traits.

figure 2

The GWS variants in light-green colour. Reported P values are two-sided and not corrected for multiple testing. GWS threshold ( P  = 5 × 10 −8 ) is used to define significant variants and depicted by red line.

Trans-ancestry analysis

We performed trans-ancestry meta-analysis of the five personality traits combining EUR and AFR GWAS for each of the five traits using inverse variance weighing in METAL 17 . For neuroticism, the trans-ancestry analysis identified a total of 216 GWS loci, of which 16 are novel, that is, they were not GWS in the EUR meta-analysis (Supplementary Tables 11 – 15 ). Of the 208 GWS loci for neuroticism in the EUR meta-analysis, 200 remained GWS in trans-ancestry analysis, while the remaining 8 showed a marginally higher P value and thus do not pass the threshold for being GWS in trans-ancestry. For agreeableness and conscientiousness, in addition to the loci that were shown to be GWS in their respective EUR meta-analysis, two more novel loci ( rs140242735 located on chromosome 8 and rs10864876 located on chromosome 2 for agreeableness and conscientiousness, respectively) were identified as GWS in the trans-ancestry analysis. In case of openness, two loci out of the three that identified as GWS in EUR remained GWS in the trans-ancestry analysis. For extraversion, in total 13 were identified as GWS in the trans-ancestry analysis, of which 10 were also GWS in the EUR meta-analysis and 3 were newly identified.

We performed TWAS for each of the ‘big five’ personality traits in EUR (meta-analysis) using FUSION 18 and the GWAS summary statistics. We performed a multi-tissue TWAS in 13 different brain subtissues and blood using their respective expression profiles from Genotype Tissue-Expression project (GTEx v8) 19 . From a total 10,386 genes tested, we identified a total 175, 24, 5, 1 and 11 genes showing significant gene–trait associations across the 13 subtissues in neuroticism, extraversion, agreeableness, conscientiousness and openness, respectively, after Bonferroni correction for 135,018 tests (10,386 genes across 13 tissues) (Fig. 3a ). Figure 3a shows the distribution of associations found across the 13 tissues for each trait. The highest number of gene–trait associations were found in brain caudate basal ganglia, cerebellum, cerebral hemisphere and frontal cortex regions for neuroticism and extraversion, while fewer TWAS gene–trait associations were identified for the other three personality traits, presumably owing to the comparatively lower power of their respective GWAS datasets.

figure 3

a , A bar chart showing the number of significant TWAS genes per transcripts found of four personality traits with significant findings in respective subtissues. Scatter plots of neuroticism ( b ), agreeableness ( c ), extraversion ( d ) and openness ( e ) with TWAS z -scores of each gene transcript plotted on the y axis and its respective chromosomal location plotted on the x axis. The significant hits are shown in red circles with mapped gene names as labels. The blue horizontal line indicates the significance threshold of the z -score corresponding to the Bonferroni-corrected, two-sided P value. Conscientiousness data is reported in Supplementary Table 22 .

CRHR1, KANSLI1-AS1 and MAP-IT1 are among the top TWAS gene associations ( P  < 1.32 × 10 −23 ) for neuroticism (Fig. 3b ). The strong association of CRHR1 (encoding corticotropic-releasing hormone receptor), which in some prior work has been shown to be associated with treatment response to depression 20 , may suggest some common underlying elements regulating both neuroticism and depression. Extraversion also shows strong gene–trait associations with CRHR1, KANSL1-AS1 and MAPT-IT1 but with an opposite direction of effect to neuroticism. This may indicate some common genetic components whose differential behaviour regulates neuroticism and extraversion. There are nine such genes showing opposite direction of effect in neuroticism and extraversion (Supplementary Table 3 ).

LOC10271024064 and LRFN4 showed the strongest associations with openness and LINCR-0001 and FAM167A showed the strong associations with agreeableness, while only one gene, AP1G1 , showed association with conscientiousness in the 13 tissues considered. The complete list of all GWS TWAS gene hits for the five personality traits is provided in Supplementary Table 22 .

We investigated the association of personality traits with protein expression using PWAS. Based on the availability of protein profiles and the observed TWAS signal, dorsolateral prefrontal cortex brain protein profiles were chosen for the PWAS analysis. The PWAS identified 47 proteins to be significantly associated with neuroticism. Next, we checked the colocalization signal for these PWAS lead genes. Out of 47 PWAS lead genes, 35 genes showed a colocalization signal (H4 probability >0.5).

Five, two, two and four proteins were discovered for extraversion, agreeableness, conscientiousness and openness, respectively (Fig. 4 ). A complete list of all PWAS lead genes is provided in Supplementary Table 23 .

figure 4

A Manhattan plot is displayed showing the significant protein associations observed for neuroticism. The red line in the plot depicts the Bonferroni-corrected, two-sided P value threshold at 5% FDR. The boxes on the right show the significant proteins found for the respective four personality traits.

We first used LDSC to calculate SNP-based heritability of each of the five personality traits within the MVP EUR cohort. The intercepts of the LDSC indicated no evidence for population stratification, with observed values of 1.01, 1.02, 0.99, 1.02 and 1.00 for neuroticism, extraversion, agreeableness, conscientiousness and openness, respectively. The SNP heritability ranges from 4% to 7% (Supplementary Fig. 2 ), with extraversion showing the highest heritability point estimate of all traits (neuroticism h 2  = 0.0655; s.e.m., 0.004; 95% CI 0.058, 0.073; agreeableness h 2  = 0.042; s.e.m., 0.003; 95% CI 0.036, 0.048; extraversion h 2  = 0.071; s.e.m., 0.003; 95% CI 0.065, 0.077; openness h 2  = 0.048; s.e.m., 0.003; 95% CI 0.042, 0.054; and conscientiousness h 2  = 0.047; s.e.m., 0.003; 95% CI 0.041, 0.053).

For the MVP AFR cohort, cov-LDSC was utilized to estimate personality heritabilities ( Methods ) 21 . Relative to the MVP EUR cohort, neuroticism and extraversion showed lower heritability (4.47% and 3.30%, respectively) in the AFR cohort, while for agreeableness, the heritability was similar (4.24%) (Supplementary Table 1 ). The values were not significant for conscientiousness and openness in AFR.

Before combining the MVP cohort-derived summary statistics with other data sources, we calculated the genetic correlation between the MVP personality summary statistics and other respective sources (Supplementary Table 2 ). A correlation coefficient value of 0.80 (s.e.m., 0.02) observed for the neuroticism summary statistics from the MVP cohort and Nagel et al. study 13 suggests that there is limited heterogeneity between the two datasets and supports their use in a meta-analysis. As shown in Supplementary Table 2 , the genetic correlations were high for all other four traits across data sources as well.

LDSC was used to estimate SNP-based heritability in the EUR participants for each personality trait in the meta-analysis. The SNP heritability values in the meta-analyses were similar to what was observed in the MVP-only cohort for the different traits in the EUR, with a decrease in heritability of extraversion from 7.1% to 5.1% (Fig. 1b ).

Genetic correlation estimates were also obtained between the meta-analysis summary statistics for the five personality traits. We found a significant degree of varying genetic overlap among the five personality traits. The genetic correlations are presented in Fig. 1b . The highest correlation is observed between neuroticism and agreeableness with a r G  = −0.51 (s.e.m., 0.030; P  = 3.813 × 10 −64 ).

Next, we estimated the genetic correlations of 1,437 traits listed in the Complex Traits Genetics Virtual Lab 22 summary statistics record to find other traits related to the five personality traits (Supplementary Tables 16 – 20 ). A total of 325 traits showed significant genetic correlation following multiple testing correction to one or more personality traits. We found MDD and anxiety showed varying degrees of significant correlations to different personality traits as shown in Fig. 5 . The highest genetic correlation is between neuroticism and anxiety ( r G  = 0.80). Neuroticism and agreeableness both show high genetic correlations to these traits, but in opposite directions with MDD (neuroticism r G  = 0.68; s.em. 0.02; P  < 5.00 × 10 −100 and agreeableness r G  = −0.35; s.e.m. 0.04; P  = 1.53 × 10 −22 ), manic behaviour (neuroticism r G  = 0.44; s.e.m. 0.08; 95% CI 0.641, 0.719; P  = 1.11 × 10 −8 and agreeableness r G  = −0.35; s.e.m. 0.11; 95% CI −0.134, 0.566; P  = 1.556 × 10 −3 ), anxiety (neuroticism r G  = 0.80; s.e.m. 0.06; 95% CI 0.682, 0.918; P  = 1.54×10 −46 and agreeableness r G  = −0.32; s.e.m. 0.08; 95% CI −0.163, 0.477; P  = 7.28 × 10 −5 ) and irritability (neuroticism r G  = 0.70; s.e.m. 0.02; 95% CI 0.661; 0.739, P  < 5.00 × 10 −100 and agreeableness r G  = −0.62; s.e.m. 0.04; 95% CI −0.542, 0.698; P  = 9.76 × 10 −61 ).

figure 5

The y axis is the genetic correlation. Error bars (in black) indicate the 95% CIs of the estimated genetic correlation. Anxiety indicates substances taken for anxiety; medication is prescribed for at least 2 weeks. Heavy DIY activities describes the types of physical activity in last 4 weeks; for example, weeding, lawn mowing, carpentry and digging. Manic behaviour describes manic/hyper behaviour for 2 days. Detailed results for all traits, including the sample size of each of the traits, is presented in the Supplementary Tables 16 – 20 .

Local genetic correlations

Global genetic correlations use the average squared signal over the entire genome, which may sometimes mask opposing local correlations in different genomic regions. To counter that, we also calculated the local genetic correlations among the five personality trait pairs using Local Analysis of [co]Variant Association (LAVA) 23 . All personality pairs showed varying degree of correlation in different genomic regions except for the neuroticism–openness pair, which showed negligible global ( r G  = −0.01) and no local genetic correlation between the two. The highest number of correlated genomic chunks were found for neuroticism–extraversion and neuroticism–openness pairs (Fig. 1c and Supplementary Table 21 ).

Variant fine-mapping

To identify well-supported possible causal variants from the large list of SNPs showing associations with the personality traits, we performed genome-wide variant fine-mapping using PolyFun 24 . In total, 166 unique variants were fine-mapped across the five personality traits. The number of variants fine-mapped for neuroticism, extraversion, agreeableness, conscientiousness and openness were 155, 8, 4, 7 and 3, respectively. The complete list of variants fine-mapped for each of the personality traits is provided in the Supplementary Tables 24 – 28 .

Relationship between personality and psychiatric disorders

We performed additional analyses to help understand the significant differential genetic correlation observed between neuroticism and agreeableness with different psychiatric disorders such as MDD and anxiety.

Conditional analysis

Because the genetic correlation between anxiety and neuroticism was so high, we performed multi-trait-based conditional and joint analysis of neuroticism summary statistics conditioned on anxiety and MDD summary statistics individually. The anxiety and MDD summary statistic used is based on data from UKB, MVP and PGC with individuals of EUR ancestry (see Methods for details). We performed a similar analysis with agreeableness, which had a negative correlation with both MDD and anxiety, as a negative control.

After conditioning on MDD, the SNP heritability of the conditioned neuroticism summary statistic reduced significantly from 7.8% to 3% (Table 2 ). Out of the original 208 GWS leads, only 42 remained significant after conditioning, indicating there is substantial genetic overlap between neuroticism and MDD, which gets removed after conditioning. In case of conditioning on anxiety, again there is a decrease in neuroticism heritability, but to a lesser extent (Table 2 ). On conditioning agreeableness on MDD and anxiety, no significant reduction in heritability was observed. However, loss of one genomic locus, rs7240986 (18:53195249:A:G), was observed after conditioning on either anxiety or MDD for agreeableness.

Drug perturbation analysis

We performed a drug perturbation analysis to find drug candidates for neuroticism-enriched genes using gene2drug software 25 . Gene2drug utilizes the Connectivity Map transcriptomics data of ~13,000 cell lines exposed to different drugs, and based on these gene expression profiles and then pathway expression profiles (PEPs), it first matches the query gene to its pathway and then to its potential candidate drug. This analysis predicted 298 unique drugs to correspond to the 231 significantly associated neuroticism genes. The top-scoring drug was found to be desipramine, which is a tricylic antidepressant. Some of the other drugs predicted are flupenthixol (anti-psychotic), tetryzoline (α-adrenergic agonist), doxorubicin (anthracycline/chemotherapy) and digitoxigenin (cardenolide). Based on these results, we repeated the drug perturbation analysis with depression-enriched genes. While there were only 51 genes common between neuroticism and depression gene sets, there was a convergence on drugs in the perturbation analysis. Out of 286 and 298 drugs predicted for depression and neuroticism, respectively, 167 drugs were common to both. The complete list of drugs is presented in Supplementary Tables 29 and 30 .

After establishing genetic overlap of neuroticism with MDD and anxiety, we carried out an MR analysis to explore the possibility of a causal relationship between genetic risk for neuroticism and MDD or anxiety. The results of the MR analysis using different methods are presented in Table 3 . The results of MR indicate a bidirectional causal effect, with the exposure of MDD on neuroticism outcome showing an inverse variance weighting (IVW) effect value of 0.429 at a significant P value (2.072 × 10 −85 ). The exposure of neuroticism on MDD shows a higher causal effect value of 0.834 with a significant P value (6.413 × 10 −103 ). We performed sensitivity analysis of MR using MRlap, which corrects for different sources of bias, including sample overlap, because there are overlapping participants between the exposure and outcome datasets 26 . With MRlap, we observe similar results with positive significant corrected β values in MRlap performed between MDD and neuroticism in both directions (Supplementary Table 4 ).

We also investigated the casual relationship of neuroticism with anxiety. On performing MR with anxiety exposure on neuroticism, we found a β value of 0.179 ( P  = 1.248 × 10 −15 ) and a corrected β value with MRlap of 0.531 ( P  = 7.781 × 10 −14 ) showing evidence of causality. On reversing the direction, the causality effect was stronger as seen by higher β value of 0.70 ( P  = 5.767 × 10 −61 ) with MR and corrected β value of 0.548 ( P  = 1.129 × 10 −40 ) with MRlap. This suggests that there is stronger evidence of causal effect of neuroticism on anxiety as compared with the reverse based on the genetic susceptibility. GWAS of anxiety and anxiety disorders are still relatively underpowered compared with neuroticism, limiting the number of available genetic instruments available for testing as exposures.

We investigated the causal effect of agreeableness on MDD and anxiety and vice versa. In the case of MR of MDD exposure on agreeableness outcome, a β value of −0.284 ( P  = 5.775 × 10 −13 ) was observed indicating negative causal effect of MDD on agreeableness (Table 3 and Supplementary Table 4 ). The causal effect is bidirectional with similar values observed in the opposite direction as well. The results are consistent with genetic correlation findings where negative correlation was observed between agreeableness and MDD. MR analysis of agreeableness and anxiety also indicated bidirectional causal effect. However, here both the traits have limited instruments available.

Out-sample polygenic risk score prediction

We conducted polygenic prediction analysis to validate our findings using the Yale–Penn cohort 27 , which had NEO Personality Inventory (NEO PI-R) scores and genotype information available for 4,532 EUR individuals, and used those data to predict PRS for each of the big five personality traits ( Methods ). We found modest but significant r 2 values in line with previous reports for all personality traits 14 : neuroticism of 2%, extraversion of 2%, openness of 2%, agreeableness of 3% and conscientiousness of 1%.

We conducted a GWAS meta-analysis study of each of the ‘big five’ personality traits in a sample size of up to 682,688 participants. We combined original GWAS results from the MVP (available for all five traits) with summary statistics from the UKB (neuroticism only) and GPC (all traits except neuroticism) cohorts to perform a well-powered meta-analysis for EUR GWAS in each trait. We identified 468 independent significant SNPs associations mapping to 208 independent genomic loci, of which one-third are novel. We identified 231 significant gene associations with neuroticism in the gene-based analysis. The current study was also successful in identifying 23 significant genomic locus associations for the four other personality traits studied, for which prior knowledge in the literature was very limited. In AFR, we found lower heritabilities for neuroticism and extraversion and no significant results for conscientiousness and openness. We identified two GWS variants for agreeableness in AFR. This is probably a reflection of low power and underlines the critical need to increase recruitment in underrepresented groups. Our work provides new data to inform the underlying genetic architecture of personality traits.

Neuroticism, the trait with the largest available sample size in this study, is characterized by emotional instability, increased anxiousness and low resilience to stressful events. As such, it has been the focus of previous efforts in GWAS. As seen previously, neuroticism overlaps substantially with psychopathology, where it is usually viewed as a precursor or risk factor for depressive and anxiety symptoms. Extraversion had the second largest sample size and had the highest SNP-based heritability in the MVP. In our data, scoring high on extraversion was genetically correlated with risk-taking behaviours and had the second strongest negative genetic correlation with neuroticism. Agreeableness assays show how someone relates with other people, that is, how trusting one is or how likely to find fault in others. This trait was the most negatively correlated with neuroticism and irritability as well as MDD, anxiety and manic symptoms. Conscientiousness items relate to discipline and thoroughness, with specific questions being ‘are you lazy’ and ‘does a thorough job’. This trait was most closely associated with ‘types of physical activity in last 4 weeks: ‘heavy do-it-yourself (DIY)’. Finally, openness 10-item Big Five Inventory (BFI-10) items assay imagination and artistic interest. Openness was positively associated with extraversion and risk taking in our data. Educational attainment was positively correlated with openness and negatively associated with neuroticism, while the other three personality traits showed essentially no such overlap (Fig. 5 ). Since these are self-reported items, they naturally reflect one’s own assessment of one’s personality traits, which might filter actual traits and behaviour through a lens of how one wishes to appear or be perceived.

Using these GWAS summary statistics, with excellent power for neuroticism and moderate power for the other traits, we investigated the heritability of the different personality traits and studied genetic correlations among them using LDSC. SNP-based heritability for all five personality traits in EUR were statistically significant. Out of all the personality pairs studied, the strongest relationship was a negative genetic correlation observed between neuroticism and agreeableness ( r G  = −0.51, Fig. 1b ). Examining the genetic correlations of the five personality traits with 1,437 external traits including depression (neuroticism r G  = 0.68 and agreeableness r G  = −0.35), manic behaviour (neuroticism r G  = 0.44 and agreeableness r G  = −0.35), anxiety (neuroticism r G  = 080 and agreeableness r G  = −0.33) and irritability (neuroticism r G  = 0.70 and agreeableness r G  = −0.62) further reflected a pattern of opposing relationships between these traits (Fig. 5 and Supplementary Tables 16 – 20 ). We also calculated local genetic correlations between personality pairs using LAVA, which helped in identifying the genomic regions playing roles in differential overlap in the genetic architecture of personality. This analysis identified several regions where the effect direction differed from the whole genome genetic correlation.

The MVP, our discovery dataset, is one of the world’s largest biobanks and is a valuable resource for genetic studies. Some previously published personality trait studies had significant contribution from UKB data. It is important to quantify the heterogeneity in these independent cohorts and the different definitions of personality phenotype within each. We investigated the genetic correlation between traits defined on the basis of different inventories (BFI-10, EPQ-RS and NEO-FFI) of personality ascertainment with different cohorts, namely MVP, UKB (part of Nagel et al. study) and GPC, respectively. For neuroticism, Nagel et al. and MVP studies showed a high r G value of 0.80 making these two independent cohorts suitable for meta-analysis (Supplementary Table 1 ). Similarly, for extraversion, NEO-FFI and two-item inventories showed high r G of 0.89 in the extraversion data of GPC and MVP studies. While for agreeableness, openness and conscientiousness, the r G s between MVP and GPC cohort were lower (0.63–0.72); this may be due to the small size of the GPC dataset for these traits and the correspondingly large standard errors around the point estimate. The point estimate is not necessarily biased in any particular direction, we only mean there is uncertainty. This limitation will be addressed by future GPC studies with larger sample sizes. No novel loci were identified in the meta-analysis with GPC for these traits.

TWAS revealed common genes with changes in gene expression but with opposite direction of effect for some personality traits. A study by Ward et al. in 2020 reported five of these genes (Supplementary Table 3 ) as eQTLs showing significant associations with mood instability 28 . This is further supported by the local genetic correlation studies (Supplementary Sheet 5 ) where we found genomic region 45883902-47516224 on chromosome 17, which harbours genes KANSL1-AS1 , MAPT and MAPT-IT1 , showing negative local genetic correlation between neuroticism and extraversion with a ρ value of −0.57 and r 2 value of 0.32.

rs1876829 , which maps to CRHR-Intronic Transcript 1, emerged as the lead SNP ( P  = 7.872 × 10 −39 ) for neuroticism in the GWAS analysis. We also found multiple eQTL SNPs in this genomic region ( rs8072451 , rs17689471 , rs173365 and rs11012 ) for the CRHR1 gene to be significantly associated ( P value ranging from 1 × 10 −5 to 1 × 10 −37 ). The TWAS analysis showed significant association of this gene with neuroticism in nervous system tissues including caudate basal ganglia, frontal cortex, hippocampus and spinal cord cervical region. CRHR1 encodes the receptor of corticotropin-releasing hormone family, which are major regulators of the hypothalamic–pituitary–adrenal pathway 29 . Genetic variation in the corticotropin-releasing hormone system has been linked to several psychiatric illnesses 30 . Another study reported hypermethylation at corticotropin-releasing hormone-associated CpG site, cg19035496, in individuals with high general psychiatric risk score for disorders such as depression, anxiety, post-traumatic stress disorder and obsessive compulsive disorder 31 . Further, a study by Gelernter et al. found that CRHR1 significantly associated with re-experiencing post-traaumatic stress disorder symptoms 32 and also maximum habitual alcohol intake 33 . This gene is also involved in hippocampal neurogenesis 30 , while reduced hippocampal activation is associated with elevated neuroticism 34 . This makes CRHR1 a good lead candidate to be followed in future studies to understand the molecular processes impacted by genetic variation underlying a range of psychiatric traits including neuroticism.

While gene expression associations give a wide array of information on the involvement of different genes regulating the different biological processes underlying the biology of traits, searching protein expression associations confers several advantages, as proteins are the final implementers in the functioning of all cells for many biological processes. Through PWAS studies, we found 47 proteins showing significant association with neuroticism in the dorsolateral prefrontal cortex. The PWAS analysis also identified leucine-rich repeat and fibronectin type III domain-containing 5 (LRFN5) protein association with neuroticism, and this protein is also involved in synapse formation. This protein has shown higher levels in patients with MDD and has been suggested as a potential MDD biomarker 35 .

Examples of genes for which we found converging evidence in neuroticism for transcript and protein-level associations with neuroticism include low-density lipoprotein receptor-related protein 4 (LRP4), syntaxin 4 (STX4) and metabolism of cobalamin associated B (MMAB) (Supplementary Table 31 ). LRP4 has diverse roles in neuromuscular junctions and in disorders of the nervous system, including Alzheimer’s disease and amyotrophic lateral sclerosis 36 , STX4 is implicated in synaptic growth and plasticity 37 , and MMAB, which catalyses the final step in the conversion of cobalamin (vitamin B12) into adenosylcobalamin (biologically active coenzyme B12), all of which have broad implications for brain function, including those in relation to methylmalonic acidaemia 38 . Low levels of plasma vitamin B12 have been found to be associated with higher depression cases in multiple studies 39 .

We investigated the relationship of these personality traits with other psychiatric traits, cognitive functions and disorders in a broad phenome-wide scan of genetic correlations with 1,437 traits. A total of 325 traits showed significant genetic correlations with at least one of the five personality traits following multiple testing correction. Two important traits that had some of the strongest associations were MDD and anxiety. Whereas the association of neuroticism with depression and anxiety has been previously considered 4 , 13 , our analysis revealed that another personality trait, agreeableness, is also strongly associated with both anxiety and depression but in the opposite direction to neuroticism, showing a potential protective relationship. MR indicated a strong bidirectional causal relationship between neuroticism with anxiety and depression, while showing a bidirectional protective relationship for agreeableness for both traits. Variance explained for neuroticism was attenuated upon conditioning for MDD but remained significant, indicating some independent genetic component for neuroticism despite the strong overlap. Similar, but with a less strong effect, was seen of anxiety on neuroticism, which may be partly due to lower power of available anxiety summary statistics. Larger studies of anxiety disorders are needed to better understand this relationship. Conversely, when we conditioned on agreeableness, for MDD and anxiety we observed a nominal but non-significant change in SNP-based heritability. We conducted MR to further discern these patterns and it showed bidirectional causal effects with neuroticism, confirming a high degree of inter-relatedness between the traits. Given the high degree of genetic overlap between trait neuroticism and the expectation of personality trait expression preceding age of onset for MDD, a high trait neuroticism may be considered an early risk factor for anxiety, depressive and related psychopathology. Indeed, studies have shown persistent elevated neuroticism through adolescence is a risk factor for later susceptibility to anxiety and MDD diagnosis 40 .

Personality phenotyping in The MVP sample were done using self-report for the short BFI-10 inventory. As such, data are relatively sparse compared with more robust instruments and do not have more in-depth features such as facets found in the NEO inventory. The nature of large biobank studies such as the MVP comes with a crucial advantage in recruitment and sample size, but comes with the sacrifice of deep phenotyping. Future studies that compare findings from more deeply phenotyped samples to more sparse phenotyping used by the MVP would be valuable to address this limitation. Additionally, while we greatly expand on the amount of data available for agreeableness, conscientiousness, openness and extraversion, they still lag behind what has been accomplished for neuroticism. This means genetic instruments defined for the other four traits may lack the precision available for neuroticism. Larger samples still need to be collected to better understand these other traits.

Personality traits are known to have complex interactions with other human behaviours. In this work we have conducted comprehensive genomic studies of personality traits. We performed a GWAS in the MVP sample, the largest and most diverse biobank in the world, in both EUR and AFR to better understand genetic factors underlying personality traits. We combined this information with previously published results in a large meta-analysis, identifying novel genetic associations with five personality traits studied. We identified interactions in a phenome-wide genetic correlation analysis, finding novel relationships between complex traits. We used in silico analysis techniques to identify genetic overlap and causal relationships with depression and anxiety disorders. We also characterized underlying biology using predicted changes in gene and protein expression, biological pathway enrichment and drug perturbation analysis. These results substantially enhance our knowledge of the genetic basis of personality traits and their relationship to psychopathology.

Inclusion and ethics statement

This research was not restricted or prohibited in the setting of any of the included researchers. All studies were approved by local institutional research boards and ethics review committees. MVP was approved by the Veterans Affairs central institutional research board. We do not believe our results will result in stigmatization, incrimination, discrimination or personal risk to participants.

Cohort and phenotype

We used data release version 4 of the MVP 41 . The BFI-10 was included as part of a self-report Lifestyle survey provided to MVP participants, with two items for each of the personality traits (Supplementary Fig. 3 ). For the MVP EUR participants, the mean age was ~65.5 years for each of the five traits and 8% of the sample was female. For MVP AFR, the mean age was ~60.6 years for each trait while 14.0% of the sample was female.

Genotyping and imputation

Genotyping and imputation of MVP subjects has been described previously 41 , 42 . A customized Affymetrix Axiom Array was used for genotyping. MVP genotype data for biallelic SNPs were imputed using Minimac4 43 and a reference panel from the African Genome Resources panel by the Sanger Institute. Indels and complex variants were imputed independently using the 1000 Genomes phase 3 panel 44 and merged in an approach similar to that employed by the UKB. Ancestry group assignment within the MVP has been previously described 45 . Briefly, designation of broad ancestries was based on genetic assignment with comparison to 1000 Genomes reference panels 44 . Principal components to be used as covariates were generated within each assigned broad ancestral group.

GWAS and meta-analysis

We performed individual GWAS for each of the five personality traits in the MVP cohort 41 . The personality information along with genotype data were available for a total of 270,000 individuals with 240,000 EUR and 30,000 AFR. The GWAS was performed separately for each of the traits in the EUR and AFR datasets and the effect values were computed using linear regression.

MVP GWAS was conducted using linear regression in PLINK 2.0 using the first ten principal components, sex and age as covariates 46 . Variants were excluded if call missingness in the best-guess genotype exceeded 20%. Alleles with minor allele frequency (MAF) <0.1% were excluded. Additionally, only variants with an imputation accuracy of ≥0.6 were retained. After applying all filters, genotype data from 233,204, 235,742, 235,374, 234,880 and 220,015 participants were included for neuroticism, extraversion, agreeableness, conscientiousness and openness, respectively.

For meta-analysis, summary statistics generated in this study (referred to as MVP study) were combined using METAL 17 with that from Nagel et al. and GPC phase I and II studies (Fig. 1a ) based on the availability of data for respective traits. The z -scores of variants provided in the summary statistics were converted into β scores 47 . The inverse variance weighing scheme of METAL was applied to weight the effect sizes of SNPs from the different source studies. For neuroticism, summary statistics from MVP and Nagel et al. studies 13 (excluding 23 and Me) were combined, increasing the total sample size to 682,688. For extraversion, summary statistics from MVP and GPC phase II study 10 were combined, while summary statistics from MVP and GPC phase I study 8 were combined for the respective meta-analysis of agreeableness, openness and conscientiousness. GPC data were already included in the neuroticism meta-analysis of Nagel et al.

The independent GWS loci for each of the personality traits were identified by clumping all SNPs using PLINK v1.9 software 48 . P value cut-off of 5 × 10 −8 , MAF >0.0001, distance cut-off of 1 MB and r 2  < 0.1 were used to define the lead SNPs using the 1000 Genomes phase 3 European reference panel 44 . The genes are mapped for the identified lead SNPs using biomaRt package in R 49 . The same parameters were used to define novel independent loci for comparison from the Nagel et al. 13 and Becker et al. 14 summary statistics (excluding 23 and Me).

Trans-ancestry analysis for each of the five personality phenotypes was performed by combining their respective summary statistics from AFR and EUR analyses using METAL 17 . As with the EUR-only meta-analysis, the inverse variance weighing scheme of METAL was applied to weight the effect sizes of SNPs from the two ancestries. We identified independent SNPs in the same manner as described above for the ancestry-stratified GWAS.

LDSC and SNP heritability

LDSC was performed based on the linkage disequilibrium reference from the 1000 Genomes data for all EUR cohorts and SNP heritability for each of the five personality traits was calculated 50 . To investigate the relation among the different personality traits, the LDSC-based correlation was also calculated between each pair of traits 51 . LDSC was also used to calculate genetic correlation of the personality traits with multiple other phenotypes (1,437 traits) with the Complex Traits Genetics Virtual Lab webtool 22 . A P value cut-off of 6.9 × 10 −6 (0.05/(1437 × 5)) was applied to filter the significant correlating pair of traits after multiple test correction.

For MVP AFR, linkage disequilibrium scores were computed from the approximately 123,000 AFR individuals’ genotype data in the MVP cohort using covariant LDSC software 21 . This linkage disequilibrium reference panel was then utilized to calculate SNP heritability in the MVP AFR cohort using LDSC.

We used LAVA 23 to calculate local heritability for the five personality traits and local genetic correlations for each pair. The genome was divided into 2,495 genomic chunks/loci to attain minimum linkage disequilibrium between them and maintain an approximate equal size of around 1 MB. The local heritability of each of the five personality traits was calculated for each of the 2,495 loci. For a given personality trait pair, local genetic correlations were calculated only for pairs that had significant local heritability (Bonferroni-corrected P value at 5% false discovery rate (FDR)) for both traits of the pair. Bonferroni multiple testing correction was also applied to genetic correlated P value to consider significant correlated pairs.

FUSION software 18 was used to perform TWAS. FUSION first estimates the SNP heritability of steady-state gene and uses the nominally significant ( P  < 0.05) genes for training the predictive models. The predictive model with significant out-of-sample R 2 (>0.01) and nominal P  < 0.05 in the five-fold cross-validation was then used for the predictions in the GWAS data. The process is performed for all five personality EUR GWAS data with 10,386 unique genes spanning over the 13 selected tissues. The expression weight panels for 13 a priori selected tissues were taken from GTEx v8 19 . We selected the different available brain tissues and whole blood as the tissues of interest, where Bonferroni corrections at FDR <0.05 were applied with the 10,386 genes test for the 13 tissues to find the genes with significant hits (P < 3.703 × 10 −7 ).

We performed PWAS to test the association between genetically regulated protein expression and different personality traits individually using FUSION software 18 . The weights for genetic effect on protein expression for the PWAS were from the Wingo et al. study 52 . In the PWAS, we integrated the protein weights with the summary statistics from the GWAS of each of the personality traits, respectively. Next, to decrease the probability of linkage contributing to the significant association in the PWAS, we performed colocalization analysis using COLOC 53 . In COLOC, we determined if the genetic variants that regulate protein expression colocalize with the GWAS variants for the personality trait. Significant proteins in the PWAS that also have COLOC posterior probability of hypothesis 4 (PP4) >50% have a higher probability of being consistent with a causal role in the personality trait of interest.

Fine-mapping

To identify likely causal variants, we performed variant fine-mapping using Polyfun software 24 . Since the fine-mapping was performed on the same EUR data, SNP-specific prior causal probabilities were taken directly from the pre-computed causal probabilities of 19 million imputed UKB SNPs with MAF >0.01 based on 15 UKB traits analysis. The fine-mapping was performed on the GWAS sumstats for each of the five personality traits. SuSiE 54 was used to map the posterior causal probabilities of the SNPs. The SNPs with posterior inclusion probability (PIP) value >0.95 were considered as significant for neuroticism, while a more relaxed cut-off of PIP >0.80 was used for other four personality traits to avoid loss of causal variant information due to the relatively less power in their respective datasets.

Conditional analysis was performed to investigate the possible mediating effects between depression or anxiety and neuroticism or agreeableness. Neuroticism meta-data GWAS summary statistics were used and conditioned on MDD and anxiety in individual runs. The MDD summary statistics were from Levey et al. study 55 and include a meta-analysis from the MVP, UKB, PGC and FinnGen. The anxiety summary statistics were taken from Levey et al. study 42 . With depression/anxiety studies as covariate traits, the conditional analysis of neuroticism (target trait) was carried out using multi-trait-based conditional and joint analysis utility of genome-wide complex trait analysis 56 . Similarly, the same method was used to perform conditional analysis of agreeableness on MDD and anxiety.

FUMA was used to carry out the MAGMA-based gene-association tests to find significantly associated genes for a trait from its GWAS summary data 15 . Drugs were searched for both neuroticism and MDD individually using their respective significantly associated genes derived from neuroticism meta-analysis summary statistics and MDD GWAS summary statistics from the Levey et al. summary statistics. To predict drug candidates for a given trait, significant genes associated with neuroticism/depression were given as input to gene2drug R-package 25 . Pre-computed Pathway Expression Profiles of the Connectivity Map data were taken from Drug Set Enrichment Analysis (DSEA) website. For each query gene, a maximum of five predicted drugs were predicted. Further, the drugs showing an E s core >0.5 and a P value less than 1 × 10 −6 were considered significant. The process was repeated for MDD.

MR was performed to study the causal relationship between four pairs of traits: neuroticism and MDD, neuroticism and anxiety, agreeableness and MDD, and agreeableness and anxiety. These traits had the highest genetic correlation. The summary statistics described previously for conditional analysis for all four traits were used for carrying out MR analysis as well. TwoSample MR package was used to perform the MR 57 . For each pair of traits, the TwoSample MR was run twice to see the effect of exposure of each of the two traits on the outcome of the other. After harmonizing the exposure and outcome instruments sets, clumping of SNPs (distance of 500 kb, r 2  = 0.05) was performed before conducting the MR analysis. Because some of our samples included in the analysis of personality overlap with our outcomes and exposures of interest, and a TwoSample MR is not robust to sample overlap, we also performed a sensitivity analysis for each trait pair using the MRlap package 26 . MRlap is specifically designed to account for many assumptions of MR, including sample overlap. It first calculates observed MR-based effect values and then a corrected effect value by using the genetic covariance calculated by LDSC.

The Yale–Penn cohort includes participants recruited from sites in the eastern United States 58 . A total of 11,705 participants completed the 240-item revised NEO PI-R, which assesses the domains of the five-factor model of personality: neuroticism, extraversion, openness to experience, agreeableness and conscientiousness 59 . Each domain has six facets. For example, the facets of neuroticism are anxiety, angry hostility, depression, self-consciousness, impulsiveness and vulnerability. Each item is rated on a five-point scale. Of the Yale–Penn participants with a NEO score, 4,582 were assigned to the broadly defined EUR group using the same methods as in the MVP sample and were unrelated. We used PRS-CS, Python software that uses Bayesian regression and continuous shrinkage priors, to calculate posterior effect sizes per SNP 60 . The 1000 Genomes linkage disequilibrium reference panel was used. The training datasets were summary statistics from the EUR meta-analysis for each of the five personality factors. The target dataset was a PLINK-formatted binary file set containing genotype information from the Yale–Penn participants 48 . Once score per SNP was generated by PRS-CS and PLINK was used to generate a score for each individual by summing SNP effect 48 . The lm (linear model) function in R was used to regress NEO PI-R scores on PRS, using age, sex and the first ten within-ancestry principal components as covariates 61 .

Ethics oversight

Research involving MVP in general is approved by the Veterans Affairs Central institutional research board; the current project was also approved by institutional research boards in West Haven, CT.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All MVP summary statistics are made available through dbGAP request at https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001672.v11.p1 . Meta-analysis summary statistics are available through the Levey lab website at https://medicine.yale.edu/lab/leveylab/data/ . Meta-analysis data will also be made available via the Complex Trait Genetics Virtual Lab at https://vl.genoma.io/ .

Code availability

No custom code was developed for analyses in this manuscript. All code used is cited and described in the methods. Software versions are accessible via PLINK v1.9 at https://www.cog-genomics.org/plink/1.9/ , PLINK v2.0 at https://www.cog-genomics.org/plink/2.0/ and Polyfun: version 1.0.0 SuSiE package version: 0.11.92.

John, O. P. & Srivastava, S. in Handbook of Personality: Theory and Research (eds Pervin, L. A. and John, O. P.) (Guilford Press, 1999).

McCrae, R. R. & Costa, P. T. Personality in Adulthood: A Five-factor Theory Perspective (Guilford Press, 2003).

Kendler, K. S. & Myers, J. The genetic and environmental relationship between major depression and the five-factor model of personality. Psychol. Med. 40 , 801–806 (2010).

Article   CAS   PubMed   Google Scholar  

Hettema, J. M. et al. A population-based twin study of the relationship between neuroticism and internalizing disorders. Am. J. Psychiatry 163 , 857–864 (2006).

Article   PubMed   Google Scholar  

Hettema, J. M., Prescott, C. A. & Kendler, K. S. Genetic and environmental sources of covariation between generalized anxiety disorder and neuroticism. Am. J. Psychiatry 161 , 1581–1587 (2004).

Van Os, J. & Jones, P. B. Neuroticism as a risk factor for schizophrenia. Psychol. Med 31 , 1129–1134 (2001).

Smeland, O. B. et al. Identification of genetic loci shared between schizophrenia and the Big Five personality traits. Sci. Rep. 7 , 2222 (2017).

Article   PubMed   PubMed Central   Google Scholar  

de Moor, M. H. et al. Meta-analysis of genome-wide association studies for personality. Mol. Psychiatry 17 , 337–349 (2012).

De Moor, M. H. et al. Meta-analysis of genome-wide association studies for neuroticism, and the polygenic association with major depressive disorder. JAMA Psychiatry 72 , 642–650 (2015).

van den Berg, S. M. et al. Meta-analysis of genome-wide association studies for extraversion: findings from the genetics of personality consortium. Behav. Genet. 46 , 170–182 (2016).

Lo, M. T. et al. Genome-wide analyses for personality traits identify six genomic loci and show correlations with psychiatric disorders. Nat. Genet. 49 , 152–156 (2017).

Okbay, A. et al. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat. Genet. 48 , 624–633 (2016).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Nagel, M. et al. Meta-analysis of genome-wide association studies for neuroticism in 449,484 individuals identifies novel genetic loci and pathways. Nat. Genet. 50 , 920–927 (2018).

Becker, J. et al. Resource profile and user guide of the Polygenic Index Repository. Nat. Hum. Behav. 5 , 1744–1758 (2021).

Watanabe, K. et al. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8 , 1826 (2017).

de Leeuw, C. A. et al. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11 , e1004219 (2015).

Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26 , 2190–2191 (2010).

Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48 , 245–252 (2016).

Consortium, G. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369 , 1318–1330 (2020).

Article   Google Scholar  

Ramoz, N. et al. Corticotropin releasing hormone receptor CRHR1 gene is associated with tianeptine antidepressant response in a large sample of outpatients from real-life settings. Transl. Psychiatry 10 , 378 (2020).

Luo, Y. et al. Estimating heritability and its enrichment in tissue-specific gene sets in admixed populations. Hum. Mol. Genet. 30 , 1521–1534 (2021).

CAS   PubMed   PubMed Central   Google Scholar  

Cuéllar-Partida, G. et al. Complex-Traits Genetics Virtual Lab: a community-driven web platform for post-GWAS analyses. Preprint at bioRxiv https://doi.org/10.1101/518027 (2019).

Werme, J. et al. An integrated framework for local genetic correlation analysis. Nat. Genet. 54 , 274–282 (2022).

Weissbrod, O. et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat. Genet. 52 , 1355–1363 (2020).

Napolitano, F. et al. gene2drug: a computational tool for pathway-based rational drug repositioning. Bioinformatics 34 , 1498–1505 (2018).

Mounier, N. & Kutalik, Z. Bias correction for inverse variance weighting Mendelian randomization. Genet. Epidemiol. 47 , 314–331 (2023).

Gelernter, J. et al. Genome-wide association study of alcohol dependence: significant findings in African- and European-Americans including novel risk loci. Mol. Psychiatry 19 , 41–49 (2014).

Ward, J. et al. The genomic basis of mood instability: identification of 46 loci in 363,705 UK Biobank participants, genetic correlation with psychiatric disorders, and association with gene expression and function. Mol. Psychiatry 25 , 3091–3099 (2020).

Herman, J. P. et al. Regulation of the hypothalamic–pituitary–adrenocortical stress response. Compr. Physiol. 6 , 603–621 (2016).

Koutmani, Y. et al. CRH promotes the neurogenic activity of neural stem cells in the adult hippocampus. Cell Rep. 29 , 932–945 e7 (2019).

Jokinen, J. et al. Epigenetic changes in the CRH gene are related to severity of suicide attempt and a general psychiatric risk score in adolescents. EBioMedicine 27 , 123–133 (2018).

Gelernter, J. et al. Genome-wide association study of post-traumatic stress disorder reexperiencing symptoms in >165,000 US veterans. Nat. Neurosci. 22 , 1394–1401 (2019).

Gelernter, J. et al. Genome-wide association study of maximum habitual alcohol intake in >140,000 U.S. European and African American veterans yields novel risk loci. Biol. Psychiatry 86 , 365–376 (2019).

Magal, N., Hendler, T. & Admon, R. Is neuroticism really bad for you? Dynamics in personality and limbic reactivity prior to, during and following real-life combat stress. Neurobiol. Stress 15 , 100361 (2021).

Xu, K. et al. LRFN5 and OLFM4 as novel potential biomarkers for major depressive disorder: a pilot study. Transl. Psychiatry 13 , 188 (2023).

DePew, A. T. & Mosca, T. J. Conservation and innovation: versatile roles for LRP4 in nervous system development. J. Dev. Biol. 9 , 9 (2021).

Harris, K. P. et al. The postsynaptic t-SNARE Syntaxin 4 controls traffic of Neuroligin 1 and Synaptotagmin 4 to regulate retrograde signaling. eLife 5 , e13881 (2016).

Chen, T. et al. Methylmalonic acidemia: neurodevelopment and neuroimaging. Front. Neurosci. 17 , 1110942 (2023).

Sangle, P. et al. Vitamin B12 supplementation: preventing onset and improving prognosis of depression. Cureus 12 , e11169 (2020).

PubMed   PubMed Central   Google Scholar  

Aldinger, M. et al. Neuroticism developmental courses—implications for depression, anxiety and everyday emotional experience; a prospective study from adolescence to young adulthood. BMC Psychiatry 14 , 210 (2014).

Gaziano, J. M. et al. Million Veteran Program: a mega-biobank to study genetic influences on health and disease. J. Clin. Epidemiol. 70 , 214–223 (2016).

Levey, D. F. et al. Reproducible genetic risk loci for anxiety: results from approximately 200,000 participants in the Million Veteran Program. Am. J. Psychiatry 177 , 223–232 (2020).

Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48 , 1284–1287 (2016).

Genomes Project, C. et al. A global reference for human genetic variation. Nature 526 , 68–74 (2015).

Zhou, H. et al. Multi-ancestry study of the genetics of problematic alcohol use in over 1 million individuals. Nat. Med. 29 , 3184–3192 (2023).

Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4 , 7 (2015).

Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48 , 481–487 (2016).

Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet 81 , 559–575 (2007).

Durinck, S. et al. BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 21 , 3439–3440 (2005).

Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47 , 291–295 (2015).

Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47 , 1236–1241 (2015).

Wingo, T. S. et al. Shared mechanisms across the major psychiatric and neurodegenerative diseases. Nat. Commun. 13 , 4314 (2022).

Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10 , e1004383 (2014).

Wang, G. et al. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Ser. B 82 , 1273–1300 (2020).

Levey, D. F. et al. Bi-ancestral depression GWAS in the Million Veteran Program and meta-analysis in >1.2 million individuals highlight new therapeutic directions. Nat. Neurosci. 24 , 954–963 (2021).

Zhu, Z. et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat. Commun. 9 , 224 (2018).

Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. eLife 7 , e34408 (2018).

Sherva, R. et al. Genome-wide association study of cannabis dependence severity, novel risk variants, and shared genetic risks. JAMA Psychiatry 73 , 472–480 (2016).

Costa, P. T. Jr. & McCrae, R. R. in The SAGE Handbook of Personality Theory and Assessment Vol. 2 (eds Boyle, G. M. G. J. & Saklofske, D. H.) 179–198 (Sage Publications, 2008).

Ge, T. et al. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat. Commun. 10 , 1776 (2019).

R: A Language and Environment for Statistical Computing (R Core Team, 2021).

Download references

Acknowledgements

This research is based on data from the Million Veteran Program, Office of Research and Development, Veterans Health Administration, and was supported by award no. 5IK2BX005058. This publication does not represent the views of the Department of Veteran Affairs or the United States Government. A.W. was supported by a BLRD CDT award from the US Department of Veterans Affairs no. 1IK4BX005219 and grant I01 BX005686. A.W. and T.W. were supported by R01 grant no. AG072120. J.G. was supported by US Department of Veterans Affairs grant 5I01CX001849-04 and NIH grants R01DA037974 and R01DA058862. H.K. was supported by US Department of Veterans Affairs grant I01 BX004820 and the VISN4 Mental Illness Research, Education and Clinical Center of the Crescenz VAMC. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. Detailed MVP Core team acknowledgements are included in the supplement.

Author information

Authors and affiliations.

Division of Human Genetics, Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA

Priya Gupta, Marco Galimberti, Sarah Beck, Keyrun Adhikari, Joel Gelernter & Daniel F. Levey

Department of Psychiatry, Veterans Affairs Connecticut Healthcare Center, West Haven, CT, USA

Department of Neurology and Human Genetics, Emory University School of Medicine, Atlanta, GA, USA

Yue Liu & Thomas Wingo

Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA, USA

Aliza Wingo

Atlanta Veterans Affairs Medical Center, Atlanta, GA, USA

Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA

Henry R. Kranzler

Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA

Psychiatry Service, VA San Diego Healthcare System, San Diego, CA, USA

Murray B. Stein

Departments of Psychiatry, School of Medicine, and Herbert Wertheim School of Public Health, University of California San Diego, La Jolla, CA, USA

You can also search for this author in PubMed   Google Scholar

VA Million Veteran Program

  • Priya Gupta
  • , Marco Galimberti
  • , Sarah Beck
  • , Henry R. Kranzler
  • , Murray B. Stein
  • , Joel Gelernter
  •  & Daniel F. Levey

Contributions

D.F.L. and P.G. designed the study. P.G. and D.F.L. drafted the manuscript. J.G. and M.B.S. provided ongoing feedback and refinement of the analytical plan, as well as early feedback on the drafted manuscript. D.F.L. and P.G. conducted GWAS on included cohorts. D.F.L. and P.G. discussed, created and refined the phenotype in the MVP. P.G. and M.G. discussed and refined MVP analytic plans. P.G. and Y.L. conducted TWAS and PWAS analysis with guidance from A.W., T.W. and D.F.L. S.B conducted out-sample PRS into the Yale–Penn cohorts with guidance from J.G. and H.R.K. P.G. D.F.L., M.G., S.B., Y.L., A.W., T.W. and K.A. conducted original analyses. D.F.L., T.W. and A.W. supervised original analyses. All authors critically evaluated and revised the manuscript.

Corresponding author

Correspondence to Daniel F. Levey .

Ethics declarations

Competing interests.

H.R.K. is a member of advisory boards for Dicerna Pharmaceuticals, Sophrosyne Pharmaceuticals, Clearmind Medicine and Enthion Pharmaceuticals; a consultant to Sobrera Pharmaceuticals; the recipient of research funding and medication supplies for an investigator-initiated study from Alkermes; and a member of the American Society of Clinical Psychopharmacology’s Alcohol Clinical Trials Initiative, which was supported in the past 3 years by Alkermes, Dicerna, Ethypharm, Lundbeck, Mitsubishi and Otsuka. J.G. and H.R.K. are holders of US patent 10,900,082 titled: ‘Genotype-guided dosing of opioid agonists’, issued 26 January 2021. J.G. is paid for editorial work on the journal Complex Psychiatry. The remaining authors declare no competing interests. J.G. is named as an inventor on PCT patent application no. 15/878,640 entitled ‘Genotype-guided dosing of opioid agonists’, filed 24 January 2018, and issued on 26 January 2021, as US patent no. 10900082. M.B.S. has stock options in Oxeia Biopharmaceuticals and EpiVario. He has been paid for his editorial work on Depression and Anxiety (Editor-in-Chief), Biological Psychiatry (Deputy Editor) and UpToDate (Co-Editor-in-Chief for Psychiatry). No other authors report competing interests.

Peer review

Peer review information.

Nature Human Behaviour thanks Robert Krueger, Aysu Okbay and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information.

Supplementary Figs. 1–3, Tables 1–4 and VA Million Veteran Program core acknowledgement.

Reporting Summary

Supplementary tables 1–31.

Thirty-one supplementary tables.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Gupta, P., Galimberti, M., Liu, Y. et al. A genome-wide investigation into the underlying genetic architecture of personality traits and overlap with psychopathology. Nat Hum Behav (2024). https://doi.org/10.1038/s41562-024-01951-3

Download citation

Received : 16 January 2024

Accepted : 09 July 2024

Published : 12 August 2024

DOI : https://doi.org/10.1038/s41562-024-01951-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

research work on genetics

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Plant Physiol
  • v.190(4); 2022 Dec

Logo of plntphys

Mendel: From genes to genome

Frances c sussmilch.

Discipline of Biological Sciences, School of Natural Sciences, University of Tasmania, Sandy Bay, Tasmania 7005, Australia

John J Ross

James b reid, associated data.

Two hundred years after the birth of Gregor Mendel, it is an appropriate time to reflect on recent developments in the discipline of genetics, particularly advances relating to the prescient friar’s model species, the garden pea ( Pisum sativum L.). Mendel’s study of seven characteristics established the laws of segregation and independent assortment. The genes underlying four of Mendel’s loci ( A , LE , I , and R ) have been characterized at the molecular level for over a decade. However, the three remaining genes, influencing pod color ( GP ), pod form ( V/P ), and the position of flowers ( FA/FAS ), have remained elusive for a variety of reasons, including a lack of detail regarding the loci with which Mendel worked. Here, we discuss potential candidate genes for these characteristics, in light of recent advances in the genetic resources for pea. These advances, including the pea genome sequence and reverse-genetics techniques, have revitalized pea as an excellent model species for physiological–genetic studies. We also discuss the issues that have been raised with Mendel’s results, such as the recent controversy regarding the discrete nature of the characters that Mendel chose and the perceived overly-good fit of his segregations to his hypotheses. We also consider the relevance of these controversies to his lasting contribution. Finally, we discuss the use of Mendel’s classical results to teach and enthuse future generations of geneticists, not only regarding the core principles of the discipline, but also its history and the role of hypothesis testing.

The molecular nature of Mendel’s genes, the genetic resources available for pea, pea as a model for teaching and research, and recent controversies about Mendel’s data are discussed.

Introduction

This year marks the 200th anniversary of Gregor Mendel’s birth—July 20, 1822. It is, therefore, timely to reflect on and revisit recent advances in genetics research using his model species, the garden pea ( Pisum sativum ). In 2011, two reviews discussed progress toward the molecular characterization of Mendel’s seven genes ( Ellis et al., 2011 ; Reid and Ross, 2011 ). Now we report on the intervening 11 years, a period that has seen exciting developments relating to Mendel’s work and to pea genetics in general. Recent breakthroughs have reinvigorated the early tradition of pea as a model species for physiological–genetic studies (from the 1950s to the 1970s), which stemmed originally from the fact that some of Mendel’s genes control traits of paramount physiological and agronomic importance. An early example was Barber et al. (1958) , who studied the effects of mutations in Mendel’s LE gene on flowering and internode length in pea, using genetics to further understand the physiology of these traits. Key discoveries in pea have been made recently on phenomena such as the regulation of flowering, nodulation, hormone biosynthesis and signaling, branching, and starch production (e.g. Barbier et al., 2019 ; Meitzel et al., 2021 ; Velandia et al., 2022 ; Williams et al., 2022 ). Pea also continues to attract attention as an important temperate crop, due to its ability to fix nitrogen and the high quality and quantity of protein in its seeds for both human and animal consumption. Further, there is great potential to breed pea varieties for improved yield and phosphorus use efficiency ( Foyer et al., 2016 ; Powers and Thavarajah, 2019 ; Davies and Muehlbauer, 2020 ).

  • Four of Mendel’s seven genes have been characterized and we can now propose candidates for the three remaining genes.
  • There have been major developments in the resources available for pea genetics, including sequence information and reverse genetics techniques.
  • The controversies surrounding Mendel’s data can perhaps now be concluded.
  • Technological advances have reinvigorated pea as a model for physiological genetics.
  • Pea continues to facilitate discoveries in important aspects of plant biology.

Recent papers including those by van Dijk et al. (2022) , Fairbanks (2022) , Nasmyth (2022) , and Berger (2022) discuss issues such as the history behind Mendel’s approach to his experiments, his biographical details, the nature of the questions he investigated, and how his studies relate to modern-day research fields. Here, we update our current understanding of Mendel’s genes, review the molecular progress that has occurred since 2011, and discuss how some of these advances have been used in a physiological context (see “Advances”). Other aspects reviewed here include recent criticisms of Mendel’s data and the use of Mendel’s genes as a foundation for the teaching of genetic principles.

Molecular characterization of four of Mendel’s genes

The characterization of Mendel’s genes at the R , LE , I , and A loci occurred with the development of molecular techniques between 1990 and 2010. These results have been reviewed by Ellis et al. (2011) and Reid and Ross (2011) , and also by Smýkal (2014) , so only a brief summary will be given here. The identification of these genes was aided by the simple fact that in all four cases there was little dispute about which locus was responsible for the Mendelian characteristic concerned. In addition, all four characteristics are important in commercial cultivars of pea, giving these loci higher priority.

Seed shape character—Round ( R ) versus wrinkled ( r )

R was the first of Mendel’s seven genes to be characterized at a molecular level. It was shown to be involved in starch biosynthesis, encoding one of the major isoforms of starch-branching enzyme 1 (PsSBE1; Bhattacharyya et al., 1990 ). The nature of the r mutation used by Mendel seems clear, and results from a 0.8-kb insertion of a nonautonomous type II transposon. This mutation can explain the complex seed phenotype of r seeds, which—in addition to the wrinkled phenotype reported by Mendel—includes the shape of starch grains in the cotyledons ( Gregory, 1903 ) and an elevated content of the sugars sucrose, fructose, and glucose ( Stickland and Wilson, 1983 ).

Stem length—Tall ( LE ) versus dwarf ( le )

The difference at the LE locus appears to have been used in agriculture for over 500 years ( Blixt, 1972 ), with the short-internode dwarf ( le-1 ) types resulting in reduced stem elongation and hence reduced lodging and consequential disease susceptibility. The LE gene was shown to encode a gibberellin 3-oxidase (PsGA3ox1) that activates the inactive precursor GA 20 to the biologically active GA 1 ( Lester et al., 1997 ; Martin et al., 1997 ). Along with results from a null mutant, le-2 , this confirmed that 3-oxidation is necessary for activation of gibberellins, a key group of plant hormones ( Lester et al., 1999 ). The mutant allele le-1 used by Mendel is caused by a single base G-to-A mutation that results in an alanine to threonine substitution near the active site of the enzyme, reducing, but not eliminating, its activity ( Ingram et al., 1984 ; Lester et al., 1997 , 1999 ). Subsequent research identified the second member of the pea GA 3-oxidase family, which appears to compensate for the lack of PsGA3ox1 ( LE ) activity in the seeds and roots of the le-1 mutant, enabling this mutation to be used in agricultural cultivars ( Weston et al., 2008 ).

Cotyledon color—yellow ( I ) versus green ( i )

The green coloration of the cotyledons observed by Mendel, as opposed to yellow cotyledons in wild-type plants, is due to a mutation in a STAY-GREEN ( SGR ) gene ( Armstead et al., 2007 ; Sato et al., 2007 ). Similar “stay-green” mutants had previously been described in other species and seemed to be due to a reduction in the breakdown of chlorophyll during dark incubation, a phenotype also present in the leaves of pea i mutants ( Armstead et al., 2007 ; Jiang et al., 2007 ; Sato et al., 2007 ; Aubry et al., 2008 ). Even after molecular characterization of I / SGR ( Sato et al., 2007 ), the actual function of the gene remained unclear ( Aubry et al., 2008 ; Sato et al., 2009 ), as was the mutation that Mendel actually used ( Sato et al., 2007 ). Subsequently, stay-green mutants have been studied closely in diverse species over the last decade because of their potential to prolong photosynthesis and to increase yield in breeding programs. Findings from these studies indicate that Mendel’s I ( SGR ) gene acts in chlorophyll a degradation as a Mg-dechelatase ( Shimoda et al., 2016 ; Jiao et al., 2020 ).

Seed coat/flower color—purple ( A ) versus white ( a )

Mendel (1866) noted that colored seed coats (testas), colored flowers, and pigmented leaf axils always occurred together. This indicates that they are pleiotropic characters caused by a single gene. A has since been shown to be a regulatory gene coding for a basic helix–loop–helix transcription factor that controls the spatial expression of the chalcone synthase gene family, essential for general flavonoid biosynthesis ( Statham et al., 1972 ; Harker et al., 1990 ; Hellens et al., 2010) . Mendel probably used an a allele caused by a G-to-A mutation at a splice site that results in a frameshift and premature stop codon ( Hellens et al., 2010) .

Progress on the molecular characterization of Mendel’s three remaining genes

Of the three remaining characteristics described by Mendel, “pod color” has two candidate genes awaiting final confirmation of causality, but “pod form” and “position of flowers” are complicated by a lack of clarity over which one of multiple possible loci associated with these traits was segregating in Mendel’s crosses. In earlier studies, the relative location of these loci was denoted across pea’s seven linkage groups (LGI-VII), based on the frequency of co-inheritance with morphological and/or molecular markers. When Medicago truncatula genome resources first became available ( Cannon et al., 2006 ; Young et al., 2011) , pea geneticists took advantage of the close synteny between Medicago and pea (e.g. Bordat et al., 2011) to identify candidate genes for pea loci of interest. Then, in a major development, Kreplak et al. (2019) published the pea genome itself. This now enables candidate genes for pea loci to be directly identified in the corresponding region of the pea genome ( Figure 1 ), with the caveat that the current assembly represents ∼88% of the full genome and that some sequence scaffolds have not yet been assigned to a chromosome ( Kreplak et al., 2019) .

An external file that holds a picture, illustration, etc.
Object name is kiac424f1.jpg

Mendel’s characterized genes and candidates. Vertical lines represent pea chromosomes, with corresponding linkage groups indicated ( Kreplak et al., 2019) . Mendel’s loci are indicated on the left-hand side, with characterized genes in blue and potential candidate genes in red on the right-hand side. See Supplemental Table S1 for more details.

Pod color—green ( GP ) versus yellow ( gp )

When Mendel (1866) observed segregation of pod color, he found the allele for green pods dominant over that for yellow ( Figure 2A; the opposite of cotyledon color, where green coloration is recessive). Yellow-podded ( gp ) plants have ˂5% of the chlorophyll levels in the pod mesocarp compared with wild-type plants ( Price et al., 1988 ). The pod color locus ( GP ) was mapped to a region later found to correspond to pea Chromosome 3 in the reference genome assembly ( Figure 1 and Supplemental Table S1 ; Lamprecht, 1948 ; Kreplak et al., 2019) .

An external file that holds a picture, illustration, etc.
Object name is kiac424f2.jpg

Mendel’s pod color and form characters. A, Pod color—green/ GP (left), yellow/ gp (right). B, Pod form—inflated, with sclerenchymatous tissue (left), constricted without sclerenchymatous tissue (right). Photographs provided by Prof. Wojciech Święcicki (A) and Dr. Robert Wiltshire (B).

Recently, Shirasawa et al. (2021) identified two potential candidates for the pod color gene, through whole-genome resequencing of a yellow-podded pea variety (JI128), combined with genome-wide association study (GWAS) and transcriptome analysis-based approaches. They confirmed and narrowed the genetic map position of GP , but their GWAS data indicated association with sequences that have not yet been assigned to chromosomes in the pea genome. They found one gene, predicted to encode a 3′-exoribonuclease (Psat0s4355g0080), to contain a single-nucleotide polymorphism (SNP) that would result in a substitution (threonine to lysine) between green and yellow-podded lines. Shirasawa et al. (2021) also noted that an adjacent gene encoded another 3′ exoribonuclease, with SNPs and indels in the predicted promoter region and higher expression in the yellow-podded line. However, further research is needed to confirm if either of these candidate genes underlies Mendel’s pod color locus. In other species, 3′ exoribonucleases are known for their role in mRNA degradation (e.g. Nguyen et al., 2015 ) including in cell death initiation ( Xi et al., 2009 ); any mechanism for these proteins in modifying chlorophyll content remains to be characterized.

Previously, Ellis et al. (2011) noted the presence of a gene similar to LOWER CELL DENSITY 1 ( LCD1 ) from Arabidopsis ( Arabidopsis thaliana ) in the region of the Medicago genome syntenic to the GP locus on pea LGV, and suggested the corresponding pea gene as a putative candidate for GP . Arabidopsis lcd1 mutant plants show a pale phenotype with reduced chlorophyll content ( Barth and Conklin, 2003 ), comparable to the chlorophyll deficiency seen in pods of pea gp plants. Now that the genomes for green ( GP ) and yellow-podded ( gp ) varieties of pea are available ( Kreplak et al., 2019 ; Shirasawa et al., 2021 ), we can confirm both that there is a putative ortholog of LCD1 in the corresponding region of pea Chromosome 3 ( Figure 1 ) and that there is a SNP that would result in the substitution of a highly conserved, nonpolar isoleucine residue within the transmembrane domain with a polar threonine in the available sequence for the gp line JI128 (see Supplemental Table S1 for sequence details). Given the precedence for LCD1 genes influencing chlorophyll levels, and the presence of a sequence difference between GP and gp that may affect protein function, this gene remains a strong candidate for GP .

Pod form—inflated ( V / P ) versus constricted ( v / p )

Mendel (1866) observed differences in the form of ripe pods with WT “inflated” pods that have a “parchment layer” on the inside of the pod wall comprising lignified cells/sclerenchyma, dominant over the recessive “constricted” form in which this layer is lacking/incomplete. The constricted pods are consequently wrinkled and deeply constricted between the seeds ( Figure 2B ), and are edible while immature, leading to them being known as “sugar” pods. In addition, the absence of a parchment layer is associated with pod indehiscence.

Two complementary loci controlling development of the parchment layer have been described in pea— V and P ( White, 1917 ). It is not clear which of these would have been segregating in Mendel’s population, but early verbal comments suggest V was more likely (see Reid and Ross, 2011 ) and this is supported by a re-analysis of Mendel’s data by Ellis et al. (2019) and Weeden (2016) , on the basis of phenotype. V has been mapped beneath LE on the genetic/linkage map ( Weeden et al., 1993 ), a region that corresponds to the end of pea Chromosome 5 with around 200 annotated genes ( Figure 1 and Supplemental Table S1 ; Kreplak et al., 2019) . This region includes a gene encoding a WRKY transcription factor—a family previously linked to lignification ( Wang et al., 2007 ; Guillaumie et al., 2009 ; Yang et al., 2016 ; Xie et al., 2021 ) and pod indehiscence/seed shattering ( Tang et al., 2013 ; Parker et al., 2020 ). This gene could be investigated further as a candidate for V .

It is harder to match linkage information for P with a physical chromosomal location, as P is rarely included in linkage maps with molecular markers that can be matched to the pea genome. P has been mapped to a region of LGVI/pea Chromosome 1 ( Weeden et al., 1998 ; Bordat et al., 2011 ; Kreplak et al., 2019) that contains genes encoding four WRKY transcription factors (see above), a NAC transcription factor homologous to genes that affect lignification in other legumes ( Wang et al., 2011 ; Dong et al., 2014 ), and six MYB transcription factors homologous to genes that control lignification in other species ( Figure 1 and Supplemental Table S1 ; Yang et al., 2007 ; Zhang et al., 2016 ; Gui et al., 2019 ). Narrowing down the chromosomal location of the P locus would help to reduce this list of potential candidates.

Position of flowers—axial ( FA / FAS ) versus terminal ( fa / fas )

The last of Mendel’s loci remaining to be characterized is the “position of flowers.” Mendel (1866) noted axial positioning of flowers (distributed at nodes along the main stem in the form of a compound raceme) was dominant over terminal positioning of flowers (all grouped together at the top of the stem, in the form of a “false umbel”). This condition is now known as fasciation, and involves the shoot apical meristem becoming elongated outward, perpendicular to normal acropetal stem growth, enabling multiple inflorescences to be borne from the top of the stem. Two loci affecting this characteristic have been described in pea, either of which may have been segregating in Mendel’s experiments: FA and FAS (see discussion by Ellis et al., 2011 ; Reid and Ross, 2011 ).

FA has been mapped to the top of LGIV ( Laucou et al., 1998 ; Bordat et al., 2011) , to a region corresponding to the start of pea Chromosome 4 ( Kreplak et al., 2019) . This region contains genes encoding (i) cullin family members—components of SCF ubiquitin ligase complexes involved in mediating auxin and jasmonic acid responses with mutant phenotypes including aberrant patterns of cell division or fasciation in other species ( Shen et al., 2002 ; Stirnberg et al., 2002 ; Dohmann et al., 2005 )—and (ii) auxin/indole-3-acetic acid (AUX/IAA) family members homologous to those linked to the regulation of meristem boundary domains and inflorescence architecture ( Galli et al., 2015) , which could be investigated further as candidates for FA ( Figure 1 and Supplemental Table S1 ).

FAS has been mapped to LGIII but is not linked to LE ( Sinjushin et al., 2006 ). This region corresponds to part of pea Chromosome 5 that includes a number of genes that are plausible candidates for FAS ( Figure 1 and Supplemental Table S1 ). These include genes encoding transcription factors from the TCP family associated with branching/inflorescence development/floral organ morphogenesis and cell proliferation in other species (e.g. Cubas et al., 1999 ; Koyama et al., 2010 ; Kieffer et al., 2011 ), and the NAC family linked to fasciation in other species ( Weir et al., 2004 ). In addition, a pea homolog of the Arabidopsis inflorescence meristem identity gene TERMINAL FLOWERING 1 — PsTFL1b —is also a potential candidate in this region.

Homologs of the shoot apical meristem maintenance gene CLAVATA1 ( CLV1 ) have previously been suggested as potential candidates for FA and FAS ( Ellis et al., 2011 ). We can now confirm that there are indeed pea members of the CLV1/BARELY ANY MERISTEM (BAM)1/2/3 subfamily of leucine-rich repeat receptor-like protein kinase genes on pea Chromosomes 4 ( PsBAM1 ) and 5 ( PsBAM3 ), which may share redundancy with CLV1 in meristem functioning similar to Arabidopsis BAM homologs ( Nimchuk et al., 2015 ). However, integration of mapping ( Laucou et al., 1998 ; Sinjushin et al., 2006 ) and genome information ( Kreplak et al., 2019) indicates that both of these pea BAM genes fall outside the regions of immediate interest for FA and FAS ( Supplemental Table S1 ).

Progress with molecular tools and resources for pea as a model plant

When we last reflected on progress with characterizing Mendel’s loci 11 years ago ( Reid and Ross, 2011 ), sequence data for pea were limited to some expressed sequence tag databases (see Bordat et al., 2011) , with transcriptome information from next-generation sequencing just becoming available (e.g. Franssen et al., 2011 ). Molecular studies in pea commonly made use of the more comprehensive sequence resources available for other model legumes, including the closely related galegoid legume M. truncatula ( Young et al., 2005 , 2011 ), in addition to Lotus japonicus ( Sato et al., 2008) and soybean ( Glycine max ; Schmutz et al., 2010) . Functional characterization of gene phenotypic effects was often successfully achieved by a forward-genetics, candidate gene approach, with genetic mapping and identification of gene candidates using sequence resources for the closely syntenic Medicago ( Kaló et al., 2004 ; Aubert et al., 2006 ; Bordat et al., 2011) . Platforms for reverse genetics using Targeting-Induced Local Lesions IN Genomes (TILLING) had been developed for pea ( Triques et al., 2007 ; Dalmais et al., 2008) , and were also being successfully adopted in functional studies (e.g. Hofer et al., 2009) .

In the past decade, we have seen first an increase in transcriptome data and improved genetic mapping of pea loci (e.g. Kaur et al., 2012 ; Duarte et al., 2014 ; Alves-Carvalho et al., 2015 ; Tayeh et al., 2015) , then a chromosome-level assembly of the genome ( Kreplak et al., 2019) , and more recently genome sequences for additional pea lines ( Shirasawa et al., 2021 ), paving the way for future pangenomics studies to compare core and variable gene sets between pea cultivars. At the same time, rapid expansion of genomic resources for other species has advanced comparative genomics between diverse legumes (e.g. Varshney et al., 2013 ; Schmutz et al., 2014 ; Griesmann et al., 2018 ; Zhuang et al., 2019 ; Quilbé et al., 2021) . New reverse genetics tools are also being developed, with a recent report of successful gene-editing via optimization of the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9) system in pea ( Li et al., 2022) . While transformation in pea is still problematic, it is being used extensively for transient expression in roots (e.g. Clemow et al., 2011 ) and even stable expression, including for one of Mendel’s genes, LE ( Reinecke et al., 2013 ) and a gene involved in seed filling ( Meitzel et al., 2021) .

These recent molecular advances have facilitated major physiological advances in pea over the last decade. For example, the reverse genetics approach, TILLING, allowed Tivendale et al. (2012) to confirm that tryptophan is converted to the key plant hormone auxin by just two steps, as in Arabidopsis ( Mashiguchi et al., 2011) . Despite the importance of auxin for plant growth and development, its biosynthetic pathway had remained elusive for decades. Next it was demonstrated, by disrupting auxin biosynthesis in maturing pea seeds, that auxin is required for normal starch synthesis ( McAdam et al., 2017) . This exciting demonstration of the auxin–starch relationship has since been confirmed for the seeds of maize ( Zea mays ; Bernardi et al., 2019 ) and rice ( Oryza sativa ; Zhang et al., 2021 ). Auxin also provides an interesting indirect link between two of the great 19th Century plant biologists, Mendel and Charles Darwin. The discovery of auxin emanated (after many decades) from Darwin’s studies on phototropism ( Darwin and Darwin, 1896 ). Now we know that auxin affects starch biosynthesis, as does Mendel’s R gene. Furthermore, Mendel’s LE gene turned out to be an auxin-regulated gene ( Ross et al., 2000 ; O’Neill et al., 2010 ).

Other examples where pea genes have provided valuable insights into important physiological processes include the identification of the branching hormone, strigolactone ( Gomez-Roldan et al., 2008) , and its associated biosynthetic and perception pathways (see Mashiguchi et al., 2021 ). Pea is also proving to be a key model species for genetic studies on plant–microbe interactions during symbioses such as nodulation and arbuscular mycorrhizae ( Foo et al., 2013 ; Velandia et al., 2022 ), phenomena that do not occur in Arabidopsis.

Mendel’s data

Did mendel’s characters involve discrete differences.

For Mendel’s research on the seven characters he selected, segregation into just two forms was an essential property. However, on the rediscovery of Mendel’s findings, this binary classification was soon questioned, mainly by Weldon (1902) . Subsequent generations of geneticists (1910–2010) accepted that in F 2 segregations, Mendel’s characteristics are typically binary, as demonstrated by data sets collated in Weeden (2016) . Nevertheless, Weldon’s ideas have been resurrected by Radick (2015) , who again posed the question of “whether Mendel was right to work with just the two categories in the first place.” A photograph of pea seeds presented by Weldon (1902) showing seeds varying from yellow to green in a continuous manner, rather than falling into clear yellow and green categories as described by Mendel (1866) , has been reproduced in support of these claims ( Radick, 2015 ; Arney, 2019 ). When extrapolated to F 2 generations, Weldon (1902) and Radick (2015 , 2022 ) appear to challenge the binary nature of segregations. This is a much more fundamental challenge to Mendelian genetics than simply pointing out, as Weeden (2016) appears to, that for some characteristics there can be a minority of “ambiguous” segregates in between two otherwise distinct and larger groups.

We refute this challenge, by showing here that characters such as cotyledon color can be classified into discrete groups. In our collection of pea genotypes, we have lines with 100% yellow cotyledons or 100% green cotyledons. When these lines are crossed, the F 2 seeds clearly fall into the yellow or green categories, provided that the seed coat is partially removed to expose the cotyledons ( Figure 3A ), as also done by Weldon (1902) . Clear segregation is also apparent in a photo modified by van Dijk et al. (2022) , originally from Darbishire (1911) . Further supporting our view, Mendel’s tall/dwarf difference also segregates cleanly in the F 2 , even if the parents are overlapping in height because of other genetic factors. In a cross between one of the shortest tall lines and one of the largest dwarf lines in our collection, we found the tall ( LE- ) and dwarf ( lele ) plants to be easily recognizable ( Figure 3B ). These examples show that it is possible to obtain pure parental lines that are consistent for Mendel’s characters, and which produce discrete bimodal segregations in the F 2 generation ( Figure 3 ), even when the parental phenotypes overlap. Therefore, the likelihood that Mendel observed clear F 2 segregations appears beyond dispute, although we cannot exclude the possibility that some of his segregations did contain individuals with “ambiguous” phenotypes.

An external file that holds a picture, illustration, etc.
Object name is kiac424f3.jpg

Clear segregation of cotyledon color ( I / i ) and stem length ( LE / le ). A, F 2 seeds from a cross between lines Torsdag ( II ) and 53 ( ii ), showing segregation of cotyledon color (scale bar = 1 cm). The round/wrinkled ( R / r ) difference is also segregating. B, Stem length data from Torsdag (TOR, wild-type, tall, LELE ), Dippes gelbe Viktoria (DGV, dwarf, lele ), and the F 2 generation of a cross between these lines, showing segregation of the tall/dwarf difference from data in Ross and Reid (1987) .

At the beginning of the last century, Weldon (1902) was perhaps justified to question how Mendel’s differences relate to the total variation that can be observed for each character. At present, however, geneticists understand that the expression of some of Mendel’s characters can be influenced by other genes and the environment. Bearing that in mind, we contend that the resurrection of Weldon’s ideas, largely from an historical perspective ( Radick, 2015 , 2022 ), has occurred with inadequate scrutiny of how his conclusions were reached or analysis of readily available data.

Mendel (1866) did examine, in a preliminary manner, certain traits that did not show binary segregation patterns. For example, he commented that time of flowering was not amenable to his analysis because the flowering time of hybrids stood almost exactly between the times of the two parents. By being selective in this way, Mendel concentrated on characteristics that enabled him to discover the laws of inheritance. Then, those laws could be, and indeed have been, extrapolated to apply to characters beyond those he originally described. This point seems to have been lost on some critics, who suggest that Mendel’s laws apply only to his carefully selected traits, and not to genetic variation in general (e.g. Arney, 2019 ).

Since Mendel’s time, however, clear segregations have been observed for the flowering time trait. Through the use of appropriate environmental conditions (short photoperiods and nonvernalizing temperatures), Murfet (1971) showed clear segregation between wild-type and mutant forms, for two flowering genes. After further detailed analysis, over 10 loci controlling flowering with Mendelian patterns of inheritance were identified without knowledge of their molecular nature or biochemical function ( Reid et al., 1996 ). Since then, a range of molecular tools have been employed to identify most of these loci, in addition to other flowering genes as well. Candidate gene and comparative genetic approaches using knowledge and sequence information from other plant species initially proved highly effective for characterizing pea flowering genes/loci (see Hecht et al., 2005 ; Weller et al., 2009 ); and progress has been more rapid as legume genome and specific pea sequence resources became available (see Weller and Ortega, 2015 ; Williams et al., 2022 ). Thus, although flowering time is a quantitative trait, sensitive to environmental cues including photoperiod, a number of key genes have been identified based on their Mendelian patterns of inheritance.

Mendel’s ratios

Mendel’s data have probably been examined and re-analyzed more than any other data in biology. This, by itself, highlights the importance of his work. The scientific method has been put to use in testing the theory that Mendel’s results agreed too closely with expectation—that is, his data were “too good.” Such questions arose with the re-discovery of Mendel’s work at the beginning of the 20th century ( Weldon, 1902 ) and became widely debated in the literature after analysis by the eminent statistician ( Fisher, 1936 ). Fisher (1936) concluded that “the data of most, if not all, of the experiments have been falsified so as to agree closely with Mendel’s expectations.” This was followed by comments by key evolutionary geneticists such as Wright (1966) and Dobzhansky (1967) . The debate continued, with several publications during the 2000s (e.g. Hartl and Fairbanks, 2007 ; Franklin et al., 2008 ; Pires and Branco, 2010 ).

In the last decade, leading pea geneticists have entered the fray, on opposite sides ( Weeden, 2016 ; Ellis et al., 2019 ). Weeden (2016) clearly agrees with the earlier suggestions that Mendel’s data were too good, collating those data and noting their relatively low chi-squared values. Weeden (2016) based on suggestions by Sturtevant (1965) proposed four explanations for the unexpectedly close fit to the predicted numbers. The first of these hypotheses was that in pea an unusual meiotic mechanism somehow results in that close fit. Weeden (2016) empirically refuted that possibility, by collating data from several pea geneticists, published since 1927. These data show that, in general, pea does not produce offspring ratios with a better fit than is expected. The second explanation was that Mendel excluded some data from his publication. Weeden (2016) accepted that possibility but considered the omissions inadequate in scale to explain the closeness of fit. The third explanation, supported by Weeden (2016) , was that there may have been some bias in scoring ambiguous phenotypes. He presented statistical evidence that the closeness of fit “problem” is more obvious with characters prone (according to Weeden, 2016 ) to ambiguity, and hence to a biased classification. The fourth possibility was that “a portion” of Mendel’s data may have been falsified by an assistant or assistants. In summary, Weeden (2016) favored explanations three and four and did not exclude either.

Weeden (2016) also questioned “the lack of any statistically significant deviation in Mendel’s data from the expected ratios.” However, as noted by Ellis et al. (2019) , Mendel did in fact report some substantial deviations from expectation. For example, one of his F 1 plants produced 43 round and 2 wrinkled seeds, while another gave 14 round and 15 wrinkled ( Mendel, 1866 ); neither set fits the expected 3:1 ratio. Mendel appears to have added these numbers into a grand total for that character, but clearly, not all of his data were “too good.”

When Ellis et al. (2019) subdivided some of the grand totals, sometimes into the offspring of individual plants, larger chi-squared values were often obtained, compared with those of the totals. An example is provided by the ratios between heterozygotes and homozygous dominants in F 2 generations, as revealed by growing the F 3 generation. In some cases, the expected ratio was 2:1, but in generations where only 10 F 3 offspring from each F 2 plant were tested, it should have been 1.8874:1.1126 ( Fisher, 1936 ). Taking this into account, the mean chi-squared value for 27 F 2 to F 3 genotyping comparisons is 0.90, close to the expected value of 1 (data from Additional file 1: Table S1.3 of Ellis et al., 2019 ).

In summary, while Weeden (2016) essentially agrees with the previous criticism that Mendel’s data were too close to expectation, Ellis et al. (2019) strongly disagree, concluding that “there is nothing remarkable about Mendel’s data.” At the same time, of course, Weeden’s (2016) collated data from other published pea geneticists actually support Mendel’s laws. The current argument, therefore, is not about the correctness of those laws, but about whether or not Mendel’s data were too close to expectation. Interestingly, van Dijk et al. (2022) alluded to yet another controversy, this time regarding whether Mendel’s approach was essentially deductive or inductive. We note here that according to his 1866 paper, Mendel was certainly aware of, and indeed employed, the scientific method of defining hypotheses and then testing them (a deductive approach). In fact, it has been suggested that Mendel’s mastery of the scientific method was “far ahead of his time” ( Huminiecki, 2020 ). This mastery was complemented by Mendel’s careful selection of pea as his major experimental material, ensuring he had true breeding lines at the commencement of his crosses, and his practice of observing actual numbers for each separate character over several generations after the initial cross. While individually these approaches had been used by earlier workers, Mendel’s combination of skills, due to his knowledge of both biology and mathematics, enabled him to make his discoveries.

Perhaps now the debates regarding Mendel’s approach and data can finally be put to rest, despite persisting for many decades. Mostly, they are peripheral to the essence of Mendel’s observations and to his invaluable insights into the inheritance of characteristics.

Did Mendel miss linkage?

Related to discussions about Mendel’s data is the issue of how he missed the phenomenon of linkage. Prior to a major revision of the pea linkage map ( Weeden et al., 1998 ), there was confusion, even in textbooks, about the location of Mendel’s seven genes. During the 1950s and 1960s, they were reported to be located on the seven different chromosomes of pea, although this was refuted as early as 1970 by Murfet and later by Blixt (1975) . With our current knowledge we know that the R and GP loci are weakly linked ( Weeden et al., 1998 ), which may not have been detected with the number of plants used by Mendel ( Weeden, 2016 ), while if Mendel’s pod membrane gene was at the V locus, it is quite tightly linked to the stem length locus LE ( Hall et al., 1997) . It is worth emphasizing that Mendel did not note unusual frequency of co-inheritance of pod form and stem length characters in his segregating populations. It is possible that Mendel studied V in a population that was not segregating for both characters ( Ellis et al., 2011 ), or that he may have instead studied pod form via segregation at the P locus, which is not linked to LE . This issue has been discussed in detail by Reid and Ross (2011) and Ellis et al. (2011) . Whatever the reason for the lack of linkage detection by Mendel, it does not overshadow the brilliance of his insights into the inheritance of discrete characters, and the principles of segregation and independent assortment which are the foundations of the discipline of genetics.

Mendel’s genes as a teaching tool

The story of Mendel’s experiments, their rediscovery after decades, and the controversies about his data that ensued, add to the benefits of garden pea as an excellent model for teaching, allowing students to appreciate the development of genetics as a discipline and the scientific method of hypothesis testing. A study of Mendel’s seven characteristics illustrates the main principles of genetics. These include the importance of dominant and recessive phenotypes, gene families, pleiotropy, structural and regulatory gene function, and the various methods available to identify genes. The mutations in Mendel’s genes include single base substitutions ( le-1 ; Lester et al., 1997 ), disruption of splice sites ( a ; Hellens et al., 2010) , and both small ( i ; Sato et al., 2007 ) and large insertions ( r ; Bhattacharyya et al., 1990 ). Students can be instructed in how Mendel’s genes have been pivotal for studying key aspects of plant development, including pigmentation patterns, seed development, the hormonal regulation of plant growth, and plant senescence. A genetics and plant development course can be based purely on these aspects. At the practical level, pea plants are easy to grow, and students can be introduced to the husbandry of commercially relevant plants bearing large flowers that self-pollinate but which can be easily crossed. Cultivars carrying Mendelian mutations (e.g. le , a , r , and i ) are readily available at plant nurseries. Overall, students may not only learn the key principles of genetics and plant development, but also relate to the history of the discipline and the associated controversies about Mendel’s data.

In the last decade, Mendel’s results have been reviewed by pea geneticists including Weeden (2016) , Ellis et al. (2019) , and van Dijk et al. (2022) . Weeden (2016) notes that “whether Mendel should be placed on a pedestal as the founder of experimental genetics is still a moot point.” In contrast, Ellis et al. (2019) describe Mendel’s (1866) paper as “exemplary,” and its subsequent statistical criticism as a “pernicious feature.” Within the recent reviews, a difference of opinion also emerges with regard to the possibility of ambiguous phenotypes interfering with the scoring of Mendel’s characters. According to Weeden (2016) , ambiguous individuals can occur with regard to four of Mendel’s characters, including cotyledon color and seed shape. In contrast, van Dijk et al. (2022) recently gave the impression that in general terms the segregation of cotyledon color and seed shape “in the F 2 is very obvious.” Here, we agree that segregation for Mendel’s characters can indeed be unambiguous ( Figure 3 ). We also dismiss recent doubts about the fundamentally binary nature of Mendel’s characters ( Radick, 2015 ). However, at the same time, we should not deny the existence of ambiguous individuals in some circumstances ( Weeden, 2016 ). It is clear that even if Mendel encountered some ambiguity at times, he would have observed more than enough clear segregations to form the basis for his laws.

OUTSTANDING QUESTIONS

  • Will recent advances in molecular techniques allow us to identify Mendel’s three unknown genes?
  • Is pod color ( GP ) controlled by a 3’ exoribonuclease gene, or is it still too soon to rule out another candidate?
  • Have questions surrounding the validity of Mendel’s data finally been resolved?
  • What does the future hold for pea genetics?

The title of our last review ( Reid and Ross, 2011 ) was “Mendel’s genes: Toward a full molecular characterisation.” However, 11 years on, that full characterization is still yet to occur. One reason is that pea genes other than Mendel’s have often been prioritized for full characterization, based on their physiological or developmental importance. Another reason is that pea has not been an easy model species for molecular genetics research. In fact, with the expansion of plant molecular biology in the 1980s and 1990s, pea was quickly left in the wake of Arabidopsis, in which advances were facilitated by the relatively small genome and other features such as the paucity of repetitive sequences, ease of transformation, and rapid life cycle.

That situation is now changing, with pea catching up in some key molecular areas. Indeed, we now have the tools to complete the characterization of Mendel’s genes. On the basis of linkage/mapping studies, combined with the pea genome, candidate genes can readily be found. The reverse genetics techniques of TILLING and more recently CRISPR now provide mechanisms for obtaining mutants for these candidate genes, to compare the resulting phenotypes with those expected from Mendel’s descriptions. In addition to helping to identify the remainder of Mendel’s genes, the molecular advances will continue to benefit pea genetics in general (see “Outstanding Questions”). In fact, following recent speculation ( Berger, 2022 ), we might surmise that Mendel would be happy to see those molecular advances propelling pea back to the top echelon of model plant species.

Supplemental data

The following materials are available in the online version of this article.

Supplemental Table S1 . Accession and chromosomal location details for Mendel’s genes and candidates identified for Mendel’s remaining loci shown in Figure 1 .

Supplementary Material

Kiac424_supplementary_data, acknowledgments.

We thank Prof. Wojciech Święcicki and Dr. Robert Wiltshire for kindly sharing the photographs shown in Figure 2 .

F.C.S. is supported by an Australian Research Council Discovery Early Career Award (DE200101133) funded by the Australian Government.

Conflict of interest statement . The author declare that have no conflicts of interest.

Contributor Information

Frances C Sussmilch, Discipline of Biological Sciences, School of Natural Sciences, University of Tasmania, Sandy Bay, Tasmania 7005, Australia.

John J Ross, Discipline of Biological Sciences, School of Natural Sciences, University of Tasmania, Sandy Bay, Tasmania 7005, Australia.

James B Reid, Discipline of Biological Sciences, School of Natural Sciences, University of Tasmania, Sandy Bay, Tasmania 7005, Australia.

All authors contributed ideas, drafted the manuscript, and revised the text. F.C.S. performed the bioinformatic analyses for candidate gene identification.

The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors ( https://academic.oup.com/plphys/pages/general-instructions ) is: James Reid ( [email protected] ).

  • Alves-Carvalho S, Aubert G, Carrère S, Cruaud C, Brochot AL, Jacquin F, Klein A, Martin C, Oucherot K, Kreplak J, et al (2015) Full-length de novo assembly of RNA-seq data in pea ( Pisum sativum L.) provides a gene expression atlas and gives insights into root nodulation in this species . Plant J 84 : 1–19 [ PubMed ] [ Google Scholar ]
  • Armstead I, Donnison I, Aubry S, Harper J, Hörtensteiner S, James C, Mani J, Moffet M, Ougham H, Roberts L, et al (2007) Cross-species identification of Mendel’s I locus . Science 315 : 73. [ PubMed ] [ Google Scholar ]
  • Arney K (2019) Fifty shades of peas. https://geneticsunzipped.com/news/2019/1/31/fifty-shades-of-peas (accessed 17 June 2022).
  • Aubert G, Morin J, Jacquin F, Loridon K, Quillet MC, Petit A, Rameau C, Lejeune-Hénaut I, Huguet T, Burstin J (2006) Functional mapping in pea, as an aid to the candidate gene selection and for investigating synteny with the model legume Medicago truncatula . Theor Appl Genet 112 : 1024–1041 [ PubMed ] [ Google Scholar ]
  • Aubry S, Mani J, Hörtensteiner S (2008) Stay-green protein, defective in Mendel’s green cotyledon mutant, acts independent and upstream of pheophorbide a oxygenase in the chlorophyll catabolic pathway . Plant Mol Biol 67 : 243–256 [ PubMed ] [ Google Scholar ]
  • Barber HN, Jackson WD, Murfet IC, Sprent JI (1958) Gibberellic acid and the physiological genetics of flowering in peas . Nature 182 : 1321–1322 [ Google Scholar ]
  • Barbier FF, Dun EA, Kerr SC, Chabikwa TG, Beveridge CA (2019) An update on the signals controlling shoot branching . Trends Plant Sci 24 : 220–236 [ PubMed ] [ Google Scholar ]
  • Barth C, Conklin PL (2003) The lower cell density of leaf parenchyma in the Arabidopsis thaliana mutant lcd1-1 is associated with increased sensitivity to ozone and virulent Pseudomonas syringae . Plant J 35 : 206–218 [ PubMed ] [ Google Scholar ]
  • Berger F (2022) Which field of research would Gregor Mendel choose in the 21st century? Plant Cell 34 : 2462–2465 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Bernardi J, Battaglia R, Bagnaresi P, Lucini L, Marocco A (2019) Transcriptomic and metabolomic analysis of ZmYUC1 mutant reveals the role of auxin during early endosperm formation in maize . Plant Sci 281 : 133–145 [ PubMed ] [ Google Scholar ]
  • Bhattacharyya MK, Smith AM, Ellis THN, Hedley C, Martin C (1990) The wrinkled-seed character of pea described by Mendel is caused by a transposon-like insertion in a gene encoding starch-branching enzyme . Cell 60 : 115–122 [ PubMed ] [ Google Scholar ]
  • Blixt S (1972) Mutation genetics in Pisum . Agri Hortique Genet 30 : 1–293 [ Google Scholar ]
  • Blixt S (1975) Why didn’t Gregor Mendel find linkage? Nature 256 : 206–206 [ PubMed ] [ Google Scholar ]
  • Bordat A, Savois V, Nicolas M, Salse J, Chauveau A, Bourgeois M, Potier J, Houtin H, Rond C, Murat F, et al (2011) Translational genomics in legumes allowed placing in silico 5460 unigenes on the pea functional map and identified candidate genes in Pisum sativum L . G3 1 : 93–103 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cannon SB, Sterck L, Rombauts S, Sato S, Cheung F, Gouzy J, Wang X, Mudge J, Vasdewani J, Schiex T, et al (2006) Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes . Proc Natl Acad Sci USA 103 : 14959–14964 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Clemow SR, Clairmont L, Madsen LH, Guinel FC (2011) Reproducible hairy root transformation and spot-inoculation methods to study root symbioses of pea . Plant Methods 7 : 46. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cubas P, Lauter N, Doebley J, Coen E (1999) The TCP domain: a motif found in proteins regulating plant growth and development . Plant J 18 : 215–222 [ PubMed ] [ Google Scholar ]
  • Dalmais M, Schmidt J, Le Signor C, Moussy F, Burstin J, Savois V, Aubert G, Brunaud V, de Oliveira Y, Guichard C, et al (2008) UTILLdb, a Pisum sativum in silico forward and reverse genetics tool . Genome Biol 9 : R43. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Darbishire AD (1911) Breeding and the Mendelian Dscovery , Cassell and Company, London [ Google Scholar ]
  • Darwin C, Darwin F (1896) The Power of Movement in Plants . D. Appleton and Company, New York, NY [ Google Scholar ]
  • Davies P, Muehlbauer F (2020) Peas. In Wien HC, Stützel H, eds, The Physiology of Vegetable Crops , CAB International, Wallingford, p 287 [ Google Scholar ]
  • Dobzhansky T (1967) Looking Back at Mendel's Discovery. Science 156 : 1588–1589 [ Google Scholar ]
  • Dohmann EMN, Kuhnle C, Schwechheimer C (2005) Loss of the CONSTITUTIVE PHOTOMORPHOGENIC 9 signalosome subunit 5 is sufficient to cause the cop/det/fus mutant phenotype in Arabidopsis . Plant Cell 17 : 1967–1978 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Dong Y, Yang X, Liu J, Wang BH, Liu BL, Wang YZ (2014) Pod shattering resistance associated with domestication is mediated by a NAC gene in soybean . Nat Commun 5 : 3352. [ PubMed ] [ Google Scholar ]
  • Duarte J, Rivière N, Baranger A, Aubert G, Burstin J, Cornet L, Lavaud C, Lejeune-Hénaut I, Martinant JP, Pichon JP, et al (2014) Transcriptome sequencing for high throughput SNP development and genetic mapping in Pea . BMC Genomics 15 : 126. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ellis THN, Hofer JMI, Swain MT, van Dijk PJ (2019) Mendel’s pea crosses: varieties, traits and statistics . Hereditas 156 : 33. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ellis THN, Hofer JMI, Timmerman-Vaughan GM, Coyne CJ, Hellens RP (2011) Mendel, 150 years on . Trends Plant Sci 16 : 590–596 [ PubMed ] [ Google Scholar ]
  • Fairbanks DJ (2022) Demystifying the mythical Mendel: a biographical review . Heredity 129 : 4–11 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Fisher RA (1936) Has Mendel’s work been rediscovered? Ann Sci 1 : 115–137 [ Google Scholar ]
  • Foo E, Yoneyama K, Hugill CJ, Quittenden LJ, Reid JB (2013) Strigolactones and the regulation of pea symbioses in response to nitrate and phosphate deficiency . Mol Plant 6 : 76–87 [ PubMed ] [ Google Scholar ]
  • Foyer CH, Lam HM, Nguyen HT, Siddique KHM, Varshney RK, Colmer TD, Cowling W, Bramley H, Mori TA, Hodgson JM, et al (2016) Neglecting legumes has compromised human health and sustainable food production . Nat Plants 2 : 16112. [ PubMed ] [ Google Scholar ]
  • Franklin A, Edwards AW, Fairbanks DJ, Hartl DL (2008) Ending the Mendel–Fisher Controversy , University of Pittsburgh Press, Pittsburgh, PA [ Google Scholar ]
  • Franssen SU, Shrestha RP, Bräutigam A, Bornberg-Bauer E, Weber APM (2011) Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing . BMC Genome 12 : 227 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Galli M, Liu Q, Moss BL, Malcomber S, Li W, Gaines C, Federici S, Roshkovan J, Meeley R, Nemhauser JL, et al (2015) Auxin signaling modules regulate maize inflorescence architecture . Proc Natl Acad Sci USA 112 : 13372–13377 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Gomez-Roldan V, Fermas S, Brewer PB, Puech-Pagès V, Dun EA, Pillot JP, Letisse F, Matusova R, Danoun S, Portais JC, et al (2008) Strigolactone inhibition of shoot branching . Nature 455 : 189–194 [ PubMed ] [ Google Scholar ]
  • Gregory RP (1903) The seed characters of Pisum sativum . New Phytol 2 : 226–228 [ Google Scholar ]
  • Griesmann M, Chang Y, Liu X, Song Y, Haberer G, Crook MB, Billault-Penneteau B, Lauressergues D, Keller J, Imanishi L, et al (2018) Phylogenomics reveals multiple losses of nitrogen-fixing root nodule symbiosis . Science 361 : eaat1743 [ PubMed ] [ Google Scholar ]
  • Gui J, Luo L, Zhong Y, Sun J, Umezawa T, Li L (2019) Phosphorylation of LTF1, an MYB transcription factor in Populus , acts as a sensory switch regulating lignin biosynthesis in wood cells . Mol Plant 12 : 1325–1337 [ PubMed ] [ Google Scholar ]
  • Guillaumie S, Mzid R, Méchin V, Léon C, Hichri I, Destrac-Irvine A, Trossat-Magnin C, Delrot S, Lauvergeat V (2009) The grapevine transcription factor WRKY2 influences the lignin pathway and xylem development in tobacco . Plant Mol Biol 72 : 215. [ PubMed ] [ Google Scholar ]
  • Hall KJ, Parker JS, Ellis THN, Turner L, Knox MR, Hofer JMI, Lu J, Ferrandiz C, Hunter PJ, Taylor JD, et al (1997) The relationship between genetic and cytogenetic maps of pea. II. Physical maps of linkage mapping populations . Genome 40 : 755–769 [ PubMed ] [ Google Scholar ]
  • Harker CL, Ellis TH, Coen ES (1990) Identification and genetic regulation of the chalcone synthase multigene family in pea . Plant Cell 2 : 185–194 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hartl DL, Fairbanks DJ (2007) Mud sticks: on the alleged falsification of Mendel’s data . Genetics 175 : 975–979 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hecht VR, Foucher F, Ferrándiz C, Macknight R, Navarro C, Morin J, Vardy ME, Ellis N, Beltrán JPO, Rameau C, et al (2005) Conservation of Arabidopsis flowering genes in model legumes . Plant Physiol 137 : 1420–1434 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hellens RP, Moreau C, Lin-Wang K, Schwinn KE, Thomson SJ, Fiers MWEJ, Frew TJ, Murray SR, Hofer JMI, Jacobs JME, et al (2010) Identification of Mendel’s white flower character . PLoS ONE 5 : e13230. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Hofer J, Turner L, Moreau C, Ambrose M, Isaac P, Butcher S, Weller J, Dupin A, Dalmais M, Le Signor C, et al. (2009) Tendril-less regulates tendril formation in pea leaves . Plant Cell 21 : 420–428 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Huminiecki Ł (2020) A contemporary message from Mendel’s logical empiricism . BioEssays 42 : 2000120 [ PubMed ] [ Google Scholar ]
  • Ingram TJ, Reid JB, Murfet IC, Gaskin P, Willis CL, MacMillan J (1984) Internode length in Pisum . Planta 160 : 455–463 [ PubMed ] [ Google Scholar ]
  • Jiang H, Li M, Liang N, Yan H, Wei Y, Xu X, Liu J, Xu Z, Chen F, Wu G (2007) Molecular cloning and function analysis of the stay green gene in rice . Plant J 52 : 197–209 [ PubMed ] [ Google Scholar ]
  • Jiao B, Meng Q, Lv W (2020) Roles of stay-green (SGR) homologs during chlorophyll degradation in green plants . Bot Stud 61 : 25. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kaló P, Seres A, Taylor SA, Jakab J, Kevei Z, Kereszt A, Endre G, Ellis THN, Kiss GB (2004) Comparative mapping between Medicago sativa and Pisum sativum . Mol Genet Genom 272 : 235–246 [ PubMed ] [ Google Scholar ]
  • Kaur S, Pembleton LW, Cogan NOI, Savin KW, Leonforte T, Paull J, Materne M, Forster JW (2012) Transcriptome sequencing of field pea and faba bean for discovery and validation of SSR genetic markers . BMC Genome 13 : 104 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kieffer M, Master V, Waites R, Davies B (2011) TCP14 and TCP15 affect internode length and leaf shape in Arabidopsis . Plant J 68 : 147–158 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Koyama T, Mitsuda N, Seki M, Shinozaki K, Ohme-Takagi M (2010) TCP transcription factors regulate the activities of ASYMMETRIC LEAVES1 and miR164, as well as the auxin response, during differentiation of leaves in Arabidopsis . Plant Cell 22 : 3574–3588 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kreplak J, Madoui MA, Cápal P, Novák P, Labadie K, Aubert G, Bayer PE, Gali KK, Syme RA, Main D, et al (2019) A reference genome for pea provides insight into legume genome evolution . Nat Genet 51 : 1411–1422 [ PubMed ] [ Google Scholar ]
  • Lamprecht H (1948) The variation of linkage and the course of crossing over . Agri Hortique Genet 6 : 10–48 [ Google Scholar ]
  • Laucou V, Haurogné K, Ellis N, Rameau C (1998) Genetic mapping in pea. 1. RAPD-based genetic linkage map of Pisum sativum . Theor Appl Genet 97 : 905–915 [ Google Scholar ]
  • Lester DR, MacKenzie-Hose AK, Davies PJ, Ross JJ, Reid JB (1999) The influence of the null le-2 mutation on gibberellin levels in developing pea seeds . Plant Growth Regul 27 : 83–89 [ Google Scholar ]
  • Lester DR, Ross JJ, Davies PJ, Reid JB (1997) Mendel’s stem length gene ( Le ) encodes a gibberellin 3 beta-hydroxylase . Plant Cell 9 : 1435–1443 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Li G, Liu R, Xu R, Varshney RK, Ding H, Li M, Yan X, Huang S, Li J, Wang D, et al (2022) Development of an Agrobacterium -mediated CRISPR/Cas9 system in pea ( Pisum sativum L.) . Crop J 10.1016/j.cj.2022.04.011 [ CrossRef ] [ Google Scholar ]
  • Martin DN, Proebsting WM, Hedden P (1997) Mendel's dwarfing gene: cDNAs from the Le alleles and function of the expressed proteins . Proc Natl Acad Sci USA 94 : 8907–8911 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Mashiguchi K, Seto Y, Yamaguchi S (2021) Strigolactone biosynthesis, transport and perception . Plant J 105 : 335–350 [ PubMed ] [ Google Scholar ]
  • Mashiguchi K, Tanaka K, Sakai T, Sugawara S, Kawaide H, Natsume M, Hanada A, Yaeno T, Shirasu K, Yao H, et al (2011) The main auxin biosynthesis pathway in Arabidopsis . Proc Natl Acad Sci USA 108 : 18512–18517 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • McAdam EL, Meitzel T, Quittenden LJ, Davidson SE, Dalmais M, Bendahmane AI, Thompson R, Smith JJ, Nichols DS, Urquhart S, et al (2017) Evidence that auxin is required for normal seed size and starch synthesis in pea . New Phytol 216 : 193–204 [ PubMed ] [ Google Scholar ]
  • Meitzel T, Radchuk R, McAdam EL, Thormählen I, Feil R, Munz E, Hilo A, Geigenberger P, Ross JJ, Lunn JE, et al (2021) Trehalose 6-phosphate promotes seed filling by activating auxin biosynthesis . New Phytol 229 : 1553–1565 [ PubMed ] [ Google Scholar ]
  • Mendel G (1866) Experiments in plant hybridization (Versüche uber Pflanzenhybriden). In Verhandlungen des naturforschenden Vereins Brünn. http://www.mendelweb.org/Mendel.html (accessed 21 July 2022)
  • Murfet IC (1971) Flowering in Pisum . Three distinct phenotypic classes determined by the interaction of a dominant early and a dominant late gene . Heredity 26 : 243–257 [ Google Scholar ]
  • Nasmyth K (2022) The magic and meaning of Mendel’s miracle . Nat Rev Genet 23 : 447–452 [ PubMed ] [ Google Scholar ]
  • Nguyen AH, Matsui A, Tanaka M, Mizunashi K, Nakaminami K, Hayashi M, Iida K, Toyoda T, Nguyen DV, Seki M (2015) Loss of Arabidopsis 5′–3′ exoribonuclease AtXRN4 function enhances heat stress tolerance of plants subjected to severe heat stress . Plant Cell Physiol 56 : 1762–1772 [ PubMed ] [ Google Scholar ]
  • Nimchuk ZL, Zhou Y, Tarr PT, Peterson BA, Meyerowitz EM (2015) Plant stem cell maintenance by transcriptional cross-regulation of related receptor kinases . Development 142 : 1043–1049 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • O’Neill DP, Davidson SE, Clarke VC, Yamauchi Y, Yamaguchi S, Kamiya Y, Reid JB, Ross JJ (2010) Regulation of the gibberellin pathway by auxin and DELLA proteins . Planta 232 : 1141–1149 [ PubMed ] [ Google Scholar ]
  • Parker TA, Berny Mier y Teran JC, Palkovic A, Jernstedt J, Gepts P (2020) Pod indehiscence is a domestication and aridity resilience trait in common bean . New Phytol 225 : 558–570 [ PubMed ] [ Google Scholar ]
  • Pires AM, Branco JA (2010) A statistical model to explain the Mendel–Fisher controversy . Statist Sci 25 : 545–565 [ Google Scholar ]
  • Powers SE, Thavarajah D (2019) Checking agriculture’s pulse: field pea ( Pisum sativum L.), sustainability, and phosphorus use efficiency . Front Plant Sci 10 : 1489. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Price DN, Smith CM, Hedley CL (1988) The effect of the gp gene on fruit development in Pisum sativum L. I. Structural and physical aspects . New Phytol 110 : 261–269 [ Google Scholar ]
  • Quilbé J, Lamy L, Brottier L, Leleux P, Fardoux J, Rivallan R, Benichou T, Guyonnet R, Becana M, Villar I, et al (2021) Genetics of nodulation in Aeschynomene evenia uncovers mechanisms of the rhizobium–legume symbiosis . Nat Commun 12 : 829. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Radick G (2015) Beyond the Mendel–Fisher controversy . Science 350 : 159–160 [ PubMed ] [ Google Scholar ]
  • Radick G (2022) Mendel the fraud? A social history of truth in genetics . Stud His Phil Sci 93 : 39–46 [ PubMed ] [ Google Scholar ]
  • Reid JB, Murfet IC, Singer SR, Weller JL, Taylor SA (1996) Physiological–genetics of flowering in Pisum . Semin Cell Dev Biol 7 : 455–463 [ Google Scholar ]
  • Reid JB, Ross JJ (2011) Mendel’s genes: toward a full molecular characterization . Genetics 189 : 3–10 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Reinecke DM, Wickramarathna AD, Ozga JA, Kurepin LV, Jin AL, Good AG, Pharis RP (2013) Gibberellin 3-oxidase gene expression patterns influence gibberellin biosynthesis, growth, and development in pea . Plant Physiol 163 : 929–945 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ross JJ, O’Neill DP, Smith JJ, Kerckhoffs LHJ, Elliott RC (2000) Evidence that auxin promotes gibberellin A1 biosynthesis in pea . Plant J 21 : 547–552 [ PubMed ] [ Google Scholar ]
  • Ross JJ, Reid JB (1987) Internode length in Pisum . A new allele at the le locus . Ann Bot 59 : 107–109 [ Google Scholar ]
  • Sato S, Nakamura Y, Kaneko T, Asamizu E, Kato T, Nakao M, Sasamoto S, Watanabe A, Ono A, Kawashima K, et al (2008) Genome structure of the legume, Lotus japonicus . DNA Res 15 : 227–239 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sato Y, Morita R, Katsuma S, Nishimura M, Tanaka A, Kusaba M (2009) Two short-chain dehydrogenase/reductases, NON-YELLOW COLORING 1 and NYC1-LIKE, are required for chlorophyll b and light-harvesting complex II degradation during senescence in rice . Plant J 57 : 120–131 [ PubMed ] [ Google Scholar ]
  • Sato Y, Morita R, Nishimura M, Yamaguchi H, Kusaba M (2007) Mendel’s green cotyledon gene encodes a positive regulator of the chlorophyll-degrading pathway . Proc Natl Acad Sci USA 104 : 14169–14174 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, et al (2010) Genome sequence of the palaeopolyploid soybean . Nature 463 : 178–183 [ PubMed ] [ Google Scholar ]
  • Schmutz J, McClean PE, Mamidi S, Wu GA, Cannon SB, Grimwood J, Jenkins J, Shu S, Song Q, Chavarro C, et al (2014) A reference genome for common bean and genome-wide analysis of dual domestications . Nat Genet 46 : 707–713 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Shen WH, Parmentier Y, Hellmann H, Lechner E, Dong A, Masson J, Granier F, Lepiniec L, Estelle M, Genschik P (2002) Null mutation of AtCUL1 causes arrest in early embryogenesis in Arabidopsis . Mol Biol Cell 13 : 1916–1928 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Shimoda Y, Ito H, Tanaka A (2016) Arabidopsis STAY-GREEN , Mendel’s green cotyledon gene, encodes magnesium-dechelatase . Plant Cell 28 : 2147–2160 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Shirasawa K, Sasaki K, Hirakawa H, Isobe S (2021) Genomic region associated with pod color variation in pea ( Pisum sativum ) . G3 11 : jkab081. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sinjushin A, Konovalov F, Gostimskii S (2006) A gene for stem fasciation is localized on linkage group III . Pisum Genet 38 : 19–20 [ Google Scholar ]
  • Smýkal P (2014) Pea ( Pisum sativum L.) in biology prior and after Mendel’s discovery . Czech J Genet Plant Breed 50 : 52–64 [ Google Scholar ]
  • Statham CM, Crowden RK, Harborne JB (1972) Biochemical genetics of pigmentation in Pisum sativum . Phytochemistry 11 : 1083–1088 [ Google Scholar ]
  • Stickland RG, Wilson KE (1983) Sugars and starch in developing round and wrinkled pea seeds . Ann Bot 52 : 919–921 [ Google Scholar ]
  • Stirnberg P, van de Sande K, Leyser HMO (2002) MAX1 and MAX2 control shoot lateral branching in Arabidopsis . Development 129 : 1131–1141 [ PubMed ] [ Google Scholar ]
  • Sturtevant AH (1965) The early Mendelians . Proc Am Phil Soc 109 : 199–204 [ Google Scholar ]
  • Tang H, Cuevas HE, Das S, Sezen UU, Zhou C, Guo H, Goff VH, Ge Z, Clemente TE, Paterson AH (2013) Seed shattering in a wild sorghum is conferred by a locus unrelated to domestication . Proc Natl Acad Sci USA 110 : 15824–15829 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Tayeh N, Aluome C, Falque M, Jacquin F, Klein A, Chauveau A, Bérard A, Houtin H, Rond C, Kreplak J, et al (2015) Development of two major resources for pea genomics: the GenoPea 13.2K SNP array and a high-density, high-resolution consensus genetic map . Plant J 84 : 1257–1273 [ PubMed ] [ Google Scholar ]
  • Tivendale ND, Davidson SE, Davies NW, Smith JA, Dalmais M, Bendahmane AI, Quittenden LJ, Sutton L, Bala RK, Le Signor C, et al (2012) Biosynthesis of the halogenated auxin, 4-chloroindole-3-acetic acid . Plant Physiol 159 : 1055–1063 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Triques K, Sturbois B, Gallais S, Dalmais M, Chauvin S, Clepet C, Aubourg S, Rameau C, Caboche M, Bendahmane A (2007) Characterization of Arabidopsis thaliana mismatch specific endonucleases: application to mutation discovery by TILLING in pea . Plant J 51 : 1116–1125 [ PubMed ] [ Google Scholar ]
  • van Dijk PJ, Jessop AP, Ellis THN (2022) How did Mendel arrive at his discoveries? Nat Genet 54 : 926–933 [ PubMed ] [ Google Scholar ]
  • Varshney RK, Song C, Saxena RK, Azam S, Yu S, Sharpe AG, Cannon S, Baek J, Rosen BD, Tar’an B, et al (2013) Draft genome sequence of chickpea ( Cicer arietinum ) provides a resource for trait improvement . Nat Biotechnol 31 : 240–246 [ PubMed ] [ Google Scholar ]
  • Velandia K, Reid JB, Foo E (2022) Right time, right place: the dynamic role of hormones in rhizobial infection and nodulation of legumes . Plant Commun 3 : 100327. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wang H, Hao J, Chen X, Hao Z, Wang X, Lou Y, Peng Y, Guo Z (2007) Overexpression of rice WRKY89 enhances ultraviolet B tolerance and disease resistance in rice plants . Plant Mol Biol 65 : 799–815 [ PubMed ] [ Google Scholar ]
  • Wang H, Zhao Q, Chen F, Wang M, Dixon RA (2011) NAC domain function and transcriptional control of a secondary cell wall master switch . Plant J 68 : 1104–1114 [ PubMed ] [ Google Scholar ]
  • Weeden N, Ellis T, Timmerman-Vaughan G, Swiecicki W, Rozov S, Berdnikov V (1998 ) A consensus linkage map for Pisum sativum . Pisum Genet 30 : 1–4 [ Google Scholar ]
  • Weeden NF (2016) Are Mendel’s data reliable? The perspective of a pea geneticist . J Hered 107 : 635–646 [ PubMed ] [ Google Scholar ]
  • Weeden NF, Swiecicki WK, Ambrose M, Timmerman GM (1993) Linkage groups of Pea. Pisum Genet 25 : Cover and 4
  • Weir I, Lu J, Cook H, Causier B, Schwarz-Sommer Z, Davies B (2004) CUPULIFORMIS establishes lateral organ boundaries in Antirrhinum . Development 131 : 915–922 [ PubMed ] [ Google Scholar ]
  • Weldon WFR (1902) Mendel’s laws of alternative inheritance in peas . Biometrika 1 : 228–233 [ Google Scholar ]
  • Weller JL, Hecht V, Liew LC, Sussmilch FC, Wenden B, Knowles CL, Vander Schoor JK (2009) Update on the genetic control of flowering in garden pea . J Exp Bot 60 : 2493–2499 [ PubMed ] [ Google Scholar ]
  • Weller JL, Ortega R (2015) Genetic control of flowering time in legumes . Front Plant Sci 6 : 207. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Weston DE, Elliott RC, Lester DR, Rameau C, Reid JB, Murfet IC, Ross JJ (2008) The pea DELLA proteins LA and CRY are important regulators of gibberellin synthesis and root growth . Plant Physiol 147 : 199–205 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • White OE (1917) Studies of inheritance in Pisum . II. The present state of knowledge of heredity and variation in peas . Proc Am Phil Soc 56 : 487–588 [ Google Scholar ]
  • Williams O, Vander Schoor JK, Butler JB, Ridge S, Sussmilch FC, Hecht VFG, Weller JL (2022) The genetic architecture of flowering time changes in pea from wild to crop . J Exp Bot 73 : 3978–3990 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Wright S (1966) Mendel’s ratios. In Stern C, Sherwood ER, eds, The Origin of Genetics , WH Freeman & Co, San Francisco, pp 173–175 [ Google Scholar ]
  • Xi L, Moscou MJ, Meng Y, Xu W, Caldo RA, Shaver M, Nettleton D, Wise RP (2009) Transcript-based cloning of RRP46, a regulator of rRNA processing and R gene-independent cell death in barley–powdery mildew interactions . Plant Cell 21 : 3280–3295 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Xie W, Ke Y, Cao J, Wang S, Yuan M (2021) Knock out of transcription factor WRKY53 thickens sclerenchyma cell walls, confers bacterial blight resistance . Plant Physiol 187 : 1746–1761 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Yang C, Xu Z, Song J, Conner K, Vizcay Barrena G, Wilson ZA (2007) Arabidopsis MYB26/MALE STERILE35 regulates secondary thickening in the endothecium and is essential for anther dehiscence . Plant Cell 19 : 534–548 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Yang L, Zhao X, Yang F, Fan D, Jiang Y, Luo K (2016) PtrWRKY19, a novel WRKY transcription factor, contributes to the regulation of pith secondary wall formation in Populus trichocarpa . Sci Rep 6 : 18643. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Young ND, Cannon SB, Sato S, Kim D, Cook DR, Town CD, Roe BA, Tabata S (2005) Sequencing the genes paces of Medicago truncatula and Lotus japonicus . Plant Physiol 137 : 1174–1181 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Young ND, Debellé F, Oldroyd GED, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KFX, Gouzy J, Schoof H, et al (2011) The Medicago genome provides insight into the evolution of rhizobial symbioses . Nature 480 : 520–524 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zhang D, Zhang M, Liang J (2021) RGB1 regulates grain development and starch accumulation through its effect on OsYUC11-mediated auxin biosynthesis in rice endosperm cells . Front Plant Sci 12 : 585174. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zhang J, Ge H, Zang C, Li X, Grierson D, Chen KS, Yin XR (2016) EjODO1 , a MYB transcription factor, regulating lignin biosynthesis in developing loquat ( Eriobotrya japonica ) fruit . Front Plant Sci 7 : 1360. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Zhuang W, Chen H, Yang M, Wang J, Pandey MK, Zhang C, Chang WC, Zhang L, Zhang X, Tang R, et al (2019) The genome of cultivated peanut provides insight into legume karyotypes, polyploid evolution and crop domestication . Nat Genet 51 : 865–876 [ PMC free article ] [ PubMed ] [ Google Scholar ]

IMAGES

  1. Advances in Genetics Research. Volume 21

    research work on genetics

  2. Analyzing Human Genetics

    research work on genetics

  3. Center for Human Genetics & Genomics Research

    research work on genetics

  4. Home

    research work on genetics

  5. Genetics Basics: Introduction to Genetics

    research work on genetics

  6. Journal of Animal Genetics Research

    research work on genetics

COMMENTS

  1. Genetics

    Genetics articles from across Nature Portfolio Genetics is the branch of science concerned with genes, heredity, and variation in living organisms. It seeks to understand the process of trait ...

  2. Human Molecular Genetics and Genomics

    Genomic research has evolved from seeking to understand the fundamentals of the human genetic code to examining the ways in which this code varies among people, and then applying this knowledge to ...

  3. Genetics research

    Genetics research is the scientific discipline concerned with the study of the role of genes in traits such as the development of disease. It has a key role in identifying potential targets for ...

  4. The road ahead in genetics and genomics

    His research focuses on the microbiome, nutrition and genetics, and their effect on health and disease and aims to develop personalized medicine based on big data from human cohorts.

  5. Scientists Finish the Human Genome at Last

    In 2019, two scientists — Adam Phillippy, a computational biologist at the National Human Genome Research Institute, and Karen Miga, a geneticist at the University of California, Santa Cruz ...

  6. Methods in molecular biology and genetics: looking to the future

    Abstract. In recent decades, advances in methods in molecular biology and genetics have revolutionized multiple areas of the life and health sciences. However, there remains a global need for the development of more refined and effective methods across these fields of research. In this current Collection, we aim to showcase articles presenting ...

  7. Genetics Research

    Genetics Research is an open access journal providing a key forum for original research on all aspects of human and animal genetics, reporting key findings on genomes, genes, mutations, developmental, evolutionary, and population genetics as well as ethical, legal and social aspects.

  8. GENETICS 101

    Almost every human trait and disease has a genetic component, whether inherited or influenced by behavioral factors such as exercise. Genetic components can also modify the body's response to environmental factors such as toxins. Understanding the underlying concepts of human genetics and the role of genes, behavior, and the environment is important for appropriately collecting and applying ...

  9. Your Guide to Genetics and Genomics

    Applying genetic and genomic testing to disease research Genetics and genomics play a critical role in the study of human disease. Just as genes pass information about traits and characteristics from parent to offspring, they also inform an individual's susceptibility to certain diseases and conditions.

  10. Current Clinical Studies

    Current Clinical Studies Researchers at the National Human Genome Research Institute (NHGRI) are working with patients and families to better understand of how genes can cause or influence diseases and develop new and more effective diagnostics and treatments.

  11. Mendel's legacy in modern genetics

    A new collection of articles celebrating the bicentennial of Gregor Mendel's birth discuss his life, work and legacy in modern-day genetic research. Citation: Clarke J, on behalf of the PLOS Biology Staff Editors (2022) Mendel's legacy in modern genetics.

  12. A Brief Guide to Genomics

    Still, a deeper understanding of genetics will shed light on more than just hereditary risks by revealing the basic components of cells and, ultimately, explaining how all the various elements work together to affect the human body in both health and disease.

  13. The road ahead in genetics and genomics

    In celebration of the 20th anniversary of Nature Reviews Genetics, we asked 12 leading researchers to reflect on the key challenges and opportunities faced by the field of genetics and genomics. Keeping their particular research area in mind, they take ...

  14. Genetics

    Genetics is the study of genes, genetic variation, and heredity in organisms. [ 1][ 2][ 3] It is an important branch in biology because heredity is vital to organisms' evolution. Gregor Mendel, a Moravian Augustinian friar working in the 19th century in Brno, was the first to study genetics scientifically.

  15. Population genetics

    Population genetics is the study of the genetic composition of populations, including distributions and changes in genotype and phenotype frequency in response to the processes of natural ...

  16. Evolution of Genetic Techniques: Past, Present, and Beyond

    Genetics is the study of heredity, which means the study of genes and factors related to all aspects of genes. The scientific history of genetics began with the works of Gregor Mendel in the mid-19th century. Prior to Mendel, genetics was primarily theoretical whilst, after Mendel, the science of genetics was broadened to include experimental ...

  17. What does a geneticist do?

    What is a Geneticist? A geneticist specializes in the field of genetics, the study of genes and heredity. Geneticists investigate how traits are inherited, how they manifest in individuals and populations, and how genetic variations contribute to human health, diseases, and evolution. They analyze and interpret genetic data, conduct experiments, and use various research techniques to explore ...

  18. Genetics Basics

    What to know This page provides information about basic genetic concepts such as DNA, genes, chromosomes, and gene expression. Genes play a role in almost every human trait and disease. Advances in our understanding of how genes work have led to improvements in health care and public health.

  19. Tree sequences as a general-purpose tool for population genetic

    As population genetics data increases in size new methods have been developed to store genetic information in efficient ways, such as tree sequences. These data structures are computationally and storage efficient, but are not interchangeable with existing data structures used for many population genetic inference methodologies such as the use of convolutional neural networks (CNNs) applied to ...

  20. Leading AI models struggle to identify genetic conditions from patient

    About the National Human Genome Research Institute (NHGRI): At NHGRI, we are focused on advances in genomics research.Building on our leadership role in the initial sequencing of the human genome, we collaborate with the world's scientific and medical communities to enhance genomic technologies that accelerate breakthroughs and improve lives.

  21. Genetic engineering

    Genetic engineering articles from across Nature Portfolio. Genetic engineering is the act of modifying the genetic makeup of an organism. Modifications can be generated by methods such as gene ...

  22. Mayo Clinic study uncovers genetic cancer risks in 550 patients

    These research findings, published in JCO Precision Oncology, are based on genetic screenings of more than 44,000 study participants from diverse backgrounds. For this Mayo Clinic Center for Individualized Medicine Tapestry project, researchers sequenced the exomes — the protein-coding regions of genes — because this is where most disease ...

  23. A Genetic Analysis of Bacteria Strains Causing Lyme Disease Could

    International research team including Dr. Benjamin Luft map out genome of 47 strains and develop web-based software for future investigations STONY BROOK, NY, August 15, 2024 - After years of research an international team of scientists has unraveled the genetic makeup of 47 strains of known and potential Lyme disease-causing bacteria. The work paves the

  24. Mendel's legacy in modern genetics

    A new collection of articles celebrating the bicentennial of Gregor Mendel's birth discuss his life, work and legacy in modern-day genetic research. The field of biology owes a great debt to both genetic material and those who study it. From tiny bacteria to colossal giant sequoias, genetic material is the common thread that runs through all ...

  25. Your best friend from high school? Here's why their genes mattered

    Research suggests that peers' genetic makeup may influence health outcomes of their friends. To test this, Salvatore and colleagues used Swedish national data to assess peer social genetic effects ...

  26. Early Career Prize for Scott Waddell

    Postdoctoral Research Fellow Dr Scott Waddell has won a Scottish Universities Life Sciences Alliance (SULSA) award for his work looking at polycystic liver disease: May 2024 Scott, who is part of the Luke Boulter Research Group at the MRC Human Genetics Unit, was a winner in the key life sciences area of Human (with other prizes awarded in the ...

  27. A genome-wide investigation into the underlying genetic ...

    Using genome-wide association studies and meta-analyses on dimensions of personality from large existing datasets, Gupta et al. find novel genomic loci that refine our understanding of the genetic ...

  28. New insights from the last decade of research in psychiatric genetics

    Psychiatric genetics has made substantial progress in the last decade, providing new insights into the genetic etiology of psychiatric disorders, and paving the way for precision psychiatry, in which individual genetic profiles may be used to personalize ...

  29. Postdoctoral Fellow (Marine Invasion Dynamics)

    We are seeking a postdoctoral research scientist to work principally on statistical analysis of large datasets on biodiversity of marine invertebrate communities in coastal bays and estuaries in California and other regions. This is a collaborative project focused specifically on evaluating spatial and temporal patterns of non-native species occurrence, using data from a long-term and large ...

  30. Mendel: From genes to genome

    A study of Mendel's seven characteristics illustrates the main principles of genetics. These include the importance of dominant and recessive phenotypes, gene families, pleiotropy, structural and regulatory gene function, and the various methods available to identify genes.