- Pre-registration nursing students
- No definition of master’s degree in nursing described in the publication
After the search, we collated and uploaded all the identified records into EndNote v.X8 (Clarivate Analytics, Philadelphia, Pennsylvania) and removed any duplicates. Two independent reviewers (MCS and SA) screened the titles and abstracts for assessment in line with the inclusion criteria. They retrieved and assessed the full texts of the selected studies while applying the inclusion criteria. Any disagreements about the eligibility of studies were resolved by discussion or, if no consensus could be reached, by involving experienced researchers (MZ-S and RP).
The first reviewer (MCS) extracted data from the selected publications. For this purpose, an extraction tool developed by the authors was used. This tool comprised the following criteria: author(s), year of publication, country, research question, design, case definition, data sources, and methodologic and data-analysis triangulation. First, we extracted and summarized information about the case study design. Second, we narratively summarized the way in which the data and methodological triangulation were described. Finally, we summarized the information on within-case or cross-case analysis. This process was performed using Microsoft Excel. One reviewer (MCS) extracted data, whereas another reviewer (SA) cross-checked the data extraction, making suggestions for additions or edits. Any disagreements between the reviewers were resolved through discussion.
A total of 149 records were identified in 2 databases. We removed 20 duplicates and screened 129 reports by title and abstract. A total of 46 reports were assessed for eligibility. Through hand searches, we identified 117 additional records. Of these, we excluded 98 reports after title and abstract screening. A total of 17 reports were assessed for eligibility. From the 2 databases and the hand search, 63 reports were assessed for eligibility. Ultimately, we included 8 articles for data extraction. No further articles were included after the reference list screening of the included studies. A PRISMA flow diagram of the study selection and inclusion process is presented in Figure 1 . As shown in Tables 2 and and3, 3 , the articles included in this scoping review were published between 2010 and 2022 in Canada (n = 3), the United States (n = 2), Australia (n = 2), and Scotland (n = 1).
PRISMA flow diagram.
Characteristics of Articles Included.
Author | Contandriopoulos et al | Flinter | Hogan et al | Hungerford et al | O’Rourke | Roots and MacDonald | Schadewaldt et al | Strachan et al |
---|---|---|---|---|---|---|---|---|
Country | Canada | The United States | The United States | Australia | Canada | Canada | Australia | Scotland |
How or why research question | No information on the research question | Several how or why research questions | What and how research question | No information on the research question | Several how or why research questions | No information on the research question | What research question | What and why research questions |
Design and referenced author of methodological guidance | Six qualitative case studies Robert K. Yin | Multiple-case studies design Robert K. Yin | Multiple-case studies design Robert E. Stake | Case study design Robert K. Yin | Qualitative single-case study Robert K. Yin Robert E. Stake Sharan Merriam | Single-case study design Robert K. Yin Sharan Merriam | Multiple-case studies design Robert K. Yin Robert E. Stake | Multiple-case studies design |
Case definition | Team of health professionals (Small group) | Nurse practitioners (Individuals) | Primary care practices (Organization) | Community-based NP model of practice (Organization) | NP-led practice (Organization) | Primary care practices (Organization) | No information on case definition | Health board (Organization) |
Overview of Within-Method, Between/Across-Method, and Data-Analysis Triangulation.
Author | Contandriopoulos et al | Flinter | Hogan et al | Hungerford et al | O’Rourke | Roots and MacDonald | Schadewaldt et al | Strachan et al |
---|---|---|---|---|---|---|---|---|
Within-method triangulation (using within-method triangulation use at least 2 data-collection procedures from the same design approach) | ||||||||
: | ||||||||
Interviews | X | x | x | x | x | |||
Observations | x | x | ||||||
Public documents | x | x | x | |||||
Electronic health records | x | |||||||
Between/across-method (using both qualitative and quantitative data-collection procedures in the same study) | ||||||||
: | ||||||||
: | ||||||||
Interviews | x | x | x | |||||
Observations | x | x | ||||||
Public documents | x | x | ||||||
Electronic health records | x | |||||||
: | ||||||||
Self-assessment | x | |||||||
Service records | x | |||||||
Questionnaires | x | |||||||
Data-analysis triangulation (combination of 2 or more methods of analyzing data) | ||||||||
: | ||||||||
: | ||||||||
Deductive | x | x | x | |||||
Inductive | x | x | ||||||
Thematic | x | x | ||||||
Content | ||||||||
: | ||||||||
Descriptive analysis | x | x | x | |||||
: | ||||||||
: | ||||||||
Deductive | x | x | x | x | ||||
Inductive | x | x | ||||||
Thematic | x | |||||||
Content | x |
The following sections describe the research question, case definition, and case study design. Case studies are most appropriate when asking “how” or “why” questions. 1 According to Yin, 1 how and why questions are explanatory and lead to the use of case studies, histories, and experiments as the preferred research methods. In 1 study from Canada, eg, the following research question was presented: “How and why did stakeholders participate in the system change process that led to the introduction of the first nurse practitioner-led Clinic in Ontario?” (p7) 19 Once the research question has been formulated, the case should be defined and, subsequently, the case study design chosen. 1 In typical case studies with mixed methods, the 2 types of data are gathered concurrently in a convergent design and the results merged to examine a case and/or compare multiple cases. 10
“How” or “why” questions were found in 4 studies. 16 , 17 , 19 , 22 Two studies additionally asked “what” questions. Three studies described an exploratory approach, and 1 study presented an explanatory approach. Of these 4 studies, 3 studies chose a qualitative approach 17 , 19 , 22 and 1 opted for mixed methods with a convergent design. 16
In the remaining studies, either the research questions were not clearly stated or no “how” or “why” questions were formulated. For example, “what” questions were found in 1 study. 21 No information was provided on exploratory, descriptive, and explanatory approaches. Schadewaldt et al 21 chose mixed methods with a convergent design.
A total of 5 studies defined the case as an organizational unit. 17 , 18 - 20 , 22 Of the 8 articles, 4 reported multiple-case studies. 16 , 17 , 22 , 23 Another 2 publications involved single-case studies. 19 , 20 Moreover, 2 publications did not state the case study design explicitly.
This section describes within-method triangulation, which involves employing at least 2 data-collection procedures within the same design approach. 6 , 7 This can also be called data source triangulation. 8 Next, we present the single data-collection procedures in detail. In 5 studies, information on within-method triangulation was found. 15 , 17 - 19 , 22 Studies describing a quantitative approach and the triangulation of 2 or more quantitative data-collection procedures could not be included in this scoping review.
Five studies used qualitative data-collection procedures. Two studies combined face-to-face interviews and documents. 15 , 19 One study mixed in-depth interviews with observations, 18 and 1 study combined face-to-face interviews and documentation. 22 One study contained face-to-face interviews, observations, and documentation. 17 The combination of different qualitative data-collection procedures was used to present the case context in an authentic and complex way, to elicit the perspectives of the participants, and to obtain a holistic description and explanation of the cases under study.
All 5 studies used qualitative interviews as the primary data-collection procedure. 15 , 17 - 19 , 22 Face-to-face, in-depth, and semi-structured interviews were conducted. The topics covered in the interviews included processes in the introduction of new care services and experiences of barriers and facilitators to collaborative work in general practices. Two studies did not specify the type of interviews conducted and did not report sample questions. 15 , 18
In 2 studies, qualitative observations were carried out. 17 , 18 During the observations, the physical design of the clinical patients’ rooms and office spaces was examined. 17 Hungerford et al 18 did not explain what information was collected during the observations. In both studies, the type of observation was not specified. Observations were generally recorded as field notes.
In 3 studies, various qualitative public documents were studied. 15 , 19 , 22 These documents included role description, education curriculum, governance frameworks, websites, and newspapers with information about the implementation of the role and general practice. Only 1 study failed to specify the type of document and the collected data. 15
In 1 study, qualitative documentation was investigated. 17 This included a review of dashboards (eg, provider productivity reports or provider quality dashboards in the electronic health record) and quality performance reports (eg, practice-wide or co-management team-wide performance reports).
This section describes the between/across methods, which involve employing both qualitative and quantitative data-collection procedures in the same study. 6 , 7 This procedure can also be denoted “methodologic triangulation.” 8 Subsequently, we present the individual data-collection procedures. In 3 studies, information on between/across triangulation was found. 16 , 20 , 21
Three studies used qualitative and quantitative data-collection procedures. One study combined face-to-face interviews, documentation, and self-assessments. 16 One study employed semi-structured interviews, direct observation, documents, and service records, 20 and another study combined face-to-face interviews, non-participant observation, documents, and questionnaires. 23
All 3 studies used qualitative interviews as the primary data-collection procedure. 16 , 20 , 23 Face-to-face and semi-structured interviews were conducted. In the interviews, data were collected on the introduction of new care services and experiences of barriers to and facilitators of collaborative work in general practices.
In 2 studies, direct and non-participant qualitative observations were conducted. 20 , 23 During the observations, the interaction between health professionals or the organization and the clinical context was observed. Observations were generally recorded as field notes.
In 2 studies, various qualitative public documents were examined. 20 , 23 These documents included role description, newspapers, websites, and practice documents (eg, flyers). In the documents, information on the role implementation and role description of NPs was collected.
In 1 study, qualitative individual journals were studied. 16 These included reflective journals from NPs, who performed the role in primary health care.
Only 1 study involved quantitative service records. 20 These service records were obtained from the primary care practices and the respective health authorities. They were collected before and after the implementation of an NP role to identify changes in patients’ access to health care, the volume of patients served, and patients’ use of acute care services.
In 2 studies, quantitative questionnaires were used to gather information about the teams’ satisfaction with collaboration. 16 , 21 In 1 study, 3 validated scales were used. The scales measured experience, satisfaction, and belief in the benefits of collaboration. 21 Psychometric performance indicators of these scales were provided. However, the time points of data collection were not specified; similarly, whether the questionnaires were completed online or by hand was not mentioned. A competency self-assessment tool was used in another study. 16 The assessment comprised 70 items and included topics such as health promotion, protection, disease prevention and treatment, the NP-patient relationship, the teaching-coaching function, the professional role, managing and negotiating health care delivery systems, monitoring and ensuring the quality of health care practice, and cultural competence. Psychometric performance indicators were provided. The assessment was completed online with 2 measurement time points (pre self-assessment and post self-assessment).
This section describes data-analysis triangulation, which involves the combination of 2 or more methods of analyzing data. 6 Subsequently, we present within-case analysis and cross-case analysis.
Three studies combined qualitative and quantitative methods of analysis. 16 , 20 , 21 Two studies involved deductive and inductive qualitative analysis, and qualitative data were analyzed thematically. 20 , 21 One used deductive qualitative analysis. 16 The method of analysis was not specified in the studies. Quantitative data were analyzed using descriptive statistics in 3 studies. 16 , 20 , 23 The descriptive statistics comprised the calculation of the mean, median, and frequencies.
Two studies combined deductive and inductive qualitative analysis, 19 , 22 and 2 studies only used deductive qualitative analysis. 15 , 18 Qualitative data were analyzed thematically in 1 study, 22 and data were treated with content analysis in the other. 19 The method of analysis was not specified in the 2 studies.
In 7 studies, a within-case analysis was performed. 15 - 20 , 22 Six studies used qualitative data for the within-case analysis, and 1 study employed qualitative and quantitative data. Data were analyzed separately, consecutively, or in parallel. The themes generated from qualitative data were compared and then summarized. The individual cases were presented mostly as a narrative description. Quantitative data were integrated into the qualitative description with tables and graphs. Qualitative and quantitative data were also presented as a narrative description.
Of the multiple-case studies, 5 carried out cross-case analyses. 15 - 17 , 20 , 22 Three studies described the cross-case analysis using qualitative data. Two studies reported a combination of qualitative and quantitative data for the cross-case analysis. In each multiple-case study, the individual cases were contrasted to identify the differences and similarities between the cases. One study did not specify whether a within-case or a cross-case analysis was conducted. 23
This section describes confirmation or contradiction through qualitative and quantitative data. 1 , 4 Qualitative and quantitative data were reported separately, with little connection between them. As a result, the conclusions on neither the comparisons nor the contradictions could be clearly determined.
In 3 studies, the consistency of the results of different types of qualitative data was highlighted. 16 , 19 , 21 In particular, documentation and interviews or interviews and observations were contrasted:
Both types of data showed that NPs and general practitioners wanted to have more time in common to discuss patient cases and engage in personal exchanges. 21 In addition, the qualitative and quantitative data confirmed the individual progression of NPs from less competent to more competent. 16 One study pointed out that qualitative and quantitative data obtained similar results for the cases. 20 For example, integrating NPs improved patient access by increasing appointment availability.
Although questionnaire results indicated that NPs and general practitioners experienced high levels of collaboration and satisfaction with the collaborative relationship, the qualitative results drew a more ambivalent picture of NPs’ and general practitioners’ experiences with collaboration. 21
The studies included in this scoping review evidenced various research questions. The recommended formats (ie, how or why questions) were not applied consistently. Therefore, no case study design should be applied because the research question is the major guide for determining the research design. 2 Furthermore, case definitions and designs were applied variably. The lack of standardization is reflected in differences in the reporting of these case studies. Generally, case study research is viewed as allowing much more freedom and flexibility. 5 , 24 However, this flexibility and the lack of uniform specifications lead to confusion.
Methodologic triangulation, as described in the literature, can be somewhat confusing as it can refer to either data-collection methods or research designs. 6 , 8 For example, methodologic triangulation can allude to qualitative and quantitative methods, indicating a paradigmatic connection. Methodologic triangulation can also point to qualitative and quantitative data-collection methods, analysis, and interpretation without specific philosophical stances. 6 , 8 Regarding “data-collection methods with no philosophical stances,” we would recommend using the wording “data source triangulation” instead. Thus, the demarcation between the method and the data-collection procedures will be clearer.
Yin 1 advocated the use of multiple sources of evidence so that a case or cases can be investigated more comprehensively and accurately. Most studies included multiple data-collection procedures. Five studies employed a variety of qualitative data-collection procedures, and 3 studies used qualitative and quantitative data-collection procedures (mixed methods). In contrast, no study contained 2 or more quantitative data-collection procedures. In particular, quantitative data-collection procedures—such as validated, reliable questionnaires, scales, or assessments—were not used exhaustively. The prerequisites for using multiple data-collection procedures are availability, the knowledge and skill of the researcher, and sufficient financial funds. 1 To meet these prerequisites, research teams consisting of members with different levels of training and experience are necessary. Multidisciplinary research teams need to be aware of the strengths and weaknesses of different data sources and collection procedures. 1
When using multiple data sources and analysis methods, it is necessary to present the results in a coherent manner. Although the importance of multiple data sources and analysis has been emphasized, 1 , 5 the description of triangulation has tended to be brief. Thus, traceability of the research process is not always ensured. The sparse description of the data-analysis triangulation procedure may be due to the limited number of words in publications or the complexity involved in merging the different data sources.
Only a few concrete recommendations regarding the operationalization of the data-analysis triangulation with the qualitative data process were found. 25 A total of 3 approaches have been proposed 25 : (1) the intuitive approach, in which researchers intuitively connect information from different data sources; (2) the procedural approach, in which each comparative or contrasting step in triangulation is documented to ensure transparency and replicability; and (3) the intersubjective approach, which necessitates a group of researchers agreeing on the steps in the triangulation process. For each case study, one of these 3 approaches needs to be selected, carefully carried out, and documented. Thus, in-depth examination of the data can take place. Farmer et al 25 concluded that most researchers take the intuitive approach; therefore, triangulation is not clearly articulated. This trend is also evident in our scoping review.
Few studies in this scoping review used a combination of qualitative and quantitative analysis. However, creating a comprehensive stand-alone picture of a case from both qualitative and quantitative methods is challenging. Findings derived from different data types may not automatically coalesce into a coherent whole. 4 O’Cathain et al 26 described 3 techniques for combining the results of qualitative and quantitative methods: (1) developing a triangulation protocol; (2) following a thread by selecting a theme from 1 component and following it across the other components; and (3) developing a mixed-methods matrix.
The most detailed description of the conducting of triangulation is the triangulation protocol. The triangulation protocol takes place at the interpretation stage of the research process. 26 This protocol was developed for multiple qualitative data but can also be applied to a combination of qualitative and quantitative data. 25 , 26 It is possible to determine agreement, partial agreement, “silence,” or dissonance between the results of qualitative and quantitative data. The protocol is intended to bring together the various themes from the qualitative and quantitative results and identify overarching meta-themes. 25 , 26
The “following a thread” technique is used in the analysis stage of the research process. To begin, each data source is analyzed to identify the most important themes that need further investigation. Subsequently, the research team selects 1 theme from 1 data source and follows it up in the other data source, thereby creating a thread. The individual steps of this technique are not specified. 26 , 27
A mixed-methods matrix is used at the end of the analysis. 26 All the data collected on a defined case are examined together in 1 large matrix, paying attention to cases rather than variables or themes. In a mixed-methods matrix (eg, a table), the rows represent the cases for which both qualitative and quantitative data exist. The columns show the findings for each case. This technique allows the research team to look for congruency, surprises, and paradoxes among the findings as well as patterns across multiple cases. In our review, we identified only one of these 3 approaches in the study by Roots and MacDonald. 20 These authors mentioned that a causal network analysis was performed using a matrix. However, no further details were given, and reference was made to a later publication. We could not find this publication.
Because it focused on the implementation of NPs in primary health care, the setting of this scoping review was narrow. However, triangulation is essential for research in this area. This type of research was found to provide a good basis for understanding methodologic and data-analysis triangulation. Despite the lack of traceability in the description of the data and methodological triangulation, we believe that case studies are an appropriate design for exploring new nursing roles in existing health care systems. This is evidenced by the fact that case study research is widely used in many social science disciplines as well as in professional practice. 1 To strengthen this research method and increase the traceability in the research process, we recommend using the reporting guideline and reporting checklist by Rodgers et al. 9 This reporting checklist needs to be complemented with methodologic and data-analysis triangulation. A procedural approach needs to be followed in which each comparative step of the triangulation is documented. 25 A triangulation protocol or a mixed-methods matrix can be used for this purpose. 26 If there is a word limit in a publication, the triangulation protocol or mixed-methods matrix needs to be identified. A schematic representation of methodologic and data-analysis triangulation in case studies can be found in Figure 2 .
Schematic representation of methodologic and data-analysis triangulation in case studies (own work).
This study suffered from several limitations that must be acknowledged. Given the nature of scoping reviews, we did not analyze the evidence reported in the studies. However, 2 reviewers independently reviewed all the full-text reports with respect to the inclusion criteria. The focus on the primary care setting with NPs (master’s degree) was very narrow, and only a few studies qualified. Thus, possible important methodological aspects that would have contributed to answering the questions were omitted. Studies describing the triangulation of 2 or more quantitative data-collection procedures could not be included in this scoping review due to the inclusion and exclusion criteria.
Given the various processes described for methodologic and data-analysis triangulation, we can conclude that triangulation in case studies is poorly standardized. Consequently, the traceability of the research process is not always given. Triangulation is complicated by the confusion of terminology. To advance case study research in nursing, we encourage authors to reflect critically on methodologic and data-analysis triangulation and use existing tools, such as the triangulation protocol or mixed-methods matrix and the reporting guideline checklist by Rodgers et al, 9 to ensure more transparent reporting.
Acknowledgments.
The authors thank Simona Aeschlimann for her support during the screening process.
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material: Supplemental material for this article is available online.
Writing a Case Study
What is a case study?
A Case study is:
What are the different types of case studies?
Descriptive | This type of case study allows the researcher to: | How has the implementation and use of the instructional coaching intervention for elementary teachers impacted students’ attitudes toward reading? |
Explanatory | This type of case study allows the researcher to: | Why do differences exist when implementing the same online reading curriculum in three elementary classrooms? |
Exploratory | This type of case study allows the researcher to:
| What are potential barriers to student’s reading success when middle school teachers implement the Ready Reader curriculum online? |
Multiple Case Studies or Collective Case Study | This type of case study allows the researcher to: | How are individual school districts addressing student engagement in an online classroom? |
Intrinsic | This type of case study allows the researcher to: | How does a student’s familial background influence a teacher’s ability to provide meaningful instruction? |
Instrumental | This type of case study allows the researcher to: | How a rural school district’s integration of a reward system maximized student engagement? |
Note: These are the primary case studies. As you continue to research and learn
about case studies you will begin to find a robust list of different types.
Who are your case study participants?
|
This type of study is implemented to understand an individual by developing a detailed explanation of the individual’s lived experiences or perceptions.
|
| This type of study is implemented to explore a particular group of people’s perceptions. |
| This type of study is implemented to explore the perspectives of people who work for or had interaction with a specific organization or company. |
| This type of study is implemented to explore participant’s perceptions of an event. |
What is triangulation ?
Validity and credibility are an essential part of the case study. Therefore, the researcher should include triangulation to ensure trustworthiness while accurately reflecting what the researcher seeks to investigate.
How to write a Case Study?
When developing a case study, there are different ways you could present the information, but remember to include the five parts for your case study.
|
|
|
|
|
Applications for case study research, what is a good case study, process of case study design, benefits and limitations of case studies.
Case studies are essential to qualitative research , offering a lens through which researchers can investigate complex phenomena within their real-life contexts. This chapter explores the concept, purpose, applications, examples, and types of case studies and provides guidance on how to conduct case study research effectively.
Whereas quantitative methods look at phenomena at scale, case study research looks at a concept or phenomenon in considerable detail. While analyzing a single case can help understand one perspective regarding the object of research inquiry, analyzing multiple cases can help obtain a more holistic sense of the topic or issue. Let's provide a basic definition of a case study, then explore its characteristics and role in the qualitative research process.
A case study in qualitative research is a strategy of inquiry that involves an in-depth investigation of a phenomenon within its real-world context. It provides researchers with the opportunity to acquire an in-depth understanding of intricate details that might not be as apparent or accessible through other methods of research. The specific case or cases being studied can be a single person, group, or organization – demarcating what constitutes a relevant case worth studying depends on the researcher and their research question .
Among qualitative research methods , a case study relies on multiple sources of evidence, such as documents, artifacts, interviews , or observations , to present a complete and nuanced understanding of the phenomenon under investigation. The objective is to illuminate the readers' understanding of the phenomenon beyond its abstract statistical or theoretical explanations.
Case studies typically possess a number of distinct characteristics that set them apart from other research methods. These characteristics include a focus on holistic description and explanation, flexibility in the design and data collection methods, reliance on multiple sources of evidence, and emphasis on the context in which the phenomenon occurs.
Furthermore, case studies can often involve a longitudinal examination of the case, meaning they study the case over a period of time. These characteristics allow case studies to yield comprehensive, in-depth, and richly contextualized insights about the phenomenon of interest.
Case studies hold a unique position in the broader landscape of research methods aimed at theory development. They are instrumental when the primary research interest is to gain an intensive, detailed understanding of a phenomenon in its real-life context.
In addition, case studies can serve different purposes within research - they can be used for exploratory, descriptive, or explanatory purposes, depending on the research question and objectives. This flexibility and depth make case studies a valuable tool in the toolkit of qualitative researchers.
Remember, a well-conducted case study can offer a rich, insightful contribution to both academic and practical knowledge through theory development or theory verification, thus enhancing our understanding of complex phenomena in their real-world contexts.
Case study research aims for a more comprehensive understanding of phenomena, requiring various research methods to gather information for qualitative analysis . Ultimately, a case study can allow the researcher to gain insight into a particular object of inquiry and develop a theoretical framework relevant to the research inquiry.
Using case studies as a research strategy depends mainly on the nature of the research question and the researcher's access to the data.
Conducting case study research provides a level of detail and contextual richness that other research methods might not offer. They are beneficial when there's a need to understand complex social phenomena within their natural contexts.
Case studies can take on various roles depending on the research objectives. They can be exploratory when the research aims to discover new phenomena or define new research questions; they are descriptive when the objective is to depict a phenomenon within its context in a detailed manner; and they can be explanatory if the goal is to understand specific relationships within the studied context. Thus, the versatility of case studies allows researchers to approach their topic from different angles, offering multiple ways to uncover and interpret the data .
Case studies play a significant role in knowledge development across various disciplines. Analysis of cases provides an avenue for researchers to explore phenomena within their context based on the collected data.
This can result in the production of rich, practical insights that can be instrumental in both theory-building and practice. Case studies allow researchers to delve into the intricacies and complexities of real-life situations, uncovering insights that might otherwise remain hidden.
In qualitative research , a case study is not a one-size-fits-all approach. Depending on the nature of the research question and the specific objectives of the study, researchers might choose to use different types of case studies. These types differ in their focus, methodology, and the level of detail they provide about the phenomenon under investigation.
Understanding these types is crucial for selecting the most appropriate approach for your research project and effectively achieving your research goals. Let's briefly look at the main types of case studies.
Exploratory case studies are typically conducted to develop a theory or framework around an understudied phenomenon. They can also serve as a precursor to a larger-scale research project. Exploratory case studies are useful when a researcher wants to identify the key issues or questions which can spur more extensive study or be used to develop propositions for further research. These case studies are characterized by flexibility, allowing researchers to explore various aspects of a phenomenon as they emerge, which can also form the foundation for subsequent studies.
Descriptive case studies aim to provide a complete and accurate representation of a phenomenon or event within its context. These case studies are often based on an established theoretical framework, which guides how data is collected and analyzed. The researcher is concerned with describing the phenomenon in detail, as it occurs naturally, without trying to influence or manipulate it.
Explanatory case studies are focused on explanation - they seek to clarify how or why certain phenomena occur. Often used in complex, real-life situations, they can be particularly valuable in clarifying causal relationships among concepts and understanding the interplay between different factors within a specific context.
These three categories of case studies focus on the nature and purpose of the study. An intrinsic case study is conducted when a researcher has an inherent interest in the case itself. Instrumental case studies are employed when the case is used to provide insight into a particular issue or phenomenon. A collective case study, on the other hand, involves studying multiple cases simultaneously to investigate some general phenomena.
Each type of case study serves a different purpose and has its own strengths and challenges. The selection of the type should be guided by the research question and objectives, as well as the context and constraints of the research.
The flexibility, depth, and contextual richness offered by case studies make this approach an excellent research method for various fields of study. They enable researchers to investigate real-world phenomena within their specific contexts, capturing nuances that other research methods might miss. Across numerous fields, case studies provide valuable insights into complex issues.
Case studies provide a detailed understanding of the role and impact of information systems in different contexts. They offer a platform to explore how information systems are designed, implemented, and used and how they interact with various social, economic, and political factors. Case studies in this field often focus on examining the intricate relationship between technology, organizational processes, and user behavior, helping to uncover insights that can inform better system design and implementation.
Health research is another field where case studies are highly valuable. They offer a way to explore patient experiences, healthcare delivery processes, and the impact of various interventions in a real-world context.
Case studies can provide a deep understanding of a patient's journey, giving insights into the intricacies of disease progression, treatment effects, and the psychosocial aspects of health and illness.
Specifically within medical research, studies on asthma often employ case studies to explore the individual and environmental factors that influence asthma development, management, and outcomes. A case study can provide rich, detailed data about individual patients' experiences, from the triggers and symptoms they experience to the effectiveness of various management strategies. This can be crucial for developing patient-centered asthma care approaches.
Apart from the fields mentioned, case studies are also extensively used in business and management research, education research, and political sciences, among many others. They provide an opportunity to delve into the intricacies of real-world situations, allowing for a comprehensive understanding of various phenomena.
Case studies, with their depth and contextual focus, offer unique insights across these varied fields. They allow researchers to illuminate the complexities of real-life situations, contributing to both theory and practice.
Download a free trial of ATLAS.ti to turn your data into insights.
Understanding the key elements of case study design is crucial for conducting rigorous and impactful case study research. A well-structured design guides the researcher through the process, ensuring that the study is methodologically sound and its findings are reliable and valid. The main elements of case study design include the research question , propositions, units of analysis, and the logic linking the data to the propositions.
The research question is the foundation of any research study. A good research question guides the direction of the study and informs the selection of the case, the methods of collecting data, and the analysis techniques. A well-formulated research question in case study research is typically clear, focused, and complex enough to merit further detailed examination of the relevant case(s).
Propositions, though not necessary in every case study, provide a direction by stating what we might expect to find in the data collected. They guide how data is collected and analyzed by helping researchers focus on specific aspects of the case. They are particularly important in explanatory case studies, which seek to understand the relationships among concepts within the studied phenomenon.
The unit of analysis refers to the case, or the main entity or entities that are being analyzed in the study. In case study research, the unit of analysis can be an individual, a group, an organization, a decision, an event, or even a time period. It's crucial to clearly define the unit of analysis, as it shapes the qualitative data analysis process by allowing the researcher to analyze a particular case and synthesize analysis across multiple case studies to draw conclusions.
This refers to the inferential model that allows researchers to draw conclusions from the data. The researcher needs to ensure that there is a clear link between the data, the propositions (if any), and the conclusions drawn. This argumentation is what enables the researcher to make valid and credible inferences about the phenomenon under study.
Understanding and carefully considering these elements in the design phase of a case study can significantly enhance the quality of the research. It can help ensure that the study is methodologically sound and its findings contribute meaningful insights about the case.
Conceptualize your research project with our intuitive data analysis interface. Download a free trial today.
Conducting a case study involves several steps, from defining the research question and selecting the case to collecting and analyzing data . This section outlines these key stages, providing a practical guide on how to conduct case study research.
The first step in case study research is defining a clear, focused research question. This question should guide the entire research process, from case selection to analysis. It's crucial to ensure that the research question is suitable for a case study approach. Typically, such questions are exploratory or descriptive in nature and focus on understanding a phenomenon within its real-life context.
The selection of the case should be based on the research question and the objectives of the study. It involves choosing a unique example or a set of examples that provide rich, in-depth data about the phenomenon under investigation. After selecting the case, it's crucial to define it clearly, setting the boundaries of the case, including the time period and the specific context.
Previous research can help guide the case study design. When considering a case study, an example of a case could be taken from previous case study research and used to define cases in a new research inquiry. Considering recently published examples can help understand how to select and define cases effectively.
A case study protocol outlines the procedures and general rules to be followed during the case study. This includes the data collection methods to be used, the sources of data, and the procedures for analysis. Having a detailed case study protocol ensures consistency and reliability in the study.
The protocol should also consider how to work with the people involved in the research context to grant the research team access to collecting data. As mentioned in previous sections of this guide, establishing rapport is an essential component of qualitative research as it shapes the overall potential for collecting and analyzing data.
Gathering data in case study research often involves multiple sources of evidence, including documents, archival records, interviews, observations, and physical artifacts. This allows for a comprehensive understanding of the case. The process for gathering data should be systematic and carefully documented to ensure the reliability and validity of the study.
The next step is analyzing the data. This involves organizing the data , categorizing it into themes or patterns , and interpreting these patterns to answer the research question. The analysis might also involve comparing the findings with prior research or theoretical propositions.
The final step is writing the case study report . This should provide a detailed description of the case, the data, the analysis process, and the findings. The report should be clear, organized, and carefully written to ensure that the reader can understand the case and the conclusions drawn from it.
Each of these steps is crucial in ensuring that the case study research is rigorous, reliable, and provides valuable insights about the case.
The type, depth, and quality of data in your study can significantly influence the validity and utility of the study. In case study research, data is usually collected from multiple sources to provide a comprehensive and nuanced understanding of the case. This section will outline the various methods of collecting data used in case study research and discuss considerations for ensuring the quality of the data.
Interviews are a common method of gathering data in case study research. They can provide rich, in-depth data about the perspectives, experiences, and interpretations of the individuals involved in the case. Interviews can be structured , semi-structured , or unstructured , depending on the research question and the degree of flexibility needed.
Observations involve the researcher observing the case in its natural setting, providing first-hand information about the case and its context. Observations can provide data that might not be revealed in interviews or documents, such as non-verbal cues or contextual information.
Documents and archival records provide a valuable source of data in case study research. They can include reports, letters, memos, meeting minutes, email correspondence, and various public and private documents related to the case.
These records can provide historical context, corroborate evidence from other sources, and offer insights into the case that might not be apparent from interviews or observations.
Physical artifacts refer to any physical evidence related to the case, such as tools, products, or physical environments. These artifacts can provide tangible insights into the case, complementing the data gathered from other sources.
Determining the quality of data in case study research requires careful planning and execution. It's crucial to ensure that the data is reliable, accurate, and relevant to the research question. This involves selecting appropriate methods of collecting data, properly training interviewers or observers, and systematically recording and storing the data. It also includes considering ethical issues related to collecting and handling data, such as obtaining informed consent and ensuring the privacy and confidentiality of the participants.
Analyzing case study research involves making sense of the rich, detailed data to answer the research question. This process can be challenging due to the volume and complexity of case study data. However, a systematic and rigorous approach to analysis can ensure that the findings are credible and meaningful. This section outlines the main steps and considerations in analyzing data in case study research.
The first step in the analysis is organizing the data. This involves sorting the data into manageable sections, often according to the data source or the theme. This step can also involve transcribing interviews, digitizing physical artifacts, or organizing observational data.
Once the data is organized, the next step is to categorize or code the data. This involves identifying common themes, patterns, or concepts in the data and assigning codes to relevant data segments. Coding can be done manually or with the help of software tools, and in either case, qualitative analysis software can greatly facilitate the entire coding process. Coding helps to reduce the data to a set of themes or categories that can be more easily analyzed.
After coding the data, the researcher looks for patterns or themes in the coded data. This involves comparing and contrasting the codes and looking for relationships or patterns among them. The identified patterns and themes should help answer the research question.
Once patterns and themes have been identified, the next step is to interpret these findings. This involves explaining what the patterns or themes mean in the context of the research question and the case. This interpretation should be grounded in the data, but it can also involve drawing on theoretical concepts or prior research.
The last step in the analysis is verification. This involves checking the accuracy and consistency of the analysis process and confirming that the findings are supported by the data. This can involve re-checking the original data, checking the consistency of codes, or seeking feedback from research participants or peers.
Like any research method , case study research has its strengths and limitations. Researchers must be aware of these, as they can influence the design, conduct, and interpretation of the study.
Understanding the strengths and limitations of case study research can also guide researchers in deciding whether this approach is suitable for their research question . This section outlines some of the key strengths and limitations of case study research.
Benefits include the following:
On the other hand, researchers should consider the following limitations:
Being aware of these strengths and limitations can help researchers design and conduct case study research effectively and interpret and report the findings appropriately.
See how our intuitive software can draw key insights from your data with a free trial today.
Introduction.
A case-control study is used to see if exposure is linked to a certain result (i.e., disease or condition of interest). Case-control research is always retrospective by definition since it starts with a result and then goes back to look at exposures. The investigator already knows the result of each participant when they are enrolled in their separate groups. Case-control studies are retrospective because of this, not because the investigator frequently uses previously gathered data. This article discusses statistical analysis in case-control studies.
Advantages and Disadvantages of Case-Control Studies
Participants in a case-control study are chosen for the study depending on their outcome status. As a result, some individuals have the desired outcome (referred to as cases), while others do not have the desired outcome (referred to as controls). After that, the investigator evaluates the exposure in both groups. As a result, in case-control research , the outcome must occur in at least some individuals. Thus, as shown in Figure 1, some research participants have the outcome, and others do not enrol.
Figure 1. Example of a case-control study [1]
The cases should be defined as precisely as feasible by the investigator. A disease’s definition may be based on many criteria at times; hence, all aspects should be fully specified in the case definition.
Selection of a control
Controls that are comparable to the cases in a variety of ways should be chosen. The matching criteria are the parameters (e.g., age, sex, and hospitalization time) used to establish how controls and cases should be similar. For instance, it would be unfair to compare patients with elective intraocular surgery to a group of controls with traumatic corneal lacerations. Another key feature of a case-control study is that the exposure in both cases and controls should be measured equally.
Though some controls have to be similar to cases in many respects, it is possible to over-match. Over-matching might make it harder to identify enough controls. Furthermore, once a matching variable is chosen, it cannot be analyzed as a risk factor. Enrolling more than one control for each case is an effective method for increasing the power of research. However, incorporating more than two controls per instance adds little statistical value.
Data collection
Decide on the data to be gathered after precisely identifying the cases and controls; both groups must have the same data obtained in the same method. If the search for primary risk variables is not conducted objectively, the study may suffer from researcher bias, especially because the conclusion is already known. It’s crucial to try to hide the outcome from the person collecting risk factor data or interviewing patients, even if it’s not always practicable. Patients may be asked questions concerning historical issues (such as smoking history, food, usage of conventional eye medications, and so on). For some people, precisely recalling all of this information may be challenging.
Furthermore, patients who get the result (cases) are more likely to recall specifics of unfavourable experiences than controls. Recall bias is a term for this phenomenon. Any effort made by the researcher to reduce this form of bias would benefit the research.
The frequency of each of the measured variables in each of the two groups is computed in the analysis. Case-control studies produce the odds ratio to measure the strength of the link between exposure and the outcome. An odds ratio is the ratio of exposure probabilities in the case group to the odds of response in the control group. Calculating a confidence interval for each odds ratio is critical. A confidence interval of 1.0 indicates that the link between the exposure and the result might have been discovered by chance alone and that the link is not statistically significant. Without a confidence interval, an odds ratio isn’t particularly useful. Computer programmes are typically used to do these computations. Because no measures are taken in a population-based sample, case-control studies cannot give any information regarding the incidence or prevalence of a disease.
Risk Factors and Sampling
Case-control studies can also be used to investigate risk factors for a rare disease. Cases might be obtained from hospital records. Patients who present to the hospital, on the other hand, may not be typical of the general community. The selection of an appropriate control group may provide challenges. Patients from the same hospital who do not have the result are a common source of controls. However, hospitalized patients may not always reflect the broader population; they are more likely to have health issues and access the healthcare system.
i) R isk factors related to multiple sclerosis in Kuwait
This matched case-control research in Kuwait looked at the relationship between several variables: family history, stressful life events, tobacco smoke exposure, vaccination history, comorbidity, and multiple sclerosis (MS) risk. To accomplish the study’s goal, a matched case-control strategy was used. Cases were recruited from Ibn Sina Hospital’s neurology clinics and the Dasman Diabetes Institute’s MS clinic. Controls were chosen from among Kuwait University’s faculty and students. A generalized questionnaire was used to collect data on socio-demographic, possibly genetic, and environmental aspects from each patient and his/her pair-matched control. Descriptive statistics were produced, including means and standard deviations for quantitative variables and frequencies for qualitative variables. Variables that were substantially (p ≤ 0.15) associated with MS status in the univariable conditional logistic regression analysis were evaluated for inclusion in the final multivariable conditional logistic regression model. In this case-control study, 112 MS patients were invited to participate, and 110 (98.2 %) agreed to participate. Therefore, 110 MS patients and 110 control participants were enlisted, and they were individually matched with cases (1:1) on age (5 years), gender, and nationality (Fig. 1). The findings revealed that having a family history of MS was significantly associated with an increased risk of developing MS. In contrast, vaccination against influenza A and B viruses provided significant protection against MS.
Figure 1. Flow chart on the enrollment of the MS cases and controls [1]
ii) Relation between periodontitis and COVID-19 infection
COVID-19 is linked to a higher inflammatory response, which can be deadly. Periodontitis is characterized by systemic inflammation. In Qatar, patients with COVID-19 were chosen from Hamad Medical Corporation’s (HMC) national electronic health data. Patients with COVID-19 problems (death, ICU hospitalizations, or assisted ventilation) were categorized as cases, while COVID-19 patients released without severe difficulties were categorized as controls. There was no control matching because all controls were included in the analysis. Periodontal problems were evaluated using dental radiographs from the same database. The relationships between periodontitis and COVID 19 problems were investigated using logistic regression models adjusted for demographic, medical, and behavioural variables. 258 of the 568 participants had periodontitis. Only 33 of the 310 patients with periodontitis had COVID-19 issues, whereas only 7 of the 310 patients without periodontitis had COVID-19 issues. Table 2 shows the unadjusted and adjusted odds ratios and 95 % confidence intervals for the relationship between periodontitis and COVID-19 problems. Periodontitis was shown to be substantially related to a greater risk of COVID-19 complications, such as ICU admission, the requirement for assisted breathing, and mortality, as well as higher blood levels of indicators connected to a poor COVID-19 outcome, such as D-dimer, WBC, and CRP.
Table 2. Associations between periodontal condition and COVID-19 complications [3]
iii) Menstrual, reproductive and hormonal factors and thyroid cancer
The relationships between menstrual, reproductive, and hormonal variables and thyroid cancer incidence in a population of Chinese women were investigated in this study. A 1:1 corresponding hospital-based Case-control study was conducted in 7 counties of Zhejiang Province to investigate the correlations of diabetes mellitus and other variables with thyroid cancer. Case participants were eligible if they were diagnosed with primary thyroid cancer for the first time in a hospital between July 2015 and December 2017. The patients and controls in this research were chosen at random. At enrollment, the interviewer gathered all essential information face-to-face using a customized questionnaire. Descriptive statistics were utilized to characterize the baseline characteristics of female individuals using frequency and percentage. To investigate the connections between the variables and thyroid cancer, univariate conditional logistic regression models were used. We used four multivariable conditional logistic regression models adjusted for variables to investigate the relationships between menstrual, reproductive, and hormonal variables and thyroid cancer. In all, 2937 pairs of participants took part in the case-control research. The findings revealed that a later age at first pregnancy and a longer duration of breastfeeding were substantially linked with a lower occurrence of thyroid cancer, which might shed light on the aetiology, monitoring, and prevention of thyroid cancer in Chinese women [4].
It’s important to note that the term “case-control study” is commonly misunderstood. A case-control study starts with a group of people exposed to something and a comparison group (control group) who have not been exposed to anything and then follows them over time to see what occurs. However, this is not a case-control study. Case-control studies are frequently seen as less valuable since they are retrospective. They can, however, be a highly effective technique of detecting a link between an exposure and a result. In addition, they are sometimes the only ethical approach to research a connection. Case-control studies can provide useful information if definitions, controls, and the possibility for bias are carefully considered.
[1] Setia, Maninder Singh. “Methodology Series Module 2: Case-control Studies.” Indian journal of dermatology vol. 61,2 (2016): 146-51. doi:10.4103/0019-5154.177773
[2] El-Muzaini, H., Akhtar, S. & Alroughani, R. A matched case-control study of risk factors associated with multiple sclerosis in Kuwait. BMC Neurol 20, 64 (2020). https://doi.org/10.1186/s .
[3] Marouf, Nadya, Wenji Cai, Khalid N. Said, Hanin Daas, Hanan Diab, Venkateswara Rao Chinta, Ali Ait Hssain, Belinda Nicolau, Mariano Sanz, and Faleh Tamimi. “Association between periodontitis and severity of COVID‐19 infection: A case–control study.” Journal of clinical periodontology 48, no. 4 (2021): 483-491.
[4] Wang, Meng, Wei-Wei Gong, Qing-Fang He, Ru-Ying Hu, and Min Yu. “Menstrual, reproductive and hormonal factors and thyroid cancer: a hospital-based case-control study in China.” BMC Women’s Health 21, no. 1 (2021): 1-8.
Related posts.
Comments are closed.
Suggested citation:
Ziller, Conrad (2024). Introduction to Statistics and Data Analysis – A Case-Based Approach. Available online at https://bookdown.org/conradziller/introstatistics
To download the R-Scripts and data used in this book, go HERE .
This short book is a complete introduction to statistics and data analysis using R and RStudio. It contains hands-on exercises with real data—mostly from social sciences. In addition, this book presents four key ingredients of statistical data analysis (univariate statistics, bivariate statistics, statistical inference, and regression analysis) as brief case studies. The motivation for this was to provide students with practical cases that help them navigate new concepts and serve as an anchor for recalling the acquired knowledge in exams or while conducting their own data analysis.
The case study logic is expected to increase motivation for engaging with the materials. As we all know, academic teaching is not the same as before the pandemic. Students are (rightfully) increasingly reluctant to chalk-and-talk techniques of teaching, and we have all developed dopamine-related addictions to social media content which have considerably shortened our ability to concentrate. This poses challenges to academic teaching in general and complex content such as statistics and data science in particular.
This book consists of four case studies that provide a short, yet comprehensive, introduction to statistics and data analysis. The examples used in the book are based on real data from official statistics and publicly available surveys. While each case study follows its own logic, I advise reading them consecutively. The goal is to provide readers with an opportunity to learn independently and to gather a solid foundation of hands-on knowledge of statistics and data analysis. Each case study contains questions that can be answered in the boxes below. The solutions to the questions can be viewed below the boxes (by clicking on the arrow next to the word “solution”). It is advised to save answers to a separate document because this content is not saved and cannot be accessed after reloading the book page.
A working sheet with questions, answer boxes, and solutions can be downloaded together with the R-Scrips HERE . You can read this book online for free. Copies in printable format may be ordered from the author.
This book can be used for teaching by university instructors, who may use data examples and analyses provided in this book as illustrations in lectures (and by acknowledging the source). This book can be used for self-study by everyone who wants to acquire foundational knowledge in basic statistics and practical skills in data analysis. The materials can also be used as a refresher on statistical foundations.
Beginners in R and RStudio are advised to install the programs via the following link https://posit.co/download/rstudio-desktop/ and to download the materials from HERE . The scripts from this material can then be executed while reading the book. This helps to get familiar with statistical analysis, and it is just an awesome feeling to get your own script running! (On the downside, it is completely normal and part of the process that code for statistical analysis does not work. This is what helpboards across the web and, more recently, ChatGPT are for. Just google your problem and keep on trying, it is, as always, 20% inspiration and 80% consistency.)
The book contains four case studies, each showcasing unique statistical and data-analysis-related techniques.
Section 2 contains material on the analysis of one variable. It presents measures of typical values (e.g., the mean) and the distribution of data.
Section 3 contains material on the analysis of the relationship between two variables, including cross tabs and correlations.
Section 4 introduces the concept of statistical inference, which refers to inferring population characteristics from a random sample. It also covers the concepts of hypothesis testing, confidence intervals, and statistical significance.
Section 5 covers how to conduct multiple regression analysis and interpret the corresponding results. Multiple regression investigates the relationship between an outcome variable (e.g., beliefs about justice) and multiple variables that represent different competing explanations for the outcome.
Thank you to Paul Gies, Phillip Kemper, Jonas Verlande, Teresa Hummler, Paul Vierus, and Felix Diehl for helpful feedback on previous versions of this book. I want to thank Achim Goerres for his feedback early on and for granting me maximal freedom in revising and updating the materials of his introductory lectures on Methods and Statistics, which led to the writing of this book. Earlier versions of this book have been used in teaching courses on statistics in the Political Science undergraduate program at the University of Duisburg-Essen.
Conrad Ziller is a Senior Researcher in the Department of Political Science at the University of Duisburg-Essen. His research interests focus on the role of immigration in politics and society, immigrant integration, policy effects on citizens, and quantitative methods. He is the principal investigator of research projects funded by the German Research Foundation and the Fritz Thyssen Foundation. More information about his research can be found here: https://conradziller.com/ .
The final part of the book is about linear regression analysis, which is the natural endpoint for a course on introductory statistics. However, the “ordinary” regression is where many further useful techniques come into play—most of which can subsumed under the label “Advanced Regression Models”. You will need them when analyzing, for example, panel data where the same respondents were interviewed multiple times or spatially clustered data from cross-national surveys.
I will extend this introduction with case studies on advanced regression techniques soon. If you want to get notified when this material is online, please sign up with your email address here: https://forms.gle/T8Hvhq3EmcywkTdFA .
In the meantime, I have a chapter on “Multiple Regression with Non-Independent Observations: Random-Effects and Fixed-Effects” that can be downloaded via https://ssrn.com/abstract=4747607 .
For feedback on the usefulness of the introduction and/or reports on errors and misspellings, I would be utmost thankful if you would send me a short notification at [email protected] .
Thanks much for engaging with this introduction!
The online version of this book is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License .
Sample size.
LEARN STATISTICS EASILY
Learn Data Analysis Now!
You will learn the transformative impact of statistical science in unfolding real-world narratives from global economics to public health victories.
The untrained eye may see only cold, lifeless digits in the intricate dance of numbers and patterns that constitute data analysis and statistics. Yet, for those who know how to listen, these numbers whisper stories about our world, our behaviors, and the delicate interplay of systems and relationships that shape our reality. Artfully unfolded through meticulous statistical analysis, these narratives can reveal startling truths and unseen correlations that challenge our understanding and broaden our horizons. Here are five case studies demonstrating the profound power of statistics to decode reality’s vast and complex tapestry.
The 2008 financial crisis is a prime real-world example of the Butterfly Effect in global markets. What started as a crisis in the housing market in the United States quickly escalated into a full-blown international banking crisis with the collapse of the investment bank Lehman Brothers on September 15, 2008.
Understanding the Ripples
A team of economists employed regression analysis to understand the impact of the Lehman Brothers collapse. The statistical models revealed how this event affected financial institutions worldwide, causing a credit crunch and a widespread economic downturn.
The Data Weaves a Story
Further analysis using time-series forecasting methods painted a detailed picture of the crisis’s spread. For instance, these models were used to predict how the initial shockwave would impact housing markets globally, consumer spending, and unemployment rates. These forecasts proved incredibly accurate, showcasing not only the domino effect of the crisis but also the predictive power of well-crafted statistical models.
Implications for Future Predictions
This real-life event became a case study of the importance of understanding the deep connections within the global financial system. Banks, policymakers, and investors now use the predictive models developed from the 2008 crisis to stress-test economic systems against similar shocks. It has led to a greater appreciation of risk management and the implementation of stricter financial regulations to safeguard against future crises.
By interpreting the unfolding of the 2008 crisis through the lens of statistical science, we can appreciate the profound effect that one event in a highly interconnected system can have. The lessons learned continue to resonate, influencing financial policies and the global economic forecasting and stability approach.
In a world teeming with infectious diseases, the story of dracunculiasis, commonly known as Guinea Worm Disease, is a testament to public health tenacity and the judicious application of statistical analysis in disease eradication efforts.
Tracing the Path of the Parasite
The campaign against dracunculiasis, led by The Carter Center and supported by a consortium of international partners, utilized epidemiological data to trace and interrupt the life cycle of the Guinea worm — the statistical approach underpinning this public health victory involved meticulously collecting data on disease incidence and transmission patterns.
The Tally of Triumph
By employing geospatial statistics and logistic regression models, health workers pinpointed endemic villages and formulated strategies that targeted the disease’s transmission vectors. These statistical tools were instrumental in monitoring the progress of eradication efforts and allocating resources to areas most in need.
The Countdown to Zero
The eradication campaign’s success was measured by the continuous decline in cases, from an estimated 3.5 million in the mid-1980s to just 54 reported cases in 2019. This dramatic decrease has been documented through rigorous data collection and statistical validation, ensuring that each reported case was accounted for and dealt with accordingly.
Legacy of a Worm
The nearing eradication of Guinea Worm Disease, with no vaccine or curative treatment, is a feat that underscores the power of preventive public health strategies informed by statistical analysis. It serves as a blueprint for tackling other infectious diseases. It is a real-world example of how statistics can aid in making the invisible enemy of disease a known and conquerable foe.
The narrative of Guinea Worm eradication is not just a tale of statistical victory but also one of human resilience and commitment to public health. It is a story that will continue to inspire as the world edges closer to declaring dracunculiasis the second human disease, after smallpox, to be eradicated.
The advent of big data analytics has revolutionized marketing strategies by providing deep insights into consumer behavior. Amazon, a global leader in e-commerce, is at the forefront of leveraging statistical analysis to offer its customers a highly personalized shopping experience.
The Predictive Power of Purchase Patterns
Amazon collects vast user data, including browsing histories, purchase patterns, and product searches. Amazon analyzes this data by employing machine learning algorithms to predict individual customer preferences and future buying behavior. This predictive power is exemplified by Amazon’s recommendation engine, which suggests products to users with uncanny accuracy, often leading to increased sales and customer satisfaction.
Beyond the Purchase: Sentiment Analysis
Amazon extends its data analysis beyond purchases by analyzing customer reviews and feedback sentiment. This analysis gives Amazon a nuanced understanding of customer sentiments towards products and services. Amazon can quickly address issues, improve product offerings, and enhance customer service by mining text for customer sentiment.
Crafting Tomorrow’s Trends Today
Amazon’s data analytics insights are not limited to personalizing the shopping experience. They are also used to anticipate and set future trends. Amazon has mastered the art of using consumer data to meet existing demands and influence and create new consumer needs. By analyzing emerging patterns, Amazon stocks products ahead of demand spikes and develops new products that align with predicted consumer trends.
Amazon’s success in utilizing statistical analysis for marketing is a testament to the power of big data in shaping the future of consumer engagement. The company’s ability to personalize the shopping experience and anticipate consumer trends has set a benchmark in the industry, illustrating the transformative impact of statistics on marketing strategies.
In the annals of environmental success stories, the recovery of the American Bald Eagle (Haliaeetus leucocephalus) from extinction stands out as a sterling example of how rigorous science, public policy, and statistics can combine to safeguard wildlife. This case study offers a narrative that encapsulates the meticulous application of data analysis in wildlife conservation, revealing a more profound truth about the interdependence of species and the human spirit’s capacity for stewardship.
The Descent Towards Silence
By the mid-20th century, the American Bald Eagle, a symbol of freedom and strength, faced decimation. Pesticides like DDT, habitat loss, and illegal shooting had dramatically reduced their numbers. The alarming descent prompted an urgent call to action bolstered by the rigorous collection and analysis of ecological data.
The Statistical Lifeline
Biostatisticians and ecologists began a comprehensive monitoring program, recording eagle population numbers, nesting sites, and chick survival rates. Advanced statistical models, including logistic regression and population viability analysis (PVA), were employed to assess the eagles’ extinction risk under various scenarios and to evaluate the effectiveness of different conservation strategies.
The Ban on DDT – A Calculated Decision
A pivotal moment in the Bald Eagle’s story was the ban on DDT in 1972, a decision grounded in the statistical analysis of the pesticide’s impacts on eagle reproduction. Studies demonstrated a strong correlation between DDT and thinning eggshells, leading to reduced hatching rates. Based on this analysis, the ban’s implementation marked the turning point for the eagle’s fate.
A Soaring Recovery
Post-ban, rigorous monitoring continued, and the data collected painted a story of resilience and recovery. The statistical evidence was undeniable: eagle populations were rebounding. As of the early 21st century, the Bald Eagle had made a miraculous comeback, removed from the Endangered Species List in 2007.
The Legacy of a Species
The American Bald Eagle’s resurgence is more than a conservation narrative; it’s a testament to the harmony between humanity’s analytical prowess and its capacity for environmental guardianship. It shows how statistics can forecast doom and herald a new dawn for conservation. This case study epitomizes the beautiful interplay between human action, informed by truth and statistical insight, resulting in a tangible good: the return of a majestic species from the shadow of extinction.
Social media platforms, particularly Twitter, have become critical arenas for public discourse, shaping societal norms and reflecting public sentiment. This case study examines the real-world application of statistical models and algorithms to understand Twitter’s role in political polarization.
Twitter’s Data-Driven Sentiment Reflection
The aim was to analyze Twitter data to evaluate public sentiment regarding political events and understand the platform’s contribution to societal polarization.
Using natural language processing (NLP) and sentiment analysis, researchers from the Massachusetts Institute of Technology (MIT) analyzed over 10 million tweets from the period surrounding the 2020 U.S. Presidential Election. The tweets were filtered using politically relevant hashtags and keywords.
Deciphering the Digital Pulse
A sentiment index was created, categorizing tweets into positive, negative, or neutral sentiments concerning the candidates. This ‘Twitter Political Sentiment Index’ provided a temporal view of public mood swings about key campaign events and debates.
The Echo Chambers of the Internet
Network analysis revealed distinct user clusters along ideological lines, illustrating the presence of echo chambers. The study examined retweet networks and highlighted how information circulated within politically homogeneous groups, reinforcing existing beliefs.
The study showed limited user exposure to opposing political views on Twitter, increasing polarization. It also correlated significant shifts in the sentiment index with real-life events, such as policy announcements and election results.
Shaping the Future of Public Discourse
The study, published in Science, emphasizes the need for transparency in social media algorithms to mitigate echo chambers’ effects. The insights gained are being used to inform policymakers and educators about the dynamics of online discourse and to encourage the design of algorithms that promote a more balanced and open digital exchange of ideas.
The findings from MIT’s Twitter data analysis underscore the platform’s power as a real-time barometer of public sentiment and its role in shaping political discourse. The case study offers a roadmap for leveraging big data to foster a healthier democratic process in the digital age.
Drawing together these varied case studies, it becomes clear that statistics and data analysis are far from mere computation tools. They are, in fact, the instruments through which we can uncover deeper truths about our world. They can illuminate the unseen, predict the future, and help us shape it towards the common good. These narratives exemplify the pursuit of true knowledge, promoting good actions, and appreciating a beautiful world.
As we engage with the data of our daily lives, we continually decode the complexities of existence. From the markets to the microorganisms, consumer behavior to conservation efforts, and the physical to the digital world, statistics is the language in which the tales of our times are written. It is the language that reveals the integrity of systems, the harmony of nature, and the pulse of humanity. Through this science’s meticulous and ethical application, we uphold the values of truth, goodness, and beauty — ideals that remain ever-present in the quest for understanding and improving the world we share.
Curious about the untold stories behind the numbers? Dive into our blog for more riveting articles that showcase the transformative power of statistics in understanding and shaping our world. Continue your journey into the beauty of data-driven truths with us.
Q1: What is the significance of the 2008 Financial Crisis in statistics? The 2008 Financial Crisis is significant in statistics for demonstrating the Butterfly Effect in global markets, where regression analysis revealed the interconnected impact of Lehman Brothers’ collapse on the global economy.
Q2: How did statistics contribute to the eradication of Guinea Worm Disease? Through geospatial and logistic regression, statistics played a crucial role in tracking and reducing the spread of Guinea Worm Disease, contributing to the decline from 3.5 million cases to just 54 by 2019.
Q3: What role does machine learning play in Amazon’s marketing? Machine learning algorithms at Amazon analyze vast amounts of consumer data to predict customer preferences and personalize the shopping experience, driving sales and setting industry benchmarks.
Q4: How were statistics instrumental in the recovery of the American Bald Eagle? Statistical models helped assess the risk of extinction and the impact of DDT on eagle reproduction, leading to conservation strategies that aided in the eagle’s significant recovery.
Q5: What is sentiment analysis, and how was it used in studying Twitter? Sentiment analysis uses natural language processing to categorize the tone of text content. MIT used it to evaluate political sentiment on Twitter and study the platform’s role in political polarization.
Q6: How did statistical models predict the global effects of the 2008 crisis? Statistical models, including time-series forecasting, predicted how the crisis would affect housing markets, consumer spending, and unemployment, demonstrating the predictive power of statistics.
Q7: Why is the eradication of Guinea Worm Disease significant beyond public health? The near eradication, without a vaccine or cure, illustrates the power of preventive strategies and statistical analysis in public health, serving as a blueprint for combating other diseases.
Q8: In what way did statistics aid in the decision to ban DDT? Statistical analysis linked DDT to thinning eagle eggshells and poor hatching rates, leading to the ban crucial for the Bald Eagle’s recovery.
Q9: How does Amazon’s use of data analytics influence consumer behavior? By analyzing consumer data, Amazon anticipates and sets trends, meets demands, and influences new consumer needs, shaping the future of consumer engagement.
Q10: What implications does the Twitter political polarization study have? The study calls for transparency in social media algorithms to reduce echo chambers. It suggests using statistical insights to foster a balanced, open digital exchange in democratic processes.
Discover why ‘Statistics is the grammar of Science’ and its pivotal role in driving scientific insights and breakthroughs.
Explore the Inter-Class Correlation to enhance the reliability of your statistical analyses and embrace the beauty of data consistency.
Explore how confounding variables in statistics can impact your research and learn effective strategies for identifying and adjusting them.
Uncover the intriguing truth about p-values and the concept of p-hacking in scientific research and its impact on statistical analysis.
Explore the risks of unethical data analysis — ‘If you torture the data long enough, it will confess to anything’ — Learn best practices.
Discover 10 effective techniques to master statistics and data analysis, enhancing insightful, efficient learning skills.
Your email address will not be published. Required fields are marked *
Save my name, email, and website in this browser for the next time I comment.
Bring practical statistical problem solving to your course.
A wide selection of real-world scenarios with practical multistep solution paths. Complete with objectives, data, illustrations, insights and exercises. Exercise solutions available to qualified instructors only.
Title | Field | Subject | Concepts | Complexity | |
---|---|---|---|---|---|
JMP001 | Healthcare | Insurance Claims Management | Summary Statistics & Box Plot | ||
JMP002 | Operations | Customer Care | Time Series Plots & Descriptive Statistics | ||
JMP003 | Engineering | Manufacturing Quality | Tabulation & Summary Statistics | ||
JMP004 | Marketing | Research Methods | Chi-Squared Test & Distribution | ||
JMP005 | Life Sciences | Quality Improvement | Correlation & Summary Statistics | ||
JMP006 | Marketing | Pricing | One Sample t - Test | ||
JMP007 | Operations | Quality Improvement | Two Sample t - Test & Welch Test | ||
JMP008 | General | Transforming Data | Normality & Transformation | ||
JMP009 | Finance | Resource Management | Non parametric & Wilcoxon Signed Rank Test | ||
JMP010 | Social Sciences | Experiments | t - Test & Wilcoxon Rank Sums test | ||
JMP011 | Operations | Project Management | ANOVA & Welch Test | ||
JMP012 | General | Games | t - Test & One way ANOVA | ||
JMP013 | Social Sciences | Demographics | ANOVA & Kruskal-Wallis Test | ||
JMP014 | General | Games of Chance | Simulation for One Proportion | ||
JMP015 | Life Sciences | Disease | Chi-Squared Test & Relative Risk | ||
JMP016 | Life Sciences | Vaccines | Chi-Squared Test & Fisher's Exact Test | ||
JMP017 | Life Sciences | Oncology | Odds Ratio & Conditional Probability | ||
JMP018 | Life Sciences | Genetics | Chisquare test for Multiple Proportions | ||
JMP019 | Marketing | Fundraising | Simple Linear Regression & Prediction Intervals | ||
JMP020 | Marketing | Advertising | Time Series & Simple Linear Regression | ||
JMP021 | Marketing | Strategy | Curve Fitting and Regression | ||
JMP022 | Life Sciences | Paleontology | Simple Linear Regression & Transformation | ||
JMP023 | Operations | Service Reliability | Multiple Linear Regression & Correlation | ||
JMP024 | Marketing | Pricing | Multiple Linear Regression & Model Diagnostics | ||
JMP025 | Finance | Revenue Management | Stepwise Regression & Model Diagnostics | ||
JMP026 | Operations | Sales | Logistic Regression & Chi Squared test | ||
JMP027 | History | Demography | Logistic Regression & Odds Ratio | ||
JMP028* | Marketing | Customer Acquisition | Classification Tree & Model Validation | ||
JMP029 | Operations | Customer Care | Process Capability & Partition Model | ||
JMP030 | Marketing | Customer Retention | Neural Networks & Variable importance | ||
JMP031* | Social Sciences | Socioeconomics | Predictive Modeling & Model comparison | ||
JMP032 | Engineering | Product Testing | Chi Squared Test & Relative Risk | ||
JMP033 | Engineering | Product Testing | Chi Squared Test & Odds Ratio | ||
JMP034 | Engineering | Product Testing | Univariate Logistic Regression | ||
JMP035 | Engineering | Product Testing | Multivariate Logistic Regression | ||
JMP036 | Marketing | Customer Acquisition | Population Parameter Estimation | ||
JMP037 | Engineering | Quality Management | Descriptive Statistics & Visualization | ||
JMP038 | Engineering | Quality Management | Normality & Test of Standard deviation | ||
JMP039 | Operations | Product Management | t - Test & ANOVA | ||
JMP040 | Engineering | Quality Improvement | Variability Gauge R&R, Variance Components | ||
JMP041* | General | Knowledge Management | Word Cloud & Term Selection | ||
JMP042 | Finance | Time Series Analysis | Stationarity & Differencing | ||
JMP043 | Marketing | Research Methods | Conjoint, Part Worths, OLS, Utility | ||
JMP044 | Marketing | Research Methods | Discrete choice & Willingness to Pay | ||
JMP045 | Finance | Time Series Analysis | ARIMA Models & Model Comparison | ||
JMP046 | Life Sciences | Ecology | Non Parametric Kendall's Tau & Normality | ||
JMP047 | Engineering | Pharmaceutical Manufacturin | Statistical Quality Control | ||
JMP048 | Engineering | Pharmaceutical Manufacturing | Statistical Process Control | ||
JMP049 | Engineering | Pharmaceutical Manufacturing | Design of Experiments | ||
JMP050 | Engineering | Chemical Manufacturing | Design of Experiments | ||
JMP051* | Engineering | Chemical Manufacturing | Functional Data Exploration (FDE) | ||
JMP052 | Engineering | Biotech Manufacturing | Design of Experiments | ||
JMP053 | Marketing | Demography | PCA & Clustering | ||
JMP054 | Finance | Time Series Forecasting | Exponential Smoothing Methods | ||
JMP055 | Engineering | Pharmaceutical Formulation | Design of Experiments, Mixture Design | ||
JMP056 | Life Sciences | Ecology | Generalized Linear Mixed Models & Forecasting | ||
JMP057 | Social Sciences | Research Methods | Exploratory Factor Analysis (EFA), Bartlett’s Test, KMO Test | ||
JMP058* | Social Sciences | Research Methods | Confirmatory Factor Analysis (CFA), Structural Equation Modeling (SEM) | ||
JMP059* | Life Sciences | Biotechnology | Functional Data Analysis, Functional DOE | ||
JMP060* | Life Sciences | Biotechnology | Nonlinear Modeling, Curve DOE | ||
JMP061* | Finance | Research Methods | Sentiment Analysis | ||
JMP062 | | Life Sciences | Ecology | Exploratory data analysis, data visualization |
*: The cases with * need JMP Pro
University of Colorado Denver
Saint-Gobain NorPro
Idaho State University
Nashville General Hospital
University of Massachusetts
Siddaganga Institute of Technology
University of Arizona
Clarkson University
University of South Indiana
Augusta University
Brandeis University
University College Ghent
Lohmann GmbH & Co. KG
Lonza Group AG
Benjamin Ingham
The University of Manchester
Healthy Reefs for Healthy People
To request solutions to the exercises within the Case Studies, please complete this form and indicate which case(s) and their number you would like to request in the space provided below. Solutions are provided to qualified instructors only and all requests including academic standing will be verified before solutions are sent.
Explore claim payment amounts for medical malpractice lawsuits and identify factors that appear to influence the amount of the payment using descriptive statistics and data visualizations.
Key words: Summary statistics, frequency distribution, histogram, box plot, bar chart, Pareto plot, and pie chart
Analyze and compare baggage complaints for three different airlines using descriptive statistics and time series plots. Explore differences between the airlines, whether complaints are getting better or worse over time, and if there are other factors, such as destinations, seasonal effects or the volume of travelers that might affect baggage performance.
Key words: Time series plots, summary statistics
Explore the effectiveness of different sampling plans in detecting changes in the occurrence of manufacturing defects.
Key words: Tabulation, histogram, summary statistics, and time series plots
Use survey results from a summer movie series to answer questions regarding customer satisfaction, demographic profiles of patrons, and the use of media outlets in advertising.
Key words: Bar charts, frequency distribution, summary statistics, mosaic plot, contingency table, (cross-tabulations), and chi-squared test
Analyze patient complaint data at a medical clinic to identify the issues resulting in customer dissatisfaction and determine potential causes of decreased patient volume.
Key words: Frequency distribution, summary statistics, Pareto plot, tabulation, scatterplot, run chart, correlation
Evaluate the price quoting process of two different sales associate to determine if there is inconsistency between them to decide if a new more consistent pricing process should be developed.
Key words: Histograms, summary statistics, confidence interval for the mean, one sample t-Test
Determine what effect a reengineering effort had on the incidence of behavioral problems and turnover at a treatment facility for teenagers.
Key words: Summary statistics, time series plots, normal quantile plots, two sample t-Test, unequal variance test, Welch's test
Use data from a survey of students to perform exploratory data analysis and to evaluate the performance of different approaches to a statistical analysis.
Key words: Histograms, normal quantile plots, log transformations, confidence intervals, inverse transformation
Use the DASL Fish Prices data to investigate whether there is evidence that overfishing occurred from 1970 to 1980.
Key words: Histograms, normal quantile plots, log transformations, inverse transformation, paired t-test, Wilcoxon signed rank test
Determine whether subliminal messages were effective in increasing math test scores, and if so, by how much.
Key words: Histograms, summary statistics, box plots, t-Test and pooled t-Test, normal quantile plot, Wilcoxon Rank Sums test, Cohen's d
Determine whether a software development project prioritization system was effective in speeding the time to completion for high priority jobs.
Key words: Summary statistics, histograms, normal quantile plot, ANOVA, pairwise comparison, unequal variance test, and Welch's test
Determine if a backgammon program has been upgraded by comparing the performance of a player against the computer across different time periods.
Key words: Histograms, confidence intervals, stacking data, one-way ANOVA, unequal variances test, one-sample t-Test, ANOVA table and calculations, F Distribution, F ratios
Use data from the World Factbook to explore wealth disparities between different regions of the world and identify those with the highest and lowest wealth.
Key words: Geographic mapping, histograms, log transformation, ANOVA, Welch's ANOVA, Kruskal-Wallis
Using outcomes for 10,000 flips of a coin, use descriptive statistics, confidence intervals and hypothesis tests to determine whether the coin is fair.
Key words: Bar charts, confidence intervals for proportions, hypothesis testing for proportions, likelihood ratio, simulating random data, scatterplot, fitting a regression line
Use results from a 1860’s sterilization study to determine if there is evidence that the sterilization process reduces deaths when amputations are performed.
Key words: Mosaic plots, contingency tables, Pearson and likelihood ratio tests, Fisher's exact test, two-sample proportions test, one- and two-sided tests, confidence interval for the difference, relative risk
Using data from a 1950’s study, determine whether the polio vaccine was effective in a cohort study, and, if it was, quantify the degree of effectiveness.
Key words: Bar charts, two-sample proportions test, relative risk, two-sided Pearson and likelihood ratio tests, Fisher's exact test, and the Gamma measure of association
Use the results of a retrospective study to determine if there is a positive association between smoking and lung cancer, and estimate the risk of lung cancer for smokers relative to non-smokers.
Key words: Mosaic plots, two-by-two contingency tables, odds ratios and confidence intervals, conditional probability, hypothesis tests for proportions (likelihood ratio, Pearson's, Fisher's Exact, two sample tests for proportions)
Use the data sets provided to explore Mendel’s Laws of Inheritance for dominant and recessive traits.
Key words: Bar charts, frequency distributions, goodness-of-fit tests, mosaic plot, hypothesis tests for proportions
Predict year-end contributions in an employee fund-raising drive.
Key words: Summary statistics, time series plots, simple linear regression, predicted values, prediction intervals
Evaluate different regression models to determine if sales at small retail shop are influence by direct mail campaign and using the resulting models to predict sales based upon the amount of marketing.
Key words: Time series plots, simple linear regression, lagged variables, predicted values, prediction intervals
Assess the effectiveness of a cost leadership strategy in increasing market share, and assess the potential for additional gains in market share under the current strategy.
Key words: Simple linear regression, spline fitting, transformations, predicted values, prediction intervals
Analyze data on the brain and body weight of different dinosaur species to determine if a proposed statistical model performs well at describing the relationship and use the model to predict brain weight based on body weight.
Key words: Histogram and summary statistics, fitting a regression line, log transformations, residual plots, interpreting regression output and parameter estimates, inverse transformations
Determine whether wind speed and barometric pressure are related to phone call performance (percentage of dropped or failed calls) and use the resulting model to predict the percentage of bad calls based upon the weather conditions.
Key words: Histograms, summary statistics, simple linear regression, multiple regression, scatterplot, 3D-scatterplot
After determining which factors relate to the selling prices of homes located in and around a ski resort, develop a model to predict housing prices.
Key words: Scatterplot matrix, correlations, multiple regression, stepwise regression, multicollinearity, model building, model diagnostics
A bank wants to understand how customer banking habits contribute to revenues and profitability. Build a model that allows the bank to predict profitability for a given customer. The resulting model will be used to forecast bank revenues and guide the bank in future marketing campaigns.
Key words: Log transformation, stepwise regression, regression assumptions, residuals, Cook’s D, model coefficients, singularity, prediction profiler, inverse transformations
Determine whether certain conditions make it more likely that a customer order will be won or lost.
Key words: Bar charts, frequency distribution, mosaic plots, contingency table, chi-squared test, logistic regression, predicted values, confusion matrix
Use the passenger data related to the sinking of the RMS Titanic ship to explore some questions of interest about survival rates for the Titanic. For example, were there some key characteristics of the survivors? Were some passenger groups more likely to survive than others? Can we accurately predict survival?
Key words: Logistic regression, log odds and logit, odds, odds ratios, prediction profiler
A bank would like to understand the demographics and other characteristics associated with whether a customer accepts a credit card offer. Build a Classification model that will provide insight into why some bank customers accept credit card offers.
Key words: Classification trees, training & validation, confusion matrix, misclassification, leaf report, ROC curves, lift curves
The scenario relates to the handling of customer queries via an IT call center. The call center performance is well below best in class. Identify potential process changes to allow the call center to achieve best in class performance.
Key words: Interactive data visualization, graphs, distribution, tabulate, recursive partitioning, process capability, control chart, multiple regression, prediction profiler
Analyze the factors related to customer churn of a mobile phone service provider. The company would like to build a model to predict which customers are most likely to move their service to a competitor. This knowledge will be used to identify customers for targeted interventions, with the ultimate goal of reducing churn.
Key words: Neural networks, activation functions, model validation, confusion matrix, lift, prediction profiler, variable importance
Build a variety of prediction models (multiple regression, partition tree, and a neural network) to determine the one that performs the best at predicting house prices based upon various characteristics of the house and its location.
Key words: Stepwise regression, regression trees, neural networks, model validation, model comparison
Evaluate the durability of mobile phone screens in a drop test. Determine if a desired level of durability is achieved for each of two types of screens and compare performance.
Key words: Confidence Intervals, Hypothesis Tests for One and Two Population Proportions, Chi-square, Relative Risk
Evaluate the durability of mobile phone screens in a drop test at various drop heights. Determine if a desired level of durability is achieved for each of three types of screens and compare performance.
Key words: Contingency analysis, comparing proportions via difference, relative risk and odds ratio
Evaluate the durability of mobile phone screens in a drop test across various heights by building individual simple logistic regression models. Use the models to estimate the probability of a screen being damaged across any drop height.
Key words: Single variable logistic regression, inverse prediction
Evaluate the durability of mobile phone screens in a drop test across various heights by building a single multiple logistic regression model. Use the model to estimate the probability of a screen being damaged across any drop height.
Key words: Multivariate logistic regression, inverse prediction, odds ratio
Evaluate the potential improvement to the UI design of an online mortgage application process by examining the usability rating from a sample of 50 customers and comparing their performance using the new design vs. a large collection of historic data on customer’s performance with the current design.
Key words: Distribution, normality, normal quantile plot, Shapiro Wilk and Anderson Darling tests, t-Test
Evaluate the performance to specifications of a food manufacturing process using graphical analyses and numerical summarizations of the data.
Key words: Distribution, summary statistics, time series plots
Evaluate the performance to specifications of a food manufacturing process using confidence intervals and hypothesis testing.
Key words: Distribution, normality, normal quantile plot, Shapiro Wilk and Anderson Darling tests, test of mean and test of standard deviation
Analyze the results of an experiment to determine if there is statistical evidence demonstrating an improvement in a new laundry detergent formulation. Explore and describe the affect that multiple factors have on a response, as well as identify conditions with the most and least impact.
Key words: Analysis of variance (ANOVA), t-Test, pairwise comparison, model diagnostics, model performance
Study the use of Nested Variability chart to understand and analyze the different components of variances. Also explore the ways to minimize the variability by applying various rules of operation related to variance.
Key words: Variability gauge, nested design, component analysis of variance
This study requires the use of unstructured data analysis to understand and analyze the text related to patents filed by different companies.
Key words: Word cloud, data visualization, term selection
Understand the basic concepts related to time series data analysis and explore the ways to practically understand the risks and rate of return related to the financial indices data.
Key words: Differencing, log transformation, stationarity, Augmented Dickey Fuller (ADF) test
Study the application of regression and concepts related to choice modeling (also called conjoint analysis) to understand and analyze the importance of the product attributes and their levels influencing the preferences.
Key words: Part Worth, regression, prediction profiler
Design and analyze discrete choice experiments (also called conjoint analysis) to discover which product or service attributes are preferred by potential customers.
Key words: Discrete choice design, regression, utility and probability profiler, willingness to pay
Learn univariate time series modeling using US Gold Prices. Build AR, MA, ARMA and ARMA models to analyze the characteristics of the time series data and forecast.
Key words: Stationarity, AR, MA, ARMA, ARIMA, model comparison and diagnostics
Explore statistical evidence demonstrating an association between Saguro size and the amount of flowers it produces.
Key words: Kendall's Tau, correlation, normality, regression
Use control charts to understand process stability and analyze the patterns of process variation.
Key words: Statistical Process Control, Control Chart, Process Capability
Use Measurement Systems Analysis (MSA) to assess the precision, consistency and bias of a measurement system.
Key words: Measurement Systems Analysis (MSA), Analysis of Variance (ANOVA)
Use Design of Experiments (DOE) to advance knowledge about the process.
Key words: Definitive Screening Design, Custom Design, Design Comparison. Prediction, Simulation and Optimization
Application of statistical methods to understand the process and enhance its performance through Design of Experiments and regression techniques.
Key words: Custom Design, Stepwise Regression, Prediction Profiler
Use Functional Data Analysis to understand the intrinsic structure of the data.
Key words: Functional Data Analysis (FDA), B Splines, Functional PCA, Generalized Regression
Use Design of Experiments (DOE) to optimize the microbial cultivation process.
Key words: Custom Design, Design Evaluation, Predictive Modeling
Use PCA and Clustering techniques to segment the demographic data.
Key words: Clustering, Principal Component Analysis, Exploratory Data Analysis
Learn various exponential smoothing techniques to build various forecasting models and compare them.
Key words: Time series forecasting, Exponential Smoothing
Use Mixture/formulation design to optimize multiple responses related to bioavailability of a drug.
Key words: Custom Design, Mixture/Formulation Design, Optimization
Apply time series forecasting and Generalized linear mixed model (GLMM) to evaluate butterfly populations being impacted by climate and land-use changes.
Key words: Time series forecasting, Generalized linear mixed model
Apply exploratory factor analysis to uncover latent factor structure in an online shopping questionnaire.
Key words: Exploratory Factor Analysis (EFA), Bartlett’s Test, KMO Test
Apply measurement and structural models to survey responses from online shoppers to build and evaluate competing models.
Key words : Confirmatory Factor Analysis (CFA), Structural Equation Modeling (SEM), Measurement and Structural Regression Models, Model Comparison
Apply functional data analysis and functional design of experiments (FDOE) for the optimization of an analytical method to allow for the accurate quantification of two biological components.
Key words: Functional Data Analysis, Functional PCA, Functional DOE
Apply nonlinear models to understand the impact of factors on a cell growth.
Key words: Nonlinear Modeling, Logistic 3P, Curve DOE
Apply Sentiment analysis to quantify the emotion in unstructured text.
Key words: Word Cloud, Sentiment Analysis
Apply exploratory data analysis in the context of wildlife monitoring and nature conservation
Key words: Summary statistics, Crosstabulation, Data visualization
An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
Email citation, add to collections.
Your saved search, create a file for external citation management software, your rss feed.
Affiliation.
In many countries it is left to the discretion of the court to accept or reject conclusions based on sampling procedures as applied to the total drug exhibit. As an alternative to this subjective approach, a statistical basis is presented using binomial and hypergeometric distributions to determine a lower limit for the proportion of units in a population which contains a drug, at a given confidence level. A method for calculating the total weight of a drug present in a population within a given confidence interval is also presented. In the event of no failures (all units sampled contain a drug), a sample size of six or seven units is generally sufficient to state that a proportion of at least 0.70 of the population contains a drug at a confidence level of at least 90%. When failures do occur in the sample, point estimation is used as the basis for selecting the appropriate sample size.
PubMed Disclaimer
Research materials.
NCBI Literature Resources
MeSH PMC Bookshelf Disclaimer
The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.
Lean Six Sigma Training Certification
August 29th, 2024
The ability to properly analyze and understand numbers has become very valuable, especially in today’s time.
Analyzing numerical data systematically involves thoughtfully collecting, organizing, and studying data to discover patterns, trends, and connections that can guide important choices.
Analyzing numbers is useful for learning from information. It applies stats methods and computational processes to study and make sense of data so you can spot patterns, connections, and how things change over time – giving insight to guide decisions.
At the core, quantitative analysis builds on math and stats fundamentals to turn raw figures into meaningful knowledge.
The process usually starts with gathering related numbers and organizing them neatly. Then analysts use different statistical techniques like descriptive stats, predictive modeling, and more to pull out valuable lessons.
Descriptive stats provide a summary of the key details, like averages and how spread out the numbers are. This helps analysts understand the basics and find any weird outliers.
Inferential stats allow analysts to predict broader trends based on a sample. Things like hypothesis testing , regression analysis, and correlation investigations help identify significant relationships.
Machine learning and predictive modeling have also enhanced working with numbers. These sophisticated methods let analysts create models that can forecast outcomes, recognize patterns across huge datasets, and uncover hidden insights beyond basic stats alone.
Leveraging data-based evidence supports more informed management of resources.
The first step in any quantitative data analysis is collecting the relevant data. This involves determining what data is needed to answer the research question or business objective.
Data can come from a variety of sources such as surveys, experiments, observational studies, transactions, sensors, and more.
Once the data is obtained, it typically needs to go through a data preprocessing or data cleaning phase.
Real-world data is often messy, containing missing values, errors, inconsistencies, and outliers that can negatively impact the analysis if not handled properly. Common data cleaning tasks include:
The goal of data cleaning is to ensure that quantitative data analysis techniques can be applied accurately to high-quality data. Proper data collection and preparation lays the foundation for reliable results.
In addition to cleaning, the data may need to be structured or formatted in a way that statistical software and data analysis tools can read it properly.
For large datasets, data management principles like establishing data pipelines become important.
Descriptive statistics is a crucial aspect of quantitative data analysis that involves summarizing and describing the main characteristics of a dataset.
This branch of statistics aims to provide a clear and concise representation of the data, making it easier to understand and interpret.
Descriptive statistics are typically the first step in analyzing data, as they provide a foundation for further statistical analyses and help identify patterns, trends, and potential outliers.
The most common descriptive statistics measures include:
Descriptive statistics play a vital role in data exploration and understanding the initial characteristics of a dataset.
They provide a summary of the data, allowing researchers and analysts to identify patterns, detect potential outliers, and make informed decisions about further analyses.
However, it’s important to note that descriptive statistics alone do not provide insights into the underlying relationships or causal mechanisms within the data.
To draw meaningful conclusions and make inferences about the population, inferential statistics and advanced analytical techniques are required.
While descriptive statistics provide a summary of data, inferential statistics allow you to make inferences and draw conclusions from that data.
Inferential statistics involve taking findings from a sample and generalizing them to a larger population. This is crucial when it is impractical or impossible to study an entire population.
The core of inferential statistics revolves around hypothesis testing . A hypothesis is a statement about a population parameter that needs to be evaluated based on sample data.
The process involves formulating a null and alternative hypothesis, calculating an appropriate test statistic, determining the p-value, and making a decision whether to reject or fail to reject the null hypothesis.
Some common inferential techniques include:
T-tests – Used to determine if the mean of a population differs significantly from a hypothesized value or if the means of two populations differ significantly.
ANOVA ( Analysis of Variance ) – Used to determine if the means of three or more groups are different.
Regression analysis – Used to model the relationship between a dependent variable and one or more independent variables. This allows you to understand drivers and make predictions.
Correlation analysis – Used to measure the strength and direction of the relationship between two variables.
Inferential statistics are critical for quantitative research, allowing you to test hypotheses, establish causality, and make data-driven decisions with confidence in the findings.
However, the validity depends on meeting the assumptions of the statistical tests and having a properly designed study with adequate sample sizes.
The interpretation of inferential statistics requires care. P-values indicate the probability of obtaining the observed data assuming the null hypothesis is true – they do not confirm or deny the hypothesis directly. Effect sizes are also crucial for assessing the practical significance beyond just statistical significance.
Quantitative data analysis goes beyond just describing and making inferences about data – it can also be used to build predictive models that forecast future events or behaviors.
Predictive modeling uses statistical techniques to analyze current and historical data to predict unknown future values.
Some of the key techniques used in predictive modeling include regression analysis , decision trees , neural networks, and other machine learning algorithms.
Regression analysis is used to understand the relationship between a dependent variable and one or more independent variables.
It allows you to model that relationship and make predictions. More advanced techniques like decision trees and neural networks can capture highly complex, non-linear relationships in data.
Machine learning has become an integral part of quantitative data analysis and predictive modeling. Machine learning algorithms can automatically learn and improve from experience without being explicitly programmed.
They can identify hidden insights and patterns in large, complex datasets that would be extremely difficult or impossible for humans to find manually.
Some popular machine learning techniques used for predictive modeling include:
Predictive models have a wide range of applications across industries, from forecasting product demand and sales to identifying risk of customer churn to detecting fraud.
With the rise of big data , machine learning is becoming increasingly important for building accurate predictive models from large, varied data sources.
To effectively perform quantitative data analysis, having the right tools and software is essential. There are numerous options available, ranging from open-source solutions to commercial platforms.
The choice depends on factors such as the size and complexity of the data, the specific analysis techniques required, and the budget.
Business Intelligence (BI) and Analytics Platforms
Cloud-based Data Analysis Platforms
Quantitative data analysis techniques find widespread applications across numerous domains and industries. Here are some notable examples:
Businesses rely heavily on quantitative methods to gain insights from customer data, sales figures, market trends, and operational metrics.
Techniques like regression analysis help model customer behavior, while clustering algorithms enable customer segmentation. Forecasting models allow businesses to predict future demand, inventory needs, and revenue projections.
Analysis of clinical trial data, disease prevalence statistics, and patient outcomes employs quantitative methods extensively.
Hypothesis testing determines the efficacy of new drugs or treatments. Survival analysis models patient longevity. Data mining techniques identify risk factors and detect anomalies in healthcare data.
Marketing teams use quantitative data from surveys, A/B tests, and online behavior tracking to optimize campaigns. Regression models predict customer churn or likelihood to purchase.
Sentiment analysis derives insights from social media data and product reviews. Conjoint analysis determines which product features impact consumer preferences.
Quantitative finance relies on statistical models for portfolio optimization, derivative pricing, risk quantification, and trading strategy formulation. Value at Risk (VaR) models assess potential losses. Monte Carlo simulations evaluate the risk of complex financial instruments.
From political polls to consumer surveys, quantitative data analysis techniques like weighting, sampling, and survey data adjustment are critical. Researchers employ methods like factor analysis, cluster analysis, and structural equation modeling .
Case study 1: netflix’s data-driven recommendations.
Netflix extensively uses quantitative data analysis, particularly machine learning, to drive its recommendation engine.
By mining user behavior data and combining it with metadata about movies and shows, they build predictive models to accurately forecast what a user would enjoy watching next.
The adoption of sabermetrics and analytics by baseball teams like the Oakland Athletics, as depicted in the movie Moneyball, revolutionized player scouting and strategy.
By quantifying player performance through new statistical metrics, teams could identify undervalued talent and gain a competitive edge.
Quantitative data analysis is a powerful toolset that allows organizations to derive valuable insights from their data to make informed decisions.
By applying the various techniques and methods discussed, such as descriptive statistics, inferential statistics , predictive modeling , and machine learning, businesses can gain a competitive edge by uncovering patterns, trends, and relationships hidden within their data.
However, it’s important to note that quantitative data analysis is not a one-time exercise. As businesses continue to generate and collect more data, the analysis process should be an ongoing, iterative cycle.
If you’re looking to further enhance their quantitative data analysis capabilities, there are several potential next steps to consider:
Implementing robust data governance policies and adhering to ethical guidelines can help organizations maintain trust and accountability.
SixSigma.us offers both Live Virtual classes as well as Online Self-Paced training. Most option includes access to the same great Master Black Belt instructors that teach our World Class in-person sessions. Sign-up today!
Virtual Classroom Training Programs Self-Paced Online Training Programs
" * " indicates required fields
Click through the PLOS taxonomy to find articles in your field.
For more information about PLOS Subject Areas, click here .
Loading metrics
Open Access
Peer-reviewed
Research Article
Roles Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Supervision, Writing – original draft, Writing – review & editing
* E-mail: [email protected]
Affiliation Department of Mathematics, University of Ha’il, Ha’il, Saudi Arabia
Roles Investigation, Writing – review & editing
Affiliation Statistics Department, University of Tabuk, Tabuk, Saudi Arabia
Chronic kidney disease (CKD) has become more common in recent decades, putting significant strain on healthcare systems worldwide. CKD is a global health issue that can lead to severe complications such as kidney failure and death.
The purpose of this study was to investigate the actual causes of the alarming increase of kidney failure cases in Saudi Arabia using the supersaturated design analysis and edge design analysis.
A cross-sectional questionnaire was distributed to the general population in the KSA, and data were collected using Google Forms. A total of 401 responses were received. To determine the actual causes of kidney failure, edge and supersaturated designs analysis methods were used, which resulted in statistical significance. All variables were studied from factor h 1 to factor h 18 related to the causes of kidney failure.
The supersaturated analysis method revealed that the reasons for the increase in kidney failure cases are as follows: h 9 (Bad diet), h 8 (Recurrent urinary tract infection), h 1 (Not drinking fluids), h 6 (Lack of exercise), h 14 (drinking from places not designated for valleys and reefs), h 18 (Rheumatic diseases), h 10 (Smoking and alcohol consumption), h 13 (Direct damage to the kidneys), h 2 (take medications), h 17 (excessive intake of soft drinks), h 12 (Infection), h 5 (heart disease), h 3 (diabetes), h 4 (pressure disease), h 15 (Dyes used in X-rays), and h 11 (The presence of kidney stones) are all valid. The design analysis method by edges revealed that the following factors contributed to an increase in kidney failure cases: h 8 (Recurrent urinary tract infection), h 6 (Lack of exercise), h 7 (Obesity), and h 11 .
The findings showed that there were causes of kidney failure that led to the statistical significance, which is h 8 (Recurrent urinary tract infection) and h 11 (The presence of kidney stones)
Citation: Abdulrahman AT, Alnagar DK (2024) Accurate statistical methods to cover the aspects of the increase in the incidence of kidney failure: A survey study in Ha’il -Saudi Arabia. PLoS ONE 19(8): e0309226. https://doi.org/10.1371/journal.pone.0309226
Editor: V. Vinoth Kumar, Vellore Institute of Technology, INDIA
Received: July 24, 2023; Accepted: August 7, 2024; Published: August 28, 2024
Copyright: © 2024 Abdulrahman, Alnagar. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The authors confirm that the data supporting the findings of this study are available within the article [and/or] its supplementary materials .
Funding: The authors received no specific funding for this work.
Competing interests: The author states that I have no conflicts of interest.
The kidney is considered one of the most essential parts of the human body as it acts as a filter to purify fluids and blood from impurities eliminate waste and toxic substances in the blood and excrete them outside the body through urine, in addition to controlling the number of fluids, sodium and potassium present in the body. Kidney failure occurs when the kidneys cannot effectively eliminate the waste products. Kidneys may lose the ability to filter waste and excrete liquid waste through urine, resulting in a chronic or acute condition known as kidney failure. In addition, it causes an imbalance in the level of water, mineral salts, and various minerals in the body, which leads to disturbances in the body’s systems, it may threaten life if it is not treated immediately [ 1 ]. Chronic kidney disease (CKD) has become more common in recent decades, putting a significant strain on healthcare systems worldwide. CKD is a global health issue that can lead to severe complications such as kidney failure and death. It affects 195 million women worldwide annually and is currently the eighth leading cause of death in women, accounting for 600,000 deaths each year [ 2 ]. Patients with end-stage kidney disease (ESKD) have a 17-fold higher mortality rate than age- and sex-matched healthy people. 5 The number of deaths from CKD is expected to reach 2–4 million by 2040 [ 3 ]. According to an epidemiological survey conducted in 2010, the global prevalence of CKD was 9.1%, with 697.5 million cases of CKD (all stages) reported worldwide. In contrast, the prevalence of CKD in the Kingdom of Saudi Arabia is 5.7%, posing a significant burden on the healthcare systems. In recent years, the medical literature and community have widely accepted that CKD is associated with an increased risk of premature death [ 4 ]. The Executive Director General of the Prince Salman Center for Kidney Diseases and the General Supervisor of the Awareness Campaign for Kidney Diseases, Dr. Khaled bin Abdulaziz Al-Saaran, revealed that the incidence of kidney failure in the Kingdom ranges from 90 to 110 people per million people in the Kingdom who suffer from kidney failure. The incidence of kidney failure in the northern part of the kingdom is the highest compared that on to other regions of the Kingdom, reaching 167 per million people. Some studies indicate that the global incidence of kidney disease is one out of every ten healthy people. The latest kidney failure statistics showed that the total number of patients with chronic renal failure reached 21,000 in Saudi Arabia. According to the Saudi Center for Organ Transplantation annual report, most patients were men 56% and women 44%. During our simple survey ten years ago, the number of people with kidney failure in Saudi Arabia was approximately 9,600 in Saudi [ 5 ].
Compared with latest statistics, we find that the number is increasing significantly and is being observed by the competent authorities. However, we did not find a survey study that looked for the reasons for the increase in cases of kidney failure worldwide, as researchers were limited during the past years to treat only in the advanced stages and urging early detection of this disease.
This study aimed to identify the reasons for the increase in kidney failure cases by conducting a survey using various statistical models. Additionally, this study examines two methods, a supersaturated design and an edge design supersaturated design, to identify the actual reasons for the increase in kidney failure cases.
Supersaturated Design Analysis is a statistical approach used in experiments in which the number of factors exceeds the number of runs. This is useful when it is believed that only a few factors are significant and particularly beneficial for screening purposes. These designs are known for their run-size economy and have been proven to be effective in identifying significant factors [ 6 – 8 ]. Edge design analysis refers to the study and evaluation of experimental designs that are particularly useful for screening experiments with more factors than runs. These designs help identify the most influential factors with a limited number of experiments; in addition, the analysis of edge designs often involves statistical methods to assess the robustness and efficiency of the designs [ 9 ].
The validity of the two methods, Analysis by supersaturated designs and analysis by design with edges, can be assessed based on their effectiveness in identifying significant factors in an experimental setup.
2.1. survey study.
This section contains general questions related to metadata and causes of kidney failure. The general questions included gender, age, region, a chronic disease, and kidney failure. Regarding the questions related to the causes of kidney failure, where the opinion of the competent people, the patient, or those around the patient is taken about the actual cause from his simple point of view for the following reasons:
After obtaining approval from the Research Ethics Committee, the questionnaire was distributed to the target group from 01/24/2023 to 06/24/2023. The Research Ethics Committee (RCE) at the University of Ha’il reviewed and approved this study on January (23, 2023, research number H-2023-040). Verbal and written consent were obtained from all participants prior to data collection.
2.3.1. analysis by supersaturated designs..
Contrast method analysis with supersaturated designs was used to determine the causes of kidney failure, which were statistically significant. The procedure is as follows, [ 10 ].
Y is the response factor, and X is the design chosen from the graphic survey. At this point, the superior attributes and ranking factors express contracts.
Edge designs analysis was used to determine the actual causes of kidney failure, which resulted in the statistical significance. The procedure described in [ 12 ] is as follows.
Based on the previous step, we searched for the number of active w (p) agents based on the value of Z.
More details on this method can be found in reference [ 13 ].
In this section, models and applications for each method are selected, and the analysis was used by a saturated design for each particular model of this method to search for the reasons that led to an increase in cases of kidney failure. Then, the edge analysis method was used for the aforementioned selected model and design to search for the reasons. In the end, similar causes were identified in both methods, therefore, these are the actual reasons that led to increased cases of kidney failure.
This section presents the results of the questionnaire answers to the general questions related to our research.
Fig 1 shows the responses to the questionnaires according to age. The age group from 18 to 25 constituted 52 percent as the highest response rate, followed by the age group from 26 to 35 (20%), and the age group from 36 to 50 (19%). While the age group over 50 years achieved a low percentage of the questionnaire responses, at 9 percent. Fig 2 shows that the response rate of the questionnaire for males was equal to the response rate for females. Fig 3 shows the response rate for each region, as the northern region occupied the highest response rate, estimated at 52%, followed by the central area at 23%, while the rest of the regions are as shown in the figure. Fig 4 shows whether those who answered the questionnaire had chronic disease and kidney failure, where the highest percentage was that they did not have these diseases.
https://doi.org/10.1371/journal.pone.0309226.g001
https://doi.org/10.1371/journal.pone.0309226.g002
https://doi.org/10.1371/journal.pone.0309226.g003
https://doi.org/10.1371/journal.pone.0309226.g004
In this section, applications are made for the questionnaire, which consists of a supersaturated design so that the number of influencing factors is greater than the number of responses at one rate. The above analysis method was then used.
https://doi.org/10.1371/journal.pone.0309226.t001
https://doi.org/10.1371/journal.pone.0309226.t002
https://doi.org/10.1371/journal.pone.0309226.t003
https://doi.org/10.1371/journal.pone.0309226.t004
https://doi.org/10.1371/journal.pone.0309226.t005
https://doi.org/10.1371/journal.pone.0309226.t006
https://doi.org/10.1371/journal.pone.0309226.t007
https://doi.org/10.1371/journal.pone.0309226.t008
https://doi.org/10.1371/journal.pone.0309226.t009
https://doi.org/10.1371/journal.pone.0309226.t010
https://doi.org/10.1371/journal.pone.0309226.t011
https://doi.org/10.1371/journal.pone.0309226.t012
In this section, a ready-made edge design is selected from a published scientific paper consisting of six factors and 12 runs (N) [ 14 , 15 ]. This design was examined horizontally to ensure agreement with the questionnaire’s design. The chosen design was then analyzed by designing the above edges. The design chosen from the scientific literature is as follows.
https://doi.org/10.1371/journal.pone.0309226.t013
The following is an edge design analysis of the data in Table 13 . To begin, Table 14 shows that all six contrasts of response y over the edges and the absolute regard are present. Second, we computed the center to forecast the number p as a powerful part. Third, we discovered (σ), w (p), and k^2^0.5. Finally, if w (p) for some hypothesis p is more critical than p, the method is terminated, and a unique factor is sought. The results are shown in Table 15 ; We have w (2) = 1, indicating a unique factor, which is h 8 (Recurrent urinary tract infection).
https://doi.org/10.1371/journal.pone.0309226.t014
https://doi.org/10.1371/journal.pone.0309226.t015
https://doi.org/10.1371/journal.pone.0309226.t016
The following is an edge design analysis of the data in Table 16 . All six contrasts of response y over the edges and the absolute regard are presented in Table 17 . Second, we computed the center to forecast the number p as powerful parts. Third, we discovered (σ), w (p), and k^2^0.5. Finally, if the w (p) for some hypothesis p is more critical than p, the method is terminated, and a unique factor is sought. The results are listed in Table 18 ; where w (5) = 4., indicating that there are unique factors h 6 (Lack of exercise), h 7 (Obesity), h 8 (Recurrent urinary tract infection) and h 11 (presence of kidney stones).
https://doi.org/10.1371/journal.pone.0309226.t017
https://doi.org/10.1371/journal.pone.0309226.t018
https://doi.org/10.1371/journal.pone.0309226.t019
https://doi.org/10.1371/journal.pone.0309226.s001
This research was funded by the Scientific Research Deanship at the University of Ha’il, Saudi Arabia, project number RD-21 001.
IMAGES
VIDEO
COMMENTS
16.1. Student Learning Objective. This chapter concludes this book. We start with a short review of the topics that were discussed in the second part of the book, the part that dealt with statistical inference. The main part of the chapter involves the statistical analysis of 2 case studies. The tools that will be used for the analysis are ...
A case study is an in-depth investigation of a single person, group, event, or community. This research method involves intensively analyzing a subject to understand its complexity and context. The richness of a case study comes from its ability to capture detailed, qualitative data that can offer insights into a process or subject matter that ...
Table of contents. Step 1: Write your hypotheses and plan your research design. Step 2: Collect data from a sample. Step 3: Summarize your data with descriptive statistics. Step 4: Test hypotheses or make estimates with inferential statistics.
The purpose of case study research is twofold: (1) to provide descriptive information and (2) to suggest theoretical relevance. Rich description enables an in-depth or sharpened understanding of the case. It is unique given one characteristic: case studies draw from more than one data source. Case studies are inherently multimodal or mixed ...
Quantitative data were analyzed using descriptive statistics in 3 studies. 16,20,23 The descriptive statistics comprised the calculation of the mean, ... In 7 studies, a within-case analysis was performed. 15-20,22 Six studies used qualitative data for the within-case analysis, and 1 study employed qualitative and quantitative data. Data were ...
Case study protocol is a formal document capturing the entire set of procedures involved in the collection of empirical material . It extends direction to researchers for gathering evidences, empirical material analysis, and case study reporting . This section includes a step-by-step guide that is used for the execution of the actual study.
A Case study is: An in-depth research design that primarily uses a qualitative methodology but sometimes includes quantitative methodology. Used to examine an identifiable problem confirmed through research. Used to investigate an individual, group of people, organization, or event. Used to mostly answer "how" and "why" questions.
Case studies are good for describing, comparing, evaluating and understanding different aspects of a research problem. Table of contents. When to do a case study. Step 1: Select a case. Step 2: Build a theoretical framework. Step 3: Collect your data. Step 4: Describe and analyze the case.
analyze the nested case-control data. The latter approach essentially involves analyzing the whole set of cohort data and using multiple imputation for those variables that were only collected in the case-control subset. There are also excellent chapters on the self-controlled case series method, and various methods for case-control studies of ...
Case studies play a significant role in knowledge development across various disciplines. Analysis of cases provides an avenue for researchers to explore phenomena within their context based on the collected data. Analysis of qualitative data from case study research can contribute to knowledge development.
This article discusses statistical analysis in case-control studies. Advantages and Disadvantages of Case-Control Studies. Study Design. Participants in a case-control study are chosen for the study depending on their outcome status. As a result, some individuals have the desired outcome (referred to as cases), while others do not have the ...
The book contains four case studies, each showcasing unique statistical and data-analysis-related techniques. Section 2: Univariate Statistics - Case Study Socio-Demographic Reporting; Section 2 contains material on the analysis of one variable. It presents measures of typical values (e.g., the mean) and the distribution of data.
Case Studies in Bayesian Statistical Modelling and Analysis is aimed at statisticians, researchers and practitioners who have some expertise in statistical modelling and analysis, and some understanding of the basics of Bayesian statistics, but little experience in its application. Graduate students of statistics and biostatistics will also ...
Study Design and Statistical Analysis A Practical Guide for Clinicians This book takes the reader through the entire research process: choosing a question, ... Although case histories are drawn from actual cases, every eff ort has been made to disguise the identities of the individuals involved. Nevertheless, the authors, editors and publishers ...
Such studies have mostly used two main approaches, case-control study and cohort study. The cohort study starts with the putative cause of disease, and observes the occurrence of disease relative to the hypothesized causal agent, while the case-control study proceeds from documented disease and investigates possible causes of the dis-ease [7].
Limitations to this approach include the significant time investment required to ... Keywords: applied statistics, data science, statistical thinking, case studies, education, computing 1Introduction A major challenge in the practice of teaching data sci-ence and statistics is the limited availability of courses
Design. In many ways the design of a study is more important than the analysis. A badly designed study can never be retrieved, whereas a poorly analysed one can usually be reanalysed. (1) Consideration of design is also important because the design of a study will govern how the data are to be analysed. Most medical studies consider an input ...
This case study epitomizes the beautiful interplay between human action, informed by truth and statistical insight, resulting in a tangible good: the return of a majestic species from the shadow of extinction. 5. The Algorithmic Mirrors of Social Media - The Case of Twitter and Political Polarization.
14. Estimating the Biomass of Forage Fishes in Alaska's Prince William Sound Following the Exxon Valdez Oil Spill. 15. A Simplified Simulation of the Impact of Environmental Interference on Measurement Systems in an Electrical Components Testing Laboratory.
Case Study Library Bring practical statistical problem solving to your course. ... Use data from a survey of students to perform exploratory data analysis and to evaluate the performance of different approaches to a statistical analysis. Key words: Histograms, normal quantile plots, log transformations, confidence intervals, inverse transformation.
Limitations to this approach include the significant time investment required to develop a case study -- namely, to select a motivating question and to create an illustrative data analysis -- and the domain expertise needed. ... View a PDF of the paper titled Open Case Studies: Statistics and Data Science Education through Real-World ...
As an alternative to this subjective approach, a statistical basis is presented using binomial and hypergeometric distributions to determine a lower limit for the proportion of units in a population which contains a drug, at a given confidence level. A method for calculating the total weight of a drug present in a population within a given ...
A Series of Case Studies. Resources for Statistics Teachers developed by: Richard D. De Veaux, Williams College Deborah Nolan and Jasjeet Sekhon, UC Berkeley ... They can be used as examples in class, or just as guides for what a statistical analysis might entail. Each case is presented in 2 versions: An R version, written in R Markdown ...
Case Study 1: Netflix's Data-Driven Recommendations Netflix extensively uses quantitative data analysis, particularly machine learning, to drive its recommendation engine. By mining user behavior data and combining it with metadata about movies and shows, they build predictive models to accurately forecast what a user would enjoy watching next.
Edge design analysis refers to the study and evaluation of experimental designs that are particularly useful for screening experiments with more factors than runs. These designs help identify the most influential factors with a limited number of experiments; in addition, the analysis of edge designs often involves statistical methods to assess ...
A comprehensive approach to ecological functional zoning in the Shenzhen region of China is presented in this study. Through the integration of advanced geospatial analysis tools, multiple data sources, and sophisticated statistical techniques, different ecological functions have been identified and categorized based on a comprehensive set of indicators and spatial analysis techniques.
This section describes the main characteristics of the power grid, electric vehicles and charging stations in this study. The studied system simulation model built in MATLAB/Simulink is shown in Fig. 2.As it can be seen the system is composed of three main parts: the grid model connected to a charging station through a bidirectional AC/DC converter and then to the EV battery chargers, the ...
A case-control study in Suzhou, China Baseline characteristics. The basic characteristics of GDM cases and matched controls are shown in Table 1.The average age of the 402 participants was 29.67 ...
Statistical analysis to evaluate heterogeneity included the chi-square test and the I-square index. A total 5072 studies resulted in the inclusion of 15 studies, encompassing data from 6,667 ...
The book contains four case studies, each showcasing unique statistical and data-analysis-related techniques. Section 2: Univariate Statistics - Case Study Socio-Demographic Reporting; Section 2 contains material on the analysis of one variable. It presents measures of typical values (e.g., the mean) and the distribution of data.