Print — Iraqi Digital Repository

تقدير الدالة اللامعلمية للبيانات العنقودية == Nonparametric Regression Function Estimation of Clustered Data

Author name: حلا كاظم عبيد

Supervisor name: سجى محمد حسين الهاشمي

General topic: Administration and Economics

Specific topic: Statistics

Degree: Doctorate

University: University of Baghdad - Faculty Of Administration And Economics - Department Of Statistics

Language: Arabic

University location: Baghdad

First pages: 07T3569 - p.pdf

Abstract: البيانات العنقودية تظهر في الكثير من العلوم الاجتماعية والصحية والسلوكية. وتتميز هذا النوع من البيانات بوجود الارتباط بين مشاهداتها. وممكن التعبير عن العنقدة من حيث العلاقة بين القياسات على الوحدات ضمن نفس المجموعة فان النماذج الاحصائية تحتم على حساب الار | Cluster data appears in a lot of social, health and behavioral sciences. And featuring this type of data link between the presences of her observations. And possible expression of clustering in terms of the relationship between measurements on units within the same group, the statistical models makes it imperative for the link account at every level, because failure to do so leads to misleading results. Hence the importance inside the Observations link to the estimating of the function non parametric for cluster data where the use of parametric method for ICON is always desirable to estimate some functions Because of the shape of the data is unknown in advance the appropriate function or as a result of the existence of some obstacles so it is the use non parametric method to estimate (smoothing) Nonparametric function.. Research has shown developed in recent times on the use of non parametric regression when parametric the assumptions are unfulfilled. And non parametric regression allows greater flexibility of functions dependent variables resulting from the data. Previous research has touched on the case of cluster data estimating the ways non parametric and semi parametric methods and was adopted state of neglect of the link within the same cluster property data that distinguish cluster data is particularly. And local kernel estimator achieved more efficient negligently correlation within clusters (even if the correlation is in the interest the study). While some touched on the case taking correlation between Observations per cluster using the estimated equations. Others had created the kernel methods in the case of cluster data behave completely different from the behavior of the capabilities of the spline estimator as has achieved kernel methods results more efficient when the neglect of the link within the clusters, while spline methods results achieved less variance of smoothing fixed parameters at taking the link inside clusters into account in the estimation process.So in this thesis will be nonparametric function estimating for clustered data using the Seemingly Unrelated Kernel Estimators, and The Generalized Least Squares Smoothing Spline Estimators and propose Robust methods and comparison of the methods listed above to indicate the best estimate of the nonparametric function estimating for clustered data, taking into account the structure of the link within the clusters were cluster data, The adoption of cluster data, which has the same number of explanatory variables within each cluster. To achieve this, thesis was divided into five chapters, the first chapter included introduction and aim of the research and reference review, either Chapter II now include the theoretical side which discussed the methods used to calculate the non parametric function of cluster data in the presence of the link. While included Chapter III experimental side (simulation) and the application addressed method in the second chapter and the statement of the best way has less (MAE) or (MSE). and either the fourth chapter includes the applied side to the real data for the proportion of white blood cells and its impact on the proportion of blood per patient (cluster) and Chapter V which includes the most important conclusions and the recommendations.it is through simulation experiments have been finding the best way to estimate the non parametric function for cluster data and a way The robust Generalized Least Squares Smoothing Spline Estimators in the case of a correlation. It was the application of all methods of the practical side using real data about the proportion of white blood cells and their impact on the proportion of blood hemoglobin for patients with blood cancer (leukemia).