Institute of Statistical Research and Training
http://repository.library.du.ac.bd:8080/xmlui/xmlui/handle/123456789/106
Wed, 15 Jul 2020 12:21:21 GMT2020-07-15T12:21:21ZUse of adaptive cluster sampling for identifying the land producing non hybrid crops
http://repository.library.du.ac.bd:8080/xmlui/xmlui/handle/123456789/349
Use of adaptive cluster sampling for identifying the land producing non hybrid crops
Chowdhury, Saiful Alam
For estimating rare and cluster characteristics the conventional sampling methods are hard very di_cult to be used, instead an alternative method like adapative cluster sampling is thought to be appropriate for in such situations. Coverage of Hybrid Boro usage in Bangladesh is rare and as well as of cluster pattern, so an Adaptive cluster sampling may be appropriate in estimating proportion of Hybrid Boro use. The applicability of the adaptive cluster sampling for this purpose, that is why, is planned to be visualized by a simulation consisting re-sampling of the Agriculture census (2008) data. Investigation of the suitability of the adaptive cluster sampling method for estimating the proportion of land producing non-hybrid crops is the prime objective of the study. Since only household (HH) level data are available (Agricultural Census, 2008), the proportion of HH producing non-hybrid (having the same sense of proportion of HH producing hybrid) was under investigation. The speci_c objectives of the study included to _nd an estimate of the proportion of HH cultivating Hybrid Boro using simple random sampling and adaptive cluster sampling methods. Also of the interest was to obtain bias and standard error of the estimators for each of the methods using extensive simulation studies and to compare them. The simulation study considered di_erent small sample sizes, namely 100, 200, 300 and 500. The choice of sample size is made arbitrarily keeping in the sense that adaptive cluster sampling is more pro_table for smaller sample sizes. The Monte Carlo absolute percentage relative bias and Monte Carlo standard error of the estimators were calculated for each of the methods for each of the sample sizes. The major _ndings of the study compared in terms of Monte Carlo absolute percentage relative bias and Monte Carlo standard error revealed that the estimator of the proportion of HH cultivating Hybrid Boro using adpative cluster sampling method has higher variance and lower bias than simple random sampling has for all the sample sizes considered in this study. The ultimate sample size realized by the application of the adaptive cluster sampling were also recorded and it is seen that the average ultimate sample size is about 10 to 20 percent higher than the initial sample sizes. The most interesting _nding of using an adaptive cluster sampling method was seen to be its strength of capturing more information. In estimation of proportion of HH cultivating Hybrid Boro, it has been revealed from the simulation that the adaptive cluster sampling method is way far better than the simple random sampling method in terms of chances of avoiding a bad sample containing very small number of targetted characteristics. For simple random sampling method such risk is higher for divisions with smaller true population proportion. The _ndings can be triangulated to the issue that the simple random sampling method may produce more bias than the adaptive cluster sampling method.
This thesis submitted in partial fulfillment of the requirement for the degree of M.Phil. in Applied Statistics.
Thu, 12 Feb 2015 00:00:00 GMThttp://repository.library.du.ac.bd:8080/xmlui/xmlui/handle/123456789/3492015-02-12T00:00:00ZA multistage model for prediction of sequence of events
http://repository.library.du.ac.bd:8080/xmlui/xmlui/handle/123456789/348
A multistage model for prediction of sequence of events
Chowdhury, Raﬁqul Islam
This dissertation investigates the existing methods for risk prediction of a sequence of events from longitudinal studies for the continuous time data, in addition to, proposing a simple alternative method. These outcomes (events) can change status at different followups that may produce a large number of paths or trajectories. Also, regressive models for multinomial and ordinal outcomes for discrete time data to obtain a joint model for a sequence of events for risk prediction is proposed. A key challenge is the simpliﬁcation and generalization of the existing method for continuous time data for risk prediction for a large sequence of events at different stages. Most of the models are proposed to solve the problem arising from the progression of speciﬁc diseases process. The proposed alternative multistage procedure simpliﬁes the transition models for risk prediction of a sequence of events for continuous time data. This framework provides the estimates for each stage in the process conditionally and the conditional estimates are linked based on marginal and conditional models to obtain the joint probabilities needed for predicting the status of disease based on the potential risk factors. The proposed method of prediction is a new development using a series of events in conditional setting arising from the beginning to the endpoint. Also, a general form of integral is developed for predicting the joint probability of a sequence of events from longitudinal studies for (i) different types of trajectories and (ii) any segment of a trajectory along with the generalization to any number of stages which is a new development. In follow-up or panel studies, multinomial outcomes may occur within an interval where transition times are not exactly known, or the time of the event is itself discrete. Available models for risk prediction for multinomial outcomes with speciﬁed risk factors are only for a single response and are not extended for prediction of a sequence of events for discrete time data for different stages. The regressive models for multinomial outcomes are proposed and then a modeling framework is developed to predict the joint probabilities for a sequence of events. The proposed models link the marginal and sequence of conditional models to provide the joint model needed for predicting the probability of a trajectory based on speciﬁed covariate patterns. The marginal model uses the outcome variable at the baseline and the models at the subsequent follow-ups provide the estimates of the parameters of the conditional models. The major improvement of the proposed framework is that one needs to ﬁt a signiﬁcantly smaller number of models compared to the conditional models such as Markov models. The independence of the repeated outcomes will allow using simpler models, and the goodness-of-ﬁt of the joint model is required for model performance. The proposed goodness-of-ﬁt test for joint model is obtained by linking marginal and conditional models. The test for independence uses marginal models for each repeated outcomes. The simulation study and application using real data prove the usefulness and illustrate the performance of these tests. For ordinal outcomes from longitudinal studies regressive proportional odds model, and in the case of violation of proportional odds assumption regressive partial proportional odds model are proposed. Then a framework is developed to predict joint probabilities for a sequence of ordinal outcomes. The major improvement of the proposed model is that only one model is required for each repeated outcome compared to the sequence of conditional models such as Markov models. Results from these two models are compared to that from the proposed regressive multinomial logistic model. Also, test for goodnessof-ﬁt and test for independence are shown. The proposed models provide the estimates for each stage in the process conditionally, and the joint model can be obtained for any order to predict the risk of a sequence of events. Proposed regressive partial proportional odds model and regressive multinomial models showed better performance compared to the regressive proportional odds model when proportional odds assumption is violated. Simulation studies showed satisfactory performance of the proposed regressive models for ordinal outcomes. All the proposed model and the risk prediction framework for both continuous and discrete time data are a new development. The major improvement of the proposed model is that it reduces the over-parameterization. One can easily add interaction terms among previous outcomes, and predictors in the proposed framework which may provide a better understanding of the underlying process and the relationships between outcomes and risk factors. Using the developed framework, modeling and risk prediction for a sequence of events can be performed in many ﬁelds of studies such as epidemiology, public health, survival analysis, genetics, reliability, environmental studies, etc. This model would be very useful for analyzing big data. One can use the existing software for model ﬁtting, and risk prediction of a sequence of events.
his thesis submitted in total fulﬁllment of the requirements for the degree of Doctor of Philosophy (Ph.D.) in Institute of Statistical Research and Training (ISRT).
Mon, 17 Dec 2018 00:00:00 GMThttp://repository.library.du.ac.bd:8080/xmlui/xmlui/handle/123456789/3482018-12-17T00:00:00Z