You limited your search to:

  Partner: UNT Libraries
 Degree Discipline: Computer Science
Bayesian Probabilistic Reasoning Applied to Mathematical Epidemiology for Predictive Spatiotemporal Analysis of Infectious Diseases
Abstract Probabilistic reasoning under uncertainty suits well to analysis of disease dynamics. The stochastic nature of disease progression is modeled by applying the principles of Bayesian learning. Bayesian learning predicts the disease progression, including prevalence and incidence, for a geographic region and demographic composition. Public health resources, prioritized by the order of risk levels of the population, will efficiently minimize the disease spread and curtail the epidemic at the earliest. A Bayesian network representing the outbreak of influenza and pneumonia in a geographic region is ported to a newer region with different demographic composition. Upon analysis for the newer region, the corresponding prevalence of influenza and pneumonia among the different demographic subgroups is inferred for the newer region. Bayesian reasoning coupled with disease timeline is used to reverse engineer an influenza outbreak for a given geographic and demographic setting. The temporal flow of the epidemic among the different sections of the population is analyzed to identify the corresponding risk levels. In comparison to spread vaccination, prioritizing the limited vaccination resources to the higher risk groups results in relatively lower influenza prevalence. HIV incidence in Texas from 1989-2002 is analyzed using demographic based epidemic curves. Dynamic Bayesian networks are integrated with probability distributions of HIV surveillance data coupled with the census population data to estimate the proportion of HIV incidence among the different demographic subgroups. Demographic based risk analysis lends to observation of varied spectrum of HIV risk among the different demographic subgroups. A methodology using hidden Markov models is introduced that enables to investigate the impact of social behavioral interactions in the incidence and prevalence of infectious diseases. The methodology is presented in the context of simulated disease outbreak data for influenza. Probabilistic reasoning analysis enhances the understanding of disease progression in order to identify the critical points of surveillance, control and prevention. Public health resources, prioritized by the order of risk levels of the population, will efficiently minimize the disease spread and curtail the epidemic at the earliest. digital.library.unt.edu/ark:/67531/metadc5302/
Boosting for Learning From Imbalanced, Multiclass Data Sets
Access: Use of this item is restricted to the UNT Community.
In many real-world applications, it is common to have uneven number of examples among multiple classes. The data imbalance, however, usually complicates the learning process, especially for the minority classes, and results in deteriorated performance. Boosting methods were proposed to handle the imbalance problem. These methods need elongated training time and require diversity among the classifiers of the ensemble to achieve improved performance. Additionally, extending the boosting method to handle multi-class data sets is not straightforward. Examples of applications that suffer from imbalanced multi-class data can be found in face recognition, where tens of classes exist, and in capsule endoscopy, which suffers massive imbalance between the classes. This dissertation introduces RegBoost, a new boosting framework to address the imbalanced, multi-class problems. This method applies a weighted stratified sampling technique and incorporates a regularization term that accommodates multi-class data sets and automatically determines the error bound of each base classifier. The regularization parameter penalizes the classifier when it misclassifies instances that were correctly classified in the previous iteration. The parameter additionally reduces the bias towards majority classes. Experiments are conducted using 12 diverse data sets with moderate to high imbalance ratios. The results demonstrate superior performance of the proposed method compared to several state-of-the-art algorithms for imbalanced, multi-class classification problems. More importantly, the sensitivity improvement of the minority classes using RegBoost is accompanied with the improvement of the overall accuracy for all classes. With unpredictability regularization, a diverse group of classifiers are created and the maximum accuracy improvement reaches above 24%. Using stratified undersampling, RegBoost exhibits the best efficiency. The reduction in computational cost is significant reaching above 50%. As the volume of training data increase, the gain of efficiency with the proposed method becomes more significant. digital.library.unt.edu/ark:/67531/metadc407775/
Qos Aware Service Oriented Architecture
Service-oriented architecture enables web services to operate in a loosely-coupled setting and provides an environment for dynamic discovery and use of services over a network using standards such as WSDL, SOAP, and UDDI. Web service has both functional and non-functional characteristics. This thesis work proposes to add QoS descriptions (non-functional properties) to WSDL and compose various services to form a business process. This composition of web services also considers QoS properties along with functional properties and the composed services can again be published as a new Web Service and can be part of any other composition using Composed WSDL. digital.library.unt.edu/ark:/67531/metadc500032/
Refactoring FrameNet for Efficient Relational Queries
The FrameNet database is being used in a variety of NLP research and applications such as word sense disambiguation, machine translation, information extraction and question answering. The database is currently available in XML format. The XML database though a wholesome way of distributing data in its entireness, is not practical for use unless converted to a more application friendly database. In light of this we have successfully converted the XML database to a relational MySQL™ database. This conversion reduced the amount of data storage amount to less than half. Most importantly the new database enables us to perform fast complex querying and facilitates use by applications and research. We show the steps taken to ensure relational integrity of the data during the refactoring process and a simple demo application demonstrating ease of use. digital.library.unt.edu/ark:/67531/metadc4443/
3GPP Long Term Evolution LTE Scheduling
Future generation cellular networks are expected to deliver an omnipresent broadband access network for an endlessly increasing number of subscribers. Long term Evolution (LTE) represents a significant milestone towards wireless networks known as 4G cellular networks. A key feature of LTE is the implementation of enhanced Radio Resource Management (RRM) mechanism to improve the system performance. The structure of LTE networks was simplified by diminishing the number of the nodes of the core network. Also, the design of the radio protocol architecture is quite unique. In order to achieve high data rate in LTE, 3rd Generation Partnership Project (3GPP) has selected Orthogonal Frequency Division Multiplexing (OFDM) as an appropriate scheme in terms of downlinks. However, the proper scheme for an uplink is the Single-Carrier Frequency Domain Multiple Access due to the peak-to-average-power-ratio (PAPR) constraint. LTE packet scheduling plays a primary role as part of RRM to improve the system’s data rate as well as supporting various QoS requirements of mobile services. The major function of the LTE packet scheduler is to assign Physical Resource Blocks (PRBs) to mobile User Equipment (UE). In our work, we formed a proposed packet scheduler algorithm. The proposed scheduler algorithm acts based on the number of UEs attached to the eNodeB. To evaluate the proposed scheduler algorithm, we assumed two different scenarios based on a number of UEs. When the number of UE is lower than the number of PRBs, the UEs with highest Channel Quality Indicator (CQI) will be assigned PRBs. Otherwise, the scheduler will assign PRBs based on a given proportional fairness metric. The eNodeB’s throughput is increased when the proposed algorithm was implemented. digital.library.unt.edu/ark:/67531/metadc490046/
Real-time Rendering of Burning Objects in Video Games
In recent years there has been growing interest in limitless realism in computer graphics applications. Among those, my foremost concentration falls into the complex physical simulations and modeling with diverse applications for the gaming industry. Different simulations have been virtually successful by replicating the details of physical process. As a result, some were strong enough to lure the user into believable virtual worlds that could destroy any sense of attendance. In this research, I focus on fire simulations and its deformation process towards various virtual objects. In most game engines model loading takes place at the beginning of the game or when the game is transitioning between levels. Game models are stored in large data structures. Since changing or adjusting a large data structure while the game is proceeding may adversely affect the performance of the game. Therefore, developers may choose to avoid procedural simulations to save resources and avoid interruptions on performance. I introduce a process to implement a real-time model deformation while maintaining performance. It is a challenging task to achieve high quality simulation while utilizing minimum resources to represent multiple events in timely manner. Especially in video games, this overwhelming criterion would be robust enough to sustain the engaging player's willing suspension of disbelief. I have implemented and tested my method on a relatively modest GPU using CUDA. My experiments conclude this method gives a believable visual effect while using small fraction of CPU and GPU resources. digital.library.unt.edu/ark:/67531/metadc500131/
An Integrated Architecture for Ad Hoc Grids
Extensive research has been conducted by the grid community to enable large-scale collaborations in pre-configured environments. grid collaborations can vary in scale and motivation resulting in a coarse classification of grids: national grid, project grid, enterprise grid, and volunteer grid. Despite the differences in scope and scale, all the traditional grids in practice share some common assumptions. They support mutually collaborative communities, adopt a centralized control for membership, and assume a well-defined non-changing collaboration. To support grid applications that do not confirm to these assumptions, we propose the concept of ad hoc grids. In the context of this research, we propose a novel architecture for ad hoc grids that integrates a suite of component frameworks. Specifically, our architecture combines the community management framework, security framework, abstraction framework, quality of service framework, and reputation framework. The overarching objective of our integrated architecture is to support a variety of grid applications in a self-controlled fashion with the help of a self-organizing ad hoc community. We introduce mechanisms in our architecture that successfully isolates malicious elements from the community, inherently improving the quality of grid services and extracting deterministic quality assurances from the underlying infrastructure. We also emphasize on the technology-independence of our architecture, thereby offering the requisite platform for technology interoperability. The feasibility of the proposed architecture is verified with a high-quality ad hoc grid implementation. Additionally, we have analyzed the performance and behavior of ad hoc grids with respect to several control parameters. digital.library.unt.edu/ark:/67531/metadc5300/
Resource Efficient and Scalable Routing using Intelligent Mobile Agents
Many of the contemporary routing algorithms use simple mechanisms such as flooding or broadcasting to disseminate the routing information available to them. Such routing algorithms cause significant network resource overhead due to the large number of messages generated at each host/router throughout the route update process. Many of these messages are wasteful since they do not contribute to the route discovery process. Reducing the resource overhead may allow for several algorithms to be deployed in a wide range of networks (wireless and ad-hoc) which require a simple routing protocol due to limited availability of resources (memory and bandwidth). Motivated by the need to reduce the resource overhead associated with routing algorithms a new implementation of distance vector routing algorithm using an agent-based paradigm known as Agent-based Distance Vector Routing (ADVR) has been proposed. In ADVR, the ability of route discovery and message passing shifts from the nodes to individual agents that traverse the network, co-ordinate with each other and successively update the routing tables of the nodes they visit. digital.library.unt.edu/ark:/67531/metadc4240/
Content-Based Image Retrieval by Integration of Metadata Encoded Multimedia Features in Constructing a Video Summarizer Application.
Content-based image retrieval (CBIR) is the retrieval of images from a collection by means of internal feature measures of the information content of the images. In CBIR systems, text media is usually used only to retrieve exemplar images for further searching by image feature content. This research work describes a new method for integrating multimedia text and image content features to increase the retrieval performance of the system. I am exploring the content-based features of an image extracted from a video to build a storyboard for search retrieval of images. Metadata encoded multimedia features include extracting primitive features like color, shape and text from an image. Histograms are built for all the features extracted and stored in a database. Images are searched based on comparing these histogram values of the extracted image with the stored values. These histogram values are used for extraction of keyframes from a collection of images parsed from a video file. Individual shots of images are extracted from a video clip and run through processes that extract the features and build the histogram values. A keyframe extraction algorithm is run to get the keyframes from the collection of images to build a storyboard of images. In video retrieval, speech recognition and other multimedia encoding could help improve the CBIR indexing technique and makes keyframe extraction and searching effective. Research in area of embedding sound and other multimedia could enhance effective video retrieval. digital.library.unt.edu/ark:/67531/metadc4238/
Privacy Management for Online Social Networks
One in seven people in the world use online social networking for a variety of purposes -- to keep in touch with friends and family, to share special occasions, to broadcast announcements, and more. The majority of society has been bought into this new era of communication technology, which allows everyone on the internet to share information with friends. Since social networking has rapidly become a main form of communication, holes in privacy have become apparent. It has come to the point that the whole concept of sharing information requires restructuring. No longer are online social networks simply technology available for a niche market; they are in use by all of society. Thus it is important to not forget that a sense of privacy is inherent as an evolutionary by-product of social intelligence. In any context of society, privacy needs to be a part of the system in order to help users protect themselves from others. This dissertation attempts to address the lack of privacy management in online social networks by designing models which understand the social science behind how we form social groups and share information with each other. Social relationship strength was modeled using activity patterns, vocabulary usage, and behavioral patterns. In addition, automatic configuration for default privacy settings was proposed to help prevent new users from leaking personal information. This dissertation aims to mobilize a new era of social networking that understands social aspects of human network, and uses that knowledge to honor users' privacy. digital.library.unt.edu/ark:/67531/metadc283816/
The Role of Intelligent Mobile Agents in Network Management and Routing
In this research, the application of intelligent mobile agents to the management of distributed network environments is investigated. Intelligent mobile agents are programs which can move about network systems in a deterministic manner in carrying their execution state. These agents can be considered an application of distributed artificial intelligence where the (usually small) agent code is moved to the data and executed locally. The mobile agent paradigm offers potential advantages over many conventional mechanisms which move (often large) data to the code, thereby wasting available network bandwidth. The performance of agents in network routing and knowledge acquisition has been investigated and simulated. A working mobile agent system has also been designed and implemented in JDK 1.2. digital.library.unt.edu/ark:/67531/metadc2736/
Control Mechanisms and Recovery Techniques for Real-Time Data Transmission Over the Internet.
Streaming multimedia content with UDP has become popular over distributed systems such as an Internet. This may encounter many losses due to dropped packets or late arrivals at destination since UDP can only provide best effort delivery. Even UDP doesn't have any self-recovery mechanism from congestion collapse or bursty loss to inform sender of the data to adjust future transmission rate of data like in TCP. So there is a need to incorporate various control schemes like forward error control, interleaving, and congestion control and error concealment into real-time transmission to prevent from effect of losses. Loss can be repaired by retransmission if roundtrip delay is allowed, otherwise error concealment techniques will be used based on the type and amount of loss. This paper implements the interleaving technique with packet spacing of varying interleaver block size for protecting real-time data from loss and its effect during transformation across the Internet. The packets are interleaved and maintain some time gap between two consecutive packets before being transmitted into the Internet. Thus loss of packets can be reduced from congestion and preventing loss of consecutive packets of information when a burst of several packets are lost. Several experiments have been conducted with video data for analysis of proposed model. digital.library.unt.edu/ark:/67531/metadc3268/
Multi-perspective, Multi-modal Image Registration and Fusion
Multi-modal image fusion is an active research area with many civilian and military applications. Fusion is defined as strategic combination of information collected by various sensors from different locations or different types in order to obtain a better understanding of an observed scene or situation. Fusion of multi-modal images cannot be completed unless these two modalities are spatially aligned. In this research, I consider two important problems. Multi-modal, multi-perspective image registration and decision level fusion of multi-modal images. In particular, LiDAR and visual imagery. Multi-modal image registration is a difficult task due to the different semantic interpretation of features extracted from each modality. This problem is decoupled into three sub-problems. The first step is identification and extraction of common features. The second step is the determination of corresponding points. The third step consists of determining the registration transformation parameters. Traditional registration methods use low level features such as lines and corners. Using these features require an extensive optimization search in order to determine the corresponding points. Many methods use global positioning systems (GPS), and a calibrated camera in order to obtain an initial estimate of the camera parameters. The advantages of our work over the previous works are the following. First, I used high level-features, which significantly reduce the search space for the optimization process. Second, the determination of corresponding points is modeled as an assignment problem between a small numbers of objects. On the other side, fusing LiDAR and visual images is beneficial, due to the different and rich characteristics of both modalities. LiDAR data contain 3D information, while images contain visual information. Developing a fusion technique that uses the characteristics of both modalities is very important. I establish a decision-level fusion technique using manifold models. digital.library.unt.edu/ark:/67531/metadc149562/
Hopfield Networks as an Error Correcting Technique for Speech Recognition
Access: Use of this item is restricted to the UNT Community.
I experimented with Hopfield networks in the context of a voice-based, query-answering system. Hopfield networks are used to store and retrieve patterns. I used this technique to store queries represented as natural language sentences and I evaluated the accuracy of the technique for error correction in a spoken question-answering dialog between a computer and a user. I show that the use of an auto-associative Hopfield network helps make the speech recognition system more fault tolerant. I also looked at the available encoding schemes to convert a natural language sentence into a pattern of zeroes and ones that can be stored in the Hopfield network reliably, and I suggest scalable data representations which allow storing a large number of queries. digital.library.unt.edu/ark:/67531/metadc5551/
Practical Cursive Script Recognition
This research focused on the off-line cursive script recognition application. The problem is very large and difficult and there is much room for improvement in every aspect of the problem. Many different aspects of this problem were explored in pursuit of solutions to create a more practical and usable off-line cursive script recognizer than is currently available. digital.library.unt.edu/ark:/67531/metadc277710/
Investigating the Extractive Summarization of Literary Novels
Abstract Due to the vast amount of information we are faced with, summarization has become a critical necessity of everyday human life. Given that a large fraction of the electronic documents available online and elsewhere consist of short texts such as Web pages, news articles, scientific reports, and others, the focus of natural language processing techniques to date has been on the automation of methods targeting short documents. We are witnessing however a change: an increasingly larger number of books become available in electronic format. This means that the need for language processing techniques able to handle very large documents such as books is becoming increasingly important. This thesis addresses the problem of summarization of novels, which are long and complex literary narratives. While there is a significant body of research that has been carried out on the task of automatic text summarization, most of this work has been concerned with the summarization of short documents, with a particular focus on news stories. However, novels are different in both length and genre, and consequently different summarization techniques are required. This thesis attempts to close this gap by analyzing a new domain for summarization, and by building unsupervised and supervised systems that effectively take into account the properties of long documents, and outperform the traditional extractive summarization systems typically addressing news genre. digital.library.unt.edu/ark:/67531/metadc103298/
Using Reinforcement Learning in Partial Order Plan Space
Access: Use of this item is restricted to the UNT Community.
Partial order planning is an important approach that solves planning problems without completely specifying the orderings between the actions in the plan. This property provides greater flexibility in executing plans; hence making the partial order planners a preferred choice over other planning methodologies. However, in order to find partially ordered plans, partial order planners perform a search in plan space rather than in space of world states and an uninformed search in plan space leads to poor efficiency. In this thesis, I discuss applying a reinforcement learning method, called First-visit Monte Carlo method, to partial order planning in order to design agents which do not need any training data or heuristics but are still able to make informed decisions in plan space based on experience. Communicating effectively with the agent is crucial in reinforcement learning. I address how this task was accomplished in plan space and the results from an evaluation of a blocks world test bed. digital.library.unt.edu/ark:/67531/metadc5232/
Natural Language Interfaces to Databases
Natural language interfaces to databases (NLIDB) are systems that aim to bridge the gap between the languages used by humans and computers, and automatically translate natural language sentences to database queries. This thesis proposes a novel approach to NLIDB, using graph-based models. The system starts by collecting as much information as possible from existing databases and sentences, and transforms this information into a knowledge base for the system. Given a new question, the system will use this knowledge to analyze and translate the sentence into its corresponding database query statement. The graph-based NLIDB system uses English as the natural language, a relational database model, and SQL as the formal query language. In experiments performed with natural language questions ran against a large database containing information about U.S. geography, the system showed good performance compared to the state-of-the-art in the field. digital.library.unt.edu/ark:/67531/metadc5474/
Measuring Vital Signs Using Smart Phones
Smart phones today have become increasingly popular with the general public for its diverse abilities like navigation, social networking, and multimedia facilities to name a few. These phones are equipped with high end processors, high resolution cameras, built-in sensors like accelerometer, orientation-sensor, light-sensor, and much more. According to comScore survey, 25.3% of US adults use smart phones in their daily lives. Motivated by the capability of smart phones and their extensive usage, I focused on utilizing them for bio-medical applications. In this thesis, I present a new application for a smart phone to quantify the vital signs such as heart rate, respiratory rate and blood pressure with the help of its built-in sensors. Using the camera and a microphone, I have shown how the blood pressure and heart rate can be determined for a subject. People sometimes encounter minor situations like fainting or fatal accidents like car crash at unexpected times and places. It would be useful to have a device which can measure all vital signs in such an event. The second part of this thesis demonstrates a new mode of communication for next generation 9-1-1 calls. In this new architecture, the call-taker will be able to control the multimedia elements in the phone from a remote location. This would help the call-taker or first responder to have a better control over the situation. Transmission of the vital signs measured using the smart phone can be a life saver in critical situations. In today's voice oriented 9-1-1 calls, the dispatcher first collects critical information (e.g., location, call-back number) from caller, and assesses the situation. Meanwhile, the dispatchers constantly face a "60-second dilemma"; i.e., within 60 seconds, they need to make a complicated but important decision, whether to dispatch and, if so, what to dispatch. The dispatchers often feel that they lack sufficient information to make a confident dispatch decision. This remote-media-control described in this system will be able to facilitate information acquisition and decision-making in emergency situations within the 60-second response window in 9-1-1 calls using new multimedia technologies. digital.library.unt.edu/ark:/67531/metadc33139/
Higher Compression from the Burrows-Wheeler Transform with New Algorithms for the List Update Problem
Burrows-Wheeler compression is a three stage process in which the data is transformed with the Burrows-Wheeler Transform, then transformed with Move-To-Front, and finally encoded with an entropy coder. Move-To-Front, Transpose, and Frequency Count are some of the many algorithms used on the List Update problem. In 1985, Competitive Analysis first showed the superiority of Move-To-Front over Transpose and Frequency Count for the List Update problem with arbitrary data. Earlier studies due to Bitner assumed independent identically distributed data, and showed that while Move-To-Front adapts to a distribution faster, incurring less overwork, the asymptotic costs of Frequency Count and Transpose are less. The improvements to Burrows-Wheeler compression this work covers are increases in the amount, not speed, of compression. Best x of 2x-1 is a new family of algorithms created to improve on Move-To-Front's processing of the output of the Burrows-Wheeler Transform which is like piecewise independent identically distributed data. Other algorithms for both the middle stage of Burrows-Wheeler compression and the List Update problem for which overwork, asymptotic cost, and competitive ratios are also analyzed are several variations of Move One From Front and part of the randomized algorithm Timestamp. The Best x of 2x - 1 family includes Move-To-Front, the part of Timestamp of interest, and Frequency Count. Lastly, a greedy choosing scheme, Snake, switches back and forth as the amount of compression that two List Update algorithms achieves fluctuates, to increase overall compression. The Burrows-Wheeler Transform is based on sorting of contexts. The other improvements are better sorting orders, such as “aeioubcdf...” instead of standard alphabetical “abcdefghi...” on English text data, and an algorithm for computing orders for any data, and Gray code sorting instead of standard sorting. Both techniques lessen the overwork incurred by whatever List Update algorithms are used by reducing the difference between adjacent sorted contexts. digital.library.unt.edu/ark:/67531/metadc2909/
Automatic Speech Recognition Using Finite Inductive Sequences
This dissertation addresses the general problem of recognition of acoustic signals which may be derived from speech, sonar, or acoustic phenomena. The specific problem of recognizing speech is the main focus of this research. The intention is to design a recognition system for a definite number of discrete words. For this purpose specifically, eight isolated words from the T1MIT database are selected. Four medium length words "greasy," "dark," "wash," and "water" are used. In addition, four short words are considered "she," "had," "in," and "all." The recognition system addresses the following issues: filtering or preprocessing, training, and decision-making. The preprocessing phase uses linear predictive coding of order 12. Following the filtering process, a vector quantization method is used to further reduce the input data and generate a finite inductive sequence of symbols representative of each input signal. The sequences generated by the vector quantization process of the same word are factored, and a single ruling or reference template is generated and stored in a codebook. This system introduces a new modeling technique which relies heavily on the basic concept that all finite sequences are finitely inductive. This technique is used in the training stage. In order to accommodate the variabilities in speech, the training is performed casualty, and a large number of training speakers is used from eight different dialect regions. Hence, a speaker independent recognition system is realized. The matching process compares the incoming speech with each of the templates stored, and a closeness ration is computed. A ratio table is generated anH the matching word that corresponds to the smallest ratio (i.e. indicating that the ruling has removed most of the symbols) is selected. Promising results were obtained for isolated words, and the recognition rates ranged between 50% and 100%. digital.library.unt.edu/ark:/67531/metadc277749/
Performance Evaluation of MPLS on Quality of Service in Voice Over IP (VoIP) Networks
Access: Use of this item is restricted to the UNT Community.
The transmission of voice data over Internet Protocol (IP) networks is rapidly gaining acceptance in the field of networking. The major voice transmissions in the IP networks are involved in Internet telephony, which is also known as IP telephony or Voice Over IP (VoIP). VoIP is undergoing many enhancements to provide the end users with same quality as in the public switched telephone networks (PSTN). These enhancements are mostly required in quality of service (QoS) for the transmission of voice data over the IP networks. As with recent developments in the networking field, various protocols came into market to provide the QoS in IP networks - of them, multi protocol label switching (MPLS) is the most reliable and upcoming protocol for working on QoS. The problem of the thesis is to develop an IP-based virtual network, with end hosts and routers, implement MPLS on the network, and analyze its QoS for voice data transmission. digital.library.unt.edu/ark:/67531/metadc3293/
Temporal Connectionist Expert Systems Using a Temporal Backpropagation Algorithm
Representing time has been considered a general problem for artificial intelligence research for many years. More recently, the question of representing time has become increasingly important in representing human decision making process through connectionist expert systems. Because most human behaviors unfold over time, any attempt to represent expert performance, without considering its temporal nature, can often lead to incorrect results. A temporal feedforward neural network model that can be applied to a number of neural network application areas, including connectionist expert systems, has been introduced. The neural network model has a multi-layer structure, i.e. the number of layers is not limited. Also, the model has the flexibility of defining output nodes in any layer. This is especially important for connectionist expert system applications. A temporal backpropagation algorithm which supports the model has been developed. The model along with the temporal backpropagation algorithm makes it extremely practical to define any artificial neural network application. Also, an approach that can be followed to decrease the memory space used by weight matrix has been introduced. The algorithm was tested using a medical connectionist expert system to show how best we describe not only the disease but also the entire course of the disease. The system, first, was trained using a pattern that was encoded from the expert system knowledge base rules. Following then, series of experiments were carried out using the temporal model and the temporal backpropagation algorithm. The first series of experiments was done to determine if the training process worked as predicted. In the second series of experiments, the weight matrix in the trained system was defined as a function of time intervals before presenting the system with the learned patterns. The result of the two experiments indicate that both approaches produce correct results. The only difference between the two results was that compressing the weight matrix required more training epochs to produce correct results. To get a measure of the correctness of the results, an error measure which is the value of the error squared was summed over all patterns to get a total sum of squares. digital.library.unt.edu/ark:/67531/metadc278824/
Modeling the Impact and Intervention of a Sexually Transmitted Disease: Human Papilloma Virus
Many human papilloma virus (HPV) types are sexually transmitted and HPV DNA types 16, 18, 31, and 45 account for more than 75% if all cervical dysplasia. Candidate vaccines are successfully completing US Federal Drug Agency (FDA) phase III testing and several drug companies are in licensing arbitration. Once this vaccine become available it is unlikely that 100% vaccination coverage will be probable; hence, the need for vaccination strategies that will have the greatest reduction on the endemic prevalence of HPV. This thesis introduces two discrete-time models for evaluating the effect of demographic-biased vaccination strategies: one model incorporates temporal demographics (i.e., age) in population compartments; the other non-temporal demographics (i.e., race, ethnicity). Also presented is an intuitive Web-based interface that was developed to allow the user to evaluate the effects on prevalence of a demographic-biased intervention by tailoring the model parameters to specific demographics and geographical region. digital.library.unt.edu/ark:/67531/metadc5289/
An Approach Towards Self-Supervised Classification Using Cyc
Due to the long duration required to perform manual knowledge entry by human knowledge engineers it is desirable to find methods to automatically acquire knowledge about the world by accessing online information. In this work I examine using the Cyc ontology to guide the creation of Naïve Bayes classifiers to provide knowledge about items described in Wikipedia articles. Given an initial set of Wikipedia articles the system uses the ontology to create positive and negative training sets for the classifiers in each category. The order in which classifiers are generated and used to test articles is also guided by the ontology. The research conducted shows that a system can be created that utilizes statistical text classification methods to extract information from an ad-hoc generated information source like Wikipedia for use in a formal semantic ontology like Cyc. Benefits and limitations of the system are discussed along with future work. digital.library.unt.edu/ark:/67531/metadc5470/
DADS - A Distributed Agent Delivery System
Mobile agents require an appropriate platform that can facilitate their migration and execution. In particular, the design and implementation of such a system must balance several factors that will ensure that its constituent agents are executed without problems. Besides the basic requirements of migration and execution, an agent system must also provide mechanisms to ensure the security and survivability of an agent when it migrates between hosts. In addition, the system should be simple enough to facilitate its widespread use across large scale networks (i.e Internet). To address these issues, this thesis discusses the design and implementation of the Distributed Agent Delivery System (DADS). The DADS provides a de-coupled design that separates agent acceptance from agent execution. Using functional modules, the DADS provides services ranging from language execution and security to fault-tolerance and compression. Modules allow the administrator(s) of hosts to declare, at run-time, the services that they want to provide. Since each administrative domain is different, the DADS provides a platform that can be adapted to exchange heterogeneous blends of agents across large scale networks. digital.library.unt.edu/ark:/67531/metadc3352/
Keywords in the mist: Automated keyword extraction for very large documents and back of the book indexing.
This research addresses the problem of automatic keyphrase extraction from large documents and back of the book indexing. The potential benefits of automating this process are far reaching, from improving information retrieval in digital libraries, to saving countless man-hours by helping professional indexers creating back of the book indexes. The dissertation introduces a new methodology to evaluate automated systems, which allows for a detailed, comparative analysis of several techniques for keyphrase extraction. We introduce and evaluate both supervised and unsupervised techniques, designed to balance the resource requirements of an automated system and the best achievable performance. Additionally, a number of novel features are proposed, including a statistical informativeness measure based on chi statistics; an encyclopedic feature that taps into the vast knowledge base of Wikipedia to establish the likelihood of a phrase referring to an informative concept; and a linguistic feature based on sophisticated semantic analysis of the text using current theories of discourse comprehension. The resulting keyphrase extraction system is shown to outperform the current state of the art in supervised keyphrase extraction by a large margin. Moreover, a fully automated back of the book indexing system based on the keyphrase extraction system was shown to lead to back of the book indexes closely resembling those created by human experts. digital.library.unt.edu/ark:/67531/metadc6118/
Symplectic Integration of Nonseparable Hamiltonian Systems
Numerical methods are usually necessary in solving Hamiltonian systems since there is often no closed-form solution. By utilizing a general property of Hamiltonians, namely the symplectic property, all of the qualities of the system may be preserved for indefinitely long integration times because all of the integral (Poincare) invariants are conserved. This allows for more reliable results and frequently leads to significantly shorter execution times as compared to conventional methods. The resonant triad Hamiltonian with one degree of freedom will be focused upon for most of the numerical tests because of its difficult nature and, moreover, analytical results exist whereby useful comparisons can be made. digital.library.unt.edu/ark:/67531/metadc278485/
Graph-Based Keyphrase Extraction Using Wikipedia
Keyphrases describe a document in a coherent and simple way, giving the prospective reader a way to quickly determine whether the document satisfies their information needs. The pervasion of huge amount of information on Web, with only a small amount of documents have keyphrases extracted, there is a definite need to discover automatic keyphrase extraction systems. Typically, a document written by human develops around one or more general concepts or sub-concepts. These concepts or sub-concepts should be structured and semantically related with each other, so that they can form the meaningful representation of a document. Considering the fact, the phrases or concepts in a document are related to each other, a new approach for keyphrase extraction is introduced that exploits the semantic relations in the document. For measuring the semantic relations between concepts or sub-concepts in the document, I present a comprehensive study aimed at using collaboratively constructed semantic resources like Wikipedia and its link structure. In particular, I introduce a graph-based keyphrase extraction system that exploits the semantic relations in the document and features such as term frequency. I evaluated the proposed system using novel measures and the results obtained compare favorably with previously published results on established benchmarks. digital.library.unt.edu/ark:/67531/metadc67939/
Multilingual Word Sense Disambiguation Using Wikipedia
Ambiguity is inherent to human language. In particular, word sense ambiguity is prevalent in all natural languages, with a large number of the words in any given language carrying more than one meaning. Word sense disambiguation is the task of automatically assigning the most appropriate meaning to a polysemous word within a given context. Generally the problem of resolving ambiguity in literature has revolved around the famous quote “you shall know the meaning of the word by the company it keeps.” In this thesis, we investigate the role of context for resolving ambiguity through three different approaches. Instead of using a predefined monolingual sense inventory such as WordNet, we use a language-independent framework where the word senses and sense-tagged data are derived automatically from Wikipedia. Using Wikipedia as a source of sense-annotations provides the much needed solution for knowledge acquisition bottleneck. In order to evaluate the viability of Wikipedia based sense-annotations, we cast the task of disambiguating polysemous nouns as a monolingual classification task and experimented on lexical samples from four different languages (viz. English, German, Italian and Spanish). The experiments confirm that the Wikipedia based sense annotations are reliable and can be used to construct accurate monolingual sense classifiers. It is a long belief that exploiting multiple languages helps in building accurate word sense disambiguation systems. Subsequently, we developed two approaches that recast the task of disambiguating polysemous nouns as a multilingual classification task. The first approach for multilingual word sense disambiguation attempts to effectively use a machine translation system to leverage two relevant multilingual aspects of the semantics of text. First, the various senses of a target word may be translated into different words, which constitute unique, yet highly salient signal that effectively expand the target word’s feature space. Second, the translated context words themselves embed co-occurrence information that a translation engine gathers from very large parallel corpora. The second approach for multlingual word sense disambiguation attempts to reduce the reliance on the machine translation system during training by using the multilingual knowledge available in Wikipedia through its interlingual links. Finally, the experiments on a lexical sample from four different languages confirm that the multilingual systems perform better than the monolingual system and significantly improve the disambiguation accuracy. digital.library.unt.edu/ark:/67531/metadc500036/
Performance Analysis of Wireless Networks with QoS Adaptations
The explosive demand for multimedia and fast transmission of continuous media on wireless networks means the simultaneous existence of traffic requiring different qualities of service (QoS). In this thesis, several efficient algorithms have been developed which offer several QoS to the end-user. We first look at a request TDMA/CDMA protocol for supporting wireless multimedia traffic, where CDMA is laid over TDMA. Then we look at a hybrid push-pull algorithm for wireless networks, and present a generalized performance analysis of the proposed protocol. Some of the QoS factors considered include customer retrial rates due to user impatience and system timeouts and different levels of priority and weights for mobile hosts. We have also looked at how customer impatience and system timeouts affect the QoS provided by several queuing and scheduling schemes such as FIFO, priority, weighted fair queuing, and the application of the stretch-optimal algorithm to scheduling. digital.library.unt.edu/ark:/67531/metadc4336/
Intrinsic and Extrinsic Adaptation in a Simulated Combat Environment
Genetic algorithm and artificial life techniques are applied to the development of challenging and interesting opponents in a combat-based computer game. Computer simulations are carried out against an idealized human player to gather data on the effectiveness of the computer generated opponents. digital.library.unt.edu/ark:/67531/metadc278231/
3D Reconstruction Using Lidar and Visual Images
In this research, multi-perspective image registration using LiDAR and visual images was considered. 2D-3D image registration is a difficult task because it requires the extraction of different semantic features from each modality. This problem is solved in three parts. The first step involves detection and extraction of common features from each of the data sets. The second step consists of associating the common features between two different modalities. Traditional methods use lines or orthogonal corners as common features. The third step consists of building the projection matrix. Many existing methods use global positing system (GPS) or inertial navigation system (INS) for an initial estimate of the camera pose. However, the approach discussed herein does not use GPS, INS, or any such devices for initial estimate; hence the model can be used in places like the lunar surface or Mars where GPS or INS are not available. A variation of the method is also described, which does not require strong features from both images but rather uses intensity gradients in the image. This can be useful when one image does not have strong features (such as lines) or there are too many extraneous features. digital.library.unt.edu/ark:/67531/metadc177193/
Survey of Approximation Algorithms for Set Cover Problem
In this thesis, I survey 11 approximation algorithms for unweighted set cover problem. I have also implemented the three algorithms and created a software library that stores the code I have written. The algorithms I survey are: 1. Johnson's standard greedy; 2. f-frequency greedy; 3. Goldsmidt, Hochbaum and Yu's modified greedy; 4. Halldorsson's local optimization; 5. Dur and Furer semi local optimization; 6. Asaf Levin's improvement to Dur and Furer; 7. Simple rounding; 8. Randomized rounding; 9. LP duality; 10. Primal-dual schema; and 11. Network flow technique. Most of the algorithms surveyed are refinements of standard greedy algorithm. digital.library.unt.edu/ark:/67531/metadc12118/
Performance comparison of data distribution management strategies in large-scale distributed simulation.
Data distribution management (DDM) is a High Level Architecture/Run-time Infrastructure (HLA/RTI) service that manages the distribution of state updates and interaction information in large-scale distributed simulations. The key to efficient DDM is to limit and control the volume of data exchanged during the simulation, to relay data to only those hosts requiring the data. This thesis focuses upon different DDM implementations and strategies. This thesis includes analysis of three DDM methods including the fixed grid-based, dynamic grid-based, and region-based methods. Also included is the use of multi-resolution modeling with various DDM strategies and analysis of the performance effects of aggregation/disaggregation with these strategies. Running numerous federation executions, I simulate four different scenarios on a cluster of workstations with a mini-RTI Kit framework and propose a set of benchmarks for a comparison of the DDM schemes. The goals of this work are to determine the most efficient model for applying each DDM scheme, discover the limitations of the scalability of the various DDM methods, evaluate the effects of aggregation/disaggregation on performance and resource usage, and present accepted benchmarks for use in future research. digital.library.unt.edu/ark:/67531/metadc4524/
A Minimally Supervised Word Sense Disambiguation Algorithm Using Syntactic Dependencies and Semantic Generalizations
Natural language is inherently ambiguous. For example, the word "bank" can mean a financial institution or a river shore. Finding the correct meaning of a word in a particular context is a task known as word sense disambiguation (WSD), which is essential for many natural language processing applications such as machine translation, information retrieval, and others. While most current WSD methods try to disambiguate a small number of words for which enough annotated examples are available, the method proposed in this thesis attempts to address all words in unrestricted text. The method is based on constraints imposed by syntactic dependencies and concept generalizations drawn from an external dictionary. The method was tested on standard benchmarks as used during the SENSEVAL-2 and SENSEVAL-3 WSD international evaluation exercises, and was found to be competitive. digital.library.unt.edu/ark:/67531/metadc4969/
General Purpose Programming on Modern Graphics Hardware
I start with a brief introduction to the graphics processing unit (GPU) as well as general-purpose computation on modern graphics hardware (GPGPU). Next, I explore the motivations for GPGPU programming, and the capabilities of modern GPUs (including advantages and disadvantages). Also, I give the background required for further exploring GPU programming, including the terminology used and the resources available. Finally, I include a comprehensive survey of previous and current GPGPU work, and end with a look at the future of GPU programming. digital.library.unt.edu/ark:/67531/metadc6112/
Rapid Prototyping and Design of a Fast Random Number Generator
Information in the form of online multimedia, bank accounts, or password usage for diverse applications needs some form of security. the core feature of many security systems is the generation of true random or pseudorandom numbers. Hence reliable generators of such numbers are indispensable. the fundamental hurdle is that digital computers cannot generate truly random numbers because the states and transitions of digital systems are well understood and predictable. Nothing in a digital computer happens truly randomly. Digital computers are sequential machines that perform a current state and move to the next state in a deterministic fashion. to generate any secure hash or encrypted word a random number is needed. But since computers are not random, random sequences are commonly used. Random sequences are algorithms that generate a pattern of values that appear to be random but after some time start repeating. This thesis implements a digital random number generator using MATLAB, FGPA prototyping, and custom silicon design. This random number generator is able to use a truly random CMOS source to generate the random number. Statistical benchmarks are used to test the results and to show that the design works. Thus the proposed random number generator will be useful for online encryption and security. digital.library.unt.edu/ark:/67531/metadc115036/
Rapid Prototyping and Design of a Fast Random Number Generator
Information in the form of online multimedia, bank accounts, or password usage for diverse applications needs some form of security. the core feature of many security systems is the generation of true random or pseudorandom numbers. Hence reliable generators of such numbers are indispensable. the fundamental hurdle is that digital computers cannot generate truly random numbers because the states and transitions of digital systems are well understood and predictable. Nothing in a digital computer happens truly randomly. Digital computers are sequential machines that perform a current state and move to the next state in a deterministic fashion. to generate any secure hash or encrypted word a random number is needed. But since computers are not random, random sequences are commonly used. Random sequences are algorithms that generate a pattern of values that appear to be random but after some time start repeating. This thesis implements a digital random number generator using MATLAB, FGPA prototyping, and custom silicon design. This random number generator is able to use a truly random CMOS source to generate the random number. Statistical benchmarks are used to test the results and to show that the design works. Thus the proposed random number generator will be useful for online encryption and security. digital.library.unt.edu/ark:/67531/metadc115040/
Force-Directed Graph Drawing and Aesthetics Measurement in a Non-Strict Pure Functional Programming Language
Non-strict pure functional programming often requires redesigning algorithms and data structures to work more effectively under new constraints of non-strict evaluation and immutable state. Graph drawing algorithms, while numerous and broadly studied, have no presence in the non-strict pure functional programming model. Additionally, there is currently no freely licensed standalone toolkit used to quantitatively analyze aesthetics of graph drawings. This thesis addresses two previously unexplored questions. Can a force-directed graph drawing algorithm be implemented in a non-strict functional language, such as Haskell, and still be practically usable? Can an easily extensible aesthetic measuring tool be implemented in a language such as Haskell and still be practically usable? The focus of the thesis is on implementing one of the simplest force-directed algorithms, that of Fruchterman and Reingold, and comparing its resulting aesthetics to those of a well-known C++ implementation of the same algorithm. digital.library.unt.edu/ark:/67531/metadc12125/
A Quality of Service Aware Protocol for Power Conservation in Wireless Ad Hoc and Mobile Networks
Access: Use of this item is restricted to the UNT Community.
Power consumption is an important issue for mobile computers since they rely on the short life of batteries. Conservation techniques are commonly used in hardware design of such systems but network interface is a significant consumer of power, which needs considerable research to be devoted towards designing a low-power design network protocol stack. Due to the dynamic nature of wireless networks, adaptations are necessary to achieve energy efficiency and a reasonable quality of service. This paper presents the application of energy-efficient techniques to each layer in the network protocol stack and a feedback is provided depending on the performance of this new design. And also a comparison of two existing MAC protocols is done showing a better suitability of E2MAC for higher power conservation. Multimedia applications can achieve an optimal performance if they are aware of the characteristics of the wireless link. Relying on the underlying operating system software and communication protocols to hide the anomalies of wireless channel needs efficient energy consumption methodology and fair quality of service like E2MAC. This report also focuses on some of the various concerns of energy efficiency in wireless communication and also looks into the definition of seven layers as defined by International Standards Organization. digital.library.unt.edu/ark:/67531/metadc3289/
Flexible Digital Authentication Techniques
Abstract This dissertation investigates authentication techniques in some emerging areas. Specifically, authentication schemes have been proposed that are well-suited for embedded systems, and privacy-respecting pay Web sites. With embedded systems, a person could own several devices which are capable of communication and interaction, but these devices use embedded processors whose computational capabilities are limited as compared to desktop computers. Examples of this scenario include entertainment devices or appliances owned by a consumer, multiple control and sensor systems in an automobile or airplane, and environmental controls in a building. An efficient public key cryptosystem has been devised, which provides a complete solution to an embedded system, including protocols for authentication, authenticated key exchange, encryption, and revocation. The new construction is especially suitable for the devices with constrained computing capabilities and resources. Compared with other available authentication schemes, such as X.509, identity-based encryption, etc, the new construction provides unique features such as simplicity, efficiency, forward secrecy, and an efficient re-keying mechanism. In the application scenario for a pay Web site, users may be sensitive about their privacy, and do not wish their behaviors to be tracked by Web sites. Thus, an anonymous authentication scheme is desirable in this case. That is, a user can prove his/her authenticity without revealing his/her identity. On the other hand, the Web site owner would like to prevent a bunch of users from sharing a single subscription while hiding behind user anonymity. The Web site should be able to detect these possible malicious behaviors, and exclude corrupted users from future service. This dissertation extensively discusses anonymous authentication techniques, such as group signature, direct anonymous attestation, and traceable signature. Three anonymous authentication schemes have been proposed, which include a group signature scheme with signature claiming and variable linkability, a scheme for direct anonymous attestation in trusted computing platforms with sign and verify protocols nearly seven times more efficient than the current solution, and a state-of-the-art traceable signature scheme with support for variable anonymity. These three schemes greatly advance research in the area of anonymous authentication. The authentication techniques presented in this dissertation are based on common mathematical and cryptographical foundations, sharing similar security assumptions. We call them flexible digital authentication schemes. digital.library.unt.edu/ark:/67531/metadc5277/
Bounded Dynamic Source Routing in Mobile Ad Hoc Networks
Access: Use of this item is restricted to the UNT Community.
A mobile ad hoc network (MANET) is a collection of mobile platforms or nodes that come together to form a network capable of communicating with each other, without the help of a central controller. To avail the maximum potential of a MANET, it is of great importance to devise a routing scheme, which will optimize upon the performance of a MANET, given the high rate of random mobility of the nodes. In a MANET individual nodes perform the routing functions like route discovery, route maintenance and delivery of packets from one node to the other. Existing routing protocols flood the network with broadcasts of route discovery messages, while attempting to establish a route. This characteristic is instrumental in deteriorating the performance of a MANET, as resource overhead triggered by broadcasts is directly proportional to the size of the network. Bounded-dynamic source routing (B-DSR), is proposed to curb this multitude of superfluous broadcasts, thus enabling to reserve valuable resources like bandwidth and battery power. B-DSR establishes a bounded region in the network, only within which, transmissions of route discovery messages are processed and validated for establishing a route. All route discovery messages reaching outside of this bounded region are dropped, thus preventing the network from being flooded. In addition B-DSR also guarantees loop-free routing and is robust for a rapid recovery when routes in the network change. digital.library.unt.edu/ark:/67531/metadc4264/
Implementation of Back Up Host in TCP/IP
Access: Use of this item is restricted to the UNT Community.
This problem in lieu thesis is considering a TCP client H1 making a connection to distant server S and is downloading a file. In the midst of the downloading, if H1 crashes, the TCP connection from H1 to S is lost. In the future, if H1 restarts, the TCP connection from H1 to S will be reestablished and the file will be downloaded again. This cannot happen until host H1 restarts. Now consider a situation where there is a standby host H2 for the host H1. H1 and H2 monitor the health of each other by heartbeat messages (like SCTP). If H2 detects the failure of H1, then H2 takes over. This implies that all resources assigned to H1 are now reassigned or taken over by H2. The host H1 and H2 transmit data between each other when any one of it crashed. Throughout the data transmission process, heart beat chunk is exchanged between the hosts when one of the host crashes. In particular, the IP addresses that were originally assigned to H1 are assigned to H2. In this scenario, movement of the TCP connection between H1 and S to a connection between H2 and S without disrupting the TCP connection is achieved. Distant server S is not aware of any changes going on at the clients. digital.library.unt.edu/ark:/67531/metadc3288/
Multi-Agent Architecture for Internet Information Extraction and Visualization
Access: Use of this item is restricted to the UNT Community.
The World Wide Web is one of the largest sources of information; more and more applications are being developed daily to make use of this information. This thesis presents a multi-agent architecture that deals with some of the issues related to Internet data extraction. The primary issue addresses the reliable, efficient and quick extraction of data through the use of HTTP performance monitoring agents. A second issue focuses on how to make use of available data to take decisions and alert the user when there is change in data; this is done with the help of user agents that are equipped with a Defeasible reasoning interpreter. An additional issue is the visualization of extracted data; this is done with the aid of VRML visualization agents. The cited issues are discussed using stock portfolio management as an example application. digital.library.unt.edu/ark:/67531/metadc2575/
Simulating the Spread of Infectious Diseases in Heterogeneous Populations with Diverse Interactions Characteristics
The spread of infectious diseases has been a public concern throughout human history. Historic recorded data has reported the severity of infectious disease epidemics in different ages. Ancient Greek physician Hippocrates was the first to analyze the correlation between diseases and their environment. Nowadays, health authorities are in charge of planning strategies that guarantee the welfare of citizens. The simulation of contagion scenarios contributes to the understanding of the epidemic behavior of diseases. Computational models facilitate the study of epidemics by integrating disease and population data to the simulation. The use of detailed demographic and geographic characteristics allows researchers to construct complex models that better resemble reality and the integration of these attributes permits us to understand the rules of interaction. The interaction of individuals with similar characteristics forms synthetic structures that depict clusters of interaction. The synthetic environments facilitate the study of the spread of infectious diseases in diverse scenarios. The characteristics of the population and the disease concurrently affect the local and global epidemic progression. Every cluster’ epidemic behavior constitutes the global epidemic for a clustered population. By understanding the correlation between structured populations and the spread of a disease, current dissertation research makes possible to identify risk groups of specific characteristics and devise containment strategies that facilitate health authorities to improve mitigation strategies. digital.library.unt.edu/ark:/67531/metadc407831/
Analysis of Web Services on J2EE Application Servers
The Internet became a standard way of exchanging business data between B2B and B2C applications and with this came the need for providing various services on the web instead of just static text and images. Web services are a new type of services offered via the web that aid in the creation of globally distributed applications. Web services are enhanced e-business applications that are easier to advertise and easier to discover on the Internet because of their flexibility and uniformity. In a real life scenario it is highly difficult to decide which J2EE application server to go for when deploying a enterprise web service. This thesis analyzes the various ways by which web services can be developed & deployed. Underlying protocols and crucial issues like EAI (enterprise application integration), asynchronous messaging, Registry tModel architecture etc have been considered in this research. This paper presents a report by analyzing what various J2EE application servers provide by doing a case study and by developing applications to test functionality. digital.library.unt.edu/ark:/67531/metadc5547/
GPS CaPPture: a System for GPS Trajectory Collection, Processing, and Destination Prediction
In the United States, smartphone ownership surpassed 69.5 million in February 2011 with a large portion of those users (20%) downloading applications (apps) that enhance the usability of a device by adding additional functionality. a large percentage of apps are written specifically to utilize the geographical position of a mobile device. One of the prime factors in developing location prediction models is the use of historical data to train such a model. with larger sets of training data, prediction algorithms become more accurate; however, the use of historical data can quickly become a downfall if the GPS stream is not collected or processed correctly. Inaccurate or incomplete or even improperly interpreted historical data can lead to the inability to develop accurately performing prediction algorithms. As GPS chipsets become the standard in the ever increasing number of mobile devices, the opportunity for the collection of GPS data increases remarkably. the goal of this study is to build a comprehensive system that addresses the following challenges: (1) collection of GPS data streams in a manner such that the data is highly usable and has a reduction in errors; (2) processing and reduction of the collected data in order to prepare it and make it highly usable for the creation of prediction algorithms; (3) creation of prediction/labeling algorithms at such a level that they are viable for commercial use. This study identifies the key research problems toward building the CaPPture (collection, processing, prediction) system. digital.library.unt.edu/ark:/67531/metadc115089/
Autonomic Failure Identification and Diagnosis for Building Dependable Cloud Computing Systems
The increasingly popular cloud-computing paradigm provides on-demand access to computing and storage with the appearance of unlimited resources. Users are given access to a variety of data and software utilities to manage their work. Users rent virtual resources and pay for only what they use. In spite of the many benefits that cloud computing promises, the lack of dependability in shared virtualized infrastructures is a major obstacle for its wider adoption, especially for mission-critical applications. Virtualization and multi-tenancy increase system complexity and dynamicity. They introduce new sources of failure degrading the dependability of cloud computing systems. To assure cloud dependability, in my dissertation research, I develop autonomic failure identification and diagnosis techniques that are crucial for understanding emergent, cloud-wide phenomena and self-managing resource burdens for cloud availability and productivity enhancement. We study the runtime cloud performance data collected from a cloud test-bed and by using traces from production cloud systems. We define cloud signatures including those metrics that are most relevant to failure instances. We exploit profiled cloud performance data in both time and frequency domain to identify anomalous cloud behaviors and leverage cloud metric subspace analysis to automate the diagnosis of observed failures. We implement a prototype of the anomaly identification system and conduct the experiments in an on-campus cloud computing test-bed and by using the Google datacenter traces. Our experimental results show that our proposed anomaly detection mechanism can achieve 93% detection sensitivity while keeping the false positive rate as low as 6.1% and outperform other tested anomaly detection schemes. In addition, the anomaly detector adapts itself by recursively learning from these newly verified detection results to refine future detection. digital.library.unt.edu/ark:/67531/metadc499993/
Exploring Trusted Platform Module Capabilities: A Theoretical and Experimental Study
Trusted platform modules (TPMs) are hardware modules that are bound to a computer's motherboard, that are being included in many desktops and laptops. Augmenting computers with these hardware modules adds powerful functionality in distributed settings, allowing us to reason about the security of these systems in new ways. In this dissertation, I study the functionality of TPMs from a theoretical as well as an experimental perspective. On the theoretical front, I leverage various features of TPMs to construct applications like random oracles that are impossible to implement in a standard model of computation. Apart from random oracles, I construct a new cryptographic primitive which is basically a non-interactive form of the standard cryptographic primitive of oblivious transfer. I apply this new primitive to secure mobile agent computations, where interaction between various entities is typically required to ensure security. I prove these constructions are secure using standard cryptographic techniques and assumptions. To test the practicability of these constructions and their applications, I performed an experimental study, both on an actual TPM and a software TPM simulator which has been enhanced to make it reflect timings from a real TPM. This allowed me to benchmark the performance of the applications and test the feasibility of the proposed extensions to standard TPMs. My tests also show that these constructions are practical. digital.library.unt.edu/ark:/67531/metadc6101/
FIRST PREV 1 2 3 4 NEXT LAST