Latest content added for UNT Digital Library Collection: UNT Theses and Dissertationshttps://digital.library.unt.edu/explore/collections/UNTETD/browse/?fq=untl_institution:UNT&start=130&fq=str_degree_discipline:Computer+Science&display=brief2013-06-04T22:29:06-05:00UNT LibrariesThis is a custom feed for browsing UNT Digital Library Collection: UNT Theses and DissertationsA Programming Language For Concurrent Processing2013-06-04T22:29:06-05:00https://digital.library.unt.edu/ark:/67531/metadc164005/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc164005/"><img alt="A Programming Language For Concurrent Processing" title="A Programming Language For Concurrent Processing" src="https://digital.library.unt.edu/ark:/67531/metadc164005/small/"/></a></p><p>This thesis is a proposed solution to the problem of including an effective interrupt mechanism in the set of concurrent- processing primitives of a block-structured programming language or system. The proposed solution is presented in the form of a programming language definition and model. The language is called TRIPLE.</p>Multi-perspective, Multi-modal Image Registration and Fusion2013-03-04T14:02:27-06:00https://digital.library.unt.edu/ark:/67531/metadc149562/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc149562/"><img alt="Multi-perspective, Multi-modal Image Registration and Fusion" title="Multi-perspective, Multi-modal Image Registration and Fusion" src="https://digital.library.unt.edu/ark:/67531/metadc149562/small/"/></a></p><p>Multi-modal image fusion is an active research area with many civilian and military applications. Fusion is defined as strategic combination of information collected by various sensors from different locations or different types in order to obtain a better understanding of an observed scene or situation. Fusion of multi-modal images cannot be completed unless these two modalities are spatially aligned. In this research, I consider two important problems. Multi-modal, multi-perspective image registration and decision level fusion of multi-modal images. In particular, LiDAR and visual imagery. Multi-modal image registration is a difficult task due to the different semantic interpretation of features extracted from each modality. This problem is decoupled into three sub-problems. The first step is identification and extraction of common features. The second step is the determination of corresponding points. The third step consists of determining the registration transformation parameters. Traditional registration methods use low level features such as lines and corners. Using these features require an extensive optimization search in order to determine the corresponding points. Many methods use global positioning systems (GPS), and a calibrated camera in order to obtain an initial estimate of the camera parameters. The advantages of our work over the previous works are the following. First, I used high level-features, which significantly reduce the search space for the optimization process. Second, the determination of corresponding points is modeled as an assignment problem between a small numbers of objects. On the other side, fusing LiDAR and visual images is beneficial, due to the different and rich characteristics of both modalities. LiDAR data contain 3D information, while images contain visual information. Developing a fusion technique that uses the characteristics of both modalities is very important. I establish a decision-level fusion technique using manifold models.</p>A Smooth-turn Mobility Model for Airborne Networks2013-03-04T14:02:27-06:00https://digital.library.unt.edu/ark:/67531/metadc149603/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc149603/"><img alt="A Smooth-turn Mobility Model for Airborne Networks" title="A Smooth-turn Mobility Model for Airborne Networks" src="https://digital.library.unt.edu/ark:/67531/metadc149603/small/"/></a></p><p>In this article, I introduce a novel airborne network mobility model, called the Smooth Turn Mobility Model, that captures the correlation of acceleration for airborne vehicles across time and spatial coordinates. E?ective routing in airborne networks (ANs) relies on suitable mobility models that capture the random movement pattern of airborne vehicles. As airborne vehicles cannot make sharp turns as easily as ground vehicles do, the widely used mobility models for Mobile Ad Hoc Networks such as Random Waypoint and Random Direction models fail. Our model is realistic in capturing the tendency of airborne vehicles toward making straight trajectory and smooth turns with large radius, and whereas is simple enough for tractable connectivity analysis and routing design.</p>Automatic Tagging of Communication Data2013-03-04T14:02:27-06:00https://digital.library.unt.edu/ark:/67531/metadc149611/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc149611/"><img alt="Automatic Tagging of Communication Data" title="Automatic Tagging of Communication Data" src="https://digital.library.unt.edu/ark:/67531/metadc149611/small/"/></a></p><p>Globally distributed software teams are widespread throughout industry. But finding reliable methods that can properly assess a team's activities is a real challenge. Methods such as surveys and manual coding of activities are too time consuming and are often unreliable. Recent advances in information retrieval and linguistics, however, suggest that automated and/or semi-automated text classification algorithms could be an effective way of finding differences in the communication patterns among individuals and groups. Communication among group members is frequent and generates a significant amount of data. Thus having a web-based tool that can automatically analyze the communication patterns among global software teams could lead to a better understanding of group performance. The goal of this thesis, therefore, is to compare automatic and semi-automatic measures of communication and evaluate their effectiveness in classifying different types of group activities that occur within a global software development project. In order to achieve this goal, we developed a web-based component that can be used to help clean and classify communication activities. The component was then used to compare different automated text classification techniques on various group activities to determine their effectiveness in correctly classifying data from a global software development team project.</p>GPS CaPPture: a System for GPS Trajectory Collection, Processing, and Destination Prediction2012-11-06T15:03:04-06:00https://digital.library.unt.edu/ark:/67531/metadc115089/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc115089/"><img alt="GPS CaPPture: a System for GPS Trajectory Collection, Processing, and Destination Prediction" title="GPS CaPPture: a System for GPS Trajectory Collection, Processing, and Destination Prediction" src="https://digital.library.unt.edu/ark:/67531/metadc115089/small/"/></a></p><p>In the United States, smartphone ownership surpassed 69.5 million in February 2011 with a large portion of those users (20%) downloading applications (apps) that enhance the usability of a device by adding additional functionality. a large percentage of apps are written specifically to utilize the geographical position of a mobile device. One of the prime factors in developing location prediction models is the use of historical data to train such a model. with larger sets of training data, prediction algorithms become more accurate; however, the use of historical data can quickly become a downfall if the GPS stream is not collected or processed correctly. Inaccurate or incomplete or even improperly interpreted historical data can lead to the inability to develop accurately performing prediction algorithms. As GPS chipsets become the standard in the ever increasing number of mobile devices, the opportunity for the collection of GPS data increases remarkably. the goal of this study is to build a comprehensive system that addresses the following challenges: (1) collection of GPS data streams in a manner such that the data is highly usable and has a reduction in errors; (2) processing and reduction of the collected data in order to prepare it and make it highly usable for the creation of prediction algorithms; (3) creation of prediction/labeling algorithms at such a level that they are viable for commercial use. This study identifies the key research problems toward building the CaPPture (collection, processing, prediction) system.</p>Rapid Prototyping and Design of a Fast Random Number Generator2012-11-06T15:03:04-06:00https://digital.library.unt.edu/ark:/67531/metadc115040/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc115040/"><img alt="Rapid Prototyping and Design of a Fast Random Number Generator" title="Rapid Prototyping and Design of a Fast Random Number Generator" src="https://digital.library.unt.edu/ark:/67531/metadc115040/small/"/></a></p><p>Information in the form of online multimedia, bank accounts, or password usage for diverse applications needs some form of security. the core feature of many security systems is the generation of true random or pseudorandom numbers. Hence reliable generators of such numbers are indispensable. the fundamental hurdle is that digital computers cannot generate truly random numbers because the states and transitions of digital systems are well understood and predictable. Nothing in a digital computer happens truly randomly. Digital computers are sequential machines that perform a current state and move to the next state in a deterministic fashion. to generate any secure hash or encrypted word a random number is needed. But since computers are not random, random sequences are commonly used. Random sequences are algorithms that generate a pattern of values that appear to be random but after some time start repeating. This thesis implements a digital random number generator using MATLAB, FGPA prototyping, and custom silicon design. This random number generator is able to use a truly random CMOS source to generate the random number. Statistical benchmarks are used to test the results and to show that the design works. Thus the proposed random number generator will be useful for online encryption and security.</p>A Global Stochastic Modeling Framework to Simulate and Visualize Epidemics2012-11-06T15:03:04-06:00https://digital.library.unt.edu/ark:/67531/metadc115099/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc115099/"><img alt="A Global Stochastic Modeling Framework to Simulate and Visualize Epidemics" title="A Global Stochastic Modeling Framework to Simulate and Visualize Epidemics" src="https://digital.library.unt.edu/ark:/67531/metadc115099/small/"/></a></p><p>Epidemics have caused major human and monetary losses through the course of human civilization. It is very important that epidemiologists and public health personnel are prepared to handle an impending infectious disease outbreak. the ever-changing demographics, evolving infrastructural resources of geographic regions, emerging and re-emerging diseases, compel the use of simulation to predict disease dynamics. By the means of simulation, public health personnel and epidemiologists can predict the disease dynamics, population groups at risk and their geographic locations beforehand, so that they are prepared to respond in case of an epidemic outbreak. As a consequence of the large numbers of individuals and inter-personal interactions involved in simulating infectious disease spread in a region such as a county, sizeable amounts of data may be produced that have to be analyzed. Methods to visualize this data would be effective in facilitating people from diverse disciplines understand and analyze the simulation. This thesis proposes a framework to simulate and visualize the spread of an infectious disease in a population of a region such as a county. As real-world populations have a non-homogeneous demographic and spatial distribution, this framework models the spread of an infectious disease based on population of and geographic distance between census blocks; social behavioral parameters for demographic groups. the population is stratified into demographic groups in individual census blocks using census data. Infection spread is modeled by means of local and global contacts generated between groups of population in census blocks. the strength and likelihood of the contacts are based on population, geographic distance and social behavioral parameters of the groups involved. the disease dynamics are represented on a geographic map of the region using a heat map representation, where the intensity of infection is mapped to a color scale. This framework provides a tool for public health personnel and epidemiologists to run what-if analyses on disease spread in specific populations and plan for epidemic response. By the means of demographic stratification of population and incorporation of geographic distance and social behavioral parameters into the modeling of the outbreak, this framework takes into account non-homogeneity in demographic mix and spatial distribution of the population. Generation of contacts per population group instead of individuals contributes to lowering computational overhead. Heat map representation of the intensity of infection provides an intuitive way to visualize the disease dynamics.</p>Cuff-less Blood Pressure Measurement Using a Smart Phone2012-11-06T15:03:04-06:00https://digital.library.unt.edu/ark:/67531/metadc115102/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc115102/"><img alt="Cuff-less Blood Pressure Measurement Using a Smart Phone" title="Cuff-less Blood Pressure Measurement Using a Smart Phone" src="https://digital.library.unt.edu/ark:/67531/metadc115102/small/"/></a></p><p>Blood pressure is vital sign information that physicians often need as preliminary data for immediate intervention during emergency situations or for regular monitoring of people with cardiovascular diseases. Despite the availability of portable blood pressure meters in the market, they are not regularly carried by people, creating a need for an ultra-portable measurement platform or device that can be easily carried and used at all times. One such device is the smartphone which, according to comScore survey is used by 26.2% of the US adult population. the mass production of these phones with built-in sensors and high computation power has created numerous possibilities for application development in different domains including biomedical. Motivated by this capability and their extensive usage, this thesis focuses on developing a blood pressure measurement platform on smartphones. Specifically, I developed a blood pressure measurement system on a smart phone using the built-in camera and a customized external microphone. the system consists of first obtaining heart beats using the microphone and finger pulse with the camera, and finally calculating the blood pressure using the recorded data. I developed techniques for finding the best location for obtaining the data, making the system usable by all categories of people. the proposed system resulted in accuracies between 90-100%, when compared to traditional blood pressure meters. the second part of this thesis presents a new system for remote heart beat monitoring using the smart phone. with the proposed system, heart beats can be transferred live by patients and monitored by physicians remotely for diagnosis. the proposed blood pressure measurement and remote monitoring systems will be able to facilitate information acquisition and decision making by the 9-1-1 operators.</p>The Design Of A Benchmark For Geo-stream Management Systems2012-10-02T16:18:49-05:00https://digital.library.unt.edu/ark:/67531/metadc103392/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc103392/"><img alt="The Design Of A Benchmark For Geo-stream Management Systems" title="The Design Of A Benchmark For Geo-stream Management Systems" src="https://digital.library.unt.edu/ark:/67531/metadc103392/small/"/></a></p><p>The recent growth in sensor technology allows easier information gathering in real-time as sensors have grown smaller, more accurate, and less expensive. The resulting data is often in a geo-stream format continuously changing input with a spatial extent. Researchers developing geo-streaming management systems (GSMS) require a benchmark system for evaluation, which is currently lacking. This thesis presents GSMark, a benchmark for evaluating GSMSs. GSMark provides a data generator that creates a combination of synthetic and real geo-streaming data, a workload simulator to present the data to the GSMS as a data stream, and a set of benchmark queries that evaluate typical GSMS functionality and query performance. In particular, GSMark generates both moving points and evolving spatial regions, two fundamental data types for a broad range of geo-stream applications, and the geo-streaming queries on this data.</p>Arithmetic Computations and Memory Management Using a Binary Tree Encoding af Natural Numbers2012-10-02T16:18:49-05:00https://digital.library.unt.edu/ark:/67531/metadc103323/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc103323/"><img alt="Arithmetic Computations and Memory Management Using a Binary Tree Encoding af Natural Numbers" title="Arithmetic Computations and Memory Management Using a Binary Tree Encoding af Natural Numbers" src="https://digital.library.unt.edu/ark:/67531/metadc103323/small/"/></a></p><p>Two applications of a binary tree data type based on a simple pairing function (a bijection between natural numbers and pairs of natural numbers) are explored. First, the tree is used to encode natural numbers, and algorithms that perform basic arithmetic computations are presented along with formal proofs of their correctness. Second, using this "canonical" representation as a base type, algorithms for encoding and decoding additional isomorphic data types of other mathematical constructs (sets, sequences, etc.) are also developed. An experimental application to a memory management system is constructed and explored using these isomorphic types. A practical analysis of this system's runtime complexity and space savings are provided, along with a proof of concept framework for both applications of the binary tree type, in the Java programming language.</p>Investigating the Extractive Summarization of Literary Novels2012-10-02T16:18:49-05:00https://digital.library.unt.edu/ark:/67531/metadc103298/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc103298/"><img alt="Investigating the Extractive Summarization of Literary Novels" title="Investigating the Extractive Summarization of Literary Novels" src="https://digital.library.unt.edu/ark:/67531/metadc103298/small/"/></a></p><p>Abstract
Due to the vast amount of information we are faced with, summarization has become a critical necessity of everyday human life. Given that a large fraction of the electronic documents available online and elsewhere consist of short texts such as Web pages, news articles, scientific reports, and others, the focus of natural language processing techniques to date has been on the automation of methods targeting short documents. We are witnessing however a change: an increasingly larger number of books become available in electronic format. This means that the need for language processing techniques able to handle very large documents such as books is becoming increasingly important. This thesis addresses the problem of summarization of novels, which are long and complex literary narratives. While there is a significant body of research that has been carried out on the task of automatic text summarization, most of this work has been concerned with the summarization of short documents, with a particular focus on news stories. However, novels are different in both length and genre, and consequently different summarization techniques are required. This thesis attempts to close this gap by analyzing a new domain for summarization, and by building unsupervised and supervised systems that effectively take into account the properties of long documents, and outperform the traditional extractive summarization systems typically addressing news genre.</p>Measuring Semantic Relatedness Using Salient Encyclopedic Concepts2012-05-17T21:47:00-05:00https://digital.library.unt.edu/ark:/67531/metadc84212/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc84212/"><img alt="Measuring Semantic Relatedness Using Salient Encyclopedic Concepts" title="Measuring Semantic Relatedness Using Salient Encyclopedic Concepts" src="https://digital.library.unt.edu/ark:/67531/metadc84212/small/"/></a></p><p>While pragmatics, through its integration of situational awareness and real world relevant knowledge, offers a high level of analysis that is suitable for real interpretation of natural dialogue, semantics, on the other end, represents a lower yet more tractable and affordable linguistic level of analysis using current technologies. Generally, the understanding of semantic meaning in literature has revolved around the famous quote ``You shall know a word by the company it keeps''. In this thesis we investigate the role of context constituents in decoding the semantic meaning of the engulfing context; specifically we probe the role of salient concepts, defined as content-bearing expressions which afford encyclopedic definitions, as a suitable source of semantic clues to an unambiguous interpretation of context. Furthermore, we integrate this world knowledge in building a new and robust unsupervised semantic model and apply it to entail semantic relatedness between textual pairs, whether they are words, sentences or paragraphs. Moreover, we explore the abstraction of semantics across languages and utilize our findings into building a novel multi-lingual semantic relatedness model exploiting information acquired from various languages. We demonstrate the effectiveness and the superiority of our mono-lingual and multi-lingual models through a comprehensive set of evaluations on specialized synthetic datasets for semantic relatedness as well as real world applications such as paraphrase detection and short answer grading. Our work represents a novel approach to integrate world-knowledge into current semantic models and a means to cross the language boundary for a better and more robust semantic relatedness representation, thus opening the door for an improved abstraction of meaning that carries the potential of ultimately imparting understanding of natural language to machines.</p>Toward a Data-Type-Based Real Time Geospatial Data Stream Management System2012-01-10T09:44:17-06:00https://digital.library.unt.edu/ark:/67531/metadc68070/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc68070/"><img alt="Toward a Data-Type-Based Real Time Geospatial Data Stream Management System" title="Toward a Data-Type-Based Real Time Geospatial Data Stream Management System" src="https://digital.library.unt.edu/ark:/67531/metadc68070/small/"/></a></p><p>The advent of sensory and communication technologies enables the generation and consumption of large volumes of streaming data. Many of these data streams are geo-referenced. Existing spatio-temporal databases and data stream management systems are not capable of handling real time queries on spatial extents. In this thesis, we investigated several fundamental research issues toward building a data-type-based real time geospatial data stream management system. The thesis makes contributions in the following areas: geo-stream data models, aggregation, window-based nearest neighbor operators, and query optimization strategies. The proposed geo-stream data model is based on second-order logic and multi-typed algebra. Both abstract and discrete data models are proposed and exemplified. I further propose two useful geo-stream operators, namely Region By and WNN, which abstract common aggregation and nearest neighbor queries as generalized data model constructs. Finally, I propose three query optimization algorithms based on spatial, temporal, and spatio-temporal constraints of geo-streams. I show the effectiveness of the data model through many query examples. The effectiveness and the efficiency of the algorithms are validated through extensive experiments on both synthetic and real data sets. This work established the fundamental building blocks toward a full-fledged geo-stream database management system and has potential impact in many applications such as hazard weather alerting and monitoring, traffic analysis, and environmental modeling.</p>A Wireless Traffic Surveillance System Using Video Analytics2012-01-09T21:53:51-06:00https://digital.library.unt.edu/ark:/67531/metadc68005/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc68005/"><img alt="A Wireless Traffic Surveillance System Using Video Analytics" title="A Wireless Traffic Surveillance System Using Video Analytics" src="https://digital.library.unt.edu/ark:/67531/metadc68005/small/"/></a></p><p>Video surveillance systems have been commonly used in transportation systems to support traffic monitoring, speed estimation, and incident detection. However, there are several challenges in developing and deploying such systems, including high development and maintenance costs, bandwidth bottleneck for long range link, and lack of advanced analytics. In this thesis, I leverage current wireless, video camera, and analytics technologies, and present a wireless traffic monitoring system. I first present an overview of the system. Then I describe the site investigation and several test links with different hardware/software configurations to demonstrate the effectiveness of the system. The system development process was documented to provide guidelines for future development. Furthermore, I propose a novel speed-estimation analytics algorithm that takes into consideration roads with slope angles. I prove the correctness of the algorithm theoretically, and validate the effectiveness of the algorithm experimentally. The experimental results on both synthetic and real dataset show that the algorithm is more accurate than the baseline algorithm 80% of the time. On average the accuracy improvement of speed estimation is over 3.7% even for very small slope angles.</p>Techniques for Improving Uniformity in Direct Mapped Caches2012-01-09T21:53:51-06:00https://digital.library.unt.edu/ark:/67531/metadc68025/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc68025/"><img alt="Techniques for Improving Uniformity in Direct Mapped Caches" title="Techniques for Improving Uniformity in Direct Mapped Caches" src="https://digital.library.unt.edu/ark:/67531/metadc68025/small/"/></a></p><p>Directly mapped caches are an attractive option for processor designers as they combine fast lookup times with reduced complexity and area. However, directly-mapped caches are prone to higher miss-rates as there are no candidates for replacement on a cache miss, hence data residing in a cache set would have to be evicted to the next level cache. Another issue that inhibits cache performance is the non-uniformity of accesses exhibited by most applications: some sets are under-utilized while others receive the majority of accesses. This implies that increasing the size of caches may not lead to proportionally improved cache hit rates.
Several solutions that address cache non-uniformity have been proposed in the literature. These techniques have been proposed over the past decade and each proposal independently claims the benefit of reduced conflict misses. However, because the published results use different benchmarks and different experimental setups, (there is no established frame of reference for comparing these results) it is not easy to compare them. In this work we report a side-by-side comparison of these techniques.
Finally, we propose and Adaptive-Partitioned cache for multi-threaded applications. This design limits inter-thread thrashing while dynamically reducing traffic to heavily accessed sets.</p>Graph-Based Keyphrase Extraction Using Wikipedia2012-01-09T21:53:51-06:00https://digital.library.unt.edu/ark:/67531/metadc67939/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc67939/"><img alt="Graph-Based Keyphrase Extraction Using Wikipedia" title="Graph-Based Keyphrase Extraction Using Wikipedia" src="https://digital.library.unt.edu/ark:/67531/metadc67939/small/"/></a></p><p>Keyphrases describe a document in a coherent and simple way, giving the prospective reader a way to quickly determine whether the document satisfies their information needs. The pervasion of huge amount of information on Web, with only a small amount of documents have keyphrases extracted, there is a definite need to discover automatic keyphrase extraction systems. Typically, a document written by human develops around one or more general concepts or sub-concepts. These concepts or sub-concepts should be structured and semantically related with each other, so that they can form the meaningful representation of a document. Considering the fact, the phrases or concepts in a document are related to each other, a new approach for keyphrase extraction is introduced that exploits the semantic relations in the document. For measuring the semantic relations between concepts or sub-concepts in the document, I present a comprehensive study aimed at using collaboratively constructed semantic resources like Wikipedia and its link structure. In particular, I introduce a graph-based keyphrase extraction system that exploits the semantic relations in the document and features such as term frequency. I evaluated the proposed system using novel measures and the results obtained compare favorably with previously published results on established benchmarks.</p>A Framework for Analyzing and Optimizing Regional Bio-Emergency Response Plans2011-05-04T13:11:57-05:00https://digital.library.unt.edu/ark:/67531/metadc33200/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc33200/"><img alt="A Framework for Analyzing and Optimizing Regional Bio-Emergency Response Plans" title="A Framework for Analyzing and Optimizing Regional Bio-Emergency Response Plans" src="https://digital.library.unt.edu/ark:/67531/metadc33200/small/"/></a></p><p>The presence of naturally occurring and man-made public health threats necessitate the design and implementation of mitigation strategies, such that adequate response is provided in a timely manner. Since multiple variables, such as geographic properties, resource constraints, and government mandated time-frames must be accounted for, computational methods provide the necessary tools to develop contingency response plans while respecting underlying data and assumptions. A typical response scenario involves the placement of points of dispensing (PODs) in the affected geographic region to supply vaccines or medications to the general public. Computational tools aid in the analysis of such response plans, as well as in the strategic placement of PODs, such that feasible response scenarios can be developed. Due to the sensitivity of bio-emergency response plans, geographic information, such as POD locations, must be kept confidential. The generation of synthetic geographic regions allows for the development of emergency response plans on non-sensitive data, as well as for the study of the effects of single geographic parameters. Further, synthetic representations of geographic regions allow for results to be published and evaluated by the scientific community. This dissertation presents methodology for the analysis of bio-emergency response plans, methods for plan optimization, as well as methodology for the generation of synthetic geographic regions.</p>Measuring Vital Signs Using Smart Phones2011-05-04T13:11:57-05:00https://digital.library.unt.edu/ark:/67531/metadc33139/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc33139/"><img alt="Measuring Vital Signs Using Smart Phones" title="Measuring Vital Signs Using Smart Phones" src="https://digital.library.unt.edu/ark:/67531/metadc33139/small/"/></a></p><p>Smart phones today have become increasingly popular with the general public for its diverse abilities like navigation, social networking, and multimedia facilities to name a few. These phones are equipped with high end processors, high resolution cameras, built-in sensors like accelerometer, orientation-sensor, light-sensor, and much more. According to comScore survey, 25.3% of US adults use smart phones in their daily lives. Motivated by the capability of smart phones and their extensive usage, I focused on utilizing them for bio-medical applications. In this thesis, I present a new application for a smart phone to quantify the vital signs such as heart rate, respiratory rate and blood pressure with the help of its built-in sensors. Using the camera and a microphone, I have shown how the blood pressure and heart rate can be determined for a subject. People sometimes encounter minor situations like fainting or fatal accidents like car crash at unexpected times and places. It would be useful to have a device which can measure all vital signs in such an event. The second part of this thesis demonstrates a new mode of communication for next generation 9-1-1 calls. In this new architecture, the call-taker will be able to control the multimedia elements in the phone from a remote location. This would help the call-taker or first responder to have a better control over the situation. Transmission of the vital signs measured using the smart phone can be a life saver in critical situations. In today's voice oriented 9-1-1 calls, the dispatcher first collects critical information (e.g., location, call-back number) from caller, and assesses the situation. Meanwhile, the dispatchers constantly face a "60-second dilemma"; i.e., within 60 seconds, they need to make a complicated but important decision, whether to dispatch and, if so, what to dispatch. The dispatchers often feel that they lack sufficient information to make a confident dispatch decision. This remote-media-control described in this system will be able to facilitate information acquisition and decision-making in emergency situations within the 60-second response window in 9-1-1 calls using new multimedia technologies.</p>Anchor Nodes Placement for Effective Passive Localization2011-05-04T13:11:57-05:00https://digital.library.unt.edu/ark:/67531/metadc33132/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc33132/"><img alt="Anchor Nodes Placement for Effective Passive Localization" title="Anchor Nodes Placement for Effective Passive Localization" src="https://digital.library.unt.edu/ark:/67531/metadc33132/small/"/></a></p><p>Wireless sensor networks are composed of sensor nodes, which can monitor an environment and observe events of interest. These networks are applied in various fields including but not limited to environmental, industrial and habitat monitoring. In many applications, the exact location of the sensor nodes is unknown after deployment. Localization is a process used to find sensor node's positional coordinates, which is vital information. The localization is generally assisted by anchor nodes that are also sensor nodes but with known locations. Anchor nodes generally are expensive and need to be optimally placed for effective localization. Passive localization is one of the localization techniques where the sensor nodes silently listen to the global events like thunder sounds, seismic waves, lighting, etc. According to previous studies, the ideal location to place anchor nodes was on the perimeter of the sensor network. This may not be the case in passive localization, since the function of anchor nodes here is different than the anchor nodes used in other localization systems. I do extensive studies on positioning anchor nodes for effective localization. Several simulations are run in dense and sparse networks for proper positioning of anchor nodes. I show that, for effective passive localization, the optimal placement of the anchor nodes is at the center of the network in such a way that no three anchor nodes share linearity. The more the non-linearity, the better the localization. The localization for our network design proves better when I place anchor nodes at right angles.</p>Socioscope: Human Relationship and Behavior Analysis in Mobile Social Networks2011-01-06T06:55:18-06:00https://digital.library.unt.edu/ark:/67531/metadc30533/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc30533/"><img alt="Socioscope: Human Relationship and Behavior Analysis in Mobile Social Networks" title="Socioscope: Human Relationship and Behavior Analysis in Mobile Social Networks" src="https://digital.library.unt.edu/ark:/67531/metadc30533/small/"/></a></p><p>The widely used mobile phone, as well as its related technologies had opened opportunities for a complete change on how people interact and build relationship across geographic and time considerations. The convenience of instant communication by mobile phones that broke the barrier of space and time is evidently the key motivational point on why such technologies so important in people's life and daily activities. Mobile phones have become the most popular communication tools. Mobile phone technology is apparently changing our relationship to each other in our work and lives. The impact of new technologies on people's lives in social spaces gives us the chance to rethink the possibilities of technologies in social interaction. Accordingly, mobile phones are basically changing social relations in ways that are intricate to measure with any precision. In this dissertation I propose a socioscope model for social network, relationship and human behavior analysis based on mobile phone call detail records. Because of the diversities and complexities of human social behavior, one technique cannot detect different features of human social behaviors. Therefore I use multiple probability and statistical methods for quantifying social groups, relationships and communication patterns, for predicting social tie strengths and for detecting human behavior changes and unusual consumption events. I propose a new reciprocity index to measure the level of reciprocity between users and their communication partners. The experimental results show that this approach is effective. Among other applications, this work is useful for homeland security, detection of unwanted calls (e.g., spam), telecommunication presence, and marketing. In my future work I plan to analyze and study the social network dynamics and evolution.</p>Rhythms of Interaction in Global Software Development Teams2011-01-06T06:55:18-06:00https://digital.library.unt.edu/ark:/67531/metadc30476/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc30476/"><img alt="Rhythms of Interaction in Global Software Development Teams" title="Rhythms of Interaction in Global Software Development Teams" src="https://digital.library.unt.edu/ark:/67531/metadc30476/small/"/></a></p><p>Researchers have speculated that global software teams have activity patterns that are dictated by work-place schedules or a client's need. Similar patterns have been suggested for individuals enrolled in distant learning projects that require students to post feedback in response to questions or assignments. Researchers tend to accept the notion that students' temporal patterns adjust to academic or social calendars and are a result of choices made within these constraints. Although there is some evidence that culture do have an impact on communication activity behavior, there is not a clear how each of these factors may relate to work done in online groups. This particular study represents a new approach to studying student-group communication activities and also pursues an alternative approach by using activity data from students participating in a global software development project to generate a variety of complex measures that capture patterns about when students work. Students work habits are also often determined by where they live and what they are working on. Moreover, students tend to work on group projects in cycles, which correspond to a start, middle, and end time period. Knowledge obtained from this study should provide insight into current empirical research on global software development by defining the different time variables that can also be used to compare temporal patterns found in real-world teams. It should also inform studies about student team projects by helping instructors schedule group activities.</p>Elicitation of Protein-Protein Interactions from Biomedical Literature Using Association Rule Discovery2011-01-06T06:55:18-06:00https://digital.library.unt.edu/ark:/67531/metadc30508/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc30508/"><img alt="Elicitation of Protein-Protein Interactions from Biomedical Literature Using Association Rule Discovery" title="Elicitation of Protein-Protein Interactions from Biomedical Literature Using Association Rule Discovery" src="https://digital.library.unt.edu/ark:/67531/metadc30508/small/"/></a></p><p>Extracting information from a stack of data is a tedious task and the scenario is no different in proteomics. Volumes of research papers are published about study of various proteins in several species, their interactions with other proteins and identification of protein(s) as possible biomarker in causing diseases. It is a challenging task for biologists to keep track of these developments manually by reading through the literatures. Several tools have been developed by computer linguists to assist identification, extraction and hypotheses generation of proteins and protein-protein interactions from biomedical publications and protein databases. However, they are confronted with the challenges of term variation, term ambiguity, access only to abstracts and inconsistencies in time-consuming manual curation of protein and protein-protein interaction repositories. This work attempts to attenuate the challenges by extracting protein-protein interactions in humans and elicit possible interactions using associative rule mining on full text, abstracts and captions from figures available from publicly available biomedical literature databases. Two such databases are used in our study: Directory of Open Access Journals (DOAJ) and PubMed Central (PMC). A corpus is built using articles based on search terms. A dataset of more than 38,000 protein-protein interactions from the Human Protein Reference Database (HPRD) is cross-referenced to validate discovered interactive pairs. A set of an optimal size of possible binary protein-protein interactions is generated to be made available for clinician or biological validation. A significant change in the number of new associations was found by altering the thresholds for support and confidence metrics. This study narrows down the limitations for biologists in keeping pace with discovery of protein-protein interactions via manually reading the literature and their needs to validate each and every possible interaction.</p>Design and Implementation of Large-Scale Wireless Sensor Networks for Environmental Monitoring Applications2010-09-10T01:20:16-05:00https://digital.library.unt.edu/ark:/67531/metadc28493/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc28493/"><img alt="Design and Implementation of Large-Scale Wireless Sensor Networks for Environmental Monitoring Applications" title="Design and Implementation of Large-Scale Wireless Sensor Networks for Environmental Monitoring Applications" src="https://digital.library.unt.edu/ark:/67531/metadc28493/small/"/></a></p><p>Environmental monitoring represents a major application domain for wireless sensor networks (WSN). However, despite significant advances in recent years, there are still many challenging issues to be addressed to exploit the full potential of the emerging WSN technology. In this dissertation, we introduce the design and implementation of low-power wireless sensor networks for long-term, autonomous, and near-real-time environmental monitoring applications. We have developed an out-of-box solution consisting of a suite of software, protocols and algorithms to provide reliable data collection with extremely low power consumption. Two wireless sensor networks based on the proposed solution have been deployed in remote field stations to monitor soil moisture along with other environmental parameters. As parts of the ever-growing environmental monitoring cyberinfrastructure, these networks have been integrated into the Texas Environmental Observatory system for long-term operation. Environmental measurement and network performance results are presented to demonstrate the capability, reliability and energy-efficiency of the network.</p>Survey of Approximation Algorithms for Set Cover Problem2010-04-02T10:17:23-05:00https://digital.library.unt.edu/ark:/67531/metadc12118/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc12118/"><img alt="Survey of Approximation Algorithms for Set Cover Problem" title="Survey of Approximation Algorithms for Set Cover Problem" src="https://digital.library.unt.edu/ark:/67531/metadc12118/small/"/></a></p><p>In this thesis, I survey 11 approximation algorithms for unweighted set cover problem. I have also implemented the three algorithms and created a software library that stores the code I have written. The algorithms I survey are: 1. Johnson's standard greedy; 2. f-frequency greedy; 3. Goldsmidt, Hochbaum and Yu's modified greedy; 4. Halldorsson's local optimization; 5. Dur and Furer semi local optimization; 6. Asaf Levin's improvement to Dur and Furer; 7. Simple rounding; 8. Randomized rounding; 9. LP duality; 10. Primal-dual schema; and 11. Network flow technique. Most of the algorithms surveyed are refinements of standard greedy algorithm.</p>End of Insertion Detection in Colonoscopy Videos2010-03-17T11:40:26-05:00https://digital.library.unt.edu/ark:/67531/metadc12159/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc12159/"><img alt="End of Insertion Detection in Colonoscopy Videos" title="End of Insertion Detection in Colonoscopy Videos" src="https://digital.library.unt.edu/ark:/67531/metadc12159/small/"/></a></p><p>Colorectal cancer is the second leading cause of cancer-related deaths behind lung cancer in the United States. Colonoscopy is the preferred screening method for detection of diseases like Colorectal Cancer. In the year 2006, American Society for Gastrointestinal Endoscopy (ASGE) and American College of Gastroenterology (ACG) issued guidelines for quality colonoscopy. The guidelines suggest that on average the withdrawal phase during a screening colonoscopy should last a minimum of 6 minutes. My aim is to classify the colonoscopy video into insertion and withdrawal phase. The problem is that currently existing shot detection techniques cannot be applied because colonoscopy is a single camera shot from start to end. An algorithm to detect phase boundary has already been developed by the MIGLAB team. Existing method has acceptable levels of accuracy but the main issue is dependency on MPEG (Moving Pictures Expert Group) 1/2. I implemented exhaustive search for motion estimation to reduce the execution time and improve the accuracy. I took advantages of the C/C++ programming languages with multithreading which helped us get even better performances in terms of execution time. I propose a method for improving the current method of colonoscopy video analysis and also an extension for the same to make it usable for real time videos. The real time version we implemented is capable of handling streams coming directly from the camera in the form of uncompressed bitmap frames. Existing implementation could not be applied to real time scenario because of its dependency on MPEG 1/2. Future direction of this research includes improved motion search and GPU parallel computing techniques.</p>Urban surface characterization using LiDAR and aerial imagery.2010-03-17T11:40:26-05:00https://digital.library.unt.edu/ark:/67531/metadc12196/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc12196/"><img alt="Urban surface characterization using LiDAR and aerial imagery." title="Urban surface characterization using LiDAR and aerial imagery." src="https://digital.library.unt.edu/ark:/67531/metadc12196/small/"/></a></p><p>Many calamities in history like hurricanes, tornado and flooding are proof to the large scale impact they cause to the life and economy. Computer simulation and GIS helps in modeling a real world scenario, which assists in evacuation planning, damage assessment, assistance and reconstruction. For achieving computer simulation and modeling there is a need for accurate classification of ground objects. One of the most significant aspects of this research is that it achieves improved classification for regions within which light detection and ranging (LiDAR) has low spatial resolution. This thesis describes a method for accurate classification of bare ground, water body, roads, vegetation, and structures using LiDAR data and aerial Infrared imagery. The most basic step for any terrain modeling application is filtering which is classification of ground and non-ground points. We present an integrated systematic method that makes classification of terrain and non-terrain points effective. Our filtering method uses the geometric feature of the triangle meshes created from LiDAR samples and calculate the confidence for every point. Geometric homogenous blocks and confidence are derived from TIN model and gridded LiDAR samples. The results from two representations are used in a classifier to determine if the block belongs ground or otherwise. Another important step is detection of water body, which is based on the LiDAR sample density of the region. Objects like tress and bare ground are characterized by the geometric features present in the LiDAR and the color features in the infrared imagery. These features are fed into a SVM classifier which detects bare-ground in the given region. Similarly trees are extracted using another trained SVM classifier. Once we obtain bare-grounds and trees, roads are extracted by removing the bare grounds. Structures are identified by the properties of non-ground segments. Experiments were conducted using LiDAR samples and Infrared imagery from the city of New Orleans. We evaluated the influence of different parameters to the classification. Water bodies were extracted successfully using density measures. Experiments showed that fusion of geometric properties and confidence levels resulted into efficient classification of ground and non-ground regions. Classification of vegetation using SVM was promising and effective using the features like height variation, HSV, angle etc. It is demonstrated that our methods successfully classified the region by using LiDAR data in a complex urban area with high-rise buildings.</p>Force-Directed Graph Drawing and Aesthetics Measurement in a Non-Strict Pure Functional Programming Language2010-03-17T11:40:26-05:00https://digital.library.unt.edu/ark:/67531/metadc12125/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc12125/"><img alt="Force-Directed Graph Drawing and Aesthetics Measurement in a Non-Strict Pure Functional Programming Language" title="Force-Directed Graph Drawing and Aesthetics Measurement in a Non-Strict Pure Functional Programming Language" src="https://digital.library.unt.edu/ark:/67531/metadc12125/small/"/></a></p><p>Non-strict pure functional programming often requires redesigning algorithms and data structures to work more effectively under new constraints of non-strict evaluation and immutable state. Graph drawing algorithms, while numerous and broadly studied, have no presence in the non-strict pure functional programming model. Additionally, there is currently no freely licensed standalone toolkit used to quantitatively analyze aesthetics of graph drawings. This thesis addresses two previously unexplored questions. Can a force-directed graph drawing algorithm be implemented in a non-strict functional language, such as Haskell, and still be practically usable? Can an easily extensible aesthetic measuring tool be implemented in a language such as Haskell and still be practically usable? The focus of the thesis is on implementing one of the simplest force-directed algorithms, that of Fruchterman and Reingold, and comparing its resulting aesthetics to those of a well-known C++ implementation of the same algorithm.</p>Cross Language Information Retrieval for Languages with Scarce Resources2010-03-17T11:40:26-05:00https://digital.library.unt.edu/ark:/67531/metadc12157/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc12157/"><img alt="Cross Language Information Retrieval for Languages with Scarce Resources" title="Cross Language Information Retrieval for Languages with Scarce Resources" src="https://digital.library.unt.edu/ark:/67531/metadc12157/small/"/></a></p><p>Our generation has experienced one of the most dramatic changes in how society communicates. Today, we have online information on almost any imaginable topic. However, most of this information is available in only a few dozen languages. In this thesis, I explore the use of parallel texts to enable cross-language information retrieval (CLIR) for languages with scarce resources. To build the parallel text I use the Bible. I evaluate different variables and their impact on the resulting CLIR system, specifically: (1) the CLIR results when using different amounts of parallel text; (2) the role of paraphrasing on the quality of the CLIR output; (3) the impact on accuracy when translating the query versus translating the collection of documents; and finally (4) how the results are affected by the use of different dialects. The results show that all these variables have a direct impact on the quality of the CLIR system.</p>Computational Epidemiology - Analyzing Exposure Risk: A Deterministic, Agent-Based Approach2009-11-19T20:18:20-06:00https://digital.library.unt.edu/ark:/67531/metadc11017/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc11017/"><img alt="Computational Epidemiology - Analyzing Exposure Risk: A Deterministic, Agent-Based Approach" title="Computational Epidemiology - Analyzing Exposure Risk: A Deterministic, Agent-Based Approach" src="https://digital.library.unt.edu/ark:/67531/metadc11017/small/"/></a></p><p>Many infectious diseases are spread through interactions between susceptible and infectious individuals. Keeping track of where each exposure to the disease took place, when it took place, and which individuals were involved in the exposure can give public health officials important information that they may use to formulate their interventions. Further, knowing which individuals in the population are at the highest risk of becoming infected with the disease may prove to be a useful tool for public health officials trying to curtail the spread of the disease. Epidemiological models are needed to allow epidemiologists to study the population dynamics of transmission of infectious agents and the potential impact of infectious disease control programs. While many agent-based computational epidemiological models exist in the literature, they focus on the spread of disease rather than exposure risk. These models are designed to simulate very large populations, representing individuals as agents, and using random experiments and probabilities in an attempt to more realistically guide the course of the modeled disease outbreak. The work presented in this thesis focuses on tracking exposure risk to chickenpox in an elementary school setting. This setting is chosen due to the high level of detailed information realistically available to school administrators regarding individuals' schedules and movements. Using an agent-based approach, contacts between individuals are tracked and analyzed with respect to both individuals and locations. The results are then analyzed using a combination of tools from computer science and geographic information science.</p>Development, Implementation, and Analysis of a Contact Model for an Infectious Disease2009-09-23T14:51:22-05:00https://digital.library.unt.edu/ark:/67531/metadc9824/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc9824/"><img alt="Development, Implementation, and Analysis of a Contact Model for an Infectious Disease" title="Development, Implementation, and Analysis of a Contact Model for an Infectious Disease" src="https://digital.library.unt.edu/ark:/67531/metadc9824/small/"/></a></p><p>With a growing concern of an infectious diseases spreading in a population, epidemiology is becoming more important for the future of public health. In the past epidemiologist used existing data of an outbreak to help them determine how an infectious disease might spread in the future. Now with computational models, they able to analysis data produced by these models to help with prevention and intervention plans. This paper looks at the design, implementation, and analysis of a computational model based on the interactions of the population between individuals. The design of the working contact model looks closely at the SEIR model used as the foundation and the two timelines of a disease. The implementation of the contact model is reviewed while looking closely at data structures. The analysis of the experiments provide evidence this contact model can be used to help epidemiologist study the spread of an infectious disease based on the contact rate of individuals.</p>Direct Online/Offline Digital Signature Schemes.2009-09-09T14:32:05-05:00https://digital.library.unt.edu/ark:/67531/metadc9717/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc9717/"><img alt="Direct Online/Offline Digital Signature Schemes." title="Direct Online/Offline Digital Signature Schemes." src="https://digital.library.unt.edu/ark:/67531/metadc9717/small/"/></a></p><p>Online/offline signature schemes are useful in many situations, and two such scenarios are considered in this dissertation: bursty server authentication and embedded device authentication. In this dissertation, new techniques for online/offline signing are introduced, those are applied in a variety of ways for creating online/offline signature schemes, and five different online/offline signature schemes that are proved secure under a variety of models and assumptions are proposed. Two of the proposed five schemes have the best offline or best online performance of any currently known technique, and are particularly well-suited for the scenarios that are considered in this dissertation. To determine if the proposed schemes provide the expected practical improvements, a series of experiments were conducted comparing the proposed schemes with each other and with other state-of-the-art schemes in this area, both on a desktop class computer, and under AVR Studio, a simulation platform for an 8-bit processor that is popular for embedded systems. Under AVR Studio, the proposed SGE scheme using a typical key size for the embedded device authentication scenario, can complete the offline phase in about 24 seconds and then produce a signature (the online phase) in 15 milliseconds, which is the best offline performance of any known signature scheme that has been proven secure in the standard model. In the tests on a desktop class computer, the proposed SGS scheme, which has the best online performance and is designed for the bursty server authentication scenario, generated 469,109 signatures per second, and the Schnorr scheme (the next best scheme in terms of online performance) generated only 223,548 signatures. The experimental results demonstrate that the SGE and SGS schemes are the most efficient techniques for embedded device authentication and bursty server authentication, respectively.</p>Graph-based Centrality Algorithms for Unsupervised Word Sense Disambiguation2009-09-09T14:31:48-05:00https://digital.library.unt.edu/ark:/67531/metadc9736/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc9736/"><img alt="Graph-based Centrality Algorithms for Unsupervised Word Sense Disambiguation" title="Graph-based Centrality Algorithms for Unsupervised Word Sense Disambiguation" src="https://digital.library.unt.edu/ark:/67531/metadc9736/small/"/></a></p><p>This thesis introduces an innovative methodology of combining some traditional dictionary based approaches to word sense disambiguation (semantic similarity measures and overlap of word glosses, both based on WordNet) with some graph-based centrality methods, namely the degree of the vertices, Pagerank, closeness, and betweenness. The approach is completely unsupervised, and is based on creating graphs for the words to be disambiguated. We experiment with several possible combinations of the semantic similarity measures as the first stage in our experiments. The next stage attempts to score individual vertices in the graphs previously created based on several graph connectivity measures. During the final stage, several voting schemes are applied on the results obtained from the different centrality algorithms. The most important contributions of this work are not only that it is a novel approach and it works well, but also that it has great potential in overcoming the new-knowledge-acquisition bottleneck which has apparently brought research in supervised WSD as an explicit application to a plateau. The type of research reported in this thesis, which does not require manually annotated data, holds promise of a lot of new and interesting things, and our work is one of the first steps, despite being a small one, in this direction. The complete system is built and tested on standard benchmarks, and is comparable with work done on graph-based word sense disambiguation as well as lexical chains. The evaluation indicates that the right combination of the above mentioned metrics can be used to develop an unsupervised disambiguation engine as powerful as the state-of-the-art in WSD.</p>Exploring Trusted Platform Module Capabilities: A Theoretical and Experimental Study2008-10-02T16:42:58-05:00https://digital.library.unt.edu/ark:/67531/metadc6101/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc6101/"><img alt="Exploring Trusted Platform Module Capabilities: A Theoretical and Experimental Study" title="Exploring Trusted Platform Module Capabilities: A Theoretical and Experimental Study" src="https://digital.library.unt.edu/ark:/67531/metadc6101/small/"/></a></p><p>Trusted platform modules (TPMs) are hardware modules that are bound to a computer's motherboard, that are being included in many desktops and laptops. Augmenting computers with these hardware modules adds powerful functionality in distributed settings, allowing us to reason about the security of these systems in new ways. In this dissertation, I study the functionality of TPMs from a theoretical as well as an experimental perspective. On the theoretical front, I leverage various features of TPMs to construct applications like random oracles that are impossible to implement in a standard model of computation. Apart from random oracles, I construct a new cryptographic primitive which is basically a non-interactive form of the standard cryptographic primitive of oblivious transfer. I apply this new primitive to secure mobile agent computations, where interaction between various entities is typically required to ensure security. I prove these constructions are secure using standard cryptographic techniques and assumptions. To test the practicability of these constructions and their applications, I performed an experimental study, both on an actual TPM and a software TPM simulator which has been enhanced to make it reflect timings from a real TPM. This allowed me to benchmark the performance of the applications and test the feasibility of the proposed extensions to standard TPMs. My tests also show that these constructions are practical.</p>General Purpose Programming on Modern Graphics Hardware2008-10-02T16:42:25-05:00https://digital.library.unt.edu/ark:/67531/metadc6112/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc6112/"><img alt="General Purpose Programming on Modern Graphics Hardware" title="General Purpose Programming on Modern Graphics Hardware" src="https://digital.library.unt.edu/ark:/67531/metadc6112/small/"/></a></p><p>I start with a brief introduction to the graphics processing unit (GPU) as well as general-purpose computation on modern graphics hardware (GPGPU). Next, I explore the motivations for GPGPU programming, and the capabilities of modern GPUs (including advantages and disadvantages). Also, I give the background required for further exploring GPU programming, including the terminology used and the resources available. Finally, I include a comprehensive survey of previous and current GPGPU work, and end with a look at the future of GPU programming.</p>Keywords in the mist: Automated keyword extraction for very large documents and back of the book indexing.2008-10-02T16:42:09-05:00https://digital.library.unt.edu/ark:/67531/metadc6118/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc6118/"><img alt="Keywords in the mist: Automated keyword extraction for very large documents and back of the book indexing." title="Keywords in the mist: Automated keyword extraction for very large documents and back of the book indexing." src="https://digital.library.unt.edu/ark:/67531/metadc6118/small/"/></a></p><p>This research addresses the problem of automatic keyphrase extraction from large documents and back of the book indexing. The potential benefits of automating this process are far reaching, from improving information retrieval in digital libraries, to saving countless man-hours by helping professional indexers creating back of the book indexes. The dissertation introduces a new methodology to evaluate automated systems, which allows for a detailed, comparative analysis of several techniques for keyphrase extraction. We introduce and evaluate both supervised and unsupervised techniques, designed to balance the resource requirements of an automated system and the best achievable performance. Additionally, a number of novel features are proposed, including a statistical informativeness measure based on chi statistics; an encyclopedic feature that taps into the vast knowledge base of Wikipedia to establish the likelihood of a phrase referring to an informative concept; and a linguistic feature based on sophisticated semantic analysis of the text using current theories of discourse comprehension. The resulting keyphrase extraction system is shown to outperform the current state of the art in supervised keyphrase extraction by a large margin. Moreover, a fully automated back of the book indexing system based on the keyphrase extraction system was shown to lead to back of the book indexes closely resembling those created by human experts.</p>A Netcentric Scientific Research Repository2008-05-14T21:18:32-05:00https://digital.library.unt.edu/ark:/67531/metadc5611/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5611/"><img alt="A Netcentric Scientific Research Repository" title="A Netcentric Scientific Research Repository" src="https://digital.library.unt.edu/ark:/67531/metadc5611/small/"/></a></p><p>The Internet and networks in general have become essential tools for disseminating in-formation. Search engines have become the predominant means of finding information on the Web and all other data repositories, including local resources. Domain scientists regularly acquire and analyze images generated by equipment such as microscopes and cameras, resulting in complex image files that need to be managed in a convenient manner. This type of integrated environment has been recently termed a netcentric sci-entific research repository. I developed a number of data manipulation tools that allow researchers to manage their information more effectively in a netcentric environment. The specific contributions are: (1) A unique interface for management of data including files and relational databases. A wrapper for relational databases was developed so that the data can be indexed and searched using traditional search engines. This approach allows data in databases to be searched with the same interface as other data. Fur-thermore, this approach makes it easier for scientists to work with their data if they are not familiar with SQL. (2) A Web services based architecture for integrating analysis op-erations into a repository. This technique allows the system to leverage the large num-ber of existing tools by wrapping them with a Web service and registering the service with the repository. Metadata associated with Web services was enhanced to allow this feature to be included. In addition, an improved binary to text encoding scheme was de-veloped to reduce the size overhead for sending large scientific data files via XML mes-sages used in Web services. (3) Integrated image analysis operations with SQL. This technique allows for images to be stored and managed conveniently in a relational da-tabase. SQL supplemented with map algebra operations is used to select and perform operations on sets of images.</p>Analysis of Web Services on J2EE Application Servers2008-05-14T20:41:20-05:00https://digital.library.unt.edu/ark:/67531/metadc5547/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5547/"><img alt="Analysis of Web Services on J2EE Application Servers" title="Analysis of Web Services on J2EE Application Servers" src="https://digital.library.unt.edu/ark:/67531/metadc5547/small/"/></a></p><p>The Internet became a standard way of exchanging business data between B2B and B2C applications and with this came the need for providing various services on the web instead of just static text and images. Web services are a new type of services offered via the web that aid in the creation of globally distributed applications. Web services are enhanced e-business applications that are easier to advertise and easier to discover on the Internet because of their flexibility and uniformity. In a real life scenario it is highly difficult to decide which J2EE application server to go for when deploying a enterprise web service. This thesis analyzes the various ways by which web services can be developed & deployed. Underlying protocols and crucial issues like EAI (enterprise application integration), asynchronous messaging, Registry tModel architecture etc have been considered in this research. This paper presents a report by analyzing what various J2EE application servers provide by doing a case study and by developing applications to test functionality.</p>Hopfield Networks as an Error Correcting Technique for Speech Recognition2008-05-14T20:41:02-05:00https://digital.library.unt.edu/ark:/67531/metadc5551/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5551/"><img alt="Hopfield Networks as an Error Correcting Technique for Speech Recognition" title="Hopfield Networks as an Error Correcting Technique for Speech Recognition" src="https://digital.library.unt.edu/ark:/67531/metadc5551/small/"/></a></p><p>I experimented with Hopfield networks in the context of a voice-based, query-answering system. Hopfield networks are used to store and retrieve patterns. I used this technique to store queries represented as natural language sentences and I evaluated the accuracy of the technique for error correction in a spoken question-answering dialog between a computer and a user. I show that the use of an auto-associative Hopfield network helps make the speech recognition system more fault tolerant. I also looked at the available encoding schemes to convert a natural language sentence into a pattern of zeroes and ones that can be stored in the Hopfield network reliably, and I suggest scalable data representations which allow storing a large number of queries.</p>A Multi-Variate Analysis of SMTP Paths and Relays to Restrict Spam and Phishing Attacks in Emails2008-05-05T15:10:22-05:00https://digital.library.unt.edu/ark:/67531/metadc5402/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5402/"><img alt="A Multi-Variate Analysis of SMTP Paths and Relays to Restrict Spam and Phishing Attacks in Emails" title="A Multi-Variate Analysis of SMTP Paths and Relays to Restrict Spam and Phishing Attacks in Emails" src="https://digital.library.unt.edu/ark:/67531/metadc5402/small/"/></a></p><p>The classifier discussed in this thesis considers the path traversed by an email (instead of its content) and reputation of the relays, features inaccessible to spammers. Groups of spammers and individual behaviors of a spammer in a given domain were analyzed to yield association patterns, which were then used to identify similar spammers. Unsolicited and phishing emails were successfully isolated from legitimate emails, using analysis results. Spammers and phishers are also categorized into serial spammers/phishers, recent spammers/phishers, prospective spammers/phishers, and suspects. Legitimate emails and trusted domains are classified into socially close (family members, friends), socially distinct (strangers etc), and opt-outs (resolved false positives and false negatives). Overall this classifier resulted in far less false positives when compared to current filters like SpamAssassin, achieving a 98.65% precision, which is well comparable to the precisions achieved by SPF, DNSRBL blacklists.</p>A Language and Visual Interface to Specify Complex Spatial Pattern Mining2008-05-05T15:09:39-05:00https://digital.library.unt.edu/ark:/67531/metadc5408/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5408/"><img alt="A Language and Visual Interface to Specify Complex Spatial Pattern Mining" title="A Language and Visual Interface to Specify Complex Spatial Pattern Mining" src="https://digital.library.unt.edu/ark:/67531/metadc5408/small/"/></a></p><p>The emerging interests in spatial pattern mining leads to the demand for a flexible spatial pattern mining language, on which easy to use and understand visual pattern language could be built. It is worthwhile to define a pattern mining language called LCSPM to allow users to specify complex spatial patterns. I describe a proposed pattern mining language in this paper. A visual interface which allows users to specify the patterns visually is developed. Visual pattern queries are translated into the LCSPM language by a parser and data mining process can be triggered afterwards. The visual language is based on and goes beyond the visual language proposed in literature. I implemented a prototype system based on the open source JUMP framework.</p>Power-benefit analysis of erasure encoding with redundant routing in sensor networks.2008-05-05T15:06:42-05:00https://digital.library.unt.edu/ark:/67531/metadc5426/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5426/"><img alt="Power-benefit analysis of erasure encoding with redundant routing in sensor networks." title="Power-benefit analysis of erasure encoding with redundant routing in sensor networks." src="https://digital.library.unt.edu/ark:/67531/metadc5426/small/"/></a></p><p>One of the problems sensor networks face is adversaries corrupting nodes along the path to the base station. One way to reduce the effect of these attacks is multipath routing. This introduces some intrusion-tolerance in the network by way of redundancy but at the cost of a higher power consumption by the sensor nodes. Erasure coding can be applied to this scenario in which the base station can receive a subset of the total data sent and reconstruct the entire message packet at its end. This thesis uses two commonly used encodings and compares their performance with respect to power consumed for unencoded data in multipath routing. It is found that using encoding with multipath routing reduces the power consumption and at the same time enables the user to send reasonably large data sizes. The experiments in this thesis were performed on the Tiny OS platform with the simulations done in TOSSIM and the power measurements were taken in PowerTOSSIM. They were performed on the simple radio model and the lossy radio model provided by Tiny OS. The lossy radio model was simulated with distances of 10 feet, 15 feet and 20 feet between nodes. It was found that by using erasure encoding, double or triple the data size can be sent at the same power consumption rate as unencoded data. All the experiments were performed with the radio set at a normal transmit power, and later a high transmit power.</p>Grid-based Coordinated Routing in Wireless Sensor Networks2008-05-05T15:05:53-05:00https://digital.library.unt.edu/ark:/67531/metadc5437/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5437/"><img alt="Grid-based Coordinated Routing in Wireless Sensor Networks" title="Grid-based Coordinated Routing in Wireless Sensor Networks" src="https://digital.library.unt.edu/ark:/67531/metadc5437/small/"/></a></p><p>Wireless sensor networks are battery-powered ad-hoc networks in which sensor nodes that are scattered over a region connect to each other and form multi-hop networks. These nodes are equipped with sensors such as temperature sensors, pressure sensors, and light sensors and can be queried to get the corresponding values for analysis. However, since they are battery operated, care has to be taken so that these nodes use energy efficiently. One of the areas in sensor networks where an energy analysis can be done is routing. This work explores grid-based coordinated routing in wireless sensor networks and compares the energy available in the network over time for different grid sizes.</p>Mediation on XQuery Views2008-05-05T15:05:24-05:00https://digital.library.unt.edu/ark:/67531/metadc5442/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5442/"><img alt="Mediation on XQuery Views" title="Mediation on XQuery Views" src="https://digital.library.unt.edu/ark:/67531/metadc5442/small/"/></a></p><p>The major goal of information integration is to provide efficient and easy-to-use access to multiple heterogeneous data sources with a single query. At the same time, one of the current trends is to use standard technologies for implementing solutions to complex software problems. In this dissertation, I used XML and XQuery as the standard technologies and have developed an extended projection algorithm to provide a solution to the information integration problem. In order to demonstrate my solution, I implemented a prototype mediation system called Omphalos based on XML related technologies. The dissertation describes the architecture of the system, its metadata, and the process it uses to answer queries. The system uses XQuery expressions (termed metaqueries) to capture complex mappings between global schemas and data source schemas. The system then applies these metaqueries in order to rewrite a user query on a virtual global database (representing the integrated view of the heterogeneous data sources) to a query (termed an outsourced query) on the real data sources. An extended XML document projection algorithm was developed to increase the efficiency of selecting the relevant subset of data from an individual data source to answer the user query. The system applies the projection algorithm to decompose an outsourced query into atomic queries which are each executed on a single data source. I also developed an algorithm to generate integrating queries, which the system uses to compose the answers from the atomic queries into a single answer to the original user query. I present a proof of both the extended XML document projection algorithm and the query integration algorithm. An analysis of the efficiency of the new extended algorithm is also presented. Finally I describe a collaborative schema-matching tool that was implemented to facilitate maintaining metadata.</p>CLUE: A Cluster Evaluation Tool2008-05-05T15:05:02-05:00https://digital.library.unt.edu/ark:/67531/metadc5444/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5444/"><img alt="CLUE: A Cluster Evaluation Tool" title="CLUE: A Cluster Evaluation Tool" src="https://digital.library.unt.edu/ark:/67531/metadc5444/small/"/></a></p><p>Modern high performance computing is dependent on parallel processing systems. Most current benchmarks reveal only the high level computational throughput metrics, which may be sufficient for single processor systems, but can lead to a misrepresentation of true system capability for parallel systems. A new benchmark is therefore proposed. CLUE (Cluster Evaluator) uses a cellular automata algorithm to evaluate the scalability of parallel processing machines. The benchmark also uses algorithmic variations to evaluate individual system components' impact on the overall serial fraction and efficiency. CLUE is not a replacement for other performance-centric benchmarks, but rather shows the scalability of a system and provides metrics to reveal where one can improve overall performance. CLUE is a new benchmark which demonstrates a better comparison among different parallel systems than existing benchmarks and can diagnose where a particular parallel system can be optimized.</p>An Approach Towards Self-Supervised Classification Using Cyc2008-05-05T15:02:37-05:00https://digital.library.unt.edu/ark:/67531/metadc5470/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5470/"><img alt="An Approach Towards Self-Supervised Classification Using Cyc" title="An Approach Towards Self-Supervised Classification Using Cyc" src="https://digital.library.unt.edu/ark:/67531/metadc5470/small/"/></a></p><p>Due to the long duration required to perform manual knowledge entry by human knowledge engineers it is desirable to find methods to automatically acquire knowledge about the world by accessing online information. In this work I examine using the Cyc ontology to guide the creation of Naïve Bayes classifiers to provide knowledge about items described in Wikipedia articles. Given an initial set of Wikipedia articles the system uses the ontology to create positive and negative training sets for the classifiers in each category. The order in which classifiers are generated and used to test articles is also guided by the ontology. The research conducted shows that a system can be created that utilizes statistical text classification methods to extract information from an ad-hoc generated information source like Wikipedia for use in a formal semantic ontology like Cyc. Benefits and limitations of the system are discussed along with future work.</p>Natural Language Interfaces to Databases2008-05-05T15:02:09-05:00https://digital.library.unt.edu/ark:/67531/metadc5474/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5474/"><img alt="Natural Language Interfaces to Databases" title="Natural Language Interfaces to Databases" src="https://digital.library.unt.edu/ark:/67531/metadc5474/small/"/></a></p><p>Natural language interfaces to databases (NLIDB) are systems that aim to bridge the gap between the languages used by humans and computers, and automatically translate natural language sentences to database queries. This thesis proposes a novel approach to NLIDB, using graph-based models. The system starts by collecting as much information as possible from existing databases and sentences, and transforms this information into a knowledge base for the system. Given a new question, the system will use this knowledge to analyze and translate the sentence into its corresponding database query statement. The graph-based NLIDB system uses English as the natural language, a relational database model, and SQL as the formal query language. In experiments performed with natural language questions ran against a large database containing information about U.S. geography, the system showed good performance compared to the state-of-the-art in the field.</p>Group-EDF: A New Approach and an Efficient Non-Preemptive Algorithm for Soft Real-Time Systems2008-05-05T14:52:44-05:00https://digital.library.unt.edu/ark:/67531/metadc5317/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5317/"><img alt="Group-EDF: A New Approach and an Efficient Non-Preemptive Algorithm for Soft Real-Time Systems" title="Group-EDF: A New Approach and an Efficient Non-Preemptive Algorithm for Soft Real-Time Systems" src="https://digital.library.unt.edu/ark:/67531/metadc5317/small/"/></a></p><p>Hard real-time systems in robotics, space and military missions, and control devices are specified with stringent and critical time constraints. On the other hand, soft real-time applications arising from multimedia, telecommunications, Internet web services, and games are specified with more lenient constraints. Real-time systems can also be distinguished in terms of their implementation into preemptive and non-preemptive systems. In preemptive systems, tasks are often preempted by higher priority tasks. Non-preemptive systems are gaining interest for implementing soft-real applications on multithreaded platforms. In this dissertation, I propose a new algorithm that uses a two-level scheduling strategy for scheduling non-preemptive soft real-time tasks. Our goal is to improve the success ratios of the well-known earliest deadline first (EDF) approach when the load on the system is very high and to improve the overall performance in both underloaded and overloaded conditions. Our approach, known as group-EDF (gEDF), is based on dynamic grouping of tasks with deadlines that are very close to each other, and using a shortest job first (SJF) technique to schedule tasks within the group. I believe that grouping tasks dynamically with similar deadlines and utilizing secondary criteria, such as minimizing the total execution time can lead to new and more efficient real-time scheduling algorithms. I present results comparing gEDF with other real-time algorithms including, EDF, best-effort, and guarantee scheme, by using randomly generated tasks with varying execution times, release times, deadlines and tolerances to missing deadlines, under varying workloads. Furthermore, I implemented the gEDF algorithm in the Linux kernel and evaluated gEDF for scheduling real applications.</p>Modeling Infectious Disease Spread Using Global Stochastic Field Simulation2008-05-05T14:48:45-05:00https://digital.library.unt.edu/ark:/67531/metadc5335/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5335/"><img alt="Modeling Infectious Disease Spread Using Global Stochastic Field Simulation" title="Modeling Infectious Disease Spread Using Global Stochastic Field Simulation" src="https://digital.library.unt.edu/ark:/67531/metadc5335/small/"/></a></p><p>Susceptibles-infectives-removals (SIR) and its derivatives are the classic mathematical models for the study of infectious diseases in epidemiology. In order to model and simulate epidemics of an infectious disease, a global stochastic field simulation paradigm (GSFS) is proposed, which incorporates geographic and demographic based interactions. The interaction measure between regions is a function of population density and geographical distance, and has been extended to include demographic and migratory constraints. The progression of diseases using GSFS is analyzed, and similar behavior to the SIR model is exhibited by GSFS, using the geographic information systems (GIS) gravity model for interactions. The limitations of the SIR and similar models of homogeneous population with uniform mixing are addressed by the GSFS model. The GSFS model is oriented to heterogeneous population, and can incorporate interactions based on geography, demography, environment and migration patterns. The progression of diseases can be modeled at higher levels of fidelity using the GSFS model, and facilitates optimal deployment of public health resources for prevention, control and surveillance of infectious diseases.</p>Using Reinforcement Learning in Partial Order Plan Space2008-05-05T14:14:05-05:00https://digital.library.unt.edu/ark:/67531/metadc5232/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5232/"><img alt="Using Reinforcement Learning in Partial Order Plan Space" title="Using Reinforcement Learning in Partial Order Plan Space" src="https://digital.library.unt.edu/ark:/67531/metadc5232/small/"/></a></p><p>Partial order planning is an important approach that solves planning problems without completely specifying the orderings between the actions in the plan. This property provides greater flexibility in executing plans; hence making the partial order planners a preferred choice over other planning methodologies. However, in order to find partially ordered plans, partial order planners perform a search in plan space rather than in space of world states and an uninformed search in plan space leads to poor efficiency. In this thesis, I discuss applying a reinforcement learning method, called First-visit Monte Carlo method, to partial order planning in order to design agents which do not need any training data or heuristics but are still able to make informed decisions in plan space based on experience. Communicating effectively with the agent is crucial in reinforcement learning. I address how this task was accomplished in plan space and the results from an evaluation of a blocks world test bed.</p>Towards Communicating Simple Sentence using Pictorial Representations2008-05-05T14:05:45-05:00https://digital.library.unt.edu/ark:/67531/metadc5263/<p><a href="https://digital.library.unt.edu/ark:/67531/metadc5263/"><img alt="Towards Communicating Simple Sentence using Pictorial Representations" title="Towards Communicating Simple Sentence using Pictorial Representations" src="https://digital.library.unt.edu/ark:/67531/metadc5263/small/"/></a></p><p>Language can sometimes be an impediment in communication. Whether we are talking about people who speak different languages, students who are learning a new language, or people with language disorders, the understanding of linguistic representations in a given language requires a certain amount of knowledge that not everybody has. In this thesis, we propose "translation through pictures" as a means for conveying simple pieces of information across language barriers, and describe a system that can automatically generate pictorial representations for simple sentences. Comparative experiments conducted on visual and linguistic representations of information show that a considerable amount of understanding can be achieved through pictorial descriptions, with results within a comparable range of those obtained with current machine translation techniques. Moreover, a user study conducted around the pictorial translation system reveals that users found the system to generally produce correct word/image associations, and rate the system as interactive and intelligent.</p>