You limited your search to:

  Partner: UNT Libraries
 Degree Discipline: Computer Science
 Collection: UNT Theses and Dissertations
Smartphone-based Household Travel Survey-a Literature Review, an App, and a Pilot Survey
Access: Use of this item is restricted to the UNT Community.
High precision data from household travel survey (HTS) is extremely important for the transportation research, traffic models and policy formulation. Traditional methods of data collection were imprecise because they relied on people’s memories of trip information, such as date and location, and the remainder data had to be obtained by certain supplemental tools. The traditional methods suffered from intensive labor, large time consumption, and unsatisfactory data precision. Recent research trends to employ smartphone apps to collect HTS data. In this study, there are two goals to be addressed. First, a smartphone app is developed to realize a smartphone-based method only for data collection. Second, the researcher evaluates whether this method can supply or replace the traditional tools of HTS. Based on this premise, the smartphone app, TravelSurvey, is specially developed and used for this study. TravelSurvey is currently compatible with iPhone 4 or higher and iPhone Operating System (iOS) 6 or higher, except iPhone 6 or iPhone 6 plus and iOS 8. To evaluate the feasibility, eight individuals are recruited to participate in a pilot HTS. Afterwards, seven of them are involved in a semi-structured interview. The interview is designed to collect interviewees’ feedback directly, so the interview mainly concerns the users’ experience of TravelSurvey. Generally, the feedback is positive. In this study, the pilot HTS data is successfully uploaded to the server by the participants, and the interviewees prefer this smartphone-based method. Therefore, as a new tool, the smartphone-based method feasibly supports a typical HTS for data collection. digital.library.unt.edu/ark:/67531/metadc700116/
General Purpose Computing in Gpu - a Watermarking Case Study
The purpose of this project is to explore the GPU for general purpose computing. The GPU is a massively parallel computing device that has a high-throughput, exhibits high arithmetic intensity, has a large market presence, and with the increasing computation power being added to it each year through innovations, the GPU is a perfect candidate to complement the CPU in performing computations. The GPU follows the single instruction multiple data (SIMD) model for applying operations on its data. This model allows the GPU to be very useful for assisting the CPU in performing computations on data that is highly parallel in nature. The compute unified device architecture (CUDA) is a parallel computing and programming platform for NVIDIA GPUs. The main focus of this project is to show the power, speed, and performance of a CUDA-enabled GPU for digital video watermark insertion in the H.264 video compression domain. Digital video watermarking in general is a highly computationally intensive process that is strongly dependent on the video compression format in place. The H.264/MPEG-4 AVC video compression format has high compression efficiency at the expense of having high computational complexity and leaving little room for an imperceptible watermark to be inserted. Employing a human visual model to limit distortion and degradation of visual quality introduced by the watermark is a good choice for designing a video watermarking algorithm though this does introduce more computational complexity to the algorithm. Research is being conducted into how the CPU-GPU execution of the digital watermark application can boost the speed of the applications several times compared to running the application on a standalone CPU using NVIDIA visual profiler to optimize the application. digital.library.unt.edu/ark:/67531/metadc700078/
Autonomic Failure Identification and Diagnosis for Building Dependable Cloud Computing Systems
The increasingly popular cloud-computing paradigm provides on-demand access to computing and storage with the appearance of unlimited resources. Users are given access to a variety of data and software utilities to manage their work. Users rent virtual resources and pay for only what they use. In spite of the many benefits that cloud computing promises, the lack of dependability in shared virtualized infrastructures is a major obstacle for its wider adoption, especially for mission-critical applications. Virtualization and multi-tenancy increase system complexity and dynamicity. They introduce new sources of failure degrading the dependability of cloud computing systems. To assure cloud dependability, in my dissertation research, I develop autonomic failure identification and diagnosis techniques that are crucial for understanding emergent, cloud-wide phenomena and self-managing resource burdens for cloud availability and productivity enhancement. We study the runtime cloud performance data collected from a cloud test-bed and by using traces from production cloud systems. We define cloud signatures including those metrics that are most relevant to failure instances. We exploit profiled cloud performance data in both time and frequency domain to identify anomalous cloud behaviors and leverage cloud metric subspace analysis to automate the diagnosis of observed failures. We implement a prototype of the anomaly identification system and conduct the experiments in an on-campus cloud computing test-bed and by using the Google datacenter traces. Our experimental results show that our proposed anomaly detection mechanism can achieve 93% detection sensitivity while keeping the false positive rate as low as 6.1% and outperform other tested anomaly detection schemes. In addition, the anomaly detector adapts itself by recursively learning from these newly verified detection results to refine future detection. digital.library.unt.edu/ark:/67531/metadc499993/
Ddos Defense Against Botnets in the Mobile Cloud
Mobile phone advancements and ubiquitous internet connectivity are resulting in ever expanding possibilities in the application of smart phones. Users of mobile phones are now capable of hosting server applications from their personal devices. Whether providing services individually or in an ad hoc network setting the devices are currently not configured for defending against distributed denial of service (DDoS) attacks. These attacks, often launched from a botnet, have existed in the space of personal computing for decades but recently have begun showing up on mobile devices. Research is done first into the required steps to develop a potential botnet on the Android platform. This includes testing for the amount of malicious traffic an Android phone would be capable of generating for a DDoS attack. On the other end of the spectrum is the need of mobile devices running networked applications to develop security against DDoS attacks. For this mobile, phones are setup, with web servers running Apache to simulate users running internet connected applications for either local ad hoc networks or serving to the internet. Testing is done for the viability of using commonly available modules developed for Apache and intended for servers as well as finding baseline capabilities of mobiles to handle higher traffic volumes. Given the unique challenge of the limited resources a mobile phone can dedicate to Apache when compared to a dedicated hosting server a new method was needed. A proposed defense algorithm is developed for mitigating DDoS attacks against the mobile server that takes into account the limited resources available on the mobile device. The algorithm is tested against TCP socket flooding for effectiveness and shown to perform better than the common Apache module installations on a mobile device. digital.library.unt.edu/ark:/67531/metadc500027/
Performance Engineering of Software Web Services and Distributed Software Systems
The promise of service oriented computing, and the availability of Web services promote the delivery and creation of new services based on existing services, in order to meet new demands and new markets. As Web and internet based services move into Clouds, inter-dependency of services and their complexity will increase substantially. There are standards and frameworks for specifying and composing Web Services based on functional properties. However, mechanisms to individually address non-functional properties of services and their compositions have not been well established. Furthermore, the Cloud ontology depicts service layers from a high-level, such as Application and Software, to a low-level, such as Infrastructure and Platform. Each component that resides in one layer can be useful to another layer as a service. It hints at the amount of complexity resulting from not only horizontal but also vertical integrations in building and deploying a composite service. To meet the requirements and facilitate using Web services, we first propose a WSDL extension to permit specification of non-functional or Quality of Service (QoS) properties. On top of the foundation, the QoS-aware framework is established to adapt publicly available tools for Web services, augmented by ontology management tools, along with tools for performance modeling to exemplify how the non-functional properties such as response time, throughput, or utilization of services can be addressed in the service acquisition and composition process. To facilitate Web service composition standards, in this work we extended the framework with additional qualitative information to the service descriptions using Business Process Execution Language (BPEL). Engineers can use BPEL to explore design options, and have the QoS properties analyzed for the composite service. The main issue in our research is performance evaluation in software system and engineering. We researched the Web service computation as the first half of this dissertation, and performance antipattern detection and elimination in the second part. Performance analysis of software system is complex due to large number of components and the interactions among them. Without the knowledge of experienced experts, it is difficult to diagnose performance anomalies and attempt to pinpoint the root causes of the problems. Software performance antipatterns are similar to design patterns in that they provide what to avoid and how to fix performance problems when they appear. Although the idea of applying antipatterns is promising, there are gaps in matching the symptoms and generating feedback solution for redesign. In this work, we analyze performance antipatterns to extract detectable features, influential factors, and resource involvements so that we can lay the foundation to detect their presence. We propose system abstract layering model and suggestive profiling methods for performance antipattern detection and elimination. Solutions proposed can be used during the refactoring phase, and can be included in the software development life cycle. Proposed tools and utilities are implemented and their use is demonstrated with RUBiS benchmark. digital.library.unt.edu/ark:/67531/metadc500103/
3GPP Long Term Evolution LTE Scheduling
Future generation cellular networks are expected to deliver an omnipresent broadband access network for an endlessly increasing number of subscribers. Long term Evolution (LTE) represents a significant milestone towards wireless networks known as 4G cellular networks. A key feature of LTE is the implementation of enhanced Radio Resource Management (RRM) mechanism to improve the system performance. The structure of LTE networks was simplified by diminishing the number of the nodes of the core network. Also, the design of the radio protocol architecture is quite unique. In order to achieve high data rate in LTE, 3rd Generation Partnership Project (3GPP) has selected Orthogonal Frequency Division Multiplexing (OFDM) as an appropriate scheme in terms of downlinks. However, the proper scheme for an uplink is the Single-Carrier Frequency Domain Multiple Access due to the peak-to-average-power-ratio (PAPR) constraint. LTE packet scheduling plays a primary role as part of RRM to improve the system’s data rate as well as supporting various QoS requirements of mobile services. The major function of the LTE packet scheduler is to assign Physical Resource Blocks (PRBs) to mobile User Equipment (UE). In our work, we formed a proposed packet scheduler algorithm. The proposed scheduler algorithm acts based on the number of UEs attached to the eNodeB. To evaluate the proposed scheduler algorithm, we assumed two different scenarios based on a number of UEs. When the number of UE is lower than the number of PRBs, the UEs with highest Channel Quality Indicator (CQI) will be assigned PRBs. Otherwise, the scheduler will assign PRBs based on a given proportional fairness metric. The eNodeB’s throughput is increased when the proposed algorithm was implemented. digital.library.unt.edu/ark:/67531/metadc490046/
Boosting for Learning From Imbalanced, Multiclass Data Sets
Access: Use of this item is restricted to the UNT Community.
In many real-world applications, it is common to have uneven number of examples among multiple classes. The data imbalance, however, usually complicates the learning process, especially for the minority classes, and results in deteriorated performance. Boosting methods were proposed to handle the imbalance problem. These methods need elongated training time and require diversity among the classifiers of the ensemble to achieve improved performance. Additionally, extending the boosting method to handle multi-class data sets is not straightforward. Examples of applications that suffer from imbalanced multi-class data can be found in face recognition, where tens of classes exist, and in capsule endoscopy, which suffers massive imbalance between the classes. This dissertation introduces RegBoost, a new boosting framework to address the imbalanced, multi-class problems. This method applies a weighted stratified sampling technique and incorporates a regularization term that accommodates multi-class data sets and automatically determines the error bound of each base classifier. The regularization parameter penalizes the classifier when it misclassifies instances that were correctly classified in the previous iteration. The parameter additionally reduces the bias towards majority classes. Experiments are conducted using 12 diverse data sets with moderate to high imbalance ratios. The results demonstrate superior performance of the proposed method compared to several state-of-the-art algorithms for imbalanced, multi-class classification problems. More importantly, the sensitivity improvement of the minority classes using RegBoost is accompanied with the improvement of the overall accuracy for all classes. With unpredictability regularization, a diverse group of classifiers are created and the maximum accuracy improvement reaches above 24%. Using stratified undersampling, RegBoost exhibits the best efficiency. The reduction in computational cost is significant reaching above 50%. As the volume of training data increase, the gain of efficiency with the proposed method becomes more significant. digital.library.unt.edu/ark:/67531/metadc407775/
Design and Analysis of Novel Verifiable Voting Schemes
Free and fair elections are the basis for democracy, but conducting elections is not an easy task. Different groups of people are trying to influence the outcome of the election in their favor using the range of methods, from campaigning for a particular candidate to well-financed lobbying. Often the stakes are too high, and the methods are illegal. Two main properties of any voting scheme are the privacy of a voter’s choice and the integrity of the tally. Unfortunately, they are mutually exclusive. Integrity requires making elections transparent and auditable, but at the same time, we must preserve a voter’s privacy. It is always a trade-off between these two requirements. Current voting schemes favor privacy over auditability, and thus, they are vulnerable to voting fraud. I propose two novel voting systems that can achieve both privacy and verifiability. The first protocol is based on cryptographical primitives to ensure the integrity of the final tally and privacy of the voter. The second protocol is a simple paper-based voting scheme that achieves almost the same level of security without usage of cryptography. digital.library.unt.edu/ark:/67531/metadc407785/
Simulating the Spread of Infectious Diseases in Heterogeneous Populations with Diverse Interactions Characteristics
The spread of infectious diseases has been a public concern throughout human history. Historic recorded data has reported the severity of infectious disease epidemics in different ages. Ancient Greek physician Hippocrates was the first to analyze the correlation between diseases and their environment. Nowadays, health authorities are in charge of planning strategies that guarantee the welfare of citizens. The simulation of contagion scenarios contributes to the understanding of the epidemic behavior of diseases. Computational models facilitate the study of epidemics by integrating disease and population data to the simulation. The use of detailed demographic and geographic characteristics allows researchers to construct complex models that better resemble reality and the integration of these attributes permits us to understand the rules of interaction. The interaction of individuals with similar characteristics forms synthetic structures that depict clusters of interaction. The synthetic environments facilitate the study of the spread of infectious diseases in diverse scenarios. The characteristics of the population and the disease concurrently affect the local and global epidemic progression. Every cluster’ epidemic behavior constitutes the global epidemic for a clustered population. By understanding the correlation between structured populations and the spread of a disease, current dissertation research makes possible to identify risk groups of specific characteristics and devise containment strategies that facilitate health authorities to improve mitigation strategies. digital.library.unt.edu/ark:/67531/metadc407831/
Framework for Evaluating Dynamic Memory Allocators Including a New Equivalence Class Based Cache-conscious Allocator
Software applications’ performance is hindered by a variety of factors, but most notably by the well-known CPU-memory speed gap (often known as the memory wall). This results in the CPU sitting idle waiting for data to be brought from memory to processor caches. The addressing used by caches cause non-uniform accesses to various cache sets. The non-uniformity is due to several reasons, including how different objects are accessed by the code and how the data objects are located in memory. Memory allocators determine where dynamically created objects are placed, thus defining addresses and their mapping to cache locations. It is important to evaluate how different allocators behave with respect to the localities of the created objects. Most allocators use a single attribute, the size, of an object in making allocation decisions. Additional attributes such as the placement with respect to other objects, or specific cache area may lead to better use of cache memories. In this dissertation, we proposed and implemented a framework that allows for the development and evaluation of new memory allocation techniques. At the root of the framework is a memory tracing tool called Gleipnir, which provides very detailed information about every memory access, and relates it back to source level objects. Using the traces from Gleipnir, we extended a commonly used cache simulator for generating detailed cache statistics: per function, per data object, per cache line, and identify specific data objects that are conflicting with each other. The utility of the framework is demonstrated with a new memory allocator known as equivalence class allocator. The new allocator allows users to specify cache sets, in addition to object size, where the objects should be placed. We compare this new allocator with two well-known allocators, viz., Doug Lea and Pool allocators. digital.library.unt.edu/ark:/67531/metadc500151/
Modeling and Analysis of Next Generation 9-1-1 Emergency Medical Dispatch Protocols
Emergency Medical Dispatch Protocols are guidelines that a 9-1-1 dispatcher uses to evaluate the nature of emergency, resources to send and the nature of help provided to the 9-1-1 caller. The current Dispatch Protocols are based on voice only call. But the Next Generation 9-1-1 (NG9-1-1) architecture will allow multimedia emergency calls. In this thesis I analyze and model the Emergency Medical Dispatch Protocols for NG9-1-1 architecture. I have identified various technical aspects to improve the NG9-1-1 Dispatch Protocols. The devices (smartphone) at the caller end have advanced to a point where they can be used to send and receive video, pictures and text. There are sensors embedded in them that can be used for initial diagnosis of the injured person. There is a need to improve the human computer (smartphone) interface to take advantage of technology so that callers can easily make use of various features available to them. The dispatchers at the 9-1-1 call center can make use of these new protocols to improve the quality and the response time. They will have capability of multiple media streams to interact with the caller and the first responders.The specific contributions in this thesis include developing applications that use smartphone sensors. The CPR application uses the smartphone to help administer effective CPR even if the person is not trained. The application makes the CPR process closed loop, i.e., the person who administers the CPR as well as the 9-1-1 operator receive feedback and prompt from the application about the correctness of the CPR. The breathing application analyzes the quality of breathing of the affected person and automatically sends the information to the 9-1-1 operator. In order to improve the Human Computer Interface at the caller and the operator end, I have analyzed Fitts law and extended it so that it can be used to improve the instructions given to a caller. In emergency situations, the caller may be physically or cognitively impaired. This may happen either because the caller is the injured person, or because the caller is a close relative or friend of the injured person. Using EEG waves, I have analyzed and developed a mathematical model of a person's cognitive impairment. Finally, I have developed a mathematical model of the response time of a 9-1-1 call and analyzed the factors that can be improved to reduce the response time. In this regard, another application, I have developed, allows the 9-1-1 operator to remotely control the media features of a caller's smartphone. This is needed in case the caller is unable to operate the multimedia features of the smartphone. For example, the caller may not know how to zoom in the smartphone camera.All these building blocks come together in the development of an efficient NG9-1-1 Emergency Medical Dispatch protocols. I have provided a sample of these protocols, using the existing Emergency Dispatch Protocols used in the state of New Jersey. The new protocols will have fewer questions and more visual prompts to evaluate the nature of the emergency. digital.library.unt.edu/ark:/67531/metadc500122/
Multilingual Word Sense Disambiguation Using Wikipedia
Ambiguity is inherent to human language. In particular, word sense ambiguity is prevalent in all natural languages, with a large number of the words in any given language carrying more than one meaning. Word sense disambiguation is the task of automatically assigning the most appropriate meaning to a polysemous word within a given context. Generally the problem of resolving ambiguity in literature has revolved around the famous quote “you shall know the meaning of the word by the company it keeps.” In this thesis, we investigate the role of context for resolving ambiguity through three different approaches. Instead of using a predefined monolingual sense inventory such as WordNet, we use a language-independent framework where the word senses and sense-tagged data are derived automatically from Wikipedia. Using Wikipedia as a source of sense-annotations provides the much needed solution for knowledge acquisition bottleneck. In order to evaluate the viability of Wikipedia based sense-annotations, we cast the task of disambiguating polysemous nouns as a monolingual classification task and experimented on lexical samples from four different languages (viz. English, German, Italian and Spanish). The experiments confirm that the Wikipedia based sense annotations are reliable and can be used to construct accurate monolingual sense classifiers. It is a long belief that exploiting multiple languages helps in building accurate word sense disambiguation systems. Subsequently, we developed two approaches that recast the task of disambiguating polysemous nouns as a multilingual classification task. The first approach for multilingual word sense disambiguation attempts to effectively use a machine translation system to leverage two relevant multilingual aspects of the semantics of text. First, the various senses of a target word may be translated into different words, which constitute unique, yet highly salient signal that effectively expand the target word’s feature space. Second, the translated context words themselves embed co-occurrence information that a translation engine gathers from very large parallel corpora. The second approach for multlingual word sense disambiguation attempts to reduce the reliance on the machine translation system during training by using the multilingual knowledge available in Wikipedia through its interlingual links. Finally, the experiments on a lexical sample from four different languages confirm that the multilingual systems perform better than the monolingual system and significantly improve the disambiguation accuracy. digital.library.unt.edu/ark:/67531/metadc500036/
Privacy Management for Online Social Networks
One in seven people in the world use online social networking for a variety of purposes -- to keep in touch with friends and family, to share special occasions, to broadcast announcements, and more. The majority of society has been bought into this new era of communication technology, which allows everyone on the internet to share information with friends. Since social networking has rapidly become a main form of communication, holes in privacy have become apparent. It has come to the point that the whole concept of sharing information requires restructuring. No longer are online social networks simply technology available for a niche market; they are in use by all of society. Thus it is important to not forget that a sense of privacy is inherent as an evolutionary by-product of social intelligence. In any context of society, privacy needs to be a part of the system in order to help users protect themselves from others. This dissertation attempts to address the lack of privacy management in online social networks by designing models which understand the social science behind how we form social groups and share information with each other. Social relationship strength was modeled using activity patterns, vocabulary usage, and behavioral patterns. In addition, automatic configuration for default privacy settings was proposed to help prevent new users from leaking personal information. This dissertation aims to mobilize a new era of social networking that understands social aspects of human network, and uses that knowledge to honor users' privacy. digital.library.unt.edu/ark:/67531/metadc283816/
Qos Aware Service Oriented Architecture
Service-oriented architecture enables web services to operate in a loosely-coupled setting and provides an environment for dynamic discovery and use of services over a network using standards such as WSDL, SOAP, and UDDI. Web service has both functional and non-functional characteristics. This thesis work proposes to add QoS descriptions (non-functional properties) to WSDL and compose various services to form a business process. This composition of web services also considers QoS properties along with functional properties and the composed services can again be published as a new Web Service and can be part of any other composition using Composed WSDL. digital.library.unt.edu/ark:/67531/metadc500032/
Real-time Rendering of Burning Objects in Video Games
In recent years there has been growing interest in limitless realism in computer graphics applications. Among those, my foremost concentration falls into the complex physical simulations and modeling with diverse applications for the gaming industry. Different simulations have been virtually successful by replicating the details of physical process. As a result, some were strong enough to lure the user into believable virtual worlds that could destroy any sense of attendance. In this research, I focus on fire simulations and its deformation process towards various virtual objects. In most game engines model loading takes place at the beginning of the game or when the game is transitioning between levels. Game models are stored in large data structures. Since changing or adjusting a large data structure while the game is proceeding may adversely affect the performance of the game. Therefore, developers may choose to avoid procedural simulations to save resources and avoid interruptions on performance. I introduce a process to implement a real-time model deformation while maintaining performance. It is a challenging task to achieve high quality simulation while utilizing minimum resources to represent multiple events in timely manner. Especially in video games, this overwhelming criterion would be robust enough to sustain the engaging player's willing suspension of disbelief. I have implemented and tested my method on a relatively modest GPU using CUDA. My experiments conclude this method gives a believable visual effect while using small fraction of CPU and GPU resources. digital.library.unt.edu/ark:/67531/metadc500131/
Modeling Alcohol Consumption Using Blog Data
How do the content and writing style of people who drink alcohol beverages stand out from non-drinkers? How much information can we learn about a person's alcohol consumption behavior by reading text that they have authored? This thesis attempts to extend the methods deployed in authorship attribution and authorship profiling research into the domain of automatically identifying the human action of drinking alcohol beverages. I examine how a psycholinguistics dictionary (the Linguistics Inquiry and Word Count lexicon, developed by James Pennebaker), together with Kenneth Burke's concept of words as symbols of human action, and James Wertsch's concept of mediated action provide a framework for analyzing meaningful data patterns from the content of blogs written by consumers of alcohol beverages. The contributions of this thesis to the research field are twofold. First, I show that it is possible to automatically identify blog posts that have content related to the consumption of alcohol beverages. And second, I provide a framework and tools to model human behavior through text analysis of blog data. digital.library.unt.edu/ark:/67531/metadc271843/
Optimizing Non-pharmaceutical Interventions Using Multi-coaffiliation Networks
Computational modeling is of fundamental significance in mapping possible disease spread, and designing strategies for its mitigation. Conventional contact networks implement the simulation of interactions as random occurrences, presenting public health bodies with a difficult trade off between a realistic model granularity and robust design of intervention strategies. Recently, researchers have been investigating the use of agent-based models (ABMs) to embrace the complexity of real world interactions. At the same time, theoretical approaches provide epidemiologists with general optimization models in which demographics are intrinsically simplified. The emerging study of affiliation networks and co-affiliation networks provide an alternative to such trade off. Co-affiliation networks maintain the realism innate to ABMs while reducing the complexity of contact networks into distinctively smaller k-partite graphs, were each partition represent a dimension of the social model. This dissertation studies the optimization of intervention strategies for infectious diseases, mainly distributed in school systems. First, concepts of synthetic populations and affiliation networks are extended to propose a modified algorithm for the synthetic reconstruction of populations. Second, the definition of multi-coaffiliation networks is presented as the main social model in which risk is quantified and evaluated, thereby obtaining vulnerability indications for each school in the system. Finally, maximization of the mitigation coverage and minimization of the overall cost of intervention strategies are proposed and compared, based on centrality measures. digital.library.unt.edu/ark:/67531/metadc271860/
3D Reconstruction Using Lidar and Visual Images
In this research, multi-perspective image registration using LiDAR and visual images was considered. 2D-3D image registration is a difficult task because it requires the extraction of different semantic features from each modality. This problem is solved in three parts. The first step involves detection and extraction of common features from each of the data sets. The second step consists of associating the common features between two different modalities. Traditional methods use lines or orthogonal corners as common features. The third step consists of building the projection matrix. Many existing methods use global positing system (GPS) or inertial navigation system (INS) for an initial estimate of the camera pose. However, the approach discussed herein does not use GPS, INS, or any such devices for initial estimate; hence the model can be used in places like the lunar surface or Mars where GPS or INS are not available. A variation of the method is also described, which does not require strong features from both images but rather uses intensity gradients in the image. This can be useful when one image does not have strong features (such as lines) or there are too many extraneous features. digital.library.unt.edu/ark:/67531/metadc177193/
Automated Classification of Emotions Using Song Lyrics
This thesis explores the classification of emotions in song lyrics, using automatic approaches applied to a novel corpus of 100 popular songs. I use crowd sourcing via Amazon Mechanical Turk to collect line-level emotions annotations for this collection of song lyrics.  I then build classifiers that rely on textual features to automatically identify the presence of one or more of the following six Ekman emotions: anger, disgust, fear, joy, sadness and surprise. I compare different classification systems and evaluate the performance of the automatic systems against the manual annotations. I also introduce a system that uses data collected from the social network Twitter. I use the Twitter API to collect a large corpus of tweets manually labeled by their authors for one of the six emotions of interest. I then compare the classification of emotions obtained when training on data automatically collected from Twitter versus data obtained through crowd sourced annotations. digital.library.unt.edu/ark:/67531/metadc177253/
Automatic Tagging of Communication Data
Globally distributed software teams are widespread throughout industry. But finding reliable methods that can properly assess a team's activities is a real challenge. Methods such as surveys and manual coding of activities are too time consuming and are often unreliable. Recent advances in information retrieval and linguistics, however, suggest that automated and/or semi-automated text classification algorithms could be an effective way of finding differences in the communication patterns among individuals and groups. Communication among group members is frequent and generates a significant amount of data. Thus having a web-based tool that can automatically analyze the communication patterns among global software teams could lead to a better understanding of group performance. The goal of this thesis, therefore, is to compare automatic and semi-automatic measures of communication and evaluate their effectiveness in classifying different types of group activities that occur within a global software development project. In order to achieve this goal, we developed a web-based component that can be used to help clean and classify communication activities. The component was then used to compare different automated text classification techniques on various group activities to determine their effectiveness in correctly classifying data from a global software development team project. digital.library.unt.edu/ark:/67531/metadc149611/
Multi-perspective, Multi-modal Image Registration and Fusion
Multi-modal image fusion is an active research area with many civilian and military applications. Fusion is defined as strategic combination of information collected by various sensors from different locations or different types in order to obtain a better understanding of an observed scene or situation. Fusion of multi-modal images cannot be completed unless these two modalities are spatially aligned. In this research, I consider two important problems. Multi-modal, multi-perspective image registration and decision level fusion of multi-modal images. In particular, LiDAR and visual imagery. Multi-modal image registration is a difficult task due to the different semantic interpretation of features extracted from each modality. This problem is decoupled into three sub-problems. The first step is identification and extraction of common features. The second step is the determination of corresponding points. The third step consists of determining the registration transformation parameters. Traditional registration methods use low level features such as lines and corners. Using these features require an extensive optimization search in order to determine the corresponding points. Many methods use global positioning systems (GPS), and a calibrated camera in order to obtain an initial estimate of the camera parameters. The advantages of our work over the previous works are the following. First, I used high level-features, which significantly reduce the search space for the optimization process. Second, the determination of corresponding points is modeled as an assignment problem between a small numbers of objects. On the other side, fusing LiDAR and visual images is beneficial, due to the different and rich characteristics of both modalities. LiDAR data contain 3D information, while images contain visual information. Developing a fusion technique that uses the characteristics of both modalities is very important. I establish a decision-level fusion technique using manifold models. digital.library.unt.edu/ark:/67531/metadc149562/
A Smooth-turn Mobility Model for Airborne Networks
In this article, I introduce a novel airborne network mobility model, called the Smooth Turn Mobility Model, that captures the correlation of acceleration for airborne vehicles across time and spatial coordinates. E?ective routing in airborne networks (ANs) relies on suitable mobility models that capture the random movement pattern of airborne vehicles. As airborne vehicles cannot make sharp turns as easily as ground vehicles do, the widely used mobility models for Mobile Ad Hoc Networks such as Random Waypoint and Random Direction models fail. Our model is realistic in capturing the tendency of airborne vehicles toward making straight trajectory and smooth turns with large radius, and whereas is simple enough for tractable connectivity analysis and routing design. digital.library.unt.edu/ark:/67531/metadc149603/
Cuff-less Blood Pressure Measurement Using a Smart Phone
Access: Use of this item is restricted to the UNT Community.
Blood pressure is vital sign information that physicians often need as preliminary data for immediate intervention during emergency situations or for regular monitoring of people with cardiovascular diseases. Despite the availability of portable blood pressure meters in the market, they are not regularly carried by people, creating a need for an ultra-portable measurement platform or device that can be easily carried and used at all times. One such device is the smartphone which, according to comScore survey is used by 26.2% of the US adult population. the mass production of these phones with built-in sensors and high computation power has created numerous possibilities for application development in different domains including biomedical. Motivated by this capability and their extensive usage, this thesis focuses on developing a blood pressure measurement platform on smartphones. Specifically, I developed a blood pressure measurement system on a smart phone using the built-in camera and a customized external microphone. the system consists of first obtaining heart beats using the microphone and finger pulse with the camera, and finally calculating the blood pressure using the recorded data. I developed techniques for finding the best location for obtaining the data, making the system usable by all categories of people. the proposed system resulted in accuracies between 90-100%, when compared to traditional blood pressure meters. the second part of this thesis presents a new system for remote heart beat monitoring using the smart phone. with the proposed system, heart beats can be transferred live by patients and monitored by physicians remotely for diagnosis. the proposed blood pressure measurement and remote monitoring systems will be able to facilitate information acquisition and decision making by the 9-1-1 operators. digital.library.unt.edu/ark:/67531/metadc115102/
A Global Stochastic Modeling Framework to Simulate and Visualize Epidemics
Epidemics have caused major human and monetary losses through the course of human civilization. It is very important that epidemiologists and public health personnel are prepared to handle an impending infectious disease outbreak. the ever-changing demographics, evolving infrastructural resources of geographic regions, emerging and re-emerging diseases, compel the use of simulation to predict disease dynamics. By the means of simulation, public health personnel and epidemiologists can predict the disease dynamics, population groups at risk and their geographic locations beforehand, so that they are prepared to respond in case of an epidemic outbreak. As a consequence of the large numbers of individuals and inter-personal interactions involved in simulating infectious disease spread in a region such as a county, sizeable amounts of data may be produced that have to be analyzed. Methods to visualize this data would be effective in facilitating people from diverse disciplines understand and analyze the simulation. This thesis proposes a framework to simulate and visualize the spread of an infectious disease in a population of a region such as a county. As real-world populations have a non-homogeneous demographic and spatial distribution, this framework models the spread of an infectious disease based on population of and geographic distance between census blocks; social behavioral parameters for demographic groups. the population is stratified into demographic groups in individual census blocks using census data. Infection spread is modeled by means of local and global contacts generated between groups of population in census blocks. the strength and likelihood of the contacts are based on population, geographic distance and social behavioral parameters of the groups involved. the disease dynamics are represented on a geographic map of the region using a heat map representation, where the intensity of infection is mapped to a color scale. This framework provides a tool for public health personnel and epidemiologists to run what-if analyses on disease spread in specific populations and plan for epidemic response. By the means of demographic stratification of population and incorporation of geographic distance and social behavioral parameters into the modeling of the outbreak, this framework takes into account non-homogeneity in demographic mix and spatial distribution of the population. Generation of contacts per population group instead of individuals contributes to lowering computational overhead. Heat map representation of the intensity of infection provides an intuitive way to visualize the disease dynamics. digital.library.unt.edu/ark:/67531/metadc115099/
GPS CaPPture: a System for GPS Trajectory Collection, Processing, and Destination Prediction
In the United States, smartphone ownership surpassed 69.5 million in February 2011 with a large portion of those users (20%) downloading applications (apps) that enhance the usability of a device by adding additional functionality. a large percentage of apps are written specifically to utilize the geographical position of a mobile device. One of the prime factors in developing location prediction models is the use of historical data to train such a model. with larger sets of training data, prediction algorithms become more accurate; however, the use of historical data can quickly become a downfall if the GPS stream is not collected or processed correctly. Inaccurate or incomplete or even improperly interpreted historical data can lead to the inability to develop accurately performing prediction algorithms. As GPS chipsets become the standard in the ever increasing number of mobile devices, the opportunity for the collection of GPS data increases remarkably. the goal of this study is to build a comprehensive system that addresses the following challenges: (1) collection of GPS data streams in a manner such that the data is highly usable and has a reduction in errors; (2) processing and reduction of the collected data in order to prepare it and make it highly usable for the creation of prediction algorithms; (3) creation of prediction/labeling algorithms at such a level that they are viable for commercial use. This study identifies the key research problems toward building the CaPPture (collection, processing, prediction) system. digital.library.unt.edu/ark:/67531/metadc115089/
Rapid Prototyping and Design of a Fast Random Number Generator
Information in the form of online multimedia, bank accounts, or password usage for diverse applications needs some form of security. the core feature of many security systems is the generation of true random or pseudorandom numbers. Hence reliable generators of such numbers are indispensable. the fundamental hurdle is that digital computers cannot generate truly random numbers because the states and transitions of digital systems are well understood and predictable. Nothing in a digital computer happens truly randomly. Digital computers are sequential machines that perform a current state and move to the next state in a deterministic fashion. to generate any secure hash or encrypted word a random number is needed. But since computers are not random, random sequences are commonly used. Random sequences are algorithms that generate a pattern of values that appear to be random but after some time start repeating. This thesis implements a digital random number generator using MATLAB, FGPA prototyping, and custom silicon design. This random number generator is able to use a truly random CMOS source to generate the random number. Statistical benchmarks are used to test the results and to show that the design works. Thus the proposed random number generator will be useful for online encryption and security. digital.library.unt.edu/ark:/67531/metadc115040/
Arithmetic Computations and Memory Management Using a Binary Tree Encoding af Natural Numbers
Two applications of a binary tree data type based on a simple pairing function (a bijection between natural numbers and pairs of natural numbers) are explored. First, the tree is used to encode natural numbers, and algorithms that perform basic arithmetic computations are presented along with formal proofs of their correctness. Second, using this "canonical" representation as a base type, algorithms for encoding and decoding additional isomorphic data types of other mathematical constructs (sets, sequences, etc.) are also developed. An experimental application to a memory management system is constructed and explored using these isomorphic types. A practical analysis of this system's runtime complexity and space savings are provided, along with a proof of concept framework for both applications of the binary tree type, in the Java programming language. digital.library.unt.edu/ark:/67531/metadc103323/
The Design Of A Benchmark For Geo-stream Management Systems
The recent growth in sensor technology allows easier information gathering in real-time as sensors have grown smaller, more accurate, and less expensive. The resulting data is often in a geo-stream format continuously changing input with a spatial extent. Researchers developing geo-streaming management systems (GSMS) require a benchmark system for evaluation, which is currently lacking. This thesis presents GSMark, a benchmark for evaluating GSMSs. GSMark provides a data generator that creates a combination of synthetic and real geo-streaming data, a workload simulator to present the data to the GSMS as a data stream, and a set of benchmark queries that evaluate typical GSMS functionality and query performance. In particular, GSMark generates both moving points and evolving spatial regions, two fundamental data types for a broad range of geo-stream applications, and the geo-streaming queries on this data. digital.library.unt.edu/ark:/67531/metadc103392/
Investigating the Extractive Summarization of Literary Novels
Abstract Due to the vast amount of information we are faced with, summarization has become a critical necessity of everyday human life. Given that a large fraction of the electronic documents available online and elsewhere consist of short texts such as Web pages, news articles, scientific reports, and others, the focus of natural language processing techniques to date has been on the automation of methods targeting short documents. We are witnessing however a change: an increasingly larger number of books become available in electronic format. This means that the need for language processing techniques able to handle very large documents such as books is becoming increasingly important. This thesis addresses the problem of summarization of novels, which are long and complex literary narratives. While there is a significant body of research that has been carried out on the task of automatic text summarization, most of this work has been concerned with the summarization of short documents, with a particular focus on news stories. However, novels are different in both length and genre, and consequently different summarization techniques are required. This thesis attempts to close this gap by analyzing a new domain for summarization, and by building unsupervised and supervised systems that effectively take into account the properties of long documents, and outperform the traditional extractive summarization systems typically addressing news genre. digital.library.unt.edu/ark:/67531/metadc103298/
Rapid Prototyping and Design of a Fast Random Number Generator
Information in the form of online multimedia, bank accounts, or password usage for diverse applications needs some form of security. the core feature of many security systems is the generation of true random or pseudorandom numbers. Hence reliable generators of such numbers are indispensable. the fundamental hurdle is that digital computers cannot generate truly random numbers because the states and transitions of digital systems are well understood and predictable. Nothing in a digital computer happens truly randomly. Digital computers are sequential machines that perform a current state and move to the next state in a deterministic fashion. to generate any secure hash or encrypted word a random number is needed. But since computers are not random, random sequences are commonly used. Random sequences are algorithms that generate a pattern of values that appear to be random but after some time start repeating. This thesis implements a digital random number generator using MATLAB, FGPA prototyping, and custom silicon design. This random number generator is able to use a truly random CMOS source to generate the random number. Statistical benchmarks are used to test the results and to show that the design works. Thus the proposed random number generator will be useful for online encryption and security. digital.library.unt.edu/ark:/67531/metadc115036/
Measuring Semantic Relatedness Using Salient Encyclopedic Concepts
While pragmatics, through its integration of situational awareness and real world relevant knowledge, offers a high level of analysis that is suitable for real interpretation of natural dialogue, semantics, on the other end, represents a lower yet more tractable and affordable linguistic level of analysis using current technologies. Generally, the understanding of semantic meaning in literature has revolved around the famous quote ``You shall know a word by the company it keeps''. In this thesis we investigate the role of context constituents in decoding the semantic meaning of the engulfing context; specifically we probe the role of salient concepts, defined as content-bearing expressions which afford encyclopedic definitions, as a suitable source of semantic clues to an unambiguous interpretation of context. Furthermore, we integrate this world knowledge in building a new and robust unsupervised semantic model and apply it to entail semantic relatedness between textual pairs, whether they are words, sentences or paragraphs. Moreover, we explore the abstraction of semantics across languages and utilize our findings into building a novel multi-lingual semantic relatedness model exploiting information acquired from various languages. We demonstrate the effectiveness and the superiority of our mono-lingual and multi-lingual models through a comprehensive set of evaluations on specialized synthetic datasets for semantic relatedness as well as real world applications such as paraphrase detection and short answer grading. Our work represents a novel approach to integrate world-knowledge into current semantic models and a means to cross the language boundary for a better and more robust semantic relatedness representation, thus opening the door for an improved abstraction of meaning that carries the potential of ultimately imparting understanding of natural language to machines. digital.library.unt.edu/ark:/67531/metadc84212/
Techniques for Improving Uniformity in Direct Mapped Caches
Directly mapped caches are an attractive option for processor designers as they combine fast lookup times with reduced complexity and area. However, directly-mapped caches are prone to higher miss-rates as there are no candidates for replacement on a cache miss, hence data residing in a cache set would have to be evicted to the next level cache. Another issue that inhibits cache performance is the non-uniformity of accesses exhibited by most applications: some sets are under-utilized while others receive the majority of accesses. This implies that increasing the size of caches may not lead to proportionally improved cache hit rates. Several solutions that address cache non-uniformity have been proposed in the literature. These techniques have been proposed over the past decade and each proposal independently claims the benefit of reduced conflict misses. However, because the published results use different benchmarks and different experimental setups, (there is no established frame of reference for comparing these results) it is not easy to compare them. In this work we report a side-by-side comparison of these techniques. Finally, we propose and Adaptive-Partitioned cache for multi-threaded applications. This design limits inter-thread thrashing while dynamically reducing traffic to heavily accessed sets. digital.library.unt.edu/ark:/67531/metadc68025/
Toward a Data-Type-Based Real Time Geospatial Data Stream Management System
The advent of sensory and communication technologies enables the generation and consumption of large volumes of streaming data. Many of these data streams are geo-referenced. Existing spatio-temporal databases and data stream management systems are not capable of handling real time queries on spatial extents. In this thesis, we investigated several fundamental research issues toward building a data-type-based real time geospatial data stream management system. The thesis makes contributions in the following areas: geo-stream data models, aggregation, window-based nearest neighbor operators, and query optimization strategies. The proposed geo-stream data model is based on second-order logic and multi-typed algebra. Both abstract and discrete data models are proposed and exemplified. I further propose two useful geo-stream operators, namely Region By and WNN, which abstract common aggregation and nearest neighbor queries as generalized data model constructs. Finally, I propose three query optimization algorithms based on spatial, temporal, and spatio-temporal constraints of geo-streams. I show the effectiveness of the data model through many query examples. The effectiveness and the efficiency of the algorithms are validated through extensive experiments on both synthetic and real data sets. This work established the fundamental building blocks toward a full-fledged geo-stream database management system and has potential impact in many applications such as hazard weather alerting and monitoring, traffic analysis, and environmental modeling. digital.library.unt.edu/ark:/67531/metadc68070/
A Wireless Traffic Surveillance System Using Video Analytics
Video surveillance systems have been commonly used in transportation systems to support traffic monitoring, speed estimation, and incident detection. However, there are several challenges in developing and deploying such systems, including high development and maintenance costs, bandwidth bottleneck for long range link, and lack of advanced analytics. In this thesis, I leverage current wireless, video camera, and analytics technologies, and present a wireless traffic monitoring system. I first present an overview of the system. Then I describe the site investigation and several test links with different hardware/software configurations to demonstrate the effectiveness of the system. The system development process was documented to provide guidelines for future development. Furthermore, I propose a novel speed-estimation analytics algorithm that takes into consideration roads with slope angles. I prove the correctness of the algorithm theoretically, and validate the effectiveness of the algorithm experimentally. The experimental results on both synthetic and real dataset show that the algorithm is more accurate than the baseline algorithm 80% of the time. On average the accuracy improvement of speed estimation is over 3.7% even for very small slope angles. digital.library.unt.edu/ark:/67531/metadc68005/
A Framework for Analyzing and Optimizing Regional Bio-Emergency Response Plans
The presence of naturally occurring and man-made public health threats necessitate the design and implementation of mitigation strategies, such that adequate response is provided in a timely manner. Since multiple variables, such as geographic properties, resource constraints, and government mandated time-frames must be accounted for, computational methods provide the necessary tools to develop contingency response plans while respecting underlying data and assumptions. A typical response scenario involves the placement of points of dispensing (PODs) in the affected geographic region to supply vaccines or medications to the general public. Computational tools aid in the analysis of such response plans, as well as in the strategic placement of PODs, such that feasible response scenarios can be developed. Due to the sensitivity of bio-emergency response plans, geographic information, such as POD locations, must be kept confidential. The generation of synthetic geographic regions allows for the development of emergency response plans on non-sensitive data, as well as for the study of the effects of single geographic parameters. Further, synthetic representations of geographic regions allow for results to be published and evaluated by the scientific community. This dissertation presents methodology for the analysis of bio-emergency response plans, methods for plan optimization, as well as methodology for the generation of synthetic geographic regions. digital.library.unt.edu/ark:/67531/metadc33200/
Graph-Based Keyphrase Extraction Using Wikipedia
Keyphrases describe a document in a coherent and simple way, giving the prospective reader a way to quickly determine whether the document satisfies their information needs. The pervasion of huge amount of information on Web, with only a small amount of documents have keyphrases extracted, there is a definite need to discover automatic keyphrase extraction systems. Typically, a document written by human develops around one or more general concepts or sub-concepts. These concepts or sub-concepts should be structured and semantically related with each other, so that they can form the meaningful representation of a document. Considering the fact, the phrases or concepts in a document are related to each other, a new approach for keyphrase extraction is introduced that exploits the semantic relations in the document. For measuring the semantic relations between concepts or sub-concepts in the document, I present a comprehensive study aimed at using collaboratively constructed semantic resources like Wikipedia and its link structure. In particular, I introduce a graph-based keyphrase extraction system that exploits the semantic relations in the document and features such as term frequency. I evaluated the proposed system using novel measures and the results obtained compare favorably with previously published results on established benchmarks. digital.library.unt.edu/ark:/67531/metadc67939/
Measuring Vital Signs Using Smart Phones
Smart phones today have become increasingly popular with the general public for its diverse abilities like navigation, social networking, and multimedia facilities to name a few. These phones are equipped with high end processors, high resolution cameras, built-in sensors like accelerometer, orientation-sensor, light-sensor, and much more. According to comScore survey, 25.3% of US adults use smart phones in their daily lives. Motivated by the capability of smart phones and their extensive usage, I focused on utilizing them for bio-medical applications. In this thesis, I present a new application for a smart phone to quantify the vital signs such as heart rate, respiratory rate and blood pressure with the help of its built-in sensors. Using the camera and a microphone, I have shown how the blood pressure and heart rate can be determined for a subject. People sometimes encounter minor situations like fainting or fatal accidents like car crash at unexpected times and places. It would be useful to have a device which can measure all vital signs in such an event. The second part of this thesis demonstrates a new mode of communication for next generation 9-1-1 calls. In this new architecture, the call-taker will be able to control the multimedia elements in the phone from a remote location. This would help the call-taker or first responder to have a better control over the situation. Transmission of the vital signs measured using the smart phone can be a life saver in critical situations. In today's voice oriented 9-1-1 calls, the dispatcher first collects critical information (e.g., location, call-back number) from caller, and assesses the situation. Meanwhile, the dispatchers constantly face a "60-second dilemma"; i.e., within 60 seconds, they need to make a complicated but important decision, whether to dispatch and, if so, what to dispatch. The dispatchers often feel that they lack sufficient information to make a confident dispatch decision. This remote-media-control described in this system will be able to facilitate information acquisition and decision-making in emergency situations within the 60-second response window in 9-1-1 calls using new multimedia technologies. digital.library.unt.edu/ark:/67531/metadc33139/
Anchor Nodes Placement for Effective Passive Localization
Access: Use of this item is restricted to the UNT Community.
Wireless sensor networks are composed of sensor nodes, which can monitor an environment and observe events of interest. These networks are applied in various fields including but not limited to environmental, industrial and habitat monitoring. In many applications, the exact location of the sensor nodes is unknown after deployment. Localization is a process used to find sensor node's positional coordinates, which is vital information. The localization is generally assisted by anchor nodes that are also sensor nodes but with known locations. Anchor nodes generally are expensive and need to be optimally placed for effective localization. Passive localization is one of the localization techniques where the sensor nodes silently listen to the global events like thunder sounds, seismic waves, lighting, etc. According to previous studies, the ideal location to place anchor nodes was on the perimeter of the sensor network. This may not be the case in passive localization, since the function of anchor nodes here is different than the anchor nodes used in other localization systems. I do extensive studies on positioning anchor nodes for effective localization. Several simulations are run in dense and sparse networks for proper positioning of anchor nodes. I show that, for effective passive localization, the optimal placement of the anchor nodes is at the center of the network in such a way that no three anchor nodes share linearity. The more the non-linearity, the better the localization. The localization for our network design proves better when I place anchor nodes at right angles. digital.library.unt.edu/ark:/67531/metadc33132/
Elicitation of Protein-Protein Interactions from Biomedical Literature Using Association Rule Discovery
Extracting information from a stack of data is a tedious task and the scenario is no different in proteomics. Volumes of research papers are published about study of various proteins in several species, their interactions with other proteins and identification of protein(s) as possible biomarker in causing diseases. It is a challenging task for biologists to keep track of these developments manually by reading through the literatures. Several tools have been developed by computer linguists to assist identification, extraction and hypotheses generation of proteins and protein-protein interactions from biomedical publications and protein databases. However, they are confronted with the challenges of term variation, term ambiguity, access only to abstracts and inconsistencies in time-consuming manual curation of protein and protein-protein interaction repositories. This work attempts to attenuate the challenges by extracting protein-protein interactions in humans and elicit possible interactions using associative rule mining on full text, abstracts and captions from figures available from publicly available biomedical literature databases. Two such databases are used in our study: Directory of Open Access Journals (DOAJ) and PubMed Central (PMC). A corpus is built using articles based on search terms. A dataset of more than 38,000 protein-protein interactions from the Human Protein Reference Database (HPRD) is cross-referenced to validate discovered interactive pairs. A set of an optimal size of possible binary protein-protein interactions is generated to be made available for clinician or biological validation. A significant change in the number of new associations was found by altering the thresholds for support and confidence metrics. This study narrows down the limitations for biologists in keeping pace with discovery of protein-protein interactions via manually reading the literature and their needs to validate each and every possible interaction. digital.library.unt.edu/ark:/67531/metadc30508/
Rhythms of Interaction in Global Software Development Teams
Researchers have speculated that global software teams have activity patterns that are dictated by work-place schedules or a client's need. Similar patterns have been suggested for individuals enrolled in distant learning projects that require students to post feedback in response to questions or assignments. Researchers tend to accept the notion that students' temporal patterns adjust to academic or social calendars and are a result of choices made within these constraints. Although there is some evidence that culture do have an impact on communication activity behavior, there is not a clear how each of these factors may relate to work done in online groups. This particular study represents a new approach to studying student-group communication activities and also pursues an alternative approach by using activity data from students participating in a global software development project to generate a variety of complex measures that capture patterns about when students work. Students work habits are also often determined by where they live and what they are working on. Moreover, students tend to work on group projects in cycles, which correspond to a start, middle, and end time period. Knowledge obtained from this study should provide insight into current empirical research on global software development by defining the different time variables that can also be used to compare temporal patterns found in real-world teams. It should also inform studies about student team projects by helping instructors schedule group activities. digital.library.unt.edu/ark:/67531/metadc30476/
Socioscope: Human Relationship and Behavior Analysis in Mobile Social Networks
Access: Use of this item is restricted to the UNT Community.
The widely used mobile phone, as well as its related technologies had opened opportunities for a complete change on how people interact and build relationship across geographic and time considerations. The convenience of instant communication by mobile phones that broke the barrier of space and time is evidently the key motivational point on why such technologies so important in people's life and daily activities. Mobile phones have become the most popular communication tools. Mobile phone technology is apparently changing our relationship to each other in our work and lives. The impact of new technologies on people's lives in social spaces gives us the chance to rethink the possibilities of technologies in social interaction. Accordingly, mobile phones are basically changing social relations in ways that are intricate to measure with any precision. In this dissertation I propose a socioscope model for social network, relationship and human behavior analysis based on mobile phone call detail records. Because of the diversities and complexities of human social behavior, one technique cannot detect different features of human social behaviors. Therefore I use multiple probability and statistical methods for quantifying social groups, relationships and communication patterns, for predicting social tie strengths and for detecting human behavior changes and unusual consumption events. I propose a new reciprocity index to measure the level of reciprocity between users and their communication partners. The experimental results show that this approach is effective. Among other applications, this work is useful for homeland security, detection of unwanted calls (e.g., spam), telecommunication presence, and marketing. In my future work I plan to analyze and study the social network dynamics and evolution. digital.library.unt.edu/ark:/67531/metadc30533/
Design and Implementation of Large-Scale Wireless Sensor Networks for Environmental Monitoring Applications
Environmental monitoring represents a major application domain for wireless sensor networks (WSN). However, despite significant advances in recent years, there are still many challenging issues to be addressed to exploit the full potential of the emerging WSN technology. In this dissertation, we introduce the design and implementation of low-power wireless sensor networks for long-term, autonomous, and near-real-time environmental monitoring applications. We have developed an out-of-box solution consisting of a suite of software, protocols and algorithms to provide reliable data collection with extremely low power consumption. Two wireless sensor networks based on the proposed solution have been deployed in remote field stations to monitor soil moisture along with other environmental parameters. As parts of the ever-growing environmental monitoring cyberinfrastructure, these networks have been integrated into the Texas Environmental Observatory system for long-term operation. Environmental measurement and network performance results are presented to demonstrate the capability, reliability and energy-efficiency of the network. digital.library.unt.edu/ark:/67531/metadc28493/
Force-Directed Graph Drawing and Aesthetics Measurement in a Non-Strict Pure Functional Programming Language
Non-strict pure functional programming often requires redesigning algorithms and data structures to work more effectively under new constraints of non-strict evaluation and immutable state. Graph drawing algorithms, while numerous and broadly studied, have no presence in the non-strict pure functional programming model. Additionally, there is currently no freely licensed standalone toolkit used to quantitatively analyze aesthetics of graph drawings. This thesis addresses two previously unexplored questions. Can a force-directed graph drawing algorithm be implemented in a non-strict functional language, such as Haskell, and still be practically usable? Can an easily extensible aesthetic measuring tool be implemented in a language such as Haskell and still be practically usable? The focus of the thesis is on implementing one of the simplest force-directed algorithms, that of Fruchterman and Reingold, and comparing its resulting aesthetics to those of a well-known C++ implementation of the same algorithm. digital.library.unt.edu/ark:/67531/metadc12125/
Survey of Approximation Algorithms for Set Cover Problem
In this thesis, I survey 11 approximation algorithms for unweighted set cover problem. I have also implemented the three algorithms and created a software library that stores the code I have written. The algorithms I survey are: 1. Johnson's standard greedy; 2. f-frequency greedy; 3. Goldsmidt, Hochbaum and Yu's modified greedy; 4. Halldorsson's local optimization; 5. Dur and Furer semi local optimization; 6. Asaf Levin's improvement to Dur and Furer; 7. Simple rounding; 8. Randomized rounding; 9. LP duality; 10. Primal-dual schema; and 11. Network flow technique. Most of the algorithms surveyed are refinements of standard greedy algorithm. digital.library.unt.edu/ark:/67531/metadc12118/
Urban surface characterization using LiDAR and aerial imagery.
Many calamities in history like hurricanes, tornado and flooding are proof to the large scale impact they cause to the life and economy. Computer simulation and GIS helps in modeling a real world scenario, which assists in evacuation planning, damage assessment, assistance and reconstruction. For achieving computer simulation and modeling there is a need for accurate classification of ground objects. One of the most significant aspects of this research is that it achieves improved classification for regions within which light detection and ranging (LiDAR) has low spatial resolution. This thesis describes a method for accurate classification of bare ground, water body, roads, vegetation, and structures using LiDAR data and aerial Infrared imagery. The most basic step for any terrain modeling application is filtering which is classification of ground and non-ground points. We present an integrated systematic method that makes classification of terrain and non-terrain points effective. Our filtering method uses the geometric feature of the triangle meshes created from LiDAR samples and calculate the confidence for every point. Geometric homogenous blocks and confidence are derived from TIN model and gridded LiDAR samples. The results from two representations are used in a classifier to determine if the block belongs ground or otherwise. Another important step is detection of water body, which is based on the LiDAR sample density of the region. Objects like tress and bare ground are characterized by the geometric features present in the LiDAR and the color features in the infrared imagery. These features are fed into a SVM classifier which detects bare-ground in the given region. Similarly trees are extracted using another trained SVM classifier. Once we obtain bare-grounds and trees, roads are extracted by removing the bare grounds. Structures are identified by the properties of non-ground segments. Experiments were conducted using LiDAR samples and Infrared imagery from the city of New Orleans. We evaluated the influence of different parameters to the classification. Water bodies were extracted successfully using density measures. Experiments showed that fusion of geometric properties and confidence levels resulted into efficient classification of ground and non-ground regions. Classification of vegetation using SVM was promising and effective using the features like height variation, HSV, angle etc. It is demonstrated that our methods successfully classified the region by using LiDAR data in a complex urban area with high-rise buildings. digital.library.unt.edu/ark:/67531/metadc12196/
Computational Epidemiology - Analyzing Exposure Risk: A Deterministic, Agent-Based Approach
Many infectious diseases are spread through interactions between susceptible and infectious individuals. Keeping track of where each exposure to the disease took place, when it took place, and which individuals were involved in the exposure can give public health officials important information that they may use to formulate their interventions. Further, knowing which individuals in the population are at the highest risk of becoming infected with the disease may prove to be a useful tool for public health officials trying to curtail the spread of the disease. Epidemiological models are needed to allow epidemiologists to study the population dynamics of transmission of infectious agents and the potential impact of infectious disease control programs. While many agent-based computational epidemiological models exist in the literature, they focus on the spread of disease rather than exposure risk. These models are designed to simulate very large populations, representing individuals as agents, and using random experiments and probabilities in an attempt to more realistically guide the course of the modeled disease outbreak. The work presented in this thesis focuses on tracking exposure risk to chickenpox in an elementary school setting. This setting is chosen due to the high level of detailed information realistically available to school administrators regarding individuals' schedules and movements. Using an agent-based approach, contacts between individuals are tracked and analyzed with respect to both individuals and locations. The results are then analyzed using a combination of tools from computer science and geographic information science. digital.library.unt.edu/ark:/67531/metadc11017/
End of Insertion Detection in Colonoscopy Videos
Colorectal cancer is the second leading cause of cancer-related deaths behind lung cancer in the United States. Colonoscopy is the preferred screening method for detection of diseases like Colorectal Cancer. In the year 2006, American Society for Gastrointestinal Endoscopy (ASGE) and American College of Gastroenterology (ACG) issued guidelines for quality colonoscopy. The guidelines suggest that on average the withdrawal phase during a screening colonoscopy should last a minimum of 6 minutes. My aim is to classify the colonoscopy video into insertion and withdrawal phase. The problem is that currently existing shot detection techniques cannot be applied because colonoscopy is a single camera shot from start to end. An algorithm to detect phase boundary has already been developed by the MIGLAB team. Existing method has acceptable levels of accuracy but the main issue is dependency on MPEG (Moving Pictures Expert Group) 1/2. I implemented exhaustive search for motion estimation to reduce the execution time and improve the accuracy. I took advantages of the C/C++ programming languages with multithreading which helped us get even better performances in terms of execution time. I propose a method for improving the current method of colonoscopy video analysis and also an extension for the same to make it usable for real time videos. The real time version we implemented is capable of handling streams coming directly from the camera in the form of uncompressed bitmap frames. Existing implementation could not be applied to real time scenario because of its dependency on MPEG 1/2. Future direction of this research includes improved motion search and GPU parallel computing techniques. digital.library.unt.edu/ark:/67531/metadc12159/
Cross Language Information Retrieval for Languages with Scarce Resources
Our generation has experienced one of the most dramatic changes in how society communicates. Today, we have online information on almost any imaginable topic. However, most of this information is available in only a few dozen languages. In this thesis, I explore the use of parallel texts to enable cross-language information retrieval (CLIR) for languages with scarce resources. To build the parallel text I use the Bible. I evaluate different variables and their impact on the resulting CLIR system, specifically: (1) the CLIR results when using different amounts of parallel text; (2) the role of paraphrasing on the quality of the CLIR output; (3) the impact on accuracy when translating the query versus translating the collection of documents; and finally (4) how the results are affected by the use of different dialects. The results show that all these variables have a direct impact on the quality of the CLIR system. digital.library.unt.edu/ark:/67531/metadc12157/
Development, Implementation, and Analysis of a Contact Model for an Infectious Disease
With a growing concern of an infectious diseases spreading in a population, epidemiology is becoming more important for the future of public health. In the past epidemiologist used existing data of an outbreak to help them determine how an infectious disease might spread in the future. Now with computational models, they able to analysis data produced by these models to help with prevention and intervention plans. This paper looks at the design, implementation, and analysis of a computational model based on the interactions of the population between individuals. The design of the working contact model looks closely at the SEIR model used as the foundation and the two timelines of a disease. The implementation of the contact model is reviewed while looking closely at data structures. The analysis of the experiments provide evidence this contact model can be used to help epidemiologist study the spread of an infectious disease based on the contact rate of individuals. digital.library.unt.edu/ark:/67531/metadc9824/
Direct Online/Offline Digital Signature Schemes.
Online/offline signature schemes are useful in many situations, and two such scenarios are considered in this dissertation: bursty server authentication and embedded device authentication. In this dissertation, new techniques for online/offline signing are introduced, those are applied in a variety of ways for creating online/offline signature schemes, and five different online/offline signature schemes that are proved secure under a variety of models and assumptions are proposed. Two of the proposed five schemes have the best offline or best online performance of any currently known technique, and are particularly well-suited for the scenarios that are considered in this dissertation. To determine if the proposed schemes provide the expected practical improvements, a series of experiments were conducted comparing the proposed schemes with each other and with other state-of-the-art schemes in this area, both on a desktop class computer, and under AVR Studio, a simulation platform for an 8-bit processor that is popular for embedded systems. Under AVR Studio, the proposed SGE scheme using a typical key size for the embedded device authentication scenario, can complete the offline phase in about 24 seconds and then produce a signature (the online phase) in 15 milliseconds, which is the best offline performance of any known signature scheme that has been proven secure in the standard model. In the tests on a desktop class computer, the proposed SGS scheme, which has the best online performance and is designed for the bursty server authentication scenario, generated 469,109 signatures per second, and the Schnorr scheme (the next best scheme in terms of online performance) generated only 223,548 signatures. The experimental results demonstrate that the SGE and SGS schemes are the most efficient techniques for embedded device authentication and bursty server authentication, respectively. digital.library.unt.edu/ark:/67531/metadc9717/
FIRST PREV 1 2 3 4 NEXT LAST