UNT Theses and Dissertations - Browse

ABOUT BROWSE FEED

Computational Methods for Vulnerability Analysis and Resource Allocation in Public Health Emergencies

Description: POD (Point of Dispensing)-based emergency response plans involving mass prophylaxis may seem feasible when considering the choice of dispensing points within a region, overall population density, and estimated traffic demands. However, the plan may fail to serve particular vulnerable sub-populations, resulting in access disparities during emergency response. Federal authorities emphasize on the need to identify sub-populations that cannot avail regular services during an emergency due to their special needs to ensure effective response. Vulnerable individuals require the targeted allocation of appropriate resources to serve their special needs. Devising schemes to address the needs of vulnerable sub-populations is essential for the effectiveness of response plans. This research focuses on data-driven computational methods to quantify and address vulnerabilities in response plans that require the allocation of targeted resources. Data-driven methods to identify and quantify vulnerabilities in response plans are developed as part of this research. Addressing vulnerabilities requires the targeted allocation of appropriate resources to PODs. The problem of resource allocation to PODs during public health emergencies is introduced and the variants of the resource allocation problem such as the spatial allocation, spatio-temporal allocation and optimal resource subset variants are formulated. Generating optimal resource allocation and scheduling solutions can be computationally hard problems. The application of metaheuristic techniques to find near-optimal solutions to the resource allocation problem in response plans is investigated. A vulnerability analysis and resource allocation framework that facilitates the demographic analysis of population data in the context of response plans, and the optimal allocation of resources with respect to the analysis are described.
Date: August 2015
Creator: Indrakanti, Saratchandra
Partner: UNT Libraries

Freeform Cursive Handwriting Recognition Using a Clustered Neural Network

Description: Optical character recognition (OCR) software has advanced greatly in recent years. Machine-printed text can be scanned and converted to searchable text with word accuracy rates around 98%. Reasonably neat hand-printed text can be recognized with about 85% word accuracy. However, cursive handwriting still remains a challenge, with state-of-the-art performance still around 75%. Algorithms based on hidden Markov models have been only moderately successful, while recurrent neural networks have delivered the best results to date. This thesis explored the feasibility of using a special type of feedforward neural network to convert freeform cursive handwriting to searchable text. The hidden nodes in this network were grouped into clusters, with each cluster being trained to recognize a unique character bigram. The network was trained on writing samples that were pre-segmented and annotated. Post-processing was facilitated in part by using the network to identify overlapping bigrams that were then linked together to form words and sentences. With dictionary assisted post-processing, the network achieved word accuracy of 66.5% on a small, proprietary corpus. The contributions in this thesis are threefold: 1) the novel clustered architecture of the feed-forward neural network, 2) the development of an expanded set of observers combining image masks, modifiers, and feature characterizations, and 3) the use of overlapping bigrams as the textual working unit to assist in context analysis and reconstruction.
Date: August 2015
Creator: Bristow, Kelly H.
Partner: UNT Libraries

Integrity Verification of Applications on RADIUM Architecture

Description: Trusted Computing capability has become ubiquitous these days, and it is being widely deployed into consumer devices as well as enterprise platforms. As the number of threats is increasing at an exponential rate, it is becoming a daunting task to secure the systems against them. In this context, the software integrity measurement at runtime with the support of trusted platforms can be a better security strategy. Trusted Computing devices like TPM secure the evidence of a breach or an attack. These devices remain tamper proof if the hardware platform is physically secured. This type of trusted security is crucial for forensic analysis in the aftermath of a breach. The advantages of trusted platforms can be further leveraged if they can be used wisely. RADIUM (Race-free on-demand Integrity Measurement Architecture) is one such architecture, which is built on the strength of TPM. RADIUM provides an asynchronous root of trust to overcome the TOC condition of DRTM. Even though the underlying architecture is trusted, attacks can still compromise applications during runtime by exploiting their vulnerabilities. I propose an application-level integrity measurement solution that fits into RADIUM, to expand the trusted computing capability to the application layer. This is based on the concept of program invariants that can be used to learn the correct behavior of an application. I used Daikon, a tool to obtain dynamic likely invariants, and developed a method of observing these properties at runtime to verify the integrity. The integrity measurement component was implemented as a Python module on top of Volatility, a virtual machine introspection tool. My approach is a first step towards integrity attestation, using hypervisor-based introspection on RADIUM and a proof of concept of application-level measurement capability.
Date: August 2015
Creator: Tarigopula, Mohan Krishna
Partner: UNT Libraries

Maintaining Web Applications Integrity Running on RADIUM

Description: Computer security attacks take place due to the presence of vulnerabilities and bugs in software applications. Bugs and vulnerabilities are the result of weak software architecture and lack of standard software development practices. Despite the fact that software companies are investing millions of dollars in the research and development of software designs security risks are still at large. In some cases software applications are found to carry vulnerabilities for many years before being identified. A recent such example is the popular Heart Bleed Bug in the Open SSL/TSL. In today’s world, where new software application are continuously being developed for a varied community of users; it’s highly unlikely to have software applications running without flaws. Attackers on computer system securities exploit these vulnerabilities and bugs and cause threat to privacy without leaving any trace. The most critical vulnerabilities are those which are related to the integrity of the software applications. Because integrity is directly linked to the credibility of software application and data it contains. Here I am giving solution of maintaining web applications integrity running on RADIUM by using daikon. Daikon generates invariants, these invariants are used to maintain the integrity of the web application and also check the correct behavior of web application at run time on RADIUM architecture in case of any attack or malware. I used data invariants and program flow invariants in my solution to maintain the integrity of web-application against such attack or malware. I check the behavior of my proposed invariants at run-time using Lib-VMI/Volatility memory introspection tool. This is a novel approach and proof of concept toward maintaining web application integrity on RADIUM.
Date: August 2015
Creator: Ur-Rehman, Wasi
Partner: UNT Libraries

Radium: Secure Policy Engine in Hypervisor

Description: The basis of today’s security systems is the trust and confidence that the system will behave as expected and are in a known good trusted state. The trust is built from hardware and software elements that generates a chain of trust that originates from a trusted known entity. Leveraging hardware, software and a mandatory access control policy technology is needed to create a trusted measurement environment. Employing a control layer (hypervisor or microkernel) with the ability to enforce a fine grained access control policy with hyper call granularity across multiple guest virtual domains can ensure that any malicious environment to be contained. In my research, I propose the use of radium's Asynchronous Root of Trust Measurement (ARTM) capability incorporated with a secure mandatory access control policy engine that would mitigate the limitations of the current hardware TPM solutions. By employing ARTM we can leverage asynchronous use of boot, launch, and use with the hypervisor proving its state and the integrity of the secure policy. My solution is using Radium (Race free on demand integrity architecture) architecture that will allow a more detailed measurement of applications at run time with greater semantic knowledge of the measured environments. Radium incorporation of a secure access control policy engine will give it the ability to limit or empower a virtual domain system. It can also enable the creation of a service oriented model of guest virtual domains that have the ability to perform certain operations such as introspecting other virtual domain systems to determine the integrity or system state and report it to a remote entity.
Date: August 2015
Creator: Shah, Tawfiq M.
Partner: UNT Libraries

Towards Resistance Detection in Health Behavior Change Dialogue Systems

Description: One of the challenges fairly common in motivational interviewing is patient resistance to health behavior change. Hence, automated dialog systems aimed at counseling patients need to be capable of detecting resistance and appropriately altering dialog. This thesis focusses primarily on the development of such a system for automatic identification of patient resistance to behavioral change. This enables the dialogue system to direct the discourse towards a more agreeable ground and helping the patient overcome the obstacles in his or her way to change. This thesis also proposes a dialogue system framework for health behavior change via natural language analysis and generation. The proposed framework facilitates automated motivational interviewing from clinical psychology and involves three broad stages: rapport building and health topic identification, assessment of the patient’s opinion about making a change, and developing a plan. Using this framework patients can be encouraged to reflect on the options available and choose the best for a healthier life.
Date: August 2015
Creator: Sarma, Bandita
Partner: UNT Libraries

Unique Channel Email System

Description: Email connects 85% of the world. This paper explores the pattern of information overload encountered by majority of email users and examine what steps key email providers are taking to combat the problem. Besides fighting spam, popular email providers offer very limited tools to reduce the amount of unwanted incoming email. Rather, there has been a trend to expand storage space and aid the organization of email. Storing email is very costly and harmful to the environment. Additionally, information overload can be detrimental to productivity. We propose a simple solution that results in drastic reduction of unwanted mail, also known as graymail.
Date: August 2015
Creator: Balakchiev, Milko
Partner: UNT Libraries

Classifying Pairwise Object Interactions: A Trajectory Analytics Approach

Description: We have a huge amount of video data from extensively available surveillance cameras and increasingly growing technology to record the motion of a moving object in the form of trajectory data. With proliferation of location-enabled devices and ongoing growth in smartphone penetration as well as advancements in exploiting image processing techniques, tracking moving objects is more flawlessly achievable. In this work, we explore some domain-independent qualitative and quantitative features in raw trajectory (spatio-temporal) data in videos captured by a fixed single wide-angle view camera sensor in outdoor areas. We study the efficacy of those features in classifying four basic high level actions by employing two supervised learning algorithms and show how each of the features affect the learning algorithms’ overall accuracy as a single factor or confounded with others.
Date: May 2015
Creator: Janmohammadi, Siamak
Partner: UNT Libraries

Distributed Frameworks Towards Building an Open Data Architecture

Description: Data is everywhere. The current Technological advancements in Digital, Social media and the ease at which the availability of different application services to interact with variety of systems are causing to generate tremendous volumes of data. Due to such varied services, Data format is now not restricted to only structure type like text but can generate unstructured content like social media data, videos and images etc. The generated Data is of no use unless been stored and analyzed to derive some Value. Traditional Database systems comes with limitations on the type of data format schema, access rates and storage sizes etc. Hadoop is an Apache open source distributed framework that support storing huge datasets of different formatted data reliably on its file system named Hadoop File System (HDFS) and to process the data stored on HDFS using MapReduce programming model. This thesis study is about building a Data Architecture using Hadoop and its related open source distributed frameworks to support a Data flow pipeline on a low commodity hardware. The Data flow components are, sourcing data, storage management on HDFS and data access layer. This study also discuss about a use case to utilize the architecture components. Sqoop, a framework to ingest the structured data from database onto Hadoop and Flume is used to ingest the semi-structured Twitter streaming json data on to HDFS for analysis. The data sourced using Sqoop and Flume have been analyzed using Hive for SQL like analytics and at a higher level of data access layer, Hadoop has been compared with an in memory computing system using Spark. Significant differences in query execution performances have been analyzed when working with Hadoop and Spark frameworks. This integration helps for ingesting huge Volumes of streaming json Variety data to derive better Value based analytics using Hive and ...
Date: May 2015
Creator: Venumuddala, Ramu Reddy
Partner: UNT Libraries

Investigation on Segmentation, Recognition and 3D Reconstruction of Objects Based on LiDAR Data Or MRI

Description: Segmentation, recognition and 3D reconstruction of objects have been cutting-edge research topics, which have many applications ranging from environmental and medical to geographical applications as well as intelligent transportation. In this dissertation, I focus on the study of segmentation, recognition and 3D reconstruction of objects using LiDAR data/MRI. Three main works are that (I). Feature extraction algorithm based on sparse LiDAR data. A novel method has been proposed for feature extraction from sparse LiDAR data. The algorithm and the related principles have been described. Also, I have tested and discussed the choices and roles of parameters. By using correlation of neighboring points directly, statistic distribution of normal vectors at each point has been effectively used to determine the category of the selected point. (II). Segmentation and 3D reconstruction of objects based on LiDAR/MRI. The proposed method includes that the 3D LiDAR data are layered, that different categories are segmented, and that 3D canopy surfaces of individual tree crowns and clusters of trees are reconstructed from LiDAR point data based on a region active contour model. The proposed method allows for delineations of 3D forest canopy naturally from the contours of raw LiDAR point clouds. The proposed model is suitable not only for a series of ideal cone shapes, but also for other kinds of 3D shapes as well as other kinds dataset such as MRI. (III). Novel algorithms for recognition of objects based on LiDAR/MRI. Aimed to the sparse LiDAR data, the feature extraction algorithm has been proposed and applied to classify the building and trees. More importantly, the novel algorithms based on level set methods have been provided and employed to recognize not only the buildings and trees, the different trees (e.g. Oak trees and Douglas firs), but also the subthalamus nuclei (STNs). By using the novel algorithms based ...
Date: May 2015
Creator: Tang, Shijun
Partner: UNT Libraries

SEM Predicting Success of Student Global Software Development Teams

Description: The extensive use of global teams to develop software has prompted researchers to investigate various factors that can enhance a team’s performance. While a significant body of research exists on global software teams, previous research has not fully explored the interrelationships and collective impact of various factors on team performance. This study explored a model that added the characteristics of a team’s culture, ability, communication frequencies, response rates, and linguistic categories to a central framework of team performance. Data was collected from two student software development projects that occurred between teams located in the United States, Panama, and Turkey. The data was obtained through online surveys and recorded postings of team activities that occurred throughout the global software development projects. Partial least squares path modeling (PLS-PM) was chosen as the analytic technique to test the model and identify the most influential factors. Individual factors associated with response rates and linguistic characteristics proved to significantly affect a team’s activity related to grade on the project, group cohesion, and the number of messages received and sent. Moreover, an examination of possible latent homogeneous segments in the model supported the existence of differences among groups based on leadership style. Teams with assigned leaders tended to have stronger relationships between linguistic characteristics and team performance factors, while teams with emergent leaders had stronger. Relationships between response rates and team performance factors. The contributions in this dissertation are three fold. 1) Novel analysis techniques using PLS-PM and clustering, 2) Use of new, quantifiable variables in analyzing team activity, 3) Identification of plausible causal indicators for team performance and analysis of the same.
Date: May 2015
Creator: Brooks, Ian Robert
Partner: UNT Libraries

Video Analytics with Spatio-Temporal Characteristics of Activities

Description: As video capturing devices become more ubiquitous from surveillance cameras to smart phones, the demand of automated video analysis is increasing as never before. One obstacle in this process is to efficiently locate where a human operator’s attention should be, and another is to determine the specific types of activities or actions without ambiguity. It is the special interest of this dissertation to locate spatial and temporal regions of interest in videos and to develop a better action representation for video-based activity analysis. This dissertation follows the scheme of “locating then recognizing” activities of interest in videos, i.e., locations of potentially interesting activities are estimated before performing in-depth analysis. Theoretical properties of regions of interest in videos are first exploited, based on which a unifying framework is proposed to locate both spatial and temporal regions of interest with the same settings of parameters. The approach estimates the distribution of motion based on 3D structure tensors, and locates regions of interest according to persistent occurrences of low probability. Two contributions are further made to better represent the actions. The first is to construct a unifying model of spatio-temporal relationships between reusable mid-level actions which bridge low-level pixels and high-level activities. Dense trajectories are clustered to construct mid-level actionlets, and the temporal relationships between actionlets are modeled as Action Graphs based on Allen interval predicates. The second is an effort for a novel and efficient representation of action graphs based on a sparse coding framework. Action graphs are first represented using Laplacian matrices and then decomposed as a linear combination of primitive dictionary items following sparse coding scheme. The optimization is eventually formulated and solved as a determinant maximization problem, and 1-nearest neighbor is used for action classification. The experiments have shown better results than existing approaches for regions-of-interest detection and action ...
Date: May 2015
Creator: Cheng, Guangchun
Partner: UNT Libraries

Smartphone-based Household Travel Survey - a Literature Review, an App, and a Pilot Survey

Description: High precision data from household travel survey (HTS) is extremely important for the transportation research, traffic models and policy formulation. Traditional methods of data collection were imprecise because they relied on people’s memories of trip information, such as date and location, and the remainder data had to be obtained by certain supplemental tools. The traditional methods suffered from intensive labor, large time consumption, and unsatisfactory data precision. Recent research trends to employ smartphone apps to collect HTS data. In this study, there are two goals to be addressed. First, a smartphone app is developed to realize a smartphone-based method only for data collection. Second, the researcher evaluates whether this method can supply or replace the traditional tools of HTS. Based on this premise, the smartphone app, TravelSurvey, is specially developed and used for this study. TravelSurvey is currently compatible with iPhone 4 or higher and iPhone Operating System (iOS) 6 or higher, except iPhone 6 or iPhone 6 plus and iOS 8. To evaluate the feasibility, eight individuals are recruited to participate in a pilot HTS. Afterwards, seven of them are involved in a semi-structured interview. The interview is designed to collect interviewees’ feedback directly, so the interview mainly concerns the users’ experience of TravelSurvey. Generally, the feedback is positive. In this study, the pilot HTS data is successfully uploaded to the server by the participants, and the interviewees prefer this smartphone-based method. Therefore, as a new tool, the smartphone-based method feasibly supports a typical HTS for data collection.
Access: This item is restricted to UNT Community Members. Login required if off-campus.
Date: December 2014
Creator: Wang, Qian
Partner: UNT Libraries

General Purpose Computing in Gpu - a Watermarking Case Study

Description: The purpose of this project is to explore the GPU for general purpose computing. The GPU is a massively parallel computing device that has a high-throughput, exhibits high arithmetic intensity, has a large market presence, and with the increasing computation power being added to it each year through innovations, the GPU is a perfect candidate to complement the CPU in performing computations. The GPU follows the single instruction multiple data (SIMD) model for applying operations on its data. This model allows the GPU to be very useful for assisting the CPU in performing computations on data that is highly parallel in nature. The compute unified device architecture (CUDA) is a parallel computing and programming platform for NVIDIA GPUs. The main focus of this project is to show the power, speed, and performance of a CUDA-enabled GPU for digital video watermark insertion in the H.264 video compression domain. Digital video watermarking in general is a highly computationally intensive process that is strongly dependent on the video compression format in place. The H.264/MPEG-4 AVC video compression format has high compression efficiency at the expense of having high computational complexity and leaving little room for an imperceptible watermark to be inserted. Employing a human visual model to limit distortion and degradation of visual quality introduced by the watermark is a good choice for designing a video watermarking algorithm though this does introduce more computational complexity to the algorithm. Research is being conducted into how the CPU-GPU execution of the digital watermark application can boost the speed of the applications several times compared to running the application on a standalone CPU using NVIDIA visual profiler to optimize the application.
Date: August 2014
Creator: Hanson, Anthony
Partner: UNT Libraries

Autonomic Failure Identification and Diagnosis for Building Dependable Cloud Computing Systems

Description: The increasingly popular cloud-computing paradigm provides on-demand access to computing and storage with the appearance of unlimited resources. Users are given access to a variety of data and software utilities to manage their work. Users rent virtual resources and pay for only what they use. In spite of the many benefits that cloud computing promises, the lack of dependability in shared virtualized infrastructures is a major obstacle for its wider adoption, especially for mission-critical applications. Virtualization and multi-tenancy increase system complexity and dynamicity. They introduce new sources of failure degrading the dependability of cloud computing systems. To assure cloud dependability, in my dissertation research, I develop autonomic failure identification and diagnosis techniques that are crucial for understanding emergent, cloud-wide phenomena and self-managing resource burdens for cloud availability and productivity enhancement. We study the runtime cloud performance data collected from a cloud test-bed and by using traces from production cloud systems. We define cloud signatures including those metrics that are most relevant to failure instances. We exploit profiled cloud performance data in both time and frequency domain to identify anomalous cloud behaviors and leverage cloud metric subspace analysis to automate the diagnosis of observed failures. We implement a prototype of the anomaly identification system and conduct the experiments in an on-campus cloud computing test-bed and by using the Google datacenter traces. Our experimental results show that our proposed anomaly detection mechanism can achieve 93% detection sensitivity while keeping the false positive rate as low as 6.1% and outperform other tested anomaly detection schemes. In addition, the anomaly detector adapts itself by recursively learning from these newly verified detection results to refine future detection.
Date: May 2014
Creator: Guan, Qiang
Partner: UNT Libraries

Ddos Defense Against Botnets in the Mobile Cloud

Description: Mobile phone advancements and ubiquitous internet connectivity are resulting in ever expanding possibilities in the application of smart phones. Users of mobile phones are now capable of hosting server applications from their personal devices. Whether providing services individually or in an ad hoc network setting the devices are currently not configured for defending against distributed denial of service (DDoS) attacks. These attacks, often launched from a botnet, have existed in the space of personal computing for decades but recently have begun showing up on mobile devices. Research is done first into the required steps to develop a potential botnet on the Android platform. This includes testing for the amount of malicious traffic an Android phone would be capable of generating for a DDoS attack. On the other end of the spectrum is the need of mobile devices running networked applications to develop security against DDoS attacks. For this mobile, phones are setup, with web servers running Apache to simulate users running internet connected applications for either local ad hoc networks or serving to the internet. Testing is done for the viability of using commonly available modules developed for Apache and intended for servers as well as finding baseline capabilities of mobiles to handle higher traffic volumes. Given the unique challenge of the limited resources a mobile phone can dedicate to Apache when compared to a dedicated hosting server a new method was needed. A proposed defense algorithm is developed for mitigating DDoS attacks against the mobile server that takes into account the limited resources available on the mobile device. The algorithm is tested against TCP socket flooding for effectiveness and shown to perform better than the common Apache module installations on a mobile device.
Date: May 2014
Creator: Jensen, David
Partner: UNT Libraries

Performance Engineering of Software Web Services and Distributed Software Systems

Description: The promise of service oriented computing, and the availability of Web services promote the delivery and creation of new services based on existing services, in order to meet new demands and new markets. As Web and internet based services move into Clouds, inter-dependency of services and their complexity will increase substantially. There are standards and frameworks for specifying and composing Web Services based on functional properties. However, mechanisms to individually address non-functional properties of services and their compositions have not been well established. Furthermore, the Cloud ontology depicts service layers from a high-level, such as Application and Software, to a low-level, such as Infrastructure and Platform. Each component that resides in one layer can be useful to another layer as a service. It hints at the amount of complexity resulting from not only horizontal but also vertical integrations in building and deploying a composite service. To meet the requirements and facilitate using Web services, we first propose a WSDL extension to permit specification of non-functional or Quality of Service (QoS) properties. On top of the foundation, the QoS-aware framework is established to adapt publicly available tools for Web services, augmented by ontology management tools, along with tools for performance modeling to exemplify how the non-functional properties such as response time, throughput, or utilization of services can be addressed in the service acquisition and composition process. To facilitate Web service composition standards, in this work we extended the framework with additional qualitative information to the service descriptions using Business Process Execution Language (BPEL). Engineers can use BPEL to explore design options, and have the QoS properties analyzed for the composite service. The main issue in our research is performance evaluation in software system and engineering. We researched the Web service computation as the first half of this dissertation, and performance antipattern ...
Date: May 2014
Creator: Lin, Chia-en
Partner: UNT Libraries

3GPP Long Term Evolution LTE Scheduling

Description: Future generation cellular networks are expected to deliver an omnipresent broadband access network for an endlessly increasing number of subscribers. Long term Evolution (LTE) represents a significant milestone towards wireless networks known as 4G cellular networks. A key feature of LTE is the implementation of enhanced Radio Resource Management (RRM) mechanism to improve the system performance. The structure of LTE networks was simplified by diminishing the number of the nodes of the core network. Also, the design of the radio protocol architecture is quite unique. In order to achieve high data rate in LTE, 3rd Generation Partnership Project (3GPP) has selected Orthogonal Frequency Division Multiplexing (OFDM) as an appropriate scheme in terms of downlinks. However, the proper scheme for an uplink is the Single-Carrier Frequency Domain Multiple Access due to the peak-to-average-power-ratio (PAPR) constraint. LTE packet scheduling plays a primary role as part of RRM to improve the system’s data rate as well as supporting various QoS requirements of mobile services. The major function of the LTE packet scheduler is to assign Physical Resource Blocks (PRBs) to mobile User Equipment (UE). In our work, we formed a proposed packet scheduler algorithm. The proposed scheduler algorithm acts based on the number of UEs attached to the eNodeB. To evaluate the proposed scheduler algorithm, we assumed two different scenarios based on a number of UEs. When the number of UE is lower than the number of PRBs, the UEs with highest Channel Quality Indicator (CQI) will be assigned PRBs. Otherwise, the scheduler will assign PRBs based on a given proportional fairness metric. The eNodeB’s throughput is increased when the proposed algorithm was implemented.
Date: December 2013
Creator: Alotaibi, Sultan
Partner: UNT Libraries

Boosting for Learning From Imbalanced, Multiclass Data Sets

Description: In many real-world applications, it is common to have uneven number of examples among multiple classes. The data imbalance, however, usually complicates the learning process, especially for the minority classes, and results in deteriorated performance. Boosting methods were proposed to handle the imbalance problem. These methods need elongated training time and require diversity among the classifiers of the ensemble to achieve improved performance. Additionally, extending the boosting method to handle multi-class data sets is not straightforward. Examples of applications that suffer from imbalanced multi-class data can be found in face recognition, where tens of classes exist, and in capsule endoscopy, which suffers massive imbalance between the classes. This dissertation introduces RegBoost, a new boosting framework to address the imbalanced, multi-class problems. This method applies a weighted stratified sampling technique and incorporates a regularization term that accommodates multi-class data sets and automatically determines the error bound of each base classifier. The regularization parameter penalizes the classifier when it misclassifies instances that were correctly classified in the previous iteration. The parameter additionally reduces the bias towards majority classes. Experiments are conducted using 12 diverse data sets with moderate to high imbalance ratios. The results demonstrate superior performance of the proposed method compared to several state-of-the-art algorithms for imbalanced, multi-class classification problems. More importantly, the sensitivity improvement of the minority classes using RegBoost is accompanied with the improvement of the overall accuracy for all classes. With unpredictability regularization, a diverse group of classifiers are created and the maximum accuracy improvement reaches above 24%. Using stratified undersampling, RegBoost exhibits the best efficiency. The reduction in computational cost is significant reaching above 50%. As the volume of training data increase, the gain of efficiency with the proposed method becomes more significant.
Access: This item is restricted to UNT Community Members. Login required if off-campus.
Date: December 2013
Creator: Abouelenien, Mohamed
Partner: UNT Libraries

Design and Analysis of Novel Verifiable Voting Schemes

Description: Free and fair elections are the basis for democracy, but conducting elections is not an easy task. Different groups of people are trying to influence the outcome of the election in their favor using the range of methods, from campaigning for a particular candidate to well-financed lobbying. Often the stakes are too high, and the methods are illegal. Two main properties of any voting scheme are the privacy of a voter’s choice and the integrity of the tally. Unfortunately, they are mutually exclusive. Integrity requires making elections transparent and auditable, but at the same time, we must preserve a voter’s privacy. It is always a trade-off between these two requirements. Current voting schemes favor privacy over auditability, and thus, they are vulnerable to voting fraud. I propose two novel voting systems that can achieve both privacy and verifiability. The first protocol is based on cryptographical primitives to ensure the integrity of the final tally and privacy of the voter. The second protocol is a simple paper-based voting scheme that achieves almost the same level of security without usage of cryptography.
Date: December 2013
Creator: Yestekov, Yernat
Partner: UNT Libraries

Simulating the Spread of Infectious Diseases in Heterogeneous Populations with Diverse Interactions Characteristics

Description: The spread of infectious diseases has been a public concern throughout human history. Historic recorded data has reported the severity of infectious disease epidemics in different ages. Ancient Greek physician Hippocrates was the first to analyze the correlation between diseases and their environment. Nowadays, health authorities are in charge of planning strategies that guarantee the welfare of citizens. The simulation of contagion scenarios contributes to the understanding of the epidemic behavior of diseases. Computational models facilitate the study of epidemics by integrating disease and population data to the simulation. The use of detailed demographic and geographic characteristics allows researchers to construct complex models that better resemble reality and the integration of these attributes permits us to understand the rules of interaction. The interaction of individuals with similar characteristics forms synthetic structures that depict clusters of interaction. The synthetic environments facilitate the study of the spread of infectious diseases in diverse scenarios. The characteristics of the population and the disease concurrently affect the local and global epidemic progression. Every cluster’ epidemic behavior constitutes the global epidemic for a clustered population. By understanding the correlation between structured populations and the spread of a disease, current dissertation research makes possible to identify risk groups of specific characteristics and devise containment strategies that facilitate health authorities to improve mitigation strategies.
Date: December 2013
Creator: Gomez-Lopez, Iris Nelly
Partner: UNT Libraries

Framework for Evaluating Dynamic Memory Allocators Including a New Equivalence Class Based Cache-conscious Allocator

Description: Software applications’ performance is hindered by a variety of factors, but most notably by the well-known CPU-memory speed gap (often known as the memory wall). This results in the CPU sitting idle waiting for data to be brought from memory to processor caches. The addressing used by caches cause non-uniform accesses to various cache sets. The non-uniformity is due to several reasons, including how different objects are accessed by the code and how the data objects are located in memory. Memory allocators determine where dynamically created objects are placed, thus defining addresses and their mapping to cache locations. It is important to evaluate how different allocators behave with respect to the localities of the created objects. Most allocators use a single attribute, the size, of an object in making allocation decisions. Additional attributes such as the placement with respect to other objects, or specific cache area may lead to better use of cache memories. In this dissertation, we proposed and implemented a framework that allows for the development and evaluation of new memory allocation techniques. At the root of the framework is a memory tracing tool called Gleipnir, which provides very detailed information about every memory access, and relates it back to source level objects. Using the traces from Gleipnir, we extended a commonly used cache simulator for generating detailed cache statistics: per function, per data object, per cache line, and identify specific data objects that are conflicting with each other. The utility of the framework is demonstrated with a new memory allocator known as equivalence class allocator. The new allocator allows users to specify cache sets, in addition to object size, where the objects should be placed. We compare this new allocator with two well-known allocators, viz., Doug Lea and Pool allocators.
Date: August 2013
Creator: Janjusic, Tomislav
Partner: UNT Libraries

Modeling and Analysis of Next Generation 9-1-1 Emergency Medical Dispatch Protocols

Description: Emergency Medical Dispatch Protocols are guidelines that a 9-1-1 dispatcher uses to evaluate the nature of emergency, resources to send and the nature of help provided to the 9-1-1 caller. The current Dispatch Protocols are based on voice only call. But the Next Generation 9-1-1 (NG9-1-1) architecture will allow multimedia emergency calls. In this thesis I analyze and model the Emergency Medical Dispatch Protocols for NG9-1-1 architecture. I have identified various technical aspects to improve the NG9-1-1 Dispatch Protocols. The devices (smartphone) at the caller end have advanced to a point where they can be used to send and receive video, pictures and text. There are sensors embedded in them that can be used for initial diagnosis of the injured person. There is a need to improve the human computer (smartphone) interface to take advantage of technology so that callers can easily make use of various features available to them. The dispatchers at the 9-1-1 call center can make use of these new protocols to improve the quality and the response time. They will have capability of multiple media streams to interact with the caller and the first responders.The specific contributions in this thesis include developing applications that use smartphone sensors. The CPR application uses the smartphone to help administer effective CPR even if the person is not trained. The application makes the CPR process closed loop, i.e., the person who administers the CPR as well as the 9-1-1 operator receive feedback and prompt from the application about the correctness of the CPR. The breathing application analyzes the quality of breathing of the affected person and automatically sends the information to the 9-1-1 operator. In order to improve the Human Computer Interface at the caller and the operator end, I have analyzed Fitts law and extended it so that it ...
Date: August 2013
Creator: Gupta, Neeraj Kant
Partner: UNT Libraries

Multilingual Word Sense Disambiguation Using Wikipedia

Description: Ambiguity is inherent to human language. In particular, word sense ambiguity is prevalent in all natural languages, with a large number of the words in any given language carrying more than one meaning. Word sense disambiguation is the task of automatically assigning the most appropriate meaning to a polysemous word within a given context. Generally the problem of resolving ambiguity in literature has revolved around the famous quote “you shall know the meaning of the word by the company it keeps.” In this thesis, we investigate the role of context for resolving ambiguity through three different approaches. Instead of using a predefined monolingual sense inventory such as WordNet, we use a language-independent framework where the word senses and sense-tagged data are derived automatically from Wikipedia. Using Wikipedia as a source of sense-annotations provides the much needed solution for knowledge acquisition bottleneck. In order to evaluate the viability of Wikipedia based sense-annotations, we cast the task of disambiguating polysemous nouns as a monolingual classification task and experimented on lexical samples from four different languages (viz. English, German, Italian and Spanish). The experiments confirm that the Wikipedia based sense annotations are reliable and can be used to construct accurate monolingual sense classifiers. It is a long belief that exploiting multiple languages helps in building accurate word sense disambiguation systems. Subsequently, we developed two approaches that recast the task of disambiguating polysemous nouns as a multilingual classification task. The first approach for multilingual word sense disambiguation attempts to effectively use a machine translation system to leverage two relevant multilingual aspects of the semantics of text. First, the various senses of a target word may be translated into different words, which constitute unique, yet highly salient signal that effectively expand the target word’s feature space. Second, the translated context words themselves embed co-occurrence information ...
Date: August 2013
Creator: Dandala, Bharath
Partner: UNT Libraries