Search Results

Accurate Joint Detection from Depth Videos towards Pose Analysis
Joint detection is vital for characterizing human pose and serves as a foundation for a wide range of computer vision applications such as physical training, health care, entertainment. This dissertation proposed two methods to detect joints in the human body for pose analysis. The first method detects joints by combining body model and automatic feature points detection together. The human body model maps the detected extreme points to the corresponding body parts of the model and detects the position of implicit joints. The dominant joints are detected after implicit joints and extreme points are located by a shortest path based methods. The main contribution of this work is a hybrid framework to detect joints on the human body to achieve robustness to different body shapes or proportions, pose variations and occlusions. Another contribution of this work is the idea of using geodesic features of the human body to build a model for guiding the human pose detection and estimation. The second proposed method detects joints by segmenting human body into parts first and then detect joints by making the detection algorithm focusing on each limb. The advantage of applying body part segmentation first is that the body segmentation method narrows down the searching area for each joint so that the joint detection method can provide more stable and accurate results.
Adaptive Power Management for Autonomic Resource Configuration in Large-scale Computer Systems
In order to run and manage resource-intensive high-performance applications, large-scale computing and storage platforms have been evolving rapidly in various domains in both academia and industry. The energy expenditure consumed to operate and maintain these cloud computing infrastructures is a major factor to influence the overall profit and efficiency for most cloud service providers. Moreover, considering the mitigation of environmental damage from excessive carbon dioxide emission, the amount of power consumed by enterprise-scale data centers should be constrained for protection of the environment.Generally speaking, there exists a trade-off between power consumption and application performance in large-scale computing systems and how to balance these two factors has become an important topic for researchers and engineers in cloud and HPC communities. Therefore, minimizing the power usage while satisfying the Service Level Agreements have become one of the most desirable objectives in cloud computing research and implementation. Since the fundamental feature of the cloud computing platform is hosting workloads with a variety of characteristics in a consolidated and on-demand manner, it is demanding to explore the inherent relationship between power usage and machine configurations. Subsequently, with an understanding of these inherent relationships, researchers are able to develop effective power management policies to optimize productivity by balancing power usage and system performance. In this dissertation, we develop an autonomic power-aware system management framework for large-scale computer systems. We propose a series of techniques including coarse-grain power profiling, VM power modelling, power-aware resource auto-configuration and full-system power usage simulator. These techniques help us to understand the characteristics of power consumption of various system components. Based on these techniques, we are able to test various job scheduling strategies and develop resource management approaches to enhance the systems' power efficiency.
Advanced Power Amplifiers Design for Modern Wireless Communication
Modern wireless communication systems use spectrally efficient modulation schemes to reach high data rate transmission. These schemes are generally involved with signals with high peak-to-average power ratio (PAPR). Moreover, the development of next generation wireless communication systems requires the power amplifiers to operate over a wide frequency band or multiple frequency bands to support different applications. These wide-band and multi-band solutions will lead to reductions in both the size and cost of the whole system. This dissertation presents several advanced power amplifier solutions to provide wide-band and multi-band operations with efficiency improvement at power back-offs.
Advanced Stochastic Signal Processing and Computational Methods: Theories and Applications
Compressed sensing has been proposed as a computationally efficient method to estimate the finite-dimensional signals. The idea is to develop an undersampling operator that can sample the large but finite-dimensional sparse signals with a rate much below the required Nyquist rate. In other words, considering the sparsity level of the signal, the compressed sensing samples the signal with a rate proportional to the amount of information hidden in the signal. In this dissertation, first, we employ compressed sensing for physical layer signal processing of directional millimeter-wave communication. Second, we go through the theoretical aspect of compressed sensing by running a comprehensive theoretical analysis of compressed sensing to address two main unsolved problems, (1) continuous-extension compressed sensing in locally convex space and (2) computing the optimum subspace and its dimension using the idea of equivalent topologies using Köthe sequence. In the first part of this thesis, we employ compressed sensing to address various problems in directional millimeter-wave communication. In particular, we are focusing on stochastic characteristics of the underlying channel to characterize, detect, estimate, and track angular parameters of doubly directional millimeter-wave communication. For this purpose, we employ compressed sensing in combination with other stochastic methods such as Correlation Matrix Distance (CMD), spectral overlap, autoregressive process, and Fuzzy entropy to (1) study the (non) stationary behavior of the channel and (2) estimate and track channel parameters. This class of applications is finite-dimensional signals. Compressed sensing demonstrates great capability in sampling finite-dimensional signals. Nevertheless, it does not show the same performance sampling the semi-infinite and infinite-dimensional signals. The second part of the thesis is more theoretical works on compressed sensing toward application. In chapter 4, we leverage the group Fourier theory and the stochastical nature of the directional communication to introduce families of the linear and quadratic family of displacement operators that …
Analysis and Optimization of Graphene FET based Nanoelectronic Integrated Circuits
Like cell to the human body, transistors are the basic building blocks of any electronics circuits. Silicon has been the industries obvious choice for making transistors. Transistors with large size occupy large chip area, consume lots of power and the number of functionalities will be limited due to area constraints. Thus to make the devices smaller, smarter and faster, the transistors are aggressively scaled down in each generation. Moore's law states that the transistors count in any electronic circuits doubles every 18 months. Following this Moore's law, the transistor has already been scaled down to 14 nm. However there are limitations to how much further these transistors can be scaled down. Particularly below 10 nm, these silicon based transistors hit the fundamental limits like loss of gate control, high leakage and various other short channel effects. Thus it is not possible to favor the silicon transistors for future electronics applications. As a result, the research has shifted to new device concepts and device materials alternative to silicon. Carbon is the next abundant element found in the Earth and one of such carbon based nanomaterial is graphene. Graphene when extracted from Graphite, the same material used as the lid in pencil, have a tremendous potential to take future electronics devices to new heights in terms of size, cost and efficiency. Thus after its first experimental discovery of graphene in 2004, graphene has been the leading research area for both academics as well as industries. This dissertation is focused on the analysis and optimization of graphene based circuits for future electronics. The first part of this dissertation considers graphene based transistors for analog/radio frequency (RF) circuits. In this section, a dual gate Graphene Field Effect Transistor (GFET) is considered to build the case study circuits like voltage controlled oscillator (VCO) and low …
Analysis and Performance of a Cyber-Human System and Protocols for Geographically Separated Collaborators
This dissertation provides an innovative mechanism to collaborate two geographically separated people on a physical task and a novel method to measure Complexity Index (CI) and calculate Minimal Complexity Index (MCI) of a collaboration protocol. The protocol is represented as a structure, and the information content of it is measured in bits to understand the complex nature of the protocol. Using the complexity metrics, one can analyze the performance of a collaborative system and a collaboration protocol. Security and privacy of the consumers are vital while seeking remote help; this dissertation also provides a novel authorization framework for dynamic access control of resources on an input-constrained appliance used for completing the physical task. Using the innovative Collaborative Appliance for REmote-help (CARE) and with the support of a remotely located expert, fifty-nine subjects with minimal or no prior mechanical knowledge are able to elevate a car for replacing a tire in an average time of six minutes and 53 seconds and with an average protocol complexity of 171.6 bits. Moreover, thirty subjects with minimal or no prior plumbing knowledge are able to change the cartridge of a faucet in an average time of ten minutes and with an average protocol complexity of 250.6 bits. Our experiments and results show that one can use the developed mechanism and methods for expanding the protocols for a variety of home, vehicle, and appliance repairs and installations.
Application of Adaptive Techniques in Regression Testing for Modern Software Development
In this dissertation we investigate the applicability of different adaptive techniques to improve the effectiveness and efficiency of the regression testing. Initially, we introduce the concept of regression testing. We then perform a literature review of current practices and state-of-the-art regression testing techniques. Finally, we advance the regression testing techniques by performing four empirical studies in which we use different types of information (e.g. user session, source code, code commit, etc.) to investigate the effectiveness of each software metric on fault detection capability for different software environments. In our first empirical study, we show the effectiveness of applying user session information for test case prioritization. In our next study, we apply learning from the previous study, and implement a collaborative filtering recommender system for test case prioritization, which uses user sessions and change history information as input parameter, and return the risk score associated with each component. Results of this study show that our recommender system improves the effectiveness of test prioritization; the performance of our approach was particularly noteworthy when we were under time constraints. We then investigate the merits of multi-objective testing over single objective techniques with a graph-based testing framework. Results of this study indicate that the use of the graph-based technique reduces the algorithm execution time considerably, while being just as effective as the greedy algorithms in terms of fault detection rate. Finally, we apply the knowledge from the previous studies and implement a query answering framework for regression test selection. This framework is built based on a graph database and uses fault history information and test diversity in attempt to select the most effective set of test cases in term of fault detection capability. Our empirical evaluation of this study with four open source programs shows that our approach can be effective and efficient by …
Application-Specific Things Architectures for IoT-Based Smart Healthcare Solutions
Human body is a complex system organized at different levels such as cells, tissues and organs, which contributes to 11 important organ systems. The functional efficiency of this complex system is evaluated as health. Traditional healthcare is unable to accommodate everyone's need due to the ever-increasing population and medical costs. With advancements in technology and medical research, traditional healthcare applications are shaping into smart healthcare solutions. Smart healthcare helps in continuously monitoring our body parameters, which helps in keeping people health-aware. It provides the ability for remote assistance, which helps in utilizing the available resources to maximum potential. The backbone of smart healthcare solutions is Internet of Things (IoT) which increases the computing capacity of the real-world components by using cloud-based solutions. The basic elements of these IoT based smart healthcare solutions are called "things." Things are simple sensors or actuators, which have the capacity to wirelessly connect with each other and to the internet. The research for this dissertation aims in developing architectures for these things, focusing on IoT-based smart healthcare solutions. The core for this dissertation is to contribute to the research in smart healthcare by identifying applications which can be monitored remotely. For this, application-specific thing architectures were proposed based on monitoring a specific body parameter; monitoring physical health for family and friends; and optimizing the power budget of IoT body sensor network using human body communications. The experimental results show promising scope towards improving the quality of life, through needle-less and cost-effective smart healthcare solutions.
An Artificial Intelligence-Driven Model-Based Analysis of System Requirements for Exposing Off-Nominal Behaviors
With the advent of autonomous systems and deep learning systems, safety pertaining to these systems has become a major concern. The existing failure analysis techniques are not enough to thoroughly analyze the safety in these systems. Moreover, because these systems are created to operate in various conditions, they are susceptible to unknown safety issues. Hence, we need mechanisms which can take into account the complexity of operational design domains, identify safety issues other than failures, and expose unknown safety issues. Moreover, existing safety analysis approaches require a lot of effort and time for analysis and do not consider machine learning (ML) safety. To address these limitations, in this dissertation, we discuss an artificial-intelligence driven model-based methodology that aids in identifying unknown safety issues and analyzing ML safety. Our methodology consists of 4 major tasks: 1) automated model generation, 2) automated analysis of component state transition model specification, 3) undesired states analysis, and 4) causal factor analysis. In our methodology we identify unknown safety issues by finding undesired combinations of components' states and environmental entities' states as well as causes resulting in these undesired combinations. In our methodology, we refer to the behaviors that occur because of undesired combinations as off-nominal behaviors (ONBs). To identify undesired combinations and ONBs that aid in exposing unknown safety issues with less effort and time we proposed various approaches for each of the task and performed corresponding empirical studies. We also discussed machine learning safety analysis from the perspective of machine learning engineers as well as system and software safety engineers. The results of studies conducted as part of our research shows that our proposed methodology helps in identifying unknown safety issues effectively. Our results also show that combinatorial methods are effective in reducing effort and time for analysis of off-nominal behaviors without overlooking any …
Automated Real-time Objects Detection in Colonoscopy Videos for Quality Measurements
The effectiveness of colonoscopy depends on the quality of the inspection of the colon. There was no automated measurement method to evaluate the quality of the inspection. This thesis addresses this issue by investigating an automated post-procedure quality measurement technique and proposing a novel approach automatically deciding a percentage of stool areas in images of digitized colonoscopy video files. It involves the classification of image pixels based on their color features using a new method of planes on RGB (red, green and blue) color space. The limitation of post-procedure quality measurement is that quality measurements are available long after the procedure was done and the patient was released. A better approach is to inform any sub-optimal inspection immediately so that the endoscopist can improve the quality in real-time during the procedure. This thesis also proposes an extension to post-procedure method to detect stool, bite-block, and blood regions in real-time using color features in HSV color space. These three objects play a major role in quality measurements in colonoscopy. The proposed method partitions very large positive examples of each of these objects into a number of groups. These groups are formed by taking intersection of positive examples with a hyper plane. This hyper plane is named as 'positive plane'. 'Convex hulls' are used to model positive planes. Comparisons with traditional classifiers such as K-nearest neighbor (K-NN) and support vector machines (SVM) proves the soundness of the proposed method in terms of accuracy and speed that are critical in the targeted real-time quality measurement system.
Blockchain for AI: Smarter Contracts to Secure Artificial Intelligence Algorithms
In this dissertation, I investigate the existing smart contract problems that limit cognitive abilities. I use Taylor's serious expansion, polynomial equation, and fraction-based computations to overcome the limitations of calculations in smart contracts. To prove the hypothesis, I use these mathematical models to compute complex operations of naive Bayes, linear regression, decision trees, and neural network algorithms on Ethereum public test networks. The smart contracts achieve 95\% prediction accuracy compared to traditional programming language models, proving the soundness of the numerical derivations. Many non-real-time applications can use our solution for trusted and secure prediction services.
Building Reliable and Cost-Effective Storage Systems for High-Performance Computing Datacenters
In this dissertation, I first incorporate declustered redundant array of independent disks (RAID) technology in the existing system by maximizing the aggregated recovery I/O and accelerating post-failure remediation. Our analytical model affirms the accelerated data recovery stage significantly improves storage reliability. Then I present a proactive data protection framework that augments storage availability and reliability. It utilizes the failure prediction methods to efficiently rescue data on drives before failures occur, which significantly reduces the storage downtime and lowers the risk of nested failures. Finally, I investigate how an active storage system enables energy-efficient computing. I explore an emerging storage device named Ethernet drive to offload data-intensive workloads from the host to drives and process the data on drives. It not only minimizes data movement and power usage, but also enhances data availability and storage scalability. In summary, my dissertation research provides intelligence at the drive, storage node, and system levels to tackle the rising reliability challenge in modern HPC datacenters. The results indicate that this novel storage paradigm cost-effectively improves storage scalability, availability, and reliability.
Capacity and Throughput Optimization in Multi-cell 3G WCDMA Networks
User modeling enables in the computation of the traffic density in a cellular network, which can be used to optimize the placement of base stations and radio network controllers as well as to analyze the performance of resource management algorithms towards meeting the final goal: the calculation and maximization of network capacity and throughput for different data rate services. An analytical model is presented for approximating the user distributions in multi-cell third generation wideband code division multiple access (WCDMA) networks using 2-dimensional Gaussian distributions by determining the means and the standard deviations of the distributions for every cell. This model allows for the calculation of the inter-cell interference and the reverse-link capacity of the network. An analytical model for optimizing capacity in multi-cell WCDMA networks is presented. Capacity is optimized for different spreading factors and for perfect and imperfect power control. Numerical results show that the SIR threshold for the received signals is decreased by 0.5 to 1.5 dB due to the imperfect power control. The results also show that the determined parameters of the 2-dimensional Gaussian model match well with traditional methods for modeling user distribution. A call admission control algorithm is designed that maximizes the throughput in multi-cell WCDMA networks. Numerical results are presented for different spreading factors and for several mobility scenarios. Our methods of optimizing capacity and throughput are computationally efficient, accurate, and can be implemented in large WCDMA networks.
Combinatorial-Based Testing Strategies for Mobile Application Testing
This work introduces three new coverage criteria based on combinatorial-based event and element sequences that occur in the mobile environment. The novel combinatorial-based criteria are used to reduce, prioritize, and generate test suites for mobile applications. The combinatorial-based criteria include unique coverage of events and elements with different respects to ordering. For instance, consider the coverage of a pair of events, e1 and e2. The least strict criterion, Combinatorial Coverage (CCov), counts the combination of these two events in a test case without respect to the order in which the events occur. That is, the combination (e1, e2) is the same as (e2, e1). The second criterion, Sequence-Based Combinatorial Coverage (SCov), considers the order of occurrence within a test case. Sequences (e1, ..., e2) and (e2,..., e1) are different sequences. The third and strictest criterion is Consecutive-Sequence Combinatorial Coverage (CSCov), which counts adjacent sequences of consecutive pairs. The sequence (e1, e2) is only counted if e1 immediately occurs before e2. The first contribution uses the novel combinatorial-based criteria for the purpose of test suite reduction. Empirical studies reveal that the criteria, when used with event sequences and sequences of size t=2, reduce the test suites by 22.8%-61.3% while the reduced test suites provide 98.8% to 100% fault finding effectiveness. Empirical studies in Android also reveal that the event sequence criteria of size t=2 reduce the test suites by 24.67%-66% while losing at most 0.39% code coverage. When the criteria are used with element sequences and sequences of size t=2, the test suites are reduced by 40\% to 72.67%, losing less than 0.87% code coverage. The second contribution of this work applies the combinatorial-based criteria for test suite prioritization of mobile application test suites. The results of an empirical study show that the prioritization criteria that use element and event sequences …
Comparative Study of RSS-Based Collaborative Localization Methods in Wireless Sensor Networks
In this thesis two collaborative localization techniques are studied: multidimensional scaling (MDS) and maximum likelihood estimator (MLE). A synthesis of a new location estimation method through a serial integration of these two techniques, such that an estimate is first obtained using MDS and then MLE is employed to fine-tune the MDS solution, was the subject of this research using various simulation and experimental studies. In the simulations, important issues including the effects of sensor node density, reference node density and different deployment strategies of reference nodes were addressed. In the experimental study, the path loss model of indoor environments is developed by determining the environment-specific parameters from the experimental measurement data. Then, the empirical path loss model is employed in the analysis and simulation study of the performance of collaborative localization techniques.
Computational Approaches for Analyzing Social Support in Online Health Communities
Online health communities (OHCs) have become a medium for patients to share their personal experiences and interact with peers on topics related to a disease, medication, side effects, and therapeutic processes. Many studies show that using OHCs regularly decreases mortality and improves patients mental health. As a result of their benefits, OHCs are a popular place for patients to refer to, especially patients with a severe disease, and to receive emotional and informational support. The main reasons for developing OHCs are to present valid and high-quality information and to understand the mechanism of social support in changing patients' mental health. Given the purpose of OHC moderators for developing OHCs applications and the purpose of patients for using OHCs, there is no facility, feature, or sub-application in OHCs to satisfy patient and moderator goals. OHCs are only equipped with a primary search engine that is a keyword-based search tool. In other words, if a patient wants to obtain information about a side-effect, he/she needs to browse many threads in the hope that he/she can find several related comments. In the same way, OHC moderators cannot browse all information which is exchanged among patients to validate their accuracy. Thus, it is critical for OHCs to be equipped with computational tools which are supported by several sophisticated computational models that provide moderators and patients with the collection of messages that they need for making decisions or predictions. We present multiple computational models to alleviate the problem of OHCs in providing specific types of messages in response to the specific moderator and patient needs. Specifically, we focused on proposing computational models for the following tasks: identifying emotional support, which presents OHCs moderators, psychologists, and sociologists with insightful views on the emotional states of individuals and groups, and identifying informational support, which provides patients with …
A Computational Methodology for Addressing Differentiated Access of Vulnerable Populations During Biological Emergencies
Mitigation response plans must be created to protect affected populations during biological emergencies resulting from the release of harmful biochemical substances. Medical countermeasures have been stockpiled by the federal government for such emergencies. However, it is the responsibility of local governments to maintain solid, functional plans to apply these countermeasures to the entire target population within short, mandated time frames. Further, vulnerabilities in the population may serve as barriers preventing certain individuals from participating in mitigation activities. Therefore, functional response plans must be capable of reaching vulnerable populations.Transportation vulnerability results from lack of access to transportation. Transportation vulnerable populations located too far from mitigation resources are at-risk of not being able to participate in mitigation activities. Quantification of these populations requires the development of computational methods to integrate spatial demographic data and transportation resource data from disparate sources into the context of planned mitigation efforts. Research described in this dissertation focuses on quantifying transportation vulnerable populations and maximizing participation in response efforts. Algorithms developed as part of this research are integrated into a computational framework to promote a transition from research and development to deployment and use by biological emergency planners.
Computational Methods to Optimize High-Consequence Variants of the Vehicle Routing Problem for Relief Networks in Humanitarian Logistics
Optimization of relief networks in humanitarian logistics often exemplifies the need for solutions that are feasible given a hard constraint on time. For instance, the distribution of medical countermeasures immediately following a biological disaster event must be completed within a short time-frame. When these supplies are not distributed within the maximum time allowed, the severity of the disaster is quickly exacerbated. Therefore emergency response plans that fail to facilitate the transportation of these supplies in the time allowed are simply not acceptable. As a result, all optimization solutions that fail to satisfy this criterion would be deemed infeasible. This creates a conflict with the priority optimization objective in most variants of the generic vehicle routing problem (VRP). Instead of efficiently maximizing usage of vehicle resources available to construct a feasible solution, these variants ordinarily prioritize the construction of a minimum cost set of vehicle routes. Research presented in this dissertation focuses on the design and analysis of efficient computational methods for optimizing high-consequence variants of the VRP for relief networks. The conflict between prioritizing the minimization of the number of vehicles required or the minimization of total travel time is demonstrated. The optimization of the time and capacity constraints in the context of minimizing the required vehicles are independently examined. An efficient meta-heuristic algorithm based on a continuous spatial partitioning scheme is presented for constructing a minimized set of vehicle routes in practical instances of the VRP that include critically high-cost penalties. Multiple optimization priority strategies that extend this algorithm are examined and compared in a large-scale bio-emergency case study. The algorithms designed from this research are implemented and integrated into an existing computational framework that is currently used by public health officials. These computational tools enhance an emergency response planner's ability to derive a set of vehicle routes specifically …
Content and Temporal Analysis of Communications to Predict Task Cohesion in Software Development Global Teams
Virtual teams in industry are increasingly being used to develop software, create products, and accomplish tasks. However, analyzing those collaborations under same-time/different-place conditions is well-known to be difficult. In order to overcome some of these challenges, this research was concerned with the study of collaboration-based, content-based and temporal measures and their ability to predict cohesion within global software development projects. Messages were collected from three software development projects that involved students from two different countries. The similarities and quantities of these interactions were computed and analyzed at individual and group levels. Results of interaction-based metrics showed that the collaboration variables most related to Task Cohesion were Linguistic Style Matching and Information Exchange. The study also found that Information Exchange rate and Reply rate have a significant and positive correlation to Task Cohesion, a factor used to describe participants' engagement in the global software development process. This relation was also found at the Group level. All these results suggest that metrics based on rate can be very useful for predicting cohesion in virtual groups. Similarly, content features based on communication categories were used to improve the identification of Task Cohesion levels. This model showed mixed results, since only Work similarity and Social rate were found to be correlated with Task Cohesion. This result can be explained by how a group's cohesiveness is often associated with fairness and trust, and that these two factors are often achieved by increased social and work communications. Also, at a group-level, all models were found correlated to Task Cohesion, specifically, Similarity+Rate, which suggests that models that include social and work communication categories are also good predictors of team cohesiveness. Finally, temporal interaction similarity measures were calculated to assess their prediction capabilities in a global setting. Results showed a significant negative correlation between the Pacing Rate and …
A Control Theoretic Approach for Resilient Network Services
Resilient networks have the ability to provide the desired level of service, despite challenges such as malicious attacks and misconfigurations. The primary goal of this dissertation is to be able to provide uninterrupted network services in the face of an attack or any failures. This dissertation attempts to apply control system theory techniques with a focus on system identification and closed-loop feedback control. It explores the benefits of system identification technique in designing and validating the model for the complex and dynamic networks. Further, this dissertation focuses on designing robust feedback control mechanisms that are both scalable and effective in real-time. It focuses on employing dynamic and predictive control approaches to reduce the impact of an attack on network services. The closed-loop feedback control mechanisms tackle this issue by degrading the network services gracefully to an acceptable level and then stabilizing the network in real-time (less than 50 seconds). Employing these feedback mechanisms also provide the ability to automatically configure the settings such that the QoS metrics of the network is consistent with those specified in the service level agreements.
Cooperative Perception for Connected Autonomous Vehicle Edge Computing System
This dissertation first conducts a study on raw-data level cooperative perception for enhancing the detection ability of self-driving systems for connected autonomous vehicles (CAVs). A LiDAR (Light Detection and Ranging sensor) point cloud-based 3D object detection method is deployed to enhance detection performance by expanding the effective sensing area, capturing critical information in multiple scenarios and improving detection accuracy. In addition, a point cloud feature based cooperative perception framework is proposed on edge computing system for CAVs. This dissertation also uses the features' intrinsically small size to achieve real-time edge computing, without running the risk of congesting the network. In order to distinguish small sized objects such as pedestrian and cyclist in 3D data, an end-to-end multi-sensor fusion model is developed to implement 3D object detection from multi-sensor data. Experiments show that by solving multiple perception on camera and LiDAR jointly, the detection model can leverage the advantages from high resolution image and physical world LiDAR mapping data, which leads the KITTI benchmark on 3D object detection. At last, an application of cooperative perception is deployed on edge to heal the live map for autonomous vehicles. Through 3D reconstruction and multi-sensor fusion detection, experiments on real-world dataset demonstrate that a high definition (HD) map on edge can afford well sensed local data for navigation to CAVs.
COVID-19 Diagnosis and Segmentation Using Machine Learning Analyses of Lung Computerized Tomography
COVID-19 is a highly contagious and virulent disease caused by the severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2). COVID-19 disease induces lung changes observed in lung computerized tomography (CT) and the percentage of those diseased areas on the CT correlates with the severity of the disease. Therefore, segmentation of CT images to delineate the diseased or lesioned areas is a logical first step to quantify disease severity, which will help physicians predict disease prognosis and guide early treatments to deliver more positive patient outcomes. It is crucial to develop an automated analysis of CT images to save their time and efforts. This dissertation proposes CoviNet, a deep three-dimensional convolutional neural network (3D-CNN) to diagnose COVID-19 in CT images. It also proposes CoviNet Enhanced, a hybrid approach with 3D-CNN and support vector machines. It also proposes CoviSegNet and CoviSegNet Enhanced, which are enhanced U-Net models to segment ground-glass opacities and consolidations observed in computerized tomography (CT) images of COVID-19 patients. We trained and tested the proposed approaches using several public datasets of CT images. The experimental results show the proposed methods are highly effective for COVID-19 detection and segmentation and exhibit better accuracy, precision, sensitivity, specificity, F-1 score, Matthew's correlation coefficient (MCC), dice score, and Jaccard index in comparison with recently published studies.
A Data-Driven Computational Framework to Assess the Risk of Epidemics at Global Mass Gatherings
This dissertation presents a data-driven computational epidemic framework to simulate disease epidemics at global mass gatherings. The annual Muslim pilgrimage to Makkah, Saudi Arabia is used to demonstrate the simulation and analysis of various disease transmission scenarios throughout the different stages of the event from the arrival to the departure of international participants. The proposed agent-based epidemic model efficiently captures the demographic, spatial, and temporal heterogeneity at each stage of the global event of Hajj. Experimental results indicate the substantial impact of the demographic and mobility patterns of the heterogeneous population of pilgrims on the progression of the disease spread in the different stages of Hajj. In addition, these simulations suggest that the differences in the spatial and temporal settings in each stage can significantly affect the dynamic of the disease. Finally, the epidemic simulations conducted at the different stages in this dissertation illustrate the impact of the differences between the duration of each stage in the event and the length of the infectious and latent periods. This research contributes to a better understanding of epidemic modeling in the context of global mass gatherings to predict the risk of disease pandemics caused by associated international travel. The computational modeling and disease spread simulations in global mass gatherings provide public health authorities with powerful tools to assess the implication of these events at a different scale and to evaluate the efficacy of control strategies to reduce their potential impacts.
Dataflow Processing in Memory Achieves Significant Energy Efficiency
The large difference between processor CPU cycle time and memory access time, often referred to as the memory wall, severely limits the performance of streaming applications. Some data centers have shown servers being idle three out of four clocks. High performance instruction sequenced systems are not energy efficient. The execute stage of even simple pipeline processors only use 9% of the pipeline's total energy. A hybrid dataflow system within a memory module is shown to have 7.2 times the performance with 368 times better energy efficiency than an Intel Xeon server processor on the analyzed benchmarks. The dataflow implementation exploits the inherent parallelism and pipelining of the application to improve performance without the overhead functions of caching, instruction fetch, instruction decode, instruction scheduling, reorder buffers, and speculative execution used by high performance out-of-order processors. Coarse grain reconfigurable logic in an energy efficient silicon process provides flexibility to implement multiple algorithms in a low energy solution. Integrating the logic within a 3D stacked memory module provides lower latency and higher bandwidth access to memory while operating independently from the host system processor.
Deep Learning Methods to Investigate Online Hate Speech and Counterhate Replies to Mitigate Hateful Content
Hateful content and offensive language are commonplace on social media platforms. Many surveys prove that high percentages of social media users experience online harassment. Previous efforts have been made to detect and remove online hate content automatically. However, removing users' content restricts free speech. A complementary strategy to address hateful content that does not interfere with free speech is to counter the hate with new content to divert the discourse away from the hate. In this dissertation, we complement the lack of previous work on counterhate arguments by analyzing and detecting them. Firstly, we study the relationships between hateful tweets and replies. Specifically, we analyze their fine-grained relationships by indicating whether the reply counters the hate, provides a justification, attacks the author of the tweet, or adds additional hate. The most obvious finding is that most replies generally agree with the hateful tweets; only 20% of them counter the hate. Secondly, we focus on the hate directed toward individuals and detect authentic counterhate arguments from online articles. We propose a methodology that assures the authenticity of the argument and its specificity to the individual of interest. We show that finding arguments in online articles is an efficient alternative compared to counterhate generation approaches that may hallucinate unsupported arguments. Thirdly, we investigate the replies to counterhate tweets beyond whether the reply agrees or disagrees with the counterhate tweet. We analyze the language of the counterhate tweet that leads to certain types of replies and predict which counterhate tweets may elicit more hate instead of stopping it. We find that counterhate tweets with profanity content elicit replies that agree with the counterhate tweet. This dissertation presents several corpora, detailed corpus analyses, and deep learning-based approaches for the three tasks mentioned above.
Deep Learning Optimization and Acceleration
The novelty of this dissertation is the optimization and acceleration of deep neural networks aimed at real-time predictions with minimal energy consumption. It consists of cross-layer optimization, output directed dynamic quantization, and opportunistic near-data computation for deep neural network acceleration. On two datasets (CIFAR-10 and CIFAR-100), the proposed deep neural network optimization and acceleration frameworks are tested using a variety of Convolutional neural networks (e.g., LeNet-5, VGG-16, GoogLeNet, DenseNet, ResNet). Experimental results are promising when compared to other state-of-the-art deep neural network acceleration efforts in the literature.
Detection and Classification of Heart Sounds Using a Heart-Mobile Interface
An early detection of heart disease can save lives, caution individuals and also help to determine the type of treatment to be given to the patients. The first test of diagnosing a heart disease is through auscultation - listening to the heart sounds. The interpretation of heart sounds is subjective and requires a professional skill to identify the abnormalities in these sounds. A medical practitioner uses a stethoscope to perform an initial screening by listening for irregular sounds from the patient's chest. Later, echocardiography and electrocardiography tests are taken for further diagnosis. However, these tests are expensive and require specialized technicians to operate. A simple and economical way is vital for monitoring in homecare or rural hospitals and urban clinics. This dissertation is focused on developing a patient-centered device for initial screening of the heart sounds that is both low cost and can be used by the users on themselves, and later share the readings with the healthcare providers. An innovative mobile health service platform is created for analyzing and classifying heart sounds. Certain properties of heart sounds have to be evaluated to identify the irregularities such as the number of heart beats and gallops, intensity, frequency, and duration. Since heart sounds are generated in low frequencies, human ears tend to miss certain sounds as the high frequency sounds mask the lower ones. Therefore, this dissertation provides a solution to process the heart sounds using several signal processing techniques, identifies the features in the heart sounds and finally classifies them. This dissertation enables remote patient monitoring through the integration of advanced wireless communications and a customized low-cost stethoscope. It also permits remote management of patients' cardiac status while maximizing patient mobility. The smartphone application facilities recording, processing, visualizing, listening, and classifying heart sounds. The application also generates an electronic medical …
Detection of Generalizable Clone Security Coding Bugs Using Graphs and Learning Algorithms
This research methodology isolates coding properties and identifies the probability of security vulnerabilities using machine learning and historical data. Several approaches characterize the effectiveness of detecting security-related bugs that manifest as vulnerabilities, but none utilize vulnerability patch information. The main contribution of this research is a framework to analyze LLVM Intermediate Representation Code and merging core source code representations using source code properties. This research is beneficial because it allows source programs to be transformed into a graphical form and users can extract specific code properties related to vulnerable functions. The result is an improved approach to detect, identify, and track software system vulnerabilities based on a performance evaluation. The methodology uses historical function level vulnerability information, unique feature extraction techniques, a novel code property graph, and learning algorithms to minimize the amount of end user domain knowledge necessary to detect vulnerabilities in applications. The analysis shows approximately 99% precision and recall to detect known vulnerabilities in the National Institute of Standards and Technology (NIST) Software Assurance Metrics and Tool Evaluation (SAMATE) project. Furthermore, 72% percent of the historical vulnerabilities in the OpenSSL testing environment were detected using a linear support vector classifier (SVC) model.
Detection of Temporal Events and Abnormal Images for Quality Analysis in Endoscopy Videos
Recent reports suggest that measuring the objective quality is very essential towards the success of colonoscopy. Several quality indicators (i.e. metrics) proposed in recent studies are implemented in software systems that compute real-time quality scores for routine screening colonoscopy. Most quality metrics are derived based on various temporal events occurred during the colonoscopy procedure. The location of the phase boundary between the insertion and the withdrawal phases and the amount of circumferential inspection are two such important temporal events. These two temporal events can be determined by analyzing various camera motions of the colonoscope. This dissertation put forward a novel method to estimate X, Y and Z directional motions of the colonoscope using motion vector templates. Since abnormalities of a WCE or a colonoscopy video can be found in a small number of frames (around 5% out of total frames), it is very helpful if a computer system can decide whether a frame has any mucosal abnormalities. Also, the number of detected abnormal lesions during a procedure is used as a quality indicator. Majority of the existing abnormal detection methods focus on detecting only one type of abnormality or the overall accuracies are somewhat low if the method tries to detect multiple abnormalities. Most abnormalities in endoscopy images have unique textures which are clearly distinguishable from normal textures. In this dissertation a new method is proposed that achieves the objective of detecting multiple abnormalities with a higher accuracy using a multi-texture analysis technique. The multi-texture analysis method is designed by representing WCE and colonoscopy image textures as textons.
An Efficient Approach for Dengue Mitigation: A Computational Framework
Dengue mitigation is a major research area among scientist who are working towards an effective management of the dengue epidemic. An effective dengue mitigation requires several other important components. These components include an accurate epidemic modeling, an efficient epidemic prediction, and an efficient resource allocation for controlling of the spread of the dengue disease. Past studies assumed homogeneous response pattern of the dengue epidemic to climate conditions throughout the regions. The dengue epidemic is climate dependent and also it is geographically dependent. A global model is not sufficient to capture the local variations of the epidemic. We propose a novel method of epidemic modeling considering local variation and that uses micro ensemble of regressors for each region. There are three regressors that are used in the construction of the ensemble. These are support vector regression, ordinary least square regression, and a k-nearest neighbor regression. The best performing regressors get selected into the ensemble. The proposed ensemble determines the risk of dengue epidemic in each region in advance. The risk is then used in risk-based resource allocation. The proposing resource allocation is built based on the genetic algorithm. The algorithm exploits the genetic algorithm with major modifications to its main components, mutation and crossover. The proposed resource allocation converges faster than the standard genetic algorithm and also produces a better allocation compared to the standard algorithm.
Epileptic Seizure Detection and Control in the Internet of Medical Things (IoMT) Framework
Epilepsy affects up to 1% of the world's population and approximately 2.5 million people in the United States. A considerable portion (30%) of epilepsy patients are refractory to antiepileptic drugs (AEDs), and surgery can not be an effective candidate if the focus of the seizure is on the eloquent cortex. To overcome the problems with existing solutions, a notable portion of biomedical research is focused on developing an implantable or wearable system for automated seizure detection and control. Seizure detection algorithms based on signal rejection algorithms (SRA), deep neural networks (DNN), and neighborhood component analysis (NCA) have been proposed in the IoMT framework. The algorithms proposed in this work have been validated with both scalp and intracranial electroencephalography (EEG, icEEG), and demonstrate high classification accuracy, sensitivity, and specificity. The occurrence of seizure can be controlled by direct drug injection into the epileptogenic zone, which enhances the efficacy of the AEDs. Piezoelectric and electromagnetic micropumps have been explored for the use of a drug delivery unit, as they provide accurate drug flow and reduce power consumption. The reduction in power consumption as a result of minimal circuitry employed by the drug delivery system is making it suitable for practical biomedical applications. The IoMT inclusion enables remote health activity monitoring, remote data sharing, and access, which advances the current healthcare modality for epilepsy considerably.
E‐Shape Analysis
The motivation of this work is to understand E-shape analysis and how it can be applied to various classification tasks. It has a powerful feature to not only look at what information is contained, but rather how that information looks. This new technique gives E-shape analysis the ability to be language independent and to some extent size independent. In this thesis, I present a new mechanism to characterize an email without using content or context called E-shape analysis for email. I explore the applications of the email shape by carrying out a case study; botnet detection and two possible applications: spam filtering and social-context based finger printing. The second part of this thesis takes what I apply E-shape analysis to activity recognition of humans. Using the Android platform and a T-Mobile G1 phone I collect data from the triaxial accelerometer and use it to classify the motion behavior of a subject.
Evaluating Appropriateness of Emg and Flex Sensors for Classifying Hand Gestures
Hand and arm gestures are a great way of communication when you don't want to be heard, quieter and often more reliable than whispering into a radio mike. In recent years hand gesture identification became a major active area of research due its use in various applications. The objective of my work is to develop an integrated sensor system, which will enable tactical squads and SWAT teams to communicate when there is absence of a Line of Sight or in the presence of any obstacles. The gesture set involved in this work is the standardized hand signals for close range engagement operations used by military and SWAT teams. The gesture sets involved in this work are broadly divided into finger movements and arm movements. The core components of the integrated sensor system are: Surface EMG sensors, Flex sensors and accelerometers. Surface EMG is the electrical activity produced by muscle contractions and measured by sensors directly attached to the skin. Bend Sensors use a piezo resistive material to detect the bend. The sensor output is determined by both the angle between the ends of the sensor as well as the flex radius. Accelerometers sense the dynamic acceleration and inclination in 3 directions simultaneously. EMG sensors are placed on the upper and lower forearm and assist in the classification of the finger and wrist movements. Bend sensors are mounted on a glove that is worn on the hand. The sensors are located over the first knuckle of each figure and can determine if the finger is bent or not. An accelerometer is attached to the glove at the base of the wrist and determines the speed and direction of the arm movement. Classification algorithm SVM is used to classify the gestures.
Evaluation of Call Mobility on Network Productivity in Long Term Evolution Advanced (LTE-A) Femtocells
The demand for higher data rates for indoor and cell-edge users led to evolution of small cells. LTE femtocells, one of the small cell categories, are low-power low-cost mobile base stations, which are deployed within the coverage area of the traditional macro base station. The cross-tier and co-tier interferences occur only when the macrocell and femtocell share the same frequency channels. Open access (OSG), closed access (CSG), and hybrid access are the three existing access-control methods that decide users' connectivity to the femtocell access point (FAP). We define a network performance function, network productivity, to measure the traffic that is carried successfully. In this dissertation, we evaluate call mobility in LTE integrated network and determine optimized network productivity with variable call arrival rate in given LTE deployment with femtocell access modes (OSG, CSG, HYBRID) for a given call blocking vector. The solution to the optimization is maximum network productivity and call arrival rates for all cells. In the second scenario, we evaluate call mobility in LTE integrated network with increasing femtocells and maximize network productivity with variable femtocells distribution per macrocell with constant call arrival rate in uniform LTE deployment with femtocell access modes (OSG, CSG, HYBRID) for a given call blocking vector. The solution to the optimization is maximum network productivity and call arrival rates for all cells for network deployment where peak productivity is identified. We analyze the effects of call mobility on network productivity by simulating low, high, and no mobility scenarios and study the impact based on offered load, handover traffic and blocking probabilities. Finally, we evaluate and optimize performance of fractional frequency reuse (FFR) mechanism and study the impact of proposed metric weighted user satisfaction with sectorized FFR configuration.
Evaluation Techniques and Graph-Based Algorithms for Automatic Summarization and Keyphrase Extraction
Automatic text summarization and keyphrase extraction are two interesting areas of research which extend along natural language processing and information retrieval. They have recently become very popular because of their wide applicability. Devising generic techniques for these tasks is challenging due to several issues. Yet we have a good number of intelligent systems performing the tasks. As different systems are designed with different perspectives, evaluating their performances with a generic strategy is crucial. It has also become immensely important to evaluate the performances with minimal human effort. In our work, we focus on designing a relativized scale for evaluating different algorithms. This is our major contribution which challenges the traditional approach of working with an absolute scale. We consider the impact of some of the environment variables (length of the document, references, and system-generated outputs) on the performance. Instead of defining some rigid lengths, we show how to adjust to their variations. We prove a mathematically sound baseline that should work for all kinds of documents. We emphasize automatically determining the syntactic well-formedness of the structures (sentences). We also propose defining an equivalence class for each unit (e.g. word) instead of the exact string matching strategy. We show an evaluation approach that considers the weighted relatedness of multiple references to adjust to the degree of disagreements between the gold standards. We publish the proposed approach as a free tool so that other systems can use it. We have also accumulated a dataset (scientific articles) with a reference summary and keyphrases for each document. Our approach is applicable not only for evaluating single-document based tasks but also for evaluating multiple-document based tasks. We have tested our evaluation method for three intrinsic tasks (taken from DUC 2004 conference), and in all three cases, it correlates positively with ROUGE. Based on our experiments …
Event Sequence Identification and Deep Learning Classification for Anomaly Detection and Predication on High-Performance Computing Systems
High-performance computing (HPC) systems continue growing in both scale and complexity. These large-scale, heterogeneous systems generate tens of millions of log messages every day. Effective log analysis for understanding system behaviors and identifying system anomalies and failures is highly challenging. Existing log analysis approaches use line-by-line message processing. They are not effective for discovering subtle behavior patterns and their transitions, and thus may overlook some critical anomalies. In this dissertation research, I propose a system log event block detection (SLEBD) method which can extract the log messages that belong to a component or system event into an event block (EB) accurately and automatically. At the event level, we can discover new event patterns, the evolution of system behavior, and the interaction among different system components. To find critical event sequences, existing sequence mining methods are mostly based on the a priori algorithm which is compute-intensive and runs for a long time. I develop a novel, topology-aware sequence mining (TSM) algorithm which is efficient to generate sequence patterns from the extracted event block lists. I also train a long short-term memory (LSTM) model to cluster sequences before specific events. With the generated sequence pattern and trained LSTM model, we can predict whether an event is going to occur normally or not. To accelerate such predictions, I propose a design flow by which we can convert recurrent neural network (RNN) designs into register-transfer level (RTL) implementations which are deployed on FPGAs. Due to its high parallelism and low power, FPGA achieves a greater speedup and better energy efficiency compared to CPU and GPU according to our experimental results.
Exploration of Visual, Acoustic, and Physiological Modalities to Complement Linguistic Representations for Sentiment Analysis
This research is concerned with the identification of sentiment in multimodal content. This is of particular interest given the increasing presence of subjective multimodal content on the web and other sources, which contains a rich and vast source of people's opinions, feelings, and experiences. Despite the need for tools that can identify opinions in the presence of diverse modalities, most of current methods for sentiment analysis are designed for textual data only, and few attempts have been made to address this problem. The dissertation investigates techniques for augmenting linguistic representations with acoustic, visual, and physiological features. The potential benefits of using these modalities include linguistic disambiguation, visual grounding, and the integration of information about people's internal states. The main goal of this work is to build computational resources and tools that allow sentiment analysis to be applied to multimodal data. This thesis makes three important contributions. First, it shows that modalities such as audio, video, and physiological data can be successfully used to improve existing linguistic representations for sentiment analysis. We present a method that integrates linguistic features with features extracted from these modalities. Features are derived from verbal statements, audiovisual recordings, thermal recordings, and physiological sensors signals. The resulting multimodal sentiment analysis system is shown to significantly outperform the use of language alone. Using this system, we were able to predict the sentiment expressed in video reviews and also the sentiment experienced by viewers while exposed to emotionally loaded content. Second, the thesis provides evidence of the portability of the developed strategies to other affect recognition problems. We provided support for this by studying the deception detection problem. Third, this thesis contributes several multimodal datasets that will enable further research in sentiment and deception detection.
Exploring Physical Unclonable Functions for Efficient Hardware Assisted Security in the IoT
Modern cities are undergoing rapid expansion. The number of connected devices in the networks in and around these cities is increasing every day and will exponentially increase in the next few years. At home, the number of connected devices is also increasing with the introduction of home automation appliances and applications. Many of these appliances are becoming smart devices which can track our daily routines. It is imperative that all these devices should be secure. When cryptographic keys used for encryption and decryption are stored on memory present on these devices, they can be retrieved by attackers or adversaries to gain control of the system. For this purpose, Physical Unclonable Functions (PUFs) were proposed to generate the keys required for encryption and decryption of the data or the communication channel, as required by the application. PUF modules take advantage of the manufacturing variations that are introduced in the Integrated Circuits (ICs) during the fabrication process. These are used to generate the cryptographic keys which reduces the use of a separate memory module to store the encryption and decryption keys. A PUF module can also be recon gurable such that the number of input output pairs or Challenge Response Pairs (CRPs) generated can be increased exponentially. This dissertation proposes three designs of PUFs, two of which are recon gurable to increase the robustness of the system.
Exploring Privacy in Location-based Services Using Cryptographic Protocols
Location-based services (LBS) are available on a variety of mobile platforms like cell phones, PDA's, etc. and an increasing number of users subscribe to and use these services. Two of the popular models of information flow in LBS are the client-server model and the peer-to-peer model, in both of which, existing approaches do not always provide privacy for all parties concerned. In this work, I study the feasibility of applying cryptographic protocols to design privacy-preserving solutions for LBS from an experimental and theoretical standpoint. In the client-server model, I construct a two-phase framework for processing nearest neighbor queries using combinations of cryptographic protocols such as oblivious transfer and private information retrieval. In the peer-to-peer model, I present privacy preserving solutions for processing group nearest neighbor queries in the semi-honest and dishonest adversarial models. I apply concepts from secure multi-party computation to realize our constructions and also leverage the capabilities of trusted computing technology, specifically TPM chips. My solution for the dishonest adversarial model is also of independent cryptographic interest. I prove my constructions secure under standard cryptographic assumptions and design experiments for testing the feasibility or practicability of our constructions and benchmark key operations. My experiments show that the proposed constructions are practical to implement and have reasonable costs, while providing strong privacy assurances.
An Extensible Computing Architecture Design for Connected Autonomous Vehicle System
Autonomous vehicles have made milestone strides within the past decade. Advances up the autonomy ladder have come lock-step with the advances in machine learning, namely deep-learning algorithms and huge, open training sets. And while advances in CPUs have slowed, GPUs have edged into the previous decade's TOP 500 supercomputer territory. This new class of GPUs include novel deep-learning hardware that has essentially side-stepped Moore's law, outpacing the doubling observation by a factor of ten. While GPUs have make record progress, networks do not follow Moore's law and are restricted by several bottlenecks, from protocol-based latency lower bounds to the very laws of physics. In a way, the bottlenecks that plague modern networks gave rise to Edge computing, a key component of the Connected Autonomous Vehicle system, as the need for low-latency in some domains eclipsed the need for massive processing farms. The Connected Autonomous Vehicle ecosystem is one of the most complicated environments in all of computing. Not only is the hardware scaled all the way from 16 and 32-bit microcontrollers, to multi-CPU Edge nodes, and multi-GPU Cloud servers, but the networking also encompasses the gamut of modern communication transports. I propose a framework for negotiating, encapsulating and transferring data between vehicles ensuring efficient bandwidth utilization and respecting real-time privacy levels.
Extracting Dimensions of Interpersonal Interactions and Relationships
People interact with each other through natural language to express feelings, thoughts, intentions, instructions etc. These interactions as a result form relationships. Besides names of relationships like siblings, spouse, friends etc., a number of dimensions (e.g. cooperative vs. competitive, temporary vs. enduring, equal vs. hierarchical etc.) can also be used to capture the underlying properties of interpersonal interactions and relationships. More fine-grained descriptors (e.g. angry, rude, nice, supportive etc.) can also be used to indicate the reasons or social-acts behind the dimension cooperative vs. competitive. The way people interact with others may also tell us about their personal traits, which in turn may be indicative of their probable success in their future. The works presented in the dissertation involve creating corpora with fine-grained descriptors of interactions and relationships. We also described experiments and their results that indicated that the processes of identifying the dimensions can be automated.
Extracting Possessions and Their Attributes
Possession is an asymmetric semantic relation between two entities, where one entity (the possessee) belongs to the other entity (the possessor). Automatically extracting possessions are useful in identifying skills, recommender systems and in natural language understanding. Possessions can be found in different communication modalities including text, images, videos, and audios. In this dissertation, I elaborate on the techniques I used to extract possessions. I begin with extracting possessions at the sentence level including the type and temporal anchors. Then, I extract the duration of possession and co-possessions (if multiple possessors possess the same entity). Next, I extract possessions from an entire Wikipedia article capturing the change of possessors over time. I extract possessions from social media including both text and images. Finally, I also present dense annotations generating possession timelines. I present separate datasets, detailed corpus analysis, and machine learning models for each task described above.
Extracting Temporally-Anchored Spatial Knowledge
In my dissertation, I elaborate on the work that I have done to extract temporally-anchored spatial knowledge from text, including both intra- and inter-sentential knowledge. I also detail multiple approaches to infer spatial timeline of a person from biographies and social media. I present and analyze two strategies to annotate information regarding whether a given entity is or is not located at some location, and for how long with respect to an event. Specifically, I leverage semantic roles or syntactic dependencies to generate potential spatial knowledge and then crowdsource annotations to validate the potential knowledge. The resulting annotations indicate how long entities are or are not located somewhere, and temporally anchor this spatial information. I present an in-depth corpus analysis and experiments comparing the spatial knowledge generated by manipulating roles or dependencies. In my work, I also explore research methodologies that go beyond single sentences and extract spatio-temporal information from text. Spatial timelines refer to a chronological order of locations where a target person is or is not located. I present corpus and experiments to extract spatial timelines from Wikipedia biographies. I present my work on determining locations and the order in which they are actually visited by a person from their travel experiences. Specifically, I extract spatio-temporal graphs that capture the order (edges) of locations (nodes) visited by a person. Further, I detail my experiments that leverage both text and images to extract spatial timeline of a person from Twitter.
Extrapolating Subjectivity Research to Other Languages
Socrates articulated it best, "Speak, so I may see you." Indeed, language represents an invisible probe into the mind. It is the medium through which we express our deepest thoughts, our aspirations, our views, our feelings, our inner reality. From the beginning of artificial intelligence, researchers have sought to impart human like understanding to machines. As much of our language represents a form of self expression, capturing thoughts, beliefs, evaluations, opinions, and emotions which are not available for scrutiny by an outside observer, in the field of natural language, research involving these aspects has crystallized under the name of subjectivity and sentiment analysis. While subjectivity classification labels text as either subjective or objective, sentiment classification further divides subjective text into either positive, negative or neutral. In this thesis, I investigate techniques of generating tools and resources for subjectivity analysis that do not rely on an existing natural language processing infrastructure in a given language. This constraint is motivated by the fact that the vast majority of human languages are scarce from an electronic point of view: they lack basic tools such as part-of-speech taggers, parsers, or basic resources such as electronic text, annotated corpora or lexica. This severely limits the implementation of techniques on par with those developed for English, and by applying methods that are lighter in the usage of text processing infrastructure, we are able to conduct multilingual subjectivity research in these languages as well. Since my aim is also to minimize the amount of manual work required to develop lexica or corpora in these languages, the techniques proposed employ a lever approach, where English often acts as the donor language (the fulcrum in a lever) and allows through a relatively minimal amount of effort to establish preliminary subjectivity research in a target language.
Finding Meaning in Context Using Graph Algorithms in Mono- and Cross-lingual Settings
Making computers automatically find the appropriate meaning of words in context is an interesting problem that has proven to be one of the most challenging tasks in natural language processing (NLP). Widespread potential applications of a possible solution to the problem could be envisaged in several NLP tasks such as text simplification, language learning, machine translation, query expansion, information retrieval and text summarization. Ambiguity of words has always been a challenge in these applications, and the traditional endeavor to solve the problem of this ambiguity, namely doing word sense disambiguation using resources like WordNet, has been fraught with debate about the feasibility of the granularity that exists in WordNet senses. The recent trend has therefore been to move away from enforcing any given lexical resource upon automated systems from which to pick potential candidate senses,and to instead encourage them to pick and choose their own resources. Given a sentence with a target ambiguous word, an alternative solution consists of picking potential candidate substitutes for the target, filtering the list of the candidates to a much shorter list using various heuristics, and trying to match these system predictions against a human generated gold standard, with a view to ensuring that the meaning of the sentence does not change after the substitutions. This solution has manifested itself in the SemEval 2007 task of lexical substitution and the more recent SemEval 2010 task of cross-lingual lexical substitution (which I helped organize), where given an English context and a target word within that context, the systems are required to provide between one and ten appropriate substitutes (in English) or translations (in Spanish) for the target word. In this dissertation, I present a comprehensive overview of state-of-the-art research and describe new experiments to tackle the tasks of lexical substitution and cross-lingual lexical substitution. In particular …
Frameworks for Attribute-Based Access Control (ABAC) Policy Engineering
In this disseration we propose semi-automated top-down policy engineering approaches for attribute-based access control (ABAC) development. Further, we propose a hybrid ABAC policy engineering approach to combine the benefits and address the shortcomings of both top-down and bottom-up approaches. In particular, we propose three frameworks: (i) ABAC attributes extraction, (ii) ABAC constraints extraction, and (iii) hybrid ABAC policy engineering. Attributes extraction framework comprises of five modules that operate together to extract attributes values from natural language access control policies (NLACPs); map the extracted values to attribute keys; and assign each key-value pair to an appropriate entity. For ABAC constraints extraction framework, we design a two-phase process to extract ABAC constraints from NLACPs. The process begins with the identification phase which focuses on identifying the right boundary of constraint expressions. Next is the normalization phase, that aims at extracting the actual elements that pose a constraint. On the other hand, our hybrid ABAC policy engineering framework consists of 5 modules. This framework combines top-down and bottom-up policy engineering techniques to overcome the shortcomings of both approaches and to generate policies that are more intuitive and relevant to actual organization policies. With this, we believe that our work takes essential steps towards a semi-automated ABAC policy development experience.
Geostatistical Inspired Metamodeling and Optimization of Nanoscale Analog Circuits
The current trend towards miniaturization of modern consumer electronic devices significantly affects their design. The demand for efficient all-in-one appliances leads to smaller, yet more complex and powerful nanoelectronic devices. The increasing complexity in the design of such nanoscale Analog/Mixed-Signal Systems-on-Chip (AMS-SoCs) presents difficult challenges to designers. One promising design method used to mitigate the burden of this design effort is the use of metamodeling (surrogate) modeling techniques. Their use significantly reduces the time for computer simulation and design space exploration and optimization. This dissertation addresses several issues of metamodeling based nanoelectronic based AMS design exploration. A surrogate modeling technique which uses geostatistical based Kriging prediction methods in creating metamodels is proposed. Kriging prediction techniques take into account the correlation effects between input parameters for performance point prediction. We propose the use of Kriging to utilize this property for the accurate modeling of process variation effects of designs in the deep nanometer region. Different Kriging methods have been explored for this work such as simple and ordinary Kriging. We also propose another metamodeling technique Kriging-Bootstrapped Neural Network that combines the accuracy and process variation awareness of Kriging with artificial neural network models for ultra-fast and accurate process aware metamodeling design. The proposed methodologies combine Kriging metamodels with selected algorithms for ultra-fast layout optimization. The selected algorithms explored are: Gravitational Search Algorithm (GSA), Simulated Annealing Optimization (SAO), and Ant Colony Optimization (ACO). Experimental results demonstrate that the proposed Kriging metamodel based methodologies can perform the optimizations with minimal computational burden compared to traditional (SPICE-based) design flows.
Helping Students with Upper Limb Motor Impairments Program in a Block-Based Programming Environment Using Voice
Students with upper body motor impairments, such as cerebral palsy, multiple sclerosis, ALS, etc., face challenges when learning to program in block-based programming environments, because these environments are highly dependent on the physical manipulation of a mouse or keyboard to drag and drop elements on the screen. In my dissertation, I make the block-based programming environment Blockly, accessible to students with upper body motor impairment by adding speech as an alternative form of input. This voice-enabled version of Blockly will reduce the need for the use of a mouse or keyboard, making it more accessible to students with upper body motor impairments. The voice-enabled Blockly system consists of the original Blockly application, a speech recognition API, predefined voice commands, and a custom function. Three user studies have been conducted, a preliminary study, a usability study, and an A/B test. These studies revealed a lot of information, such as the need for simpler, shorter, and more intuitive commands, the need to change the target audience, the shortcomings of speech recognition systems, etc. The feedback received from each study influenced design decisions at different phases. The findings also gave me insight into the direction I would like to go in the future. This work was started and finished in 2 years.
Hybrid Approaches in Test Suite Prioritization
The rapid advancement of web and mobile application technologies has recently posed numerous challenges to the Software Engineering community, including how to cost-effectively test applications that have complex event spaces. Many software testing techniques attempt to cost-effectively improve the quality of such software. This dissertation primarily focuses on that of hybrid test suite prioritization. The techniques utilize two or more criteria to perform test suite prioritization as it is often insufficient to use only a single criterion. The dissertation consists of the following contributions: (1) a weighted test suite prioritization technique that employs the distance between criteria as a weighting factor, (2) a coarse-to-fine grained test suite prioritization technique that uses a multilevel approach to increase the granularity of the criteria at each subsequent iteration, (3) the Caret-HM tool for Android user session-based testing that allows testers to record, replay, and create heat maps from user interactions with Android applications via a web browser, and (4) Android user session-based test suite prioritization techniques that utilize heuristics developed from user sessions created by Caret-HM. Each of the chapters empirically evaluate the respective techniques. The proposed techniques generally show improved or equally good performance when compared to the baselines, depending on an application under test. Further, this dissertation provides guidance to testers as it relates to the use of the proposed hybrid techniques.
Hybrid Optimization Models for Depot Location-Allocation and Real-Time Routing of Emergency Deliveries
Prompt and efficient intervention is vital in reducing casualty figures during epidemic outbreaks, disasters, sudden civil strife or terrorism attacks. This can only be achieved if there is a fit-for-purpose and location-specific emergency response plan in place, incorporating geographical, time and vehicular capacity constraints. In this research, a comprehensive emergency response model for situations of uncertainties (in locations' demand and available resources), typically obtainable in low-resource countries, is designed. It involves the development of algorithms for optimizing pre-and post-disaster activities. The studies result in the development of four models: (1) an adaptation of a machine learning clustering algorithm, for pre-positioning depots and emergency operation centers, which optimizes the placement of these depots, such that the largest geographical location is covered, and the maximum number of individuals reached, with minimal facility cost; (2) an optimization algorithm for routing relief distribution, using heterogenous fleets of vehicle, with considerations for uncertainties in humanitarian supplies; (3) a genetic algorithm-based route improvement model; and (4) a model for integrating possible new locations into the routing network, in real-time, using emergency severity ranking, with a high priority on the most-vulnerable population. The clustering approach to solving dept location-allocation problem produces a better time complexity, and the benchmarking of the routing algorithm with existing approaches, results in competitive outcomes.
Back to Top of Screen