Search Results

Accurate Joint Detection from Depth Videos towards Pose Analysis
Joint detection is vital for characterizing human pose and serves as a foundation for a wide range of computer vision applications such as physical training, health care, entertainment. This dissertation proposed two methods to detect joints in the human body for pose analysis. The first method detects joints by combining body model and automatic feature points detection together. The human body model maps the detected extreme points to the corresponding body parts of the model and detects the position of implicit joints. The dominant joints are detected after implicit joints and extreme points are located by a shortest path based methods. The main contribution of this work is a hybrid framework to detect joints on the human body to achieve robustness to different body shapes or proportions, pose variations and occlusions. Another contribution of this work is the idea of using geodesic features of the human body to build a model for guiding the human pose detection and estimation. The second proposed method detects joints by segmenting human body into parts first and then detect joints by making the detection algorithm focusing on each limb. The advantage of applying body part segmentation first is that the body segmentation method narrows down the searching area for each joint so that the joint detection method can provide more stable and accurate results.
Adaptive Power Management for Autonomic Resource Configuration in Large-scale Computer Systems
In order to run and manage resource-intensive high-performance applications, large-scale computing and storage platforms have been evolving rapidly in various domains in both academia and industry. The energy expenditure consumed to operate and maintain these cloud computing infrastructures is a major factor to influence the overall profit and efficiency for most cloud service providers. Moreover, considering the mitigation of environmental damage from excessive carbon dioxide emission, the amount of power consumed by enterprise-scale data centers should be constrained for protection of the environment.Generally speaking, there exists a trade-off between power consumption and application performance in large-scale computing systems and how to balance these two factors has become an important topic for researchers and engineers in cloud and HPC communities. Therefore, minimizing the power usage while satisfying the Service Level Agreements have become one of the most desirable objectives in cloud computing research and implementation. Since the fundamental feature of the cloud computing platform is hosting workloads with a variety of characteristics in a consolidated and on-demand manner, it is demanding to explore the inherent relationship between power usage and machine configurations. Subsequently, with an understanding of these inherent relationships, researchers are able to develop effective power management policies to optimize productivity by balancing power usage and system performance. In this dissertation, we develop an autonomic power-aware system management framework for large-scale computer systems. We propose a series of techniques including coarse-grain power profiling, VM power modelling, power-aware resource auto-configuration and full-system power usage simulator. These techniques help us to understand the characteristics of power consumption of various system components. Based on these techniques, we are able to test various job scheduling strategies and develop resource management approaches to enhance the systems' power efficiency.
Advanced Power Amplifiers Design for Modern Wireless Communication
Modern wireless communication systems use spectrally efficient modulation schemes to reach high data rate transmission. These schemes are generally involved with signals with high peak-to-average power ratio (PAPR). Moreover, the development of next generation wireless communication systems requires the power amplifiers to operate over a wide frequency band or multiple frequency bands to support different applications. These wide-band and multi-band solutions will lead to reductions in both the size and cost of the whole system. This dissertation presents several advanced power amplifier solutions to provide wide-band and multi-band operations with efficiency improvement at power back-offs.
Analysis and Optimization of Graphene FET based Nanoelectronic Integrated Circuits
Like cell to the human body, transistors are the basic building blocks of any electronics circuits. Silicon has been the industries obvious choice for making transistors. Transistors with large size occupy large chip area, consume lots of power and the number of functionalities will be limited due to area constraints. Thus to make the devices smaller, smarter and faster, the transistors are aggressively scaled down in each generation. Moore's law states that the transistors count in any electronic circuits doubles every 18 months. Following this Moore's law, the transistor has already been scaled down to 14 nm. However there are limitations to how much further these transistors can be scaled down. Particularly below 10 nm, these silicon based transistors hit the fundamental limits like loss of gate control, high leakage and various other short channel effects. Thus it is not possible to favor the silicon transistors for future electronics applications. As a result, the research has shifted to new device concepts and device materials alternative to silicon. Carbon is the next abundant element found in the Earth and one of such carbon based nanomaterial is graphene. Graphene when extracted from Graphite, the same material used as the lid in pencil, have a tremendous potential to take future electronics devices to new heights in terms of size, cost and efficiency. Thus after its first experimental discovery of graphene in 2004, graphene has been the leading research area for both academics as well as industries. This dissertation is focused on the analysis and optimization of graphene based circuits for future electronics. The first part of this dissertation considers graphene based transistors for analog/radio frequency (RF) circuits. In this section, a dual gate Graphene Field Effect Transistor (GFET) is considered to build the case study circuits like voltage controlled oscillator (VCO) and low …
Analysis and Performance of a Cyber-Human System and Protocols for Geographically Separated Collaborators
This dissertation provides an innovative mechanism to collaborate two geographically separated people on a physical task and a novel method to measure Complexity Index (CI) and calculate Minimal Complexity Index (MCI) of a collaboration protocol. The protocol is represented as a structure, and the information content of it is measured in bits to understand the complex nature of the protocol. Using the complexity metrics, one can analyze the performance of a collaborative system and a collaboration protocol. Security and privacy of the consumers are vital while seeking remote help; this dissertation also provides a novel authorization framework for dynamic access control of resources on an input-constrained appliance used for completing the physical task. Using the innovative Collaborative Appliance for REmote-help (CARE) and with the support of a remotely located expert, fifty-nine subjects with minimal or no prior mechanical knowledge are able to elevate a car for replacing a tire in an average time of six minutes and 53 seconds and with an average protocol complexity of 171.6 bits. Moreover, thirty subjects with minimal or no prior plumbing knowledge are able to change the cartridge of a faucet in an average time of ten minutes and with an average protocol complexity of 250.6 bits. Our experiments and results show that one can use the developed mechanism and methods for expanding the protocols for a variety of home, vehicle, and appliance repairs and installations.
Application of Adaptive Techniques in Regression Testing for Modern Software Development
In this dissertation we investigate the applicability of different adaptive techniques to improve the effectiveness and efficiency of the regression testing. Initially, we introduce the concept of regression testing. We then perform a literature review of current practices and state-of-the-art regression testing techniques. Finally, we advance the regression testing techniques by performing four empirical studies in which we use different types of information (e.g. user session, source code, code commit, etc.) to investigate the effectiveness of each software metric on fault detection capability for different software environments. In our first empirical study, we show the effectiveness of applying user session information for test case prioritization. In our next study, we apply learning from the previous study, and implement a collaborative filtering recommender system for test case prioritization, which uses user sessions and change history information as input parameter, and return the risk score associated with each component. Results of this study show that our recommender system improves the effectiveness of test prioritization; the performance of our approach was particularly noteworthy when we were under time constraints. We then investigate the merits of multi-objective testing over single objective techniques with a graph-based testing framework. Results of this study indicate that the use of the graph-based technique reduces the algorithm execution time considerably, while being just as effective as the greedy algorithms in terms of fault detection rate. Finally, we apply the knowledge from the previous studies and implement a query answering framework for regression test selection. This framework is built based on a graph database and uses fault history information and test diversity in attempt to select the most effective set of test cases in term of fault detection capability. Our empirical evaluation of this study with four open source programs shows that our approach can be effective and efficient by …
Application-Specific Things Architectures for IoT-Based Smart Healthcare Solutions
Human body is a complex system organized at different levels such as cells, tissues and organs, which contributes to 11 important organ systems. The functional efficiency of this complex system is evaluated as health. Traditional healthcare is unable to accommodate everyone's need due to the ever-increasing population and medical costs. With advancements in technology and medical research, traditional healthcare applications are shaping into smart healthcare solutions. Smart healthcare helps in continuously monitoring our body parameters, which helps in keeping people health-aware. It provides the ability for remote assistance, which helps in utilizing the available resources to maximum potential. The backbone of smart healthcare solutions is Internet of Things (IoT) which increases the computing capacity of the real-world components by using cloud-based solutions. The basic elements of these IoT based smart healthcare solutions are called "things." Things are simple sensors or actuators, which have the capacity to wirelessly connect with each other and to the internet. The research for this dissertation aims in developing architectures for these things, focusing on IoT-based smart healthcare solutions. The core for this dissertation is to contribute to the research in smart healthcare by identifying applications which can be monitored remotely. For this, application-specific thing architectures were proposed based on monitoring a specific body parameter; monitoring physical health for family and friends; and optimizing the power budget of IoT body sensor network using human body communications. The experimental results show promising scope towards improving the quality of life, through needle-less and cost-effective smart healthcare solutions.
Automated Real-time Objects Detection in Colonoscopy Videos for Quality Measurements
The effectiveness of colonoscopy depends on the quality of the inspection of the colon. There was no automated measurement method to evaluate the quality of the inspection. This thesis addresses this issue by investigating an automated post-procedure quality measurement technique and proposing a novel approach automatically deciding a percentage of stool areas in images of digitized colonoscopy video files. It involves the classification of image pixels based on their color features using a new method of planes on RGB (red, green and blue) color space. The limitation of post-procedure quality measurement is that quality measurements are available long after the procedure was done and the patient was released. A better approach is to inform any sub-optimal inspection immediately so that the endoscopist can improve the quality in real-time during the procedure. This thesis also proposes an extension to post-procedure method to detect stool, bite-block, and blood regions in real-time using color features in HSV color space. These three objects play a major role in quality measurements in colonoscopy. The proposed method partitions very large positive examples of each of these objects into a number of groups. These groups are formed by taking intersection of positive examples with a hyper plane. This hyper plane is named as 'positive plane'. 'Convex hulls' are used to model positive planes. Comparisons with traditional classifiers such as K-nearest neighbor (K-NN) and support vector machines (SVM) proves the soundness of the proposed method in terms of accuracy and speed that are critical in the targeted real-time quality measurement system.
Computational Approaches for Analyzing Social Support in Online Health Communities
Online health communities (OHCs) have become a medium for patients to share their personal experiences and interact with peers on topics related to a disease, medication, side effects, and therapeutic processes. Many studies show that using OHCs regularly decreases mortality and improves patients mental health. As a result of their benefits, OHCs are a popular place for patients to refer to, especially patients with a severe disease, and to receive emotional and informational support. The main reasons for developing OHCs are to present valid and high-quality information and to understand the mechanism of social support in changing patients' mental health. Given the purpose of OHC moderators for developing OHCs applications and the purpose of patients for using OHCs, there is no facility, feature, or sub-application in OHCs to satisfy patient and moderator goals. OHCs are only equipped with a primary search engine that is a keyword-based search tool. In other words, if a patient wants to obtain information about a side-effect, he/she needs to browse many threads in the hope that he/she can find several related comments. In the same way, OHC moderators cannot browse all information which is exchanged among patients to validate their accuracy. Thus, it is critical for OHCs to be equipped with computational tools which are supported by several sophisticated computational models that provide moderators and patients with the collection of messages that they need for making decisions or predictions. We present multiple computational models to alleviate the problem of OHCs in providing specific types of messages in response to the specific moderator and patient needs. Specifically, we focused on proposing computational models for the following tasks: identifying emotional support, which presents OHCs moderators, psychologists, and sociologists with insightful views on the emotional states of individuals and groups, and identifying informational support, which provides patients with …
A Computational Methodology for Addressing Differentiated Access of Vulnerable Populations During Biological Emergencies
Mitigation response plans must be created to protect affected populations during biological emergencies resulting from the release of harmful biochemical substances. Medical countermeasures have been stockpiled by the federal government for such emergencies. However, it is the responsibility of local governments to maintain solid, functional plans to apply these countermeasures to the entire target population within short, mandated time frames. Further, vulnerabilities in the population may serve as barriers preventing certain individuals from participating in mitigation activities. Therefore, functional response plans must be capable of reaching vulnerable populations.Transportation vulnerability results from lack of access to transportation. Transportation vulnerable populations located too far from mitigation resources are at-risk of not being able to participate in mitigation activities. Quantification of these populations requires the development of computational methods to integrate spatial demographic data and transportation resource data from disparate sources into the context of planned mitigation efforts. Research described in this dissertation focuses on quantifying transportation vulnerable populations and maximizing participation in response efforts. Algorithms developed as part of this research are integrated into a computational framework to promote a transition from research and development to deployment and use by biological emergency planners.
Computational Methods to Optimize High-Consequence Variants of the Vehicle Routing Problem for Relief Networks in Humanitarian Logistics
Optimization of relief networks in humanitarian logistics often exemplifies the need for solutions that are feasible given a hard constraint on time. For instance, the distribution of medical countermeasures immediately following a biological disaster event must be completed within a short time-frame. When these supplies are not distributed within the maximum time allowed, the severity of the disaster is quickly exacerbated. Therefore emergency response plans that fail to facilitate the transportation of these supplies in the time allowed are simply not acceptable. As a result, all optimization solutions that fail to satisfy this criterion would be deemed infeasible. This creates a conflict with the priority optimization objective in most variants of the generic vehicle routing problem (VRP). Instead of efficiently maximizing usage of vehicle resources available to construct a feasible solution, these variants ordinarily prioritize the construction of a minimum cost set of vehicle routes. Research presented in this dissertation focuses on the design and analysis of efficient computational methods for optimizing high-consequence variants of the VRP for relief networks. The conflict between prioritizing the minimization of the number of vehicles required or the minimization of total travel time is demonstrated. The optimization of the time and capacity constraints in the context of minimizing the required vehicles are independently examined. An efficient meta-heuristic algorithm based on a continuous spatial partitioning scheme is presented for constructing a minimized set of vehicle routes in practical instances of the VRP that include critically high-cost penalties. Multiple optimization priority strategies that extend this algorithm are examined and compared in a large-scale bio-emergency case study. The algorithms designed from this research are implemented and integrated into an existing computational framework that is currently used by public health officials. These computational tools enhance an emergency response planner's ability to derive a set of vehicle routes specifically …
Content and Temporal Analysis of Communications to Predict Task Cohesion in Software Development Global Teams
Virtual teams in industry are increasingly being used to develop software, create products, and accomplish tasks. However, analyzing those collaborations under same-time/different-place conditions is well-known to be difficult. In order to overcome some of these challenges, this research was concerned with the study of collaboration-based, content-based and temporal measures and their ability to predict cohesion within global software development projects. Messages were collected from three software development projects that involved students from two different countries. The similarities and quantities of these interactions were computed and analyzed at individual and group levels. Results of interaction-based metrics showed that the collaboration variables most related to Task Cohesion were Linguistic Style Matching and Information Exchange. The study also found that Information Exchange rate and Reply rate have a significant and positive correlation to Task Cohesion, a factor used to describe participants' engagement in the global software development process. This relation was also found at the Group level. All these results suggest that metrics based on rate can be very useful for predicting cohesion in virtual groups. Similarly, content features based on communication categories were used to improve the identification of Task Cohesion levels. This model showed mixed results, since only Work similarity and Social rate were found to be correlated with Task Cohesion. This result can be explained by how a group's cohesiveness is often associated with fairness and trust, and that these two factors are often achieved by increased social and work communications. Also, at a group-level, all models were found correlated to Task Cohesion, specifically, Similarity+Rate, which suggests that models that include social and work communication categories are also good predictors of team cohesiveness. Finally, temporal interaction similarity measures were calculated to assess their prediction capabilities in a global setting. Results showed a significant negative correlation between the Pacing Rate and …
A Control Theoretic Approach for Resilient Network Services
Resilient networks have the ability to provide the desired level of service, despite challenges such as malicious attacks and misconfigurations. The primary goal of this dissertation is to be able to provide uninterrupted network services in the face of an attack or any failures. This dissertation attempts to apply control system theory techniques with a focus on system identification and closed-loop feedback control. It explores the benefits of system identification technique in designing and validating the model for the complex and dynamic networks. Further, this dissertation focuses on designing robust feedback control mechanisms that are both scalable and effective in real-time. It focuses on employing dynamic and predictive control approaches to reduce the impact of an attack on network services. The closed-loop feedback control mechanisms tackle this issue by degrading the network services gracefully to an acceptable level and then stabilizing the network in real-time (less than 50 seconds). Employing these feedback mechanisms also provide the ability to automatically configure the settings such that the QoS metrics of the network is consistent with those specified in the service level agreements.
A Data-Driven Computational Framework to Assess the Risk of Epidemics at Global Mass Gatherings
This dissertation presents a data-driven computational epidemic framework to simulate disease epidemics at global mass gatherings. The annual Muslim pilgrimage to Makkah, Saudi Arabia is used to demonstrate the simulation and analysis of various disease transmission scenarios throughout the different stages of the event from the arrival to the departure of international participants. The proposed agent-based epidemic model efficiently captures the demographic, spatial, and temporal heterogeneity at each stage of the global event of Hajj. Experimental results indicate the substantial impact of the demographic and mobility patterns of the heterogeneous population of pilgrims on the progression of the disease spread in the different stages of Hajj. In addition, these simulations suggest that the differences in the spatial and temporal settings in each stage can significantly affect the dynamic of the disease. Finally, the epidemic simulations conducted at the different stages in this dissertation illustrate the impact of the differences between the duration of each stage in the event and the length of the infectious and latent periods. This research contributes to a better understanding of epidemic modeling in the context of global mass gatherings to predict the risk of disease pandemics caused by associated international travel. The computational modeling and disease spread simulations in global mass gatherings provide public health authorities with powerful tools to assess the implication of these events at a different scale and to evaluate the efficacy of control strategies to reduce their potential impacts.
Dataflow Processing in Memory Achieves Significant Energy Efficiency
The large difference between processor CPU cycle time and memory access time, often referred to as the memory wall, severely limits the performance of streaming applications. Some data centers have shown servers being idle three out of four clocks. High performance instruction sequenced systems are not energy efficient. The execute stage of even simple pipeline processors only use 9% of the pipeline's total energy. A hybrid dataflow system within a memory module is shown to have 7.2 times the performance with 368 times better energy efficiency than an Intel Xeon server processor on the analyzed benchmarks. The dataflow implementation exploits the inherent parallelism and pipelining of the application to improve performance without the overhead functions of caching, instruction fetch, instruction decode, instruction scheduling, reorder buffers, and speculative execution used by high performance out-of-order processors. Coarse grain reconfigurable logic in an energy efficient silicon process provides flexibility to implement multiple algorithms in a low energy solution. Integrating the logic within a 3D stacked memory module provides lower latency and higher bandwidth access to memory while operating independently from the host system processor.
Detection and Classification of Heart Sounds Using a Heart-Mobile Interface
An early detection of heart disease can save lives, caution individuals and also help to determine the type of treatment to be given to the patients. The first test of diagnosing a heart disease is through auscultation - listening to the heart sounds. The interpretation of heart sounds is subjective and requires a professional skill to identify the abnormalities in these sounds. A medical practitioner uses a stethoscope to perform an initial screening by listening for irregular sounds from the patient's chest. Later, echocardiography and electrocardiography tests are taken for further diagnosis. However, these tests are expensive and require specialized technicians to operate. A simple and economical way is vital for monitoring in homecare or rural hospitals and urban clinics. This dissertation is focused on developing a patient-centered device for initial screening of the heart sounds that is both low cost and can be used by the users on themselves, and later share the readings with the healthcare providers. An innovative mobile health service platform is created for analyzing and classifying heart sounds. Certain properties of heart sounds have to be evaluated to identify the irregularities such as the number of heart beats and gallops, intensity, frequency, and duration. Since heart sounds are generated in low frequencies, human ears tend to miss certain sounds as the high frequency sounds mask the lower ones. Therefore, this dissertation provides a solution to process the heart sounds using several signal processing techniques, identifies the features in the heart sounds and finally classifies them. This dissertation enables remote patient monitoring through the integration of advanced wireless communications and a customized low-cost stethoscope. It also permits remote management of patients' cardiac status while maximizing patient mobility. The smartphone application facilities recording, processing, visualizing, listening, and classifying heart sounds. The application also generates an electronic medical …
Detection of Generalizable Clone Security Coding Bugs Using Graphs and Learning Algorithms
This research methodology isolates coding properties and identifies the probability of security vulnerabilities using machine learning and historical data. Several approaches characterize the effectiveness of detecting security-related bugs that manifest as vulnerabilities, but none utilize vulnerability patch information. The main contribution of this research is a framework to analyze LLVM Intermediate Representation Code and merging core source code representations using source code properties. This research is beneficial because it allows source programs to be transformed into a graphical form and users can extract specific code properties related to vulnerable functions. The result is an improved approach to detect, identify, and track software system vulnerabilities based on a performance evaluation. The methodology uses historical function level vulnerability information, unique feature extraction techniques, a novel code property graph, and learning algorithms to minimize the amount of end user domain knowledge necessary to detect vulnerabilities in applications. The analysis shows approximately 99% precision and recall to detect known vulnerabilities in the National Institute of Standards and Technology (NIST) Software Assurance Metrics and Tool Evaluation (SAMATE) project. Furthermore, 72% percent of the historical vulnerabilities in the OpenSSL testing environment were detected using a linear support vector classifier (SVC) model.
Detection of Temporal Events and Abnormal Images for Quality Analysis in Endoscopy Videos
Recent reports suggest that measuring the objective quality is very essential towards the success of colonoscopy. Several quality indicators (i.e. metrics) proposed in recent studies are implemented in software systems that compute real-time quality scores for routine screening colonoscopy. Most quality metrics are derived based on various temporal events occurred during the colonoscopy procedure. The location of the phase boundary between the insertion and the withdrawal phases and the amount of circumferential inspection are two such important temporal events. These two temporal events can be determined by analyzing various camera motions of the colonoscope. This dissertation put forward a novel method to estimate X, Y and Z directional motions of the colonoscope using motion vector templates. Since abnormalities of a WCE or a colonoscopy video can be found in a small number of frames (around 5% out of total frames), it is very helpful if a computer system can decide whether a frame has any mucosal abnormalities. Also, the number of detected abnormal lesions during a procedure is used as a quality indicator. Majority of the existing abnormal detection methods focus on detecting only one type of abnormality or the overall accuracies are somewhat low if the method tries to detect multiple abnormalities. Most abnormalities in endoscopy images have unique textures which are clearly distinguishable from normal textures. In this dissertation a new method is proposed that achieves the objective of detecting multiple abnormalities with a higher accuracy using a multi-texture analysis technique. The multi-texture analysis method is designed by representing WCE and colonoscopy image textures as textons.
An Efficient Approach for Dengue Mitigation: A Computational Framework
Dengue mitigation is a major research area among scientist who are working towards an effective management of the dengue epidemic. An effective dengue mitigation requires several other important components. These components include an accurate epidemic modeling, an efficient epidemic prediction, and an efficient resource allocation for controlling of the spread of the dengue disease. Past studies assumed homogeneous response pattern of the dengue epidemic to climate conditions throughout the regions. The dengue epidemic is climate dependent and also it is geographically dependent. A global model is not sufficient to capture the local variations of the epidemic. We propose a novel method of epidemic modeling considering local variation and that uses micro ensemble of regressors for each region. There are three regressors that are used in the construction of the ensemble. These are support vector regression, ordinary least square regression, and a k-nearest neighbor regression. The best performing regressors get selected into the ensemble. The proposed ensemble determines the risk of dengue epidemic in each region in advance. The risk is then used in risk-based resource allocation. The proposing resource allocation is built based on the genetic algorithm. The algorithm exploits the genetic algorithm with major modifications to its main components, mutation and crossover. The proposed resource allocation converges faster than the standard genetic algorithm and also produces a better allocation compared to the standard algorithm.
Enhancing Storage Dependability and Computing Energy Efficiency for Large-Scale High Performance Computing Systems
With the advent of information explosion age, larger capacity disk drives are used to store data and powerful devices are used to process big data. As the scale and complexity of computer systems increase, we expect these systems to provide dependable and energy-efficient services and computation. Although hard drives are reliable in general, they are the most commonly replaced hardware components. Disk failures cause data corruption and even data loss, which can significantly affect system performance and financial losses. In this dissertation research, I analyze different manifestations of disk failures in production data centers and explore data mining techniques combined with statistical analysis methods to discover categories of disk failures and their distinctive properties. I use similarity measures to quantify the degradation process of each failure type and derive the degradation signature. The derived degradation signatures are further leveraged to forecast when future disk failures may happen. Meanwhile, this dissertation also studies energy efficiency of high performance computers. Specifically, I characterize the power and energy consumption of Haswell processors which are used in multiple supercomputers, and analyze the power and energy consumption of Legion, a data-centric programming model and runtime system, and Legion applications. We find that power and energy efficiency can be improved significantly by optimizing the settings and runtime scheduling of processors, and Legion runtime performs well for larger-scale computation in terms of power and energy consumption.
Evaluating Appropriateness of Emg and Flex Sensors for Classifying Hand Gestures
Hand and arm gestures are a great way of communication when you don't want to be heard, quieter and often more reliable than whispering into a radio mike. In recent years hand gesture identification became a major active area of research due its use in various applications. The objective of my work is to develop an integrated sensor system, which will enable tactical squads and SWAT teams to communicate when there is absence of a Line of Sight or in the presence of any obstacles. The gesture set involved in this work is the standardized hand signals for close range engagement operations used by military and SWAT teams. The gesture sets involved in this work are broadly divided into finger movements and arm movements. The core components of the integrated sensor system are: Surface EMG sensors, Flex sensors and accelerometers. Surface EMG is the electrical activity produced by muscle contractions and measured by sensors directly attached to the skin. Bend Sensors use a piezo resistive material to detect the bend. The sensor output is determined by both the angle between the ends of the sensor as well as the flex radius. Accelerometers sense the dynamic acceleration and inclination in 3 directions simultaneously. EMG sensors are placed on the upper and lower forearm and assist in the classification of the finger and wrist movements. Bend sensors are mounted on a glove that is worn on the hand. The sensors are located over the first knuckle of each figure and can determine if the finger is bent or not. An accelerometer is attached to the glove at the base of the wrist and determines the speed and direction of the arm movement. Classification algorithm SVM is used to classify the gestures.
Evaluation of Call Mobility on Network Productivity in Long Term Evolution Advanced (LTE-A) Femtocells
The demand for higher data rates for indoor and cell-edge users led to evolution of small cells. LTE femtocells, one of the small cell categories, are low-power low-cost mobile base stations, which are deployed within the coverage area of the traditional macro base station. The cross-tier and co-tier interferences occur only when the macrocell and femtocell share the same frequency channels. Open access (OSG), closed access (CSG), and hybrid access are the three existing access-control methods that decide users' connectivity to the femtocell access point (FAP). We define a network performance function, network productivity, to measure the traffic that is carried successfully. In this dissertation, we evaluate call mobility in LTE integrated network and determine optimized network productivity with variable call arrival rate in given LTE deployment with femtocell access modes (OSG, CSG, HYBRID) for a given call blocking vector. The solution to the optimization is maximum network productivity and call arrival rates for all cells. In the second scenario, we evaluate call mobility in LTE integrated network with increasing femtocells and maximize network productivity with variable femtocells distribution per macrocell with constant call arrival rate in uniform LTE deployment with femtocell access modes (OSG, CSG, HYBRID) for a given call blocking vector. The solution to the optimization is maximum network productivity and call arrival rates for all cells for network deployment where peak productivity is identified. We analyze the effects of call mobility on network productivity by simulating low, high, and no mobility scenarios and study the impact based on offered load, handover traffic and blocking probabilities. Finally, we evaluate and optimize performance of fractional frequency reuse (FFR) mechanism and study the impact of proposed metric weighted user satisfaction with sectorized FFR configuration.
Evaluation Techniques and Graph-Based Algorithms for Automatic Summarization and Keyphrase Extraction
Automatic text summarization and keyphrase extraction are two interesting areas of research which extend along natural language processing and information retrieval. They have recently become very popular because of their wide applicability. Devising generic techniques for these tasks is challenging due to several issues. Yet we have a good number of intelligent systems performing the tasks. As different systems are designed with different perspectives, evaluating their performances with a generic strategy is crucial. It has also become immensely important to evaluate the performances with minimal human effort. In our work, we focus on designing a relativized scale for evaluating different algorithms. This is our major contribution which challenges the traditional approach of working with an absolute scale. We consider the impact of some of the environment variables (length of the document, references, and system-generated outputs) on the performance. Instead of defining some rigid lengths, we show how to adjust to their variations. We prove a mathematically sound baseline that should work for all kinds of documents. We emphasize automatically determining the syntactic well-formedness of the structures (sentences). We also propose defining an equivalence class for each unit (e.g. word) instead of the exact string matching strategy. We show an evaluation approach that considers the weighted relatedness of multiple references to adjust to the degree of disagreements between the gold standards. We publish the proposed approach as a free tool so that other systems can use it. We have also accumulated a dataset (scientific articles) with a reference summary and keyphrases for each document. Our approach is applicable not only for evaluating single-document based tasks but also for evaluating multiple-document based tasks. We have tested our evaluation method for three intrinsic tasks (taken from DUC 2004 conference), and in all three cases, it correlates positively with ROUGE. Based on our experiments …
Event Sequence Identification and Deep Learning Classification for Anomaly Detection and Predication on High-Performance Computing Systems
High-performance computing (HPC) systems continue growing in both scale and complexity. These large-scale, heterogeneous systems generate tens of millions of log messages every day. Effective log analysis for understanding system behaviors and identifying system anomalies and failures is highly challenging. Existing log analysis approaches use line-by-line message processing. They are not effective for discovering subtle behavior patterns and their transitions, and thus may overlook some critical anomalies. In this dissertation research, I propose a system log event block detection (SLEBD) method which can extract the log messages that belong to a component or system event into an event block (EB) accurately and automatically. At the event level, we can discover new event patterns, the evolution of system behavior, and the interaction among different system components. To find critical event sequences, existing sequence mining methods are mostly based on the a priori algorithm which is compute-intensive and runs for a long time. I develop a novel, topology-aware sequence mining (TSM) algorithm which is efficient to generate sequence patterns from the extracted event block lists. I also train a long short-term memory (LSTM) model to cluster sequences before specific events. With the generated sequence pattern and trained LSTM model, we can predict whether an event is going to occur normally or not. To accelerate such predictions, I propose a design flow by which we can convert recurrent neural network (RNN) designs into register-transfer level (RTL) implementations which are deployed on FPGAs. Due to its high parallelism and low power, FPGA achieves a greater speedup and better energy efficiency compared to CPU and GPU according to our experimental results.
Exploration of Visual, Acoustic, and Physiological Modalities to Complement Linguistic Representations for Sentiment Analysis
This research is concerned with the identification of sentiment in multimodal content. This is of particular interest given the increasing presence of subjective multimodal content on the web and other sources, which contains a rich and vast source of people's opinions, feelings, and experiences. Despite the need for tools that can identify opinions in the presence of diverse modalities, most of current methods for sentiment analysis are designed for textual data only, and few attempts have been made to address this problem. The dissertation investigates techniques for augmenting linguistic representations with acoustic, visual, and physiological features. The potential benefits of using these modalities include linguistic disambiguation, visual grounding, and the integration of information about people's internal states. The main goal of this work is to build computational resources and tools that allow sentiment analysis to be applied to multimodal data. This thesis makes three important contributions. First, it shows that modalities such as audio, video, and physiological data can be successfully used to improve existing linguistic representations for sentiment analysis. We present a method that integrates linguistic features with features extracted from these modalities. Features are derived from verbal statements, audiovisual recordings, thermal recordings, and physiological sensors signals. The resulting multimodal sentiment analysis system is shown to significantly outperform the use of language alone. Using this system, we were able to predict the sentiment expressed in video reviews and also the sentiment experienced by viewers while exposed to emotionally loaded content. Second, the thesis provides evidence of the portability of the developed strategies to other affect recognition problems. We provided support for this by studying the deception detection problem. Third, this thesis contributes several multimodal datasets that will enable further research in sentiment and deception detection.
Exploring Physical Unclonable Functions for Efficient Hardware Assisted Security in the IoT
Modern cities are undergoing rapid expansion. The number of connected devices in the networks in and around these cities is increasing every day and will exponentially increase in the next few years. At home, the number of connected devices is also increasing with the introduction of home automation appliances and applications. Many of these appliances are becoming smart devices which can track our daily routines. It is imperative that all these devices should be secure. When cryptographic keys used for encryption and decryption are stored on memory present on these devices, they can be retrieved by attackers or adversaries to gain control of the system. For this purpose, Physical Unclonable Functions (PUFs) were proposed to generate the keys required for encryption and decryption of the data or the communication channel, as required by the application. PUF modules take advantage of the manufacturing variations that are introduced in the Integrated Circuits (ICs) during the fabrication process. These are used to generate the cryptographic keys which reduces the use of a separate memory module to store the encryption and decryption keys. A PUF module can also be recon gurable such that the number of input output pairs or Challenge Response Pairs (CRPs) generated can be increased exponentially. This dissertation proposes three designs of PUFs, two of which are recon gurable to increase the robustness of the system.
Exploring Privacy in Location-based Services Using Cryptographic Protocols
Location-based services (LBS) are available on a variety of mobile platforms like cell phones, PDA's, etc. and an increasing number of users subscribe to and use these services. Two of the popular models of information flow in LBS are the client-server model and the peer-to-peer model, in both of which, existing approaches do not always provide privacy for all parties concerned. In this work, I study the feasibility of applying cryptographic protocols to design privacy-preserving solutions for LBS from an experimental and theoretical standpoint. In the client-server model, I construct a two-phase framework for processing nearest neighbor queries using combinations of cryptographic protocols such as oblivious transfer and private information retrieval. In the peer-to-peer model, I present privacy preserving solutions for processing group nearest neighbor queries in the semi-honest and dishonest adversarial models. I apply concepts from secure multi-party computation to realize our constructions and also leverage the capabilities of trusted computing technology, specifically TPM chips. My solution for the dishonest adversarial model is also of independent cryptographic interest. I prove my constructions secure under standard cryptographic assumptions and design experiments for testing the feasibility or practicability of our constructions and benchmark key operations. My experiments show that the proposed constructions are practical to implement and have reasonable costs, while providing strong privacy assurances.
Extracting Temporally-Anchored Spatial Knowledge
In my dissertation, I elaborate on the work that I have done to extract temporally-anchored spatial knowledge from text, including both intra- and inter-sentential knowledge. I also detail multiple approaches to infer spatial timeline of a person from biographies and social media. I present and analyze two strategies to annotate information regarding whether a given entity is or is not located at some location, and for how long with respect to an event. Specifically, I leverage semantic roles or syntactic dependencies to generate potential spatial knowledge and then crowdsource annotations to validate the potential knowledge. The resulting annotations indicate how long entities are or are not located somewhere, and temporally anchor this spatial information. I present an in-depth corpus analysis and experiments comparing the spatial knowledge generated by manipulating roles or dependencies. In my work, I also explore research methodologies that go beyond single sentences and extract spatio-temporal information from text. Spatial timelines refer to a chronological order of locations where a target person is or is not located. I present corpus and experiments to extract spatial timelines from Wikipedia biographies. I present my work on determining locations and the order in which they are actually visited by a person from their travel experiences. Specifically, I extract spatio-temporal graphs that capture the order (edges) of locations (nodes) visited by a person. Further, I detail my experiments that leverage both text and images to extract spatial timeline of a person from Twitter.
Extrapolating Subjectivity Research to Other Languages
Socrates articulated it best, "Speak, so I may see you." Indeed, language represents an invisible probe into the mind. It is the medium through which we express our deepest thoughts, our aspirations, our views, our feelings, our inner reality. From the beginning of artificial intelligence, researchers have sought to impart human like understanding to machines. As much of our language represents a form of self expression, capturing thoughts, beliefs, evaluations, opinions, and emotions which are not available for scrutiny by an outside observer, in the field of natural language, research involving these aspects has crystallized under the name of subjectivity and sentiment analysis. While subjectivity classification labels text as either subjective or objective, sentiment classification further divides subjective text into either positive, negative or neutral. In this thesis, I investigate techniques of generating tools and resources for subjectivity analysis that do not rely on an existing natural language processing infrastructure in a given language. This constraint is motivated by the fact that the vast majority of human languages are scarce from an electronic point of view: they lack basic tools such as part-of-speech taggers, parsers, or basic resources such as electronic text, annotated corpora or lexica. This severely limits the implementation of techniques on par with those developed for English, and by applying methods that are lighter in the usage of text processing infrastructure, we are able to conduct multilingual subjectivity research in these languages as well. Since my aim is also to minimize the amount of manual work required to develop lexica or corpora in these languages, the techniques proposed employ a lever approach, where English often acts as the donor language (the fulcrum in a lever) and allows through a relatively minimal amount of effort to establish preliminary subjectivity research in a target language.
Finding Meaning in Context Using Graph Algorithms in Mono- and Cross-lingual Settings
Making computers automatically find the appropriate meaning of words in context is an interesting problem that has proven to be one of the most challenging tasks in natural language processing (NLP). Widespread potential applications of a possible solution to the problem could be envisaged in several NLP tasks such as text simplification, language learning, machine translation, query expansion, information retrieval and text summarization. Ambiguity of words has always been a challenge in these applications, and the traditional endeavor to solve the problem of this ambiguity, namely doing word sense disambiguation using resources like WordNet, has been fraught with debate about the feasibility of the granularity that exists in WordNet senses. The recent trend has therefore been to move away from enforcing any given lexical resource upon automated systems from which to pick potential candidate senses,and to instead encourage them to pick and choose their own resources. Given a sentence with a target ambiguous word, an alternative solution consists of picking potential candidate substitutes for the target, filtering the list of the candidates to a much shorter list using various heuristics, and trying to match these system predictions against a human generated gold standard, with a view to ensuring that the meaning of the sentence does not change after the substitutions. This solution has manifested itself in the SemEval 2007 task of lexical substitution and the more recent SemEval 2010 task of cross-lingual lexical substitution (which I helped organize), where given an English context and a target word within that context, the systems are required to provide between one and ten appropriate substitutes (in English) or translations (in Spanish) for the target word. In this dissertation, I present a comprehensive overview of state-of-the-art research and describe new experiments to tackle the tasks of lexical substitution and cross-lingual lexical substitution. In particular …
Geostatistical Inspired Metamodeling and Optimization of Nanoscale Analog Circuits
The current trend towards miniaturization of modern consumer electronic devices significantly affects their design. The demand for efficient all-in-one appliances leads to smaller, yet more complex and powerful nanoelectronic devices. The increasing complexity in the design of such nanoscale Analog/Mixed-Signal Systems-on-Chip (AMS-SoCs) presents difficult challenges to designers. One promising design method used to mitigate the burden of this design effort is the use of metamodeling (surrogate) modeling techniques. Their use significantly reduces the time for computer simulation and design space exploration and optimization. This dissertation addresses several issues of metamodeling based nanoelectronic based AMS design exploration. A surrogate modeling technique which uses geostatistical based Kriging prediction methods in creating metamodels is proposed. Kriging prediction techniques take into account the correlation effects between input parameters for performance point prediction. We propose the use of Kriging to utilize this property for the accurate modeling of process variation effects of designs in the deep nanometer region. Different Kriging methods have been explored for this work such as simple and ordinary Kriging. We also propose another metamodeling technique Kriging-Bootstrapped Neural Network that combines the accuracy and process variation awareness of Kriging with artificial neural network models for ultra-fast and accurate process aware metamodeling design. The proposed methodologies combine Kriging metamodels with selected algorithms for ultra-fast layout optimization. The selected algorithms explored are: Gravitational Search Algorithm (GSA), Simulated Annealing Optimization (SAO), and Ant Colony Optimization (ACO). Experimental results demonstrate that the proposed Kriging metamodel based methodologies can perform the optimizations with minimal computational burden compared to traditional (SPICE-based) design flows.
Hybrid Approaches in Test Suite Prioritization
The rapid advancement of web and mobile application technologies has recently posed numerous challenges to the Software Engineering community, including how to cost-effectively test applications that have complex event spaces. Many software testing techniques attempt to cost-effectively improve the quality of such software. This dissertation primarily focuses on that of hybrid test suite prioritization. The techniques utilize two or more criteria to perform test suite prioritization as it is often insufficient to use only a single criterion. The dissertation consists of the following contributions: (1) a weighted test suite prioritization technique that employs the distance between criteria as a weighting factor, (2) a coarse-to-fine grained test suite prioritization technique that uses a multilevel approach to increase the granularity of the criteria at each subsequent iteration, (3) the Caret-HM tool for Android user session-based testing that allows testers to record, replay, and create heat maps from user interactions with Android applications via a web browser, and (4) Android user session-based test suite prioritization techniques that utilize heuristics developed from user sessions created by Caret-HM. Each of the chapters empirically evaluate the respective techniques. The proposed techniques generally show improved or equally good performance when compared to the baselines, depending on an application under test. Further, this dissertation provides guidance to testers as it relates to the use of the proposed hybrid techniques.
Improving Software Quality through Syntax and Semantics Verification of Requirements Models
Software defects can frequently be traced to poorly-specified requirements. Many software teams manage their requirements using tools such as checklists and databases, which lack a formal semantic mapping to system behavior. Such a mapping can be especially helpful for safety-critical systems. Another limitation of many requirements analysis methods is that much of the analysis must still be done manually. We propose techniques that automate portions of the requirements analysis process, as well as clarify the syntax and semantics of requirements models using a variety of methods, including machine learning tools and our own tool, VeriCCM. The machine learning tools used help us identify potential model elements and verify their correctness. VeriCCM, a formalized extension of the causal component model (CCM), uses formal methods to ensure that requirements are well-formed, as well as providing the beginnings of a full formal semantics. We also explore the use of statecharts to identify potential abnormal behaviors from a given set of requirements. At each stage, we perform empirical studies to evaluate the effectiveness of our proposed approaches.
Incremental Learning with Large Datasets
This dissertation focuses on the novel learning strategy based on geometric support vector machines to address the difficulties of processing immense data set. Support vector machines find the hyper-plane that maximizes the margin between two classes, and the decision boundary is represented with a few training samples it becomes a favorable choice for incremental learning. The dissertation presents a novel method Geometric Incremental Support Vector Machines (GISVMs) to address both efficiency and accuracy issues in handling massive data sets. In GISVM, skin of convex hulls is defined and an efficient method is designed to find the best skin approximation given available examples. The set of extreme points are found by recursively searching along the direction defined by a pair of known extreme points. By identifying the skin of the convex hulls, the incremental learning will only employ a much smaller number of samples with comparable or even better accuracy. When additional samples are provided, they will be used together with the skin of the convex hull constructed from previous dataset. This results in a small number of instances used in incremental steps of the training process. Based on the experimental results with synthetic data sets, public benchmark data sets from UCI and endoscopy videos, it is evident that the GISVM achieved satisfactory classifiers that closely model the underlying data distribution. GISVM improves the performance in sensitivity in the incremental steps, significantly reduced the demand for memory space, and demonstrates the ability of recovery from temporary performance degradation.
Indoor Localization Using Magnetic Fields
Indoor localization consists of locating oneself inside new buildings. GPS does not work indoors due to multipath reflection and signal blockage. WiFi based systems assume ubiquitous availability and infrastructure based systems require expensive installations, hence making indoor localization an open problem. This dissertation consists of solving the problem of indoor localization by thoroughly exploiting the indoor ambient magnetic fields comprising mainly of disturbances termed as anomalies in the Earth’s magnetic field caused by pillars, doors and elevators in hallways which are ferromagnetic in nature. By observing uniqueness in magnetic signatures collected from different campus buildings, the work presents the identification of landmarks and guideposts from these signatures and further develops magnetic maps of buildings - all of which can be used to locate and navigate people indoors. To understand the reason behind these anomalies, first a comparison between the measured and model generated Earth’s magnetic field is made, verifying the presence of a constant field without any disturbances. Then by modeling the magnetic field behavior of different pillars such as steel reinforced concrete, solid steel, and other structures like doors and elevators, the interaction of the Earth’s field with the ferromagnetic fields is described thereby explaining the causes of the uniqueness in the signatures that comprise these disturbances. Next, by employing the dynamic time warping algorithm to account for time differences in signatures obtained from users walking at different speeds, an indoor localization application capable of classifying locations using the magnetic signatures is developed solely on the smart phone. The application required users to walk short distances of 3-6 m anywhere in hallway to be located with accuracies of 80-99%. The classification framework was further validated with over 90% accuracies using model generated magnetic signatures representing hallways with different kinds of pillars, doors and elevators. All in all, this dissertation …
The Influence of Social Network Graph Structure on Disease Dynamics in a Simulated Environment
The fight against epidemics/pandemics is one of man versus nature. Technological advances have not only improved existing methods for monitoring and controlling disease outbreaks, but have also provided new means for investigation, such as through modeling and simulation. This dissertation explores the relationship between social structure and disease dynamics. Social structures are modeled as graphs, and outbreaks are simulated based on a well-recognized standard, the susceptible-infectious-removed (SIR) paradigm. Two independent, but related, studies are presented. The first involves measuring the severity of outbreaks as social network parameters are altered. The second study investigates the efficacy of various vaccination policies based on social structure. Three disease-related centrality measures are introduced, contact, transmission, and spread centrality, which are related to previously established centrality measures degree, betweenness, and closeness, respectively. The results of experiments presented in this dissertation indicate that reducing the neighborhood size along with outside-of-neighborhood contacts diminishes the severity of disease outbreaks. Vaccination strategies can effectively reduce these parameters. Additionally, vaccination policies that target individuals with high centrality are generally shown to be slightly more effective than a random vaccination policy. These results combined with past and future studies will assist public health officials in their effort to minimize the effects of inevitable disease epidemics/pandemics.
Infusing Automatic Question Generation with Natural Language Understanding
Automatically generating questions from text for educational purposes is an active research area in natural language processing. The automatic question generation system accompanying this dissertation is MARGE, which is a recursive acronym for: MARGE automatically reads generates and evaluates. MARGE generates questions from both individual sentences and the passage as a whole, and is the first question generation system to successfully generate meaningful questions from textual units larger than a sentence. Prior work in automatic question generation from text treats a sentence as a string of constituents to be rearranged into as many questions as allowed by English grammar rules. Consequently, such systems overgenerate and create mainly trivial questions. Further, none of these systems to date has been able to automatically determine which questions are meaningful and which are trivial. This is because the research focus has been placed on NLG at the expense of NLU. In contrast, the work presented here infuses the questions generation process with natural language understanding. From the input text, MARGE creates a meaning analysis representation for each sentence in a passage via the DeconStructure algorithm presented in this work. Questions are generated from sentence meaning analysis representations using templates. The generated questions are automatically evaluated for question quality and importance via a ranking algorithm.
Joint Schemes for Physical Layer Security and Error Correction
The major challenges facing resource constraint wireless devices are error resilience, security and speed. Three joint schemes are presented in this research which could be broadly divided into error correction based and cipher based. The error correction based ciphers take advantage of the properties of LDPC codes and Nordstrom Robinson code. A cipher-based cryptosystem is also presented in this research. The complexity of this scheme is reduced compared to conventional schemes. The securities of the ciphers are analyzed against known-plaintext and chosen-plaintext attacks and are found to be secure. Randomization test was also conducted on these schemes and the results are presented. For the proof of concept, the schemes were implemented in software and hardware and these shows a reduction in hardware usage compared to conventional schemes. As a result, joint schemes for error correction and security provide security to the physical layer of wireless communication systems, a layer in the protocol stack where currently little or no security is implemented. In this physical layer security approach, the properties of powerful error correcting codes are exploited to deliver reliability to the intended parties, high security against eavesdroppers and efficiency in communication system. The notion of a highly secure and reliable physical layer has the potential to significantly change how communication system designers and users think of the physical layer since the error control codes employed in this work will have the dual roles of both reliability and security.
Layout-accurate Ultra-fast System-level Design Exploration Through Verilog-ams
This research addresses problems in designing analog and mixed-signal (AMS) systems by bridging the gap between system-level and circuit-level simulation by making simulations fast like system-level and accurate like circuit-level. The tools proposed include metamodel integrated Verilog-AMS based design exploration flows. The research involves design centering, metamodel generation flows for creating efficient behavioral models, and Verilog-AMS integration techniques for model realization. The core of the proposed solution is transistor-level and layout-level metamodeling and their incorporation in Verilog-AMS. Metamodeling is used to construct efficient and layout-accurate surrogate models for AMS system building blocks. Verilog-AMS, an AMS hardware description language, is employed to build surrogate model implementations that can be simulated with industrial standard simulators. The case-study circuits and systems include an operational amplifier (OP-AMP), a voltage-controlled oscillator (VCO), a charge-pump phase-locked loop (PLL), and a continuous-time delta-sigma modulator (DSM). The minimum and maximum error rates of the proposed OP-AMP model are 0.11 % and 2.86 %, respectively. The error rates for the PLL lock time and power estimation are 0.7 % and 3.0 %, respectively. The OP-AMP optimization using the proposed approach is ~17000× faster than the transistor-level model based approach. The optimization achieves a ~4× power reduction for the OP-AMP design. The PLL parasitic-aware optimization achieves a 10× speedup and a 147 µW power reduction. Thus the experimental results validate the effectiveness of the proposed solution.
Location Estimation and Geo-Correlated Information Trends
A tremendous amount of information is being shared every day on social media sites such as Facebook, Twitter or Google+. However, only a small portion of users provide their location information, which can be helpful in targeted advertising and many other services. Current methods in location estimation using social relationships consider social friendship as a simple binary relationship. However, social closeness between users and structure of friends have strong implications on geographic distances. In the first task, we introduce new measures to evaluate the social closeness between users and structure of friends. Then we propose models that use them for location estimation. Compared with the models which take the friend relation as a binary feature, social closeness can help identify which friend of a user is more important and friend structure can help to determine significance level of locations, thus improving the accuracy of the location estimation models. A confidence iteration method is further introduced to improve estimation accuracy and overcome the problem of scarce location information. We evaluate our methods on two different datasets, Twitter and Gowalla. The results show that our model can improve the estimation accuracy by 5% - 20% compared with state-of-the-art friend-based models. In the second task, we also propose a Local Event Discovery and Summarization (LEDS) framework to detect local events from Twitter. Many existing algorithms for event detection focus on larger-scale events and are not sensitive to smaller-scale local events. Most of the local events detected by these methods are major events like important sports, shows, or big natural disasters. In this work, we propose the LEDS framework to detect both bigger and smaller events. LEDS contains three key steps: 1) Detecting possible event related terms by monitoring abnormal distribution in different locations and times; 2) Clustering tweets based on their key terms, …
Metamodeling-based Fast Optimization of Nanoscale Ams-socs
Modern consumer electronic systems are mostly based on analog and digital circuits and are designed as analog/mixed-signal systems on chip (AMS-SoCs). the integration of analog and digital circuits on the same die makes the system cost effective. in AMS-SoCs, analog and mixed-signal portions have not traditionally received much attention due to their complexity. As the fabrication technology advances, the simulation times for AMS-SoC circuits become more complex and take significant amounts of time. the time allocated for the circuit design and optimization creates a need to reduce the simulation time. the time constraints placed on designers are imposed by the ever-shortening time to market and non-recurrent cost of the chip. This dissertation proposes the use of a novel method, called metamodeling, and intelligent optimization algorithms to reduce the design time. Metamodel-based ultra-fast design flows are proposed and investigated. Metamodel creation is a one time process and relies on fast sampling through accurate parasitic-aware simulations. One of the targets of this dissertation is to minimize the sample size while retaining the accuracy of the model. in order to achieve this goal, different statistical sampling techniques are explored and applied to various AMS-SoC circuits. Also, different metamodel functions are explored for their accuracy and application to AMS-SoCs. Several different optimization algorithms are compared for global optimization accuracy and convergence. Three different AMS circuits, ring oscillator, inductor-capacitor voltage-controlled oscillator (LC-VCO) and phase locked loop (PLL) that are present in many AMS-SoC are used in this study for design flow application. Metamodels created in this dissertation provide accuracy with an error of less than 2% from the physical layout simulations. After optimal sampling investigation, metamodel functions and optimization algorithms are ranked in terms of speed and accuracy. Experimental results show that the proposed design flow provides roughly 5,000x speedup over conventional design flows. Thus, …
Methodical Evaluation of Processing-in-Memory Alternatives
In this work, I characterized a series of potential application kernels using a set of architectural and non-architectural metrics, and performed a comparison of four different alternatives for processing-in-memory cores (PIMs): ARM cores, GPGPUs, coarse-grained reconfigurable dataflow (DF-PIM), and a domain specific architecture using SIMD PIM engine consisting of a series of multiply-accumulate circuits (MACs). For each PIM alternative I investigated how performance and energy efficiency changes with respect to a series of system parameters, such as memory bandwidth and latency, number of PIM cores, DVFS states, cache architecture, etc. In addition, I compared the PIM core choices for a subset of applications and discussed how the application characteristics correlate to the achieved performance and energy efficiency. Furthermore, I compared the PIM alternatives to a host-centric solution that uses a traditional server-class CPU core or PIM-like cores acting as host-side accelerators instead of being part of 3D-stacked memories. Such insights can expose the achievable performance limits and shortcomings of certain PIM designs and show sensitivity to a series of system parameters (available memory bandwidth, application latency and bandwidth sensitivity, etc.). In addition, identifying the common application characteristics for PIM kernels provides opportunity to identify similar types of computation patterns in other applications and allows us to create a set of applications which can then be used as benchmarks for evaluating future PIM design alternatives.
Modeling and Analysis of Intentional And Unintentional Security Vulnerabilities in a Mobile Platform
Mobile phones are one of the essential parts of modern life. Making a phone call is not the main purpose of a smart phone anymore, but merely one of many other features. Online social networking, chatting, short messaging, web browsing, navigating, and photography are some of the other features users enjoy in modern smartphones, most of which are provided by mobile apps. However, with this advancement, many security vulnerabilities have opened up in these devices. Malicious apps are a major threat for modern smartphones. According to Symantec Corp., by the middle of 2013, about 273,000 Android malware apps were identified. It is a complex issue to protect everyday users of mobile devices from the attacks of technologically competent hackers, illegitimate users, trolls, and eavesdroppers. This dissertation emphasizes the concept of intention identification. Then it looks into ways to utilize this intention identification concept to enforce security in a mobile phone platform. For instance, a battery monitoring app requiring SMS permissions indicates suspicious intention as battery monitoring usually does not need SMS permissions. Intention could be either the user's intention or the intention of an app. These intentions can be identified using their behavior or by using their source code. Regardless of the intention type, identifying it, evaluating it, and taking actions by using it to prevent any malicious intentions are the main goals of this research. The following four different security vulnerabilities are identified in this research: Malicious apps, spammers and lurkers in social networks, eavesdroppers in phone conversations, and compromised authentication. These four vulnerabilities are solved by detecting malware applications, identifying malicious users in a social network, enhancing the encryption system of a phone communication, and identifying user activities using electroencephalogram (EEG) for authentication. Each of these solutions are constructed using the idea of intention identification. Furthermore, many of …
Modeling Epidemics on Structured Populations: Effects of Socio-demographic Characteristics and Immune Response Quality
Epidemiologists engage in the study of the distribution and determinants of health-related states or events in human populations. Eventually, they will apply that study to prevent and control problems and contingencies associated with the health of the population. Due to the spread of new pathogens and the emergence of new bio-terrorism threats, it has become imperative to develop new and expand existing techniques to equip public health providers with robust tools to predict and control health-related crises. In this dissertation, I explore the effects caused in the disease dynamics by the differences in individuals’ physiology and social/behavioral characteristics. Multiple computational and mathematical models were developed to quantify the effect of those factors on spatial and temporal variations of the disease epidemics. I developed statistical methods to measure the effects caused in the outbreak dynamics by the incorporation of heterogeneous demographics and social interactions to the individuals of the population. Specifically, I studied the relationship between demographics and the physiological characteristics of an individual when preparing for an infectious disease epidemic.
Modeling Synergistic Relationships Between Words and Images
Texts and images provide alternative, yet orthogonal views of the same underlying cognitive concept. By uncovering synergistic, semantic relationships that exist between words and images, I am working to develop novel techniques that can help improve tasks in natural language processing, as well as effective models for text-to-image synthesis, image retrieval, and automatic image annotation. Specifically, in my dissertation, I will explore the interoperability of features between language and vision tasks. In the first part, I will show how it is possible to apply features generated using evidence gathered from text corpora to solve the image annotation problem in computer vision, without the use of any visual information. In the second part, I will address research in the reverse direction, and show how visual cues can be used to improve tasks in natural language processing. Importantly, I propose a novel metric to estimate the similarity of words by comparing the visual similarity of concepts invoked by these words, and show that it can be used further to advance the state-of-the-art methods that employ corpus-based and knowledge-based semantic similarity measures. Finally, I attempt to construct a joint semantic space connecting words with images, and synthesize an evaluation framework to quantify cross-modal semantic relationships that exist between arbitrary pairs of words and images. I study the effectiveness of unsupervised, corpus-based approaches to automatically derive the semantic relatedness between words and images, and perform empirical evaluations by measuring its correlation with human annotators.
Monitoring Dengue Outbreaks Using Online Data
Internet technology has affected humans' lives in many disciplines. The search engine is one of the most important Internet tools in that it allows people to search for what they want. Search queries entered in a web search engine can be used to predict dengue incidence. This vector borne disease causes severe illness and kills a large number of people every year. This dissertation utilizes the capabilities of search queries related to dengue and climate to forecast the number of dengue cases. Several machine learning techniques are applied for data analysis, including Multiple Linear Regression, Artificial Neural Networks, and the Seasonal Autoregressive Integrated Moving Average. Predictive models produced from these machine learning methods are measured for their performance to find which technique generates the best model for dengue prediction. The results of experiments presented in this dissertation indicate that search query data related to dengue and climate can be used to forecast the number of dengue cases. The performance measurement of predictive models shows that Artificial Neural Networks outperform the others. These results will help public health officials in planning to deal with the outbreaks.
A Multi-Modal Insider Threat Detection and Prevention based on Users' Behaviors
Insider threat is one of the greatest concerns for information security that could cause more significant financial losses and damages than any other attack. However, implementing an efficient detection system is a very challenging task. It has long been recognized that solutions to insider threats are mainly user-centric and several psychological and psychosocial models have been proposed. A user's psychophysiological behavior measures can provide an excellent source of information for detecting user's malicious behaviors and mitigating insider threats. In this dissertation, we propose a multi-modal framework based on the user's psychophysiological measures and computer-based behaviors to distinguish between a user's behaviors during regular activities versus malicious activities. We utilize several psychophysiological measures such as electroencephalogram (EEG), electrocardiogram (ECG), and eye movement and pupil behaviors along with the computer-based behaviors such as the mouse movement dynamics, and keystrokes dynamics to build our framework for detecting malicious insiders. We conduct human subject experiments to capture the psychophysiological measures and the computer-based behaviors for a group of participants while performing several computer-based activities in different scenarios. We analyze the behavioral measures, extract useful features, and evaluate their capability in detecting insider threats. We investigate each measure separately, then we use data fusion techniques to build two modules and a comprehensive multi-modal framework. The first module combines the synchronized EEG and ECG psychophysiological measures, and the second module combines the eye movement and pupil behaviors with the computer-based behaviors to detect the malicious insiders. The multi-modal framework utilizes all the measures and behaviors in one model to achieve better detection accuracy. Our findings demonstrate that psychophysiological measures can reveal valuable knowledge about a user's malicious intent and can be used as an effective indicator in designing insider threat monitoring and detection frameworks. Our work lays out the necessary foundation to establish a new generation …
New Frameworks for Secure Image Communication in the Internet of Things (IoT)
The continuous expansion of technology, broadband connectivity and the wide range of new devices in the IoT cause serious concerns regarding privacy and security. In addition, in the IoT a key challenge is the storage and management of massive data streams. For example, there is always the demand for acceptable size with the highest quality possible for images to meet the rapidly increasing number of multimedia applications. The effort in this dissertation contributes to the resolution of concerns related to the security and compression functions in image communications in the Internet of Thing (IoT), due to the fast of evolution of IoT. This dissertation proposes frameworks for a secure digital camera in the IoT. The objectives of this dissertation are twofold. On the one hand, the proposed framework architecture offers a double-layer of protection: encryption and watermarking that will address all issues related to security, privacy, and digital rights management (DRM) by applying a hardware architecture of the state-of-the-art image compression technique Better Portable Graphics (BPG), which achieves high compression ratio with small size. On the other hand, the proposed framework of SBPG is integrated with the Digital Camera. Thus, the proposed framework of SBPG integrated with SDC is suitable for high performance imaging in the IoT, such as Intelligent Traffic Surveillance (ITS) and Telemedicine. Due to power consumption, which has become a major concern in any portable application, a low-power design of SBPG is proposed to achieve an energy- efficient SBPG design. As the visual quality of the watermarked and compressed images improves with larger values of PSNR, the results show that the proposed SBPG substantially increases the quality of the watermarked compressed images. Higher value of PSNR also shows how robust the algorithm is to different types of attack. From the results obtained for the energy- efficient SBPG …
A New Look at Retargetable Compilers
Consumers demand new and innovative personal computing devices every 2 years when their cellular phone service contracts are renewed. Yet, a 2 year development cycle for the concurrent development of both hardware and software is nearly impossible. As more components and features are added to the devices, maintaining this 2 year cycle with current tools will become commensurately harder. This dissertation delves into the feasibility of simplifying the development of such systems by employing heterogeneous systems on a chip in conjunction with a retargetable compiler such as the hybrid computer retargetable compiler (Hy-C). An example of a simple architecture description of sufficient detail for use with a retargetable compiler like Hy-C is provided. As a software engineer with 30 years of experience, I have witnessed numerous system failures. A plethora of software development paradigms and tools have been employed to prevent software errors, but none have been completely successful. Much discussion centers on software development in the military contracting market, as that is my background. The dissertation reviews those tools, as well as some existing retargetable compilers, in an attempt to determine how those errors occurred and how a system like Hy-C could assist in reducing future software errors. In the end, the potential for a simple retargetable solution like Hy-C is shown to be very simple, yet powerful enough to provide a very capable product in a very fast-growing market.
On-Loom Fabric Defect Inspection Using Contact Image Sensors and Activation Layer Embedded Convolutional Neural Network
Malfunctions on loom machines are the main causes of faulty fabric production. An on-loom fabric inspection system is a real-time monitoring device that enables immediate defect detection for human intervention. This dissertation presented a solution for the on-loom fabric defect inspection, including the new hardware design—the configurable contact image sensor (CIS) module—for on-loom fabric scanning and the defect detection algorithms. The main contributions of this work include (1) creating a configurable CIS module adaptable to a loom width, which brings CIS unique features, such as sub-millimeter resolution, compact size, short working distance and low cost, to the fabric defect inspection system, (2) designing a two-level hardware architecture that can be efficiently deployed in a weaving factory with hundreds of looms, (3) developing a two-level inspecting scheme, with which the initial defect screening is performed on the Raspberry Pi and the intensive defect verification is processed on the cloud server, (4) introducing the novel pairwise-potential activation layer to a convolutional neural network that leads to high accuracies of defect segmentation on fabrics with fine and imbalanced structures, (5) achieving a real-time defect detection that allows a possible defect to be examined multiple times, and (6) implementing a new color segmentation technique suitable for processing multi-color fabric defects. The novel CIS-based on-loom scanning system offered real-time and high-resolution fabric images, which was able to deliver the information of single thread on a fabric. The algorithm evaluation on the fabric defect datasets showed a non-miss-detection rate on defect-free fabrics. The average precision of defect existed images reached above 90% at the pixel level. The detected defect pixels' integrity—the recall scored around 70%. Possible defect regions overestimated on ground truth images and the morphologies of fine defects similar to regular fabric pattern were the two major reasons that caused the imperfection in defect pixel …
Back to Top of Screen