Search Results

3D Reconstruction Using Lidar and Visual Images
In this research, multi-perspective image registration using LiDAR and visual images was considered. 2D-3D image registration is a difficult task because it requires the extraction of different semantic features from each modality. This problem is solved in three parts. The first step involves detection and extraction of common features from each of the data sets. The second step consists of associating the common features between two different modalities. Traditional methods use lines or orthogonal corners as common features. The third step consists of building the projection matrix. Many existing methods use global positing system (GPS) or inertial navigation system (INS) for an initial estimate of the camera pose. However, the approach discussed herein does not use GPS, INS, or any such devices for initial estimate; hence the model can be used in places like the lunar surface or Mars where GPS or INS are not available. A variation of the method is also described, which does not require strong features from both images but rather uses intensity gradients in the image. This can be useful when one image does not have strong features (such as lines) or there are too many extraneous features.
3GPP Long Term Evolution LTE Scheduling
Future generation cellular networks are expected to deliver an omnipresent broadband access network for an endlessly increasing number of subscribers. Long term Evolution (LTE) represents a significant milestone towards wireless networks known as 4G cellular networks. A key feature of LTE is the implementation of enhanced Radio Resource Management (RRM) mechanism to improve the system performance. The structure of LTE networks was simplified by diminishing the number of the nodes of the core network. Also, the design of the radio protocol architecture is quite unique. In order to achieve high data rate in LTE, 3rd Generation Partnership Project (3GPP) has selected Orthogonal Frequency Division Multiplexing (OFDM) as an appropriate scheme in terms of downlinks. However, the proper scheme for an uplink is the Single-Carrier Frequency Domain Multiple Access due to the peak-to-average-power-ratio (PAPR) constraint. LTE packet scheduling plays a primary role as part of RRM to improve the system’s data rate as well as supporting various QoS requirements of mobile services. The major function of the LTE packet scheduler is to assign Physical Resource Blocks (PRBs) to mobile User Equipment (UE). In our work, we formed a proposed packet scheduler algorithm. The proposed scheduler algorithm acts based on the number of UEs attached to the eNodeB. To evaluate the proposed scheduler algorithm, we assumed two different scenarios based on a number of UEs. When the number of UE is lower than the number of PRBs, the UEs with highest Channel Quality Indicator (CQI) will be assigned PRBs. Otherwise, the scheduler will assign PRBs based on a given proportional fairness metric. The eNodeB’s throughput is increased when the proposed algorithm was implemented.
An Adaptive Linearization Method for a Constraint Satisfaction Problem in Semiconductor Device Design Optimization
The device optimization is a very important element in semiconductor technology advancement. Its objective is to find a design point for a semiconductor device so that the optimized design goal meets all specified constraints. As in other engineering fields, a nonlinear optimizer is often used for design optimization. One major drawback of using a nonlinear optimizer is that it can only partially explore the design space and return a local optimal solution. This dissertation provides an adaptive optimization design methodology to allow the designer to explore the design space and obtain a globally optimal solution. One key element of our method is to quickly compute the set of all feasible solutions, also called the acceptability region. We described a polytope-based representation for the acceptability region and an adaptive linearization technique for device performance model approximation. These efficiency enhancements have enabled significant speed-up in estimating acceptability regions and allow acceptability regions to be estimated for a larger class of device design tasks. Our linearization technique also provides an efficient mechanism to guarantee the global accuracy of the computed acceptability region. To visualize the acceptability region, we study the orthogonal projection of high-dimensional convex polytopes and propose an output sensitive algorithm for projecting polytopes into two dimensions.
Adaptive Planning and Prediction in Agent-Supported Distributed Collaboration.
Agents that act as user assistants will become invaluable as the number of information sources continue to proliferate. Such agents can support the work of users by learning to automate time-consuming tasks and filter information to manageable levels. Although considerable advances have been made in this area, it remains a fertile area for further development. One application of agents under careful scrutiny is the automated negotiation of conflicts between different user's needs and desires. Many techniques require explicit user models in order to function. This dissertation explores a technique for dynamically constructing user models and the impact of using them to anticipate the need for negotiation. Negotiation is reduced by including an advising aspect to the agent that can use this anticipation of conflict to adjust user behavior.
Agent Extensions for Peer-to-Peer Networks.
Peer-to-Peer (P2P) networks have seen tremendous growth in development and usage in recent times. This attention has brought many developments as well as new challenges to these networks. We will show that agent extensions to P2P networks offer solutions to many problems faced by P2P networks. In this research, an attempt is made to bring together JXTA P2P infrastructure and Jinni, a Prolog based agent engine to form an agent based P2P network. On top of the JXTA, we define simple Java API providing P2P services for agent programming constructs. Jinni is deployed on this JXTA network using an automated code update mechanism. Experiments are conducted on this Jinni/JXTA platform to implement a simple agent communication and data exchange protocol.
An Algorithm for the PLA Equivalence Problem
The Programmable Logic Array (PLA) has been widely used in the design of VLSI circuits and systems because of its regularity, flexibility, and simplicity. The equivalence problem is typically to verify that the final description of a circuit is functionally equivalent to its initial description. Verifying the functional equivalence of two descriptions is equivalent to proving their logical equivalence. This problem of pure logic is essential to circuit design. The most widely used technique to solve the problem is based on Binary Decision Diagram or BDD, proposed by Bryant in 1986. Unfortunately, BDD requires too much time and space to represent moderately large circuits for equivalence testing. We design and implement a new algorithm called the Cover-Merge Algorithm for the equivalence problem based on a divide-and-conquer strategy using the concept of cover and a derivational method. We prove that the algorithm is sound and complete. Because of the NP-completeness of the problem, we emphasize simplifications to reduce the search space or to avoid redundant computations. Simplification techniques are incorporated into the algorithm as an essential part to speed up the the derivation process. Two different sets of heuristics are developed for two opposite goals: one for the proof of equivalence and the other for its disproof. Experiments on a large scale of data have shown that big speed-ups can be achieved by prioritizing the heuristics and by choosing the most favorable one at each iteration of the Algorithm. Results are compared with those for BDD on standard benchmark problems as well as on random PLAs to perform an unbiased way of testing algorithms. It has been shown that the Cover-Merge Algorithm outperforms BDD in nearly all problem instances in terms of time and space. The algorithm has demonstrated fairly stabilized and practical performances especially for big PLAs under a wide …
Algorithm Optimizations in Genomic Analysis Using Entropic Dissection
In recent years, the collection of genomic data has skyrocketed and databases of genomic data are growing at a faster rate than ever before. Although many computational methods have been developed to interpret these data, they tend to struggle to process the ever increasing file sizes that are being produced and fail to take advantage of the advances in multi-core processors by using parallel processing. In some instances, loss of accuracy has been a necessary trade off to allow faster computation of the data. This thesis discusses one such algorithm that has been developed and how changes were made to allow larger input file sizes and reduce the time required to achieve a result without sacrificing accuracy. An information entropy based algorithm was used as a basis to demonstrate these techniques. The algorithm dissects the distinctive patterns underlying genomic data efficiently requiring no a priori knowledge, and thus is applicable in a variety of biological research applications. This research describes how parallel processing and object-oriented programming techniques were used to process larger files in less time and achieve a more accurate result from the algorithm. Through object oriented techniques, the maximum allowable input file size was significantly increased from 200 mb to 2000 mb. Using parallel processing techniques allowed the program to finish processing data in less than half the time of the sequential version. The accuracy of the algorithm was improved by reducing data loss throughout the algorithm. Finally, adding user-friendly options enabled the program to use requests more effectively and further customize the logic used within the algorithm.
Algorithms for Efficient Utilization of Wireless Bandwidth and to Provide Quality-of-Service in Wireless Networks
This thesis presents algorithms to utilize the wireless bandwidth efficiently and at the same time meet the quality of service (QoS) requirements of the users. In the proposed algorithms we present an adaptive frame structure based upon the airlink frame loss probability and control the admission of call requests into the system based upon the load on the system and the QoS requirements of the incoming call requests. The performance of the proposed algorithms is studied by developing analytical formulations and simulation experiments. Finally we present an admission control algorithm which uses an adaptive delay computation algorithm to compute the queuing delay for each class of traffic and adapts the service rate and the reliability in the estimates based upon the deviation in the expected and obtained performance. We study the performance of the call admission control algorithm by simulation experiments. Simulation results for the adaptive frame structure algorithm show an improvement in the number of users in the system but there is a drop in the system throughput. In spite of the lower throughput the adaptive frame structure algorithm has fewer QoS delay violations. The adaptive call admission control algorithm adapts the call dropping probability of different classes of traffic and optimizes the system performance w.r.t the number of calls dropped and the reliability in meeting the QoS promised when the call is admitted into the system.
An Analysis of Motivational Cues in Virtual Environments.
Guiding navigation in virtual environments (VEs) is a challenging task. A key issue in the navigation of a virtual environment is to be able to strike a balance between the user's need to explore the environment freely and the designer's need to ensure that the user experiences all the important events in the VE. This thesis reports on a study aimed at comparing the effectiveness of various navigation cues that are used to motivate users towards a specific target location. The results of this study indicate some significant differences in how users responded to the various cues.
Analysis of Web Services on J2EE Application Servers
The Internet became a standard way of exchanging business data between B2B and B2C applications and with this came the need for providing various services on the web instead of just static text and images. Web services are a new type of services offered via the web that aid in the creation of globally distributed applications. Web services are enhanced e-business applications that are easier to advertise and easier to discover on the Internet because of their flexibility and uniformity. In a real life scenario it is highly difficult to decide which J2EE application server to go for when deploying a enterprise web service. This thesis analyzes the various ways by which web services can be developed & deployed. Underlying protocols and crucial issues like EAI (enterprise application integration), asynchronous messaging, Registry tModel architecture etc have been considered in this research. This paper presents a report by analyzing what various J2EE application servers provide by doing a case study and by developing applications to test functionality.
Anchor Nodes Placement for Effective Passive Localization
Wireless sensor networks are composed of sensor nodes, which can monitor an environment and observe events of interest. These networks are applied in various fields including but not limited to environmental, industrial and habitat monitoring. In many applications, the exact location of the sensor nodes is unknown after deployment. Localization is a process used to find sensor node's positional coordinates, which is vital information. The localization is generally assisted by anchor nodes that are also sensor nodes but with known locations. Anchor nodes generally are expensive and need to be optimally placed for effective localization. Passive localization is one of the localization techniques where the sensor nodes silently listen to the global events like thunder sounds, seismic waves, lighting, etc. According to previous studies, the ideal location to place anchor nodes was on the perimeter of the sensor network. This may not be the case in passive localization, since the function of anchor nodes here is different than the anchor nodes used in other localization systems. I do extensive studies on positioning anchor nodes for effective localization. Several simulations are run in dense and sparse networks for proper positioning of anchor nodes. I show that, for effective passive localization, the optimal placement of the anchor nodes is at the center of the network in such a way that no three anchor nodes share linearity. The more the non-linearity, the better the localization. The localization for our network design proves better when I place anchor nodes at right angles.
An Annotated Bibliography of Mobile Agents in Networks
The purpose of this thesis is to present a comprehensive colligation of applications of mobile agents in networks, and provide a baseline association of these systems. This work has been motivated by the fact that mobile agent systems have been deemed proficuous alternatives in system applications. Several mobile agent systems have been developed to provide scalable and cogent solutions in network-centric applications. This thesis examines some existing mobile agent systems in core networking areas, in particular, those of network and resource management, routing, and the provision of fault tolerance and security. The inherent features of these systems are discussed with respect to their specific functionalities. The applicability and efficacy of mobile agents are further considered in the specific areas mentioned above. Although an initial foray into a collation of this nature, the goal of this annotated bibliography is to provide a generic referential view of mobile agent systems in network applications.
An Approach Towards Self-Supervised Classification Using Cyc
Due to the long duration required to perform manual knowledge entry by human knowledge engineers it is desirable to find methods to automatically acquire knowledge about the world by accessing online information. In this work I examine using the Cyc ontology to guide the creation of Naïve Bayes classifiers to provide knowledge about items described in Wikipedia articles. Given an initial set of Wikipedia articles the system uses the ontology to create positive and negative training sets for the classifiers in each category. The order in which classifiers are generated and used to test articles is also guided by the ontology. The research conducted shows that a system can be created that utilizes statistical text classification methods to extract information from an ad-hoc generated information source like Wikipedia for use in a formal semantic ontology like Cyc. Benefits and limitations of the system are discussed along with future work.
Arithmetic Computations and Memory Management Using a Binary Tree Encoding af Natural Numbers
Two applications of a binary tree data type based on a simple pairing function (a bijection between natural numbers and pairs of natural numbers) are explored. First, the tree is used to encode natural numbers, and algorithms that perform basic arithmetic computations are presented along with formal proofs of their correctness. Second, using this "canonical" representation as a base type, algorithms for encoding and decoding additional isomorphic data types of other mathematical constructs (sets, sequences, etc.) are also developed. An experimental application to a memory management system is constructed and explored using these isomorphic types. A practical analysis of this system's runtime complexity and space savings are provided, along with a proof of concept framework for both applications of the binary tree type, in the Java programming language.
Automated Classification of Emotions Using Song Lyrics
This thesis explores the classification of emotions in song lyrics, using automatic approaches applied to a novel corpus of 100 popular songs. I use crowd sourcing via Amazon Mechanical Turk to collect line-level emotions annotations for this collection of song lyrics. I then build classifiers that rely on textual features to automatically identify the presence of one or more of the following six Ekman emotions: anger, disgust, fear, joy, sadness and surprise. I compare different classification systems and evaluate the performance of the automatic systems against the manual annotations. I also introduce a system that uses data collected from the social network Twitter. I use the Twitter API to collect a large corpus of tweets manually labeled by their authors for one of the six emotions of interest. I then compare the classification of emotions obtained when training on data automatically collected from Twitter versus data obtained through crowd sourced annotations.
Automated Defense Against Worm Propagation.
Worms have caused significant destruction over the last few years. Network security elements such as firewalls, IDS, etc have been ineffective against worms. Some worms are so fast that a manual intervention is not possible. This brings in the need for a stronger security architecture which can automatically react to stop worm propagation. The method has to be signature independent so that it can stop new worms. In this thesis, an automated defense system (ADS) is developed to automate defense against worms and contain the worm to a level where manual intervention is possible. This is accomplished with a two level architecture with feedback at each level. The inner loop is based on control system theory and uses the properties of PID (proportional, integral and differential controller). The outer loop works at the network level and stops the worm to reach its spread saturation point. In our lab setup, we verified that with only inner loop active the worm was delayed, and with both loops active we were able to restrict the propagation to 10% of the targeted hosts. One concern for deployment of a worm containment mechanism was degradation of throughput for legitimate traffic. We found that with proper intelligent algorithm we can minimize the degradation to an acceptable level.
Automated GUI Tests Generation for Android Apps Using Q-learning
Mobile applications are growing in popularity and pose new problems in the area of software testing. In particular, mobile applications heavily depend upon user interactions and a dynamically changing environment of system events. In this thesis, we focus on user-driven events and use Q-learning, a reinforcement machine learning algorithm, to generate tests for Android applications under test (AUT). We implement a framework that automates the generation of GUI test cases by using our Q-learning approach and compare it to a uniform random (UR) implementation. A novel feature of our approach is that we generate user-driven event sequences through the GUI, without the source code or the model of the AUT. Hence, considerable amount of cost and time are saved by avoiding the need for model generation for generating the tests. Our results show that the systematic path exploration used by Q-learning results in higher average code coverage in comparison to the uniform random approach.
Automated Syndromic Surveillance using Intelligent Mobile Agents
Current syndromic surveillance systems utilize centralized databases that are neither scalable in storage space nor in computing power. Such systems are limited in the amount of syndromic data that may be collected and analyzed for the early detection of infectious disease outbreaks. However, with the increased prevalence of international travel, public health monitoring must extend beyond the borders of municipalities or states which will require the ability to store vasts amount of data and significant computing power for analyzing the data. Intelligent mobile agents may be used to create a distributed surveillance system that will utilize the hard drives and computer processing unit (CPU) power of the hosts on the agent network where the syndromic information is located. This thesis proposes the design of a mobile agent-based syndromic surveillance system and an agent decision model for outbreak detection. Simulation results indicate that mobile agents are capable of detecting an outbreak that occurs at all hosts the agent is monitoring. Further study of agent decision models is required to account for localized epidemics and variable agent movement rates.
Automated Testing of Interactive Systems
Computer systems which interact with human users to collect, update or provide information are growing more complex. Additionally, users are demanding more thorough testing of all computer systems. Because of the complexity and thoroughness required, automation of interactive systems testing is desirable, especially for functional testing. Many currently available testing tools, like program proving, are impractical for testing large systems. The solution presented here is the development of an automated test system which simulates human users. This system incorporates a high-level programming language, ATLIS. ATLIS programs are compiled and interpretively executed. Programs are selected for execution by operator command, and failures are reported to the operator's console. An audit trail of all activity is provided. This solution provides improved efficiency and effectiveness over conventional testing methods.
Automatic Removal of Complex Shadows From Indoor Videos
Shadows in indoor scenarios are usually characterized with multiple light sources that produce complex shadow patterns of a single object. Without removing shadow, the foreground object tends to be erroneously segmented. The inconsistent hue and intensity of shadows make automatic removal a challenging task. In this thesis, a dynamic thresholding and transfer learning-based method for removing shadows is proposed. The method suppresses light shadows with a dynamically computed threshold and removes dark shadows using an online learning strategy that is built upon a base classifier trained with manually annotated examples and refined with the automatically identified examples in the new videos. Experimental results demonstrate that despite variation of lighting conditions in videos our proposed method is able to adapt to the videos and remove shadows effectively. The sensitivity of shadow detection changes slightly with different confidence levels used in example selection for classifier retraining and high confidence level usually yields better performance with less retraining iterations.
Automatic Speech Recognition Using Finite Inductive Sequences
This dissertation addresses the general problem of recognition of acoustic signals which may be derived from speech, sonar, or acoustic phenomena. The specific problem of recognizing speech is the main focus of this research. The intention is to design a recognition system for a definite number of discrete words. For this purpose specifically, eight isolated words from the T1MIT database are selected. Four medium length words "greasy," "dark," "wash," and "water" are used. In addition, four short words are considered "she," "had," "in," and "all." The recognition system addresses the following issues: filtering or preprocessing, training, and decision-making. The preprocessing phase uses linear predictive coding of order 12. Following the filtering process, a vector quantization method is used to further reduce the input data and generate a finite inductive sequence of symbols representative of each input signal. The sequences generated by the vector quantization process of the same word are factored, and a single ruling or reference template is generated and stored in a codebook. This system introduces a new modeling technique which relies heavily on the basic concept that all finite sequences are finitely inductive. This technique is used in the training stage. In order to accommodate the variabilities in speech, the training is performed casualty, and a large number of training speakers is used from eight different dialect regions. Hence, a speaker independent recognition system is realized. The matching process compares the incoming speech with each of the templates stored, and a closeness ration is computed. A ratio table is generated anH the matching word that corresponds to the smallest ratio (i.e. indicating that the ruling has removed most of the symbols) is selected. Promising results were obtained for isolated words, and the recognition rates ranged between 50% and 100%.
Automatic Tagging of Communication Data
Globally distributed software teams are widespread throughout industry. But finding reliable methods that can properly assess a team's activities is a real challenge. Methods such as surveys and manual coding of activities are too time consuming and are often unreliable. Recent advances in information retrieval and linguistics, however, suggest that automated and/or semi-automated text classification algorithms could be an effective way of finding differences in the communication patterns among individuals and groups. Communication among group members is frequent and generates a significant amount of data. Thus having a web-based tool that can automatically analyze the communication patterns among global software teams could lead to a better understanding of group performance. The goal of this thesis, therefore, is to compare automatic and semi-automatic measures of communication and evaluate their effectiveness in classifying different types of group activities that occur within a global software development project. In order to achieve this goal, we developed a web-based component that can be used to help clean and classify communication activities. The component was then used to compare different automated text classification techniques on various group activities to determine their effectiveness in correctly classifying data from a global software development team project.
Autonomic Failure Identification and Diagnosis for Building Dependable Cloud Computing Systems
The increasingly popular cloud-computing paradigm provides on-demand access to computing and storage with the appearance of unlimited resources. Users are given access to a variety of data and software utilities to manage their work. Users rent virtual resources and pay for only what they use. In spite of the many benefits that cloud computing promises, the lack of dependability in shared virtualized infrastructures is a major obstacle for its wider adoption, especially for mission-critical applications. Virtualization and multi-tenancy increase system complexity and dynamicity. They introduce new sources of failure degrading the dependability of cloud computing systems. To assure cloud dependability, in my dissertation research, I develop autonomic failure identification and diagnosis techniques that are crucial for understanding emergent, cloud-wide phenomena and self-managing resource burdens for cloud availability and productivity enhancement. We study the runtime cloud performance data collected from a cloud test-bed and by using traces from production cloud systems. We define cloud signatures including those metrics that are most relevant to failure instances. We exploit profiled cloud performance data in both time and frequency domain to identify anomalous cloud behaviors and leverage cloud metric subspace analysis to automate the diagnosis of observed failures. We implement a prototype of the anomaly identification system and conduct the experiments in an on-campus cloud computing test-bed and by using the Google datacenter traces. Our experimental results show that our proposed anomaly detection mechanism can achieve 93% detection sensitivity while keeping the false positive rate as low as 6.1% and outperform other tested anomaly detection schemes. In addition, the anomaly detector adapts itself by recursively learning from these newly verified detection results to refine future detection.
Autonomic Zero Trust Framework for Network Protection
With the technological improvements, the number of Internet connected devices is increasing tremendously. We also observe an increase in cyberattacks since the attackers want to use all these interconnected devices for malicious intention. Even though there exist many proactive security solutions, it is not practical to run all the security solutions on them as they have limited computational resources and even battery operated. As an alternative, Zero Trust Architecture (ZTA) has become popular is because it defines boundaries and requires to monitor all events, configurations, and connections and evaluate them to enforce rejecting by default and accepting only if they are known and accepted as well as applies a continuous trust evaluation. In addition, we need to be able to respond as quickly as possible, which cannot be managed by human interaction but through autonomous computing paradigm. Therefore, in this work, we propose a framework that would implement ZTA using autonomous computing paradigm. The proposed solution, Autonomic ZTA Management Engine (AZME) framework, focusing on enforcing ZTA on network, uses a set of sensors to monitor a network, a set of user-defined policies to define which actions to be taken (through controller). We have implemented a Python prototype as a proof-of-concept that checks network packets and enforce ZTA by checking the individual source and destination based on the given policies and continuously evaluate the trust of connections. If an unaccepted connection is made, it can block the connection by creating firewall rule at runtime.
Bayesian Probabilistic Reasoning Applied to Mathematical Epidemiology for Predictive Spatiotemporal Analysis of Infectious Diseases
Abstract Probabilistic reasoning under uncertainty suits well to analysis of disease dynamics. The stochastic nature of disease progression is modeled by applying the principles of Bayesian learning. Bayesian learning predicts the disease progression, including prevalence and incidence, for a geographic region and demographic composition. Public health resources, prioritized by the order of risk levels of the population, will efficiently minimize the disease spread and curtail the epidemic at the earliest. A Bayesian network representing the outbreak of influenza and pneumonia in a geographic region is ported to a newer region with different demographic composition. Upon analysis for the newer region, the corresponding prevalence of influenza and pneumonia among the different demographic subgroups is inferred for the newer region. Bayesian reasoning coupled with disease timeline is used to reverse engineer an influenza outbreak for a given geographic and demographic setting. The temporal flow of the epidemic among the different sections of the population is analyzed to identify the corresponding risk levels. In comparison to spread vaccination, prioritizing the limited vaccination resources to the higher risk groups results in relatively lower influenza prevalence. HIV incidence in Texas from 1989-2002 is analyzed using demographic based epidemic curves. Dynamic Bayesian networks are integrated with probability distributions of HIV surveillance data coupled with the census population data to estimate the proportion of HIV incidence among the different demographic subgroups. Demographic based risk analysis lends to observation of varied spectrum of HIV risk among the different demographic subgroups. A methodology using hidden Markov models is introduced that enables to investigate the impact of social behavioral interactions in the incidence and prevalence of infectious diseases. The methodology is presented in the context of simulated disease outbreak data for influenza. Probabilistic reasoning analysis enhances the understanding of disease progression in order to identify the critical points of surveillance, …
BC Framework for CAV Edge Computing
Edge computing and CAV (Connected Autonomous Vehicle) fields can work as a team. With the short latency and high responsiveness of edge computing, it is a better fit than cloud computing in the CAV field. Moreover, containerized applications are getting rid of the annoying procedures for setting the required environment. So that deployment of applications on new machines is much more user-friendly than before. Therefore, this paper proposes a framework developed for the CAV edge computing scenario. This framework consists of various programs written in different languages. The framework uses Docker technology to containerize these applications so that the deployment could be simple and easy. This framework consists of two parts. One is for the vehicle on-board unit, which exposes data to the closest edge device and receives the output generated by the edge device. Another is for the edge device, which is responsible for collecting and processing big load of data and broadcasting output to vehicles. So the vehicle does not need to perform the heavyweight tasks that could drain up the limited power.
Biomedical Semantic Embeddings: Using Hybrid Sentences to Construct Biomedical Word Embeddings and its Applications
Word embeddings is a useful method that has shown enormous success in various NLP tasks, not only in open domain but also in biomedical domain. The biomedical domain provides various domain specific resources and tools that can be exploited to improve performance of these word embeddings. However, most of the research related to word embeddings in biomedical domain focuses on analysis of model architecture, hyper-parameters and input text. In this paper, we use SemMedDB to design new sentences called `Semantic Sentences'. Then we use these sentences in addition to biomedical text as inputs to the word embedding model. This approach aims at introducing biomedical semantic types defined by UMLS, into the vector space of word embeddings. The semantically rich word embeddings presented here rivals state of the art biomedical word embedding in both semantic similarity and relatedness metrics up to 11%. We also demonstrate how these semantic types in word embeddings can be utilized.
Boosting for Learning From Imbalanced, Multiclass Data Sets
In many real-world applications, it is common to have uneven number of examples among multiple classes. The data imbalance, however, usually complicates the learning process, especially for the minority classes, and results in deteriorated performance. Boosting methods were proposed to handle the imbalance problem. These methods need elongated training time and require diversity among the classifiers of the ensemble to achieve improved performance. Additionally, extending the boosting method to handle multi-class data sets is not straightforward. Examples of applications that suffer from imbalanced multi-class data can be found in face recognition, where tens of classes exist, and in capsule endoscopy, which suffers massive imbalance between the classes. This dissertation introduces RegBoost, a new boosting framework to address the imbalanced, multi-class problems. This method applies a weighted stratified sampling technique and incorporates a regularization term that accommodates multi-class data sets and automatically determines the error bound of each base classifier. The regularization parameter penalizes the classifier when it misclassifies instances that were correctly classified in the previous iteration. The parameter additionally reduces the bias towards majority classes. Experiments are conducted using 12 diverse data sets with moderate to high imbalance ratios. The results demonstrate superior performance of the proposed method compared to several state-of-the-art algorithms for imbalanced, multi-class classification problems. More importantly, the sensitivity improvement of the minority classes using RegBoost is accompanied with the improvement of the overall accuracy for all classes. With unpredictability regularization, a diverse group of classifiers are created and the maximum accuracy improvement reaches above 24%. Using stratified undersampling, RegBoost exhibits the best efficiency. The reduction in computational cost is significant reaching above 50%. As the volume of training data increase, the gain of efficiency with the proposed method becomes more significant.
Bounded Dynamic Source Routing in Mobile Ad Hoc Networks
A mobile ad hoc network (MANET) is a collection of mobile platforms or nodes that come together to form a network capable of communicating with each other, without the help of a central controller. To avail the maximum potential of a MANET, it is of great importance to devise a routing scheme, which will optimize upon the performance of a MANET, given the high rate of random mobility of the nodes. In a MANET individual nodes perform the routing functions like route discovery, route maintenance and delivery of packets from one node to the other. Existing routing protocols flood the network with broadcasts of route discovery messages, while attempting to establish a route. This characteristic is instrumental in deteriorating the performance of a MANET, as resource overhead triggered by broadcasts is directly proportional to the size of the network. Bounded-dynamic source routing (B-DSR), is proposed to curb this multitude of superfluous broadcasts, thus enabling to reserve valuable resources like bandwidth and battery power. B-DSR establishes a bounded region in the network, only within which, transmissions of route discovery messages are processed and validated for establishing a route. All route discovery messages reaching outside of this bounded region are dropped, thus preventing the network from being flooded. In addition B-DSR also guarantees loop-free routing and is robust for a rapid recovery when routes in the network change.
Brain Computer Interface (BCI) Applications: Privacy Threats and Countermeasures
In recent years, brain computer interfaces (BCIs) have gained popularity in non-medical domains such as the gaming, entertainment, personal health, and marketing industries. A growing number of companies offer various inexpensive consumer grade BCIs and some of these companies have recently introduced the concept of BCI "App stores" in order to facilitate the expansion of BCI applications and provide software development kits (SDKs) for other developers to create new applications for their devices. The BCI applications access to users' unique brainwave signals, which consequently allows them to make inferences about users' thoughts and mental processes. Since there are no specific standards that govern the development of BCI applications, its users are at the risk of privacy breaches. In this work, we perform first comprehensive analysis of BCI App stores including software development kits (SDKs), application programming interfaces (APIs), and BCI applications w.r.t privacy issues. The goal is to understand the way brainwave signals are handled by BCI applications and what threats to the privacy of users exist. Our findings show that most applications have unrestricted access to users' brainwave signals and can easily extract private information about their users without them even noticing. We discuss potential privacy threats posed by current practices used in BCI App stores and then describe some countermeasures that could be used to mitigate the privacy threats. Also, develop a prototype which gives the BCI app users a choice to restrict their brain signal dynamically.
BSM Message and Video Streaming Quality Comparative Analysis Using Wave Short Message Protocol (WSMP)
Vehicular ad-hoc networks (VANETs) are used for vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communications. The IEEE 802.11p/WAVE (Wireless Access in Vehicular Environment) and with WAVE Short Messaging Protocol (WSMP) has been proposed as the standard protocol for designing applications for VANETs. This communication protocol must be thoroughly tested before reliable and efficient applications can be built using its protocols. In this paper, we perform on-road experiments in a variety of scenarios to evaluate the performance of the standard. We use commercial VANET devices with 802.11p/WAVE compliant chipsets for both BSM (basic safety messages) as well as video streaming applications using WSMP as a communication protocol. We show that while the standard performs well for BSM application in lightly loaded conditions, the performance becomes inferior when traffic and other performance metric increases. Furthermore, we also show that the standard is not suitable for video streaming due to the bursty nature of traffic and the bandwidth throttling, which is a major shortcoming for V2X applications.
Building an Intelligent Filtering System Using Idea Indexing
The widely used vector model maintains its popularity because of its simplicity, fast speed, and the appeal of using spatial proximity for semantic proximity. However, this model faces a disadvantage that is associated with the vagueness from keywords overlapping. Efforts have been made to improve the vector model. The research on improving document representation has been focused on four areas, namely, statistical co-occurrence of related items, forming term phrases, grouping of related words, and representing the content of documents. In this thesis, we propose the idea-indexing model to improve document representation for the filtering task in IR. The idea-indexing model matches document terms with the ideas they express and indexes the document with these ideas. This indexing scheme represents the document with its semantics instead of sets of independent terms. We show in this thesis that indexing with ideas leads to better performance.
A C Navigational System
The C Navigational System (CNS) is a proposed programming environment for the C programming language. The introduction covers the major influences of programming environments and the components of a programming environment. The system is designed to support the design, coding and maintenance phases of software development. CNS provides multiple views to both the source and documentation for a programming project. User-defined and system-defined links allow the source and documentation to be hierarchically searched. CNS also creates a history list and function interface for each function in a module. The final chapter compares CNS and several other programming environments (Microscope, Rn, Cedar, PECAN, and Marvel).
Classifying Pairwise Object Interactions: A Trajectory Analytics Approach
We have a huge amount of video data from extensively available surveillance cameras and increasingly growing technology to record the motion of a moving object in the form of trajectory data. With proliferation of location-enabled devices and ongoing growth in smartphone penetration as well as advancements in exploiting image processing techniques, tracking moving objects is more flawlessly achievable. In this work, we explore some domain-independent qualitative and quantitative features in raw trajectory (spatio-temporal) data in videos captured by a fixed single wide-angle view camera sensor in outdoor areas. We study the efficacy of those features in classifying four basic high level actions by employing two supervised learning algorithms and show how each of the features affect the learning algorithms’ overall accuracy as a single factor or confounded with others.
CLUE: A Cluster Evaluation Tool
Modern high performance computing is dependent on parallel processing systems. Most current benchmarks reveal only the high level computational throughput metrics, which may be sufficient for single processor systems, but can lead to a misrepresentation of true system capability for parallel systems. A new benchmark is therefore proposed. CLUE (Cluster Evaluator) uses a cellular automata algorithm to evaluate the scalability of parallel processing machines. The benchmark also uses algorithmic variations to evaluate individual system components' impact on the overall serial fraction and efficiency. CLUE is not a replacement for other performance-centric benchmarks, but rather shows the scalability of a system and provides metrics to reveal where one can improve overall performance. CLUE is a new benchmark which demonstrates a better comparison among different parallel systems than existing benchmarks and can diagnose where a particular parallel system can be optimized.
A Comparative Analysis of Guided vs. Query-Based Intelligent Tutoring Systems (ITS) Using a Class-Entity-Relationship-Attribute (CERA) Knowledge Base
One of the greatest problems facing researchers in the sub field of Artificial Intelligence known as Intelligent Tutoring Systems (ITS) is the selection of a knowledge base designs that will facilitate the modification of the knowledge base. The Class-Entity-Relationship-Attribute (CERA), proposed by R. P. Brazile, holds certain promise as a more generic knowledge base design framework upon which can be built robust and efficient ITS. This study has a twofold purpose. The first is to demonstrate that a CERA knowledge base can be constructed for an ITS on a subset of the domain of Cretaceous paleontology and function as the "expert module" of the ITS. The second is to test the validity of the ideas that students guided through a lesson learn more factual knowledge, while those who explore the knowledge base that underlies the lesson through query at their own pace will be able to formulate their own integrative knowledge from the knowledge gained in their explorations and spend more time on the system. This study concludes that a CERA-based system can be constructed as an effective teaching tool. However, while an ITS - treatment provides for statistically significant gains in achievement test scores, the type of treatment seems not to matter as much as time spent on task. This would seem to indicate that a query-based system which allows the user to progress at their own pace would be a better type of system for the presentation of material due to the greater amount of on-line computer time exhibited by the users.
A Comparison of Agent-Oriented Software Engineering Frameworks and Methodologies
Agent-oriented software engineering (AOSE) covers issues on developing systems with software agents. There are many techniques, mostly agent-oriented and object-oriented, ready to be chosen as building blocks to create agent-based systems. There have been several AOSE methodologies proposed intending to show engineers guidelines on how these elements are constituted in having agents achieve the overall system goals. Although these solutions are promising, most of them are designed in ad-hoc manner without truly obeying software developing life-cycle fully, as well as lacking of examinations on agent-oriented features. To address these issues, we investigated state-of-the-art techniques and AOSE methodologies. By examining them in different respects, we commented on the strength and weakness of them. Toward a formal study, a comparison framework has been set up regarding four aspects, including concepts and properties, notations and modeling techniques, process, and pragmatics. Under these criteria, we conducted the comparison in both overview and detailed level. The comparison helped us with empirical and analytical study, to inspect the issues on how an ideal agent-based system will be formed.
A Comparison of File Organization Techniques
This thesis compares the file organization techniques that are implemented on two different types of computer systems, the large-scale and the small-scale. File organizations from representative computers in each class are examined in detail: the IBM System/370 (OS/370) and the Harris 1600 Distributed Processing System with the Extended Communications Operating System (ECOS). In order to establish the basic framework for comparison, an introduction to file organizations is presented. Additionally, the functional requirements for file organizations are described by their characteristics and user demands. Concluding remarks compare file organization techniques and discuss likely future developments of file systems.
Computational Complexity of Hopfield Networks
There are three main results in this dissertation. They are PLS-completeness of discrete Hopfield network convergence with eight different restrictions, (degree 3, bipartite and degree 3, 8-neighbor mesh, dual of the knight's graph, hypercube, butterfly, cube-connected cycles and shuffle-exchange), exponential convergence behavior of discrete Hopfield network, and simulation of Turing machines by discrete Hopfield Network.
Computational Epidemiology - Analyzing Exposure Risk: A Deterministic, Agent-Based Approach
Many infectious diseases are spread through interactions between susceptible and infectious individuals. Keeping track of where each exposure to the disease took place, when it took place, and which individuals were involved in the exposure can give public health officials important information that they may use to formulate their interventions. Further, knowing which individuals in the population are at the highest risk of becoming infected with the disease may prove to be a useful tool for public health officials trying to curtail the spread of the disease. Epidemiological models are needed to allow epidemiologists to study the population dynamics of transmission of infectious agents and the potential impact of infectious disease control programs. While many agent-based computational epidemiological models exist in the literature, they focus on the spread of disease rather than exposure risk. These models are designed to simulate very large populations, representing individuals as agents, and using random experiments and probabilities in an attempt to more realistically guide the course of the modeled disease outbreak. The work presented in this thesis focuses on tracking exposure risk to chickenpox in an elementary school setting. This setting is chosen due to the high level of detailed information realistically available to school administrators regarding individuals' schedules and movements. Using an agent-based approach, contacts between individuals are tracked and analyzed with respect to both individuals and locations. The results are then analyzed using a combination of tools from computer science and geographic information science.
Computational Methods for Discovering and Analyzing Causal Relationships in Health Data
Publicly available datasets in health science are often large and observational, in contrast to experimental datasets where a small number of data are collected in controlled experiments. Variables' causal relationships in the observational dataset are yet to be determined. However, there is a significant interest in health science to discover and analyze causal relationships from health data since identified causal relationships will greatly facilitate medical professionals to prevent diseases or to mitigate the negative effects of the disease. Recent advances in Computer Science, particularly in Bayesian networks, has initiated a renewed interest for causality research. Causal relationships can be possibly discovered through learning the network structures from data. However, the number of candidate graphs grows in a more than exponential rate with the increase of variables. Exact learning for obtaining the optimal structure is thus computationally infeasible in practice. As a result, heuristic approaches are imperative to alleviate the difficulty of computations. This research provides effective and efficient learning tools for local causal discoveries and novel methods of learning causal structures with a combination of background knowledge. Specifically in the direction of constraint based structural learning, polynomial-time algorithms for constructing causal structures are designed with first-order conditional independence. Algorithms of efficiently discovering non-causal factors are developed and proved. In addition, when the background knowledge is partially known, methods of graph decomposition are provided so as to reduce the number of conditioned variables. Experiments on both synthetic data and real epidemiological data indicate the provided methods are applicable to large-scale datasets and scalable for causal analysis in health data. Followed by the research methods and experiments, this dissertation gives thoughtful discussions on the reliability of causal discoveries computational health science research, complexity, and implications in health science research.
Computational Methods for Vulnerability Analysis and Resource Allocation in Public Health Emergencies
POD (Point of Dispensing)-based emergency response plans involving mass prophylaxis may seem feasible when considering the choice of dispensing points within a region, overall population density, and estimated traffic demands. However, the plan may fail to serve particular vulnerable sub-populations, resulting in access disparities during emergency response. Federal authorities emphasize on the need to identify sub-populations that cannot avail regular services during an emergency due to their special needs to ensure effective response. Vulnerable individuals require the targeted allocation of appropriate resources to serve their special needs. Devising schemes to address the needs of vulnerable sub-populations is essential for the effectiveness of response plans. This research focuses on data-driven computational methods to quantify and address vulnerabilities in response plans that require the allocation of targeted resources. Data-driven methods to identify and quantify vulnerabilities in response plans are developed as part of this research. Addressing vulnerabilities requires the targeted allocation of appropriate resources to PODs. The problem of resource allocation to PODs during public health emergencies is introduced and the variants of the resource allocation problem such as the spatial allocation, spatio-temporal allocation and optimal resource subset variants are formulated. Generating optimal resource allocation and scheduling solutions can be computationally hard problems. The application of metaheuristic techniques to find near-optimal solutions to the resource allocation problem in response plans is investigated. A vulnerability analysis and resource allocation framework that facilitates the demographic analysis of population data in the context of response plans, and the optimal allocation of resources with respect to the analysis are described.
A Computer Algorithm for Synthetic Seismograms
Synthetic seismograms are a computer-generated aid in the search for hydrocarbons. Heretofore the solution has been done by z-transforms. This thesis presents a solution based on the method of finite differences. The resulting algorithm is fast and compact. The method is applied to three variations of the problem, all three are reduced to the same approximating equation, which is shown to be optimal, in that grid refinement does not change it. Two types of algorithms are derived from the equation. The number of obvious multiplications, additions and subtractions of each is analyzed. Critical section of each requires one multiplication, two additions and two subtractions. Four sample synthetic seismograms are shown. Implementation of the new algorithm runs twice as fast as previous computer program.
Computer Analysis of Amino Acid Chromatography
The problem with which this research was done was that of applying the IBM360 computer to the analysis of waveforms from a Beckman model 120C liquid chromatograph. Software to interpret these waveforms was written in the PLl language. For a control run, input to the computer consisted of a digital tape containing the raw results of the chromatograph run. Output consisted of several graphs and charts giving the results of the analysis. In addition, punched output was provided which gave the name of each amino acid, its elution time and color constant. These punched cards were then input to the computer as input to the experimental run, along with the raw data on the digital tape. From the known amounts of amino acids in the control run and the ratio of control to experimental peak area, the amino acids of the unknown were quantified. The resulting programs provided a complete and easy to use solution to the problem of chromatographic data analysis.
Computer Graphics Primitives and the Scan-Line Algorithm
This paper presents the scan-line algorithm which has been implemented on the Lisp Machine. The scan-line algorithm resides beneath a library of primitive software routines which draw more fundamental objects: lines, triangles and rectangles. This routine, implemented in microcode, applies the A(BC)*D approach to word boundary alignments in order to create an extremely fast, efficient, and general purpose drawing primitive. The scan-line algorithm improves on previous methodologies by limiting the number of CPU intensive instructions and by minimizing the number of words referenced. This paper will describe how to draw scan-lines and the constraints imposed upon the scan-line algorithm by the Lisp Machine's hardware and software.
Computer Realization of Human Music Cognition
This study models the human process of music cognition on the digital computer. The definition of music cognition is derived from the work in music cognition done by the researchers Carol Krumhansl and Edward Kessler, and by Mari Jones, as well as from the music theories of Heinrich Schenker. The computer implementation functions in three stages. First, it translates a musical "performance" in the form of MIDI (Musical Instrument Digital Interface) messages into LISP structures. Second, the various parameters of the performance are examined separately a la Jones's joint accent structure, quantified according to psychological findings, and adjusted to a common scale. The findings of Krumhansl and Kessler are used to evaluate the consonance of each note with respect to the key of the piece and with respect to the immediately sounding harmony. This process yields a multidimensional set of points, each of which is a cognitive evaluation of a single musical event within the context of the piece of music within which it occurred. This set of points forms a metric space in multi-dimensional Euclidean space. The third phase of the analysis maps the set of points into a topology-preserving data structure for a Schenkerian-like middleground structural analysis. This process yields a hierarchical stratification of all the musical events (notes) in a piece of music. It has been applied to several pieces of music with surprising results. In each case, the analysis obtained very closely resembles a structural analysis which would be supplied by a human theorist. The results obtained invite us to take another look at the representation of knowledge and perception from another perspective, that of a set of points in a topological space, and to ask if such a representation might not be useful in other domains. It also leads us to ask if such a …
Computerized Analysis of Radiograph Images of Embedded Objects as Applied to Bone Location and Mineral Content Measurement
This investigation dealt with locating and measuring x-ray absorption of radiographic images. The methods developed provide a fast, accurate, minicomputer control, for analysis of embedded objects. A PDP/8 computer system was interfaced with a Joyce Loebl 3CS Microdensitometer and a Leeds & Northrup Recorder. Proposed algorithms for bone location and data smoothing work on a twelve-bit minicomputer. Designs of a software control program and operational procedure are presented. The filter made wedge and limb scans monotonic from minima to maxima. It was tested for various convoluted intervals. Ability to resmooth the same data in multiple passes was tested. An interval size of fifteen works well in one pass.
Concurrent Pattern Recognition and Optical Character Recognition
The problem of interest as indicated is to develop a general purpose technique that is a combination of the structural approach, and an extension of the Finite Inductive Sequence (FI) technique. FI technology is pre-algebra, and deals with patterns for which an alphabet can be formulated.
Convexity-Preserving Scattered Data Interpolation
Surface fitting methods play an important role in many scientific fields as well as in computer aided geometric design. The problem treated here is that of constructing a smooth surface that interpolates data values associated with scattered nodes in the plane. The data is said to be convex if there exists a convex interpolant. The problem of convexity-preserving interpolation is to determine if the data is convex, and construct a convex interpolant if it exists.
Cross Language Information Retrieval for Languages with Scarce Resources
Our generation has experienced one of the most dramatic changes in how society communicates. Today, we have online information on almost any imaginable topic. However, most of this information is available in only a few dozen languages. In this thesis, I explore the use of parallel texts to enable cross-language information retrieval (CLIR) for languages with scarce resources. To build the parallel text I use the Bible. I evaluate different variables and their impact on the resulting CLIR system, specifically: (1) the CLIR results when using different amounts of parallel text; (2) the role of paraphrasing on the quality of the CLIR output; (3) the impact on accuracy when translating the query versus translating the collection of documents; and finally (4) how the results are affected by the use of different dialects. The results show that all these variables have a direct impact on the quality of the CLIR system.
Back to Top of Screen