Search Results

Bayesian Probabilistic Reasoning Applied to Mathematical Epidemiology for Predictive Spatiotemporal Analysis of Infectious Diseases
Abstract Probabilistic reasoning under uncertainty suits well to analysis of disease dynamics. The stochastic nature of disease progression is modeled by applying the principles of Bayesian learning. Bayesian learning predicts the disease progression, including prevalence and incidence, for a geographic region and demographic composition. Public health resources, prioritized by the order of risk levels of the population, will efficiently minimize the disease spread and curtail the epidemic at the earliest. A Bayesian network representing the outbreak of influenza and pneumonia in a geographic region is ported to a newer region with different demographic composition. Upon analysis for the newer region, the corresponding prevalence of influenza and pneumonia among the different demographic subgroups is inferred for the newer region. Bayesian reasoning coupled with disease timeline is used to reverse engineer an influenza outbreak for a given geographic and demographic setting. The temporal flow of the epidemic among the different sections of the population is analyzed to identify the corresponding risk levels. In comparison to spread vaccination, prioritizing the limited vaccination resources to the higher risk groups results in relatively lower influenza prevalence. HIV incidence in Texas from 1989-2002 is analyzed using demographic based epidemic curves. Dynamic Bayesian networks are integrated with probability distributions of HIV surveillance data coupled with the census population data to estimate the proportion of HIV incidence among the different demographic subgroups. Demographic based risk analysis lends to observation of varied spectrum of HIV risk among the different demographic subgroups. A methodology using hidden Markov models is introduced that enables to investigate the impact of social behavioral interactions in the incidence and prevalence of infectious diseases. The methodology is presented in the context of simulated disease outbreak data for influenza. Probabilistic reasoning analysis enhances the understanding of disease progression in order to identify the critical points of surveillance, …
Boosting for Learning From Imbalanced, Multiclass Data Sets
In many real-world applications, it is common to have uneven number of examples among multiple classes. The data imbalance, however, usually complicates the learning process, especially for the minority classes, and results in deteriorated performance. Boosting methods were proposed to handle the imbalance problem. These methods need elongated training time and require diversity among the classifiers of the ensemble to achieve improved performance. Additionally, extending the boosting method to handle multi-class data sets is not straightforward. Examples of applications that suffer from imbalanced multi-class data can be found in face recognition, where tens of classes exist, and in capsule endoscopy, which suffers massive imbalance between the classes. This dissertation introduces RegBoost, a new boosting framework to address the imbalanced, multi-class problems. This method applies a weighted stratified sampling technique and incorporates a regularization term that accommodates multi-class data sets and automatically determines the error bound of each base classifier. The regularization parameter penalizes the classifier when it misclassifies instances that were correctly classified in the previous iteration. The parameter additionally reduces the bias towards majority classes. Experiments are conducted using 12 diverse data sets with moderate to high imbalance ratios. The results demonstrate superior performance of the proposed method compared to several state-of-the-art algorithms for imbalanced, multi-class classification problems. More importantly, the sensitivity improvement of the minority classes using RegBoost is accompanied with the improvement of the overall accuracy for all classes. With unpredictability regularization, a diverse group of classifiers are created and the maximum accuracy improvement reaches above 24%. Using stratified undersampling, RegBoost exhibits the best efficiency. The reduction in computational cost is significant reaching above 50%. As the volume of training data increase, the gain of efficiency with the proposed method becomes more significant.
Qos Aware Service Oriented Architecture
Service-oriented architecture enables web services to operate in a loosely-coupled setting and provides an environment for dynamic discovery and use of services over a network using standards such as WSDL, SOAP, and UDDI. Web service has both functional and non-functional characteristics. This thesis work proposes to add QoS descriptions (non-functional properties) to WSDL and compose various services to form a business process. This composition of web services also considers QoS properties along with functional properties and the composed services can again be published as a new Web Service and can be part of any other composition using Composed WSDL.
Computer Realization of Human Music Cognition
This study models the human process of music cognition on the digital computer. The definition of music cognition is derived from the work in music cognition done by the researchers Carol Krumhansl and Edward Kessler, and by Mari Jones, as well as from the music theories of Heinrich Schenker. The computer implementation functions in three stages. First, it translates a musical "performance" in the form of MIDI (Musical Instrument Digital Interface) messages into LISP structures. Second, the various parameters of the performance are examined separately a la Jones's joint accent structure, quantified according to psychological findings, and adjusted to a common scale. The findings of Krumhansl and Kessler are used to evaluate the consonance of each note with respect to the key of the piece and with respect to the immediately sounding harmony. This process yields a multidimensional set of points, each of which is a cognitive evaluation of a single musical event within the context of the piece of music within which it occurred. This set of points forms a metric space in multi-dimensional Euclidean space. The third phase of the analysis maps the set of points into a topology-preserving data structure for a Schenkerian-like middleground structural analysis. This process yields a hierarchical stratification of all the musical events (notes) in a piece of music. It has been applied to several pieces of music with surprising results. In each case, the analysis obtained very closely resembles a structural analysis which would be supplied by a human theorist. The results obtained invite us to take another look at the representation of knowledge and perception from another perspective, that of a set of points in a topological space, and to ask if such a representation might not be useful in other domains. It also leads us to ask if such a …
3GPP Long Term Evolution LTE Scheduling
Future generation cellular networks are expected to deliver an omnipresent broadband access network for an endlessly increasing number of subscribers. Long term Evolution (LTE) represents a significant milestone towards wireless networks known as 4G cellular networks. A key feature of LTE is the implementation of enhanced Radio Resource Management (RRM) mechanism to improve the system performance. The structure of LTE networks was simplified by diminishing the number of the nodes of the core network. Also, the design of the radio protocol architecture is quite unique. In order to achieve high data rate in LTE, 3rd Generation Partnership Project (3GPP) has selected Orthogonal Frequency Division Multiplexing (OFDM) as an appropriate scheme in terms of downlinks. However, the proper scheme for an uplink is the Single-Carrier Frequency Domain Multiple Access due to the peak-to-average-power-ratio (PAPR) constraint. LTE packet scheduling plays a primary role as part of RRM to improve the system’s data rate as well as supporting various QoS requirements of mobile services. The major function of the LTE packet scheduler is to assign Physical Resource Blocks (PRBs) to mobile User Equipment (UE). In our work, we formed a proposed packet scheduler algorithm. The proposed scheduler algorithm acts based on the number of UEs attached to the eNodeB. To evaluate the proposed scheduler algorithm, we assumed two different scenarios based on a number of UEs. When the number of UE is lower than the number of PRBs, the UEs with highest Channel Quality Indicator (CQI) will be assigned PRBs. Otherwise, the scheduler will assign PRBs based on a given proportional fairness metric. The eNodeB’s throughput is increased when the proposed algorithm was implemented.
Towards a Unilateral Sensing System for Detecting Person-to-Person Contacts
The contact patterns among individuals can significantly affect the progress of an infectious outbreak within a population. Gathering data about these interaction and mixing patterns is essential to assess computational modeling of infectious diseases. Various self-report approaches have been designed in different studies to collect data about contact rates and patterns. Recent advances in sensing technology provide researchers with a bilateral automated data collection devices to facilitate contact gathering overcoming the disadvantages of previous approaches. In this study, a novel unilateral wearable sensing architecture has been proposed that overcome the limitations of the bi-lateral sensing. Our unilateral wearable sensing system gather contact data using hybrid sensor arrays embedded in wearable shirt. A smartphone application has been used to transfer the collected sensors data to the cloud and apply deep learning model to estimate the number of human contacts and the results are stored in the cloud database. The deep learning model has been developed on the hand labelled data over multiple experiments. This model has been tested and evaluated, and these results were reported in the study. Sensitivity analysis has been performed to choose the most suitable image resolution and format for the model to estimate contacts and to analyze the model's consumption of computer resources.
Real-time Rendering of Burning Objects in Video Games
In recent years there has been growing interest in limitless realism in computer graphics applications. Among those, my foremost concentration falls into the complex physical simulations and modeling with diverse applications for the gaming industry. Different simulations have been virtually successful by replicating the details of physical process. As a result, some were strong enough to lure the user into believable virtual worlds that could destroy any sense of attendance. In this research, I focus on fire simulations and its deformation process towards various virtual objects. In most game engines model loading takes place at the beginning of the game or when the game is transitioning between levels. Game models are stored in large data structures. Since changing or adjusting a large data structure while the game is proceeding may adversely affect the performance of the game. Therefore, developers may choose to avoid procedural simulations to save resources and avoid interruptions on performance. I introduce a process to implement a real-time model deformation while maintaining performance. It is a challenging task to achieve high quality simulation while utilizing minimum resources to represent multiple events in timely manner. Especially in video games, this overwhelming criterion would be robust enough to sustain the engaging player's willing suspension of disbelief. I have implemented and tested my method on a relatively modest GPU using CUDA. My experiments conclude this method gives a believable visual effect while using small fraction of CPU and GPU resources.
An Integrated Architecture for Ad Hoc Grids
Extensive research has been conducted by the grid community to enable large-scale collaborations in pre-configured environments. grid collaborations can vary in scale and motivation resulting in a coarse classification of grids: national grid, project grid, enterprise grid, and volunteer grid. Despite the differences in scope and scale, all the traditional grids in practice share some common assumptions. They support mutually collaborative communities, adopt a centralized control for membership, and assume a well-defined non-changing collaboration. To support grid applications that do not confirm to these assumptions, we propose the concept of ad hoc grids. In the context of this research, we propose a novel architecture for ad hoc grids that integrates a suite of component frameworks. Specifically, our architecture combines the community management framework, security framework, abstraction framework, quality of service framework, and reputation framework. The overarching objective of our integrated architecture is to support a variety of grid applications in a self-controlled fashion with the help of a self-organizing ad hoc community. We introduce mechanisms in our architecture that successfully isolates malicious elements from the community, inherently improving the quality of grid services and extracting deterministic quality assurances from the underlying infrastructure. We also emphasize on the technology-independence of our architecture, thereby offering the requisite platform for technology interoperability. The feasibility of the proposed architecture is verified with a high-quality ad hoc grid implementation. Additionally, we have analyzed the performance and behavior of ad hoc grids with respect to several control parameters.
Resource Efficient and Scalable Routing using Intelligent Mobile Agents
Many of the contemporary routing algorithms use simple mechanisms such as flooding or broadcasting to disseminate the routing information available to them. Such routing algorithms cause significant network resource overhead due to the large number of messages generated at each host/router throughout the route update process. Many of these messages are wasteful since they do not contribute to the route discovery process. Reducing the resource overhead may allow for several algorithms to be deployed in a wide range of networks (wireless and ad-hoc) which require a simple routing protocol due to limited availability of resources (memory and bandwidth). Motivated by the need to reduce the resource overhead associated with routing algorithms a new implementation of distance vector routing algorithm using an agent-based paradigm known as Agent-based Distance Vector Routing (ADVR) has been proposed. In ADVR, the ability of route discovery and message passing shifts from the nodes to individual agents that traverse the network, co-ordinate with each other and successively update the routing tables of the nodes they visit.
Concurrent Pattern Recognition and Optical Character Recognition
The problem of interest as indicated is to develop a general purpose technique that is a combination of the structural approach, and an extension of the Finite Inductive Sequence (FI) technique. FI technology is pre-algebra, and deals with patterns for which an alphabet can be formulated.
Independent Quadtrees
This dissertation deals with the problem of manipulating and storing an image using quadtrees. A quadtree is a tree in which each node has four ordered children or is a leaf. It can be used to represent an image via hierarchical decomposition. The image is broken into four regions. A region can be a solid color (homogeneous) or a mixture of colors (heterogeneous). If a region is heterogeneous it is broken into four subregions, and the process continues recursively until all subregions are homogeneous. The traditional quadtree suffers from dependence on the underlying grid. The grid coordinate system is implicit, and therefore fixed. The fixed coordinate system implies a rigid tree. A rigid tree cannot be translated, scaled, or rotated. Instead, a new tree must be built which is the result of one of these transformations. This dissertation introduces the independent quadtree. The independent quadtree is free of any underlying coordinate system. The tree is no longer rigid and can be easily translated, scaled, or rotated. Algorithms to perform these operations axe presented. The translation and rotation algorithms take constant time. The scaling algorithm has linear time in the number nodes in the tree. The disadvantage of independent quadtrees is the longer generation and display time. This dissertation also introduces an alternate method of hierarchical decomposition. This new method finds the largest homogeneous block with respect to the corners of the image. This block defines the division point for the decomposition. If the size of the block is below some cutoff point, it is deemed to be to small to make the overhead worthwhile and the traditional method is used instead. This new method is compared to the traditional method on randomly generated rectangles, triangles, and circles. The new method is shown to use significantly less space for all three …
Inheritance Problems in Object-Oriented Database
This research is concerned with inheritance as used in object-oriented database. More specifically, partial bi-directional inheritance among classes is examined. In partial inheritance, a class can inherit a proper subset of instance variables from another class. Two subclasses of the same superclass do not need to inherit the same proper subset of instance variables from their superclass. Bi-directional partial inheritance allows a class to inherit instance variables from its subclass. The prototype of an object-oriented database that supports both full and partial bi-directional inheritance among classes was developed on top of an existing relational database management system. The prototype was tested with two database applications. One database application needs full and partial inheritance. The second database application required bi-directional inheritance. The result of this testing suggests both advantages and disadvantages of partial bi-directional inheritance. Future areas of research are also suggested.
Privacy Management for Online Social Networks
One in seven people in the world use online social networking for a variety of purposes -- to keep in touch with friends and family, to share special occasions, to broadcast announcements, and more. The majority of society has been bought into this new era of communication technology, which allows everyone on the internet to share information with friends. Since social networking has rapidly become a main form of communication, holes in privacy have become apparent. It has come to the point that the whole concept of sharing information requires restructuring. No longer are online social networks simply technology available for a niche market; they are in use by all of society. Thus it is important to not forget that a sense of privacy is inherent as an evolutionary by-product of social intelligence. In any context of society, privacy needs to be a part of the system in order to help users protect themselves from others. This dissertation attempts to address the lack of privacy management in online social networks by designing models which understand the social science behind how we form social groups and share information with each other. Social relationship strength was modeled using activity patterns, vocabulary usage, and behavioral patterns. In addition, automatic configuration for default privacy settings was proposed to help prevent new users from leaking personal information. This dissertation aims to mobilize a new era of social networking that understands social aspects of human network, and uses that knowledge to honor users' privacy.
An Efficient Hybrid Heuristic and Probabilistic Model for the Gate Matrix Layout Problem in VLSI Design
In this thesis, the gate matrix layout problem in VLSI design is considered where the goal is to minimize the number of tracks required to layout a given circuit and a taxonomy of approaches to its solution is presented. An efficient hybrid heuristic is also proposed for this combinatorial optimization problem, which is based on the combination of probabilistic hill-climbing technique and greedy method. This heuristic is tested experimentally with respect to four existing algorithms. As test cases, five benchmark problems from the literature as well as randomly generated problem instances are considered. The experimental results show that the proposed hybrid algorithm, on the average, performs better than other heuristics in terms of the required computation time and/or the quality of solution. Due to the computation-intensive nature of the problem, an exact solution within reasonable time limits is impossible. So, it is difficult to judge the effectiveness of any heuristic in terms of the quality of solution (number of tracks required). A probabilistic model of the gate matrix layout problem that computes the expected number of tracks from the given input parameters, is useful to this respect. Such a probabilistic model is proposed in this thesis, and its performance is experimentally evaluated.
Defensive Programming
This research explores the concepts of defensive programming as currently defined in the literature. Then these concepts are extended and more explicitly defined. The relationship between defensive programming, as presented in this research, and current programming practices is discussed and several benefits are observed. Defensive programming appears to benefit the entire software life cycle. Four identifiable phases of the software development process are defined, and the relationship between these four phases and defensive programming is shown. In this research, defensive programming is defined as writing programs in such a way that during execution the program itself produces communication allowing the programmer and the user to observe its dynamic states accurately and critically. To accomplish this end, the use of defensive programming snap shots is presented as a software development tool.
Unique Channel Email System
Email connects 85% of the world. This paper explores the pattern of information overload encountered by majority of email users and examine what steps key email providers are taking to combat the problem. Besides fighting spam, popular email providers offer very limited tools to reduce the amount of unwanted incoming email. Rather, there has been a trend to expand storage space and aid the organization of email. Storing email is very costly and harmful to the environment. Additionally, information overload can be detrimental to productivity. We propose a simple solution that results in drastic reduction of unwanted mail, also known as graymail.
The Role of Intelligent Mobile Agents in Network Management and Routing
In this research, the application of intelligent mobile agents to the management of distributed network environments is investigated. Intelligent mobile agents are programs which can move about network systems in a deterministic manner in carrying their execution state. These agents can be considered an application of distributed artificial intelligence where the (usually small) agent code is moved to the data and executed locally. The mobile agent paradigm offers potential advantages over many conventional mechanisms which move (often large) data to the code, thereby wasting available network bandwidth. The performance of agents in network routing and knowledge acquisition has been investigated and simulated. A working mobile agent system has also been designed and implemented in JDK 1.2.
FORTRAN Optimizations at the Source Code Level
This paper discusses FORTRAN optimizations that the user can perform manually at the source code level to improve object code performance. It makes use of descriptive examples within the text of the paper for explanatory purposes. The paper defines key areas in writing a FORTRAN program and recommends ways to improve efficiency in these areas.
Multi-perspective, Multi-modal Image Registration and Fusion
Multi-modal image fusion is an active research area with many civilian and military applications. Fusion is defined as strategic combination of information collected by various sensors from different locations or different types in order to obtain a better understanding of an observed scene or situation. Fusion of multi-modal images cannot be completed unless these two modalities are spatially aligned. In this research, I consider two important problems. Multi-modal, multi-perspective image registration and decision level fusion of multi-modal images. In particular, LiDAR and visual imagery. Multi-modal image registration is a difficult task due to the different semantic interpretation of features extracted from each modality. This problem is decoupled into three sub-problems. The first step is identification and extraction of common features. The second step is the determination of corresponding points. The third step consists of determining the registration transformation parameters. Traditional registration methods use low level features such as lines and corners. Using these features require an extensive optimization search in order to determine the corresponding points. Many methods use global positioning systems (GPS), and a calibrated camera in order to obtain an initial estimate of the camera parameters. The advantages of our work over the previous works are the following. First, I used high level-features, which significantly reduce the search space for the optimization process. Second, the determination of corresponding points is modeled as an assignment problem between a small numbers of objects. On the other side, fusing LiDAR and visual images is beneficial, due to the different and rich characteristics of both modalities. LiDAR data contain 3D information, while images contain visual information. Developing a fusion technique that uses the characteristics of both modalities is very important. I establish a decision-level fusion technique using manifold models.
Improving Digital Circuit Simulation: A Knowledge-Based Approach
This project focuses on a prototype system architecture which integrates features of an event-driven gate-level simulator and features of the multiple expert system architecture, HEARSAY-II. Combining artificial intelligence and simulation techniques, a knowledge-based simulator was designed and constructed to model non-standard circuit behavior. This non-standard circuit behavior is amplified by advances in integrated circuit technology. Currently available digital circuit simulators can not simulate this behavior. Circuit designer expertise on behavioral phenomena is used in the expert system to guide the base simulator by manipulating its events to achieve the desired behavior.
Brain Computer Interface (BCI) Applications: Privacy Threats and Countermeasures
In recent years, brain computer interfaces (BCIs) have gained popularity in non-medical domains such as the gaming, entertainment, personal health, and marketing industries. A growing number of companies offer various inexpensive consumer grade BCIs and some of these companies have recently introduced the concept of BCI "App stores" in order to facilitate the expansion of BCI applications and provide software development kits (SDKs) for other developers to create new applications for their devices. The BCI applications access to users' unique brainwave signals, which consequently allows them to make inferences about users' thoughts and mental processes. Since there are no specific standards that govern the development of BCI applications, its users are at the risk of privacy breaches. In this work, we perform first comprehensive analysis of BCI App stores including software development kits (SDKs), application programming interfaces (APIs), and BCI applications w.r.t privacy issues. The goal is to understand the way brainwave signals are handled by BCI applications and what threats to the privacy of users exist. Our findings show that most applications have unrestricted access to users' brainwave signals and can easily extract private information about their users without them even noticing. We discuss potential privacy threats posed by current practices used in BCI App stores and then describe some countermeasures that could be used to mitigate the privacy threats. Also, develop a prototype which gives the BCI app users a choice to restrict their brain signal dynamically.
Detecting Component Failures and Critical Components in Safety Critical Embedded Systems using Fault Tree Analysis
Component failures can result in catastrophic behaviors in safety critical embedded systems, sometimes resulting in loss of life. Component failures can be treated as off nominal behaviors (ONBs) with respect to the components and sub systems involved in an embedded system. A lot of research is being carried out to tackle the problem of ONBs. These approaches are mainly focused on the states (i.e., desired and undesired states of a system at a given point of time to detect ONBs). In this paper, an approach is discussed to detect component failures and critical components of an embedded system. The approach is based on fault tree analysis (FTA), applied to the requirements specification of embedded systems at design time to find out the relationship between individual component failures and overall system failure. FTA helps in determining both qualitative and quantitative relationship between component failures and system failure. Analyzing the system at design time helps in detecting component failures and critical components and helps in devising strategies to mitigate component failures at design time and improve overall safety and reliability of a system.
Hopfield Networks as an Error Correcting Technique for Speech Recognition
I experimented with Hopfield networks in the context of a voice-based, query-answering system. Hopfield networks are used to store and retrieve patterns. I used this technique to store queries represented as natural language sentences and I evaluated the accuracy of the technique for error correction in a spoken question-answering dialog between a computer and a user. I show that the use of an auto-associative Hopfield network helps make the speech recognition system more fault tolerant. I also looked at the available encoding schemes to convert a natural language sentence into a pattern of zeroes and ones that can be stored in the Hopfield network reliably, and I suggest scalable data representations which allow storing a large number of queries.
Modeling and Simulation of the Vector-Borne Dengue Disease and the Effects of Regional Variation of Temperature in the Disease Prevalence in Homogenous and Heterogeneous Human Populations
The history of mitigation programs to contain vector-borne diseases is a story of successes and failures. Due to the complex interplay among multiple factors that determine disease dynamics, the general principles for timely and specific intervention for incidence reduction or eradication of life-threatening diseases has yet to be determined. This research discusses computational methods developed to assist in the understanding of complex relationships affecting vector-borne disease dynamics. A computational framework to assist public health practitioners with exploring the dynamics of vector-borne diseases, such as malaria and dengue in homogenous and heterogeneous populations, has been conceived, designed, and implemented. The framework integrates a stochastic computational model of interactions to simulate horizontal disease transmission. The intent of the computational modeling has been the integration of stochasticity during simulation of the disease progression while reducing the number of necessary interactions to simulate a disease outbreak. While there are improvements in the computational time reducing the number of interactions needed for simulating disease dynamics, the realization of interactions can remain computationally expensive. Using multi-threading technology to improve performance upon the original computational model, multi-threading experimental results have been tested and reported. In addition, to the contact model, the modeling of biological processes specific to the corresponding pathogen-carrier vector to increase the specificity of the vector-borne disease has been integrated. Last, automation for requesting, retrieving, parsing, and storing specific weather data and geospatial information from federal agencies to study the differences between homogenous and heterogeneous populations has been implemented.
Freeform Cursive Handwriting Recognition Using a Clustered Neural Network
Optical character recognition (OCR) software has advanced greatly in recent years. Machine-printed text can be scanned and converted to searchable text with word accuracy rates around 98%. Reasonably neat hand-printed text can be recognized with about 85% word accuracy. However, cursive handwriting still remains a challenge, with state-of-the-art performance still around 75%. Algorithms based on hidden Markov models have been only moderately successful, while recurrent neural networks have delivered the best results to date. This thesis explored the feasibility of using a special type of feedforward neural network to convert freeform cursive handwriting to searchable text. The hidden nodes in this network were grouped into clusters, with each cluster being trained to recognize a unique character bigram. The network was trained on writing samples that were pre-segmented and annotated. Post-processing was facilitated in part by using the network to identify overlapping bigrams that were then linked together to form words and sentences. With dictionary assisted post-processing, the network achieved word accuracy of 66.5% on a small, proprietary corpus. The contributions in this thesis are threefold: 1) the novel clustered architecture of the feed-forward neural network, 2) the development of an expanded set of observers combining image masks, modifiers, and feature characterizations, and 3) the use of overlapping bigrams as the textual working unit to assist in context analysis and reconstruction.
SEM Predicting Success of Student Global Software Development Teams
The extensive use of global teams to develop software has prompted researchers to investigate various factors that can enhance a team’s performance. While a significant body of research exists on global software teams, previous research has not fully explored the interrelationships and collective impact of various factors on team performance. This study explored a model that added the characteristics of a team’s culture, ability, communication frequencies, response rates, and linguistic categories to a central framework of team performance. Data was collected from two student software development projects that occurred between teams located in the United States, Panama, and Turkey. The data was obtained through online surveys and recorded postings of team activities that occurred throughout the global software development projects. Partial least squares path modeling (PLS-PM) was chosen as the analytic technique to test the model and identify the most influential factors. Individual factors associated with response rates and linguistic characteristics proved to significantly affect a team’s activity related to grade on the project, group cohesion, and the number of messages received and sent. Moreover, an examination of possible latent homogeneous segments in the model supported the existence of differences among groups based on leadership style. Teams with assigned leaders tended to have stronger relationships between linguistic characteristics and team performance factors, while teams with emergent leaders had stronger. Relationships between response rates and team performance factors. The contributions in this dissertation are three fold. 1) Novel analysis techniques using PLS-PM and clustering, 2) Use of new, quantifiable variables in analyzing team activity, 3) Identification of plausible causal indicators for team performance and analysis of the same.
Computerized Analysis of Radiograph Images of Embedded Objects as Applied to Bone Location and Mineral Content Measurement
This investigation dealt with locating and measuring x-ray absorption of radiographic images. The methods developed provide a fast, accurate, minicomputer control, for analysis of embedded objects. A PDP/8 computer system was interfaced with a Joyce Loebl 3CS Microdensitometer and a Leeds & Northrup Recorder. Proposed algorithms for bone location and data smoothing work on a twelve-bit minicomputer. Designs of a software control program and operational procedure are presented. The filter made wedge and limb scans monotonic from minima to maxima. It was tested for various convoluted intervals. Ability to resmooth the same data in multiple passes was tested. An interval size of fifteen works well in one pass.
A Graphical, Database-Querying Interface for Casual, Naive Computer Users
This research is concerned with some aspects of the retrieval of information from database systems by casual, naive computer users. A "casual user" is defined as an individual who only wishes to execute queries perhaps once or twice a month, and a "naive user" is someone who has little or no expertise in operating a computer and, more specifically for the purposes of this study, is not practiced at querying a database. The research initially focuses on a specific group of casual, naive users, namely a group of clinicians, and analyzes their characteristics as they pertain to the retrieval of information from a computer database. The characteristics thus elicited are then used to create the requirements for a database interface that would, potentially, be acceptable to this group. An interface having the desired requirements is then proposed. This interface consists, from a user's perspective, of three basic components. A graphical model gives a picture of the database structure. Windows give the ability to view different areas of the database, physically group together items that come under one logical heading and provide the user with immediate access to the data item names used by the system. Finally, a natural language query language provides a means of entering a query in a syntax (that of ordinary English) which is familiar to the user. The graphical model is a logical abstraction of the database. Unlike other database interfaces, it is not constrained by the model (relational, hierarchical, network) underlying the database management system, with the one caveat that the graphical model should not imply any connections which cannot be supported by the management system. Versions of the interface are implemented on both eight-bit and sixteen-bit microcomputers, and testing is conducted in order to validate the acceptability of the interface and to discover the …
A Design Approach for Digital Computer Peripheral Controllers, Case Study Design and Construction
The purpose of this project was to describe a novel design approach for a digital computer peripheral controller, then design and construct a case study controller. This document consists of three chapters and an appendix. Chapter II presents the design approach chosen; a variation to a design presented by Charles R. Richards in an article published in Electronics magazine. Richards' approach consists of a finite state machine circuitry controlling all the functions of a controller. The variation to Richards' approach consists of considering the various logically independent processes which a controller carries out and assigning control of each process to a separate finite state machine. The appendix contains the documentation of the design and construction of the controller.
DRVBLD: a UNIX Device Driver Builder
New peripheral devices are being developed at an ever increasing rate. Before such accessories can be used in the UNIX environment (UNIX is a trademark of Bell Laboratories), they must be able to communicate with the operating system. This involves writing a device driver for each device. In order to do this, very detailed knowledge is required of both the device to be integrated and the version of UNIX to which it will be attached. The process is long, detailed and prone to subtle problems and errors. This paper presents a menu-driven utility designed to simplify and accelerate the design and implementation of UNIX device drivers by freeing developers from many of the implementation specific low-level details.
Practical Cursive Script Recognition
This research focused on the off-line cursive script recognition application. The problem is very large and difficult and there is much room for improvement in every aspect of the problem. Many different aspects of this problem were explored in pursuit of solutions to create a more practical and usable off-line cursive script recognizer than is currently available.
Automated Testing of Interactive Systems
Computer systems which interact with human users to collect, update or provide information are growing more complex. Additionally, users are demanding more thorough testing of all computer systems. Because of the complexity and thoroughness required, automation of interactive systems testing is desirable, especially for functional testing. Many currently available testing tools, like program proving, are impractical for testing large systems. The solution presented here is the development of an automated test system which simulates human users. This system incorporates a high-level programming language, ATLIS. ATLIS programs are compiled and interpretively executed. Programs are selected for execution by operator command, and failures are reported to the operator's console. An audit trail of all activity is provided. This solution provides improved efficiency and effectiveness over conventional testing methods.
Investigating the Extractive Summarization of Literary Novels
Abstract Due to the vast amount of information we are faced with, summarization has become a critical necessity of everyday human life. Given that a large fraction of the electronic documents available online and elsewhere consist of short texts such as Web pages, news articles, scientific reports, and others, the focus of natural language processing techniques to date has been on the automation of methods targeting short documents. We are witnessing however a change: an increasingly larger number of books become available in electronic format. This means that the need for language processing techniques able to handle very large documents such as books is becoming increasingly important. This thesis addresses the problem of summarization of novels, which are long and complex literary narratives. While there is a significant body of research that has been carried out on the task of automatic text summarization, most of this work has been concerned with the summarization of short documents, with a particular focus on news stories. However, novels are different in both length and genre, and consequently different summarization techniques are required. This thesis attempts to close this gap by analyzing a new domain for summarization, and by building unsupervised and supervised systems that effectively take into account the properties of long documents, and outperform the traditional extractive summarization systems typically addressing news genre.
Using Reinforcement Learning in Partial Order Plan Space
Partial order planning is an important approach that solves planning problems without completely specifying the orderings between the actions in the plan. This property provides greater flexibility in executing plans; hence making the partial order planners a preferred choice over other planning methodologies. However, in order to find partially ordered plans, partial order planners perform a search in plan space rather than in space of world states and an uninformed search in plan space leads to poor efficiency. In this thesis, I discuss applying a reinforcement learning method, called First-visit Monte Carlo method, to partial order planning in order to design agents which do not need any training data or heuristics but are still able to make informed decisions in plan space based on experience. Communicating effectively with the agent is crucial in reinforcement learning. I address how this task was accomplished in plan space and the results from an evaluation of a blocks world test bed.
Natural Language Interfaces to Databases
Natural language interfaces to databases (NLIDB) are systems that aim to bridge the gap between the languages used by humans and computers, and automatically translate natural language sentences to database queries. This thesis proposes a novel approach to NLIDB, using graph-based models. The system starts by collecting as much information as possible from existing databases and sentences, and transforms this information into a knowledge base for the system. Given a new question, the system will use this knowledge to analyze and translate the sentence into its corresponding database query statement. The graph-based NLIDB system uses English as the natural language, a relational database model, and SQL as the formal query language. In experiments performed with natural language questions ran against a large database containing information about U.S. geography, the system showed good performance compared to the state-of-the-art in the field.
Measuring Vital Signs Using Smart Phones
Smart phones today have become increasingly popular with the general public for its diverse abilities like navigation, social networking, and multimedia facilities to name a few. These phones are equipped with high end processors, high resolution cameras, built-in sensors like accelerometer, orientation-sensor, light-sensor, and much more. According to comScore survey, 25.3% of US adults use smart phones in their daily lives. Motivated by the capability of smart phones and their extensive usage, I focused on utilizing them for bio-medical applications. In this thesis, I present a new application for a smart phone to quantify the vital signs such as heart rate, respiratory rate and blood pressure with the help of its built-in sensors. Using the camera and a microphone, I have shown how the blood pressure and heart rate can be determined for a subject. People sometimes encounter minor situations like fainting or fatal accidents like car crash at unexpected times and places. It would be useful to have a device which can measure all vital signs in such an event. The second part of this thesis demonstrates a new mode of communication for next generation 9-1-1 calls. In this new architecture, the call-taker will be able to control the multimedia elements in the phone from a remote location. This would help the call-taker or first responder to have a better control over the situation. Transmission of the vital signs measured using the smart phone can be a life saver in critical situations. In today's voice oriented 9-1-1 calls, the dispatcher first collects critical information (e.g., location, call-back number) from caller, and assesses the situation. Meanwhile, the dispatchers constantly face a "60-second dilemma"; i.e., within 60 seconds, they need to make a complicated but important decision, whether to dispatch and, if so, what to dispatch. The dispatchers often feel that …
An Adaptive Linearization Method for a Constraint Satisfaction Problem in Semiconductor Device Design Optimization
The device optimization is a very important element in semiconductor technology advancement. Its objective is to find a design point for a semiconductor device so that the optimized design goal meets all specified constraints. As in other engineering fields, a nonlinear optimizer is often used for design optimization. One major drawback of using a nonlinear optimizer is that it can only partially explore the design space and return a local optimal solution. This dissertation provides an adaptive optimization design methodology to allow the designer to explore the design space and obtain a globally optimal solution. One key element of our method is to quickly compute the set of all feasible solutions, also called the acceptability region. We described a polytope-based representation for the acceptability region and an adaptive linearization technique for device performance model approximation. These efficiency enhancements have enabled significant speed-up in estimating acceptability regions and allow acceptability regions to be estimated for a larger class of device design tasks. Our linearization technique also provides an efficient mechanism to guarantee the global accuracy of the computed acceptability region. To visualize the acceptability region, we study the orthogonal projection of high-dimensional convex polytopes and propose an output sensitive algorithm for projecting polytopes into two dimensions.
An Interpreter for the Basic Programming Language
In this thesis, the first chapter provides the general description of this interpreter. The second chapter contains a formal definition of the syntax of BASIC along with an introduction to the semantics. The third chapter contains the design of data structure. The fourth chapter contains the description of algorithms along with stages for testing the interpreter and the design of debug output. The stages and actions-are represented internally to the computer in tabular forms. For statement parsing working syntax equations are established. They serve as standards for the conversion of source statements into object pseudocodes. As the statement is parsed for legal form, pseudocodes for this statement are created. For pseudocode execution, pseudocodes are represented internally to the computer in tabular forms.
Generating Machine Code for High-Level Programming Languages
The purpose of this research was to investigate the generation of machine code from high-level programming language. The following steps were undertaken: 1) Choose a high-level programming language as the source language and a computer as the target computer. 2) Examine all stages during the compiling of a high-level programming language and all data sets involved in the compilation. 3) Discover the mechanism for generating machine code and the mechanism to generate more efficient machine code from the language. 3) Construct an algorithm for generating machine code for the target computer. The results suggest that compiler is best implemented in a high-level programming language, and that SCANNER and PARSER should be independent of target representations, if possible.
Higher Compression from the Burrows-Wheeler Transform with New Algorithms for the List Update Problem
Burrows-Wheeler compression is a three stage process in which the data is transformed with the Burrows-Wheeler Transform, then transformed with Move-To-Front, and finally encoded with an entropy coder. Move-To-Front, Transpose, and Frequency Count are some of the many algorithms used on the List Update problem. In 1985, Competitive Analysis first showed the superiority of Move-To-Front over Transpose and Frequency Count for the List Update problem with arbitrary data. Earlier studies due to Bitner assumed independent identically distributed data, and showed that while Move-To-Front adapts to a distribution faster, incurring less overwork, the asymptotic costs of Frequency Count and Transpose are less. The improvements to Burrows-Wheeler compression this work covers are increases in the amount, not speed, of compression. Best x of 2x-1 is a new family of algorithms created to improve on Move-To-Front's processing of the output of the Burrows-Wheeler Transform which is like piecewise independent identically distributed data. Other algorithms for both the middle stage of Burrows-Wheeler compression and the List Update problem for which overwork, asymptotic cost, and competitive ratios are also analyzed are several variations of Move One From Front and part of the randomized algorithm Timestamp. The Best x of 2x - 1 family includes Move-To-Front, the part of Timestamp of interest, and Frequency Count. Lastly, a greedy choosing scheme, Snake, switches back and forth as the amount of compression that two List Update algorithms achieves fluctuates, to increase overall compression. The Burrows-Wheeler Transform is based on sorting of contexts. The other improvements are better sorting orders, such as “aeioubcdf...” instead of standard alphabetical “abcdefghi...” on English text data, and an algorithm for computing orders for any data, and Gray code sorting instead of standard sorting. Both techniques lessen the overwork incurred by whatever List Update algorithms are used by reducing the difference between adjacent sorted …
Efficient Parallel Algorithms and Data Structures Related to Trees
The main contribution of this dissertation proposes a new paradigm, called the parentheses matching paradigm. It claims that this paradigm is well suited for designing efficient parallel algorithms for a broad class of nonnumeric problems. To demonstrate its applicability, we present three cost-optimal parallel algorithms for breadth-first traversal of general trees, sorting a special class of integers, and coloring an interval graph with the minimum number of colors.
BC Framework for CAV Edge Computing
Edge computing and CAV (Connected Autonomous Vehicle) fields can work as a team. With the short latency and high responsiveness of edge computing, it is a better fit than cloud computing in the CAV field. Moreover, containerized applications are getting rid of the annoying procedures for setting the required environment. So that deployment of applications on new machines is much more user-friendly than before. Therefore, this paper proposes a framework developed for the CAV edge computing scenario. This framework consists of various programs written in different languages. The framework uses Docker technology to containerize these applications so that the deployment could be simple and easy. This framework consists of two parts. One is for the vehicle on-board unit, which exposes data to the closest edge device and receives the output generated by the edge device. Another is for the edge device, which is responsible for collecting and processing big load of data and broadcasting output to vehicles. So the vehicle does not need to perform the heavyweight tasks that could drain up the limited power.
Video Analytics with Spatio-Temporal Characteristics of Activities
As video capturing devices become more ubiquitous from surveillance cameras to smart phones, the demand of automated video analysis is increasing as never before. One obstacle in this process is to efficiently locate where a human operator’s attention should be, and another is to determine the specific types of activities or actions without ambiguity. It is the special interest of this dissertation to locate spatial and temporal regions of interest in videos and to develop a better action representation for video-based activity analysis. This dissertation follows the scheme of “locating then recognizing” activities of interest in videos, i.e., locations of potentially interesting activities are estimated before performing in-depth analysis. Theoretical properties of regions of interest in videos are first exploited, based on which a unifying framework is proposed to locate both spatial and temporal regions of interest with the same settings of parameters. The approach estimates the distribution of motion based on 3D structure tensors, and locates regions of interest according to persistent occurrences of low probability. Two contributions are further made to better represent the actions. The first is to construct a unifying model of spatio-temporal relationships between reusable mid-level actions which bridge low-level pixels and high-level activities. Dense trajectories are clustered to construct mid-level actionlets, and the temporal relationships between actionlets are modeled as Action Graphs based on Allen interval predicates. The second is an effort for a novel and efficient representation of action graphs based on a sparse coding framework. Action graphs are first represented using Laplacian matrices and then decomposed as a linear combination of primitive dictionary items following sparse coding scheme. The optimization is eventually formulated and solved as a determinant maximization problem, and 1-nearest neighbor is used for action classification. The experiments have shown better results than existing approaches for regions-of-interest detection and action …
Automatic Speech Recognition Using Finite Inductive Sequences
This dissertation addresses the general problem of recognition of acoustic signals which may be derived from speech, sonar, or acoustic phenomena. The specific problem of recognizing speech is the main focus of this research. The intention is to design a recognition system for a definite number of discrete words. For this purpose specifically, eight isolated words from the T1MIT database are selected. Four medium length words "greasy," "dark," "wash," and "water" are used. In addition, four short words are considered "she," "had," "in," and "all." The recognition system addresses the following issues: filtering or preprocessing, training, and decision-making. The preprocessing phase uses linear predictive coding of order 12. Following the filtering process, a vector quantization method is used to further reduce the input data and generate a finite inductive sequence of symbols representative of each input signal. The sequences generated by the vector quantization process of the same word are factored, and a single ruling or reference template is generated and stored in a codebook. This system introduces a new modeling technique which relies heavily on the basic concept that all finite sequences are finitely inductive. This technique is used in the training stage. In order to accommodate the variabilities in speech, the training is performed casualty, and a large number of training speakers is used from eight different dialect regions. Hence, a speaker independent recognition system is realized. The matching process compares the incoming speech with each of the templates stored, and a closeness ration is computed. A ratio table is generated anH the matching word that corresponds to the smallest ratio (i.e. indicating that the ruling has removed most of the symbols) is selected. Promising results were obtained for isolated words, and the recognition rates ranged between 50% and 100%.
Mobile-Based Smart Auscultation
In developing countries, acute respiratory infections (ARIs) are responsible for two million deaths per year. Most victims are children who are less than 5 years old. Pneumonia kills 5000 children per day. The statistics for cardiovascular diseases (CVDs) are even more alarming. According to a 2009 report from the World Health Organization (WHO), CVDs kill 17 million people per year. In many resource-poor parts of the world such as India and China, many people are unable to access cardiologists, pulmonologists, and other specialists. Hence, low skilled health professionals are responsible for screening people for ARIs and CVDs in these areas. For example, in the rural areas of the Philippines, there is only one doctor for every 10,000 people. By contrast, the United States has one doctor for every 500 Americans. Due to advances in technology, it is now possible to use a smartphone for audio recording, signal processing, and machine learning. In my thesis, I have developed an Android application named Smart Auscultation. Auscultation is a process in which physicians listen to heart and lung sounds to diagnose disorders. Cardiologists spend years mastering this skill. The Smart Auscultation application is capable of recording and classifying heart sounds, and can be used by public or clinical health workers. This application can detect abnormal heart sounds with up to 92-98% accuracy. In addition, the application can record, but not yet classify, lung sounds. This application will be able to help save thousands of lives by allowing anyone to identify abnormal heart and lung sounds.
Temporal Connectionist Expert Systems Using a Temporal Backpropagation Algorithm
Representing time has been considered a general problem for artificial intelligence research for many years. More recently, the question of representing time has become increasingly important in representing human decision making process through connectionist expert systems. Because most human behaviors unfold over time, any attempt to represent expert performance, without considering its temporal nature, can often lead to incorrect results. A temporal feedforward neural network model that can be applied to a number of neural network application areas, including connectionist expert systems, has been introduced. The neural network model has a multi-layer structure, i.e. the number of layers is not limited. Also, the model has the flexibility of defining output nodes in any layer. This is especially important for connectionist expert system applications. A temporal backpropagation algorithm which supports the model has been developed. The model along with the temporal backpropagation algorithm makes it extremely practical to define any artificial neural network application. Also, an approach that can be followed to decrease the memory space used by weight matrix has been introduced. The algorithm was tested using a medical connectionist expert system to show how best we describe not only the disease but also the entire course of the disease. The system, first, was trained using a pattern that was encoded from the expert system knowledge base rules. Following then, series of experiments were carried out using the temporal model and the temporal backpropagation algorithm. The first series of experiments was done to determine if the training process worked as predicted. In the second series of experiments, the weight matrix in the trained system was defined as a function of time intervals before presenting the system with the learned patterns. The result of the two experiments indicate that both approaches produce correct results. The only difference between the two results …
The Object-Oriented Database Editor
Because of an interest in object-oriented database systems, designers have created systems to store and manipulate specific sets of abstract data types that belong to the real world environment they represent. Unfortunately, the advantage of these systems is also a disadvantage since no single object-oriented database system can be used for all applications. This paper describes an object-oriented database management system called the Object-oriented Database Editor (ODE) which overcomes this disadvantage by allowing designers to create and execute an object-oriented database that represents any type of environment and then to store it and simulate that environment. As conditions within the environment change, the designer can use ODE to alter that environment without loss of data. ODE provides a flexible environment for the user; it is efficient; and it can run on a personal computer.
Modeling the Impact and Intervention of a Sexually Transmitted Disease: Human Papilloma Virus
Many human papilloma virus (HPV) types are sexually transmitted and HPV DNA types 16, 18, 31, and 45 account for more than 75% if all cervical dysplasia. Candidate vaccines are successfully completing US Federal Drug Agency (FDA) phase III testing and several drug companies are in licensing arbitration. Once this vaccine become available it is unlikely that 100% vaccination coverage will be probable; hence, the need for vaccination strategies that will have the greatest reduction on the endemic prevalence of HPV. This thesis introduces two discrete-time models for evaluating the effect of demographic-biased vaccination strategies: one model incorporates temporal demographics (i.e., age) in population compartments; the other non-temporal demographics (i.e., race, ethnicity). Also presented is an intuitive Web-based interface that was developed to allow the user to evaluate the effects on prevalence of a demographic-biased intervention by tailoring the model parameters to specific demographics and geographical region.
An Approach Towards Self-Supervised Classification Using Cyc
Due to the long duration required to perform manual knowledge entry by human knowledge engineers it is desirable to find methods to automatically acquire knowledge about the world by accessing online information. In this work I examine using the Cyc ontology to guide the creation of Naïve Bayes classifiers to provide knowledge about items described in Wikipedia articles. Given an initial set of Wikipedia articles the system uses the ontology to create positive and negative training sets for the classifiers in each category. The order in which classifiers are generated and used to test articles is also guided by the ontology. The research conducted shows that a system can be created that utilizes statistical text classification methods to extract information from an ad-hoc generated information source like Wikipedia for use in a formal semantic ontology like Cyc. Benefits and limitations of the system are discussed along with future work.
A Parallel Programming Language
The problem of programming a parallel processor is discussed. Previous methods of programming a parallel processor, analyzing a program for parallel paths, and special language features are discussed. Graph theory is used to define the three basic programming constructs: choice, sequence, repetition. The concept of mechanized programming is expanded to allow for total separation of control and computational sections of a program. A definition of a language is presented which provides for this separation. A method for developing the program graph is discussed. The control graph and data graph are developed separately. The two graphs illustrate control and data predecessor relationships used in determining parallel elements of a program.
Back to Top of Screen