Search Results

Biomedical Semantic Embeddings: Using Hybrid Sentences to Construct Biomedical Word Embeddings and its Applications
Word embeddings is a useful method that has shown enormous success in various NLP tasks, not only in open domain but also in biomedical domain. The biomedical domain provides various domain specific resources and tools that can be exploited to improve performance of these word embeddings. However, most of the research related to word embeddings in biomedical domain focuses on analysis of model architecture, hyper-parameters and input text. In this paper, we use SemMedDB to design new sentences called `Semantic Sentences'. Then we use these sentences in addition to biomedical text as inputs to the word embedding model. This approach aims at introducing biomedical semantic types defined by UMLS, into the vector space of word embeddings. The semantically rich word embeddings presented here rivals state of the art biomedical word embedding in both semantic similarity and relatedness metrics up to 11%. We also demonstrate how these semantic types in word embeddings can be utilized.
Boosting for Learning From Imbalanced, Multiclass Data Sets
In many real-world applications, it is common to have uneven number of examples among multiple classes. The data imbalance, however, usually complicates the learning process, especially for the minority classes, and results in deteriorated performance. Boosting methods were proposed to handle the imbalance problem. These methods need elongated training time and require diversity among the classifiers of the ensemble to achieve improved performance. Additionally, extending the boosting method to handle multi-class data sets is not straightforward. Examples of applications that suffer from imbalanced multi-class data can be found in face recognition, where tens of classes exist, and in capsule endoscopy, which suffers massive imbalance between the classes. This dissertation introduces RegBoost, a new boosting framework to address the imbalanced, multi-class problems. This method applies a weighted stratified sampling technique and incorporates a regularization term that accommodates multi-class data sets and automatically determines the error bound of each base classifier. The regularization parameter penalizes the classifier when it misclassifies instances that were correctly classified in the previous iteration. The parameter additionally reduces the bias towards majority classes. Experiments are conducted using 12 diverse data sets with moderate to high imbalance ratios. The results demonstrate superior performance of the proposed method compared to several state-of-the-art algorithms for imbalanced, multi-class classification problems. More importantly, the sensitivity improvement of the minority classes using RegBoost is accompanied with the improvement of the overall accuracy for all classes. With unpredictability regularization, a diverse group of classifiers are created and the maximum accuracy improvement reaches above 24%. Using stratified undersampling, RegBoost exhibits the best efficiency. The reduction in computational cost is significant reaching above 50%. As the volume of training data increase, the gain of efficiency with the proposed method becomes more significant.
Bounded Dynamic Source Routing in Mobile Ad Hoc Networks
A mobile ad hoc network (MANET) is a collection of mobile platforms or nodes that come together to form a network capable of communicating with each other, without the help of a central controller. To avail the maximum potential of a MANET, it is of great importance to devise a routing scheme, which will optimize upon the performance of a MANET, given the high rate of random mobility of the nodes. In a MANET individual nodes perform the routing functions like route discovery, route maintenance and delivery of packets from one node to the other. Existing routing protocols flood the network with broadcasts of route discovery messages, while attempting to establish a route. This characteristic is instrumental in deteriorating the performance of a MANET, as resource overhead triggered by broadcasts is directly proportional to the size of the network. Bounded-dynamic source routing (B-DSR), is proposed to curb this multitude of superfluous broadcasts, thus enabling to reserve valuable resources like bandwidth and battery power. B-DSR establishes a bounded region in the network, only within which, transmissions of route discovery messages are processed and validated for establishing a route. All route discovery messages reaching outside of this bounded region are dropped, thus preventing the network from being flooded. In addition B-DSR also guarantees loop-free routing and is robust for a rapid recovery when routes in the network change.
Brain Computer Interface (BCI) Applications: Privacy Threats and Countermeasures
In recent years, brain computer interfaces (BCIs) have gained popularity in non-medical domains such as the gaming, entertainment, personal health, and marketing industries. A growing number of companies offer various inexpensive consumer grade BCIs and some of these companies have recently introduced the concept of BCI "App stores" in order to facilitate the expansion of BCI applications and provide software development kits (SDKs) for other developers to create new applications for their devices. The BCI applications access to users' unique brainwave signals, which consequently allows them to make inferences about users' thoughts and mental processes. Since there are no specific standards that govern the development of BCI applications, its users are at the risk of privacy breaches. In this work, we perform first comprehensive analysis of BCI App stores including software development kits (SDKs), application programming interfaces (APIs), and BCI applications w.r.t privacy issues. The goal is to understand the way brainwave signals are handled by BCI applications and what threats to the privacy of users exist. Our findings show that most applications have unrestricted access to users' brainwave signals and can easily extract private information about their users without them even noticing. We discuss potential privacy threats posed by current practices used in BCI App stores and then describe some countermeasures that could be used to mitigate the privacy threats. Also, develop a prototype which gives the BCI app users a choice to restrict their brain signal dynamically.
BSM Message and Video Streaming Quality Comparative Analysis Using Wave Short Message Protocol (WSMP)
Vehicular ad-hoc networks (VANETs) are used for vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communications. The IEEE 802.11p/WAVE (Wireless Access in Vehicular Environment) and with WAVE Short Messaging Protocol (WSMP) has been proposed as the standard protocol for designing applications for VANETs. This communication protocol must be thoroughly tested before reliable and efficient applications can be built using its protocols. In this paper, we perform on-road experiments in a variety of scenarios to evaluate the performance of the standard. We use commercial VANET devices with 802.11p/WAVE compliant chipsets for both BSM (basic safety messages) as well as video streaming applications using WSMP as a communication protocol. We show that while the standard performs well for BSM application in lightly loaded conditions, the performance becomes inferior when traffic and other performance metric increases. Furthermore, we also show that the standard is not suitable for video streaming due to the bursty nature of traffic and the bandwidth throttling, which is a major shortcoming for V2X applications.
Building an Intelligent Filtering System Using Idea Indexing
The widely used vector model maintains its popularity because of its simplicity, fast speed, and the appeal of using spatial proximity for semantic proximity. However, this model faces a disadvantage that is associated with the vagueness from keywords overlapping. Efforts have been made to improve the vector model. The research on improving document representation has been focused on four areas, namely, statistical co-occurrence of related items, forming term phrases, grouping of related words, and representing the content of documents. In this thesis, we propose the idea-indexing model to improve document representation for the filtering task in IR. The idea-indexing model matches document terms with the ideas they express and indexes the document with these ideas. This indexing scheme represents the document with its semantics instead of sets of independent terms. We show in this thesis that indexing with ideas leads to better performance.
A C Navigational System
The C Navigational System (CNS) is a proposed programming environment for the C programming language. The introduction covers the major influences of programming environments and the components of a programming environment. The system is designed to support the design, coding and maintenance phases of software development. CNS provides multiple views to both the source and documentation for a programming project. User-defined and system-defined links allow the source and documentation to be hierarchically searched. CNS also creates a history list and function interface for each function in a module. The final chapter compares CNS and several other programming environments (Microscope, Rn, Cedar, PECAN, and Marvel).
Case-Based Reasoning for Children Story Selection in ASP.NET
This paper describes the general architecture and function of a Case-Based Reasoning (CBR) system implemented with ASP.NET and C#. Microsoft Visual Studio .NET and XML Web Services provide a flexible, standards-based model that allows clients to access data. Web Form Pages offer a powerful programming model for Web-enabled user interface. The system provides a variety of mechanisms and services related to story retrieval and adaptation. Users may browse and search a library of text stories. More advanced CBR capabilities were also implemented, including a multi-factor distance-calculation for matching user interests with stories in the library, recommendations on optimizing search, and adaptation of stories to match user interests.
Classifying Pairwise Object Interactions: A Trajectory Analytics Approach
We have a huge amount of video data from extensively available surveillance cameras and increasingly growing technology to record the motion of a moving object in the form of trajectory data. With proliferation of location-enabled devices and ongoing growth in smartphone penetration as well as advancements in exploiting image processing techniques, tracking moving objects is more flawlessly achievable. In this work, we explore some domain-independent qualitative and quantitative features in raw trajectory (spatio-temporal) data in videos captured by a fixed single wide-angle view camera sensor in outdoor areas. We study the efficacy of those features in classifying four basic high level actions by employing two supervised learning algorithms and show how each of the features affect the learning algorithms’ overall accuracy as a single factor or confounded with others.
CLUE: A Cluster Evaluation Tool
Modern high performance computing is dependent on parallel processing systems. Most current benchmarks reveal only the high level computational throughput metrics, which may be sufficient for single processor systems, but can lead to a misrepresentation of true system capability for parallel systems. A new benchmark is therefore proposed. CLUE (Cluster Evaluator) uses a cellular automata algorithm to evaluate the scalability of parallel processing machines. The benchmark also uses algorithmic variations to evaluate individual system components' impact on the overall serial fraction and efficiency. CLUE is not a replacement for other performance-centric benchmarks, but rather shows the scalability of a system and provides metrics to reveal where one can improve overall performance. CLUE is a new benchmark which demonstrates a better comparison among different parallel systems than existing benchmarks and can diagnose where a particular parallel system can be optimized.
A Comparative Analysis of Guided vs. Query-Based Intelligent Tutoring Systems (ITS) Using a Class-Entity-Relationship-Attribute (CERA) Knowledge Base
One of the greatest problems facing researchers in the sub field of Artificial Intelligence known as Intelligent Tutoring Systems (ITS) is the selection of a knowledge base designs that will facilitate the modification of the knowledge base. The Class-Entity-Relationship-Attribute (CERA), proposed by R. P. Brazile, holds certain promise as a more generic knowledge base design framework upon which can be built robust and efficient ITS. This study has a twofold purpose. The first is to demonstrate that a CERA knowledge base can be constructed for an ITS on a subset of the domain of Cretaceous paleontology and function as the "expert module" of the ITS. The second is to test the validity of the ideas that students guided through a lesson learn more factual knowledge, while those who explore the knowledge base that underlies the lesson through query at their own pace will be able to formulate their own integrative knowledge from the knowledge gained in their explorations and spend more time on the system. This study concludes that a CERA-based system can be constructed as an effective teaching tool. However, while an ITS - treatment provides for statistically significant gains in achievement test scores, the type of treatment seems not to matter as much as time spent on task. This would seem to indicate that a query-based system which allows the user to progress at their own pace would be a better type of system for the presentation of material due to the greater amount of on-line computer time exhibited by the users.
A Comparison of Agent-Oriented Software Engineering Frameworks and Methodologies
Agent-oriented software engineering (AOSE) covers issues on developing systems with software agents. There are many techniques, mostly agent-oriented and object-oriented, ready to be chosen as building blocks to create agent-based systems. There have been several AOSE methodologies proposed intending to show engineers guidelines on how these elements are constituted in having agents achieve the overall system goals. Although these solutions are promising, most of them are designed in ad-hoc manner without truly obeying software developing life-cycle fully, as well as lacking of examinations on agent-oriented features. To address these issues, we investigated state-of-the-art techniques and AOSE methodologies. By examining them in different respects, we commented on the strength and weakness of them. Toward a formal study, a comparison framework has been set up regarding four aspects, including concepts and properties, notations and modeling techniques, process, and pragmatics. Under these criteria, we conducted the comparison in both overview and detailed level. The comparison helped us with empirical and analytical study, to inspect the issues on how an ideal agent-based system will be formed.
A Comparison of File Organization Techniques
This thesis compares the file organization techniques that are implemented on two different types of computer systems, the large-scale and the small-scale. File organizations from representative computers in each class are examined in detail: the IBM System/370 (OS/370) and the Harris 1600 Distributed Processing System with the Extended Communications Operating System (ECOS). In order to establish the basic framework for comparison, an introduction to file organizations is presented. Additionally, the functional requirements for file organizations are described by their characteristics and user demands. Concluding remarks compare file organization techniques and discuss likely future developments of file systems.
A Comparison of Meansort and Quicksort
The main purpose of this project is to compare a new sorting method- Meansort with its preceding sorting method- Quicksort. Meansort uses the mean value for each key to determine the partition of the file, but Quicksort selects at random. Experiments proved that in some ways Meansort is superior to Quicksort but is still not perfect since it always needs a mean value for each key. This project implements these two methods and determines the situations under which each of these methods outperforms the other.
Computational Complexity of Hopfield Networks
There are three main results in this dissertation. They are PLS-completeness of discrete Hopfield network convergence with eight different restrictions, (degree 3, bipartite and degree 3, 8-neighbor mesh, dual of the knight's graph, hypercube, butterfly, cube-connected cycles and shuffle-exchange), exponential convergence behavior of discrete Hopfield network, and simulation of Turing machines by discrete Hopfield Network.
Computational Epidemiology - Analyzing Exposure Risk: A Deterministic, Agent-Based Approach
Many infectious diseases are spread through interactions between susceptible and infectious individuals. Keeping track of where each exposure to the disease took place, when it took place, and which individuals were involved in the exposure can give public health officials important information that they may use to formulate their interventions. Further, knowing which individuals in the population are at the highest risk of becoming infected with the disease may prove to be a useful tool for public health officials trying to curtail the spread of the disease. Epidemiological models are needed to allow epidemiologists to study the population dynamics of transmission of infectious agents and the potential impact of infectious disease control programs. While many agent-based computational epidemiological models exist in the literature, they focus on the spread of disease rather than exposure risk. These models are designed to simulate very large populations, representing individuals as agents, and using random experiments and probabilities in an attempt to more realistically guide the course of the modeled disease outbreak. The work presented in this thesis focuses on tracking exposure risk to chickenpox in an elementary school setting. This setting is chosen due to the high level of detailed information realistically available to school administrators regarding individuals' schedules and movements. Using an agent-based approach, contacts between individuals are tracked and analyzed with respect to both individuals and locations. The results are then analyzed using a combination of tools from computer science and geographic information science.
Computational Methods for Discovering and Analyzing Causal Relationships in Health Data
Publicly available datasets in health science are often large and observational, in contrast to experimental datasets where a small number of data are collected in controlled experiments. Variables' causal relationships in the observational dataset are yet to be determined. However, there is a significant interest in health science to discover and analyze causal relationships from health data since identified causal relationships will greatly facilitate medical professionals to prevent diseases or to mitigate the negative effects of the disease. Recent advances in Computer Science, particularly in Bayesian networks, has initiated a renewed interest for causality research. Causal relationships can be possibly discovered through learning the network structures from data. However, the number of candidate graphs grows in a more than exponential rate with the increase of variables. Exact learning for obtaining the optimal structure is thus computationally infeasible in practice. As a result, heuristic approaches are imperative to alleviate the difficulty of computations. This research provides effective and efficient learning tools for local causal discoveries and novel methods of learning causal structures with a combination of background knowledge. Specifically in the direction of constraint based structural learning, polynomial-time algorithms for constructing causal structures are designed with first-order conditional independence. Algorithms of efficiently discovering non-causal factors are developed and proved. In addition, when the background knowledge is partially known, methods of graph decomposition are provided so as to reduce the number of conditioned variables. Experiments on both synthetic data and real epidemiological data indicate the provided methods are applicable to large-scale datasets and scalable for causal analysis in health data. Followed by the research methods and experiments, this dissertation gives thoughtful discussions on the reliability of causal discoveries computational health science research, complexity, and implications in health science research.
Computational Methods for Vulnerability Analysis and Resource Allocation in Public Health Emergencies
POD (Point of Dispensing)-based emergency response plans involving mass prophylaxis may seem feasible when considering the choice of dispensing points within a region, overall population density, and estimated traffic demands. However, the plan may fail to serve particular vulnerable sub-populations, resulting in access disparities during emergency response. Federal authorities emphasize on the need to identify sub-populations that cannot avail regular services during an emergency due to their special needs to ensure effective response. Vulnerable individuals require the targeted allocation of appropriate resources to serve their special needs. Devising schemes to address the needs of vulnerable sub-populations is essential for the effectiveness of response plans. This research focuses on data-driven computational methods to quantify and address vulnerabilities in response plans that require the allocation of targeted resources. Data-driven methods to identify and quantify vulnerabilities in response plans are developed as part of this research. Addressing vulnerabilities requires the targeted allocation of appropriate resources to PODs. The problem of resource allocation to PODs during public health emergencies is introduced and the variants of the resource allocation problem such as the spatial allocation, spatio-temporal allocation and optimal resource subset variants are formulated. Generating optimal resource allocation and scheduling solutions can be computationally hard problems. The application of metaheuristic techniques to find near-optimal solutions to the resource allocation problem in response plans is investigated. A vulnerability analysis and resource allocation framework that facilitates the demographic analysis of population data in the context of response plans, and the optimal allocation of resources with respect to the analysis are described.
A Computer Algorithm for Synthetic Seismograms
Synthetic seismograms are a computer-generated aid in the search for hydrocarbons. Heretofore the solution has been done by z-transforms. This thesis presents a solution based on the method of finite differences. The resulting algorithm is fast and compact. The method is applied to three variations of the problem, all three are reduced to the same approximating equation, which is shown to be optimal, in that grid refinement does not change it. Two types of algorithms are derived from the equation. The number of obvious multiplications, additions and subtractions of each is analyzed. Critical section of each requires one multiplication, two additions and two subtractions. Four sample synthetic seismograms are shown. Implementation of the new algorithm runs twice as fast as previous computer program.
Computer Analysis of Amino Acid Chromatography
The problem with which this research was done was that of applying the IBM360 computer to the analysis of waveforms from a Beckman model 120C liquid chromatograph. Software to interpret these waveforms was written in the PLl language. For a control run, input to the computer consisted of a digital tape containing the raw results of the chromatograph run. Output consisted of several graphs and charts giving the results of the analysis. In addition, punched output was provided which gave the name of each amino acid, its elution time and color constant. These punched cards were then input to the computer as input to the experimental run, along with the raw data on the digital tape. From the known amounts of amino acids in the control run and the ratio of control to experimental peak area, the amino acids of the unknown were quantified. The resulting programs provided a complete and easy to use solution to the problem of chromatographic data analysis.
Computer Graphics Primitives and the Scan-Line Algorithm
This paper presents the scan-line algorithm which has been implemented on the Lisp Machine. The scan-line algorithm resides beneath a library of primitive software routines which draw more fundamental objects: lines, triangles and rectangles. This routine, implemented in microcode, applies the A(BC)*D approach to word boundary alignments in order to create an extremely fast, efficient, and general purpose drawing primitive. The scan-line algorithm improves on previous methodologies by limiting the number of CPU intensive instructions and by minimizing the number of words referenced. This paper will describe how to draw scan-lines and the constraints imposed upon the scan-line algorithm by the Lisp Machine's hardware and software.
Computer Realization of Human Music Cognition
This study models the human process of music cognition on the digital computer. The definition of music cognition is derived from the work in music cognition done by the researchers Carol Krumhansl and Edward Kessler, and by Mari Jones, as well as from the music theories of Heinrich Schenker. The computer implementation functions in three stages. First, it translates a musical "performance" in the form of MIDI (Musical Instrument Digital Interface) messages into LISP structures. Second, the various parameters of the performance are examined separately a la Jones's joint accent structure, quantified according to psychological findings, and adjusted to a common scale. The findings of Krumhansl and Kessler are used to evaluate the consonance of each note with respect to the key of the piece and with respect to the immediately sounding harmony. This process yields a multidimensional set of points, each of which is a cognitive evaluation of a single musical event within the context of the piece of music within which it occurred. This set of points forms a metric space in multi-dimensional Euclidean space. The third phase of the analysis maps the set of points into a topology-preserving data structure for a Schenkerian-like middleground structural analysis. This process yields a hierarchical stratification of all the musical events (notes) in a piece of music. It has been applied to several pieces of music with surprising results. In each case, the analysis obtained very closely resembles a structural analysis which would be supplied by a human theorist. The results obtained invite us to take another look at the representation of knowledge and perception from another perspective, that of a set of points in a topological space, and to ask if such a representation might not be useful in other domains. It also leads us to ask if such a …
A Computer Solved Scheduling Problem
The purpose of this paper is to illustrate the use of the computer in solving complex real time scheduling problems. This problem involves the airline industry and is concerned with the local scheduling of security personnel to the gate areas for outgoing flights from one terminal at Dallas-Fort Worth airport. The purpose of this type of program is to enhance personnel efficiency and management control over a large group of people while cutting the cost of lower management.
Computerized Analysis of Radiograph Images of Embedded Objects as Applied to Bone Location and Mineral Content Measurement
This investigation dealt with locating and measuring x-ray absorption of radiographic images. The methods developed provide a fast, accurate, minicomputer control, for analysis of embedded objects. A PDP/8 computer system was interfaced with a Joyce Loebl 3CS Microdensitometer and a Leeds & Northrup Recorder. Proposed algorithms for bone location and data smoothing work on a twelve-bit minicomputer. Designs of a software control program and operational procedure are presented. The filter made wedge and limb scans monotonic from minima to maxima. It was tested for various convoluted intervals. Ability to resmooth the same data in multiple passes was tested. An interval size of fifteen works well in one pass.
Concurrent Pattern Recognition and Optical Character Recognition
The problem of interest as indicated is to develop a general purpose technique that is a combination of the structural approach, and an extension of the Finite Inductive Sequence (FI) technique. FI technology is pre-algebra, and deals with patterns for which an alphabet can be formulated.
Content-Based Image Retrieval by Integration of Metadata Encoded Multimedia Features in Constructing a Video Summarizer Application.
Content-based image retrieval (CBIR) is the retrieval of images from a collection by means of internal feature measures of the information content of the images. In CBIR systems, text media is usually used only to retrieve exemplar images for further searching by image feature content. This research work describes a new method for integrating multimedia text and image content features to increase the retrieval performance of the system. I am exploring the content-based features of an image extracted from a video to build a storyboard for search retrieval of images. Metadata encoded multimedia features include extracting primitive features like color, shape and text from an image. Histograms are built for all the features extracted and stored in a database. Images are searched based on comparing these histogram values of the extracted image with the stored values. These histogram values are used for extraction of keyframes from a collection of images parsed from a video file. Individual shots of images are extracted from a video clip and run through processes that extract the features and build the histogram values. A keyframe extraction algorithm is run to get the keyframes from the collection of images to build a storyboard of images. In video retrieval, speech recognition and other multimedia encoding could help improve the CBIR indexing technique and makes keyframe extraction and searching effective. Research in area of embedding sound and other multimedia could enhance effective video retrieval.
Control Mechanisms and Recovery Techniques for Real-Time Data Transmission Over the Internet.
Streaming multimedia content with UDP has become popular over distributed systems such as an Internet. This may encounter many losses due to dropped packets or late arrivals at destination since UDP can only provide best effort delivery. Even UDP doesn't have any self-recovery mechanism from congestion collapse or bursty loss to inform sender of the data to adjust future transmission rate of data like in TCP. So there is a need to incorporate various control schemes like forward error control, interleaving, and congestion control and error concealment into real-time transmission to prevent from effect of losses. Loss can be repaired by retransmission if roundtrip delay is allowed, otherwise error concealment techniques will be used based on the type and amount of loss. This paper implements the interleaving technique with packet spacing of varying interleaver block size for protecting real-time data from loss and its effect during transformation across the Internet. The packets are interleaved and maintain some time gap between two consecutive packets before being transmitted into the Internet. Thus loss of packets can be reduced from congestion and preventing loss of consecutive packets of information when a burst of several packets are lost. Several experiments have been conducted with video data for analysis of proposed model.
Convexity-Preserving Scattered Data Interpolation
Surface fitting methods play an important role in many scientific fields as well as in computer aided geometric design. The problem treated here is that of constructing a smooth surface that interpolates data values associated with scattered nodes in the plane. The data is said to be convex if there exists a convex interpolant. The problem of convexity-preserving interpolation is to determine if the data is convex, and construct a convex interpolant if it exists.
Cross Language Information Retrieval for Languages with Scarce Resources
Our generation has experienced one of the most dramatic changes in how society communicates. Today, we have online information on almost any imaginable topic. However, most of this information is available in only a few dozen languages. In this thesis, I explore the use of parallel texts to enable cross-language information retrieval (CLIR) for languages with scarce resources. To build the parallel text I use the Bible. I evaluate different variables and their impact on the resulting CLIR system, specifically: (1) the CLIR results when using different amounts of parallel text; (2) the role of paraphrasing on the quality of the CLIR output; (3) the impact on accuracy when translating the query versus translating the collection of documents; and finally (4) how the results are affected by the use of different dialects. The results show that all these variables have a direct impact on the quality of the CLIR system.
Cuff-less Blood Pressure Measurement Using a Smart Phone
Blood pressure is vital sign information that physicians often need as preliminary data for immediate intervention during emergency situations or for regular monitoring of people with cardiovascular diseases. Despite the availability of portable blood pressure meters in the market, they are not regularly carried by people, creating a need for an ultra-portable measurement platform or device that can be easily carried and used at all times. One such device is the smartphone which, according to comScore survey is used by 26.2% of the US adult population. the mass production of these phones with built-in sensors and high computation power has created numerous possibilities for application development in different domains including biomedical. Motivated by this capability and their extensive usage, this thesis focuses on developing a blood pressure measurement platform on smartphones. Specifically, I developed a blood pressure measurement system on a smart phone using the built-in camera and a customized external microphone. the system consists of first obtaining heart beats using the microphone and finger pulse with the camera, and finally calculating the blood pressure using the recorded data. I developed techniques for finding the best location for obtaining the data, making the system usable by all categories of people. the proposed system resulted in accuracies between 90-100%, when compared to traditional blood pressure meters. the second part of this thesis presents a new system for remote heart beat monitoring using the smart phone. with the proposed system, heart beats can be transferred live by patients and monitored by physicians remotely for diagnosis. the proposed blood pressure measurement and remote monitoring systems will be able to facilitate information acquisition and decision making by the 9-1-1 operators.
DADS - A Distributed Agent Delivery System
Mobile agents require an appropriate platform that can facilitate their migration and execution. In particular, the design and implementation of such a system must balance several factors that will ensure that its constituent agents are executed without problems. Besides the basic requirements of migration and execution, an agent system must also provide mechanisms to ensure the security and survivability of an agent when it migrates between hosts. In addition, the system should be simple enough to facilitate its widespread use across large scale networks (i.e Internet). To address these issues, this thesis discusses the design and implementation of the Distributed Agent Delivery System (DADS). The DADS provides a de-coupled design that separates agent acceptance from agent execution. Using functional modules, the DADS provides services ranging from language execution and security to fault-tolerance and compression. Modules allow the administrator(s) of hosts to declare, at run-time, the services that they want to provide. Since each administrative domain is different, the DADS provides a platform that can be adapted to exchange heterogeneous blends of agents across large scale networks.
Data-Driven Decision-Making Framework for Large-Scale Dynamical Systems under Uncertainty
Managing large-scale dynamical systems (e.g., transportation systems, complex information systems, and power networks, etc.) in real-time is very challenging considering their complicated system dynamics, intricate network interactions, large scale, and especially the existence of various uncertainties. To address this issue, intelligent techniques which can quickly design decision-making strategies that are robust to uncertainties are needed. This dissertation aims to conquer these challenges by exploring a data-driven decision-making framework, which leverages big-data techniques and scalable uncertainty evaluation approaches to quickly solve optimal control problems. In particular, following techniques have been developed along this direction: 1) system modeling approaches to simplify the system analysis and design procedures for multiple applications; 2) effective simulation and analytical based approaches to efficiently evaluate system performance and design control strategies under uncertainty; and 3) big-data techniques that allow some computations of control strategies to be completed offline. These techniques and tools for analysis, design and control contribute to a wide range of applications including air traffic flow management, complex information systems, and airborne networks.
The Data Structure of a KSAM Key Directory
The purpose of this project is to explore the alternate data structures for a disk file which is currently a preorder binary tree. specifically, the file is the key directory for an implementation of Keyed Sequential Access Method (KSAM) in a mini-computer operating system. A new data structure will be chosen, with the reasons for that choice given, and it will be incorporated into the existing system.
Ddos Defense Against Botnets in the Mobile Cloud
Mobile phone advancements and ubiquitous internet connectivity are resulting in ever expanding possibilities in the application of smart phones. Users of mobile phones are now capable of hosting server applications from their personal devices. Whether providing services individually or in an ad hoc network setting the devices are currently not configured for defending against distributed denial of service (DDoS) attacks. These attacks, often launched from a botnet, have existed in the space of personal computing for decades but recently have begun showing up on mobile devices. Research is done first into the required steps to develop a potential botnet on the Android platform. This includes testing for the amount of malicious traffic an Android phone would be capable of generating for a DDoS attack. On the other end of the spectrum is the need of mobile devices running networked applications to develop security against DDoS attacks. For this mobile, phones are setup, with web servers running Apache to simulate users running internet connected applications for either local ad hoc networks or serving to the internet. Testing is done for the viability of using commonly available modules developed for Apache and intended for servers as well as finding baseline capabilities of mobiles to handle higher traffic volumes. Given the unique challenge of the limited resources a mobile phone can dedicate to Apache when compared to a dedicated hosting server a new method was needed. A proposed defense algorithm is developed for mitigating DDoS attacks against the mobile server that takes into account the limited resources available on the mobile device. The algorithm is tested against TCP socket flooding for effectiveness and shown to perform better than the common Apache module installations on a mobile device.
Defensive Programming
This research explores the concepts of defensive programming as currently defined in the literature. Then these concepts are extended and more explicitly defined. The relationship between defensive programming, as presented in this research, and current programming practices is discussed and several benefits are observed. Defensive programming appears to benefit the entire software life cycle. Four identifiable phases of the software development process are defined, and the relationship between these four phases and defensive programming is shown. In this research, defensive programming is defined as writing programs in such a way that during execution the program itself produces communication allowing the programmer and the user to observe its dynamic states accurately and critically. To accomplish this end, the use of defensive programming snap shots is presented as a software development tool.
Design and Analysis of Novel Verifiable Voting Schemes
Free and fair elections are the basis for democracy, but conducting elections is not an easy task. Different groups of people are trying to influence the outcome of the election in their favor using the range of methods, from campaigning for a particular candidate to well-financed lobbying. Often the stakes are too high, and the methods are illegal. Two main properties of any voting scheme are the privacy of a voter’s choice and the integrity of the tally. Unfortunately, they are mutually exclusive. Integrity requires making elections transparent and auditable, but at the same time, we must preserve a voter’s privacy. It is always a trade-off between these two requirements. Current voting schemes favor privacy over auditability, and thus, they are vulnerable to voting fraud. I propose two novel voting systems that can achieve both privacy and verifiability. The first protocol is based on cryptographical primitives to ensure the integrity of the final tally and privacy of the voter. The second protocol is a simple paper-based voting scheme that achieves almost the same level of security without usage of cryptography.
Design and Implementation of a Parser for the DBase II Query Language
In this paper the DBase II query language of an RDBMS for personal computers is discussed. Other languages will be provided by large and sophisticated DBMS will not be discussed here. The reason for selecting the DBase II query language for discussion are as follows: 1. It is a simple language that can be learned easily [TOWN 84, DINE 84]. Within a short period, users can learn all of the facilities and manage the system very well. 2. It is a language suitable for interactive programming and execution like BASIC. 3. It provides adequate facilities for a small data base system and serves as an introductory guide for more sophisticated systems.
Design and Implementation of a PDP-8 Computer Assembler Executing on the IBM 360/50 Computer
This problem is intended to be an introduction to the design of a software system which translates PDP-8 assembly language source into it's machine-readable object code. This assembler runs on the IBM 360/50. It is assumed that the reader is familiar with the basic PDP-8 assembly language. For the description and use of this assembler the reader is referred to the PAL-III SYMBOLIC ASSEMBLER PROGRAMMING MANUAL from DEC (order number DIGITAL 8-3-5, Digital Equipment Corporation: Maynard, Massachusetts, 1965.). The Second problem of the study concerns the design of a simulator for the PDP-8 computer.
The Design and Implementation of a Prolog Parser Using Javacc
Operatorless Prolog text is LL(1) in nature and any standard LL parser generator tool can be used to parse it. However, the Prolog text that conforms to the ISO Prolog standard allows the definition of dynamic operators. Since Prolog operators can be defined at run-time, operator symbols are not present in the grammar rules of the language. Unless the parser generator allows for some flexibility in the specification of the grammar rules, it is very difficult to generate a parser for such text. In this thesis we discuss the existing parsing methods and their modified versions to parse languages with dynamic operator capabilities. Implementation details of a parser using Javacc as a parser generator tool to parse standard Prolog text is provided. The output of the parser is an “Abstract Syntax Tree” that reflects the correct precedence and associativity rules among the various operators (static and dynamic) of the language. Empirical results are provided that show that a Prolog parser that is generated by the parser generator like Javacc is comparable in efficiency to a hand-coded parser.
Design and Implementation of a Text Editor Under Music Interactive Operating System
An interactive text editor is a computer program that allows a user to create and revise a target document such as program statements, manuscript text, and numeric data through an online terminal and the computer. It allows text to be modified and corrected many orders of magnitude faster and more easily than would manual correction. The most important characteristic of the text editor is its convenience for the user. Such convenience requires a simple, mnemonic command language which is easy to use and understand.
The Design and Implementation of an Intelligent Agent-Based File System
As bandwidth constraints on LAN/WAN environments decrease, the demand for distributed services will continue to increase. In particular, the proliferation of user-level applications requiring high-capacity distributed file storage systems will demand that such services be universally available. At the same time, the advent of high-speed networks have made the deployment of application and communication solutions based upon an Intelligent Mobile Agent (IMA) framework practical. Agents have proven to present an ideal development paradigm for the creation of autonomous large-scale distributed systems, and an agent-based communication scheme would facilitate the creation of independently administered distributed file services. This thesis thus outlines an architecture for such a distributed file system based upon an IMA communication framework.
Design and Implementation of Large-Scale Wireless Sensor Networks for Environmental Monitoring Applications
Environmental monitoring represents a major application domain for wireless sensor networks (WSN). However, despite significant advances in recent years, there are still many challenging issues to be addressed to exploit the full potential of the emerging WSN technology. In this dissertation, we introduce the design and implementation of low-power wireless sensor networks for long-term, autonomous, and near-real-time environmental monitoring applications. We have developed an out-of-box solution consisting of a suite of software, protocols and algorithms to provide reliable data collection with extremely low power consumption. Two wireless sensor networks based on the proposed solution have been deployed in remote field stations to monitor soil moisture along with other environmental parameters. As parts of the ever-growing environmental monitoring cyberinfrastructure, these networks have been integrated into the Texas Environmental Observatory system for long-term operation. Environmental measurement and network performance results are presented to demonstrate the capability, reliability and energy-efficiency of the network.
A Design Approach for Digital Computer Peripheral Controllers, Case Study Design and Construction
The purpose of this project was to describe a novel design approach for a digital computer peripheral controller, then design and construct a case study controller. This document consists of three chapters and an appendix. Chapter II presents the design approach chosen; a variation to a design presented by Charles R. Richards in an article published in Electronics magazine. Richards' approach consists of a finite state machine circuitry controlling all the functions of a controller. The variation to Richards' approach consists of considering the various logically independent processes which a controller carries out and assigning control of each process to a separate finite state machine. The appendix contains the documentation of the design and construction of the controller.
The Design Of A Benchmark For Geo-stream Management Systems
The recent growth in sensor technology allows easier information gathering in real-time as sensors have grown smaller, more accurate, and less expensive. The resulting data is often in a geo-stream format continuously changing input with a spatial extent. Researchers developing geo-streaming management systems (GSMS) require a benchmark system for evaluation, which is currently lacking. This thesis presents GSMark, a benchmark for evaluating GSMSs. GSMark provides a data generator that creates a combination of synthetic and real geo-streaming data, a workload simulator to present the data to the GSMS as a data stream, and a set of benchmark queries that evaluate typical GSMS functionality and query performance. In particular, GSMark generates both moving points and evolving spatial regions, two fundamental data types for a broad range of geo-stream applications, and the geo-streaming queries on this data.
Detecting Component Failures and Critical Components in Safety Critical Embedded Systems using Fault Tree Analysis
Component failures can result in catastrophic behaviors in safety critical embedded systems, sometimes resulting in loss of life. Component failures can be treated as off nominal behaviors (ONBs) with respect to the components and sub systems involved in an embedded system. A lot of research is being carried out to tackle the problem of ONBs. These approaches are mainly focused on the states (i.e., desired and undesired states of a system at a given point of time to detect ONBs). In this paper, an approach is discussed to detect component failures and critical components of an embedded system. The approach is based on fault tree analysis (FTA), applied to the requirements specification of embedded systems at design time to find out the relationship between individual component failures and overall system failure. FTA helps in determining both qualitative and quantitative relationship between component failures and system failure. Analyzing the system at design time helps in detecting component failures and critical components and helps in devising strategies to mitigate component failures at design time and improve overall safety and reliability of a system.
Detection of Ulcerative Colitis Severity and Enhancement of Informative Frame Filtering Using Texture Analysis in Colonoscopy Videos
There are several types of disorders that affect our colon’s ability to function properly such as colorectal cancer, ulcerative colitis, diverticulitis, irritable bowel syndrome and colonic polyps. Automatic detection of these diseases would inform the endoscopist of possible sub-optimal inspection during the colonoscopy procedure as well as save time during post-procedure evaluation. But existing systems only detects few of those disorders like colonic polyps. In this dissertation, we address the automatic detection of another important disorder called ulcerative colitis. We propose a novel texture feature extraction technique to detect the severity of ulcerative colitis in block, image, and video levels. We also enhance the current informative frame filtering methods by detecting water and bubble frames using our proposed technique. Our feature extraction algorithm based on accumulation of pixel value difference provides better accuracy at faster speed than the existing methods making it highly suitable for real-time systems. We also propose a hybrid approach in which our feature method is combined with existing feature method(s) to provide even better accuracy. We extend the block and image level detection method to video level severity score calculation and shot segmentation. Also, the proposed novel feature extraction method can detect water and bubble frames in colonoscopy videos with very high accuracy in significantly less processing time even when clustering is used to reduce the training size by 10 times.
Determining Event Outcomes from Social Media
An event is something that happens at a time and location. Events include major life events such as graduating college or getting married, and also simple day-to-day activities such as commuting to work or eating lunch. Most work on event extraction detects events and the entities involved in events. For example, cooking events will usually involve a cook, some utensils and appliances, and a final product. In this work, we target the task of determining whether events result in their expected outcomes. Specifically, we target cooking and baking events, and characterize event outcomes into two categories. First, we distinguish whether something edible resulted from the event. Second, if something edible resulted, we distinguish between perfect, partial and alternative outcomes. The main contributions of this thesis are a corpus of 4,000 tweets annotated with event outcome information and experimental results showing that the task can be automated. The corpus includes tweets that have only text as well as tweets that have text and an image.
Determining Whether and When People Participate in the Events They Tweet About
This work describes an approach to determine whether people participate in the events they tweet about. Specifically, we determine whether people are participants in events with respect to the tweet timestamp. We target all events expressed by verbs in tweets, including past, present and events that may occur in future. We define event participant as people directly involved in an event regardless of whether they are the agent, recipient or play another role. We present an annotation effort, guidelines and quality analysis with 1,096 event mentions. We discuss the label distributions and event behavior in the annotated corpus. We also explain several features used and a standard supervised machine learning approach to automatically determine if and when the author is a participant of the event in the tweet. We discuss trends in the results obtained and devise important conclusions.
Developing a Test Bed for Interactive Narrative in Virtual Environments
As Virtual Environments (VE) become a more commonly used method of interaction and presentation, supporting users as they navigate and interact with scenarios presented in VE will be a significant issue. A key step in understanding the needs of users in these situations will be observing them perform representative tasks in a fully developed environment. In this paper, we describe the development of a test bed for interactive narrative in a virtual environment. The test bed was specifically developed to present multiple, simultaneous sequences of events (scenarios or narratives) and to support user navigation through these scenarios. These capabilities will support the development of multiple users testing scenarios, allowing us to study and better understand the needs of users of narrative VEs.
Development, Implementation, and Analysis of a Contact Model for an Infectious Disease
With a growing concern of an infectious diseases spreading in a population, epidemiology is becoming more important for the future of public health. In the past epidemiologist used existing data of an outbreak to help them determine how an infectious disease might spread in the future. Now with computational models, they able to analysis data produced by these models to help with prevention and intervention plans. This paper looks at the design, implementation, and analysis of a computational model based on the interactions of the population between individuals. The design of the working contact model looks closely at the SEIR model used as the foundation and the two timelines of a disease. The implementation of the contact model is reviewed while looking closely at data structures. The analysis of the experiments provide evidence this contact model can be used to help epidemiologist study the spread of an infectious disease based on the contact rate of individuals.
Back to Top of Screen