Search Results

XML-Based Agent Scripts and Inference Mechanisms
Natural language understanding has been a persistent challenge to researchers in various computer science fields, in a number of applications ranging from user support systems to entertainment and online teaching. A long term goal of the Artificial Intelligence field is to implement mechanisms that enable computers to emulate human dialogue. The recently developed ALICEbots, virtual agents with underlying AIML scripts, by A.L.I.C.E. foundation, use AIML scripts - a subset of XML - as the underlying pattern database for question answering. Their goal is to enable pattern-based, stimulus-response knowledge content to be served, received and processed over the Web, or offline, in the manner similar to HTML and XML. In this thesis, we describe a system that converts the AIML scripts to Prolog clauses and reuses them as part of a knowledge processor. The inference mechanism developed in this thesis is able to successfully match the input pattern with our clauses database even if words are missing. We also emulate the pattern deduction algorithm of the original logic deduction mechanism. Our rules, compatible with Semantic Web standards, bring structure to the meaningful content of Web pages and support interactive content retrieval using natural language.
Agent Extensions for Peer-to-Peer Networks.
Peer-to-Peer (P2P) networks have seen tremendous growth in development and usage in recent times. This attention has brought many developments as well as new challenges to these networks. We will show that agent extensions to P2P networks offer solutions to many problems faced by P2P networks. In this research, an attempt is made to bring together JXTA P2P infrastructure and Jinni, a Prolog based agent engine to form an agent based P2P network. On top of the JXTA, we define simple Java API providing P2P services for agent programming constructs. Jinni is deployed on this JXTA network using an automated code update mechanism. Experiments are conducted on this Jinni/JXTA platform to implement a simple agent communication and data exchange protocol.
An Analysis of Motivational Cues in Virtual Environments.
Guiding navigation in virtual environments (VEs) is a challenging task. A key issue in the navigation of a virtual environment is to be able to strike a balance between the user's need to explore the environment freely and the designer's need to ensure that the user experiences all the important events in the VE. This thesis reports on a study aimed at comparing the effectiveness of various navigation cues that are used to motivate users towards a specific target location. The results of this study indicate some significant differences in how users responded to the various cues.
A Comparison of Agent-Oriented Software Engineering Frameworks and Methodologies
Agent-oriented software engineering (AOSE) covers issues on developing systems with software agents. There are many techniques, mostly agent-oriented and object-oriented, ready to be chosen as building blocks to create agent-based systems. There have been several AOSE methodologies proposed intending to show engineers guidelines on how these elements are constituted in having agents achieve the overall system goals. Although these solutions are promising, most of them are designed in ad-hoc manner without truly obeying software developing life-cycle fully, as well as lacking of examinations on agent-oriented features. To address these issues, we investigated state-of-the-art techniques and AOSE methodologies. By examining them in different respects, we commented on the strength and weakness of them. Toward a formal study, a comparison framework has been set up regarding four aspects, including concepts and properties, notations and modeling techniques, process, and pragmatics. Under these criteria, we conducted the comparison in both overview and detailed level. The comparison helped us with empirical and analytical study, to inspect the issues on how an ideal agent-based system will be formed.
Embedded monitors for detecting and preventing intrusions in cryptographic and application protocols.
There are two main approaches for intrusion detection: signature-based and anomaly-based. Signature-based detection employs pattern matching to match attack signatures with observed data making it ideal for detecting known attacks. However, it cannot detect unknown attacks for which there is no signature available. Anomaly-based detection builds a profile of normal system behavior to detect known and unknown attacks as behavioral deviations. However, it has a drawback of a high false alarm rate. In this thesis, we describe our anomaly-based IDS designed for detecting intrusions in cryptographic and application-level protocols. Our system has several unique characteristics, such as the ability to monitor cryptographic protocols and application-level protocols embedded in encrypted sessions, a very lightweight monitoring process, and the ability to react to protocol misuse by modifying protocol response directly.
Improved Approximation Algorithms for Geometric Packing Problems With Experimental Evaluation
Geometric packing problems are NP-complete problems that arise in VLSI design. In this thesis, we present two novel algorithms using dynamic programming to compute exactly the maximum number of k x k squares of unit size that can be packed without overlap into a given n x m grid. The first algorithm was implemented and ran successfully on problems of large input up to 1,000,000 nodes for different values. A heuristic based on the second algorithm is implemented. This heuristic is fast in practice, but may not always be giving optimal times in theory. However, over a wide range of random data this version of the algorithm is giving very good solutions very fast and runs on problems of up to 100,000,000 nodes in a grid and different ranges for the variables. It is also shown that this version of algorithm is clearly superior to the first algorithm and has shown to be very efficient in practice.
Intelligent Memory Management Heuristics
Automatic memory management is crucial in implementation of runtime systems even though it induces a significant computational overhead. In this thesis I explore the use of statistical properties of the directed graph describing the set of live data to decide between garbage collection and heap expansion in a memory management algorithm combining the dynamic array represented heaps with a mark and sweep garbage collector to enhance its performance. The sampling method predicting the density and the distribution of useful data is implemented as a partial marking algorithm. The algorithm randomly marks the nodes of the directed graph representing the live data at different depths with a variable probability factor p. Using the information gathered by the partial marking algorithm in the current step and the knowledge gathered in the previous iterations, the proposed empirical formula predicts with reasonable accuracy the density of live nodes on the heap, to decide between garbage collection and heap expansion. The resulting heuristics are tested empirically and shown to improve overall execution performance significantly in the context of the Jinni Prolog compiler's runtime system.
Performance Evaluation of Data Integrity Mechanisms for Mobile Agents
With the growing popularity of e-commerce applications that use software agents, the protection of mobile agent data has become imperative. To that end, the performance of four methods that protect the data integrity of mobile agents is evaluated. The methods investigated include existing approaches known as the Partial Result Authentication Codes, Hash Chaining, and Set Authentication Code methods, and a technique of our own design, called the Modified Set Authentication Code method, which addresses the limitations of the Set Authentication Code method. The experiments were run using the DADS agent system (developed at the Network Research Laboratory at UNT), for which a Data Integrity Module was designed. The experimental results show that our Modified Set Authentication Code technique performed comparably to the Set Authentication Code method.
Analysis of Web Services on J2EE Application Servers
The Internet became a standard way of exchanging business data between B2B and B2C applications and with this came the need for providing various services on the web instead of just static text and images. Web services are a new type of services offered via the web that aid in the creation of globally distributed applications. Web services are enhanced e-business applications that are easier to advertise and easier to discover on the Internet because of their flexibility and uniformity. In a real life scenario it is highly difficult to decide which J2EE application server to go for when deploying a enterprise web service. This thesis analyzes the various ways by which web services can be developed & deployed. Underlying protocols and crucial issues like EAI (enterprise application integration), asynchronous messaging, Registry tModel architecture etc have been considered in this research. This paper presents a report by analyzing what various J2EE application servers provide by doing a case study and by developing applications to test functionality.
A general purpose semantic parser using FrameNet and WordNet®.
Syntactic parsing is one of the best understood language processing applications. Since language and grammar have been formally defined, it is easy for computers to parse the syntactic structure of natural language text. Does meaning have structure as well? If it has, how can we analyze the structure? Previous systems rely on a one-to-one correspondence between syntactic rules and semantic rules. But such systems can only be applied to limited fragments of English. In this thesis, we propose a general-purpose shallow semantic parser which utilizes a semantic network (WordNet), and a frame dataset (FrameNet). Semantic relations recognized by the parser are based on how human beings represent knowledge of the world. Parsing semantic structure allows semantic units and constituents to be accessed and processed in a more meaningful way than syntactic parsing, moving the automation of understanding natural language text to a higher level.
Hopfield Networks as an Error Correcting Technique for Speech Recognition
I experimented with Hopfield networks in the context of a voice-based, query-answering system. Hopfield networks are used to store and retrieve patterns. I used this technique to store queries represented as natural language sentences and I evaluated the accuracy of the technique for error correction in a spoken question-answering dialog between a computer and a user. I show that the use of an auto-associative Hopfield network helps make the speech recognition system more fault tolerant. I also looked at the available encoding schemes to convert a natural language sentence into a pattern of zeroes and ones that can be stored in the Hopfield network reliably, and I suggest scalable data representations which allow storing a large number of queries.
Impact of actual interference on capacity and call admission control in a CDMA network.
An overwhelming number of models in the literature use average inter-cell interference for the calculation of capacity of a Code Division Multiple Access (CDMA) network. The advantage gained in terms of simplicity by using such models comes at the cost of rendering the exact location of a user within a cell irrelevant. We calculate the actual per-user interference and analyze the effect of user-distribution within a cell on the capacity of a CDMA network. We show that even though the capacity obtained using average interference is a good approximation to the capacity calculated using actual interference for a uniform user distribution, the deviation can be tremendously large for non-uniform user distributions. Call admission control (CAC) algorithms are responsible for efficient management of a network's resources while guaranteeing the quality of service and grade of service, i.e., accepting the maximum number of calls without affecting the quality of service of calls already present in the network. We design and implement global and local CAC algorithms, and through simulations compare their network throughput and blocking probabilities for varying mobility scenarios. We show that even though our global CAC is better at resource management, the lack of substantial gain in network throughput and exponential increase in complexity makes our optimized local CAC algorithm a much better choice for a given traffic distribution profile.
Performance comparison of data distribution management strategies in large-scale distributed simulation.
Data distribution management (DDM) is a High Level Architecture/Run-time Infrastructure (HLA/RTI) service that manages the distribution of state updates and interaction information in large-scale distributed simulations. The key to efficient DDM is to limit and control the volume of data exchanged during the simulation, to relay data to only those hosts requiring the data. This thesis focuses upon different DDM implementations and strategies. This thesis includes analysis of three DDM methods including the fixed grid-based, dynamic grid-based, and region-based methods. Also included is the use of multi-resolution modeling with various DDM strategies and analysis of the performance effects of aggregation/disaggregation with these strategies. Running numerous federation executions, I simulate four different scenarios on a cluster of workstations with a mini-RTI Kit framework and propose a set of benchmarks for a comparison of the DDM schemes. The goals of this work are to determine the most efficient model for applying each DDM scheme, discover the limitations of the scalability of the various DDM methods, evaluate the effects of aggregation/disaggregation on performance and resource usage, and present accepted benchmarks for use in future research.
Modeling the Impact and Intervention of a Sexually Transmitted Disease: Human Papilloma Virus
Many human papilloma virus (HPV) types are sexually transmitted and HPV DNA types 16, 18, 31, and 45 account for more than 75% if all cervical dysplasia. Candidate vaccines are successfully completing US Federal Drug Agency (FDA) phase III testing and several drug companies are in licensing arbitration. Once this vaccine become available it is unlikely that 100% vaccination coverage will be probable; hence, the need for vaccination strategies that will have the greatest reduction on the endemic prevalence of HPV. This thesis introduces two discrete-time models for evaluating the effect of demographic-biased vaccination strategies: one model incorporates temporal demographics (i.e., age) in population compartments; the other non-temporal demographics (i.e., race, ethnicity). Also presented is an intuitive Web-based interface that was developed to allow the user to evaluate the effects on prevalence of a demographic-biased intervention by tailoring the model parameters to specific demographics and geographical region.
Analyzing Microwave Spectra Collected by the Solar Radio Burst Locator
Modern communication systems rely heavily upon microwave, radio, and other electromagnetic frequency bands as a means of providing wireless communication links. Although convenient, wireless communication is susceptible to electromagnetic interference. Solar activity causes both direct interference through electromagnetic radiation as well as indirect interference caused by charged particles interacting with Earth's magnetic field. The Solar Radio Burst Locator (SRBL) is a United States Air Force radio telescope designed to detect and locate solar microwave bursts as they occur on the Sun. By analyzing these events, the Air Force hopes to gain a better understanding of the root causes of solar interference and improve interference forecasts. This thesis presents methods of searching and analyzing events found in the previously unstudied SRBL data archive. A new web-based application aids in the searching and visualization of the data. Comparative analysis is performed amongst data collected by SRBL and several other instruments. This thesis also analyzes events across the time, intensity, and frequency domains. These analysis methods can be used to aid in the detection and understanding of solar events so as to provide improved forecasts of solar-induced electromagnetic interference.
Arithmetic Computations and Memory Management Using a Binary Tree Encoding af Natural Numbers
Two applications of a binary tree data type based on a simple pairing function (a bijection between natural numbers and pairs of natural numbers) are explored. First, the tree is used to encode natural numbers, and algorithms that perform basic arithmetic computations are presented along with formal proofs of their correctness. Second, using this "canonical" representation as a base type, algorithms for encoding and decoding additional isomorphic data types of other mathematical constructs (sets, sequences, etc.) are also developed. An experimental application to a memory management system is constructed and explored using these isomorphic types. A practical analysis of this system's runtime complexity and space savings are provided, along with a proof of concept framework for both applications of the binary tree type, in the Java programming language.
SEM Predicting Success of Student Global Software Development Teams
The extensive use of global teams to develop software has prompted researchers to investigate various factors that can enhance a team’s performance. While a significant body of research exists on global software teams, previous research has not fully explored the interrelationships and collective impact of various factors on team performance. This study explored a model that added the characteristics of a team’s culture, ability, communication frequencies, response rates, and linguistic categories to a central framework of team performance. Data was collected from two student software development projects that occurred between teams located in the United States, Panama, and Turkey. The data was obtained through online surveys and recorded postings of team activities that occurred throughout the global software development projects. Partial least squares path modeling (PLS-PM) was chosen as the analytic technique to test the model and identify the most influential factors. Individual factors associated with response rates and linguistic characteristics proved to significantly affect a team’s activity related to grade on the project, group cohesion, and the number of messages received and sent. Moreover, an examination of possible latent homogeneous segments in the model supported the existence of differences among groups based on leadership style. Teams with assigned leaders tended to have stronger relationships between linguistic characteristics and team performance factors, while teams with emergent leaders had stronger. Relationships between response rates and team performance factors. The contributions in this dissertation are three fold. 1) Novel analysis techniques using PLS-PM and clustering, 2) Use of new, quantifiable variables in analyzing team activity, 3) Identification of plausible causal indicators for team performance and analysis of the same.
Extracting Temporally-Anchored Knowledge from Tweets
Twitter has quickly become one of the most popular social media sites. It has 313 million monthly active users, and 500 million tweets are published daily. With the massive number of tweets, Twitter users share information about a location along with the temporal awareness. In this work, I focus on tweets where author of the tweets exclusively mentions a location in the tweet. Natural language processing systems can leverage wide range of information from the tweets to build applications like recommender systems that predict the location of the author. This kind of system can be used to increase the visibility of the targeted audience and can also provide recommendations interesting places to visit, hotels to stay, restaurants to eat, targeted on-line advertising, and co-traveler matching based on the temporal information extracted from a tweet. In this work I determine if the author of the tweet is present in the mentioned location of the tweet. I also determine if the author is present in the location before tweeting, while tweeting, or after tweeting. I introduce 5 temporal tags (before the tweet but > 24 hours; before the tweet but < 24 hours; during the tweet is posted; after the tweet is posted but < 24 hours; and after the tweet is posted but > 24 hours). The major contributions of this paper are: (1) creation of a corpus of 1062 tweets containing 1200 location named entities, containing annotations whether author of a tweet is or is not located in the location he tweets about with respect to 5 temporal tags; (2) detailed corpus analysis including real annotation examples and label distributions per temporal tag; (3) detailed inter-annotator agreements, including Cohen's kappa, Krippendorff's alpha and confusion matrices per temporal tag; (4) label distributions and analysis; and (5) supervised learning experiments, along with …
Back to Top of Screen