UNT Libraries - Browse

ABOUT BROWSE FEED

A Multi-Variate Analysis of SMTP Paths and Relays to Restrict Spam and Phishing Attacks in Emails

Description: The classifier discussed in this thesis considers the path traversed by an email (instead of its content) and reputation of the relays, features inaccessible to spammers. Groups of spammers and individual behaviors of a spammer in a given domain were analyzed to yield association patterns, which were then used to identify similar spammers. Unsolicited and phishing emails were successfully isolated from legitimate emails, using analysis results. Spammers and phishers are also categorized into serial spammers/phishers, recent spammers/phishers, prospective spammers/phishers, and suspects. Legitimate emails and trusted domains are classified into socially close (family members, friends), socially distinct (strangers etc), and opt-outs (resolved false positives and false negatives). Overall this classifier resulted in far less false positives when compared to current filters like SpamAssassin, achieving a 98.65% precision, which is well comparable to the precisions achieved by SPF, DNSRBL blacklists.
Access: This item is restricted to UNT Community Members. Login required if off-campus.
Date: December 2006
Creator: Palla, Srikanth

Multilingual Word Sense Disambiguation Using Wikipedia

Description: Ambiguity is inherent to human language. In particular, word sense ambiguity is prevalent in all natural languages, with a large number of the words in any given language carrying more than one meaning. Word sense disambiguation is the task of automatically assigning the most appropriate meaning to a polysemous word within a given context. Generally the problem of resolving ambiguity in literature has revolved around the famous quote “you shall know the meaning of the word by the company it keeps.” In this thesis, we investigate the role of context for resolving ambiguity through three different approaches. Instead of using a predefined monolingual sense inventory such as WordNet, we use a language-independent framework where the word senses and sense-tagged data are derived automatically from Wikipedia. Using Wikipedia as a source of sense-annotations provides the much needed solution for knowledge acquisition bottleneck. In order to evaluate the viability of Wikipedia based sense-annotations, we cast the task of disambiguating polysemous nouns as a monolingual classification task and experimented on lexical samples from four different languages (viz. English, German, Italian and Spanish). The experiments confirm that the Wikipedia based sense annotations are reliable and can be used to construct accurate monolingual sense classifiers. It is a long belief that exploiting multiple languages helps in building accurate word sense disambiguation systems. Subsequently, we developed two approaches that recast the task of disambiguating polysemous nouns as a multilingual classification task. The first approach for multilingual word sense disambiguation attempts to effectively use a machine translation system to leverage two relevant multilingual aspects of the semantics of text. First, the various senses of a target word may be translated into different words, which constitute unique, yet highly salient signal that effectively expand the target word’s feature space. Second, the translated context words themselves embed co-occurrence information ...
Date: August 2013
Creator: Dandala, Bharath

The Multipath Fault-Tolerant Protocol for Routing in Packet-Switched Communication Network

Description: In order to provide improved service quality to applications, networks need to address the need for reliability of data delivery. Reliability can be improved by incorporating fault tolerance into network routing, wherein a set of multiple routes are used for routing between a given source and destination. This thesis proposes a new fault-tolerant protocol, called the Multipath Fault Tolerant Protocol for Routing (MFTPR), to improve the reliability of network routing services. The protocol is based on a multipath discovery algorithm, the Quasi-Shortest Multipath (QSMP), and is designed to work in conjunction with the routing protocol employed by the network. MFTPR improves upon the QSMP algorithm by finding more routes than QSMP, and also provides for maintenance of these routes in the event of failure of network components. In order to evaluate the resilience of a pair of paths to failure, this thesis proposes metrics that evaluate the non-disjointness of a pair of paths and measure the probability of simultaneous failure of these paths. The performance of MFTPR to find alternate routes based on these metrics is analyzed through simulation.
Date: May 2003
Creator: Krishnan, Anupama

Multiresolutional/Fractal Compression of Still and Moving Pictures

Description: The scope of the present dissertation is a deep lossy compression of still and moving grayscale pictures while maintaining their fidelity, with a specific goal of creating a working prototype of a software system for use in low bandwidth transmission of still satellite imagery and weather briefings with the best preservation of features considered important by the end user.
Date: December 1993
Creator: Kiselyov, Oleg E.

Natural Language Interfaces to Databases

Description: Natural language interfaces to databases (NLIDB) are systems that aim to bridge the gap between the languages used by humans and computers, and automatically translate natural language sentences to database queries. This thesis proposes a novel approach to NLIDB, using graph-based models. The system starts by collecting as much information as possible from existing databases and sentences, and transforms this information into a knowledge base for the system. Given a new question, the system will use this knowledge to analyze and translate the sentence into its corresponding database query statement. The graph-based NLIDB system uses English as the natural language, a relational database model, and SQL as the formal query language. In experiments performed with natural language questions ran against a large database containing information about U.S. geography, the system showed good performance compared to the state-of-the-art in the field.
Date: December 2006
Creator: Chandra, Yohan

A Netcentric Scientific Research Repository

Description: The Internet and networks in general have become essential tools for disseminating in-formation. Search engines have become the predominant means of finding information on the Web and all other data repositories, including local resources. Domain scientists regularly acquire and analyze images generated by equipment such as microscopes and cameras, resulting in complex image files that need to be managed in a convenient manner. This type of integrated environment has been recently termed a netcentric sci-entific research repository. I developed a number of data manipulation tools that allow researchers to manage their information more effectively in a netcentric environment. The specific contributions are: (1) A unique interface for management of data including files and relational databases. A wrapper for relational databases was developed so that the data can be indexed and searched using traditional search engines. This approach allows data in databases to be searched with the same interface as other data. Fur-thermore, this approach makes it easier for scientists to work with their data if they are not familiar with SQL. (2) A Web services based architecture for integrating analysis op-erations into a repository. This technique allows the system to leverage the large num-ber of existing tools by wrapping them with a Web service and registering the service with the repository. Metadata associated with Web services was enhanced to allow this feature to be included. In addition, an improved binary to text encoding scheme was de-veloped to reduce the size overhead for sending large scientific data files via XML mes-sages used in Web services. (3) Integrated image analysis operations with SQL. This technique allows for images to be stored and managed conveniently in a relational da-tabase. SQL supplemented with map algebra operations is used to select and perform operations on sets of images.
Access: This item is restricted to the UNT Community Members at a UNT Libraries Location.
Date: December 2006
Creator: Harrington, Brian

Network Security Tool for a Novice

Description: Network security is a complex field that is handled by security professionals who need certain expertise and experience to configure security systems. With the ever increasing size of the networks, managing them is going to be a daunting task. What kind of solution can be used to generate effective security configurations by both security professionals and nonprofessionals alike? In this thesis, a web tool is developed to simplify the process of configuring security systems by translating direct human language input into meaningful, working security rules. These human language inputs yield the security rules that the individual wants to implement in their network. The human language input can be as simple as, "Block Facebook to my son's PC". This tool will translate these inputs into specific security rules and install the translated rules into security equipment such as virtualized Cisco FWSM network firewall, Netfilter host-based firewall, and Snort Network Intrusion Detection. This tool is implemented and tested in both a traditional network and a cloud environment. One thousand input policies were collected from various users such as staff from UNT departments' and health science, including individuals with network security background as well as students with a non-computer science background to analyze the tool's performance. The tool is tested for its accuracy (91%) in generating a security rule. It is also tested for accuracy of the translated rule (86%) compared to a standard rule written by security professionals. Nevertheless, the network security tool built has shown promise to both experienced and inexperienced people in network security field by simplifying the provisioning process to result in accurate and effective network security rules.
Access: This item is restricted to UNT Community Members. Login required if off-campus.
Date: August 2016
Creator: Ganduri, Rajasekhar

Optimal Access Point Selection and Channel Assignment in IEEE 802.11 Networks

Description: Designing 802.11 wireless networks includes two major components: selection of access points (APs) in the demand areas and assignment of radio frequencies to each AP. Coverage and capacity are some key issues when placing APs in a demand area. APs need to cover all users. A user is considered covered if the power received from its corresponding AP is greater than a given threshold. Moreover, from a capacity standpoint, APs need to provide certain minimum bandwidth to users located in the coverage area. A major challenge in designing wireless networks is the frequency assignment problem. The 802.11 wireless LANs operate in the unlicensed ISM frequency, and all APs share the same frequency. As a result, as 802.11 APs become widely deployed, they start to interfere with each other and degrade network throughput. In consequence, efficient assignment of channels becomes necessary to avoid and minimize interference. In this work, an optimal AP selection was developed by balancing traffic load. An optimization problem was formulated that minimizes heavy congestion. As a result, APs in wireless LANs will have well distributed traffic loads, which maximize the throughput of the network. The channel assignment algorithm was designed by minimizing channel interference between APs. The optimization algorithm assigns channels in such a way that minimizes co-channel and adjacent channel interference resulting in higher throughput.
Date: December 2004
Creator: Park, Sangtae

Optimizing Non-pharmaceutical Interventions Using Multi-coaffiliation Networks

Description: Computational modeling is of fundamental significance in mapping possible disease spread, and designing strategies for its mitigation. Conventional contact networks implement the simulation of interactions as random occurrences, presenting public health bodies with a difficult trade off between a realistic model granularity and robust design of intervention strategies. Recently, researchers have been investigating the use of agent-based models (ABMs) to embrace the complexity of real world interactions. At the same time, theoretical approaches provide epidemiologists with general optimization models in which demographics are intrinsically simplified. The emerging study of affiliation networks and co-affiliation networks provide an alternative to such trade off. Co-affiliation networks maintain the realism innate to ABMs while reducing the complexity of contact networks into distinctively smaller k-partite graphs, were each partition represent a dimension of the social model. This dissertation studies the optimization of intervention strategies for infectious diseases, mainly distributed in school systems. First, concepts of synthetic populations and affiliation networks are extended to propose a modified algorithm for the synthetic reconstruction of populations. Second, the definition of multi-coaffiliation networks is presented as the main social model in which risk is quantified and evaluated, thereby obtaining vulnerability indications for each school in the system. Finally, maximization of the mitigation coverage and minimization of the overall cost of intervention strategies are proposed and compared, based on centrality measures.
Date: May 2013
Creator: Loza, Olivia G.

A Parallel Programming Language

Description: The problem of programming a parallel processor is discussed. Previous methods of programming a parallel processor, analyzing a program for parallel paths, and special language features are discussed. Graph theory is used to define the three basic programming constructs: choice, sequence, repetition. The concept of mechanized programming is expanded to allow for total separation of control and computational sections of a program. A definition of a language is presented which provides for this separation. A method for developing the program graph is discussed. The control graph and data graph are developed separately. The two graphs illustrate control and data predecessor relationships used in determining parallel elements of a program.
Date: May 1979
Creator: Cox, Richard D.

Performance Analysis of Wireless Networks with QoS Adaptations

Description: The explosive demand for multimedia and fast transmission of continuous media on wireless networks means the simultaneous existence of traffic requiring different qualities of service (QoS). In this thesis, several efficient algorithms have been developed which offer several QoS to the end-user. We first look at a request TDMA/CDMA protocol for supporting wireless multimedia traffic, where CDMA is laid over TDMA. Then we look at a hybrid push-pull algorithm for wireless networks, and present a generalized performance analysis of the proposed protocol. Some of the QoS factors considered include customer retrial rates due to user impatience and system timeouts and different levels of priority and weights for mobile hosts. We have also looked at how customer impatience and system timeouts affect the QoS provided by several queuing and scheduling schemes such as FIFO, priority, weighted fair queuing, and the application of the stretch-optimal algorithm to scheduling.
Date: August 2003
Creator: Dash, Trivikram

Performance comparison of data distribution management strategies in large-scale distributed simulation.

Description: Data distribution management (DDM) is a High Level Architecture/Run-time Infrastructure (HLA/RTI) service that manages the distribution of state updates and interaction information in large-scale distributed simulations. The key to efficient DDM is to limit and control the volume of data exchanged during the simulation, to relay data to only those hosts requiring the data. This thesis focuses upon different DDM implementations and strategies. This thesis includes analysis of three DDM methods including the fixed grid-based, dynamic grid-based, and region-based methods. Also included is the use of multi-resolution modeling with various DDM strategies and analysis of the performance effects of aggregation/disaggregation with these strategies. Running numerous federation executions, I simulate four different scenarios on a cluster of workstations with a mini-RTI Kit framework and propose a set of benchmarks for a comparison of the DDM schemes. The goals of this work are to determine the most efficient model for applying each DDM scheme, discover the limitations of the scalability of the various DDM methods, evaluate the effects of aggregation/disaggregation on performance and resource usage, and present accepted benchmarks for use in future research.
Date: May 2004
Creator: Dzermajko, Caron

Performance Engineering of Software Web Services and Distributed Software Systems

Description: The promise of service oriented computing, and the availability of Web services promote the delivery and creation of new services based on existing services, in order to meet new demands and new markets. As Web and internet based services move into Clouds, inter-dependency of services and their complexity will increase substantially. There are standards and frameworks for specifying and composing Web Services based on functional properties. However, mechanisms to individually address non-functional properties of services and their compositions have not been well established. Furthermore, the Cloud ontology depicts service layers from a high-level, such as Application and Software, to a low-level, such as Infrastructure and Platform. Each component that resides in one layer can be useful to another layer as a service. It hints at the amount of complexity resulting from not only horizontal but also vertical integrations in building and deploying a composite service. To meet the requirements and facilitate using Web services, we first propose a WSDL extension to permit specification of non-functional or Quality of Service (QoS) properties. On top of the foundation, the QoS-aware framework is established to adapt publicly available tools for Web services, augmented by ontology management tools, along with tools for performance modeling to exemplify how the non-functional properties such as response time, throughput, or utilization of services can be addressed in the service acquisition and composition process. To facilitate Web service composition standards, in this work we extended the framework with additional qualitative information to the service descriptions using Business Process Execution Language (BPEL). Engineers can use BPEL to explore design options, and have the QoS properties analyzed for the composite service. The main issue in our research is performance evaluation in software system and engineering. We researched the Web service computation as the first half of this dissertation, and performance antipattern ...
Date: May 2014
Creator: Lin, Chia-en

Performance Evaluation of Data Integrity Mechanisms for Mobile Agents

Description: With the growing popularity of e-commerce applications that use software agents, the protection of mobile agent data has become imperative. To that end, the performance of four methods that protect the data integrity of mobile agents is evaluated. The methods investigated include existing approaches known as the Partial Result Authentication Codes, Hash Chaining, and Set Authentication Code methods, and a technique of our own design, called the Modified Set Authentication Code method, which addresses the limitations of the Set Authentication Code method. The experiments were run using the DADS agent system (developed at the Network Research Laboratory at UNT), for which a Data Integrity Module was designed. The experimental results show that our Modified Set Authentication Code technique performed comparably to the Set Authentication Code method.
Access: This item is restricted to UNT Community Members. Login required if off-campus.
Date: December 2003
Creator: Gunupudi, Vandana

Planning techniques for agent based 3D animations.

Description: The design of autonomous agents capable of performing a given goal in a 3D domain continues to be a challenge for computer animated story generation systems. We present a novel prototype which consists of a 3D engine and a planner for a simple virtual world. We incorporate the 2D planner into the 3D engine to provide 3D animations. Based on the plan, the 3D world is created and the objects are positioned. Then the plan is linearized into simpler actions for object animation and rendered via the 3D engine. We use JINNI3D as the engine and WARPLAN-C as the planner for the above-mentioned prototype. The user can interact with the system using a simple natural language interface. The interface consists of a shallow parser, which is capable of identifying a set of predefined basic commands. The command given by the user is considered as the goal for the planner. The resulting plan is created and rendered in 3D. The overall system is comparable to a character based interactive story generation system except that it is limited to the predefined 3D environment.
Date: December 2005
Creator: Kandaswamy, Balasubramanian

Power-benefit analysis of erasure encoding with redundant routing in sensor networks.

Description: One of the problems sensor networks face is adversaries corrupting nodes along the path to the base station. One way to reduce the effect of these attacks is multipath routing. This introduces some intrusion-tolerance in the network by way of redundancy but at the cost of a higher power consumption by the sensor nodes. Erasure coding can be applied to this scenario in which the base station can receive a subset of the total data sent and reconstruct the entire message packet at its end. This thesis uses two commonly used encodings and compares their performance with respect to power consumed for unencoded data in multipath routing. It is found that using encoding with multipath routing reduces the power consumption and at the same time enables the user to send reasonably large data sizes. The experiments in this thesis were performed on the Tiny OS platform with the simulations done in TOSSIM and the power measurements were taken in PowerTOSSIM. They were performed on the simple radio model and the lossy radio model provided by Tiny OS. The lossy radio model was simulated with distances of 10 feet, 15 feet and 20 feet between nodes. It was found that by using erasure encoding, double or triple the data size can be sent at the same power consumption rate as unencoded data. All the experiments were performed with the radio set at a normal transmit power, and later a high transmit power.
Date: December 2006
Creator: Vishwanathan, Roopa

Practical Cursive Script Recognition

Description: This research focused on the off-line cursive script recognition application. The problem is very large and difficult and there is much room for improvement in every aspect of the problem. Many different aspects of this problem were explored in pursuit of solutions to create a more practical and usable off-line cursive script recognizer than is currently available.
Date: August 1995
Creator: Carroll, Johnny Glen, 1953-

Practical Parallel Processing

Description: The physical limitations of uniprocessors and the real-time requirements of numerous practical applications have made parallel processing an essential technology in military, industry and scientific research. In this dissertation, we investigate parallelizations of three practical applications using three parallel machine models. The algorithms are: Finitely inductive (FI) sequence processing is a pattern recognition technique used in many fields. We first propose four parallel FI algorithms on the EREW PRAM. The time complexity of the parallel factoring and following by bucket packing is O(sk^2 n/p), and they are optimal under some conditions. The parallel factoring and following by hashing requires O(sk^2 n/p) time when uniform hash functions are used and log(p) ≤ k n/p and pm ≈ n. Their speedup is proportional to the number processors used. For these results, s is the number of levels, k is the size of the antecedents and n is the length of the input sequence and p is the number of processors. We also describe algorithms for raster/vector conversion based on the scan model to handle block-like connected components of arbitrary geometrical shapes with multi-level nested dough nuts for the IES (image exploitation system). Both the parallel raster-to-vector algorithm and parallel vector-to-raster algorithm require O(log(n2)) or O(log2(n2)) time (depending on the sorting algorithms used) for images of size n2 using p = n2 processors. Not only is the DWT (discrete wavelet transforms) useful in data compression, but also has it potentials in signal processing, image processing, and graphics. Therefore, it is of great importance to investigate efficient parallelizations of the wavelet transforms. The time complexity of the parallel forward DWT on the parallel virtual machine with linear processor organization is O(((so+s1)mn)/p), where s0 and s1 are the lengths of the filters and p is the number of processors used. The time complexity of the ...
Date: August 1996
Creator: Zhang, Hua, 1954-

Privacy Management for Online Social Networks

Description: One in seven people in the world use online social networking for a variety of purposes -- to keep in touch with friends and family, to share special occasions, to broadcast announcements, and more. The majority of society has been bought into this new era of communication technology, which allows everyone on the internet to share information with friends. Since social networking has rapidly become a main form of communication, holes in privacy have become apparent. It has come to the point that the whole concept of sharing information requires restructuring. No longer are online social networks simply technology available for a niche market; they are in use by all of society. Thus it is important to not forget that a sense of privacy is inherent as an evolutionary by-product of social intelligence. In any context of society, privacy needs to be a part of the system in order to help users protect themselves from others. This dissertation attempts to address the lack of privacy management in online social networks by designing models which understand the social science behind how we form social groups and share information with each other. Social relationship strength was modeled using activity patterns, vocabulary usage, and behavioral patterns. In addition, automatic configuration for default privacy settings was proposed to help prevent new users from leaking personal information. This dissertation aims to mobilize a new era of social networking that understands social aspects of human network, and uses that knowledge to honor users' privacy.
Date: August 2013
Creator: Baatarjav, Enkh-Amgalan

Privacy Preserving EEG-based Authentication Using Perceptual Hashing

Description: The use of electroencephalogram (EEG), an electrophysiological monitoring method for recording the brain activity, for authentication has attracted the interest of researchers for over a decade. In addition to exhibiting qualities of biometric-based authentication, they are revocable, impossible to mimic, and resistant to coercion attacks. However, EEG signals carry a wealth of information about an individual and can reveal private information about the user. This brings significant privacy issues to EEG-based authentication systems as they have access to raw EEG signals. This thesis proposes a privacy-preserving EEG-based authentication system that preserves the privacy of the user by not revealing the raw EEG signals while allowing the system to authenticate the user accurately. In that, perceptual hashing is utilized and instead of raw EEG signals, their perceptually hashed values are used in the authentication process. In addition to describing the authentication process, algorithms to compute the perceptual hash are developed based on two feature extraction techniques. Experimental results show that an authentication system using perceptual hashing can achieve performance comparable to a system that has access to raw EEG signals if enough EEG channels are used in the process. This thesis also presents a security analysis to show that perceptual hashing can prevent information leakage.
Access: This item is restricted to UNT Community Members. Login required if off-campus.
Date: December 2016
Creator: Koppikar, Samir Dilip

Procedural content creation and technologies for 3D graphics applications and games.

Description: The recent transformation of consumer graphics (CG) cards into powerful 3D rendering processors is due in large measure to the success of game developers in delivering mass market entertainment software that feature highly immersive and captivating virtual environments. Despite this success, 3D CG application development is becoming increasingly handicapped by the inability of traditional content creation methods to keep up with the demand for content. The term content is used here to refer to any data operated on by application code that is meant for viewing, including 3D models, textures, animation sequences and maps or other data-intensive descriptions of virtual environments. Traditionally, content has been handcrafted by humans. A serious problem facing the interactive graphics software development community is how to increase the rate at which content can be produced to keep up with the increasingly rapid pace at which software for interactive applications can now be developed. Research addressing this problem centers around procedural content creation systems. By moving away from purely human content creation toward systems in which humans play a substantially less time-intensive but no less creative part in the process, procedural content creation opens new doors. From a qualitative standpoint, these types of systems will not rely less on human intervention but rather more since they will depend heavily on direction from a human in order to synthesize the desired content. This research draws heavily from the entertainment software domain but the research is broadly relevant to 3D graphics applications in general.
Access: This item is restricted to UNT Community Members. Login required if off-campus.
Date: May 2005
Creator: Roden, Timothy E.

Qos Aware Service Oriented Architecture

Description: Service-oriented architecture enables web services to operate in a loosely-coupled setting and provides an environment for dynamic discovery and use of services over a network using standards such as WSDL, SOAP, and UDDI. Web service has both functional and non-functional characteristics. This thesis work proposes to add QoS descriptions (non-functional properties) to WSDL and compose various services to form a business process. This composition of web services also considers QoS properties along with functional properties and the composed services can again be published as a new Web Service and can be part of any other composition using Composed WSDL.
Date: August 2013
Creator: Adepu, Sagarika

Quantifying Design Principles in Reusable Software Components

Description: Software reuse can occur in various places during the software development cycle. Reuse of existing source code is the most commonly practiced form of software reuse. One of the key requirements for software reuse is readability, thus the interest in the use of data abstraction, inheritance, modularity, and aspects of the visible portion of module specifications. This research analyzed the contents of software reuse libraries to answer the basic question of what makes a good reusable software component. The approach taken was to measure and analyze various software metrics as mapped to design characteristics. A related research question investigated the change in the design principles over time. This was measured by comparing sets of Ada reuse libraries categorized into two time periods. It was discovered that recently developed Ada reuse components scored better on readability than earlier developed components. A benefit of this research has been the development of a set of "design for reuse" guidelines. These guidelines address coding practices as well as design principles for an Ada implementation. C++ software reuse libraries were also analyzed to determine if design principles can be applied in a language independent fashion. This research used cyclomatic complexity metrics, software science metrics, and traditional static code metrics to measure design features. This research provides at least three original contributions. First it collects empirical data about existing reuse libraries. Second, it develops a readability measure for software libraries which can aid in comparing libraries. And third, this research developed a set of coding and design guidelines for developers of reusable software. Future research can investigate how design principles for C++ change over time. Another topic for research is the investigation of systems employing reused components to determine which libraries are more successfully used than others.
Date: December 1995
Creator: Moore, Freeman Leroy