You limited your search to:

  Access Rights: Public
  Partner: UNT Libraries
 Department: Department of Computer Science and Engineering
Keywords in the mist:  Automated keyword extraction for very large documents and back of the book indexing.

Keywords in the mist: Automated keyword extraction for very large documents and back of the book indexing.

Date: May 2008
Creator: Csomai, Andras
Description: This research addresses the problem of automatic keyphrase extraction from large documents and back of the book indexing. The potential benefits of automating this process are far reaching, from improving information retrieval in digital libraries, to saving countless man-hours by helping professional indexers creating back of the book indexes. The dissertation introduces a new methodology to evaluate automated systems, which allows for a detailed, comparative analysis of several techniques for keyphrase extraction. We introduce and evaluate both supervised and unsupervised techniques, designed to balance the resource requirements of an automated system and the best achievable performance. Additionally, a number of novel features are proposed, including a statistical informativeness measure based on chi statistics; an encyclopedic feature that taps into the vast knowledge base of Wikipedia to establish the likelihood of a phrase referring to an informative concept; and a linguistic feature based on sophisticated semantic analysis of the text using current theories of discourse comprehension. The resulting keyphrase extraction system is shown to outperform the current state of the art in supervised keyphrase extraction by a large margin. Moreover, a fully automated back of the book indexing system based on the keyphrase extraction system was shown to lead to back ...
Contributing Partner: UNT Libraries
Layout-accurate Ultra-fast System-level Design Exploration Through Verilog-ams

Layout-accurate Ultra-fast System-level Design Exploration Through Verilog-ams

Date: May 2013
Creator: Zheng, Geng
Description: This research addresses problems in designing analog and mixed-signal (AMS) systems by bridging the gap between system-level and circuit-level simulation by making simulations fast like system-level and accurate like circuit-level. The tools proposed include metamodel integrated Verilog-AMS based design exploration flows. The research involves design centering, metamodel generation flows for creating efficient behavioral models, and Verilog-AMS integration techniques for model realization. The core of the proposed solution is transistor-level and layout-level metamodeling and their incorporation in Verilog-AMS. Metamodeling is used to construct efficient and layout-accurate surrogate models for AMS system building blocks. Verilog-AMS, an AMS hardware description language, is employed to build surrogate model implementations that can be simulated with industrial standard simulators. The case-study circuits and systems include an operational amplifier (OP-AMP), a voltage-controlled oscillator (VCO), a charge-pump phase-locked loop (PLL), and a continuous-time delta-sigma modulator (DSM). The minimum and maximum error rates of the proposed OP-AMP model are 0.11 % and 2.86 %, respectively. The error rates for the PLL lock time and power estimation are 0.7 % and 3.0 %, respectively. The OP-AMP optimization using the proposed approach is ~17000× faster than the transistor-level model based approach. The optimization achieves a ~4× power reduction for the OP-AMP ...
Contributing Partner: UNT Libraries
Maintaining Web Applications Integrity Running on Radium

Maintaining Web Applications Integrity Running on Radium

Date: August 2015
Creator: Ur-Rehman, Wasi
Description: Computer security attacks take place due to the presence of vulnerabilities and bugs in software applications. Bugs and vulnerabilities are the result of weak software architecture and lack of standard software development practices. Despite the fact that software companies are investing millions of dollars in the research and development of software designs security risks are still at large. In some cases software applications are found to carry vulnerabilities for many years before being identified. A recent such example is the popular Heart Bleed Bug in the Open SSL/TSL. In today’s world, where new software application are continuously being developed for a varied community of users; it’s highly unlikely to have software applications running without flaws. Attackers on computer system securities exploit these vulnerabilities and bugs and cause threat to privacy without leaving any trace. The most critical vulnerabilities are those which are related to the integrity of the software applications. Because integrity is directly linked to the credibility of software application and data it contains. Here I am giving solution of maintaining web applications integrity running on RADIUM by using daikon. Daikon generates invariants, these invariants are used to maintain the integrity of the web application and also check the ...
Contributing Partner: UNT Libraries
Measuring Semantic Relatedness Using Salient Encyclopedic Concepts

Measuring Semantic Relatedness Using Salient Encyclopedic Concepts

Date: August 2011
Creator: Hassan, Samer
Description: While pragmatics, through its integration of situational awareness and real world relevant knowledge, offers a high level of analysis that is suitable for real interpretation of natural dialogue, semantics, on the other end, represents a lower yet more tractable and affordable linguistic level of analysis using current technologies. Generally, the understanding of semantic meaning in literature has revolved around the famous quote ``You shall know a word by the company it keeps''. In this thesis we investigate the role of context constituents in decoding the semantic meaning of the engulfing context; specifically we probe the role of salient concepts, defined as content-bearing expressions which afford encyclopedic definitions, as a suitable source of semantic clues to an unambiguous interpretation of context. Furthermore, we integrate this world knowledge in building a new and robust unsupervised semantic model and apply it to entail semantic relatedness between textual pairs, whether they are words, sentences or paragraphs. Moreover, we explore the abstraction of semantics across languages and utilize our findings into building a novel multi-lingual semantic relatedness model exploiting information acquired from various languages. We demonstrate the effectiveness and the superiority of our mono-lingual and multi-lingual models through a comprehensive set of evaluations on specialized ...
Contributing Partner: UNT Libraries
Measuring Vital Signs Using Smart Phones

Measuring Vital Signs Using Smart Phones

Date: December 2010
Creator: Chandrasekaran, Vikram
Description: Smart phones today have become increasingly popular with the general public for its diverse abilities like navigation, social networking, and multimedia facilities to name a few. These phones are equipped with high end processors, high resolution cameras, built-in sensors like accelerometer, orientation-sensor, light-sensor, and much more. According to comScore survey, 25.3% of US adults use smart phones in their daily lives. Motivated by the capability of smart phones and their extensive usage, I focused on utilizing them for bio-medical applications. In this thesis, I present a new application for a smart phone to quantify the vital signs such as heart rate, respiratory rate and blood pressure with the help of its built-in sensors. Using the camera and a microphone, I have shown how the blood pressure and heart rate can be determined for a subject. People sometimes encounter minor situations like fainting or fatal accidents like car crash at unexpected times and places. It would be useful to have a device which can measure all vital signs in such an event. The second part of this thesis demonstrates a new mode of communication for next generation 9-1-1 calls. In this new architecture, the call-taker will be able to control the ...
Contributing Partner: UNT Libraries
Mediation on XQuery Views

Mediation on XQuery Views

Date: December 2006
Creator: Peng, Xiaobo
Description: The major goal of information integration is to provide efficient and easy-to-use access to multiple heterogeneous data sources with a single query. At the same time, one of the current trends is to use standard technologies for implementing solutions to complex software problems. In this dissertation, I used XML and XQuery as the standard technologies and have developed an extended projection algorithm to provide a solution to the information integration problem. In order to demonstrate my solution, I implemented a prototype mediation system called Omphalos based on XML related technologies. The dissertation describes the architecture of the system, its metadata, and the process it uses to answer queries. The system uses XQuery expressions (termed metaqueries) to capture complex mappings between global schemas and data source schemas. The system then applies these metaqueries in order to rewrite a user query on a virtual global database (representing the integrated view of the heterogeneous data sources) to a query (termed an outsourced query) on the real data sources. An extended XML document projection algorithm was developed to increase the efficiency of selecting the relevant subset of data from an individual data source to answer the user query. The system applies the projection algorithm ...
Contributing Partner: UNT Libraries
Metamodeling-based Fast Optimization of  Nanoscale Ams-socs

Metamodeling-based Fast Optimization of Nanoscale Ams-socs

Date: May 2012
Creator: Garitselov, Oleg
Description: Modern consumer electronic systems are mostly based on analog and digital circuits and are designed as analog/mixed-signal systems on chip (AMS-SoCs). the integration of analog and digital circuits on the same die makes the system cost effective. in AMS-SoCs, analog and mixed-signal portions have not traditionally received much attention due to their complexity. As the fabrication technology advances, the simulation times for AMS-SoC circuits become more complex and take significant amounts of time. the time allocated for the circuit design and optimization creates a need to reduce the simulation time. the time constraints placed on designers are imposed by the ever-shortening time to market and non-recurrent cost of the chip. This dissertation proposes the use of a novel method, called metamodeling, and intelligent optimization algorithms to reduce the design time. Metamodel-based ultra-fast design flows are proposed and investigated. Metamodel creation is a one time process and relies on fast sampling through accurate parasitic-aware simulations. One of the targets of this dissertation is to minimize the sample size while retaining the accuracy of the model. in order to achieve this goal, different statistical sampling techniques are explored and applied to various AMS-SoC circuits. Also, different metamodel functions are explored for their ...
Contributing Partner: UNT Libraries
A Minimally Supervised Word Sense Disambiguation Algorithm Using Syntactic Dependencies and Semantic Generalizations

A Minimally Supervised Word Sense Disambiguation Algorithm Using Syntactic Dependencies and Semantic Generalizations

Date: December 2005
Creator: Faruque, Md. Ehsanul
Description: Natural language is inherently ambiguous. For example, the word "bank" can mean a financial institution or a river shore. Finding the correct meaning of a word in a particular context is a task known as word sense disambiguation (WSD), which is essential for many natural language processing applications such as machine translation, information retrieval, and others. While most current WSD methods try to disambiguate a small number of words for which enough annotated examples are available, the method proposed in this thesis attempts to address all words in unrestricted text. The method is based on constraints imposed by syntactic dependencies and concept generalizations drawn from an external dictionary. The method was tested on standard benchmarks as used during the SENSEVAL-2 and SENSEVAL-3 WSD international evaluation exercises, and was found to be competitive.
Contributing Partner: UNT Libraries
Mobile agent security through multi-agent cryptographic protocols.

Mobile agent security through multi-agent cryptographic protocols.

Date: May 2004
Creator: Xu, Ke
Description: An increasingly promising and widespread topic of research in distributed computing is the mobile agent paradigm: code travelling and performing computations on remote hosts in an autonomous manner. One of the biggest challenges faced by this new paradigm is security. The issue of protecting sensitive code and data carried by a mobile agent against tampering from a malicious host is particularly hard but important. Based on secure multi-party computation, a recent research direction shows the feasibility of a software-only solution to this problem, which had been deemed impossible by some researchers previously. The best result prior to this dissertation is a single-agent protocol which requires the participation of a trusted third party. Our research employs multi-agent protocols to eliminate the trusted third party, resulting in a protocol with minimum trust assumptions. This dissertation presents one of the first formal definitions of secure mobile agent computation, in which the privacy and integrity of the agent code and data as well as the data provided by the host are all protected. We present secure protocols for mobile agent computation against static, semi-honest or malicious adversaries without relying on any third party or trusting any specific participant in the system. The security of ...
Contributing Partner: UNT Libraries
Modeling Alcohol Consumption Using Blog Data

Modeling Alcohol Consumption Using Blog Data

Date: May 2013
Creator: Koh, Kok Chuan
Description: How do the content and writing style of people who drink alcohol beverages stand out from non-drinkers? How much information can we learn about a person's alcohol consumption behavior by reading text that they have authored? This thesis attempts to extend the methods deployed in authorship attribution and authorship profiling research into the domain of automatically identifying the human action of drinking alcohol beverages. I examine how a psycholinguistics dictionary (the Linguistics Inquiry and Word Count lexicon, developed by James Pennebaker), together with Kenneth Burke's concept of words as symbols of human action, and James Wertsch's concept of mediated action provide a framework for analyzing meaningful data patterns from the content of blogs written by consumers of alcohol beverages. The contributions of this thesis to the research field are twofold. First, I show that it is possible to automatically identify blog posts that have content related to the consumption of alcohol beverages. And second, I provide a framework and tools to model human behavior through text analysis of blog data.
Contributing Partner: UNT Libraries