You limited your search to:

  Partner: UNT Libraries
 Department: Department of Computer Science and Engineering
 Decade: 2010-2019
 Collection: UNT Theses and Dissertations
3D Reconstruction Using Lidar and Visual Images
In this research, multi-perspective image registration using LiDAR and visual images was considered. 2D-3D image registration is a difficult task because it requires the extraction of different semantic features from each modality. This problem is solved in three parts. The first step involves detection and extraction of common features from each of the data sets. The second step consists of associating the common features between two different modalities. Traditional methods use lines or orthogonal corners as common features. The third step consists of building the projection matrix. Many existing methods use global positing system (GPS) or inertial navigation system (INS) for an initial estimate of the camera pose. However, the approach discussed herein does not use GPS, INS, or any such devices for initial estimate; hence the model can be used in places like the lunar surface or Mars where GPS or INS are not available. A variation of the method is also described, which does not require strong features from both images but rather uses intensity gradients in the image. This can be useful when one image does not have strong features (such as lines) or there are too many extraneous features. digital.library.unt.edu/ark:/67531/metadc177193/
Anchor Nodes Placement for Effective Passive Localization
Access: Use of this item is restricted to the UNT Community.
Wireless sensor networks are composed of sensor nodes, which can monitor an environment and observe events of interest. These networks are applied in various fields including but not limited to environmental, industrial and habitat monitoring. In many applications, the exact location of the sensor nodes is unknown after deployment. Localization is a process used to find sensor node's positional coordinates, which is vital information. The localization is generally assisted by anchor nodes that are also sensor nodes but with known locations. Anchor nodes generally are expensive and need to be optimally placed for effective localization. Passive localization is one of the localization techniques where the sensor nodes silently listen to the global events like thunder sounds, seismic waves, lighting, etc. According to previous studies, the ideal location to place anchor nodes was on the perimeter of the sensor network. This may not be the case in passive localization, since the function of anchor nodes here is different than the anchor nodes used in other localization systems. I do extensive studies on positioning anchor nodes for effective localization. Several simulations are run in dense and sparse networks for proper positioning of anchor nodes. I show that, for effective passive localization, the optimal placement of the anchor nodes is at the center of the network in such a way that no three anchor nodes share linearity. The more the non-linearity, the better the localization. The localization for our network design proves better when I place anchor nodes at right angles. digital.library.unt.edu/ark:/67531/metadc33132/
Automated Classification of Emotions Using Song Lyrics
This thesis explores the classification of emotions in song lyrics, using automatic approaches applied to a novel corpus of 100 popular songs. I use crowd sourcing via Amazon Mechanical Turk to collect line-level emotions annotations for this collection of song lyrics.  I then build classifiers that rely on textual features to automatically identify the presence of one or more of the following six Ekman emotions: anger, disgust, fear, joy, sadness and surprise. I compare different classification systems and evaluate the performance of the automatic systems against the manual annotations. I also introduce a system that uses data collected from the social network Twitter. I use the Twitter API to collect a large corpus of tweets manually labeled by their authors for one of the six emotions of interest. I then compare the classification of emotions obtained when training on data automatically collected from Twitter versus data obtained through crowd sourced annotations. digital.library.unt.edu/ark:/67531/metadc177253/
Automated Real-time Objects Detection in Colonoscopy Videos for Quality Measurements
The effectiveness of colonoscopy depends on the quality of the inspection of the colon. There was no automated measurement method to evaluate the quality of the inspection. This thesis addresses this issue by investigating an automated post-procedure quality measurement technique and proposing a novel approach automatically deciding a percentage of stool areas in images of digitized colonoscopy video files. It involves the classification of image pixels based on their color features using a new method of planes on RGB (red, green and blue) color space. The limitation of post-procedure quality measurement is that quality measurements are available long after the procedure was done and the patient was released. A better approach is to inform any sub-optimal inspection immediately so that the endoscopist can improve the quality in real-time during the procedure. This thesis also proposes an extension to post-procedure method to detect stool, bite-block, and blood regions in real-time using color features in HSV color space. These three objects play a major role in quality measurements in colonoscopy. The proposed method partitions very large positive examples of each of these objects into a number of groups. These groups are formed by taking intersection of positive examples with a hyper plane. This hyper plane is named as 'positive plane'. 'Convex hulls' are used to model positive planes. Comparisons with traditional classifiers such as K-nearest neighbor (K-NN) and support vector machines (SVM) proves the soundness of the proposed method in terms of accuracy and speed that are critical in the targeted real-time quality measurement system. digital.library.unt.edu/ark:/67531/metadc283843/
Automatic Tagging of Communication Data
Globally distributed software teams are widespread throughout industry. But finding reliable methods that can properly assess a team's activities is a real challenge. Methods such as surveys and manual coding of activities are too time consuming and are often unreliable. Recent advances in information retrieval and linguistics, however, suggest that automated and/or semi-automated text classification algorithms could be an effective way of finding differences in the communication patterns among individuals and groups. Communication among group members is frequent and generates a significant amount of data. Thus having a web-based tool that can automatically analyze the communication patterns among global software teams could lead to a better understanding of group performance. The goal of this thesis, therefore, is to compare automatic and semi-automatic measures of communication and evaluate their effectiveness in classifying different types of group activities that occur within a global software development project. In order to achieve this goal, we developed a web-based component that can be used to help clean and classify communication activities. The component was then used to compare different automated text classification techniques on various group activities to determine their effectiveness in correctly classifying data from a global software development team project. digital.library.unt.edu/ark:/67531/metadc149611/
Autonomic Failure Identification and Diagnosis for Building Dependable Cloud Computing Systems
The increasingly popular cloud-computing paradigm provides on-demand access to computing and storage with the appearance of unlimited resources. Users are given access to a variety of data and software utilities to manage their work. Users rent virtual resources and pay for only what they use. In spite of the many benefits that cloud computing promises, the lack of dependability in shared virtualized infrastructures is a major obstacle for its wider adoption, especially for mission-critical applications. Virtualization and multi-tenancy increase system complexity and dynamicity. They introduce new sources of failure degrading the dependability of cloud computing systems. To assure cloud dependability, in my dissertation research, I develop autonomic failure identification and diagnosis techniques that are crucial for understanding emergent, cloud-wide phenomena and self-managing resource burdens for cloud availability and productivity enhancement. We study the runtime cloud performance data collected from a cloud test-bed and by using traces from production cloud systems. We define cloud signatures including those metrics that are most relevant to failure instances. We exploit profiled cloud performance data in both time and frequency domain to identify anomalous cloud behaviors and leverage cloud metric subspace analysis to automate the diagnosis of observed failures. We implement a prototype of the anomaly identification system and conduct the experiments in an on-campus cloud computing test-bed and by using the Google datacenter traces. Our experimental results show that our proposed anomaly detection mechanism can achieve 93% detection sensitivity while keeping the false positive rate as low as 6.1% and outperform other tested anomaly detection schemes. In addition, the anomaly detector adapts itself by recursively learning from these newly verified detection results to refine future detection. digital.library.unt.edu/ark:/67531/metadc499993/
Boosting for Learning From Imbalanced, Multiclass Data Sets
Access: Use of this item is restricted to the UNT Community.
In many real-world applications, it is common to have uneven number of examples among multiple classes. The data imbalance, however, usually complicates the learning process, especially for the minority classes, and results in deteriorated performance. Boosting methods were proposed to handle the imbalance problem. These methods need elongated training time and require diversity among the classifiers of the ensemble to achieve improved performance. Additionally, extending the boosting method to handle multi-class data sets is not straightforward. Examples of applications that suffer from imbalanced multi-class data can be found in face recognition, where tens of classes exist, and in capsule endoscopy, which suffers massive imbalance between the classes. This dissertation introduces RegBoost, a new boosting framework to address the imbalanced, multi-class problems. This method applies a weighted stratified sampling technique and incorporates a regularization term that accommodates multi-class data sets and automatically determines the error bound of each base classifier. The regularization parameter penalizes the classifier when it misclassifies instances that were correctly classified in the previous iteration. The parameter additionally reduces the bias towards majority classes. Experiments are conducted using 12 diverse data sets with moderate to high imbalance ratios. The results demonstrate superior performance of the proposed method compared to several state-of-the-art algorithms for imbalanced, multi-class classification problems. More importantly, the sensitivity improvement of the minority classes using RegBoost is accompanied with the improvement of the overall accuracy for all classes. With unpredictability regularization, a diverse group of classifiers are created and the maximum accuracy improvement reaches above 24%. Using stratified undersampling, RegBoost exhibits the best efficiency. The reduction in computational cost is significant reaching above 50%. As the volume of training data increase, the gain of efficiency with the proposed method becomes more significant. digital.library.unt.edu/ark:/67531/metadc407775/
Cuff-less Blood Pressure Measurement Using a Smart Phone
Access: Use of this item is restricted to the UNT Community.
Blood pressure is vital sign information that physicians often need as preliminary data for immediate intervention during emergency situations or for regular monitoring of people with cardiovascular diseases. Despite the availability of portable blood pressure meters in the market, they are not regularly carried by people, creating a need for an ultra-portable measurement platform or device that can be easily carried and used at all times. One such device is the smartphone which, according to comScore survey is used by 26.2% of the US adult population. the mass production of these phones with built-in sensors and high computation power has created numerous possibilities for application development in different domains including biomedical. Motivated by this capability and their extensive usage, this thesis focuses on developing a blood pressure measurement platform on smartphones. Specifically, I developed a blood pressure measurement system on a smart phone using the built-in camera and a customized external microphone. the system consists of first obtaining heart beats using the microphone and finger pulse with the camera, and finally calculating the blood pressure using the recorded data. I developed techniques for finding the best location for obtaining the data, making the system usable by all categories of people. the proposed system resulted in accuracies between 90-100%, when compared to traditional blood pressure meters. the second part of this thesis presents a new system for remote heart beat monitoring using the smart phone. with the proposed system, heart beats can be transferred live by patients and monitored by physicians remotely for diagnosis. the proposed blood pressure measurement and remote monitoring systems will be able to facilitate information acquisition and decision making by the 9-1-1 operators. digital.library.unt.edu/ark:/67531/metadc115102/
Ddos Defense Against Botnets in the Mobile Cloud
Mobile phone advancements and ubiquitous internet connectivity are resulting in ever expanding possibilities in the application of smart phones. Users of mobile phones are now capable of hosting server applications from their personal devices. Whether providing services individually or in an ad hoc network setting the devices are currently not configured for defending against distributed denial of service (DDoS) attacks. These attacks, often launched from a botnet, have existed in the space of personal computing for decades but recently have begun showing up on mobile devices. Research is done first into the required steps to develop a potential botnet on the Android platform. This includes testing for the amount of malicious traffic an Android phone would be capable of generating for a DDoS attack. On the other end of the spectrum is the need of mobile devices running networked applications to develop security against DDoS attacks. For this mobile, phones are setup, with web servers running Apache to simulate users running internet connected applications for either local ad hoc networks or serving to the internet. Testing is done for the viability of using commonly available modules developed for Apache and intended for servers as well as finding baseline capabilities of mobiles to handle higher traffic volumes. Given the unique challenge of the limited resources a mobile phone can dedicate to Apache when compared to a dedicated hosting server a new method was needed. A proposed defense algorithm is developed for mitigating DDoS attacks against the mobile server that takes into account the limited resources available on the mobile device. The algorithm is tested against TCP socket flooding for effectiveness and shown to perform better than the common Apache module installations on a mobile device. digital.library.unt.edu/ark:/67531/metadc500027/
Design and Analysis of Novel Verifiable Voting Schemes
Free and fair elections are the basis for democracy, but conducting elections is not an easy task. Different groups of people are trying to influence the outcome of the election in their favor using the range of methods, from campaigning for a particular candidate to well-financed lobbying. Often the stakes are too high, and the methods are illegal. Two main properties of any voting scheme are the privacy of a voter’s choice and the integrity of the tally. Unfortunately, they are mutually exclusive. Integrity requires making elections transparent and auditable, but at the same time, we must preserve a voter’s privacy. It is always a trade-off between these two requirements. Current voting schemes favor privacy over auditability, and thus, they are vulnerable to voting fraud. I propose two novel voting systems that can achieve both privacy and verifiability. The first protocol is based on cryptographical primitives to ensure the integrity of the final tally and privacy of the voter. The second protocol is a simple paper-based voting scheme that achieves almost the same level of security without usage of cryptography. digital.library.unt.edu/ark:/67531/metadc407785/
Design and Implementation of Large-Scale Wireless Sensor Networks for Environmental Monitoring Applications
Environmental monitoring represents a major application domain for wireless sensor networks (WSN). However, despite significant advances in recent years, there are still many challenging issues to be addressed to exploit the full potential of the emerging WSN technology. In this dissertation, we introduce the design and implementation of low-power wireless sensor networks for long-term, autonomous, and near-real-time environmental monitoring applications. We have developed an out-of-box solution consisting of a suite of software, protocols and algorithms to provide reliable data collection with extremely low power consumption. Two wireless sensor networks based on the proposed solution have been deployed in remote field stations to monitor soil moisture along with other environmental parameters. As parts of the ever-growing environmental monitoring cyberinfrastructure, these networks have been integrated into the Texas Environmental Observatory system for long-term operation. Environmental measurement and network performance results are presented to demonstrate the capability, reliability and energy-efficiency of the network. digital.library.unt.edu/ark:/67531/metadc28493/
The Design Of A Benchmark For Geo-stream Management Systems
The recent growth in sensor technology allows easier information gathering in real-time as sensors have grown smaller, more accurate, and less expensive. The resulting data is often in a geo-stream format continuously changing input with a spatial extent. Researchers developing geo-streaming management systems (GSMS) require a benchmark system for evaluation, which is currently lacking. This thesis presents GSMark, a benchmark for evaluating GSMSs. GSMark provides a data generator that creates a combination of synthetic and real geo-streaming data, a workload simulator to present the data to the GSMS as a data stream, and a set of benchmark queries that evaluate typical GSMS functionality and query performance. In particular, GSMark generates both moving points and evolving spatial regions, two fundamental data types for a broad range of geo-stream applications, and the geo-streaming queries on this data. digital.library.unt.edu/ark:/67531/metadc103392/
Detection of Temporal Events and Abnormal Images for Quality Analysis in Endoscopy Videos
Recent reports suggest that measuring the objective quality is very essential towards the success of colonoscopy. Several quality indicators (i.e. metrics) proposed in recent studies are implemented in software systems that compute real-time quality scores for routine screening colonoscopy. Most quality metrics are derived based on various temporal events occurred during the colonoscopy procedure. The location of the phase boundary between the insertion and the withdrawal phases and the amount of circumferential inspection are two such important temporal events. These two temporal events can be determined by analyzing various camera motions of the colonoscope. This dissertation put forward a novel method to estimate X, Y and Z directional motions of the colonoscope using motion vector templates. Since abnormalities of a WCE or a colonoscopy video can be found in a small number of frames (around 5% out of total frames), it is very helpful if a computer system can decide whether a frame has any mucosal abnormalities. Also, the number of detected abnormal lesions during a procedure is used as a quality indicator. Majority of the existing abnormal detection methods focus on detecting only one type of abnormality or the overall accuracies are somewhat low if the method tries to detect multiple abnormalities. Most abnormalities in endoscopy images have unique textures which are clearly distinguishable from normal textures. In this dissertation a new method is proposed that achieves the objective of detecting multiple abnormalities with a higher accuracy using a multi-texture analysis technique. The multi-texture analysis method is designed by representing WCE and colonoscopy image textures as textons. digital.library.unt.edu/ark:/67531/metadc283849/
A Driver, Vehicle and Road Safety System Using Smartphones
Access: Use of this item is restricted to the UNT Community.
As vehicle manufacturers continue to increase their emphasis on safety with advanced driver assistance systems (ADAS), I propose a ubiquitous device that is able to analyze and advise on safety conditions. Mobile smartphones are increasing in popularity among younger generations with an estimated 64% of 25-34 year olds already using one in their daily lives. with over 10 million car accidents reported in the United States each year, car manufacturers have shifted their focus of a passive approach (airbags) to more active by adding features associated with ADAS (lane departure warnings). However, vehicles manufactured with these sensors are not economically priced while older vehicles might only have passive safety features. Given its accessibility and portability, I target a mobile smartphone as a device to compliment ADAS that can bring a driver assist to any vehicle without regards for any on-vehicle communication system requirements. I use the 3-axis accelerometer of multiple Android based smartphone to record and analyze various safety factors which can influence a driver while operating a vehicle. These influences with respect to the driver, vehicle and road are lane change maneuvers, vehicular comfort and road conditions. Each factor could potentially be hazardous to the health of the driver, neighboring public, and automobile and is therefore analyzed thoroughly achieving 85.60% and 89.89% classification accuracy for identifying road anomalies and lane changes, respectively. Effective use of this data can educate a potentially dangerous driver on how to operate a vehicle safely and efficiently. with real time analysis and auditory alerts of these factors, I hope to increase a driver's overall awareness to maximize safety. digital.library.unt.edu/ark:/67531/metadc115086/
Effective and Accelerated Informative Frame Filtering in Colonoscopy Videos Using Graphic Processing Units
Colonoscopy is an endoscopic technique that allows a physician to inspect the mucosa of the human colon. Previous methods and software solutions to detect informative frames in a colonoscopy video (a process called informative frame filtering or IFF) have been hugely ineffective in (1) covering the proper definition of an informative frame in the broadest sense and (2) striking an optimal balance between accuracy and speed of classification in both real-time and non real-time medical procedures. In my thesis, I propose a more effective method and faster software solutions for IFF which is more effective due to the introduction of a heuristic algorithm (derived from experimental analysis of typical colon features) for classification. It contributed to a 5-10% boost in various performance metrics for IFF. The software modules are faster due to the incorporation of sophisticated parallel-processing oriented coding techniques on modern microprocessors. Two IFF modules were created, one for post-procedure and the other for real-time. Code optimizations through NVIDIA CUDA for GPU processing and/or CPU multi-threading concepts embedded in two significant microprocessor design philosophies (multi-core design and many-core design) resulted a 5-fold acceleration for the post-procedure module and a 40-fold acceleration for the real-time module. Some innovative software modules, which are still in testing phase, have been recently created to exploit the power of multiple GPUs together. digital.library.unt.edu/ark:/67531/metadc31536/
Elicitation of Protein-Protein Interactions from Biomedical Literature Using Association Rule Discovery
Extracting information from a stack of data is a tedious task and the scenario is no different in proteomics. Volumes of research papers are published about study of various proteins in several species, their interactions with other proteins and identification of protein(s) as possible biomarker in causing diseases. It is a challenging task for biologists to keep track of these developments manually by reading through the literatures. Several tools have been developed by computer linguists to assist identification, extraction and hypotheses generation of proteins and protein-protein interactions from biomedical publications and protein databases. However, they are confronted with the challenges of term variation, term ambiguity, access only to abstracts and inconsistencies in time-consuming manual curation of protein and protein-protein interaction repositories. This work attempts to attenuate the challenges by extracting protein-protein interactions in humans and elicit possible interactions using associative rule mining on full text, abstracts and captions from figures available from publicly available biomedical literature databases. Two such databases are used in our study: Directory of Open Access Journals (DOAJ) and PubMed Central (PMC). A corpus is built using articles based on search terms. A dataset of more than 38,000 protein-protein interactions from the Human Protein Reference Database (HPRD) is cross-referenced to validate discovered interactive pairs. A set of an optimal size of possible binary protein-protein interactions is generated to be made available for clinician or biological validation. A significant change in the number of new associations was found by altering the thresholds for support and confidence metrics. This study narrows down the limitations for biologists in keeping pace with discovery of protein-protein interactions via manually reading the literature and their needs to validate each and every possible interaction. digital.library.unt.edu/ark:/67531/metadc30508/
Evaluating Appropriateness of Emg and Flex Sensors for Classifying Hand Gestures
Hand and arm gestures are a great way of communication when you don't want to be heard, quieter and often more reliable than whispering into a radio mike. In recent years hand gesture identification became a major active area of research due its use in various applications. The objective of my work is to develop an integrated sensor system, which will enable tactical squads and SWAT teams to communicate when there is absence of a Line of Sight or in the presence of any obstacles. The gesture set involved in this work is the standardized hand signals for close range engagement operations used by military and SWAT teams. The gesture sets involved in this work are broadly divided into finger movements and arm movements. The core components of the integrated sensor system are: Surface EMG sensors, Flex sensors and accelerometers. Surface EMG is the electrical activity produced by muscle contractions and measured by sensors directly attached to the skin. Bend Sensors use a piezo resistive material to detect the bend. The sensor output is determined by both the angle between the ends of the sensor as well as the flex radius. Accelerometers sense the dynamic acceleration and inclination in 3 directions simultaneously. EMG sensors are placed on the upper and lower forearm and assist in the classification of the finger and wrist movements. Bend sensors are mounted on a glove that is worn on the hand. The sensors are located over the first knuckle of each figure and can determine if the finger is bent or not. An accelerometer is attached to the glove at the base of the wrist and determines the speed and direction of the arm movement. Classification algorithm SVM is used to classify the gestures. digital.library.unt.edu/ark:/67531/metadc271769/
Exploring Memristor Based Analog Design in Simscape
With conventional CMOS technologies approaching their scaling limits, researchers are actively investigating alternative technologies for ever increasing computing and mobile demand. A number of different technologies are currently being studied by different research groups. In the last decade, one-dimensional (1D) carbon nanotubes (CNT), graphene, which is a two-dimensional (2D) natural occurring carbon rolled in tubular form, and zero-dimensional (0D) fullerenes have been the subject of intensive research. In 2008, HP Labs announced a ground-breaking fabrication of memristors, the fourth fundamental element postulated by Chua at the University of California, Berkeley in 1971. In the last few years, the memristor has gained a lot of attention from the research community. In-depth studies of the memristor and its analog behavior have convinced the community that it has the potential in future nano-architectures for optimization of high-density memory and neuromorphic computing architectures. The objective of this thesis is to explore memristors for analog and mixed-signal system design using Simscape. This thesis presents a memristor model in the Simscape language. Simscape has been used as it has the potential for modeling large systems. A memristor based programmable oscillator is also presented with simulation results and characterization. In addition, simulation results of different memristor models are presented which are crucial for the detailed understanding of the memristor along with its properties. digital.library.unt.edu/ark:/67531/metadc271817/
Exploring Privacy in Location-based Services Using Cryptographic Protocols
Location-based services (LBS) are available on a variety of mobile platforms like cell phones, PDA's, etc. and an increasing number of users subscribe to and use these services. Two of the popular models of information flow in LBS are the client-server model and the peer-to-peer model, in both of which, existing approaches do not always provide privacy for all parties concerned. In this work, I study the feasibility of applying cryptographic protocols to design privacy-preserving solutions for LBS from an experimental and theoretical standpoint. In the client-server model, I construct a two-phase framework for processing nearest neighbor queries using combinations of cryptographic protocols such as oblivious transfer and private information retrieval. In the peer-to-peer model, I present privacy preserving solutions for processing group nearest neighbor queries in the semi-honest and dishonest adversarial models. I apply concepts from secure multi-party computation to realize our constructions and also leverage the capabilities of trusted computing technology, specifically TPM chips. My solution for the dishonest adversarial model is also of independent cryptographic interest. I prove my constructions secure under standard cryptographic assumptions and design experiments for testing the feasibility or practicability of our constructions and benchmark key operations. My experiments show that the proposed constructions are practical to implement and have reasonable costs, while providing strong privacy assurances. digital.library.unt.edu/ark:/67531/metadc68060/
Exploring Process-Variation Tolerant Design of Nanoscale Sense Amplifier Circuits
Sense amplifiers are important circuit components of a dynamic random access memory (DRAM), which forms the main memory of digital computers. The ability of the sense amplifier to detect and amplify voltage signals to correctly interpret data in DRAM cells cannot be understated. The sense amplifier plays a significant role in the overall speed of the DRAM. Sense amplifiers require matched transistors for optimal performance. Hence, the effects of mismatch through process variations must be minimized. This thesis presents a research which leads to optimal nanoscale CMOS sense amplifiers by incorporating the effects of process variation early in the design process. The effects of process variation on the performance of a standard voltage sense amplifier, which is used in conventional DRAMs, is studied. Parametric analysis is performed through circuit simulations to investigate which parameters have the most impact on the performance of the sense amplifier. The figures-of-merit (FoMs) used to characterize the circuit are the precharge time, power dissipation, sense delay and sense margin. Statistical analysis is also performed to study the impact of process variations on each FoM. By analyzing the results from the statistical study, a method is presented to select parameter values that minimize the effects of process variation. A design flow algorithm incorporating dual oxide and dual threshold voltage based techniques is used to optimize the FoMs for the sense amplifier. Experimental results prove that the proposed approach improves precharge time by 83.9%, sense delay by 80.2% sense margin by 61.9%, and power dissipation by 13.1%. digital.library.unt.edu/ark:/67531/metadc67942/
Extrapolating Subjectivity Research to Other Languages
Socrates articulated it best, "Speak, so I may see you." Indeed, language represents an invisible probe into the mind. It is the medium through which we express our deepest thoughts, our aspirations, our views, our feelings, our inner reality. From the beginning of artificial intelligence, researchers have sought to impart human like understanding to machines. As much of our language represents a form of self expression, capturing thoughts, beliefs, evaluations, opinions, and emotions which are not available for scrutiny by an outside observer, in the field of natural language, research involving these aspects has crystallized under the name of subjectivity and sentiment analysis. While subjectivity classification labels text as either subjective or objective, sentiment classification further divides subjective text into either positive, negative or neutral. In this thesis, I investigate techniques of generating tools and resources for subjectivity analysis that do not rely on an existing natural language processing infrastructure in a given language. This constraint is motivated by the fact that the vast majority of human languages are scarce from an electronic point of view: they lack basic tools such as part-of-speech taggers, parsers, or basic resources such as electronic text, annotated corpora or lexica. This severely limits the implementation of techniques on par with those developed for English, and by applying methods that are lighter in the usage of text processing infrastructure, we are able to conduct multilingual subjectivity research in these languages as well. Since my aim is also to minimize the amount of manual work required to develop lexica or corpora in these languages, the techniques proposed employ a lever approach, where English often acts as the donor language (the fulcrum in a lever) and allows through a relatively minimal amount of effort to establish preliminary subjectivity research in a target language. digital.library.unt.edu/ark:/67531/metadc271777/
Finding Meaning in Context Using Graph Algorithms in Mono- and Cross-lingual Settings
Making computers automatically find the appropriate meaning of words in context is an interesting problem that has proven to be one of the most challenging tasks in natural language processing (NLP). Widespread potential applications of a possible solution to the problem could be envisaged in several NLP tasks such as text simplification, language learning, machine translation, query expansion, information retrieval and text summarization. Ambiguity of words has always been a challenge in these applications, and the traditional endeavor to solve the problem of this ambiguity, namely doing word sense disambiguation using resources like WordNet, has been fraught with debate about the feasibility of the granularity that exists in WordNet senses. The recent trend has therefore been to move away from enforcing any given lexical resource upon automated systems from which to pick potential candidate senses,and to instead encourage them to pick and choose their own resources. Given a sentence with a target ambiguous word, an alternative solution consists of picking potential candidate substitutes for the target, filtering the list of the candidates to a much shorter list using various heuristics, and trying to match these system predictions against a human generated gold standard, with a view to ensuring that the meaning of the sentence does not change after the substitutions. This solution has manifested itself in the SemEval 2007 task of lexical substitution and the more recent SemEval 2010 task of cross-lingual lexical substitution (which I helped organize), where given an English context and a target word within that context, the systems are required to provide between one and ten appropriate substitutes (in English) or translations (in Spanish) for the target word. In this dissertation, I present a comprehensive overview of state-of-the-art research and describe new experiments to tackle the tasks of lexical substitution and cross-lingual lexical substitution. In particular I attempt to answer some research questions pertinent to the tasks, mostly focusing on completely unsupervised approaches. I present a new framework for unsupervised lexical substitution using graphs and centrality algorithms. An additional novelty in this approach is the use of directional similarity rather than the traditional, symmetric word similarity. Additionally, the thesis also explores the extension of the monolingual framework into a cross-lingual one, and examines how well this cross-lingual framework can work for the monolingual lexical substitution and cross-lingual lexical substitution tasks. A comprehensive set of comparative investigations are presented amongst supervised and unsupervised methods, several graph based methods, and the use of monolingual and multilingual information. digital.library.unt.edu/ark:/67531/metadc271899/
A Framework for Analyzing and Optimizing Regional Bio-Emergency Response Plans
The presence of naturally occurring and man-made public health threats necessitate the design and implementation of mitigation strategies, such that adequate response is provided in a timely manner. Since multiple variables, such as geographic properties, resource constraints, and government mandated time-frames must be accounted for, computational methods provide the necessary tools to develop contingency response plans while respecting underlying data and assumptions. A typical response scenario involves the placement of points of dispensing (PODs) in the affected geographic region to supply vaccines or medications to the general public. Computational tools aid in the analysis of such response plans, as well as in the strategic placement of PODs, such that feasible response scenarios can be developed. Due to the sensitivity of bio-emergency response plans, geographic information, such as POD locations, must be kept confidential. The generation of synthetic geographic regions allows for the development of emergency response plans on non-sensitive data, as well as for the study of the effects of single geographic parameters. Further, synthetic representations of geographic regions allow for results to be published and evaluated by the scientific community. This dissertation presents methodology for the analysis of bio-emergency response plans, methods for plan optimization, as well as methodology for the generation of synthetic geographic regions. digital.library.unt.edu/ark:/67531/metadc33200/
Framework for Evaluating Dynamic Memory Allocators Including a New Equivalence Class Based Cache-conscious Allocator
Software applications’ performance is hindered by a variety of factors, but most notably by the well-known CPU-memory speed gap (often known as the memory wall). This results in the CPU sitting idle waiting for data to be brought from memory to processor caches. The addressing used by caches cause non-uniform accesses to various cache sets. The non-uniformity is due to several reasons, including how different objects are accessed by the code and how the data objects are located in memory. Memory allocators determine where dynamically created objects are placed, thus defining addresses and their mapping to cache locations. It is important to evaluate how different allocators behave with respect to the localities of the created objects. Most allocators use a single attribute, the size, of an object in making allocation decisions. Additional attributes such as the placement with respect to other objects, or specific cache area may lead to better use of cache memories. In this dissertation, we proposed and implemented a framework that allows for the development and evaluation of new memory allocation techniques. At the root of the framework is a memory tracing tool called Gleipnir, which provides very detailed information about every memory access, and relates it back to source level objects. Using the traces from Gleipnir, we extended a commonly used cache simulator for generating detailed cache statistics: per function, per data object, per cache line, and identify specific data objects that are conflicting with each other. The utility of the framework is demonstrated with a new memory allocator known as equivalence class allocator. The new allocator allows users to specify cache sets, in addition to object size, where the objects should be placed. We compare this new allocator with two well-known allocators, viz., Doug Lea and Pool allocators. digital.library.unt.edu/ark:/67531/metadc500151/
Geostatistical Inspired Metamodeling and Optimization of Nanoscale Analog Circuits
The current trend towards miniaturization of modern consumer electronic devices significantly affects their design. The demand for efficient all-in-one appliances leads to smaller, yet more complex and powerful nanoelectronic devices. The increasing complexity in the design of such nanoscale Analog/Mixed-Signal Systems-on-Chip (AMS-SoCs) presents difficult challenges to designers. One promising design method used to mitigate the burden of this design effort is the use of metamodeling (surrogate) modeling techniques. Their use significantly reduces the time for computer simulation and design space exploration and optimization. This dissertation addresses several issues of metamodeling based nanoelectronic based AMS design exploration. A surrogate modeling technique which uses geostatistical based Kriging prediction methods in creating metamodels is proposed. Kriging prediction techniques take into account the correlation effects between input parameters for performance point prediction. We propose the use of Kriging to utilize this property for the accurate modeling of process variation effects of designs in the deep nanometer region. Different Kriging methods have been explored for this work such as simple and ordinary Kriging. We also propose another metamodeling technique Kriging-Bootstrapped Neural Network that combines the accuracy and process variation awareness of Kriging with artificial neural network models for ultra-fast and accurate process aware metamodeling design. The proposed methodologies combine Kriging metamodels with selected algorithms for ultra-fast layout optimization. The selected algorithms explored are: Gravitational Search Algorithm (GSA), Simulated Annealing Optimization (SAO), and Ant Colony Optimization (ACO). Experimental results demonstrate that the proposed Kriging metamodel based methodologies can perform the optimizations with minimal computational burden compared to traditional (SPICE-based) design flows. digital.library.unt.edu/ark:/67531/metadc500074/
A Global Stochastic Modeling Framework to Simulate and Visualize Epidemics
Epidemics have caused major human and monetary losses through the course of human civilization. It is very important that epidemiologists and public health personnel are prepared to handle an impending infectious disease outbreak. the ever-changing demographics, evolving infrastructural resources of geographic regions, emerging and re-emerging diseases, compel the use of simulation to predict disease dynamics. By the means of simulation, public health personnel and epidemiologists can predict the disease dynamics, population groups at risk and their geographic locations beforehand, so that they are prepared to respond in case of an epidemic outbreak. As a consequence of the large numbers of individuals and inter-personal interactions involved in simulating infectious disease spread in a region such as a county, sizeable amounts of data may be produced that have to be analyzed. Methods to visualize this data would be effective in facilitating people from diverse disciplines understand and analyze the simulation. This thesis proposes a framework to simulate and visualize the spread of an infectious disease in a population of a region such as a county. As real-world populations have a non-homogeneous demographic and spatial distribution, this framework models the spread of an infectious disease based on population of and geographic distance between census blocks; social behavioral parameters for demographic groups. the population is stratified into demographic groups in individual census blocks using census data. Infection spread is modeled by means of local and global contacts generated between groups of population in census blocks. the strength and likelihood of the contacts are based on population, geographic distance and social behavioral parameters of the groups involved. the disease dynamics are represented on a geographic map of the region using a heat map representation, where the intensity of infection is mapped to a color scale. This framework provides a tool for public health personnel and epidemiologists to run what-if analyses on disease spread in specific populations and plan for epidemic response. By the means of demographic stratification of population and incorporation of geographic distance and social behavioral parameters into the modeling of the outbreak, this framework takes into account non-homogeneity in demographic mix and spatial distribution of the population. Generation of contacts per population group instead of individuals contributes to lowering computational overhead. Heat map representation of the intensity of infection provides an intuitive way to visualize the disease dynamics. digital.library.unt.edu/ark:/67531/metadc115099/
GPS CaPPture: a System for GPS Trajectory Collection, Processing, and Destination Prediction
In the United States, smartphone ownership surpassed 69.5 million in February 2011 with a large portion of those users (20%) downloading applications (apps) that enhance the usability of a device by adding additional functionality. a large percentage of apps are written specifically to utilize the geographical position of a mobile device. One of the prime factors in developing location prediction models is the use of historical data to train such a model. with larger sets of training data, prediction algorithms become more accurate; however, the use of historical data can quickly become a downfall if the GPS stream is not collected or processed correctly. Inaccurate or incomplete or even improperly interpreted historical data can lead to the inability to develop accurately performing prediction algorithms. As GPS chipsets become the standard in the ever increasing number of mobile devices, the opportunity for the collection of GPS data increases remarkably. the goal of this study is to build a comprehensive system that addresses the following challenges: (1) collection of GPS data streams in a manner such that the data is highly usable and has a reduction in errors; (2) processing and reduction of the collected data in order to prepare it and make it highly usable for the creation of prediction algorithms; (3) creation of prediction/labeling algorithms at such a level that they are viable for commercial use. This study identifies the key research problems toward building the CaPPture (collection, processing, prediction) system. digital.library.unt.edu/ark:/67531/metadc115089/
Graph-Based Keyphrase Extraction Using Wikipedia
Keyphrases describe a document in a coherent and simple way, giving the prospective reader a way to quickly determine whether the document satisfies their information needs. The pervasion of huge amount of information on Web, with only a small amount of documents have keyphrases extracted, there is a definite need to discover automatic keyphrase extraction systems. Typically, a document written by human develops around one or more general concepts or sub-concepts. These concepts or sub-concepts should be structured and semantically related with each other, so that they can form the meaningful representation of a document. Considering the fact, the phrases or concepts in a document are related to each other, a new approach for keyphrase extraction is introduced that exploits the semantic relations in the document. For measuring the semantic relations between concepts or sub-concepts in the document, I present a comprehensive study aimed at using collaboratively constructed semantic resources like Wikipedia and its link structure. In particular, I introduce a graph-based keyphrase extraction system that exploits the semantic relations in the document and features such as term frequency. I evaluated the proposed system using novel measures and the results obtained compare favorably with previously published results on established benchmarks. digital.library.unt.edu/ark:/67531/metadc67939/
Incremental Learning with Large Datasets
This dissertation focuses on the novel learning strategy based on geometric support vector machines to address the difficulties of processing immense data set. Support vector machines find the hyper-plane that maximizes the margin between two classes, and the decision boundary is represented with a few training samples it becomes a favorable choice for incremental learning. The dissertation presents a novel method Geometric Incremental Support Vector Machines (GISVMs) to address both efficiency and accuracy issues in handling massive data sets. In GISVM, skin of convex hulls is defined and an efficient method is designed to find the best skin approximation given available examples. The set of extreme points are found by recursively searching along the direction defined by a pair of known extreme points. By identifying the skin of the convex hulls, the incremental learning will only employ a much smaller number of samples with comparable or even better accuracy. When additional samples are provided, they will be used together with the skin of the convex hull constructed from previous dataset. This results in a small number of instances used in incremental steps of the training process. Based on the experimental results with synthetic data sets, public benchmark data sets from UCI and endoscopy videos, it is evident that the GISVM achieved satisfactory classifiers that closely model the underlying data distribution. GISVM improves the performance in sensitivity in the incremental steps, significantly reduced the demand for memory space, and demonstrates the ability of recovery from temporary performance degradation. digital.library.unt.edu/ark:/67531/metadc149595/
Indoor Localization Using Magnetic Fields
Indoor localization consists of locating oneself inside new buildings. GPS does not work indoors due to multipath reflection and signal blockage. WiFi based systems assume ubiquitous availability and infrastructure based systems require expensive installations, hence making indoor localization an open problem. This dissertation consists of solving the problem of indoor localization by thoroughly exploiting the indoor ambient magnetic fields comprising mainly of disturbances termed as anomalies in the Earth’s magnetic field caused by pillars, doors and elevators in hallways which are ferromagnetic in nature. By observing uniqueness in magnetic signatures collected from different campus buildings, the work presents the identification of landmarks and guideposts from these signatures and further develops magnetic maps of buildings - all of which can be used to locate and navigate people indoors. To understand the reason behind these anomalies, first a comparison between the measured and model generated Earth’s magnetic field is made, verifying the presence of a constant field without any disturbances. Then by modeling the magnetic field behavior of different pillars such as steel reinforced concrete, solid steel, and other structures like doors and elevators, the interaction of the Earth’s field with the ferromagnetic fields is described thereby explaining the causes of the uniqueness in the signatures that comprise these disturbances. Next, by employing the dynamic time warping algorithm to account for time differences in signatures obtained from users walking at different speeds, an indoor localization application capable of classifying locations using the magnetic signatures is developed solely on the smart phone. The application required users to walk short distances of 3-6 m anywhere in hallway to be located with accuracies of 80-99%. The classification framework was further validated with over 90% accuracies using model generated magnetic signatures representing hallways with different kinds of pillars, doors and elevators. All in all, this dissertation contributes the following: 1) provides a framework for understanding the presence of ambient magnetic fields indoors and utilizing them to solve the indoor localization problem; 2) develops an application that is independent of the user and the smart phones and 3) requires no other infrastructure since it is deployed on a device that encapsulates the sensing, computing and inferring functionalities, thereby making it a novel contribution to the mobile and pervasive computing domain. digital.library.unt.edu/ark:/67531/metadc103371/
The Influence of Social Network Graph Structure on Disease Dynamics in a Simulated Environment
The fight against epidemics/pandemics is one of man versus nature. Technological advances have not only improved existing methods for monitoring and controlling disease outbreaks, but have also provided new means for investigation, such as through modeling and simulation. This dissertation explores the relationship between social structure and disease dynamics. Social structures are modeled as graphs, and outbreaks are simulated based on a well-recognized standard, the susceptible-infectious-removed (SIR) paradigm. Two independent, but related, studies are presented. The first involves measuring the severity of outbreaks as social network parameters are altered. The second study investigates the efficacy of various vaccination policies based on social structure. Three disease-related centrality measures are introduced, contact, transmission, and spread centrality, which are related to previously established centrality measures degree, betweenness, and closeness, respectively. The results of experiments presented in this dissertation indicate that reducing the neighborhood size along with outside-of-neighborhood contacts diminishes the severity of disease outbreaks. Vaccination strategies can effectively reduce these parameters. Additionally, vaccination policies that target individuals with high centrality are generally shown to be slightly more effective than a random vaccination policy. These results combined with past and future studies will assist public health officials in their effort to minimize the effects of inevitable disease epidemics/pandemics. digital.library.unt.edu/ark:/67531/metadc33173/
Investigating the Extractive Summarization of Literary Novels
Abstract Due to the vast amount of information we are faced with, summarization has become a critical necessity of everyday human life. Given that a large fraction of the electronic documents available online and elsewhere consist of short texts such as Web pages, news articles, scientific reports, and others, the focus of natural language processing techniques to date has been on the automation of methods targeting short documents. We are witnessing however a change: an increasingly larger number of books become available in electronic format. This means that the need for language processing techniques able to handle very large documents such as books is becoming increasingly important. This thesis addresses the problem of summarization of novels, which are long and complex literary narratives. While there is a significant body of research that has been carried out on the task of automatic text summarization, most of this work has been concerned with the summarization of short documents, with a particular focus on news stories. However, novels are different in both length and genre, and consequently different summarization techniques are required. This thesis attempts to close this gap by analyzing a new domain for summarization, and by building unsupervised and supervised systems that effectively take into account the properties of long documents, and outperform the traditional extractive summarization systems typically addressing news genre. digital.library.unt.edu/ark:/67531/metadc103298/
Joint Schemes for Physical Layer Security and Error Correction
The major challenges facing resource constraint wireless devices are error resilience, security and speed. Three joint schemes are presented in this research which could be broadly divided into error correction based and cipher based. The error correction based ciphers take advantage of the properties of LDPC codes and Nordstrom Robinson code. A cipher-based cryptosystem is also presented in this research. The complexity of this scheme is reduced compared to conventional schemes. The securities of the ciphers are analyzed against known-plaintext and chosen-plaintext attacks and are found to be secure. Randomization test was also conducted on these schemes and the results are presented. For the proof of concept, the schemes were implemented in software and hardware and these shows a reduction in hardware usage compared to conventional schemes. As a result, joint schemes for error correction and security provide security to the physical layer of wireless communication systems, a layer in the protocol stack where currently little or no security is implemented. In this physical layer security approach, the properties of powerful error correcting codes are exploited to deliver reliability to the intended parties, high security against eavesdroppers and efficiency in communication system. The notion of a highly secure and reliable physical layer has the potential to significantly change how communication system designers and users think of the physical layer since the error control codes employed in this work will have the dual roles of both reliability and security. digital.library.unt.edu/ark:/67531/metadc84159/
Layout-accurate Ultra-fast System-level Design Exploration Through Verilog-ams
This research addresses problems in designing analog and mixed-signal (AMS) systems by bridging the gap between system-level and circuit-level simulation by making simulations fast like system-level and accurate like circuit-level. The tools proposed include metamodel integrated Verilog-AMS based design exploration flows. The research involves design centering, metamodel generation flows for creating efficient behavioral models, and Verilog-AMS integration techniques for model realization. The core of the proposed solution is transistor-level and layout-level metamodeling and their incorporation in Verilog-AMS. Metamodeling is used to construct efficient and layout-accurate surrogate models for AMS system building blocks. Verilog-AMS, an AMS hardware description language, is employed to build surrogate model implementations that can be simulated with industrial standard simulators. The case-study circuits and systems include an operational amplifier (OP-AMP), a voltage-controlled oscillator (VCO), a charge-pump phase-locked loop (PLL), and a continuous-time delta-sigma modulator (DSM). The minimum and maximum error rates of the proposed OP-AMP model are 0.11 % and 2.86 %, respectively. The error rates for the PLL lock time and power estimation are 0.7 % and 3.0 %, respectively. The OP-AMP optimization using the proposed approach is ~17000× faster than the transistor-level model based approach. The optimization achieves a ~4× power reduction for the OP-AMP design. The PLL parasitic-aware optimization achieves a 10× speedup and a 147 µW power reduction. Thus the experimental results validate the effectiveness of the proposed solution. digital.library.unt.edu/ark:/67531/metadc271923/
Measuring Semantic Relatedness Using Salient Encyclopedic Concepts
While pragmatics, through its integration of situational awareness and real world relevant knowledge, offers a high level of analysis that is suitable for real interpretation of natural dialogue, semantics, on the other end, represents a lower yet more tractable and affordable linguistic level of analysis using current technologies. Generally, the understanding of semantic meaning in literature has revolved around the famous quote ``You shall know a word by the company it keeps''. In this thesis we investigate the role of context constituents in decoding the semantic meaning of the engulfing context; specifically we probe the role of salient concepts, defined as content-bearing expressions which afford encyclopedic definitions, as a suitable source of semantic clues to an unambiguous interpretation of context. Furthermore, we integrate this world knowledge in building a new and robust unsupervised semantic model and apply it to entail semantic relatedness between textual pairs, whether they are words, sentences or paragraphs. Moreover, we explore the abstraction of semantics across languages and utilize our findings into building a novel multi-lingual semantic relatedness model exploiting information acquired from various languages. We demonstrate the effectiveness and the superiority of our mono-lingual and multi-lingual models through a comprehensive set of evaluations on specialized synthetic datasets for semantic relatedness as well as real world applications such as paraphrase detection and short answer grading. Our work represents a novel approach to integrate world-knowledge into current semantic models and a means to cross the language boundary for a better and more robust semantic relatedness representation, thus opening the door for an improved abstraction of meaning that carries the potential of ultimately imparting understanding of natural language to machines. digital.library.unt.edu/ark:/67531/metadc84212/
Measuring Vital Signs Using Smart Phones
Smart phones today have become increasingly popular with the general public for its diverse abilities like navigation, social networking, and multimedia facilities to name a few. These phones are equipped with high end processors, high resolution cameras, built-in sensors like accelerometer, orientation-sensor, light-sensor, and much more. According to comScore survey, 25.3% of US adults use smart phones in their daily lives. Motivated by the capability of smart phones and their extensive usage, I focused on utilizing them for bio-medical applications. In this thesis, I present a new application for a smart phone to quantify the vital signs such as heart rate, respiratory rate and blood pressure with the help of its built-in sensors. Using the camera and a microphone, I have shown how the blood pressure and heart rate can be determined for a subject. People sometimes encounter minor situations like fainting or fatal accidents like car crash at unexpected times and places. It would be useful to have a device which can measure all vital signs in such an event. The second part of this thesis demonstrates a new mode of communication for next generation 9-1-1 calls. In this new architecture, the call-taker will be able to control the multimedia elements in the phone from a remote location. This would help the call-taker or first responder to have a better control over the situation. Transmission of the vital signs measured using the smart phone can be a life saver in critical situations. In today's voice oriented 9-1-1 calls, the dispatcher first collects critical information (e.g., location, call-back number) from caller, and assesses the situation. Meanwhile, the dispatchers constantly face a "60-second dilemma"; i.e., within 60 seconds, they need to make a complicated but important decision, whether to dispatch and, if so, what to dispatch. The dispatchers often feel that they lack sufficient information to make a confident dispatch decision. This remote-media-control described in this system will be able to facilitate information acquisition and decision-making in emergency situations within the 60-second response window in 9-1-1 calls using new multimedia technologies. digital.library.unt.edu/ark:/67531/metadc33139/
Metamodeling-based Fast Optimization of Nanoscale Ams-socs
Modern consumer electronic systems are mostly based on analog and digital circuits and are designed as analog/mixed-signal systems on chip (AMS-SoCs). the integration of analog and digital circuits on the same die makes the system cost effective. in AMS-SoCs, analog and mixed-signal portions have not traditionally received much attention due to their complexity. As the fabrication technology advances, the simulation times for AMS-SoC circuits become more complex and take significant amounts of time. the time allocated for the circuit design and optimization creates a need to reduce the simulation time. the time constraints placed on designers are imposed by the ever-shortening time to market and non-recurrent cost of the chip. This dissertation proposes the use of a novel method, called metamodeling, and intelligent optimization algorithms to reduce the design time. Metamodel-based ultra-fast design flows are proposed and investigated. Metamodel creation is a one time process and relies on fast sampling through accurate parasitic-aware simulations. One of the targets of this dissertation is to minimize the sample size while retaining the accuracy of the model. in order to achieve this goal, different statistical sampling techniques are explored and applied to various AMS-SoC circuits. Also, different metamodel functions are explored for their accuracy and application to AMS-SoCs. Several different optimization algorithms are compared for global optimization accuracy and convergence. Three different AMS circuits, ring oscillator, inductor-capacitor voltage-controlled oscillator (LC-VCO) and phase locked loop (PLL) that are present in many AMS-SoC are used in this study for design flow application. Metamodels created in this dissertation provide accuracy with an error of less than 2% from the physical layout simulations. After optimal sampling investigation, metamodel functions and optimization algorithms are ranked in terms of speed and accuracy. Experimental results show that the proposed design flow provides roughly 5,000x speedup over conventional design flows. Thus, this dissertation greatly advances the state-of-the-art in mixed-signal design and will assist towards making consumer electronics cheaper and affordable. digital.library.unt.edu/ark:/67531/metadc115081/
Modeling Alcohol Consumption Using Blog Data
How do the content and writing style of people who drink alcohol beverages stand out from non-drinkers? How much information can we learn about a person's alcohol consumption behavior by reading text that they have authored? This thesis attempts to extend the methods deployed in authorship attribution and authorship profiling research into the domain of automatically identifying the human action of drinking alcohol beverages. I examine how a psycholinguistics dictionary (the Linguistics Inquiry and Word Count lexicon, developed by James Pennebaker), together with Kenneth Burke's concept of words as symbols of human action, and James Wertsch's concept of mediated action provide a framework for analyzing meaningful data patterns from the content of blogs written by consumers of alcohol beverages. The contributions of this thesis to the research field are twofold. First, I show that it is possible to automatically identify blog posts that have content related to the consumption of alcohol beverages. And second, I provide a framework and tools to model human behavior through text analysis of blog data. digital.library.unt.edu/ark:/67531/metadc271843/
Modeling and Analysis of Next Generation 9-1-1 Emergency Medical Dispatch Protocols
Emergency Medical Dispatch Protocols are guidelines that a 9-1-1 dispatcher uses to evaluate the nature of emergency, resources to send and the nature of help provided to the 9-1-1 caller. The current Dispatch Protocols are based on voice only call. But the Next Generation 9-1-1 (NG9-1-1) architecture will allow multimedia emergency calls. In this thesis I analyze and model the Emergency Medical Dispatch Protocols for NG9-1-1 architecture. I have identified various technical aspects to improve the NG9-1-1 Dispatch Protocols. The devices (smartphone) at the caller end have advanced to a point where they can be used to send and receive video, pictures and text. There are sensors embedded in them that can be used for initial diagnosis of the injured person. There is a need to improve the human computer (smartphone) interface to take advantage of technology so that callers can easily make use of various features available to them. The dispatchers at the 9-1-1 call center can make use of these new protocols to improve the quality and the response time. They will have capability of multiple media streams to interact with the caller and the first responders.The specific contributions in this thesis include developing applications that use smartphone sensors. The CPR application uses the smartphone to help administer effective CPR even if the person is not trained. The application makes the CPR process closed loop, i.e., the person who administers the CPR as well as the 9-1-1 operator receive feedback and prompt from the application about the correctness of the CPR. The breathing application analyzes the quality of breathing of the affected person and automatically sends the information to the 9-1-1 operator. In order to improve the Human Computer Interface at the caller and the operator end, I have analyzed Fitts law and extended it so that it can be used to improve the instructions given to a caller. In emergency situations, the caller may be physically or cognitively impaired. This may happen either because the caller is the injured person, or because the caller is a close relative or friend of the injured person. Using EEG waves, I have analyzed and developed a mathematical model of a person's cognitive impairment. Finally, I have developed a mathematical model of the response time of a 9-1-1 call and analyzed the factors that can be improved to reduce the response time. In this regard, another application, I have developed, allows the 9-1-1 operator to remotely control the media features of a caller's smartphone. This is needed in case the caller is unable to operate the multimedia features of the smartphone. For example, the caller may not know how to zoom in the smartphone camera.All these building blocks come together in the development of an efficient NG9-1-1 Emergency Medical Dispatch protocols. I have provided a sample of these protocols, using the existing Emergency Dispatch Protocols used in the state of New Jersey. The new protocols will have fewer questions and more visual prompts to evaluate the nature of the emergency. digital.library.unt.edu/ark:/67531/metadc500122/
Modeling Synergistic Relationships Between Words and Images
Texts and images provide alternative, yet orthogonal views of the same underlying cognitive concept. By uncovering synergistic, semantic relationships that exist between words and images, I am working to develop novel techniques that can help improve tasks in natural language processing, as well as effective models for text-to-image synthesis, image retrieval, and automatic image annotation. Specifically, in my dissertation, I will explore the interoperability of features between language and vision tasks. In the first part, I will show how it is possible to apply features generated using evidence gathered from text corpora to solve the image annotation problem in computer vision, without the use of any visual information. In the second part, I will address research in the reverse direction, and show how visual cues can be used to improve tasks in natural language processing. Importantly, I propose a novel metric to estimate the similarity of words by comparing the visual similarity of concepts invoked by these words, and show that it can be used further to advance the state-of-the-art methods that employ corpus-based and knowledge-based semantic similarity measures. Finally, I attempt to construct a joint semantic space connecting words with images, and synthesize an evaluation framework to quantify cross-modal semantic relationships that exist between arbitrary pairs of words and images. I study the effectiveness of unsupervised, corpus-based approaches to automatically derive the semantic relatedness between words and images, and perform empirical evaluations by measuring its correlation with human annotators. digital.library.unt.edu/ark:/67531/metadc177223/
Monitoring Dengue Outbreaks Using Online Data
Internet technology has affected humans' lives in many disciplines. The search engine is one of the most important Internet tools in that it allows people to search for what they want. Search queries entered in a web search engine can be used to predict dengue incidence. This vector borne disease causes severe illness and kills a large number of people every year. This dissertation utilizes the capabilities of search queries related to dengue and climate to forecast the number of dengue cases. Several machine learning techniques are applied for data analysis, including Multiple Linear Regression, Artificial Neural Networks, and the Seasonal Autoregressive Integrated Moving Average. Predictive models produced from these machine learning methods are measured for their performance to find which technique generates the best model for dengue prediction. The results of experiments presented in this dissertation indicate that search query data related to dengue and climate can be used to forecast the number of dengue cases. The performance measurement of predictive models shows that Artificial Neural Networks outperform the others. These results will help public health officials in planning to deal with the outbreaks. digital.library.unt.edu/ark:/67531/metadc500167/
Multi-perspective, Multi-modal Image Registration and Fusion
Multi-modal image fusion is an active research area with many civilian and military applications. Fusion is defined as strategic combination of information collected by various sensors from different locations or different types in order to obtain a better understanding of an observed scene or situation. Fusion of multi-modal images cannot be completed unless these two modalities are spatially aligned. In this research, I consider two important problems. Multi-modal, multi-perspective image registration and decision level fusion of multi-modal images. In particular, LiDAR and visual imagery. Multi-modal image registration is a difficult task due to the different semantic interpretation of features extracted from each modality. This problem is decoupled into three sub-problems. The first step is identification and extraction of common features. The second step is the determination of corresponding points. The third step consists of determining the registration transformation parameters. Traditional registration methods use low level features such as lines and corners. Using these features require an extensive optimization search in order to determine the corresponding points. Many methods use global positioning systems (GPS), and a calibrated camera in order to obtain an initial estimate of the camera parameters. The advantages of our work over the previous works are the following. First, I used high level-features, which significantly reduce the search space for the optimization process. Second, the determination of corresponding points is modeled as an assignment problem between a small numbers of objects. On the other side, fusing LiDAR and visual images is beneficial, due to the different and rich characteristics of both modalities. LiDAR data contain 3D information, while images contain visual information. Developing a fusion technique that uses the characteristics of both modalities is very important. I establish a decision-level fusion technique using manifold models. digital.library.unt.edu/ark:/67531/metadc149562/
Multilingual Word Sense Disambiguation Using Wikipedia
Ambiguity is inherent to human language. In particular, word sense ambiguity is prevalent in all natural languages, with a large number of the words in any given language carrying more than one meaning. Word sense disambiguation is the task of automatically assigning the most appropriate meaning to a polysemous word within a given context. Generally the problem of resolving ambiguity in literature has revolved around the famous quote “you shall know the meaning of the word by the company it keeps.” In this thesis, we investigate the role of context for resolving ambiguity through three different approaches. Instead of using a predefined monolingual sense inventory such as WordNet, we use a language-independent framework where the word senses and sense-tagged data are derived automatically from Wikipedia. Using Wikipedia as a source of sense-annotations provides the much needed solution for knowledge acquisition bottleneck. In order to evaluate the viability of Wikipedia based sense-annotations, we cast the task of disambiguating polysemous nouns as a monolingual classification task and experimented on lexical samples from four different languages (viz. English, German, Italian and Spanish). The experiments confirm that the Wikipedia based sense annotations are reliable and can be used to construct accurate monolingual sense classifiers. It is a long belief that exploiting multiple languages helps in building accurate word sense disambiguation systems. Subsequently, we developed two approaches that recast the task of disambiguating polysemous nouns as a multilingual classification task. The first approach for multilingual word sense disambiguation attempts to effectively use a machine translation system to leverage two relevant multilingual aspects of the semantics of text. First, the various senses of a target word may be translated into different words, which constitute unique, yet highly salient signal that effectively expand the target word’s feature space. Second, the translated context words themselves embed co-occurrence information that a translation engine gathers from very large parallel corpora. The second approach for multlingual word sense disambiguation attempts to reduce the reliance on the machine translation system during training by using the multilingual knowledge available in Wikipedia through its interlingual links. Finally, the experiments on a lexical sample from four different languages confirm that the multilingual systems perform better than the monolingual system and significantly improve the disambiguation accuracy. digital.library.unt.edu/ark:/67531/metadc500036/
Optimizing Non-pharmaceutical Interventions Using Multi-coaffiliation Networks
Computational modeling is of fundamental significance in mapping possible disease spread, and designing strategies for its mitigation. Conventional contact networks implement the simulation of interactions as random occurrences, presenting public health bodies with a difficult trade off between a realistic model granularity and robust design of intervention strategies. Recently, researchers have been investigating the use of agent-based models (ABMs) to embrace the complexity of real world interactions. At the same time, theoretical approaches provide epidemiologists with general optimization models in which demographics are intrinsically simplified. The emerging study of affiliation networks and co-affiliation networks provide an alternative to such trade off. Co-affiliation networks maintain the realism innate to ABMs while reducing the complexity of contact networks into distinctively smaller k-partite graphs, were each partition represent a dimension of the social model. This dissertation studies the optimization of intervention strategies for infectious diseases, mainly distributed in school systems. First, concepts of synthetic populations and affiliation networks are extended to propose a modified algorithm for the synthetic reconstruction of populations. Second, the definition of multi-coaffiliation networks is presented as the main social model in which risk is quantified and evaluated, thereby obtaining vulnerability indications for each school in the system. Finally, maximization of the mitigation coverage and minimization of the overall cost of intervention strategies are proposed and compared, based on centrality measures. digital.library.unt.edu/ark:/67531/metadc271860/
Performance Engineering of Software Web Services and Distributed Software Systems
The promise of service oriented computing, and the availability of Web services promote the delivery and creation of new services based on existing services, in order to meet new demands and new markets. As Web and internet based services move into Clouds, inter-dependency of services and their complexity will increase substantially. There are standards and frameworks for specifying and composing Web Services based on functional properties. However, mechanisms to individually address non-functional properties of services and their compositions have not been well established. Furthermore, the Cloud ontology depicts service layers from a high-level, such as Application and Software, to a low-level, such as Infrastructure and Platform. Each component that resides in one layer can be useful to another layer as a service. It hints at the amount of complexity resulting from not only horizontal but also vertical integrations in building and deploying a composite service. To meet the requirements and facilitate using Web services, we first propose a WSDL extension to permit specification of non-functional or Quality of Service (QoS) properties. On top of the foundation, the QoS-aware framework is established to adapt publicly available tools for Web services, augmented by ontology management tools, along with tools for performance modeling to exemplify how the non-functional properties such as response time, throughput, or utilization of services can be addressed in the service acquisition and composition process. To facilitate Web service composition standards, in this work we extended the framework with additional qualitative information to the service descriptions using Business Process Execution Language (BPEL). Engineers can use BPEL to explore design options, and have the QoS properties analyzed for the composite service. The main issue in our research is performance evaluation in software system and engineering. We researched the Web service computation as the first half of this dissertation, and performance antipattern detection and elimination in the second part. Performance analysis of software system is complex due to large number of components and the interactions among them. Without the knowledge of experienced experts, it is difficult to diagnose performance anomalies and attempt to pinpoint the root causes of the problems. Software performance antipatterns are similar to design patterns in that they provide what to avoid and how to fix performance problems when they appear. Although the idea of applying antipatterns is promising, there are gaps in matching the symptoms and generating feedback solution for redesign. In this work, we analyze performance antipatterns to extract detectable features, influential factors, and resource involvements so that we can lay the foundation to detect their presence. We propose system abstract layering model and suggestive profiling methods for performance antipattern detection and elimination. Solutions proposed can be used during the refactoring phase, and can be included in the software development life cycle. Proposed tools and utilities are implemented and their use is demonstrated with RUBiS benchmark. digital.library.unt.edu/ark:/67531/metadc500103/
Physical-Layer Network Coding for MIMO Systems
The future wireless communication systems are required to meet the growing demands of reliability, bandwidth capacity, and mobility. However, as corruptions such as fading effects, thermal noise, are present in the channel, the occurrence of errors is unavoidable. Motivated by this, the work in this dissertation attempts to improve the system performance by way of exploiting schemes which statistically reduce the error rate, and in turn boost the system throughput. The network can be studied using a simplified model, the two-way relay channel, where two parties exchange messages via the assistance of a relay in between. In such scenarios, this dissertation performs theoretical analysis of the system, and derives closed-form and upper bound expressions of the error probability. These theoretical measurements are potentially helpful references for the practical system design. Additionally, several novel transmission methods including block relaying, permutation modulations for the physical-layer network coding, are proposed and discussed. Numerical simulation results are presented to support the validity of the conclusions. digital.library.unt.edu/ark:/67531/metadc68065/
Privacy Management for Online Social Networks
One in seven people in the world use online social networking for a variety of purposes -- to keep in touch with friends and family, to share special occasions, to broadcast announcements, and more. The majority of society has been bought into this new era of communication technology, which allows everyone on the internet to share information with friends. Since social networking has rapidly become a main form of communication, holes in privacy have become apparent. It has come to the point that the whole concept of sharing information requires restructuring. No longer are online social networks simply technology available for a niche market; they are in use by all of society. Thus it is important to not forget that a sense of privacy is inherent as an evolutionary by-product of social intelligence. In any context of society, privacy needs to be a part of the system in order to help users protect themselves from others. This dissertation attempts to address the lack of privacy management in online social networks by designing models which understand the social science behind how we form social groups and share information with each other. Social relationship strength was modeled using activity patterns, vocabulary usage, and behavioral patterns. In addition, automatic configuration for default privacy settings was proposed to help prevent new users from leaking personal information. This dissertation aims to mobilize a new era of social networking that understands social aspects of human network, and uses that knowledge to honor users' privacy. digital.library.unt.edu/ark:/67531/metadc283816/
Process-Voltage-Temperature Aware Nanoscale Circuit Optimization
Embedded systems which are targeted towards portable applications are required to have low power consumption because such portable devices are typically powered by batteries. During the memory accesses of such battery operated portable systems, including laptops, cell phones and other devices, a significant amount of power or energy is consumed which significantly affects the battery life. Therefore, efficient and leakage power saving cache designs are needed for longer operation of battery powered applications. Design engineers have limited control over many design parameters of the circuit and hence face many chal-lenges due to inherent process technology variations, particularly on static random access memory (SRAM) circuit design. As CMOS process technologies scale down deeper into the nanometer regime, the push for high performance and reliable systems becomes even more challenging. As a result, developing low-power designs while maintaining better performance of the circuit becomes a very difficult task. Furthermore, a major need for accurate analysis and optimization of various forms of total power dissipation and performance in nanoscale CMOS technologies, particularly in SRAMs, is another critical issue to be considered. This dissertation proposes power-leakage and static noise margin (SNM) analysis and methodologies to achieve optimized static random access memories (SRAMs). Alternate topologies of SRAMs, mainly a 7-transistor SRAM, are taken as a case study throughout this dissertation. The optimized cache designs are process-voltage-temperature (PVT) tolerant and consider individual cells as well as memory arrays. digital.library.unt.edu/ark:/67531/metadc67943/
Qos Aware Service Oriented Architecture
Service-oriented architecture enables web services to operate in a loosely-coupled setting and provides an environment for dynamic discovery and use of services over a network using standards such as WSDL, SOAP, and UDDI. Web service has both functional and non-functional characteristics. This thesis work proposes to add QoS descriptions (non-functional properties) to WSDL and compose various services to form a business process. This composition of web services also considers QoS properties along with functional properties and the composed services can again be published as a new Web Service and can be part of any other composition using Composed WSDL. digital.library.unt.edu/ark:/67531/metadc500032/
Rapid Prototyping and Design of a Fast Random Number Generator
Information in the form of online multimedia, bank accounts, or password usage for diverse applications needs some form of security. the core feature of many security systems is the generation of true random or pseudorandom numbers. Hence reliable generators of such numbers are indispensable. the fundamental hurdle is that digital computers cannot generate truly random numbers because the states and transitions of digital systems are well understood and predictable. Nothing in a digital computer happens truly randomly. Digital computers are sequential machines that perform a current state and move to the next state in a deterministic fashion. to generate any secure hash or encrypted word a random number is needed. But since computers are not random, random sequences are commonly used. Random sequences are algorithms that generate a pattern of values that appear to be random but after some time start repeating. This thesis implements a digital random number generator using MATLAB, FGPA prototyping, and custom silicon design. This random number generator is able to use a truly random CMOS source to generate the random number. Statistical benchmarks are used to test the results and to show that the design works. Thus the proposed random number generator will be useful for online encryption and security. digital.library.unt.edu/ark:/67531/metadc115036/
FIRST PREV 1 2 NEXT LAST