You limited your search to:
- 3D Reconstruction Using Lidar and Visual Images
- In this research, multi-perspective image registration using LiDAR and visual images was considered. 2D-3D image registration is a difficult task because it requires the extraction of different semantic features from each modality. This problem is solved in three parts. The first step involves detection and extraction of common features from each of the data sets. The second step consists of associating the common features between two different modalities. Traditional methods use lines or orthogonal corners as common features. The third step consists of building the projection matrix. Many existing methods use global positing system (GPS) or inertial navigation system (INS) for an initial estimate of the camera pose. However, the approach discussed herein does not use GPS, INS, or any such devices for initial estimate; hence the model can be used in places like the lunar surface or Mars where GPS or INS are not available. A variation of the method is also described, which does not require strong features from both images but rather uses intensity gradients in the image. This can be useful when one image does not have strong features (such as lines) or there are too many extraneous features.
- Adaptive Planning and Prediction in Agent-Supported Distributed Collaboration.
- Agents that act as user assistants will become invaluable as the number of information sources continue to proliferate. Such agents can support the work of users by learning to automate time-consuming tasks and filter information to manageable levels. Although considerable advances have been made in this area, it remains a fertile area for further development. One application of agents under careful scrutiny is the automated negotiation of conflicts between different user's needs and desires. Many techniques require explicit user models in order to function. This dissertation explores a technique for dynamically constructing user models and the impact of using them to anticipate the need for negotiation. Negotiation is reduced by including an advising aspect to the agent that can use this anticipation of conflict to adjust user behavior.
- Anchor Nodes Placement for Effective Passive Localization
Access: Use of this item is restricted to the UNT Community.
Wireless sensor networks are composed of sensor nodes, which can monitor an environment and observe events of interest. These networks are applied in various fields including but not limited to environmental, industrial and habitat monitoring. In many applications, the exact location of the sensor nodes is unknown after deployment. Localization is a process used to find sensor node's positional coordinates, which is vital information. The localization is generally assisted by anchor nodes that are also sensor nodes but with known locations. Anchor nodes generally are expensive and need to be optimally placed for effective localization. Passive localization is one of the localization techniques where the sensor nodes silently listen to the global events like thunder sounds, seismic waves, lighting, etc. According to previous studies, the ideal location to place anchor nodes was on the perimeter of the sensor network. This may not be the case in passive localization, since the function of anchor nodes here is different than the anchor nodes used in other localization systems. I do extensive studies on positioning anchor nodes for effective localization. Several simulations are run in dense and sparse networks for proper positioning of anchor nodes. I show that, for effective passive localization, the optimal placement of the anchor nodes is at the center of the network in such a way that no three anchor nodes share linearity. The more the non-linearity, the better the localization. The localization for our network design proves better when I place anchor nodes at right angles.
- An Approach Towards Self-Supervised Classification Using Cyc
- Due to the long duration required to perform manual knowledge entry by human knowledge engineers it is desirable to find methods to automatically acquire knowledge about the world by accessing online information. In this work I examine using the Cyc ontology to guide the creation of Naïve Bayes classifiers to provide knowledge about items described in Wikipedia articles. Given an initial set of Wikipedia articles the system uses the ontology to create positive and negative training sets for the classifiers in each category. The order in which classifiers are generated and used to test articles is also guided by the ontology. The research conducted shows that a system can be created that utilizes statistical text classification methods to extract information from an ad-hoc generated information source like Wikipedia for use in a formal semantic ontology like Cyc. Benefits and limitations of the system are discussed along with future work.
- Automated Classification of Emotions Using Song Lyrics
- This thesis explores the classification of emotions in song lyrics, using automatic approaches applied to a novel corpus of 100 popular songs. I use crowd sourcing via Amazon Mechanical Turk to collect line-level emotions annotations for this collection of song lyrics. I then build classifiers that rely on textual features to automatically identify the presence of one or more of the following six Ekman emotions: anger, disgust, fear, joy, sadness and surprise. I compare different classification systems and evaluate the performance of the automatic systems against the manual annotations. I also introduce a system that uses data collected from the social network Twitter. I use the Twitter API to collect a large corpus of tweets manually labeled by their authors for one of the six emotions of interest. I then compare the classification of emotions obtained when training on data automatically collected from Twitter versus data obtained through crowd sourced annotations.
- Automated Defense Against Worm Propagation.
Access: Use of this item is restricted to the UNT Community.
Worms have caused significant destruction over the last few years. Network security elements such as firewalls, IDS, etc have been ineffective against worms. Some worms are so fast that a manual intervention is not possible. This brings in the need for a stronger security architecture which can automatically react to stop worm propagation. The method has to be signature independent so that it can stop new worms. In this thesis, an automated defense system (ADS) is developed to automate defense against worms and contain the worm to a level where manual intervention is possible. This is accomplished with a two level architecture with feedback at each level. The inner loop is based on control system theory and uses the properties of PID (proportional, integral and differential controller). The outer loop works at the network level and stops the worm to reach its spread saturation point. In our lab setup, we verified that with only inner loop active the worm was delayed, and with both loops active we were able to restrict the propagation to 10% of the targeted hosts. One concern for deployment of a worm containment mechanism was degradation of throughput for legitimate traffic. We found that with proper intelligent algorithm we can minimize the degradation to an acceptable level.
- Automated Real-time Objects Detection in Colonoscopy Videos for Quality Measurements
- The effectiveness of colonoscopy depends on the quality of the inspection of the colon. There was no automated measurement method to evaluate the quality of the inspection. This thesis addresses this issue by investigating an automated post-procedure quality measurement technique and proposing a novel approach automatically deciding a percentage of stool areas in images of digitized colonoscopy video files. It involves the classification of image pixels based on their color features using a new method of planes on RGB (red, green and blue) color space. The limitation of post-procedure quality measurement is that quality measurements are available long after the procedure was done and the patient was released. A better approach is to inform any sub-optimal inspection immediately so that the endoscopist can improve the quality in real-time during the procedure. This thesis also proposes an extension to post-procedure method to detect stool, bite-block, and blood regions in real-time using color features in HSV color space. These three objects play a major role in quality measurements in colonoscopy. The proposed method partitions very large positive examples of each of these objects into a number of groups. These groups are formed by taking intersection of positive examples with a hyper plane. This hyper plane is named as 'positive plane'. 'Convex hulls' are used to model positive planes. Comparisons with traditional classifiers such as K-nearest neighbor (K-NN) and support vector machines (SVM) proves the soundness of the proposed method in terms of accuracy and speed that are critical in the targeted real-time quality measurement system.
- Automated Syndromic Surveillance using Intelligent Mobile Agents
- Current syndromic surveillance systems utilize centralized databases that are neither scalable in storage space nor in computing power. Such systems are limited in the amount of syndromic data that may be collected and analyzed for the early detection of infectious disease outbreaks. However, with the increased prevalence of international travel, public health monitoring must extend beyond the borders of municipalities or states which will require the ability to store vasts amount of data and significant computing power for analyzing the data. Intelligent mobile agents may be used to create a distributed surveillance system that will utilize the hard drives and computer processing unit (CPU) power of the hosts on the agent network where the syndromic information is located. This thesis proposes the design of a mobile agent-based syndromic surveillance system and an agent decision model for outbreak detection. Simulation results indicate that mobile agents are capable of detecting an outbreak that occurs at all hosts the agent is monitoring. Further study of agent decision models is required to account for localized epidemics and variable agent movement rates.
- Automatic Tagging of Communication Data
- Globally distributed software teams are widespread throughout industry. But finding reliable methods that can properly assess a team's activities is a real challenge. Methods such as surveys and manual coding of activities are too time consuming and are often unreliable. Recent advances in information retrieval and linguistics, however, suggest that automated and/or semi-automated text classification algorithms could be an effective way of finding differences in the communication patterns among individuals and groups. Communication among group members is frequent and generates a significant amount of data. Thus having a web-based tool that can automatically analyze the communication patterns among global software teams could lead to a better understanding of group performance. The goal of this thesis, therefore, is to compare automatic and semi-automatic measures of communication and evaluate their effectiveness in classifying different types of group activities that occur within a global software development project. In order to achieve this goal, we developed a web-based component that can be used to help clean and classify communication activities. The component was then used to compare different automated text classification techniques on various group activities to determine their effectiveness in correctly classifying data from a global software development team project.
- Autonomic Failure Identification and Diagnosis for Building Dependable Cloud Computing Systems
- The increasingly popular cloud-computing paradigm provides on-demand access to computing and storage with the appearance of unlimited resources. Users are given access to a variety of data and software utilities to manage their work. Users rent virtual resources and pay for only what they use. In spite of the many benefits that cloud computing promises, the lack of dependability in shared virtualized infrastructures is a major obstacle for its wider adoption, especially for mission-critical applications. Virtualization and multi-tenancy increase system complexity and dynamicity. They introduce new sources of failure degrading the dependability of cloud computing systems. To assure cloud dependability, in my dissertation research, I develop autonomic failure identification and diagnosis techniques that are crucial for understanding emergent, cloud-wide phenomena and self-managing resource burdens for cloud availability and productivity enhancement. We study the runtime cloud performance data collected from a cloud test-bed and by using traces from production cloud systems. We define cloud signatures including those metrics that are most relevant to failure instances. We exploit profiled cloud performance data in both time and frequency domain to identify anomalous cloud behaviors and leverage cloud metric subspace analysis to automate the diagnosis of observed failures. We implement a prototype of the anomaly identification system and conduct the experiments in an on-campus cloud computing test-bed and by using the Google datacenter traces. Our experimental results show that our proposed anomaly detection mechanism can achieve 93% detection sensitivity while keeping the false positive rate as low as 6.1% and outperform other tested anomaly detection schemes. In addition, the anomaly detector adapts itself by recursively learning from these newly verified detection results to refine future detection.
- Bayesian Probabilistic Reasoning Applied to Mathematical Epidemiology for Predictive Spatiotemporal Analysis of Infectious Diseases
- Abstract Probabilistic reasoning under uncertainty suits well to analysis of disease dynamics. The stochastic nature of disease progression is modeled by applying the principles of Bayesian learning. Bayesian learning predicts the disease progression, including prevalence and incidence, for a geographic region and demographic composition. Public health resources, prioritized by the order of risk levels of the population, will efficiently minimize the disease spread and curtail the epidemic at the earliest. A Bayesian network representing the outbreak of influenza and pneumonia in a geographic region is ported to a newer region with different demographic composition. Upon analysis for the newer region, the corresponding prevalence of influenza and pneumonia among the different demographic subgroups is inferred for the newer region. Bayesian reasoning coupled with disease timeline is used to reverse engineer an influenza outbreak for a given geographic and demographic setting. The temporal flow of the epidemic among the different sections of the population is analyzed to identify the corresponding risk levels. In comparison to spread vaccination, prioritizing the limited vaccination resources to the higher risk groups results in relatively lower influenza prevalence. HIV incidence in Texas from 1989-2002 is analyzed using demographic based epidemic curves. Dynamic Bayesian networks are integrated with probability distributions of HIV surveillance data coupled with the census population data to estimate the proportion of HIV incidence among the different demographic subgroups. Demographic based risk analysis lends to observation of varied spectrum of HIV risk among the different demographic subgroups. A methodology using hidden Markov models is introduced that enables to investigate the impact of social behavioral interactions in the incidence and prevalence of infectious diseases. The methodology is presented in the context of simulated disease outbreak data for influenza. Probabilistic reasoning analysis enhances the understanding of disease progression in order to identify the critical points of surveillance, control and prevention. Public health resources, prioritized by the order of risk levels of the population, will efficiently minimize the disease spread and curtail the epidemic at the earliest.
- Boosting for Learning From Imbalanced, Multiclass Data Sets
Access: Use of this item is restricted to the UNT Community.
In many real-world applications, it is common to have uneven number of examples among multiple classes. The data imbalance, however, usually complicates the learning process, especially for the minority classes, and results in deteriorated performance. Boosting methods were proposed to handle the imbalance problem. These methods need elongated training time and require diversity among the classifiers of the ensemble to achieve improved performance. Additionally, extending the boosting method to handle multi-class data sets is not straightforward. Examples of applications that suffer from imbalanced multi-class data can be found in face recognition, where tens of classes exist, and in capsule endoscopy, which suffers massive imbalance between the classes. This dissertation introduces RegBoost, a new boosting framework to address the imbalanced, multi-class problems. This method applies a weighted stratified sampling technique and incorporates a regularization term that accommodates multi-class data sets and automatically determines the error bound of each base classifier. The regularization parameter penalizes the classifier when it misclassifies instances that were correctly classified in the previous iteration. The parameter additionally reduces the bias towards majority classes. Experiments are conducted using 12 diverse data sets with moderate to high imbalance ratios. The results demonstrate superior performance of the proposed method compared to several state-of-the-art algorithms for imbalanced, multi-class classification problems. More importantly, the sensitivity improvement of the minority classes using RegBoost is accompanied with the improvement of the overall accuracy for all classes. With unpredictability regularization, a diverse group of classifiers are created and the maximum accuracy improvement reaches above 24%. Using stratified undersampling, RegBoost exhibits the best efficiency. The reduction in computational cost is significant reaching above 50%. As the volume of training data increase, the gain of efficiency with the proposed method becomes more significant.
- A CAM-Based, High-Performance Classifier-Scheduler for a Video Network Processor.
- Classification and scheduling are key functionalities of a network processor. Network processors are equipped with application specific integrated circuits (ASIC), so that as IP (Internet Protocol) packets arrive, they can be processed directly without using the central processing unit. A new network processor is proposed called the video network processor (VNP) for real time broadcasting of video streams for IP television (IPTV). This thesis explores the challenge in designing a combined classification and scheduling module for a VNP. I propose and design the classifier-scheduler module which will classify and schedule data for VNP. The proposed module discriminates between IP packets and video packets. The video packets are further processed for digital rights management (DRM). IP packets which carry regular traffic will traverse without any modification. Basic architecture of VNP and architecture of classifier-scheduler module based on content addressable memory (CAM) and random access memory (RAM) has been proposed. The module has been designed and simulated in Xilinx 9.1i; is built in ISE simulator with a throughput of 1.79 Mbps and a maximum working frequency of 111.89 MHz at a power dissipation of 33.6mW. The code has been translated and mapped for Spartan and Virtex family of devices.
- Capacity and Throughput Optimization in Multi-cell 3G WCDMA Networks
- User modeling enables in the computation of the traffic density in a cellular network, which can be used to optimize the placement of base stations and radio network controllers as well as to analyze the performance of resource management algorithms towards meeting the final goal: the calculation and maximization of network capacity and throughput for different data rate services. An analytical model is presented for approximating the user distributions in multi-cell third generation wideband code division multiple access (WCDMA) networks using 2-dimensional Gaussian distributions by determining the means and the standard deviations of the distributions for every cell. This model allows for the calculation of the inter-cell interference and the reverse-link capacity of the network. An analytical model for optimizing capacity in multi-cell WCDMA networks is presented. Capacity is optimized for different spreading factors and for perfect and imperfect power control. Numerical results show that the SIR threshold for the received signals is decreased by 0.5 to 1.5 dB due to the imperfect power control. The results also show that the determined parameters of the 2-dimensional Gaussian model match well with traditional methods for modeling user distribution. A call admission control algorithm is designed that maximizes the throughput in multi-cell WCDMA networks. Numerical results are presented for different spreading factors and for several mobility scenarios. Our methods of optimizing capacity and throughput are computationally efficient, accurate, and can be implemented in large WCDMA networks.
- CLUE: A Cluster Evaluation Tool
- Modern high performance computing is dependent on parallel processing systems. Most current benchmarks reveal only the high level computational throughput metrics, which may be sufficient for single processor systems, but can lead to a misrepresentation of true system capability for parallel systems. A new benchmark is therefore proposed. CLUE (Cluster Evaluator) uses a cellular automata algorithm to evaluate the scalability of parallel processing machines. The benchmark also uses algorithmic variations to evaluate individual system components' impact on the overall serial fraction and efficiency. CLUE is not a replacement for other performance-centric benchmarks, but rather shows the scalability of a system and provides metrics to reveal where one can improve overall performance. CLUE is a new benchmark which demonstrates a better comparison among different parallel systems than existing benchmarks and can diagnose where a particular parallel system can be optimized.
- CMOS Active Pixel Sensors for Digital Cameras: Current State-of-the-Art
- Image sensors play a vital role in many image sensing and capture applications. Among the various types of image sensors, complementary metal oxide semiconductor (CMOS) based active pixel sensors (APS), which are characterized by reduced pixel size, give fast readouts and reduced noise. APS are used in many applications such as mobile cameras, digital cameras, Webcams, and many consumer, commercial and scientific applications. With these developments and applications, CMOS APS designs are challenging the old and mature technology of charged couple device (CCD) sensors. With the continuous improvements of APS architecture, pixel designs, along with the development of nanometer CMOS fabrications technologies, APS are optimized for optical sensing. In addition, APS offers very low-power and low-voltage operations and is suitable for monolithic integration, thus allowing manufacturers to integrate more functionality on the array and building low-cost camera-on-a-chip. In this thesis, I explore the current state-of-the-art of CMOS APS by examining various types of APS. I show design and simulation results of one of the most commonly used APS in consumer applications, i.e. photodiode based APS. We also present an approach for technology scaling of the devices in photodiode APS to present CMOS technologies. Finally, I present the most modern CMOS APS technologies by reviewing different design models. The design of the photodiode APS is implemented using commercial CAD tools.
- Comparative Study of RSS-Based Collaborative Localization Methods in Wireless Sensor Networks
- In this thesis two collaborative localization techniques are studied: multidimensional scaling (MDS) and maximum likelihood estimator (MLE). A synthesis of a new location estimation method through a serial integration of these two techniques, such that an estimate is first obtained using MDS and then MLE is employed to fine-tune the MDS solution, was the subject of this research using various simulation and experimental studies. In the simulations, important issues including the effects of sensor node density, reference node density and different deployment strategies of reference nodes were addressed. In the experimental study, the path loss model of indoor environments is developed by determining the environment-specific parameters from the experimental measurement data. Then, the empirical path loss model is employed in the analysis and simulation study of the performance of collaborative localization techniques.
- Comparison and Evaluation of Existing Analog Circuit Simulator using Sigma-Delta Modulator
Access: Use of this item is restricted to the UNT Community.
In the world of VLSI (very large scale integration) technology, there are many different types of circuit simulators that are used to design and predict the circuit behavior before actual fabrication of the circuit. In this thesis, I compared and evaluated existing circuit simulators by considering standard benchmark circuits. The circuit simulators which I evaluated and explored are Ngspice, Tclspice, Winspice (open source) and Spectre® (commercial). I also tested standard benchmarks using these circuit simulators and compared their outputs. The simulators are evaluated using design metrics in order to quantify their performance and identify efficient circuit simulators. In addition, I designed a sigma-delta modulator and its individual components using the analog behavioral language Verilog-A. Initially, I performed simulations of individual components of the sigma-delta modulator and later of the whole system. Finally, CMOS (complementary metal-oxide semiconductor) transistor-level circuits were designed for the differential amplifier, operational amplifier and comparator of the modulator.
- Computational Epidemiology - Analyzing Exposure Risk: A Deterministic, Agent-Based Approach
- Many infectious diseases are spread through interactions between susceptible and infectious individuals. Keeping track of where each exposure to the disease took place, when it took place, and which individuals were involved in the exposure can give public health officials important information that they may use to formulate their interventions. Further, knowing which individuals in the population are at the highest risk of becoming infected with the disease may prove to be a useful tool for public health officials trying to curtail the spread of the disease. Epidemiological models are needed to allow epidemiologists to study the population dynamics of transmission of infectious agents and the potential impact of infectious disease control programs. While many agent-based computational epidemiological models exist in the literature, they focus on the spread of disease rather than exposure risk. These models are designed to simulate very large populations, representing individuals as agents, and using random experiments and probabilities in an attempt to more realistically guide the course of the modeled disease outbreak. The work presented in this thesis focuses on tracking exposure risk to chickenpox in an elementary school setting. This setting is chosen due to the high level of detailed information realistically available to school administrators regarding individuals' schedules and movements. Using an agent-based approach, contacts between individuals are tracked and analyzed with respect to both individuals and locations. The results are then analyzed using a combination of tools from computer science and geographic information science.
- A Computational Methodology for Addressing Differentiated Access of Vulnerable Populations During Biological Emergencies
- Mitigation response plans must be created to protect affected populations during biological emergencies resulting from the release of harmful biochemical substances. Medical countermeasures have been stockpiled by the federal government for such emergencies. However, it is the responsibility of local governments to maintain solid, functional plans to apply these countermeasures to the entire target population within short, mandated time frames. Further, vulnerabilities in the population may serve as barriers preventing certain individuals from participating in mitigation activities. Therefore, functional response plans must be capable of reaching vulnerable populations.Transportation vulnerability results from lack of access to transportation. Transportation vulnerable populations located too far from mitigation resources are at-risk of not being able to participate in mitigation activities. Quantification of these populations requires the development of computational methods to integrate spatial demographic data and transportation resource data from disparate sources into the context of planned mitigation efforts. Research described in this dissertation focuses on quantifying transportation vulnerable populations and maximizing participation in response efforts. Algorithms developed as part of this research are integrated into a computational framework to promote a transition from research and development to deployment and use by biological emergency planners.
- Cross Language Information Retrieval for Languages with Scarce Resources
- Our generation has experienced one of the most dramatic changes in how society communicates. Today, we have online information on almost any imaginable topic. However, most of this information is available in only a few dozen languages. In this thesis, I explore the use of parallel texts to enable cross-language information retrieval (CLIR) for languages with scarce resources. To build the parallel text I use the Bible. I evaluate different variables and their impact on the resulting CLIR system, specifically: (1) the CLIR results when using different amounts of parallel text; (2) the role of paraphrasing on the quality of the CLIR output; (3) the impact on accuracy when translating the query versus translating the collection of documents; and finally (4) how the results are affected by the use of different dialects. The results show that all these variables have a direct impact on the quality of the CLIR system.
- Cuff-less Blood Pressure Measurement Using a Smart Phone
Access: Use of this item is restricted to the UNT Community.
Blood pressure is vital sign information that physicians often need as preliminary data for immediate intervention during emergency situations or for regular monitoring of people with cardiovascular diseases. Despite the availability of portable blood pressure meters in the market, they are not regularly carried by people, creating a need for an ultra-portable measurement platform or device that can be easily carried and used at all times. One such device is the smartphone which, according to comScore survey is used by 26.2% of the US adult population. the mass production of these phones with built-in sensors and high computation power has created numerous possibilities for application development in different domains including biomedical. Motivated by this capability and their extensive usage, this thesis focuses on developing a blood pressure measurement platform on smartphones. Specifically, I developed a blood pressure measurement system on a smart phone using the built-in camera and a customized external microphone. the system consists of first obtaining heart beats using the microphone and finger pulse with the camera, and finally calculating the blood pressure using the recorded data. I developed techniques for finding the best location for obtaining the data, making the system usable by all categories of people. the proposed system resulted in accuracies between 90-100%, when compared to traditional blood pressure meters. the second part of this thesis presents a new system for remote heart beat monitoring using the smart phone. with the proposed system, heart beats can be transferred live by patients and monitored by physicians remotely for diagnosis. the proposed blood pressure measurement and remote monitoring systems will be able to facilitate information acquisition and decision making by the 9-1-1 operators.
- Ddos Defense Against Botnets in the Mobile Cloud
- Mobile phone advancements and ubiquitous internet connectivity are resulting in ever expanding possibilities in the application of smart phones. Users of mobile phones are now capable of hosting server applications from their personal devices. Whether providing services individually or in an ad hoc network setting the devices are currently not configured for defending against distributed denial of service (DDoS) attacks. These attacks, often launched from a botnet, have existed in the space of personal computing for decades but recently have begun showing up on mobile devices. Research is done first into the required steps to develop a potential botnet on the Android platform. This includes testing for the amount of malicious traffic an Android phone would be capable of generating for a DDoS attack. On the other end of the spectrum is the need of mobile devices running networked applications to develop security against DDoS attacks. For this mobile, phones are setup, with web servers running Apache to simulate users running internet connected applications for either local ad hoc networks or serving to the internet. Testing is done for the viability of using commonly available modules developed for Apache and intended for servers as well as finding baseline capabilities of mobiles to handle higher traffic volumes. Given the unique challenge of the limited resources a mobile phone can dedicate to Apache when compared to a dedicated hosting server a new method was needed. A proposed defense algorithm is developed for mitigating DDoS attacks against the mobile server that takes into account the limited resources available on the mobile device. The algorithm is tested against TCP socket flooding for effectiveness and shown to perform better than the common Apache module installations on a mobile device.
- Design and Analysis of Novel Verifiable Voting Schemes
- Free and fair elections are the basis for democracy, but conducting elections is not an easy task. Different groups of people are trying to influence the outcome of the election in their favor using the range of methods, from campaigning for a particular candidate to well-financed lobbying. Often the stakes are too high, and the methods are illegal. Two main properties of any voting scheme are the privacy of a voter’s choice and the integrity of the tally. Unfortunately, they are mutually exclusive. Integrity requires making elections transparent and auditable, but at the same time, we must preserve a voter’s privacy. It is always a trade-off between these two requirements. Current voting schemes favor privacy over auditability, and thus, they are vulnerable to voting fraud. I propose two novel voting systems that can achieve both privacy and verifiability. The first protocol is based on cryptographical primitives to ensure the integrity of the final tally and privacy of the voter. The second protocol is a simple paper-based voting scheme that achieves almost the same level of security without usage of cryptography.
- Design and Implementation of Large-Scale Wireless Sensor Networks for Environmental Monitoring Applications
- Environmental monitoring represents a major application domain for wireless sensor networks (WSN). However, despite significant advances in recent years, there are still many challenging issues to be addressed to exploit the full potential of the emerging WSN technology. In this dissertation, we introduce the design and implementation of low-power wireless sensor networks for long-term, autonomous, and near-real-time environmental monitoring applications. We have developed an out-of-box solution consisting of a suite of software, protocols and algorithms to provide reliable data collection with extremely low power consumption. Two wireless sensor networks based on the proposed solution have been deployed in remote field stations to monitor soil moisture along with other environmental parameters. As parts of the ever-growing environmental monitoring cyberinfrastructure, these networks have been integrated into the Texas Environmental Observatory system for long-term operation. Environmental measurement and network performance results are presented to demonstrate the capability, reliability and energy-efficiency of the network.
- Design and Optimization of Components in a 45nm CMOS Phase Locked Loop
Access: Use of this item is restricted to the UNT Community.
A novel scheme of optimizing the individual components of a phase locked loop (PLL) which is used for stable clock generation and synchronization of signals is considered in this work. Verilog-A is used for the high level system design of the main components of the PLL, followed by the individual component wise optimization. The design of experiments (DOE) approach to optimize the analog, 45nm voltage controlled oscillator (VCO) is presented. Also a mixed signal analysis using the analog and digital Verilog behavior of components is studied. Overall a high level system design of a PLL, a systematic optimization of each of its components, and an analog and mixed signal behavioral design approach have been implemented using cadence custom IC design tools.
- The Design Of A Benchmark For Geo-stream Management Systems
- The recent growth in sensor technology allows easier information gathering in real-time as sensors have grown smaller, more accurate, and less expensive. The resulting data is often in a geo-stream format continuously changing input with a spatial extent. Researchers developing geo-streaming management systems (GSMS) require a benchmark system for evaluation, which is currently lacking. This thesis presents GSMark, a benchmark for evaluating GSMSs. GSMark provides a data generator that creates a combination of synthetic and real geo-streaming data, a workload simulator to present the data to the GSMS as a data stream, and a set of benchmark queries that evaluate typical GSMS functionality and query performance. In particular, GSMark generates both moving points and evolving spatial regions, two fundamental data types for a broad range of geo-stream applications, and the geo-streaming queries on this data.
- Detection of Temporal Events and Abnormal Images for Quality Analysis in Endoscopy Videos
- Recent reports suggest that measuring the objective quality is very essential towards the success of colonoscopy. Several quality indicators (i.e. metrics) proposed in recent studies are implemented in software systems that compute real-time quality scores for routine screening colonoscopy. Most quality metrics are derived based on various temporal events occurred during the colonoscopy procedure. The location of the phase boundary between the insertion and the withdrawal phases and the amount of circumferential inspection are two such important temporal events. These two temporal events can be determined by analyzing various camera motions of the colonoscope. This dissertation put forward a novel method to estimate X, Y and Z directional motions of the colonoscope using motion vector templates. Since abnormalities of a WCE or a colonoscopy video can be found in a small number of frames (around 5% out of total frames), it is very helpful if a computer system can decide whether a frame has any mucosal abnormalities. Also, the number of detected abnormal lesions during a procedure is used as a quality indicator. Majority of the existing abnormal detection methods focus on detecting only one type of abnormality or the overall accuracies are somewhat low if the method tries to detect multiple abnormalities. Most abnormalities in endoscopy images have unique textures which are clearly distinguishable from normal textures. In this dissertation a new method is proposed that achieves the objective of detecting multiple abnormalities with a higher accuracy using a multi-texture analysis technique. The multi-texture analysis method is designed by representing WCE and colonoscopy image textures as textons.
- Development, Implementation, and Analysis of a Contact Model for an Infectious Disease
- With a growing concern of an infectious diseases spreading in a population, epidemiology is becoming more important for the future of public health. In the past epidemiologist used existing data of an outbreak to help them determine how an infectious disease might spread in the future. Now with computational models, they able to analysis data produced by these models to help with prevention and intervention plans. This paper looks at the design, implementation, and analysis of a computational model based on the interactions of the population between individuals. The design of the working contact model looks closely at the SEIR model used as the foundation and the two timelines of a disease. The implementation of the contact model is reviewed while looking closely at data structures. The analysis of the experiments provide evidence this contact model can be used to help epidemiologist study the spread of an infectious disease based on the contact rate of individuals.
- Direct Online/Offline Digital Signature Schemes.
- Online/offline signature schemes are useful in many situations, and two such scenarios are considered in this dissertation: bursty server authentication and embedded device authentication. In this dissertation, new techniques for online/offline signing are introduced, those are applied in a variety of ways for creating online/offline signature schemes, and five different online/offline signature schemes that are proved secure under a variety of models and assumptions are proposed. Two of the proposed five schemes have the best offline or best online performance of any currently known technique, and are particularly well-suited for the scenarios that are considered in this dissertation. To determine if the proposed schemes provide the expected practical improvements, a series of experiments were conducted comparing the proposed schemes with each other and with other state-of-the-art schemes in this area, both on a desktop class computer, and under AVR Studio, a simulation platform for an 8-bit processor that is popular for embedded systems. Under AVR Studio, the proposed SGE scheme using a typical key size for the embedded device authentication scenario, can complete the offline phase in about 24 seconds and then produce a signature (the online phase) in 15 milliseconds, which is the best offline performance of any known signature scheme that has been proven secure in the standard model. In the tests on a desktop class computer, the proposed SGS scheme, which has the best online performance and is designed for the bursty server authentication scenario, generated 469,109 signatures per second, and the Schnorr scheme (the next best scheme in terms of online performance) generated only 223,548 signatures. The experimental results demonstrate that the SGE and SGS schemes are the most efficient techniques for embedded device authentication and bursty server authentication, respectively.
- A Driver, Vehicle and Road Safety System Using Smartphones
Access: Use of this item is restricted to the UNT Community.
As vehicle manufacturers continue to increase their emphasis on safety with advanced driver assistance systems (ADAS), I propose a ubiquitous device that is able to analyze and advise on safety conditions. Mobile smartphones are increasing in popularity among younger generations with an estimated 64% of 25-34 year olds already using one in their daily lives. with over 10 million car accidents reported in the United States each year, car manufacturers have shifted their focus of a passive approach (airbags) to more active by adding features associated with ADAS (lane departure warnings). However, vehicles manufactured with these sensors are not economically priced while older vehicles might only have passive safety features. Given its accessibility and portability, I target a mobile smartphone as a device to compliment ADAS that can bring a driver assist to any vehicle without regards for any on-vehicle communication system requirements. I use the 3-axis accelerometer of multiple Android based smartphone to record and analyze various safety factors which can influence a driver while operating a vehicle. These influences with respect to the driver, vehicle and road are lane change maneuvers, vehicular comfort and road conditions. Each factor could potentially be hazardous to the health of the driver, neighboring public, and automobile and is therefore analyzed thoroughly achieving 85.60% and 89.89% classification accuracy for identifying road anomalies and lane changes, respectively. Effective use of this data can educate a potentially dangerous driver on how to operate a vehicle safely and efficiently. with real time analysis and auditory alerts of these factors, I hope to increase a driver's overall awareness to maximize safety.
- A Dual Dielectric Approach for Performance Aware Reduction of Gate Leakage in Combinational Circuits
- Design of systems in the low-end nanometer domain has introduced new dimensions in power consumption and dissipation in CMOS devices. With continued and aggressive scaling, using low thickness SiO2 for the transistor gates, gate leakage due to gate oxide direct tunneling current has emerged as the major component of leakage in the CMOS circuits. Therefore, providing a solution to the issue of gate oxide leakage has become one of the key concerns in achieving low power and high performance CMOS VLSI circuits. In this thesis, a new approach is proposed involving dual dielectric of dual thicknesses (DKDT) for the reducing both ON and OFF state gate leakage. It is claimed that the simultaneous utilization of SiON and SiO2 each with multiple thicknesses is a better approach for gate leakage reduction than the conventional usage of a single gate dielectric (SiO2), possibly with multiple thicknesses. An algorithm is developed for DKDT assignment that minimizes the overall leakage for a circuit without compromising with the performance. Extensive experiments were carried out on ISCAS'85 benchmarks using 45nm technology which showed that the proposed approach can reduce the leakage, as much as 98% (in an average 89.5%), without degrading the performance.
- Effective and Accelerated Informative Frame Filtering in Colonoscopy Videos Using Graphic Processing Units
- Colonoscopy is an endoscopic technique that allows a physician to inspect the mucosa of the human colon. Previous methods and software solutions to detect informative frames in a colonoscopy video (a process called informative frame filtering or IFF) have been hugely ineffective in (1) covering the proper definition of an informative frame in the broadest sense and (2) striking an optimal balance between accuracy and speed of classification in both real-time and non real-time medical procedures. In my thesis, I propose a more effective method and faster software solutions for IFF which is more effective due to the introduction of a heuristic algorithm (derived from experimental analysis of typical colon features) for classification. It contributed to a 5-10% boost in various performance metrics for IFF. The software modules are faster due to the incorporation of sophisticated parallel-processing oriented coding techniques on modern microprocessors. Two IFF modules were created, one for post-procedure and the other for real-time. Code optimizations through NVIDIA CUDA for GPU processing and/or CPU multi-threading concepts embedded in two significant microprocessor design philosophies (multi-core design and many-core design) resulted a 5-fold acceleration for the post-procedure module and a 40-fold acceleration for the real-time module. Some innovative software modules, which are still in testing phase, have been recently created to exploit the power of multiple GPUs together.
- Elicitation of Protein-Protein Interactions from Biomedical Literature Using Association Rule Discovery
- Extracting information from a stack of data is a tedious task and the scenario is no different in proteomics. Volumes of research papers are published about study of various proteins in several species, their interactions with other proteins and identification of protein(s) as possible biomarker in causing diseases. It is a challenging task for biologists to keep track of these developments manually by reading through the literatures. Several tools have been developed by computer linguists to assist identification, extraction and hypotheses generation of proteins and protein-protein interactions from biomedical publications and protein databases. However, they are confronted with the challenges of term variation, term ambiguity, access only to abstracts and inconsistencies in time-consuming manual curation of protein and protein-protein interaction repositories. This work attempts to attenuate the challenges by extracting protein-protein interactions in humans and elicit possible interactions using associative rule mining on full text, abstracts and captions from figures available from publicly available biomedical literature databases. Two such databases are used in our study: Directory of Open Access Journals (DOAJ) and PubMed Central (PMC). A corpus is built using articles based on search terms. A dataset of more than 38,000 protein-protein interactions from the Human Protein Reference Database (HPRD) is cross-referenced to validate discovered interactive pairs. A set of an optimal size of possible binary protein-protein interactions is generated to be made available for clinician or biological validation. A significant change in the number of new associations was found by altering the thresholds for support and confidence metrics. This study narrows down the limitations for biologists in keeping pace with discovery of protein-protein interactions via manually reading the literature and their needs to validate each and every possible interaction.
- An Empirical Evaluation of Communication and Coordination Effectiveness in Autonomous Reactive Multiagent Systems
- This thesis describes experiments designed to measure the effect of collaborative communication on task performance of a multiagent system. A discrete event simulation was developed to model a multi-agent system completing a task to find and collect food resources, with the ability to substitute various communication and coordination methods. Experiments were conducted to find the effects of the various communication methods on completion of the task to find and harvest the food resources. Results show that communication decreases the time required to complete the task. However, all communication methods do not fare equally well. In particular, results indicate that the communication model of the bee is a particularly effective method of agent communication and collaboration. Furthermore, results indicate that direct communication with additional information content provides better completion results. Cost-benefit models show some conflicting information, indicating that the increased performance may not offset the additional cost of achieving that performance.
- End of Insertion Detection in Colonoscopy Videos
- Colorectal cancer is the second leading cause of cancer-related deaths behind lung cancer in the United States. Colonoscopy is the preferred screening method for detection of diseases like Colorectal Cancer. In the year 2006, American Society for Gastrointestinal Endoscopy (ASGE) and American College of Gastroenterology (ACG) issued guidelines for quality colonoscopy. The guidelines suggest that on average the withdrawal phase during a screening colonoscopy should last a minimum of 6 minutes. My aim is to classify the colonoscopy video into insertion and withdrawal phase. The problem is that currently existing shot detection techniques cannot be applied because colonoscopy is a single camera shot from start to end. An algorithm to detect phase boundary has already been developed by the MIGLAB team. Existing method has acceptable levels of accuracy but the main issue is dependency on MPEG (Moving Pictures Expert Group) 1/2. I implemented exhaustive search for motion estimation to reduce the execution time and improve the accuracy. I took advantages of the C/C++ programming languages with multithreading which helped us get even better performances in terms of execution time. I propose a method for improving the current method of colonoscopy video analysis and also an extension for the same to make it usable for real time videos. The real time version we implemented is capable of handling streams coming directly from the camera in the form of uncompressed bitmap frames. Existing implementation could not be applied to real time scenario because of its dependency on MPEG 1/2. Future direction of this research includes improved motion search and GPU parallel computing techniques.
- Energy-Aware Time Synchronization in Wireless Sensor Networks
- I present a time synchronization algorithm for wireless sensor networks that aims to conserve sensor battery power. The proposed method creates a hierarchical tree by flooding the sensor network from a designated source point. It then uses a hybrid algorithm derived from the timing-sync protocol for sensor networks (TSPN) and the reference broadcast synchronization method (RBS) to periodically synchronize sensor clocks by minimizing energy consumption. In multi-hop ad-hoc networks, a depleted sensor will drop information from all other sensors that route data through it, decreasing the physical area being monitored by the network. The proposed method uses several techniques and thresholds to maintain network connectivity. A new root sensor is chosen when the current one's battery power decreases to a designated value. I implement this new synchronization technique using Matlab and show that it can provide significant power savings over both TPSN and RBS.
- The enhancement of machine translation for low-density languages using Web-gathered parallel texts.
- The majority of the world's languages are poorly represented in informational media like radio, television, newspapers, and the Internet. Translation into and out of these languages may offer a way for speakers of these languages to interact with the wider world, but current statistical machine translation models are only effective with a large corpus of parallel texts - texts in two languages that are translations of one another - which most languages lack. This thesis describes the Babylon project which attempts to alleviate this shortage by supplementing existing parallel texts with texts gathered automatically from the Web -- specifically targeting pages that contain text in a pair of languages. Results indicate that parallel texts gathered from the Web can be effectively used as a source of training data for machine translation and can significantly improve the translation quality for text in a similar domain. However, the small quantity of high-quality low-density language parallel texts on the Web remains a significant obstacle.
- E‐Shape Analysis
- The motivation of this work is to understand E-shape analysis and how it can be applied to various classification tasks. It has a powerful feature to not only look at what information is contained, but rather how that information looks. This new technique gives E-shape analysis the ability to be language independent and to some extent size independent. In this thesis, I present a new mechanism to characterize an email without using content or context called E-shape analysis for email. I explore the applications of the email shape by carrying out a case study; botnet detection and two possible applications: spam filtering and social-context based finger printing. The second part of this thesis takes what I apply E-shape analysis to activity recognition of humans. Using the Android platform and a T-Mobile G1 phone I collect data from the triaxial accelerometer and use it to classify the motion behavior of a subject.
- Evaluating Appropriateness of Emg and Flex Sensors for Classifying Hand Gestures
- Hand and arm gestures are a great way of communication when you don't want to be heard, quieter and often more reliable than whispering into a radio mike. In recent years hand gesture identification became a major active area of research due its use in various applications. The objective of my work is to develop an integrated sensor system, which will enable tactical squads and SWAT teams to communicate when there is absence of a Line of Sight or in the presence of any obstacles. The gesture set involved in this work is the standardized hand signals for close range engagement operations used by military and SWAT teams. The gesture sets involved in this work are broadly divided into finger movements and arm movements. The core components of the integrated sensor system are: Surface EMG sensors, Flex sensors and accelerometers. Surface EMG is the electrical activity produced by muscle contractions and measured by sensors directly attached to the skin. Bend Sensors use a piezo resistive material to detect the bend. The sensor output is determined by both the angle between the ends of the sensor as well as the flex radius. Accelerometers sense the dynamic acceleration and inclination in 3 directions simultaneously. EMG sensors are placed on the upper and lower forearm and assist in the classification of the finger and wrist movements. Bend sensors are mounted on a glove that is worn on the hand. The sensors are located over the first knuckle of each figure and can determine if the finger is bent or not. An accelerometer is attached to the glove at the base of the wrist and determines the speed and direction of the arm movement. Classification algorithm SVM is used to classify the gestures.
- Evaluating the Scalability of SDF Single-chip Multiprocessor Architecture Using Automatically Parallelizing Code
- Advances in integrated circuit technology continue to provide more and more transistors on a chip. Computer architects are faced with the challenge of finding the best way to translate these resources into high performance. The challenge in the design of next generation CPU (central processing unit) lies not on trying to use up the silicon area, but on finding smart ways to make use of the wealth of transistors now available. In addition, the next generation architecture should offer high throughout performance, scalability, modularity, and low energy consumption, instead of an architecture that is suitable for only one class of applications or users, or only emphasize faster clock rate. A program exhibits different types of parallelism: instruction level parallelism (ILP), thread level parallelism (TLP), or data level parallelism (DLP). Likewise, architectures can be designed to exploit one or more of these types of parallelism. It is generally not possible to design architectures that can take advantage of all three types of parallelism without using very complex hardware structures and complex compiler optimizations. We present the state-of-art architecture SDF (scheduled data flowed) which explores the TLP parallelism as much as that is supplied by that application. We implement a SDF single-chip multiprocessor constructed from simpler processors and execute the automatically parallelizing application on the single-chip multiprocessor. SDF has many desirable features such as high throughput, scalability, and low power consumption, which meet the requirements of the next generation of CPU design. Compared with superscalar, VLIW (very long instruction word), and SMT (simultaneous multithreading), the experiment results show that for application with very little parallelism SDF is comparable to other architectures, for applications with large amounts of parallelism SDF outperforms other architectures.
- Exploration of Visual, Acoustic, and Physiological Modalities to Complement Linguistic Representations for Sentiment Analysis
Access: Use of this item is restricted to the UNT Community.
This research is concerned with the identification of sentiment in multimodal content. This is of particular interest given the increasing presence of subjective multimodal content on the web and other sources, which contains a rich and vast source of people's opinions, feelings, and experiences. Despite the need for tools that can identify opinions in the presence of diverse modalities, most of current methods for sentiment analysis are designed for textual data only, and few attempts have been made to address this problem. The dissertation investigates techniques for augmenting linguistic representations with acoustic, visual, and physiological features. The potential benefits of using these modalities include linguistic disambiguation, visual grounding, and the integration of information about people's internal states. The main goal of this work is to build computational resources and tools that allow sentiment analysis to be applied to multimodal data. This thesis makes three important contributions. First, it shows that modalities such as audio, video, and physiological data can be successfully used to improve existing linguistic representations for sentiment analysis. We present a method that integrates linguistic features with features extracted from these modalities. Features are derived from verbal statements, audiovisual recordings, thermal recordings, and physiological sensors signals. The resulting multimodal sentiment analysis system is shown to significantly outperform the use of language alone. Using this system, we were able to predict the sentiment expressed in video reviews and also the sentiment experienced by viewers while exposed to emotionally loaded content. Second, the thesis provides evidence of the portability of the developed strategies to other affect recognition problems. We provided support for this by studying the deception detection problem. Third, this thesis contributes several multimodal datasets that will enable further research in sentiment and deception detection.
- Exploring Memristor Based Analog Design in Simscape
- With conventional CMOS technologies approaching their scaling limits, researchers are actively investigating alternative technologies for ever increasing computing and mobile demand. A number of different technologies are currently being studied by different research groups. In the last decade, one-dimensional (1D) carbon nanotubes (CNT), graphene, which is a two-dimensional (2D) natural occurring carbon rolled in tubular form, and zero-dimensional (0D) fullerenes have been the subject of intensive research. In 2008, HP Labs announced a ground-breaking fabrication of memristors, the fourth fundamental element postulated by Chua at the University of California, Berkeley in 1971. In the last few years, the memristor has gained a lot of attention from the research community. In-depth studies of the memristor and its analog behavior have convinced the community that it has the potential in future nano-architectures for optimization of high-density memory and neuromorphic computing architectures. The objective of this thesis is to explore memristors for analog and mixed-signal system design using Simscape. This thesis presents a memristor model in the Simscape language. Simscape has been used as it has the potential for modeling large systems. A memristor based programmable oscillator is also presented with simulation results and characterization. In addition, simulation results of different memristor models are presented which are crucial for the detailed understanding of the memristor along with its properties.
- Exploring Privacy in Location-based Services Using Cryptographic Protocols
- Location-based services (LBS) are available on a variety of mobile platforms like cell phones, PDA's, etc. and an increasing number of users subscribe to and use these services. Two of the popular models of information flow in LBS are the client-server model and the peer-to-peer model, in both of which, existing approaches do not always provide privacy for all parties concerned. In this work, I study the feasibility of applying cryptographic protocols to design privacy-preserving solutions for LBS from an experimental and theoretical standpoint. In the client-server model, I construct a two-phase framework for processing nearest neighbor queries using combinations of cryptographic protocols such as oblivious transfer and private information retrieval. In the peer-to-peer model, I present privacy preserving solutions for processing group nearest neighbor queries in the semi-honest and dishonest adversarial models. I apply concepts from secure multi-party computation to realize our constructions and also leverage the capabilities of trusted computing technology, specifically TPM chips. My solution for the dishonest adversarial model is also of independent cryptographic interest. I prove my constructions secure under standard cryptographic assumptions and design experiments for testing the feasibility or practicability of our constructions and benchmark key operations. My experiments show that the proposed constructions are practical to implement and have reasonable costs, while providing strong privacy assurances.
- Exploring Process-Variation Tolerant Design of Nanoscale Sense Amplifier Circuits
- Sense amplifiers are important circuit components of a dynamic random access memory (DRAM), which forms the main memory of digital computers. The ability of the sense amplifier to detect and amplify voltage signals to correctly interpret data in DRAM cells cannot be understated. The sense amplifier plays a significant role in the overall speed of the DRAM. Sense amplifiers require matched transistors for optimal performance. Hence, the effects of mismatch through process variations must be minimized. This thesis presents a research which leads to optimal nanoscale CMOS sense amplifiers by incorporating the effects of process variation early in the design process. The effects of process variation on the performance of a standard voltage sense amplifier, which is used in conventional DRAMs, is studied. Parametric analysis is performed through circuit simulations to investigate which parameters have the most impact on the performance of the sense amplifier. The figures-of-merit (FoMs) used to characterize the circuit are the precharge time, power dissipation, sense delay and sense margin. Statistical analysis is also performed to study the impact of process variations on each FoM. By analyzing the results from the statistical study, a method is presented to select parameter values that minimize the effects of process variation. A design flow algorithm incorporating dual oxide and dual threshold voltage based techniques is used to optimize the FoMs for the sense amplifier. Experimental results prove that the proposed approach improves precharge time by 83.9%, sense delay by 80.2% sense margin by 61.9%, and power dissipation by 13.1%.
- Exploring Trusted Platform Module Capabilities: A Theoretical and Experimental Study
- Trusted platform modules (TPMs) are hardware modules that are bound to a computer's motherboard, that are being included in many desktops and laptops. Augmenting computers with these hardware modules adds powerful functionality in distributed settings, allowing us to reason about the security of these systems in new ways. In this dissertation, I study the functionality of TPMs from a theoretical as well as an experimental perspective. On the theoretical front, I leverage various features of TPMs to construct applications like random oracles that are impossible to implement in a standard model of computation. Apart from random oracles, I construct a new cryptographic primitive which is basically a non-interactive form of the standard cryptographic primitive of oblivious transfer. I apply this new primitive to secure mobile agent computations, where interaction between various entities is typically required to ensure security. I prove these constructions are secure using standard cryptographic techniques and assumptions. To test the practicability of these constructions and their applications, I performed an experimental study, both on an actual TPM and a software TPM simulator which has been enhanced to make it reflect timings from a real TPM. This allowed me to benchmark the performance of the applications and test the feasibility of the proposed extensions to standard TPMs. My tests also show that these constructions are practical.
- Extrapolating Subjectivity Research to Other Languages
- Socrates articulated it best, "Speak, so I may see you." Indeed, language represents an invisible probe into the mind. It is the medium through which we express our deepest thoughts, our aspirations, our views, our feelings, our inner reality. From the beginning of artificial intelligence, researchers have sought to impart human like understanding to machines. As much of our language represents a form of self expression, capturing thoughts, beliefs, evaluations, opinions, and emotions which are not available for scrutiny by an outside observer, in the field of natural language, research involving these aspects has crystallized under the name of subjectivity and sentiment analysis. While subjectivity classification labels text as either subjective or objective, sentiment classification further divides subjective text into either positive, negative or neutral. In this thesis, I investigate techniques of generating tools and resources for subjectivity analysis that do not rely on an existing natural language processing infrastructure in a given language. This constraint is motivated by the fact that the vast majority of human languages are scarce from an electronic point of view: they lack basic tools such as part-of-speech taggers, parsers, or basic resources such as electronic text, annotated corpora or lexica. This severely limits the implementation of techniques on par with those developed for English, and by applying methods that are lighter in the usage of text processing infrastructure, we are able to conduct multilingual subjectivity research in these languages as well. Since my aim is also to minimize the amount of manual work required to develop lexica or corpora in these languages, the techniques proposed employ a lever approach, where English often acts as the donor language (the fulcrum in a lever) and allows through a relatively minimal amount of effort to establish preliminary subjectivity research in a target language.
- Finding Meaning in Context Using Graph Algorithms in Mono- and Cross-lingual Settings
- Making computers automatically find the appropriate meaning of words in context is an interesting problem that has proven to be one of the most challenging tasks in natural language processing (NLP). Widespread potential applications of a possible solution to the problem could be envisaged in several NLP tasks such as text simplification, language learning, machine translation, query expansion, information retrieval and text summarization. Ambiguity of words has always been a challenge in these applications, and the traditional endeavor to solve the problem of this ambiguity, namely doing word sense disambiguation using resources like WordNet, has been fraught with debate about the feasibility of the granularity that exists in WordNet senses. The recent trend has therefore been to move away from enforcing any given lexical resource upon automated systems from which to pick potential candidate senses,and to instead encourage them to pick and choose their own resources. Given a sentence with a target ambiguous word, an alternative solution consists of picking potential candidate substitutes for the target, filtering the list of the candidates to a much shorter list using various heuristics, and trying to match these system predictions against a human generated gold standard, with a view to ensuring that the meaning of the sentence does not change after the substitutions. This solution has manifested itself in the SemEval 2007 task of lexical substitution and the more recent SemEval 2010 task of cross-lingual lexical substitution (which I helped organize), where given an English context and a target word within that context, the systems are required to provide between one and ten appropriate substitutes (in English) or translations (in Spanish) for the target word. In this dissertation, I present a comprehensive overview of state-of-the-art research and describe new experiments to tackle the tasks of lexical substitution and cross-lingual lexical substitution. In particular I attempt to answer some research questions pertinent to the tasks, mostly focusing on completely unsupervised approaches. I present a new framework for unsupervised lexical substitution using graphs and centrality algorithms. An additional novelty in this approach is the use of directional similarity rather than the traditional, symmetric word similarity. Additionally, the thesis also explores the extension of the monolingual framework into a cross-lingual one, and examines how well this cross-lingual framework can work for the monolingual lexical substitution and cross-lingual lexical substitution tasks. A comprehensive set of comparative investigations are presented amongst supervised and unsupervised methods, several graph based methods, and the use of monolingual and multilingual information.
- Flexible Digital Authentication Techniques
- Abstract This dissertation investigates authentication techniques in some emerging areas. Specifically, authentication schemes have been proposed that are well-suited for embedded systems, and privacy-respecting pay Web sites. With embedded systems, a person could own several devices which are capable of communication and interaction, but these devices use embedded processors whose computational capabilities are limited as compared to desktop computers. Examples of this scenario include entertainment devices or appliances owned by a consumer, multiple control and sensor systems in an automobile or airplane, and environmental controls in a building. An efficient public key cryptosystem has been devised, which provides a complete solution to an embedded system, including protocols for authentication, authenticated key exchange, encryption, and revocation. The new construction is especially suitable for the devices with constrained computing capabilities and resources. Compared with other available authentication schemes, such as X.509, identity-based encryption, etc, the new construction provides unique features such as simplicity, efficiency, forward secrecy, and an efficient re-keying mechanism. In the application scenario for a pay Web site, users may be sensitive about their privacy, and do not wish their behaviors to be tracked by Web sites. Thus, an anonymous authentication scheme is desirable in this case. That is, a user can prove his/her authenticity without revealing his/her identity. On the other hand, the Web site owner would like to prevent a bunch of users from sharing a single subscription while hiding behind user anonymity. The Web site should be able to detect these possible malicious behaviors, and exclude corrupted users from future service. This dissertation extensively discusses anonymous authentication techniques, such as group signature, direct anonymous attestation, and traceable signature. Three anonymous authentication schemes have been proposed, which include a group signature scheme with signature claiming and variable linkability, a scheme for direct anonymous attestation in trusted computing platforms with sign and verify protocols nearly seven times more efficient than the current solution, and a state-of-the-art traceable signature scheme with support for variable anonymity. These three schemes greatly advance research in the area of anonymous authentication. The authentication techniques presented in this dissertation are based on common mathematical and cryptographical foundations, sharing similar security assumptions. We call them flexible digital authentication schemes.
- Force-Directed Graph Drawing and Aesthetics Measurement in a Non-Strict Pure Functional Programming Language
- Non-strict pure functional programming often requires redesigning algorithms and data structures to work more effectively under new constraints of non-strict evaluation and immutable state. Graph drawing algorithms, while numerous and broadly studied, have no presence in the non-strict pure functional programming model. Additionally, there is currently no freely licensed standalone toolkit used to quantitatively analyze aesthetics of graph drawings. This thesis addresses two previously unexplored questions. Can a force-directed graph drawing algorithm be implemented in a non-strict functional language, such as Haskell, and still be practically usable? Can an easily extensible aesthetic measuring tool be implemented in a language such as Haskell and still be practically usable? The focus of the thesis is on implementing one of the simplest force-directed algorithms, that of Fruchterman and Reingold, and comparing its resulting aesthetics to those of a well-known C++ implementation of the same algorithm.