Structure and its impact for recognition

Publications

Authors Title
M. Bresler, T. Van Phan, D. Průša, M. Nakagawa, V. Hlaváč Recognition System for On-line Sketched Diagrams
In Guerrero, Juan E., editors, ICFHR 2014: Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition, 2014, 563 - 568. ISBN 978-1-4799-4334-0.
Abstract: We present our recent model of a diagram recognition engine. It extends our previous work which approaches the structural recognition as an optimization problem of choosing the best subset of symbol candidates. The main improvement is the integration of our own text separator into the pipeline to deal with text blocks occurring in diagrams. Second improvement is splitting the symbol candidates detection into two stages: uniform symbols detection and arrows detection. Text recognition is left for postprocessing when the diagram structure is already known. Training and testing of the engine was done on a freely available benchmark database of flowcharts. We correctly segmented and recognized 93.0% of the symbols having 55.1% of the diagrams recognized without any error. Considering correct stroke labeling, we achieved the precision of 95.7%. This result is superior to the state-of-the-art method with the precision of 92.4 %. Additionally, we demonstrate the generality of the proposed method by adapting the system to finite automata domain and evaluating it on own database of such diagrams.
Links: BibTeX, DOI
P. Černo Grammatical Inference of Lambda-Confluent Context Rewriting Systems
Technical report 2015/1/KSVI Department of Software and Computer Science Education, Faculty of Mathematics and Physics, Charles University, 2015.
Abstract: Although a lot of methods have been proposed to learn regular languages, learning more complex language classes is still a challenging task. One attractive approach is to consider non-classical formalisms that provide alternative ways of representing interesting classes of languages and give rise to new promising learning techniques. To this end we use the so-called context rewriting systems, which are defined as string-rewriting systems extended by contexts, and propose a novel learning algorithm for inferring various restricted classes of lambda-confluent context rewriting systems. We show that, under certain conditions, it is possible to identify in polynomial time (but not polynomial data) any target lambda-confluent context rewriting system with minimal width of instructions from informant. We discuss the complexity of the learning algorithm, relate the considered language classes to the Chomsky hierarchy and finally raise some open questions and further research directions.
Links: BibTeX, Fulltext
L. Krtek and F. Mráz Two-Dimensional Limited Context Restarting Automata
In S. Bensch, R. Freund, F. Otto, editors, Sixth Workshop on Non-Classical Models for Automata and Applications NCMA 2014, Kassel, Germany. Proceedings, 2014, Vol. 304, 147 - 162. ISBN 978-3-85403-304-2.
Abstract: Motivated by possible machine learning of picture languages, we introduce a new two-dimensional automaton called two-dimensional limited context restarting automaton. Our model works similarly to the two-dimensional restarting tiling automaton, yet we show that it is equally powerful as the two-dimensional sgraffito automaton.
Links: BibTeX
F. Mráz Limited Restarting Automata
In S. Bensch, R. Freund, F. Otto, editors, Sixth Workshop on Non-Classical Models for Automata and Applications NCMA 2014, Kassel, Germany. Proceedings, 2014, Vol. 304, 29 - 56. ISBN 978-3-85403-304-2.
Abstract: Restarting automata were introduced as a tool for modelling a method of checking correctness of sentences called analysis by reduction. The first ten years of research on restarting automata brought out their interesting properties and introduced several extensions to the original model. Here we concentrate on recent developments in the opposite direction – from complex to simpler models. Such development aims to find automata models which are somewhat limited in their power but have other desired properties, like e.g. simpler definition, more effective recognition or learnability. All these properties are crucial for possible applications of restarting automata.
Links: BibTeX
F. Mráz, F. Otto Ordered Restarting Automata for Picture Languages
In V. Geffert, B. Preneel, B. Rovan, J. Stuller. A Min Tjoa, editors, SOFSEM 2014: Theory and Practice of Computer Science, 2014, LNCS 8327, 431 - 442. ISBN 978-3-319-04297-8.
Abstract: We introduce a two-dimensional variant of the restarting automaton with window size three-by-three for processing rectangular pictures. In each rewrite step such an automaton can only replace the symbol in the middle position of its window by a symbol that is smaller with respect to a fixed ordering on the tape alphabet. When restricted to one-dimensional inputs (that is, words) the deterministic variant of these ordered restarting automata only accepts regular languages, while the nondeterministic one can accept some languages that are not even context-free. We then concentrate on the deterministic two-dimensional ordered restarting automaton, showing that it is quite expressive as it can simulate the deterministic sgraffito automaton, and we present some closure and non-closure properties for the class of picture languages accepted by these automata.
Links: BibTeX, DOI
F. Mráz, F. Otto, M. Plátek Free Word-Order and Restarting Automata
Fundamenta Informaticae, 2014, Vol. 133, No. 4, 399 - 419.
Abstract: In natural languages with a high degree of word-order freedom, syntactic phenomena like dependencies (subordinations) or valences do not depend on the word-order (or on the individual positions of the individual words). This means that some permutations of sentences of these languages are in some (important) sense syntactically equivalent. Here we study this phenomenon in a formal way. Various types of j-monotonicity for restarting automata can serve as parameters for the degree of word-order freedom and for the complexity of word-order in sentences (languages). Here we combine two types of parameters on computations of restarting automata: the degree of j-monotonicity, and the number of rewrites per cycle. We study these notions formally in order to obtain an adequate tool for modelling and comparing formal descriptions of (natural) languages with different degrees of word-order freedom and word-order complexity.
Links: BibTeX, DOI
I. Mrázová, Z. Petříčková Fast Sensitivity-Based Training of BP-Networks
In S. Wermter et.al., editors, Artificial Neural Networks and Machine Learning ICANN 2014. Proceedings, 2014, LNCS 8681, 507 – 514. ISBN 978-3-319-11178-0.
Abstract: Sensitivity analysis became an acknowledged tool used to study the performance of artificial neural networks. Sensitivity analysis allows to assess the influence, e.g., of each neuron or weight on the final network output. In particular various feature selection and pruning strategies are based on this capability. In this paper, we will present a new approximative sensitivity-based training algorithm yielding robust neural networks with generalization capabilities comparable to its exact analytical counterpart, yet much faster.
Links: BibTeX, DOI
I. Mrázová, P. Zvirinský Mining the Czech Insolvency Proceedings Data
In Cihan H. Dagli, editors, Proceedings of the Complex Adaptive Systems 2014 Conference - Conquering Complexity: Challenges and Opportunities, 2014, Procedia Computer Science, Vol. 36, 308 - 313.
Abstract: The Global Financial Crisis of 2008 has left behind it many victims worldwide – both among bankrupt companies and indebted people with a grim future ahead. On January 1, 2008, the government of the Czech Republic launched a new information system called Insolvency Register of the Czech Republic. Meanwhile, the Czech Insolvency Register contains publicly available data concerning more than 100 000 insolvency proceedings. Modern data mining methods quite naturally represent an appealing approach to analyze these huge amounts of open source data. In this context, especially the techniques of Bayesian networks and social network analysis seem to reveal several new socio-economic patterns present in the current Czech society.
Links: BibTeX, DOI
F. Otto, P. Černo, F. Mráz On the Classes of Languages Accepted by Limited Context Restarting Automata
RAIRO - Theoretical Informatics and Applications, 2014, Vol. 48(1), 61 – 84.
Abstract: In the literature various types of restarting automata have been studied that are based on contextual rewriting. A word w is accepted by such an automaton if, starting from the initial configuration that corresponds to input w, the word w is reduced to the empty word by a finite number of applications of these contextual rewritings. This approach is reminiscent of the notion of McNaughton families of languages. Here we put the aforementioned types of restarting automata into the context of McNaughton families of languages, relating the classes of languages accepted by these automata in particular to the class GCSL of growing context-sensitive languages and to the class CRL of Church–Rosser languages.
Links: BibTeX, DOI
F. Otto, F. Mráz Extended Two-Way Ordered Restarting Automata for Picture Languages
In Adrian Horia Dediu et.al., editors, Language and Automata Theory and Applications -- 8th International Conference, LATA 2014, LNCS 8370, 541 - 552. ISBN 978-3-319-04920-5.
Abstract: We introduce a two-dimensional variant of the deterministic restarting automaton for processing rectangular pictures. Our device has a window of size three-by-three, in a rewrite step it can only replace the symbol in the central position of its window by a symbol that is smaller with respect to a fixed ordering on the tape alphabet, and it can only perform (extended) move-right and move-down steps. This automaton is strictly more expressive than the deterministic Sgraffito automaton, but its word problem can still be solved in polynomial time, and when restricted to one-dimensional input, it only accepts the regular languages.
Links: BibTeX, DOI
F. Otto, F. Mráz Lambda-Confluence for Context Rewriting Systems
Accepted for publication, in press, published on-line since Jan 22nd 2015.
Abstract: Clearing restarting automata and limited context restarting automata are particular types of context rewriting systems. A word w is accepted by such a system if there is a sequence of rewritings that reduces the word w to the empty word lambda, where each rewrite rule is extended by certain context conditions. If each rewrite step is strictly length-reducing, as for example in the case of clearing restarting automata, then the word problem for such a system can be solved nondeterministically in quadratic time. If, in addition, the contextual rewritings happen to be lambda-confluent, that is, confluent on the congruence class of the empty word, then the word problem can be solved deterministically in linear time. Here we show that lambda-confluence is decidable in polynomial time for limited context restarting automata of type R2, but that this property is not even recursively enumerable for clearing restarting automata. The latter follows from the fact that lambda-confluence is not recursively enumerable for finite factor-erasing string-rewriting systems.
Links: BibTeX, DOI
M. Plátek, D. Pardubská, M. Lopatková On Minimalism of Analysis by Reduction by Restarting Automata
In Glyn Morrill et.al., editors, Formal Grammar - 19th International Conference, FG 2014, Proceedings., LNCS 8612, 155 - 170.
Abstract: The paper provides linguistic observations as a motivation for a formal study of an analysis by reduction. It concentrates on a study of the whole mechanism through a class of restarting automata with meta-instructions using pebbles, with delete and shift operations (DSautomata). Four types of (in)finite sets defined by these automata are considered as linguistically relevant: basic languages on word forms marked with grammatical categories, proper languages on unmarked word forms, categorial languages on grammatical categories, and sets of reductions (reduction languages). The equivalence of proper languages is considered for a weak equivalence of DS-automata, and the equivalence of reduction languages for a strong equivalence of DS-automata. The complexity of a language is naturally measured by the number of pebbles, the number of deletions, and the number of word order shifts used in a single reduction step. We have obtained unbounded hierarchies (scales) for all four types of classes of finite languages considered here, as well as for Chomsky’s classes of infinite languages. The scales make it possible to estimate relevant complexity issues of analysis by reduction for natural languages.
Links: BibTeX, DOI
D. Průša Non-recursive Trade-offs between Two-Dimensional Automata and Grammars
In H. Jürgensen et.al, editors, DCFS 2014: Proceedings of the 16th International Workshop on Descriptional Complexity of Formal Systems, 2014, LNCS 8614, 352 - 363. ISBN 978-3-319-09703-9.
Abstract: We study succinctness of descriptional systems for picture languages. Basic models of two-dimensional finite automata and generalizations of context-free grammars are considered. It is shown t hat non-recursive trade-offs between the systems are very common. The results are based on the ability of th e systems to simulate Turing machines.
Links: BibTeX, DOI
D. Průša Weight-Reducing Hennie Machines and Their Descriptional Complexity
In Adrian Horia Dediu et.al., editors, LATA 2014: Proceedings of the 8th International Conference on Language and Automata Theory and Applications, 2014, LNCS 8370, 553 - 564. ISBN 978-3-319-04920-5.
Abstract: We present a constructive variant of the Hennie machine. It is demonstrated how it can facilitate the design of finite-state machines. We focus on the d eterministic version of the model and study its descriptional complexity. The model's suc cinctness is compared with common devices that include the nondeterministic finite automa ton, two-way finite automaton and pebble automaton.
Links: BibTeX, DOI
D. Průša, F. Mráz, F. Otto Two-dimensional Sgraffito automata
RAIRO - Theoretical Informatics and Applications, 2014, Vol. 48(5), 505 - 539.
Abstract: We present a new model of a two-dimensional computing device called Sgraffito automaton. In general, the model is quite simple, which allows a clear design of computations. When restricted to one-dimensional inputs, that is, strings, the Sgraffito automaton does not exceed the power of finite-state automata. On the other hand, for two-dimensional inputs, it yields a family of picture languages with good closure properties that strictly includes the class REC of recognizable picture languages. The deterministic Sgraffito automata define a class of picture languages that includes the class of deterministic recognizable picture languages DREC, the class of picture languages that are accepted by four-way alternating automata, those that are accepted by deterministic one-marker automata, and the sudoku-deterministically recognizable picture languages, but the membership problem for the accepted languages is still decidable in polynomial time. In addition, the deterministic Sgraffito automata accept some unary picture languages that are outside of the class REC.
Links: BibTeX, URL

Last updated: Saturday, 01/31/15