The striking question concerning this finding was whether this limit behaviour resulted from the specifics of small-world graphs or was simply an artefact. This result can be applied to many well-known classes of connected graphs. Here, we illustrate it by considering four examples. In fact, our proof-theoretical approach allows for quickly obtaining the limit value of compactness for many graph classes sparing computational costs.
We present Resources2City Explorer R2CE , a tool for representing file systems as interactive, walkable virtual cities. R2CE visualizes file systems based on concepts of spatial, 3D information processing. For this purpose, it extends the range of functions of conventional file browsers considerably. Visual elements in a city generated by R2CE represent relations of objects of the underlying file system. The paper describes the functional spectrum of R2CE and illustrates it by visualizing a sample of files. We introduce nnDDC, a largely language-independent neural network-based classifier for DDC-related topic classification, which we optimized using a wide range of linguistic features to achieve an F-score of To show that our approach is language-independent, we evaluate nnDDC using up to 40 different languages.
We derive a topic model based on nnDDC, which generates probability distributions over semantic units for any input on sense-, word- and text-level. Unlike related approaches, however, these probabilities are estimated by means of nnDDC so that each dimension of the resulting vector representation is uniquely labeled by a DDC class. BIOfid is a specialized information service currently being developed to mobilize biodiversity data dormant in printed historical and modern literature and to offer a platform for open access journals on the science of biodiversity. Our team of librarians, computer scientists and biologists produce high-quality text digitizations, develop new text-mining tools and generate detailed ontologies enabling semantic text analysis and semantic search by means of user-specific queries.
In a pilot project we focus on German publications on the distribution and ecology of vascular plants, birds, moths and butterflies extending back to the Linnaeus period about years ago. The three organism groups have been selected according to current demands of the relevant research community in Germany. The text corpus defined for this purpose comprises over volumes with more than , pages to be digitized and will be complemented by journals from other digitization projects, copyright-free and project-related literature. Abstract: In this chapter we introduce a multidimensional model of syntactic dependency trees.
Our ultimate goal is to generate fingerprints of such trees to predict the author of the underlying sentences. The chapter makes a first attempt to create such fingerprints for sentence categorization via the detour of text categorization. We show that at text level, aggregated dependency structures actually provide information about authorship.
At the same time, we show that this does not hold for topic detection. We evaluate our model using a quarter of a million sentences collected in two corpora: the first is sampled from literary texts, the second from Wikipedia articles. As a second finding of our approach, we show that quantitative models of dependency structure do not yet allow for detecting syntactic alignment in written communication.
We conclude that this is mainly due to effects of lexical alignment on syntactic alignment. We present the Stolperwege app, a web-based framework for ubiquitous modeling of historical processes. Starting from the art project Stolpersteine of Gunter Demnig, it allows for virtually connecting these stumbling blocks with information about the biographies of victims of Nazism. According to the practice of public history, the aim of Stolperwege is to deepen public knowledge of the Holocaust in the context of our everyday environment. Stolperwege uses an information model that allows for modeling social networks of agents starting from information about portions of their life.
The paper exemplifies how Stolperwege is informationally enriched by means of historical maps and 3D animations of historical buildings. In English: The paper deals with characteristics of the structural, thematic and participatory dynamics of collaboratively generated lexical networks. This is done by example of Wiktionary.
Starting from a network-theoretical model in terms of so-called multi-layer networks, we describe Wiktionary as a scale-free lexicon. Systems of this sort are characterized by the fact that their content-related dynamics is determined by the underlying dynamics of collaborating authors. This happens in a way that social structure imprints on content structure. According to this conception, the unequal distribution of the activities of authors results in a correspondingly unequal distribution of the information units documented within the lexicon.
The paper focuses on foundations for describing such systems starting from a parameter space which requires to deal with Wiktionary as an issue in big data analysis. In German: Der Beitrag thematisiert Eigenschaften der strukturellen, thematischen und partizipativen Dynamik kollaborativ erzeugter lexikalischer Netzwerke am Beispiel von Wiktionary.
Ausgehend von einem netzwerktheoretischen Modell in Form so genannter Mehrebenennetzwerke wird Wiktionary als ein skalenfreies Lexikon beschrieben. Der Beitrag thematisiert Grundlagen zur Beschreibung solcher Systeme ausgehend von einem Parameterraum, welcher die netzwerkanalytische Betrachtung von Wiktionary als Big-Data-Problem darstellt.
This chapter develops a computational linguistic model for analyzing and comparing multilingual data as well as its application to a large body of standardized assessment data from higher education. The approach employs both an automatic and a manual annotation of the data on several linguistic layers including parts of speech, text structure and content. The respective analysis involves statistics of distance correlation, text categorization with respect to text types questions and distractors as well as languages English and German , and network analysis as a means to assess dependencies between features.
The results indicate a correlation between correct test results of students and linguistic features of the verbal presentations of tests indicating a language influence on higher education test performance. It is also found that this influence relates to special language.
About the author(s)
Thus, this integrative modeling approach contributes a test basis for a large-scale analysis of learning data and points to a number of subsequent more detailed research. Ganzha and L. Maciaszek and M. In this paper, the argument is investigated that, with a large amount of lost manuscripts, the amount of bifurcations in the true stemmas would naturally be high because the probability for siblings to survive becomes very low is assessed via a computer simulation. The paper presents Wikidition, a novel text mining tool for generating online editions of text corpora.
It explores lexical, sentential and textual relations to span multi-layer networks linkification that allow for browsing syntagmatic and paradigmatic relations among the constituents of its input texts. In this way, relations of text reuse can be explored together with lexical relations within the same literary memory information system. Beyond that, Wikidition contains a module for automatic lexiconisation to extract author specific vocabularies. Based on linkification and lexiconisation, Wikidition does not only allow for traversing input corpora on different lexical, sentential and textual levels.
Rather, its readers can also study the vocabulary of authors on several levels of resolution including superlemmas, lemmas, syntactic words and wordforms. We exemplify Wikidition by a range of literary texts and evaluate it by means of the apparatus of quantitative network analysis. We introduce a new text technology, called Wikidition, which automatically generates large scale editions of corpora of natural language texts.
Wikidition combines a wide range of text mining tools for automatically linking lexical, sentential and textual units. This includes the extraction of corpus-specific lexica down to the level of syntactic words and their grammatical categories. To this end, we introduce a novel measure of text reuse and exemplify Wikidition by means of the capitularies, that is, a corpus of Medieval Latin texts.
We Have a New Site!
Current semantic theory on indexical expressions claims that demonstratively used indexicals such as this lack a referent-determining meaning but instead rely on an accompanying demonstration act like a pointing gesture. While this view allows to set up a sound logic of demonstratives, the direct-referential role assigned to pointing gestures has never been scrutinized thoroughly in semantics or pragmatics. We investigate the semantics and pragmatics of co-verbal pointing from a foundational perspective combining experiments, statistical investigation, computer simulation and theoretical modeling techniques in a novel manner.
We evaluate various referential hypotheses with a corpus of object identification games set up in experiments in which body movement tracking techniques have been extensively used to generate precise pointing measurements. Statistical investigation and computer simulations show that especially distal areas in the pointing domain falsify the semantic direct-referential hypotheses concerning pointing gestures. As an alternative, we propose that reference involving pointing rests on a default inference which we specify using the empirical data.
These results raise numerous problems for classical semantics—pragmatics interfaces: we argue for pre-semantic pragmatics in order to account for inferential reference in addition to classical post-semantic Gricean pragmatics. The analysis of longitudinal corpora of historical texts requires the integrated development of tools for automatically preprocessing these texts and for building representation models of their genre- and register-related dynamics. In this chapter we present such a joint endeavor that ranges from resource formation via preprocessing to network-based text representation and classification.
As a first test case for showing the expressiveness of these resources, we perform a tripartite classification task of authorship attribution, genre detection and a combination thereof. To this end, we introduce a novel text representation model that explores the core structure the so-called coreness of lexical network representations of texts.
Our experiment shows the expressiveness of this representation format and mediately of our Latin preprocessor. Two goals are targeted by computer philology for ancient manuscript corpora: firstly, making an edition, that is roughly speaking one text version representing the whole corpus, which contains variety induced through copy errors and other processes and secondly, producing a stemma. A stemma is a graph-based visualization of the copy history with manuscripts as nodes and copy events as edges. Its root, the so-called archetype is the supposed original text or urtext from which all subsequent copies are made.
Our main contribution is to present one of the first computational approaches to automatic archetype reconstruction and to introduce the first text-based evaluation for automatically produced archetypes. We compare a philologically generated archetype with one generated by bio-informatic software. Physical misreading as opposed to interpretational misreading is an unnoticed substitution in silent reading.
Especially for legally important documents or instruction manuals, this can lead to serious consequences. We present a prototype of an automatic highlighter targeting words which can most easily be misread in a given text using a dynamic orthographic neighbour concept. We propose measures of fit of a misread token based on Natural Language Processing and detect a list of short most easily misread tokens in the English language. We design a highlighting scheme for avoidance of misreading. Text corpora are structured sets of text segments that can be annotated or interrelated.
Expanding on this, we can define a database of images as an iconographic multimodal corpus with annotated images and the relations between images as well as between images and texts. In this project we create a database containing digitized items from this collection, and extend a tool, the ImageDB in the eHumanities Desktop, to annotate and provide relations between resources. This article gives an overview of the project and provides some technical details. Furthermore we show newly implemented features, explain the challenge of creating an ontology on multimodal corpora and give a forecast for future work.
We show that the Mann-Shanks primality criterion holds for weighted extended binomial coefficients which count the number of weighted integer compositions , not only for the ordinary binomial coefficients. We derive asymptotic formulas for central extended binomial coefficients, which are generalizations of binomial coefficients, using the distribution of the sum of independent discrete uniform random variables with the Central Limit Theorem and a local limit variant. Research in the field of Digital Humanities, also known as Humanities Computing, has seen a steady increase over the past years.
Situated at the intersection of computing science and the humanities, present efforts focus on making resources such as texts, images, musical pieces and other semiotic artifacts digitally available, searchable and analysable. To this end, computational tools enabling textual search, visual analytics, data mining, statistics and natural language processing are harnessed to support the humanities researcher. The processing of large data sets with appropriate software opens up novel and fruitful approaches to questions in the traditional humanities.
Crane and Christiane D. Readability classification is an important application of Natural Language Processing. It aims at judging the quality of documents and to assist writers to identify possible problems. This paper presents a readability classifier for Bangla textbooks using information-theoretic and lexical features. All together 18 features are explored to achieve an F-score of Zahurul and Rahman, Md. HCI systems are often equipped with gestural interfaces drawing on a predefined set of admitted gestures.
We provide an assessment of the fitness of such gesture vocabularies in terms of their learnability and naturalness. This is done by example of rivaling gesture vocabularies of the museum information system WikiNect. In this way, we do not only provide a procedure for evaluating gesture vocabularies, but additionally contribute to design criteria to be followed by the gestures. This paper provides a theoretical assessment of gestures in the context of authoring image-related hypertexts by example of the museum information system WikiNect.
To this end, a first implementation of gestural writing based on image schemata is provided Lakoff in Women, fire, and dangerous things: what categories reveal about the mind. University of Chicago Press, Chicago, Gestural writing is defined as a sort of coding in which propositions are only expressed by means of gestures. In this respect, it is shown that image schemata allow for bridging between natural language predicates and gestural manifestations.
Further, it is demonstrated that gestural writing primarily focuses on the perceptual level of image descriptions Hollink et al. By exploring the metaphorical potential of image schemata, it is finally illustrated how to extend the expressiveness of gestural writing in order to reach the conceptual level of image descriptions. In this context, the paper paves the way for implementing museum information systems like WikiNect as systems of kinetic hypertext authoring based on full-fledged gestural writing.
Currently, a large number of different lexica is available for English. However, substantial and freely available fullform lexica with a high number of named entities are rather rare even in the case of this lingua franca. Existing lexica are often limited in several respects as explained in Section 2. What is missing so far is a freely available substantial machine-readable lexical resource of English that contains a high number of word forms and a large collection of named entities.
In this paper, we describe a procedure to generate such a resource by example of English. This lexicon, henceforth called ColLex. EN for Collecting Lexica for English , will be made freely available to the public 1. In this paper, we describe how ColLex. EN was collected from existing lexical resources and specify the statistical procedures that we developed to extend and adjust it.
- Trotting Along;
- Humour et jeux de mots dans l’œuvre d’ Eugène Ionesco et ses traductions en grec;
- Basic information;
No manual modifications were done on the generated word forms and lemmas. Our fully automatic procedure has the advantage that whenever new versions of the source lexica are available, a new version of ColLex. EN can be automatically generated with low effort. L Andrews and C. We derive a stochastic word length distribution model based on the concept of compound distributions and show its relationships with and implications for Wimmer et al. We provide simple generalizations of the classical Needleman—Wunsch algorithm for aligning two sequences.
First, we let both sequences be defined over arbitrary, potentially different alphabets. Secondly, we consider similarity functions between elements of both sequences with ranges in a semiring. Next, we present novel combinatorial formulas for the number of monotone alignments between two sequences for selected steps S.
Finally, we illustrate sample applications in natural language processing that require larger steps than available in the original Needleman—Wunsch sequence alignment procedure such that our generalizations can be fruitfully adopted. We prove a simple relationship between extended binomial coefficients — natural extensions of the well-known binomial coefficients — and weighted restricted integer compositions. Moreover, wegiveaveryuseful interpretation ofextendedbinomial coefficients as representing distributions of sums of independent discrete random variables.
We apply our results, e. Based on our findings and using the central limit theorem, we also give generalized Stirling formulae for central extended binomial coefficients. We enlarge the list of known properties of extended binomial coefficients. Recently, translation scholars have made some general claims about translation properties. Some of these are source language independent while others are not. Koppel and Ordan performed empirical studies to validate both types of properties using English source texts and other texts translated into English.
Obviously, corpora of this sort, which focus on a single language, are not adequate for claiming universality of translation prop- erties. In this paper, we are validating both types of translation properties using original and translated texts from six European languages. We provide a new, theoretically motivated evaluation grid for assessing the conversational achievements of Artificial Dialog Companions ADCs. The grid is spanned along three grounding problems.
Firstly, it is argued that symbol grounding in general has to be instrinsic. Current approaches in this context, however, are limited to a certain kind of expression that can be grounded in this way. Secondly, we identify three requirements for conversational grounding, the process leading to mutual understanding. Finally, we sketch a test case for symbol grounding in the form of the philosophical grounding problem that involves the use of modal language.
Im Zentrum ihres Interesses stehen Untersuchungen zum Zusammenhang zwischen sozialen und sprachlichen Netzwerken und ihrer Dynamiken, aufgezeigt an empirischen Beispielen aus dem Bereich des Web 2. Nicht-verbale Zeichen, insbesondere sprachbegleitende Gesten, spielen eine herausragende Rolle in der menschlichen Kommunikation.
Exemplifikation wird im Rahmen einer unifikationsbasierten Grammatik umgesetzt. Dort werden u. This paper presents a classifier of text readability based on information-theoretic features. The classifier was developed based on a linguistic approach to readability that explores lexical, syntactic and semantic features.
For this evaluation we extracted a corpus of articles from Wikipedia together with their quality judgments. We show that information-theoretic features perform as well as their linguistic counterparts even if we explore several linguistic levels at once. Machine translation systems always struggle transliterating names and unknown words during the translation process.
It becomes more problematic when the source and the target language use different scripts for writing. To handle this problem, transliteration systems are becoming popular as additional modules of the MT systems. The transliteration system is the same as the phrase based statistical machine translation system, but it works on character level rather than on phrase level. The performance of a statistical system is directly correlated with the size of the training corpus.
In this work, names are extracted from the Wikipedia cross lingual links and from Geonames. Also names are manually transliterated and added to the data. If we consider only the candidate transliterations, the system gives Communicating face-to-face, interlocutors frequently produce multimodal meaning packages consisting of speech and accompanying gestures. We discuss a systematically annotated speech and gesture corpus consisting of 25 route-and-landmark-description dialogues, the Bielefeld Speech and Gesture Alignment corpus SaGA , collected in experimental face-to-face settings.
We first describe the primary and secondary data of the corpus and its reliability assessment. Speech-gesture interfaces have been established extending unification-based grammars. In addition, the development of a computational model of speech-gesture alignment and its implementation constitutes a research line we focus on.
This thesis bridges between two scientific fields -- linguistics and computer science -- in terms of Linguistic Networks. From the linguistic point of view we examine whether languages can be distinguished when looking at network topology of different linguistic networks. We deal with up to 17 languages and ask how far the methods of network theory reveal the peculiarities of single languages.
We present and apply network models from different levels of linguistic representation: syntactic, phonological and morphological. The network models presented here allow to integrate various linguistic features at once, which enables a more abstract, holistic view at the particular language. From the point of view of computer science we elaborate the instrumentarium of network theory applying it to a new field. We study the expressiveness of different network features and their ability to characterize language structure. We evaluate the interplay of these features and their goodness in the task of classifying languages genealogically.
Among others we compare network features related to: average degree, average geodesic distance, clustering, entropy-based indices, assortativity, centrality, compactness etc. We also propose some new indices that can serve as additional characteristics of networks. The results obtained show that network models succeed in classifying related languages, and allow to study language structure in general. The mathematical analysis of the particular network indices brings new insights into the nature of these indices and their potential when applied to different networks.
In this paper, a model is presented for the automatic measurement that can systematically describe the usage and function of the phenomenon of repetition in written text. The motivating hypothesis for this study is that the more repetitive a text is, the easier it is to memorize.
Therefore, an automated measurement index can provide feedback to writers and for those who design texts that are often memorized including songs, holy texts, theatrical plays, and advertising slogans. The potential benefits of this kind of systematic feedback are numerous, the main one being that content creators would be able to employ a standard threshold of memorizability. This study explores multiple ways of implementing and calculating repetitiveness across levels of analysis such as paragraph-level or sub-word level genres such as songs, holy texts, and other genres and languages, integrating these into the a model for the automatic measurement of repetitiveness.
The Avestan language and some of its idiosyncratic features are explored in order to illuminate how the proposed index is applied in the ranking of texts according to their repetitiveness. In recent work, Covington discusses the number of alignments of two strings. This definition has drawbacks as it excludes many relevant situations.
Meaning of "Makrostruktur" in the German dictionary
In this work, we specify the notion of an alignment so that many linguistically interesting situations are covered. To this end, we define an alignment in an abstract manner as a set of pairs and then define three properties on such sets. Secondly, we specify the numbers of possibilities of aligning two strings in each case. We present a simple and straightforward alignment algorithm for monotone many-to-many alignments in grapheme-to-phoneme conversion and related fields such as morphology, and discuss a few noteworthy extensions.
Moreover, we specify combinatorial formulas for monotone many-to-many alignments and decoding in G2P which indicate that exhaustive enumeration is generally possible, so that some limitations of our approach can easily be overcome. Finally, we present a decoding scheme, within the monotone many-to-many alignment paradigm, that relates the decoding problem to restricted integer compositions and that is, putatively, superior to alternatives suggested in the literatur. We present a framework, based on Sejane and Eger , for inducing lexical semantic typologies for groups of languages.
Our framework rests on lexical semantic association networks derived from encoding, via bilingual corpora, each language in a common reference language, the tertium comparationis, so that distances between languages can easily be determined. Together with Vedic Sanskrit, Avestan represents one of the most archaic witnesses of the Indo-Iranian branch of the Indo-European languages, which makes it especially interesting for historical-comparative linguistics.
Instead, we had to rely upon transcriptional devices that were dictated by the restrictions of character encoding as provided by the computer systems used. As the problems we had to face in this respect and the solutions we could apply are typical for the development of computational work on ancient languages, it seems worthwhile to sketch them out here. Ein Mensch hat sich das erforderliche Wissen, um Informationen zu suchen oder Fragen zu beantworten, im Laufe seines Lebens angeeignet. Einem Computer muss dieses Wissen explizit mitgeteilt werden. Dabei kommen Methoden der Logik und des maschinellen Lernens zum Einsatz.
Synonyms are a highly relevant information source for natural language processing. Automatic synonym extraction methods have in common that they are either applied on the surface representation of the text or on a syntactical structure derived from it. In this paper, however, we present a semantic synonym extraction approach that operates directly on semantic networks SNs , which were derived from text by a deep syntactico-semantic analysis.
Synonymy hypotheses are extracted from the SNs by graph matching. These hypotheses are then validated by a support vector machine SVM employing a combined graph and string kernel. Our method was compared to several other approaches and the evaluation has shown that our results are considerably superior. There are many languages considered to be low-density languages, either because the population speaking the language is not very large, or because insufficient digitized text material is available in the language even though millions of people speak the language.
Bangla is one of the latter ones. Readability classification is an important Natural Language Processing NLP application that can be used to judge the quality of documents and assist writers to locate possible problems. This paper presents a readability classifier of Bangla textbook documents based on information-theoretic and lexical features.
This is done to get measurable evidence for the existence of speech-and-gesture ensembles. Thus, there is evidence for a one-way coupling — going from words to gestures — that leads to speech-and-gesture alignment and underlies the constitution of multimodal ensembles. A framework for grounding the semantics of co-verbal iconic gestures is presented. A resemblance account to iconicity is discarded in favor of an exemplification approach. It is sketched how exemplification can be captured within a unification-based grammar that provides a conceptual interface.
Gestures modeled as vector sequences are the exemplificational base. Some hypotheses that follow from the general account are pointed at and remaining challenges are discussed. We introduce WikiNect as a kinetic museum information system that allows museum visitors to give on-site feedback about exhibitions. To this end, WikiNect integrates three approaches to Human-Computer Interaction HCI : games with a purpose, wiki-based collaborative writing and kinetic text-technologies.
Our aim is to develop kinetic technologies as a new paradigm of HCI.
Lv in de metafysische grootheid Exodus-Leviticus-Numeri by Raymond R. Hausoul - Issuu
They dispense with classical interfaces e. In this paper, we introduce the notion of gestural writing as a kinetic text-technology that underlies WikiNect to enable museum visitors to communicate their feedback. The basic idea is to explore sequences of gestures that share the semantic expressivity of verbally manifested speech acts.
Our task is to identify such gestures that are learnable on-site in the usage scenario of WikiNect. This is done by referring to so-called transient gestures as part of multimodal ensembles, which are candidate gestures of the desired functionality. The eHumanities Desktop is a system which allows users to upload, organize and share resources using a web interface. Furthermore resources can be processed, annotated and analyzed in various ways.
Registered users can organize themselves in groups and collaboratively work on their data. The eHumanities Desktop is platform independent and runs in a web browser. This paper presents the system focusing on its service orientation and process management. Staccato, the Segmentation Agreement Calculator According to Thomann , is a software tool for assessing the degree of agreement of multiple segmentations of some time-related data e.
The software implements an assessment procedure developed by Bruno Thomann and will be made publicly available. The article discusses the rationale of the agreement assessment procedure and points at future extensions of Staccato. TiTAN series capture the formation and structure of dialog lexica in terms of serialized graph representations. The dynamic update of TiTAN series is driven by the dialog-inherent timing of turn-taking. The model provides a link between neural, connectionist underpinnings of dialog lexica on the one hand and observable symbolic behavior on the other.
On the neural side, priming and spreading activation are modeled in terms of TiTAN networking. On the symbolic side, TiTAN series account for cognitive alignment in terms of the structural coupling of the linguistic representations of dialog partners. This structural stance allows us to apply TiTAN in machine learning of data of dialogical alignment. In previous studies, it has been shown that aligned dialogs can be distinguished from non-aligned ones by means of TiTAN -based modeling. Now, we simultaneously apply this model to two types of dialog: task-oriented, experimentally controlled dialogs on the one hand and more spontaneous, direction giving dialogs on the other.
We ask whether it is possible to separate aligned dialogs from non-aligned ones in a type-crossing way. This hints at a structural fingerprint left by alignment in networks of linguistic items that are routinely co-activated during conversation. Currently, the area of translation studies lacks corpora by which translation scholars can validate their theoretical claims, for example, regarding the scope of the characteristics of the translation relation.
In this paper, we describe a customized resource in the area of translation studies that mainly addresses research on the properties of the translation relation. Our experimental results show that the Type-Token-Ratio TTR is not a universally valid indicator of the simplification of translation. The Naming Game NG has become a vivid research paradigm for simulation studies on language evolution and the establishment of naming conventions. Recently, NGs were used for reconstructing the creation of linguistic categories, most notably for color terms.
We recap the functional principle of NGs and the latter Categorization Games CGs and evaluate them in the light of semantic data of linguistic categorization outside the domain of colors. This comparison reveals two specifics of the CG paradigm: Firstly, the emerging categories draw basically on the predefined topology of the learning domain. Secondly, the kind of categories that can be learnt in CGs is bound to context-independent intersective categories.
This suggests that the NG and the CG focus on a special aspect of natural language categorization, which disregards context-sensitive categories used in a non-compositional manner. Ancient corpora contain various multilingual patterns. This imposes numerous problems on their manual annotation and automatic processing. We introduce a lexicon building system, called Lexicon Expander, that has an integrated language detection module, Language Detection LD Toolkit. The Lexicon Expander post-processes the output of the LD Toolkit which leads to the improvement of f-score and accuracy values.
Furthermore, the functionality of the Lexicon Expander also includes manual editing of lexical entries and automatic morphological expansion by means of a morphological grammar. Currently, some simulative accounts exist within dynamic or evolutionary frameworks that are concerned with the development of linguistic categories within a population of language users. Although these studies mostly emphasize that their models are abstract, the paradigm categorization domain is preferably that of colors.
In this paper, the authors argue that color adjectives are special predicates in both linguistic and metaphysical terms: semantically, they are intersective predicates, metaphysically, color properties can be empirically reduced onto purely physical properties. The restriction of categorization simulations to the color paradigm systematically leads to ignoring two ubiquitous features of natural language predicates, namely relativity and context-dependency.
The authors develop a three-dimensional grid of ascending complexity that is partitioned according to the semiotic triangle. They also develop a conceptual model in the form of a decision grid by means of which the complexity level of simulation models of linguistic categorization can be assessed in linguistic terms.
In this article, we test a variant of the Sapir-Whorf Hypothesis in the area of complex network theory. This is done by analyzing social ontologies as a new resource for automatic language classification. Our method is to solely explore structural features of social ontologies in order to predict family resemblances of languages used by the corresponding communities to build these ontologies. This approach is based on a reformulation of the Sapir-Whorf Hypothesis in terms of distributed cognition.
Starting from a corpus of Wikipedia-based social ontologies, we test our variant of the Sapir-Whorf Hypothesis by several experiments, and find out that we outperform the corresponding baselines. All in all, the article develops an approach to classify linguistic networks of tens of thousands of vertices by exploring a small range of mathematically well-established topological indices. Checking for readability or simplicity of texts is important for many institutional and individual users.
Formulas for approximately measuring text readability have a long tradition. Usually, they exploit surface-oriented indicators like sentence length, word length, word frequency, etc. However, in many cases, this information is not adequate to realistically approximate the cognitive difficulties a person can have to understand a text.
Therefore we use deep syntactic and semantic indicators in addition. The syntactic information is represented by a dependency tree, the semantic information by a semantic network. Both representations are automatically generated by a deep syntactico-semantic analysis. A global readability score is determined by applying a nearest neighbor algorithm on 3, ratings of test persons.
The evaluation showed that the deep syntactic and semantic indicators lead to promising results comparable to the best surface-based indicators. The combination of deep and shallow indicators leads to an improvement over shallow indicators alone. Finally, a graphical user interface was developed which highlights difficult passages, depending on the individual indicator values, and displays a global readability score. This article presents an approach to automatic language classification by means of linguistic networks.
Networks of 11 languages were constructed from dependency treebanks, and the topology of these networks serves as input to the classification algorithm. The results match the genealogical similarities of these languages. In addition, we test two alternative approaches to automatic language classification — one based on n-grams and the other on quantitative typological indices. All three methods show good results in identifying genealogical groups. Beyond genetic similarities, network features and feature combinations offer a new source of typological information about languages.
This information can contribute to a better understanding of the interplay of single linguistic phenomena observed in language. In this thesis we analyze the performance of social semantics in textual information retrieval. By means of collaboratively constructed knowledge derived from web-based social networks, inducing both common-sense and domain-specific knowledge as constructed by a multitude of users, we will establish an improvement in performance of selected tasks within different areas of information retrieval.
This work connects the concepts and the methods of social networks and the semantic web to support the analysis of a social semantic web that combines human intelligence with machine learning and natural language processing. In this context, social networks, as instances of the social web, are capable in delivering social network data and document collections on a tremendous scale, inducing thematic dynamics that cannot be achieved by traditional expert resources. The question of an automatic conversion, annotation and processing, however, is central to the debate of the benefits of the social semantic web.
Which kind of technologies and methods are available, adequate and contribute to the processing of this rapidly rising flood of information and at the same time being capable of using the wealth of information in this large, but more importantly decentralized internet.
The present work researches the performance of social semantic-induced categorization by means of different document models. We will shed light on the question, to which level social networks and social ontologies contribute to selected areas within the information retrieval area, such as automatically determining term and text associations, identifying topics, text and web genre categorization, and also the domain of sentiment analysis. We will show in extensive evaluations, comparing the classical apparatus of text categorization -- Vector Space Model, Latent Semantic Analysis and Support Vector Maschine -- that significant improvements can be obtained by considering the collaborative knowledge derived from the social web.
Challenged by the growing societal demand for Ambient Assistive Living AAL technologies, we are dedicated to develop intelligent technical devices which are able to communicate with human persons in a truly human-like manner. The core of the project is a simulation environment which enables the development of conscious learning semiotic agents which will be able to assist human persons in their daily life. We are reporting first results and future perspectives. In the area of digital library services, the access to subject-specific metadata of scholarly publications is of utmost interest.
However, due to its loose requirements regarding metadata content there is no strict standard for consistent subject indexing specified, which is furthermore needed in the digital library domain. This contribution addresses the problem of automatic enhancement of OAI metadata by means of the most widely used universal classification schemes in libraries—the Dewey Decimal Classification DDC. To be more specific, we automatically classify scientific documents according to the DDC taxonomy within three levels using a machine learning-based classifier that relies solely on OAI metadata records as the document representation.
The results show an asymmetric distribution of documents across the hierarchical structure of the DDC taxonomy and issues of data sparseness. However, the performance of the classifier shows promising results on all three levels of the DDC. Die Digital Humanities bzw. Diese Entwicklung betrifft zunehmend auch die Lehre im Bereich der geisteswissenschaftlichen Fachinformatik.
Research in cognitive psychology shows that the connection relation is the primitive spatial relation. This paper proposes a novel spatial knowledge representation of indoor environments based on the connection relation, and demonstrates how deictic orientation relations can be acquired from a map, which is constructed purely on connection relations between extended objects. Without loss of generality, we restrict indoor environments to be constructed by a set of rectangles, each representing either a room or a corridor.
The term fiat cell is coined to represent a subjective partition along a corridor. Spatial knowledge includes rectangles, sides information of rectangles, connection relations among rectangles, and fiat cells of rectangles. Efficient algorithms are given for identifying one shortest path between two locations, transforming paths into fiat paths, and acquiring deictic orientations. In this paper, we present an approach to language d etection in streams of multilingual ancient texts.
We evaluate our mod el by means of three experiments that show that language detection is po ssible even for dead languages. Finally, we present an experiment in unsupervised language detection as a tertium comparationis for o ur supervised classifier. The idea behind MSMTs is to provide spanning trees that minimize the costs of edge traversals in a Markovian manner, that is, in terms of the path starting with the root of the tree and ending at the vertex under consideration. In a second part, the chapter generalizes this class of spanning trees in order to allow for damped Markovian effects in the course of spanning.
These two effects, 1 the sensitivity to the contexts generated by consecutive edges and 2 the decreasing impact of more antecedent or 'weakly remembered' vertices, are well known in cognitive modeling [6, 10, 21, 23]. In this sense, the chapter can also be read as an effort to introduce a graph model to support the simulation of cognitive systems.
- Wild horseman (Wilder Reiter) from Album für die Jugend - Piano?
- Love Notes.
- The Paris Game (The Le Chat Rouge Series Book 1)?
Note that MSMTs are not to be confused with branching Markov chains or Markov trees  as we focus on generating spanning trees from given weighted undirected networks. Delivering linguistic resources and easy-to-use methods to a broad public in the humanities is a challenging task. On the one hand users rightly demand easy to use interfaces but on the other hand want to have access to the full flexibility and power of the functions being offered.
Even though a growing number of excellent systems exist which offer convenient means to use linguistic resources and methods, they usually focus on a specific domain, as for example corpus exploration or text categorization. Architectures which address a broad scope of applications are still rare. This article introduces the eHumanities Desktop, an online system for corpus management, processing and analysis which aims at bridging the gap between powerful command line tools and intuitive user interfaces.
In dyadic communication, both interlocutors adapt to each other linguistically, that is, they align interpersonally. This is done by means of so-called two-layer time-aligned network series, that is, a time-adjusted graph model. Each constituent network of the series is updated utterance-wise. Thus, both the inherent bipartition of dyadic conversations and their gradual development are modeled.
By adapting and further developing several models of complex network theory, we show that dialog lexica evolve as a novel class of graphs that have not been considered before in the area of complex linguistic networks. Additionally, we show that our framework allows for classifying dialogs according to their alignment status.
To the best of our knowledge, this is the first approach to measuring alignment in communication that explores the similarities of graph-like cognitive representations. The volume 'Genres on the Web' has been designed for a wide audience, from the expert to the novice. It is a required book for scholars, researchers and students who want to become acquainted with the latest theoretical, empirical and computational advances in the expanding field of web genre research.
The study of web genre is an overarching and interdisciplinary novel area of research that spans from corpus linguistics, computational linguistics, NLP, and text-technology, to web mining, webometrics, social network analysis and information studies.
This book gives readers a thorough grounding in the latest research on web genres and emerging document types. The book covers a wide range of web-genre focussed subjects, such as: -The identification of the sources of web genres -Automatic web genre identification -The presentation of structure-oriented models -Empirical case studies One of the driving forces behind genre research is the idea of a genre-sensitive information system, which incorporates genre cues complementing the current keyword-based search and retrieval applications.
To be actually usable in such real-world scenarios, ontologies usually have to encompass a large number of factual statements. However, with increasing size, it becomes very diffcult to ensure their complete correctness. This is particularly true in the case when an ontology is not hand-crafted but constructed semi automatically through text mining, for example.
As a consequence, when inference mechanisms are applied on these ontologies, even minimal inconsistencies of tentimes lead to serious errors and are hard to trace back and find. This paper addresses this issue and describes a method to validate ontologies using an automatic theorem prover and MultiNet axioms.
This logic-based approach allows to detect many inconsistencies, which are diffcult or even impossible to identify through statistical methods or by manual investigation in reasonable time. To make this approach accessible for ontology developers, a graphical user interface is provided that highlights erroneous axioms directly in the ontology for quicker fixing. There are several approaches to detect hypernymy relations from texts by text mining.
Usually these approaches are based on supervised learning and in a first step are extracting several patterns. Normally these approaches are only based on a surface representation or a syntactical tree structure, i. In this work, however, we present an approach that operates directly on a semantic network SN , which is generated by a deep syntactico-semantic analysis. This algorithm is combined with a shallow approach enriched with semantic information. Current approaches of hypernymy acquisition are mostly based on syntactic or surface representations and extract hypernymy relations between surface word forms and not word readings.
In this paper we present a purely semantic approach for hypernymy extraction based on semantic networks SNs. Furthermore this paper describes how the patterns can be derived by relational statistical learning following the Minimum Description Length principle MDL. The evaluation demonstrates the usefulness of the learned patterns and also of the entire hypernymy extraction system.
Identifying duplicate texts is important in many areas like plagiarism detection, information retrieval, text summarization, and question answering. Current approaches are mostly surface-oriented or use only shallow syntactic representations and see each text only as a token list. In this work however, we describe a deep, semantically oriented method based on semantic networks which are derived by a syntactico-semantic parser. Semantically identical or similar semantic networks for each sentence of a given base text are efficiently retrieved by using a specialized semantic network index.
In order to detect many kinds of paraphrases the current base semantic network is varied by applying inferences: lexico-semantic relations, relation axioms, and meaning postulates. Some important phenomena occurring in difficult-to-detect duplicates are discussed. The deep approach profits from background knowledge, whose acquisition from corpora like Wikipedia is explained briefly.
This deep duplicate recognizer is combined with two shallow duplicate recognizers in order to guarantee high recall for texts which are not fully parsable. The evaluation shows that the combined approach preserves recall and increases precision considerably, in comparison to traditional shallow methods. For the evaluation, a standard corpus of German plagiarisms was extended by four diverse components with an emphasis on duplicates and not just plagiarisms , e. In this work however, we describe a deep, semantically oriented method based on semantic networks which are derived by a syntacticosemantic parser.
Semantically identical or similar semantic networks for each sentence of a given base text are efficiently retrieved by using a specialized index. In order to detect many kinds of paraphrases the semantic networks of a candidate text are varied by applying inferences: lexico- semantic relations, relation axioms, and meaning postulates. Important phenomena occurring in difficult duplicates are discussed.
The deep approach profits from background knowledge, whose acquisition from corpora is explained briefly. The deep duplicate recognizer is combined with two shallow duplicate recognizers in order to guarantee a high recall for texts which are not fully parsable. The evaluation shows that the combined approach preserves recall and increases precision considerably in comparison to traditional shallow methods.
In this paper we present a truly semantic-oriented approach for meronymy relation extraction. It directly operates, instead of syntactic trees or surface representations, on semantic networks SNs. These SNs are derived from texts in our case, the German Wikip edia by a deep linguistic syntactico-semantic analysis. The corresponding algorithm is combined with a shallow approach enriched with semantic information.
Through the employment of logical methods, the recall and precision of the semantic patterns pertinent to the extracted relations can be increased considerably. People communicate multimodally. Most prominently, they co-produce speech and gesture. How do they do that? Studying the interplay of both modalities has to be informed by empirically observed communication behavior. We present a corpus built of speech and gesture data gained in a controlled study. We describe 1 the setting underlying the data; 2 annotation of the data; 3 reliability evalution methods and results; and 4 applications of the corpus in the research domain of speech and gesture alignment.
This paper introduces the Ariadne Corpus Management System. First, the underlying data model is presented which enables users to represent and process heterogeneous data sets within a single, consistent framework. Secondly, a set of automatized procedures is described that offers assistance to researchers in various data-related use cases.
Finally, an approach to easy yet powerful data retrieval is introduced in form of a specialised querying language for multimodal data. Die Herausforderung stellt sich auch mit der Frage, welches die neuen Formen und Strukturen sind, die aus dem Wandel der Medien hervorgehen. Denn bislang bedeutete Medienwandel im Kern eine zunehmende Ausdifferenzierung alter und neuer Medien mit je spezifischen Leistungen, d. There is a substantial body of work on the extraction of relations from texts, most of which is based on pattern matching or on applying tree kernel functions to syntactic structures.
Whereas pattern application is usually more efficient, tree kernels can be superior when assessed by the F-measure. In this paper, we introduce a hybrid approach to extracting meronymy relations, which is based on both patterns and kernel functions. In a first step, meronymy relation hypotheses are extracted from a text corpus by applying patterns. In a second step these relation hypotheses are validated by using several shallow features and a graph kernel approach. In contrast to other meronymy extraction and validation methods which are based on surface or syntactic representations we use a purely semantic approach based on semantic networks.
This involves analyzing each sentence of the Wikipedia corpus by a deep syntactico-semantic parser and converting it into a semantic network. Meronymy relation hypotheses are extracted from the semantic networks by means of an automated theorem prover, which employs a set of logical axioms and patterns in the form of semantic networks. The meronymy candidates are then validated by means of a graph kernel approach based on common walks.
The evaluation shows that this method achieves considerably higher accuracy, recall, and F-measure than a method using purely shallow validation. This chapter outlines the state of the art of empirical and computational webgenre research. First, it highlights why the concept of genre is profitable for a range of disciplines.
At the same time, it lists a number of recent interpretations that can inform and influence present and future genre research. Last but not least, it breaks down a series of open issues that relate to the modelling of the concept of webgenre in empirical and computational studies. This paper presents an approach of two-level categorization of web pages. In contrast to related approaches the model additionally explores and categorizes functionally and thematically demarcated segments of the hypertext types to be categorized.
By classifying these segments conclusions can be drawn about the type of the corresponding compound web document. In this chapter we develop a representation model of web document networks. Based on the notion of uncertain web document structures, the model is defined as a template which grasps nested manifestation levels of hypertext types. Further, we specify the model on the conceptual, formal and physical level and exemplify it by reconstructing competing web document models.
This paper presents the eHumanities Desktop - an online system for corpus management and analysis in support of computing in the humanities. Design issues and the overall architecture are described, as well as an outline of the applications offered by the system. In addition to the well-known linguistic alignment processes in dyadic communication — e. Communicative elements from different modalities 'routinize into' cross-modal 'super-signs', which we call multimodal ensembles.
Computational models of human communication are in need of expressive models of multimodal ensembles. In this paper, we exemplify semiotic alignment by means of empirical examples of the building of multimodal ensembles. We then propose a graph model of multimodal dialogue that is expressive enough to capture multimodal ensembles. In line with this model, we define a novel task in machine learning with the aim of training classifiers that can detect semiotic alignment in dialogue.
This model is in support of approaches which need to gain insights into realistic human-machine communication. One major reason that readability checkers are still far away from judging the understandability of texts consists in the fact that no semantic information is used. Syntactic, lexical, or morphological information can only give limited access for estimating the cognitive difficulties for a human being to comprehend a text.
In this paper however, we present a readability checker which uses semantic information in addition. This information is represented as semantic networks and is derived by a deep syntactico-semantic analysis. We investigate in which situations a semantic readability indicator can lead to superior results in comparison with ordinary surface indicators like sentence length.
Finally, we compute the weights of our semantic indicators in the readability function based on the user ratings collected in an online evaluation. There exist various approaches to construct taxonomies by text mining. Usually these approaches are based on supervised learning and extract in a first step several patterns.
Normally these approaches are only based on a surface representation or a syntactic tree structure, i. In this work we present an approach which, additionally to shallow patterns, directly operates on semantic networks which are derived by a deep linguistic syntacticosemantic analysis. Furthermore, the shallow approach heavily depends on semantic information, too. It is shown that recall and precision can be improved considerably than by relying on shallow patterns alone. For many languages, the size of Wikipedia is an order of magnitude smaller than the English Wikipedia.
We present a method for cross-lingual alignment of template and infobox attributes in Wikipedia. The alignment is used to add and complete templates and infoboxes in one language with information derived from Wikipedia in another language. Furthermore, the alignment provides valuable information for normalization of template and attribute names and can be used to detect potential inconsistencies. Taatgen and H. This is done by exploring metadata as provided by the Open Archives Initiative OAI to derive document snippets as minimal document representations.
The reason is to reduce the effort of document processing in digital libraries. Further, we perform feature selection and extension by means of social ontologies and related web-based lexical resources. This is done to provide reliable topic-related classifications while circumventing the problem of data sparseness. Finally, we evaluate our model by means of two language-specific corpora. This paper bridges digital libraries on the one hand and computational linguistics on the other. The aim is to make accessible computational linguistic methods to provide thematic classifications in digital libraries based on closed topic models as the DDC.
This paper presents an approach using social semantics for the task of topic labelling by means of Open Topic Models. Our approach utilizes a social ontology to create an alignment of documents within a social network. Comprised category information is used to compute a topic generalization. We propose a feature-frequency-based method for measuring semantic relatedness which is needed in order to reduce the number of document features for the task of topic labelling.
This method is evaluated against multiple human judgement experiments comprising two languages and three different resources. Overall the results show that social ontologies provide a rich source of terminological knowledge. The performance of the semantic relatedness measure with correlation values of up to.
Results on the topic labelling experiment show, with an accuracy of up to. Most readability formulas calculate a global readability score by combining several indicator values by a linear combination. Typical indicators are Average sentence length, Average number of syllables per word, etc. Usually the parameters of the linear combination are determined by a linear OLS ordinary least square estimation minimizing the sum of the squared residuals in comparison with human ratings for a given set of texts.
The usage of OLS leads to several drawbacks. First, the parameters are not constraint in any way and are therefore not intuitive and difficult to interpret. Second, if the number of parameters become large, the effect of overfitting easily occurs. Finally, OLS is quite sensitive to outliers. Therefore, an alternative method is presented which avoids these drawbacks and is based on robust regression. We consider that there are obvious relationships between research on sustainability of language and linguistic resources on the one hand and work undertaken in the Research Unit 'Text-Technological Modelling of Information' on the other.
Currently the main focus in sustainability research is concerned with archiving methods of textual resources, i. However, we believe that there are additional certain aspects of sustainability on which new light is shed on by procedures, algorithms and dynamic processes undertaken in our Research Unit. In diesem Kapitel beschreiben wir so genannte sprachliche Netzwerke.
Dabei handelt es sich um Netzwerke sprachlicher Einheiten, die in Zusammenhang mit ihrer Einbettung in das Netzwerk jener Sprachgemeinschaft analysiert werden, welche diese Einheiten und deren Vernetzung hervorgebracht hat. Ein Hauptaugenmerk des Kapitels liegt dabei auf einem Mehrebenennetzwerkmodell, und zwar in Abkehr von den unipartiten Graphmodellen der Theorie komplexer Netzwerke. This paper describes a database of 11 dependency treebanks which were unified by means of a two-dimensional graph format.
The format was evaluated with respect to storage-complexity on the one hand, and efficiency of data access on the other hand. An example of how the treebanks can be integrated within a unique interface is given by means of the DTDB interface. This article elaborates a framework for representing and classifying large complex networks by example of wiki graphs. By means of this framework we reliably measure the similarity of document, agent, and word networks by solely regarding their topology.
In doing so, the article departs from classical approaches to complex network theory which focuses on topological characteristics in order to check their small world property. This does not only include characteristics that have been studied in complex network theory, but also some of those which were invented in social network analysis and hypertext theory.
We show that network classifications come into reach which go beyond the hypertext structures traditionally analyzed in web mining. The reason is that we focus on networks as a whole as units to be classified—above the level of websites and their constitutive pages. As a consequence, we bridge classical approaches to text and web mining on the one hand and complex network theory on the other hand. Last but not least, this approach also provides a framework for quantifying the linguistic notion of intertextuality.
In any real world application scenario, natural language generation NLG systems have to employ grammars consisting of tremendous amounts of rules. Detecting and fixing errors in such grammars is therefore a highly tedious task. In this work we present a data mining algorithm which deduces incorrect grammar rules by abductive reasoning out of positive and negative training examples. More specifcally, the constituency trees belonging to successful generation processes and the incomplete trees of failed ones are analyzed.
From this a quality score is derived for each grammar rule by analyzing the occurrences of the rules in the trees and by spotting the exact error locations in the incomplete trees. In prior work on automatic error detection v. The approach of Cussens et al. Zeller introduced a dynamic approach in the related area of detecting errors in computer programs . Usually, they exploit surfaceoriented indicators like sentence length, word length, word frequency, etc. This article describes an API for exploring the logical document and the logical network structure of wikis. It introduces an algorithm for the semantic preprocessing, filtering and typing of these building blocks.
Further, this article models the process of wiki generation based on a unified format of syntactic, semantic and pragmatic representations. This three-level approach to make accessible syntactic, semantic and pragmatic aspects of wiki-based structure formation is complemented by a corresponding database model — called WikiDB — and an API operating thereon. Finally, the article provides an empirical study of using the three-fold representation format in conjunction with WikiDB.
This article addresses challenges in maintaining and annotating image resources in the field of iconographic research. We focus on the task of bringing together generic and extensible techniques for resource and anno- tation management with the highly specific demands in this area of research.
Special emphasis is put on the interrelation of images, image segements and textual contents. In addition, we describe the architecture, data model and user interface of the open annotation system used in the image database application that is a part of the eHumanities Desktop. This paper presents a simulation model of self-organizing lexical networks. Its starting point is the notion of an association game in which the impact of varying community models is studied on the emergence of lexical networks.
The paper reports on experiments whose results are in accordance with findings in the framework of the naming game. This is done by means of a multilevel network model in which the correlation of social and of linguistic networks is studied. In this paper we present a corpus representation format which unifies the representation of a wide range of dependency treebanks within a single model. This approach provides interoperability and reusability of annotated syntactic data which in turn extends its applicability within various research contexts. See more. Inselmotive in der Kinder- und Jugendliteratur.
Schon seit der Antike existieren Inseln in verschiedenen Sagenkreisen. Auf diesen Inseln gibt es kein Leiden, keinen Hunger und Durst, keine Krankheiten und keinen Tod; alles scheint vollkommen. Schon immer haben diese Orte eine magische Wirkung auf die Menschen besessen. Inseln gelten als in sich geschlossene Gegenwelten. Hinsichtlich des Inselmotivs innerhalb der Kinder. Anja Brzezinski. You want to learn German quick and easy without getting bored about grammar?
This book will help to train your language skills while saving your time. It contains five short and easy reading texts about the topic Shopping. These short stories train your vocabulary. Reading is an effective method to learn new vocabulary in the context. Important words about the topic are listed at the end of the book.
For every text there are ten easy questions and solutions. To translate words in your language, it is important to use your translation-feature on your e-book-reader. Sie wollen schnell und einfach Deutsch lernen ohne von Grammatik gelangweilt zu werden? Diese kurzen Geschichten trainieren Ihren Wortschatz. Lesen ist eine effektive Methode, um neue Vokabeln im Kontext zu lernen. Volete imparare il tedesco in modo rapido e semplice senza essere annoiato di grammatica?
Parole importanti del tema sono elencati alla fine del libro. Ci sono dieci semplici domande e soluzioni per qualsiasi testo. Newly arrived in Berlin, a young man from Sicily is thrown headlong into an unfamiliar urban lifestyle of unkempt bachelor pads, evanescent romances and cosmopolitan encounters of the strangest kind.