menu

Scientific Publications

Here you will find the main scientific output of the project so far. These include data, papers and presentations in the fields of digitial humanities, computational linguistics and history. Below them, we also list the events hosted by the project aiming at fostering the interdisciplinary discussion on digitised newspapers.

Publications

2019

Ehrmann, Maud, Matteo Romanello, Stefan Bircher and Simon Clematide. 2019.‘Introducing the CLEF 2020 HIPE Shared Task: Named Entity Recognition and Linking on Historical Newspapers’. In Proceedings of 42nd European Conference on IR Research, ECIR 2020 (to appear).

Ehrmann, Maud, Estelle Bunout and Marten Düring. 2019. ‘Historical Newspaper User Interfaces: A Review’. In WLIC proceedings. Athens, Greece: IFLA. Related dataset: ‘Survey of Digitized Newspaper Interfaces’, available on Zenodo.

Amrhein, Chantal, and Simon Clematide. 2018. ‘Supervised OCR Error Detection and Correction Using Statistical and Neural Machine Translation Methods’. Journal for Language Technology and Computational Linguistics (JLCL) 33 (1): 49–76.

2018

Clematide, Simon, Lenz Furrer, and Martin Volk. 2018. ‘Crowdsourcing the OCR Ground Truth of a German and French Cultural Heritage Corpus’. Journal for Language Technology and Computational Linguistics (JLCL) 33 (1): 25–47.

Makarov, Peter, and Simon Clematide. 2018a. ‘Imitation Learning for Neural Morphological String Transduction’. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing abs/1808.10701: 2877–82.

Makarov, Peter, and Simon Clematide. 2018b. ‘Neural Transition-Based String Transduction for Limited-Resource Setting in Morphology’. In Proceedings of the 27th International Conference on Computational Linguistics , 83–93.

Makarov, Peter, and Simon Clematide. 2018c. ‘UZH at CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection’. In Proceedings of the CoNLL–SIGMORPHON 2018 Shared Task: Universal Morphological Reinflection, 69–75. Brussels: Association for Computational Linguistics.

Datasets

Ströbel Phillip and Simon Clematide. 2019. ‘Ground truth for Neue Zürcher Zeitung black letter period’, available on Zenodo.

Ehrmann, Maud, Estelle Bunout and Marten Düring. 2019. ‘Survey of Digitized Newspaper Interfaces’, available on Zenodo.

Shared Task

Ehrmann, Maud, Matteo Romanello and Simon Clematide. 2020. ‘CLEF 2020 HIPE Shared Task: Named Entity Recognition and Linking on Historical Newspapers’, Evaluation Lab of CLEF 2020.

Presentations

2019

Estelle Bunout. 2019a. ‘Grasping the Anti-Modern: How to Identify Anti-Modern Discourses on Europe in a Digitized Newspaper Collection (Using a Naïve Bayes Classifier and Topic Modelling)’. C2DH Research Seminar, October 23.

Estelle Bunout. 2019b. ‘Can the Digitised Newspapers Enable the Reconstruction of the European Debates in Switzerland and Luxembourg, from 1918-1945?’ 2019. Presented at the Tensions of Europe Conference: Decoding Europe Technological Pasts in the Digital Age, Belval, June 27.

Clematide, Simon, and Phillip Ströbel. 2019. ‘Historical Media Monitoring with Impresso’. Demo presented at the Swiss Text Analytics Conference, Winterthur, June 19.

Marten Düring and E. Bunout. 2019. ‘Complexities in the Use, Analysis, and Representation of Historical Digital Periodicals’. Presented at the Digital Humanities 2019 : Complexities, Utrecht, December 7.

Ehrmann, Maud. 2019a. ‘Beyond Keyword Search - Semantic Indexing and Exploration of Large Collections of Historical Newspapers’. Keynote presented at the DHN 2019 - Copenhagen, Copenhaguen, June 3.

Ehrmann, Maud. 2019b. ‘Le Projet Impresso «Media Monitoring of the Past - Fouiller Deux Siècles de Journaux Historiques»’. Presented at the Journée d’étude annuelle du projet Numapresse, Nîmes, June 20.

Ehrmann, Maud. 2019c. ‘The Past, Present and Future of Digital Scholarship with Newspaper Collections’. Presented at the Digital Humanities 2019 : Complexities, Utrecht, October 7.

Ehrman, Maud. 2019d. ‘Stronger Multilateralism through Knowledge Heritage and Culture’. 2019. Presented at the 100 Years of Multilateralism in Geneva, June 17.

Ehrmann, Maud, Simon Clematide, and Matteo Romanello. 2019. ‘Shared Task on Named Entity Recognition and Linking on Historical Newspapers’. Presented at the CLEF, Lugano, September 9. .

Ehrmann, Maud, Matteo Romanello, and Simon Clematide. n.d. ‘Named Entity Processing for Digital Humanities’. Accessed 27 August 2019.

Ströbel, Phillip. 2019. ‘Improving OCR of Black Letter in Historical Newspapers: The Unreasonable Effectiveness of HTR Models on Low-Resolution Images’. Presented at the Digital Humanities 2019 : Complexities, Utrecht, December 7.

2018

Bunout, Estelle. 2018. ‘Une Recherche plus Fouillée Dans Un Corpus Imparfait? L’étude de La Question Européenne Dans La Presse Numérisée Suisse et Luxembourgeoise (1848-1945)’. Presented at the Journée d’étude - L’histoire contemporaine à l’ère numérique : sources, méthodologies, critiques, Lausanne, April 7.

Bunout, Estelle, and Marten Düring. 2018. ‘Implementing Transparency’. Presented at the Digital Hermeneutics in History: Theory and Practice Workshop, Esch-sur-Alzette.

Bunout, Estelle, and Paul Schroeder. 2018. ‘Impresso’. Presented at the C2DH/LCSB Data Visualisation Workshop, Esch-sur-Alzette, April 24.

Chambers, Sally, Steven Claeyssens, Estelle Bunout, Marten Düring, Clemens Neudecker, Jaap Verheul, and Pim Huijnen. 2018a. ‘Corpus Creation and Digitised Newspapers: Perspectives from Research and Libraries’. Presented at the DH Benelux 2018, Amsterdam, August 6.

Chambers, Sally, Steven Claeyssens, Estelle Bunout, Marten Düring, Clemens Neudecker, Jaap Verheul, and Pim Huijnen. 2018b. ‘Transparency as a Prerequisite for Digital Source Criticism of Digitized Newspapers’. Presented at the DH Benelux 2018, Amsterdam, August 6.

Maud Ehrmann. 2018a. ‘Spotlight Presentation of Impresso Project’. Presented at the Time Machine Conference, October 31.

Romanello, Matteo. 2018a. ‘Annotating Named Entities in Historical Newspapers and Scholarly Publications’. Presented at the INCEpTION Workshop, Darmstadt.

Maud Ehrmann. 2018b. ‘Detecting Text Reuse in Newspapers Data with Passim’. Presented at the Hacking the News Workshop in conjunction with DHN 2018, Helsinki.

Ströbel, Phillip. 2018a. ‘Computerlinguistische Methoden Für Bessere Zugänglichkeit von Historischen Zeitungsberichten. Die NZZ Im Wandel Der Zeit’. Presented at the Vom DIARIUM zum DIGITARIUM, Vienna, April 25.

Ströbel, Phillip. 2018b. ‘Bridging Literature and Information Science’. Presented at the Digital Humanities Austria, Innsbruck, December 6.

Community building

Impresso talks

2019

Bergamini, Enrico, and Emmanuel Mourlon-Druol. 2019. ‘Impresso Talk #9: Talking about Europe: In La Stampa, Le Monde, Die Zeit and Der Spiegel, 1940s - 2010s’. Presented at the impresso Talks Series, Belval, October 12.

Keck, Jana, and Moritz Knabben. 2019. ‘Impresso Talk #8: Visualization for Newspapers Corpus Exploration across Time and Space’. Presented at the impresso Talks Series, Belval, October 29.

Marschall, Ralph. 2019. ‘Impresso Talk #7: Development of a New Digital Document Viewer at the Luxembourg National Library’. Presented at the impresso Talks Series, Belval, June 18.

Glaurdić, Josip, and Michal Mochtak. 2019. ‘Impresso Talk #6: Thus Spoke the People: Public Discourse in Belgrade’s Politika on the Eve of Yugoslav Wars’. Presented at the impresso Talks Series, Belval, May 6.

Lange, Milan van. 2019. ‘Impresso Talk #5: Beyond the Tantrum: Questioning Shared Intuitions in Historiography with Emotion Mining’. Presented at the impresso Talks Series, Belval, April 17.

Seul, Stephanie. 2019. ‘Impresso Talk #4: German Antisemitism and the Press during the Weimar Republic (1918-1933)’. Presented at the impresso Talks Series, Belval, March 19.

Wevers, Melvin. 2019. ‘Impresso Talk #3: A Data-Driven Approach to Exploring and Analyzing Digitized Newspapers’. Presented at the impresso Talks Series, Belval, February 27.

2018

Doucet, Antoine. 2018. ‘Impresso Talk #1: Sequential Pattern Mining for Robust Event Detection, Switzerland, Lausanne’. Presented at the impresso Talks Series, Lausanne, November 10.

Marjanen, Jani, and Simon Hengchen. 2018. ‘Impresso Talk #2: The Omnipresence of the Nation: Preliminary Remarks in Studying Nationhood through Digitized Newspapers’. Presented at the impresso Talks Series, Belval, October 30.

Impresso Community Calls

Ströbel, Phillip, and Simon Clematide. 2019. ‘Community Call #2: Topic Modeling in the Impresso Web Application’. Presented at the impresso Community Call, Zurich, May 17.

Guido, Daniele, and Estelle Bunout. 2018. ‘Community Call #1: Introducing the Impresso Interface’. Presented at the impresso Community Call, Belval, September 11 2018.

Workshops

2019

Ehrmann, Maud. 2019. ‘Digillu-Workshop: Zusammenstellung und Erschließung von Korpusdaten’. Workshop, November 25.

2018

impresso team. 2017. ‘Impresso Workshop #1 - Kick-off Workshop and 1st Consortium Meeting’. Workshop, Lausanne, October 24 2017.

impresso team. 2018. ‘Impresso Workshop #2 “Buttercup” Interface Co-Design’. Workshop, Belval, February 8 2018.

impresso team. 2018. ‘Impresso Workshop #3 “Edelweiss”: Interface Co-Design’. Workshop presented at the Edelweiss, Basel, February 23 2018.

impresso team. 2018. ‘Impresso Workshop #4 “Laurel” - Co-Design, Integrating Digitized Press in Historians’ Workflow’. Workshop, Lausanne, July 5-7 2018.

impresso team. 2018. ‘Impresso Workshop #5 “Lavender” - Co-Design, Integrating Digitized Press in Historians’ Workflow’. Workshop, Beval, July 11 2018.

Ehrmann, Maud. 2018. ‘Les magazines illustrés de la première moitié du 20e siècle à l’ère des humanités numériques – Allemagne / France en regard et acteurs en dialogue’. Paris, November 29. https://digillu.hypotheses.org/.

2017

impresso team. 2017. ‘Impresso Workshop #1 - Kick-off Workshop and 1st Consortium Meeting’. Workshop, Lausanne, October 24 2017.

Try our interface
impresso-project.ch/app/