C4 is a distributed corpus of German language with a focus on its pluricentric character. It was created in the years 2004 - 2008 and is being maintained by research groups from Basel, Berlin, Bozen and Vienna, representing four varietes of German.
C4 was an international initiative with a two-fold purpose. For one, it aimed to deliver a corpus of German language with a focus on its pluricentric character. At the same time, it was an early technical experiment exploring the possibility of setting up a distributed corpus search system.
In this project, research groups from Vienna (Austrian Academy Corpus - AAC, now ACDH at the Austrian Academy of Sciences), Berlin (Digitales Wörterbuch der deutschen Sprache des 20. Jh. - DWDS, BBAW), Bozen (Korpus Südtirol, EURAC) and Basel (Schweizer Text Korpus, Deutsches Seminar der Universität Basel) brought together four heterogeneous and locally distributed corpora and made them available through a single web-interface.
C4 can be seen as a predecessors to the Federated Content Search - an initiative within CLARIN aiming to provide harmonized integrated search access to distributed language resource (primarily corpora).