Aller au contenu principal

Francesco BIANCO


Francesco BIANCO

is Assistant Professor in Italian linguistics at the Department of Romance Languages of Palacký University Olomouc. His main scientific interests span from the syntax of old Italian to the Italian language varieties (emigrants’ language, bureaucratic Italian, popular Italian, etc.). His most recent work explores one of the intersections of language and technology, aiming in particular to shed light on contemporary Italian through data science and artificial intelligence. He is author of several papers about various topics. Moreover, he published “Breve guida alla sintassi italiana” (Cesati, Florence, 2017) and (with Sandro Mattioli) a “Bessersprecher Italienisch” (Conbook, Meerbusch, 2017). Further info can be found on his personal website.

Project at the MaCI

APID. Building an Archive of Popular Italian Documents

January 2025 – July 2025


APID is a project about Italian sociolinguistics and corpus linguistics, within the framework of the digital humanities. It aims to conceive and implement the first corpus entirely dedicated to popular Italian, i.e. to the Italian language used by low-educated people. The documents, which will be included APID, mostly belong to the so-called primary forms of writing (letters, postcards, diaries, notes, etc.). coming from both unpublished archives and published works (papers, books, archives). Such documents will be slightly encoded as XML-TEI documents, in order to be later variously annotated and processed (for the automatic search, vision, etc.). The idea is to make available its data in two ways: through a slight web search interface, and also via a dedicated API, for more customizable and advanced purposes. This approach, DH-oriented, will give scholars a wider range of work possibilities, including to easily integrate data coming from different sources (which is nowadays often difficult or even impossible), or to process them with some third-party software. Furthermore, it will ensure as long life as possible to the project main output, and it will reduce cost and time of the project.

The main outcome of APID project, i.e. the corpus, will be conceived as a “shell” capable to host a dataset likely to evolve and increase. Therefore, it will be designed as a flexible and minimalist outcome, easy to be further developed, improved, reused within other projects. Source code will be released through a public repository. Within the six-months stay at MaCI, the aforementioned shell is expected to be fully conceived and developed, and some texts will be published as first core of the corpus, which will be later increased.




Back to "Fellows 2025"


Funded by the French government's Programme d'Investissement Avenir and implemented by ANR France 2030


Publié le 11 décembre 2024

Mis à jour le 8 janvier 2025