We present a large Spanish-Catalan parallel corpus extracted from ten years of the paper edition of a bilingual Catalan newspaper. The produced corpus of 7.5 M parallel sentences (around 180 M words per language) is useful for many natural language applications. We report excellent results when building a statistical machine translation system trained on this parallel corpus. The Spanish-Catalan corpus is partially available via ELDA (Evaluations and Language Resources Distribution Agency) in catalog number ELRA-W0053.
In a decentralised and distributed environment, collaboration requiring the sharing and building of applications is a complex task. For this reason, we propose LaCOLLA, a fully decentralised peer-to-peer middleware that aims to simplify the process of incorporating collaborative functionalities into any application. It provides applications with certain essential collaborative functionalities:
dissemination of information, storage, presence and transparency of location, management of members and groups, and execution of tasks. A distinguishing feature of LaCOLLA is that participants provide resources for the benefit of the group. This enables collaboration activities to take place in a collective environment using only the resources provided by participants in the collaboration (self-sufficiency). In this paper we present and evaluate the architecture of LaCOLLA, its API, and key aspects of its implementation.