
The study of the extensive and unstructured correspondence of the Portuguese Empire (1610–1833), archived in the Arquivo Histórico Ultramarino de Lisboa, poses a great challenge to traditional research methods due to its complexity and volume. This study presents the MAPE Engine, an innovative AI-powered framework that integrates advanced natural language processing (NLP), large language models (LLMs), embedding-based validation and clustering techniques to extract information from approximately 180,000 historical records. The method automates the assignment of concise, contextual topics to each correspondence and organizes them into thematic clusters that reveal overarching categories such as, among others, colonial administration, maritime trade, religious affairs, and mobility. By leveraging the multilingual and contextual understanding capabilities of the LLaMA 3.2 model and advanced clustering algorithms, this approach overcomes the limitations of traditional archival processing and provides improved accessibility and interpretability. The MAPE engine paves the way for transformative archival research. It enables international scholars and history enthusiasts to explore hidden patterns and connections in historical datasets in a bilingual, user-friendly tool in English and Portuguese.
Project MAPE: Mapping the Atlantic Portuguese Empire is carried out in collaboration with Demival Vasques Filho (University of Luxembourg), Clodomir Santana (University of California-Davis) and Irene Vicente-Martín (University of Salamanca).
We gratefully acknowledge the support of the National Science Center of Poland, OPUS: 2022/45/B/HS3/00473.
REGISTATION: https://docs.google.com/forms/d/e/1FAIpQLSdmxJpEoarSSWW6e-EhZOeoUYOQSnQZ0tJNedn7oX3GP2hI3w/viewform

Comments