The project resulted in multiple outputs, some of which, due to their nature, we will continue to develop in the years to come.

DECM Historical Gazetteer

The historical gazetteer is a digital directory of historical places of Mexico and Guatemala created from information from primary sources and research carried out by the team, as well as information collected from comprehensive studies on the political, religious, and administrative units of the Viceroyalty of New Spain. This contains thousands of place names, alternative names and their location, and a wealth of other geographic historical information in Geographic Information Systems format.

The research resulted in the DECM Online Gazetteer and a downloadable DECM GIS Dataset.

The process followed in creating the gazetteer is explained in ‘The Creation of the DECM Historical Gazetteer‘ report.

You can also consult a summary of the development of the DECM Historical Gazetteer in this story map.

Please cite this resource as:

Murrieta-Flores, P., Jiménez-Badillo, D., Martins, B., Liceras-Garrido, R., Favila-Vázquez, M., and Bellamy, K. (2023) 
Digging into Early Colonial Mexico Historical Gazetteer. Figshare, Dataset. DOI:10.6084/m9.figshare.12301682

DECM Corpus

The DECM Corpus is a digital corpus produced from the original editions of the texts of the Relaciones Geográficas with different versions. These include a machine-ready version, a machine learning annotated training dataset, and an automatically annotated version ready for text mining and machine learning experiments. All these can be downloaded from the DECM Github Corpus page.

The DECM Machine Ready Corpus

This version includes text-only files (.txt) containing each of the 10 volumes originally edited by Rene Acuña, the 2 volumes edited by Mercedes de la Garza et al., and the Suma de Visita edited by Del Paso y Troncoso, a file with the original text of the Crown mandate (Instrucción), and metadata for this collection. This version contains only the original text of each RGs transcribed by the scholars, excluding any editorial note, commentary, or historical work. This can be therefore used directly for corpus linguistics analyses, visualisations, etc.

Please cite this resource as:

Murrieta-Flores, P., Jiménez-Badillo, D., Martins, B. (2023) DECM Machine Ready Corpus. 
Figshare, Dataset. DOI:10.6084/m9.figshare.12048729

The DECM ML-Training corpus

This version contains a sample of the RGs manually annotated by multiple researchers with the software of our industry partner, Tagtog. This corpus has been used to carry out the NLP and ML experiments, and the files are available in JSON and TSV format. These files are composed of texts and annotations. This is also accompanied by the DECM ontology, which explains the entities and labels produced. This corpus can be used for further experimentation with Artificial Intelligence methods.

Please cite this resource as:

Murrieta-Flores, P., Liceras-Garrido, R., Favila-Vázquez, M., Jiménez-Badillo, D. (2023) DECM ML Training Corpus. 
Figshare, Dataset. DOI:10.6084/m9.figshare.12366734

The DECM Annotated Corpus

This is the version of the entire RG corpus automatically annotated using the ML models trained with the DECM Gold Standard Corpus. The files are available in JSON and TSV format, and it also contains the file for the DECM Ontology. This corpus can be further used for quantitative and qualitative research, as well as advanced analyses using text mining techniques, corpus linguistics and other methods such as Geographical Text Analysis.

Please cite this resource as:

Murrieta-Flores, P., Jiménez-Badillo, D., Martins, B., Favila-Vázquez, M., Liceras-Garrido, R.(2023) DECM Annotated Corpus. 
Figshare, Dataset. DOI:10.6084/m9.figshare.12366956

The DECM Ontology and Annotation Rules

This .xls file contains two sheets. The one called ‘Ontology’ defines the entities and labels used to annotate the corpus of the RGs. This comprises 18 entities and labels marking important social, political, territorial, religious, and economic information. The second one, called ‘Annotation rules’ includes the basic rules followed by all the annotators in the project and examples that help to make decisions while carrying out the annotations. These rules were thought to achieve a better annotator consensus which in some cases reached up to 98 per cent in some entities.

DECM Geographical Text Analysis Software

The GTA software developed in two beta versions (v.1 and v.2) combine concepts from Corpus Linguistics, Natural Language Processing, and Geographic Information Systems. Our research group first developed the idea at Lancaster in the context of the Spatial Humanities project (see Murrieta-Flores et al., 2015). A detailed description of the method as implemented in the software can be found in Jiménez-Badillo et al., 2021., Murrieta-Flores et al., 2022., and Murrieta-Flores et al., 2023-forthcoming. The method involves applying Geographic Collocation Analysis. The software combines an interface with a corpus viewer, a query interface, and a keyword in-context tool connected to a map explorer and a historical gazetteer. The tool identifies concepts and/or terms and their associations to places with their coordinates in very large corpora, allowing to explore the corpus in different ways and download the results for further analysis in Geographic Information Systems or other tools.

In the current version, the software works by uploading an annotated corpus and bringing a historical gazetteer through an API. We are still developing the tool to open it to other users so everyone can work with their own tailored annotated corpora and gazetteers or bring a geographic index from projects such as the World Historical Gazetteer. At the moment, a demo with a sample of the corpus of the Relaciones Geográficas can be explored here: https://gta.colonialatlas.com/v2

The code for the software can be found on our DECM Github page.

Please contact us if you would like more information and updates on the development of the GTA Software.

Please cite this resource as:

Alvarez-Rivera, L., Hernández-Huerfano, E., Murrieta-Flores, P., Jiménez-Badillo, D., and Martins, B.(2023) DECM Geographical Text Analysis Software. Figshare, Software. DOI:10.6084/m9.figshare.21696794

The Relaciones Geográficas de Nueva España Digital Collection

The Relaciones Geográficas de la Nueva España (1577 – 1585) digital collection brings together images from the original documents, transcriptions, maps and thematic information from the historical source. This site aims to encourage public interest in these documents and to facilitate the work of historians, ethnologists, archaeologists and linguists interested in the corpus. Each geographical relation is accompanied by a topographical map showing the location of the villages and the geographical features mentioned in the documents. In the case of accounts that include a pictographic map, the map is displayed in high resolution to appreciate all the details easily. In addition, an image of each folio is presented together with the corresponding transcription. Finally, bibliographical references to publications devoted to transcribing, editing or analysing each geographical relation are included.

Other resources

Pathways to understanding sixteenth-century Mesoamerica funded by the Department of History at Lancaster University, is a spin-off project which created a series of three ESRI StoryMaps, combining interactive texts, images and maps in a series of online interactive learning resources on the history, archaeology and geography of the Postclassic and Colonial period of Central Mexico, beginning in the 14th through to the mid-16th century.

Subaltern Recogito Dataset and Ontology were developed as part of the ‘Subaltern Recogito Project’ with a Pelagios Commons Resource Development Grant to explore the annotation of a series of historical maps using Recogito. Our corpus of maps includes those produced in the sixteenth century for the Relaciones Geográficas de Nueva España across the area, which is currently Mexico. The project took place in collaboration with our colleagues in the LLILAS Benson Latin American Studies and Collections at The University of Texas at Austin, the National School of Anthropology and History (ENAH), The National Autonomous University of Mexico (UNAM), the National Institute of Anthropology and History (INAH), and the University of Lisbon. We delivered an online workshop and trained participants on Recogito for the annotation of the sixteenth-century maps of the Relaciones Geográficas.

Twenty-seven scholars from UNAM and ENAH participated, and Patricia Murrieta-Flores introduced the Spatial Humanities and the use of these technologies. From this, the project evolved into a citizen science project, where the participants met online every week to participate in ‘mappathons’, completing the annotation of a set of sixteenth-century maps now available in the Benson collection.

Please cite this resource as:

Murrieta-Flores, P., Fávila-Vazquez, M., Bellamy, K., Jiménez-Badillo, D., Martins, B., López Camacho, J., McDonough, K., and Palacios, Albert A (2019) Pelagios Commons: Subaltern Recogito Project. DOI: https://doi.org/10.18738/T8/L2SJQT, Texas Data Repository Dataverse, V2.

Publications by the project:

  • Murrieta-Flores, P., Jiménez-Badillo, D., Favila-Vazquez, M. (forthcoming) Placing New Spain through Early Modern Big Data: Developing a Geographical Text Analysis Approach to the Relaciones Geográficas de Nueva España. In Mackenzie Cooley and Huiyi Wu (eds.) Knowing the Empire in Early Modern China and Spain.
  • Favila-Vázquez, M.; Liceras-Garrido, R.; Murrieta-Flores, P.; and Bellamy, K. (2022-In press): De cosmógrafos reales a cartógrafas digitales: construyendo un diccionario geográfico digital de la Nueva España. In Armando Trujillo (ed.) Sistemas de información geográfica para arqueólogos: Repensando el espacio en contextos arqueológicos mesoamericanos. México: El Colegio Mexiquense.
  • Murrieta-Flores, P., Jiménez-Badillo, D., and Martins, B. (2022) Artificial Intelligence and Computational Approaches to Investigate the History of Early Colonial Mexico. Oxford Research Encyclopedia of Latin American History. DOI: 10.1093/acrefore/9780199366439.013.977
  • Murrieta-Flores, P., Fávila-Vazquez, M., and Flores-Morán, A. (2021) Indigenous Deep Mapping: A conceptual and representational analysis of space in Mesoamerica and New Spain. In Bodenhammer, D., Corrigan, J., and Harris (Eds.), Making Deep Maps: Foundations, Approaches, and Methods. London: Routledge., p. 78-111.
  • Jiménez-Badillo, D., Murrieta-Flores, P., Bruno Martins, B., Gregory, I., Favila-Vázquez, M., Liceras-Garrido, R., & Bellamy, K. (2021).  Análisis histórico-geográfico de documentos novohispanos del siglo XVI mediante técnicas de lingüística computacional y análisis espacial. En D. Jiménez-Badillo (Ed.), Métodos computacionales y técnicas digitales para analizar y divulgar el patrimonio cultural. México: Instituto Nacional de Antropología e Historia.
  • Jiménez-Badillo, D., Murrieta-Flores, P., Martins, B., Gregory, I., Favila-Vázquez, M., and Liceras-Garrido, R. (2020) Developing Geographically Oriented NLP Approaches to Sixteenth–Century Historical Documents: Digging into Early Colonial Mexico. Digital Humanities Quarterly, 14 (1).
  • Liceras-Garrido R., Favila-Vázquez, M., Bellamy, K., Murrieta-Flores P., Jiménez-Badillo, D., and Martins, B. (2019) ‘Digital Approaches to Historical Archaeology: Exploring the Geographies of 16th Century New Spain’. Open Access J Arch & Anthropol, 2(1) DOI: 10.33552/OAJAA.2019.02.000526
  • Murrieta-Flores, P., Favila-Vázquez, M., and Flores-Morán, A. (2019) ‘Spatial Humanities 3.0: QSR and Semantic Triples as New Means of Exploration of Complex Indigenous Spatial Representations in Sixteenth Century Early Colonial Mexican Maps’. International Journal of Humanities and Arts Computing 13 (1–2): 53–68. https://doi.org/10.3366/ijhac.2019.0231.
  • Murrieta-Flores, P., and Bellamy, K. (2019) Annotating 16th century Mexican historical maps with Recogito. EuropeanaTech Insight, Europeana Foundation, Issue 12 (Pelagios). https://pro.europeana.eu/page/issue-12-pelagios#annotating-16th-century-mexican-historicalmaps-with-recogito
  • Murrieta-Flores, P., and Martins, B. (2019) The Geospatial Humanities: past, present and future. International Journal of Geographic Information Sciences, 32 (12). DOI: 10.1080/13658816.2019.1645336
  • Monteiro J., Martins, B., Murrieta-Flores, P., and Pires, J.M. (2019) Spatial Disaggregation of Historical Census Data Leveraging Multiple Sources of Ancillary Data. International Journal of Geo-Information. 8 (327). DOI:10.3390/ijgi8080327
  • Santos, R., Murrieta-Flores, P., and Martins, B. (2017) Learning to Combine Multiple String Similarity Metrics for Effective Toponym Matching. Journal of Digital Earth. DOI: 10.1080/17538947.2017.1371253 (Received 13 Feb 2017, Accepted 20 Aug 2017, Published online: 06 Sep 2017)
  • Santos, R., Murrieta-Flores, P., Calado, P., and Martins, B. (2017) Toponym Matching Through Deep Neural Networks. International Journal of Geographical Information Science. 32 (2), 324-348, DOI: 10.1080/13658816.2017.1390119. (Received 16 May 2017, Accepted 05 Oct 2017, Published online: 31 Oct 2017)
  • Santos, R., Murrieta-Flores, P., and Martins, B. (2017) An Automated Approach for Geocoding Tabular Itineraries. GIR’17 Proceedings of the 11th Workshop on Geographic Information Retrieval. New York: Association for Computing Machinery, Inc, 10 p. 8. DOI: 10.1145/3155902.3155908.


Click on the map side panel for more information


The Geographic Reports of New Spain

The information contained in the Relaciones Geográficas constitutes one of the most important sources of knowledge for the study of both Prehispanic (particularly the Post-classic period) and Colonial life in America. The sheer size of the source material has, in many ways, restricted access to this varied and complex wealth of information. As such, utilising computational techniques offers a unique approach to the study of these sources, considerably improving accessibility in the process.

Our corpus largely consists of the comprehensive editions of René Acuña, recently digitised and made available by the Instituto de Investigaciones Antropológicas. Acuña compiled a considerable majority of the Relaciones Geográficas del Siglo XVI, culminating in the publication of the following editions:

René Acuña’s editions do, however, omit the province of Yucatán. We are therefore using Mercedes de la Garza’s Relaciones Histórico-geográficas de la Gobernación de Yucatán to complement the work of Acuña.

In addition, our research is drawing upon the following resources to provide further background:

Paso y Troncoso, F. del., (1905) Papeles de Nueva España. Segunda Serie Geografica y Estadística, Madrid: Establecimiento Tipográfico Sucesores de Rivadeneyra.

This collection encompasses the geographic relations compiled during the 16th century from diverse dioceses and archiepiscopates from the region of Mexico. These were edited and transcribed by Francisco del Paso y Troncoso, and although originally 8 volumes were planned, only 6 were published in 1905

Real Academia de la Historia (1900) Colección de Documentos Inéditos de las Posesiones de España en Ultramar. Segunda Serie. Madrid: Establecimiento Tipográfico Sucesores de Rivadeneyra.

This is a broad compilation of diverse historical sources located at the Archivo General de Indias, in Seville, Spain. The chosen volumes include a range of important documents with information discussing geographic, legislative and political matters, amongst others.

  • Volume 5: Documentos legislativos.
  • Volumes 9-10: Documentos legislativos.
  • Volumes 11 and 13: Relaciones de Yucatán.
  • Volumes 14 to 19: Índice general de los papeles del consejo de Indias.
  • Volumes 20 to 25: Gobernación espiritual y temporal de las Indias.

This corpus consists of more than 2.8 million words and thousands of pages. Digitised versions of the documents are available through Europeana | Google Books | Internet Archive.


Creating resources for future research


Pathways to understanding 16th century Mesoamerica‘, funded by the Department of History at Lancaster University, is a spin-off project which created a series of three ESRI StoryMaps, combining interactive texts, images and maps in a series of online interactive learning resources on the history, archaeology and geography of the Postclassic and Colonial period of Central Mexico, beginning in the 14th through to the mid-16th century.

Hover over the images below to learn more, and click the links to view each StoryMap:



The first of the story maps explores the history of the Mexica people, beginning with their journey to the foundation of Tenochtitlan in 1325, which would become (alongside its neighbour city to the north, Tlatelolco) the heart of the Triple Alliance. Following this, the story map shows how the Mexica began to expand, featuring the lists of conquered settlements as recorded in the Codex Mendoza.

This leads up to the arrival of the Spanish, and the ultimate meeting of Moctezuma II and Hernán Cortés in 1519. It then proceeds to describe how Cortés, with considerable assistance from his indigenous allies, conquered Tenochtitlan.

This story map concludes with a look at the beginning of the colonial era, exploring how the Spanish began to impose their own institutions across ‘New Spain’, with varying success due to the continuing influence of indigenous institutions across Mesoamerica.



This story map explores the nature of historic place-names across what is currently Mexico, introducing the importance of place-names and language as a tool of colonisation and empire. The story map explores how this tool was used not only by the Spanish, but also by the Mexica and the Triple Alliance (not to mention other indigenous groups), as part of their systematic colonisation of conquered settlements and people.

The story map goes on to explore how indigenous place-names continued to be used, despite the processes of colonisation at the hands of both the Triple Alliance and the Spanish. In addition, it explores the meaning of Nahua toponymy in particular – demonstrating the use of suffixes such as -tepec (which means ‘inhabited place’) and showing the distribution of some of these examples.

Following this are some case studies of individual place-names, explaining their meaning and how they have been depicted in the historical record. The story map concludes by giving a brief overview of colonial naming, and how indigenous influences have continued.



The final story map discusses depictions of geographic space and place. This starts with an explanation of why this is an important discussion, with particular reference to, and problematisation of, the use of Geographic Information Systems for representing historical geographies.

Following this, the story map introduces the idea of representations of space, which may be unfamiliar to the modern reader, and explores the various types of pre-Hispanic Nahuatl documents, including those which represented geographies.

The story map gives an introduction to the state of Spanish cartography in the sixteenth-century, before going on to discuss how the Spanish and Nahua traditions of depicting geography began to merge during the conquest of Mexico. There is considerable evidence of this merging of traditions throughout the historical record, which the story map explains, giving two specific examples.