Modernizing a 17th Century Italian-English Dictionary with .NET and ML.NET
dotnet demonstrates how to modernize a 17th century Italian-English dictionary using .NET, ML.NET for custom embeddings, CosmosDb for storage, and Aspire as the orchestration layer.
Modernizing a 17th Century Italian-English Dictionary
In this session, dotnet presents a technical walkthrough of bringing John Florio’s Renaissance-era Italian-English dictionary into the 21st century. The approach combines classic text resources with modern Microsoft technologies, demonstrating:
Key Technologies
- .NET for application development and integration.
- ML.NET to train a custom vector embeddings model, enabling intelligent retrieval and translation features for historical text.
- CosmosDb to provide scalable, cloud-native storage for dictionary data and machine learning results.
- Aspire to orchestrate and glue together the various technological components involved in the project.
Project Steps
- Digitization & Ingestion: The original dictionary text is digitized and analyzed for integration.
- Custom Model Training: ML.NET is used to create a tailored vector embeddings model, improving translation and lookup performance specifically for Renaissance vocabulary.
- Cloud Storage: All structured data, vectors, and dictionary entries are stored and indexed in Azure CosmosDb, using cloud-native scalability and querying.
- Application Architecture: Aspire arranges how these components communicate, supporting robust API endpoints and service integration.
Modernization Outcomes
- Enhanced accessibility for translators and researchers working with early-modern texts.
- New capabilities for semantic search and linguistic analysis using recent machine learning techniques.
- Demonstration of modern software patterns (API, model training, cloud storage) applied to legacy content with Microsoft tools.
For more technical talks like this, visit the .NET Conf 2025 playlist: https://www.youtube.com/playlist?list=PLdo4fOcmZ0oXtIlvq1tuORUtZqVG-HdCt