Open Tamil Texts for Machine Processing

utsc_library_dsu.pngDate:  Saturday, January, 18th, 2020
Time: 9:30 am - 11:30 am (EST - local time) ; 8.00 pm - 10:00 pm (IST - India/Sri Lankan Time)

Location: The BRIDGE Boardroom: IC 111

University of Toronto Scarborough Campus (UTSC)
Instructional Centre (IC Building), ground floor
1095 Military Trail
Toronto, Ontario M1C 1A4

Zoom link:
(The event is a virtual and in-person event.)

Hosted by:  Digital Tamil Studies project at UTSC Library.

High-quality open Tamil language texts are difficult to source, often requiring substantial cleaning/processing in order to be used for digital scholarship and application development purposes. The Digital Tamil Studies project, based at the UTSC library has been developing partnerships to create better quality Tamil text data for machine processing. This subject is intertwined with many other activities such as text analysis, natural language processing, development of multilingual digital repositories and the Digital Tamil Studies community writ large. Please join us for a roundtable with Tamil computing practitioners and users discussing projects and developments in this area.


  • Current State of Open Tamil Datasets - Ravi Annaswamy (Information Architect)
  • Python Libraries for Tamil Computing - Muthu Annamalai (Software Engineer)
  • Linked and Structured Tamil Data for Machine Learning - Saatviga Sudhahar (Machine Learning Scientist)
  • Tamil Computing Needs for Libraries - Natkeeran L. Kanthan (Software Developer)

Presentations will be followed by discussions. 

Kirsta Stapelfeldt -
Natkeeran L. Kanthan -