DAAL
DOCUMENT ACCESS ACROSS LANGUAGES
- Mechanisms to enter queries in Indian languages,
- Capability to match Indian language queries against documents in English.
- Mechanisms to view such documents in Indian languages.
The solution to the above mentioned problem will require a combination of
- Cross Lingual Information Retrieval (CLIR) for locating English documents on the web via Indian language queries
- Machine Translation (MT) to translate documents.
Setu
Setu is a realization of DAAL from Hindi to English.
Features of Setu:
- User can enter search keywords in Hindi (Devanagari script)
- Searched is performed using existing popular search engines
- Documents and results of search are displayed in Hindi after an indicative translation by MaTra.
Current Scope
Setu currently focuses on access to English-documents using Hindi. It is fine-tuned for the Health Domain. The framework, however, is general and can be adapted to work for other Indian languages and documents from other domains.
The project was funded by the
Development Gateway Foundation
.