GenerativeAI_and_LLM_Odia

Translate to Indic Languages

This app helps in translating .json and .jsonl datasets from english to indian languages.

Getting Started With Development

  1. Ensure you meet the versions requirements, mentioned here

  2. Install pre-commit library on your development setup

  pip3 install pre-commit
  1. Prepping the environment - From the base directory of the repo execute the following commands:
  bash install.sh
  1. Creating the config file
  1. Running app - From the base directory of the repo execute the following commands:
  bash translate-to-indic-lang/run.sh
  1. The output folder, from the base directory of the repo.
  cd indicTrans/.work_dir/output/merged
  ls output.json

Any possible errors are logger under indicTrans/.work_dir/output/translated/errors as individual files.

Config File

Indian Language Language Code
Assamese as
Hindi hi
Marathi mr
Tamil ta
Bengali bn
Odia or
Telugu te
Gujarati gu
Malayalam ml
Punjabi pa

Versions

Package Version
wget any
Python 3.9

Developer Hints

  1. Check the number of files translated, when the app is still running. From the project’s home directory run the following command.
  ls indicTrans/.work_dir/output/translated/error | wc -l && ls indicTrans/.work_dir/output/translated/data | wc -l