The NLP Engineer will maintain and augment the current linguistic resources used in Lingua Custodia’s Machine Translation systems (parallel and monolingual data, bilingual terminologies, etc). He/she will work in close collaboration with the Machine Learning team in order to set up and evaluate the systems for production, according to a specific schedule. He/she will design and implement Machine Learning methods for data extraction, cleaning and classification. His/her activities include:
- Maintenance of Lingua Custodia’s database.
- Data extraction, processing and labeling: web crawling, text extraction, sentence segmentation, sentence alignment, data filtering, classification, data anonymisation.
- Maintain and improve automatic data extraction pipeline.
- Apply and improve Machine Tranlation evaluation procedures.
- Manage outsourcing for data cleaning and Machine Translation evaluation.
- Master’s degree or PhD in Natural Language Processing or any related fields.
- Good knowledge of Machine Learning methods for data analysis and processing: language modeling, classification, sequence labeling for named-entity recognition, etc.
- Proficiency in at least one scripting language: Python, Perl, etc.
- Experience in Linux environment: Bash scripting.
- Basics of SQL, XML.
- Experience in project management
- Knowledge of financial documents
- Good understanding of Machine Translation technologies
- Friendly startup environment
- Possibility to work remotely
- Good health insurance
Applications are expected by email: email@example.com
Please, specify the position name in subject field.