NLP

Contact us now!

If you have a ready-made project plan, or an idea for it, leave your contact information. We will definitely contact you and help you implement your idea!





    Our works

    Reviewing legal documents using NLP techniques

    Technologies: Python, NLP, Machine Learning Duration: 3 months Our client builds a financial software (named ‘CF Engine’) in order to model complex financial products (RMBS, ABS, CLO, etc.). The main goal of the project is to extend this software using feature that allows users to review the related legal documents based on the information from the model. The developed model needs to check if specific doc corresponds to one of the created models. For example, if processed document is mortgage than model: - parses mortgage document (from PDF, Word, plain text format); - checks if document contains all required information (all parties are specified and described correctly, property is described, interest rate is specified, all information required by law is provided and so on); - if document fits the model then system extracts important information (parties, property description, interest rates and so on) and provides it as summary for user review. System supports different formats of input documents and different types of documents, such as mortgages, car loans, commercial loans and so on. Also system supports different countries of operating, i.e. different structure of document for each country and different languages.

    Japanese-Russian translation service

    Technologies: Travatar, Moses, EDA, KyTea, Python Duration: 1 year The goal of the project was to create ad text translator from Japanese into Russian. For this purpose, we chose statistical translation model based on a comparison of a large body of parallel texts. They were used as a client texts received from the Q&A service (questions in Russian and translated into Japanese). We also used several statistical translators such as Travatar and Moses. Subsequently was used Travatar, since it showed the greater quality translation, based on objective metrics. One of the key challenges that we have faced is the low quality of the translation, and lack of alignment at the level of sentences. To align it we developed statistical algorithm, that worked searching the closest statements on the basis of the dictionary of n-grams.

    Chatbot content management system using ChatScript

    Technologies: Python, GoogleDocs API Duration: 3 months The goal of the project is to build content management system that bases on Google Spreadsheets and ChatScript. CMS is designed and built, so it can be connected to Fieldbook and Google Sheets with client chatbot app to spin up new chat bots on demand. Also, edits can be done in Google Sheets and then pushed to production. This is a solution for writers to create content directly into a CMS, organized around the ChatScript syntax. System processes these images as a learning samples to build a database and corresponding identification model. Based on built model system tries to identify an author for any custom signature.