Analytica is seeking a remote Senior Data Scientist- NLP to support long term federal client engagements in financial regulatory or health projects in the DC Metro area. The role will apply statistical programming, modeling, visualization techniques, data mining, and forecasting skills to analyze challenging public sector problems.
Analytica has been recognized by Inc. for 3 consecutive years as one of the 250 fastest growing business. We offer competitive compensation with opportunities for bonuses, employer paid health care, training and development funds, and 401k match.
Responsibilities include:
Experience with machine translation and transcription of foreign language documents using Microsoft Azure translation services
Pre-processing - Demonstrate the skills and experience to collect, clean, and prepare data sets for input into a computational model using technologies such as Python, SAS, or R. Strong candidates will explain various methods you have applied using common pre-processing functions such as stop word removal, stemming, lemmatization, and tokenization.
Feature Engineering and Attribute Evaluation - Candidate must demonstrate experience with NLP feature engineering methods such as TF-IDF, word2vec, GloVe, and FastText identifying the key determinants for modeling that exist in the business process and within existing data sets as well as selecting evaluation protocols (model techniques).
Modeling - Candidates will have practiced skills and experience selecting modeling techniques to fit the business problem. Examples will include techniques such as machine learning (ML) supervised and unsupervised learning, regression, neural networks and deep learning, natural language processing, etc.
Validation - Strong candidates will describe their experience with investigating, reporting, and justifying model results.
Visualization- Experience in presenting the results of their modeling activities, depicting the insights realized, and explaining the relevance of their results to the organization’s business challenges.
Qualifications:
Master's degree required, and PhD preferred in Statistics, Mathematics, Computer Science, or similar
High degree of experience utilizing SAS, R, or Python to support NLP use cases such as Document Summarization, Named Entity Recognition, Sentiment Analysis, and/or Topic Modeling
At least four years of experience developing scalable, production-ready NLP solutions using sci-kit learn, Keras, TensorFlow, PyTorch, Spark NLP.
Experience leveraging transformer architecture to develop NLP models
Experience with open source NLP packages such as Gensim, SpaCy, or NLTK.
Experience with BERT, GPT-J, RoBERTa, T5 or other transformers
Experience working in a cloud environment
Experience coordinating and maintaining user stories
Must be a US citizen
Must be able to obtain and maintain a Public trust security clearance
About Analytica: Analytica is a leading consulting and information technology solutions provider to public sector organizations supporting health, civilian, and national security missions. The company is an award-winning SBA certified 8(a) small business that has been recognized by Inc. Magazine each of the past three years as one of the 250 fastest-growing companies in the U.S. Analytica specializes in providing software and systems engineering, information management, analytics & visualization, agile project management, and management consulting services. The company is appraised by the Software Engineering Institute (SEI) at CMMI® Maturity Level 3 and is an ISO 9001:2008 certified provider.
Analytica LLC. is an Equal Employment Opportunity and Affirmative Action employer. We value diversity at all levels. All individuals, regardless of personal characteristics, are encouraged to apply. All qualified applicants will receive consideration for employment without regard to sex, pregnancy, race, religion or religious creed, color, gender, gender identity, gender expression, national origin, ancestry, physical or mental disability, medical condition, genetic information, marital status, registered domestic partner status, age, sexual orientation, military or veteran status, protected veteran status, or any other basis protected by federal, state, local law, ordinance, or regulation and will not be discriminated against on these bases.