Description
About RazorThink
RazorThink is an AI-driven software product company specializing in Demand Forecasting AI (DFAI) solutions. Its market-proven platform helps organizations accurately forecast product demand—including new products with limited historical data—while reducing stock-outs and overproduction. RazorThink’s solutions deliver explainable, granular forecasts aligned with each client’s unique business processes for rapid adoption and measurable impact.
Job Summary
RazorThink is looking for an early-career Data Scientist to join its Data & Analytics team and help design, build, and productionize Machine Learning, NLP, and Generative AI solutions. Under the mentorship of senior data scientists, you will work across the full model lifecycle—from data preparation and modeling to deployment and monitoring—while following best practices for quality, safety, and reliability.
Key Responsibilities
Modeling & Research
Train, evaluate, and iterate on ML/DL/NLP models including classification, regression, NER, embeddings, and LLM-based workflows
Perform error analysis, A/B testing, and model evaluation; document findings and insights
Experiment with LLM prompting, RAG architectures, and transformer-based models
Data Preparation & Feature Engineering
Explore and prepare structured, semi-structured, and unstructured data using Python and SQL
Ensure data quality, versioning, and reproducibility
Build reusable feature pipelines and prompt templates
Production & MLOps
Package models as APIs or batch jobs using FastAPI/Flask
Containerize applications using Docker and follow CI/CD best practices
Implement monitoring for model accuracy, latency, and drift
Maintain experiment tracking using MLflow or Weights & Biases (W&B)
Collaboration & Communication
Collaborate with product and engineering teams to define success metrics and delivery priorities
Present analytical insights to both technical and non-technical stakeholders
Contribute to documentation, testing, and technical debt resolution
Minimum Qualifications
Bachelor’s or Master’s degree in Computer Science, Data Science, Mathematics, Statistics, Engineering, or equivalent project/internship experience
Strong proficiency in Python and SQL
Solid understanding of statistics, probability, and experimentation
Hands-on experience with scikit-learn and introductory PyTorch or TensorFlow
Basic NLP knowledge including tokenization, embeddings, and transformer concepts
Familiarity with Git, Jupyter notebooks, and clear written/verbal communication
Preferred Qualifications (Nice to Have)
Experience with RAG pipelines, vector databases (FAISS, Milvus, Pinecone)
Exposure to LangChain or LangGraph and LLM evaluation frameworks
Experience with data processing at scale (pandas, Polars, Spark/PySpark)
Workflow orchestration tools such as Airflow or Prefect
Cloud exposure (AWS, GCP, or Azure)
Participation in hackathons, Kaggle competitions, open-source projects, or research publications
Success Measures (First 90 Days)
Deliver at least one ML/NLP or GenAI feature to staging or production
Establish baseline metrics, monitoring, and experiment logs
Resolve scoped bugs or technical debt items with proper testing and documentation
Tools & Technologies
Python, SQL, scikit-learn, PyTorch, TensorFlow, Hugging Face, LangChain, LangGraph, Vector Databases, FastAPI, Docker, Git, MLflow, Weights & Biases, Airflow, Prefect, AWS, GCP, Azure