We are looking for a hands-on Generative AI Engineer who is deeply familiar with LLMs (Zephyr, Mistral, Llama, etc.), fine-tuning large models, and optimizing AI performance for industry-specific use cases. This person should have a strong engineering mindset and be capable of designing, training, and fine-tuning models while ensuring minimal hallucinations and high accuracy. The ideal candidate should also be well-versed in dynamic prompt engineering, retrieval-augmented generation (RAG), and knowledge base development.
Key Responsibilities:
- Develop and fine-tune open-source LLMs (Mistral, Zephyr, Llama) for insurance-specific applications.
- Optimize AI models to reduce hallucinations and increase domain-specific accuracy (insurance preferred).
- Implement dynamic prompt engineering to control model responses and improve reliability.
- Integrate LLMs into a structured AI pipeline (RAG, knowledge retrieval, vector databases).
- Research, evaluate, and deploy NLP-based solutions for text classification, entity recognition, and semantic search.
- Ensure models adhere to enterprise standards of accuracy, compliance, and security (GDPR, CCPA, PII handling).
- Work with large datasets to train models efficiently and integrate them into real-world applications.
- Optimize model inference efficiency for deployment in cloud-based and on-prem environments.
- Collaborate with DevOps engineers to scale AI solutions in production.
Must-Have Skills & Experience:
- Strong experience with LLMs, NLP, and transformer-based architectures.
- Hands-on experience in fine-tuning and optimizing models (Hugging Face, PyTorch, TensorFlow, LangChain, etc.).
- Proficiency in Python, PyTorch, TensorFlow, or JAX.
- Familiarity with vector databases (Weaviate, Pinecone, FAISS) and retrieval-augmented generation (RAG).
- Understanding of prompt engineering strategies to minimize hallucinations.
- Strong background in statistical modeling and machine learning algorithms.
- Knowledge of MLOps tools for deployment and monitoring (MLflow, Kubeflow, Ray).
We Offer
- A dynamic and creative work environment with a team of passionate professionals.
- Opportunities for professional growth and development.
- Competitive salary and benefits package.
- Flexible working hours and the possibility for remote work.
Why working at Diffco?
- You will have the exciting opportunity to work on cutting-edge projects in the Silicon Valley for both US and European clients utilizing the latest technologies, methodologies, frameworks and approaches with the ability and time to learn and develop professionally.
- Our team is a second family, so you would enjoy spending the day working on nice projects with kind people with broad interests.
- Learn continuously, expand your skills and demonstrate your professional level and ability to take more responsibility to grow.
- We care about you and are interested in your personal professional goals, motivation and we build a perfect work/life balance.