- *ABOUT AIVAR INNOVATIONS**
Aivar Innovations is an AI\-native services and software company and AWS Preferred Partner, backed by Bessemer Venture Partners and Sorin Investments. Founded by four former Amazon Web Services senior leaders, Aivar combines deep cloud engineering expertise with AI\-first thinking to build and deploy production\-grade solutions at enterprise scale. Aivar operates three accelerator platforms — **Convogent** (voice and agent AI automation), **Velogent** (governed agentic process automation for regulated industries), and **Kubogent** (Kubernetes\-native AIOps). We raised $4\.6 million in seed funding in January 2026 and serve 100\+ customers across fintech, healthcare, and technology verticals in 7\+ industries globally.**About the Role**
You'll build the data foundation powering Aivar's accelerators and the autonomous agents that run on top of them. That means designing pipelines that turn unstructured enterprise data — invoices, contracts, transactions, RFQs — into structured, high\-quality datasets agentic AI can reason on with full lineage and auditability. This is a senior IC role: you'll own production systems end\-to\-end and raise the technical bar for the engineers around you.**What You'll Own**
- **Pipelines.** Design and run ingestion, processing, and feature engineering pipelines handling terabytes of unstructured enterprise data across multiple sources and formats.
- **Architecture contribution.** Make calls on warehousing, lakehouse, and streaming patterns that support both real\-time agent decisions and downstream analytics.
- **Quality \& governance.** Build the validation, lineage, and audit frameworks that let regulated clients trust agent decisions in production.
- **Production reliability.** Own SLAs, monitoring, and incident response for the pipelines you ship.
- **Mentorship.** Review designs, level up junior engineers, and set the standard for engineering rigor on the team.
- *Who You Are**
- --------------
- 4\+ years building production data systems, with at least 2 years owning pipelines end\-to\-end
- Comfortable making independent design calls and defending them in review
- Pragmatic about quality — you know when to enforce it and when to ship and iterate
- Have shipped data infra into regulated environments (financial, healthcare, or similar) — or are eager to
- *Must\-Have Expertise**
- -----------------------
- **Distributed processing:** Apache Spark, Flink, or AWS Glue at production scale
- **AWS data stack:** S3, Glue, Athena, RDS, DynamoDB, MSK
- **Pipeline engineering:** ETL/ELT with error handling, retries, and monitoring; dbt for transformations
- **Quality \& governance:** Great Expectations or equivalent; lineage, metadata, and compliance frameworks (GDPR, HIPAA, or SOC 2\)
- **Languages:** Strong Python and SQL
- **Streaming:** Kafka or Redis Streams in production
- *Nice to Have**
- ---------------
- Feature engineering on unstructured data — document classification, entity extraction, semantic tagging (spaCy, transformers, LangChain)
- Vector stores in production (Pinecone, Weaviate, pgvector)
- Infrastructure as Code (Terraform or CDK) and observability (Prometheus, Grafana, OpenTelemetry)
- Prior work supporting ML/AI or agentic systems downstream
- *What Will Set You Apart**
- --------------------------
- A pipeline you've personally owned in production that you can walk us through end\-to\-end
- Experience designing for compliance\-critical environments
- A point of view on the right balance between batch, streaming, and on\-demand processing for agent workloads
- *WHAT YOU'LL BUILD:\-** **DIVERSITY \& INCLUSION**
Aivar Innovations is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to gender, gender identity, sexual orientation, religion, disability, age, marital status, caste, or any other protected characteristic. We are committed to building a diverse, inclusive, and respectful workplace for everyone.
Required Skills
Data Engineer, Lake architecture, Streaming data, NLP \& AI Data, Unstructured data masteryend\-to\-end data pipelines, Pipeline Design, Distributed computing, Data Architecture, Data Quality