Grupa Insight
software house

AI integrations in practice: RAG, chatbots and LLM pipelines — how it actually works

HomeArticlesAI integrations in practice: RAG, chatbots and LLM pipelines — how it actually works
AI integrations in practice: RAG, chatbots and LLM pipelines — how it actually works

AI Integrations in Production

Case study: BLUP-FLOCK ERP — an AI system for a poultry breeding farm using the BLUP algorithm and 5-generation genealogy analysis

AI integrations fall into two worlds. The first: a simple API call to a language model wrapped in an interface. The second: a production system that genuinely changes operational decisions and is responsible for millions in herd management.

The BLUP-FLOCK ERP project belongs firmly to the second category. The system analyses real genetic data, traces back 5 generations in the genealogy, and recommends crossings with mathematical precision — with a full justification for every decision.

How an AI deployment differs from an API call

The simplest AI integration looks like this:

  • user sends a query
  • system passes it to the model
  • model returns a response

In BLUP-FLOCK, every query is an orchestration of multiple steps:

  • fetching data from the flock
  • reconstructing the genealogical tree up to the 5th generation
  • calculating EBV (Estimated Breeding Values) using the BLUP method
  • ranking birds taking inbreeding coefficient into account
  • recommendation with full justification

What is the BLUP method?

BLUP (Best Linear Unbiased Prediction) is a statistical method for estimating the breeding value of animals, used globally in poultry, cattle, and pig breeding.

It simultaneously accounts for:

  • genealogy (going back multiple generations)
  • environmental and nutritional data
  • measured production traits (fertility, daily gain, feed conversion)

Each individual is assessed in the context of the entire family, not just its own results — which is why the method gives a far more accurate picture of genetic value than classical index methods.

Pipeline architecture — how the system works

Before diving into the details — two terms that will come up throughout.

RAG (Retrieval-Augmented Generation) is an approach in which an AI model, instead of relying solely on its own knowledge, first retrieves the relevant context from external data — and only then generates a response. An LLM pipeline is the set of steps leading from raw data to that response: context retrieval, prompt construction, model call.

In a classic RAG setup, those steps look like this:

  • vector search in a document database
  • prompt construction with context
  • response generation

In BLUP-FLOCK, standard RAG was not enough. Genetic data is not a collection of text documents — it is a graph of relationships between individuals spanning 5 generations. Instead of similarity search on embeddings, we built a custom retrieval mechanism that reconstructs the genealogical tree to a specified generational depth and calculates the inbreeding coefficient of potential offspring before anything reaches the model. Hence 8 layers instead of the standard 3.

8 layers of the AI system in BLUP-FLOCK

  1. Data ingestion Near-real-time fetching from two existing applications (.NET/Blazor) via REST API and message queues

  2. Chunking and normalisation Splitting data into contexts: phenotypic, genealogical, environmental

  3. Temporal Tables Full change history of every record in MS SQL Server — essential for BLUP calculation reproducibility and certification audits

  4. BLUP / EBV calculations Deterministic genetic calculation engine — identical result for the same input data (certification requirement)

  5. Ranking and selection Multi-criteria optimisation: fertility, daily gain, feed conversion, and inbreeding coincidence — all at once

  6. Explainable AI (XAI) Every recommendation accompanied by a justification: which traits influenced the decision, what the predictor weights were, model error risk

  7. Validation and anomaly detection Automatic detection of incorrect assignments (e.g. egg attributed to the wrong nest) — critical for Ross/Hubbard/Hy-Line certifications

  8. Feedback loop The model adapts EBV estimates to real-world results — continuous learning on data from the specific flock

Challenge #1: genetic data is not documents

Most RAG descriptions concern document-based systems — FAQs, manuals, knowledge bases. In BLUP-FLOCK, the "context" data is a relational network of 5 genetic generations, where each individual has associated phenotypic traits, reproductive results, and a position in the genealogical tree.

Standard RAG designed for document retrieval was not sufficient. We had to build a custom retrieval mechanism that:

  • reconstructs the genealogical tree to a specified generational depth
  • calculates the inbreeding coefficient of potential offspring before recommending a crossing
  • prunes genealogical branches that would exceed the permitted inbreeding threshold
  • returns a ranking weighted according to the client's breeding programme

That is why custom retrieval instead of classic vector search.

Challenge #2: Explainable AI as a requirement, not a feature

In most AI projects, explainability is a "nice to have". In BLUP-FLOCK, it is a certification requirement. Pedigree breeding programmes (Ross 308, Hubbard, Hy-Line, Lohmann) require full decision reproducibility — an auditor must be able to verify why rooster X was selected as a breeder and rooster Y was eliminated.

Every system decision contains:

  • which traits had the greatest influence on the recommendation and to what degree
  • comparison with the results of traditional index methods
  • model error risk for the given set of input data
  • full calculation history stored in Temporal Tables

Without this, the system would be a "black box" that neither the breeder can trust nor the certifier can accept.

Challenge #3: migration without downtime

A pedigree breeding farm cannot "switch off the system for a week" to move to new software. Rszew already had two operational applications handling its day-to-day processes.

We designed a hybrid integration architecture:

  • existing applications continue to run unchanged as the source of truth for operational data
  • the new ERP system fetches data near-real-time via REST API and message queues
  • the AI module operates on a synchronised, central database
  • historical data migration runs in parallel, without interrupting production

This pattern — API Gateway + message queues + central database with Temporal Tables — is our recommendation for any organisation that wants to deploy AI on existing data without operational risk.

Technology stack

MS SQL Server 2022 – Temporal Tables (full data history, calculation reproducibility) – Columnstore Indexes (analytical query performance across tens of thousands of records)

Laravel 11 / PHP 8.3 – AI pipeline backend – queues (Redis/Horizon) and JWT/Sanctum authorisation

Vue.js 3 + Inertia.js – frontend and visualisations (D3.js / Apache ECharts)

Redis 7 + RedisJSON – caching of intermediate BLUP calculation results — significant speed-up for repeated operations

Kubernetes + Docker – horizontal scalability, AI microservice isolation

MLOps pipeline – model versioning, data drift monitoring, automatic retraining

Measurable outcomes

  • analysis time for pedigree review and ranking preparation reduced from weeks to minutes
  • genetic progress increased by 3–8% per year vs. classical index methods
  • number of maintained breeders reduced by 10–25% through more precise selection
  • data errors eliminated through automatic real-time anomaly validation
  • full auditability for certification schemes (IFS Food, BRCGS, GlobalG.A.P.)

5 lessons from an AI deployment

1. Data matters more than the model

In BLUP-FLOCK, the biggest challenge was not choosing an algorithm — it was data architecture: Temporal Tables, Columnstore Indexes, a schema built for genetic relationships. A poor model on good data can be fixed. A good model on bad data simply does not work.

2. LangChain is not always the answer

We use it for rapid prototyping and architecture testing. In production — a custom pipeline with full control over the logic. In complex domains like animal genetics, ready-made frameworks impose abstractions that do not fit the domain.

3. Explainable AI is non-negotiable

More and more sectors — finance, healthcare, certified animal breeding — require auditability. Build explainability from day one, not as an afterthought on top of a finished model.

4. Zero-downtime migration is possible

The API Gateway + message queues + central database pattern allows AI to be deployed incrementally on existing systems, without operational risk. No client can afford a "big bang migration".

5. KPIs must be defined before the start

At Rszew we know exactly: how much to reduce analysis time, what breeder reduction threshold is the target, which certifications the system must support. Without those KPIs there is no way to measure the success of a deployment.

Summary

AI integrations in production are systems built on data, architecture, and quality control. The greatest value comes not from the model itself, but from how it is used in a business context — and from the precision of the data it works with.

Planning an AI deployment?

We build production-grade AI systems — from RAG-based chatbots and LLM pipelines to specialised analytical modules like BLUP-FLOCK. Every deployment starts with data analysis and architecture, not with choosing a model.

Tell us about your use case

Sources