Grupa Insight - Digital Production & E-commerce Solutions

AI Integrations in Production

Case study: BLUP-FLOCK ERP — an AI system for a poultry breeding farm using the BLUP algorithm and 5-generation genealogy analysis

AI integrations fall into two worlds. The first: a simple API call to a language model wrapped in an interface. The second: a production system that genuinely changes operational decisions and is responsible for millions in herd management.

The BLUP-FLOCK ERP project belongs firmly to the second category. The system analyses real genetic data, traces back 5 generations in the genealogy, and recommends crossings with mathematical precision — with a full justification for every decision.

How an AI deployment differs from an API call

The simplest AI integration looks like this:

user sends a query
system passes it to the model
model returns a response

In BLUP-FLOCK, every query is an orchestration of multiple steps:

fetching data from the flock
reconstructing the genealogical tree up to the 5th generation
calculating EBV (Estimated Breeding Values) using the BLUP method
ranking birds taking inbreeding coefficient into account
recommendation with full justification

What is the BLUP method?

BLUP (Best Linear Unbiased Prediction) is a statistical method for estimating the breeding value of animals, used globally in poultry, cattle, and pig breeding.

It simultaneously accounts for:

genealogy (going back multiple generations)
environmental and nutritional data
measured production traits (fertility, daily gain, feed conversion)

Each individual is assessed in the context of the entire family, not just its own results — which is why the method gives a far more accurate picture of genetic value than classical index methods.

Pipeline architecture — how the system works

Before diving into the details — two terms that will come up throughout.

RAG (Retrieval-Augmented Generation) is an approach in which an AI model, instead of relying solely on its own knowledge, first retrieves the relevant context from external data — and only then generates a response. An LLM pipeline is the set of steps leading from raw data to that response: context retrieval, prompt construction, model call.

In a classic RAG setup, those steps look like this:

vector search in a document database
prompt construction with context
response generation

In BLUP-FLOCK, standard RAG was not enough. Genetic data is not a collection of text documents — it is a graph of relationships between individuals spanning 5 generations. Instead of similarity search on embeddings, we built a custom retrieval mechanism that reconstructs the genealogical tree to a specified generational depth and calculates the inbreeding coefficient of potential offspring before anything reaches the model. Hence 8 layers instead of the standard 3.

8 layers of the AI system in BLUP-FLOCK

Data ingestion Near-real-time fetching from two existing applications (.NET/Blazor) via REST API and message queues
Chunking and normalisation Splitting data into contexts: phenotypic, genealogical, environmental
Temporal Tables Full change history of every record in MS SQL Server — essential for BLUP calculation reproducibility and certification audits
BLUP / EBV calculations Deterministic genetic calculation engine — identical result for the same input data (certification requirement)
Ranking and selection Multi-criteria optimisation: fertility, daily gain, feed conversion, and inbreeding coincidence — all at once
Explainable AI (XAI) Every recommendation accompanied by a justification: which traits influenced the decision, what the predictor weights were, model error risk
Validation and anomaly detection Automatic detection of incorrect assignments (e.g. egg attributed to the wrong nest) — critical for Ross/Hubbard/Hy-Line certifications
Feedback loop The model adapts EBV estimates to real-world results — continuous learning on data from the specific flock

Challenge #1: genetic data is not documents

Most RAG descriptions concern document-based systems — FAQs, manuals, knowledge bases. In BLUP-FLOCK, the "context" data is a relational network of 5 genetic generations, where each individual has associated phenotypic traits, reproductive results, and a position in the genealogical tree.

Standard RAG designed for document retrieval was not sufficient. We had to build a custom retrieval mechanism that:

reconstructs the genealogical tree to a specified generational depth
calculates the inbreeding coefficient of potential offspring before recommending a crossing
prunes genealogical branches that would exceed the permitted inbreeding threshold
returns a ranking weighted according to the client's breeding programme

That is why custom retrieval instead of classic vector search.

Challenge #2: Explainable AI as a requirement, not a feature

In most AI projects, explainability is a "nice to have". In BLUP-FLOCK, it is a certification requirement. Pedigree breeding programmes (Ross 308, Hubbard, Hy-Line, Lohmann) require full decision reproducibility — an auditor must be able to verify why rooster X was selected as a breeder and rooster Y was eliminated.

Every system decision contains:

which traits had the greatest influence on the recommendation and to what degree
comparison with the results of traditional index methods
model error risk for the given set of input data
full calculation history stored in Temporal Tables

Without this, the system would be a "black box" that neither the breeder can trust nor the certifier can accept.

Challenge #3: migration without downtime

A pedigree breeding farm cannot "switch off the system for a week" to move to new software. Rszew already had two operational applications handling its day-to-day processes.

We designed a hybrid integration architecture:

existing applications continue to run unchanged as the source of truth for operational data
the new ERP system fetches data near-real-time via REST API and message queues
the AI module operates on a synchronised, central database
historical data migration runs in parallel, without interrupting production

This pattern — API Gateway + message queues + central database with Temporal Tables — is our recommendation for any organisation that wants to deploy AI on existing data without operational risk.

Technology stack

MS SQL Server 2022 – Temporal Tables (full data history, calculation reproducibility) – Columnstore Indexes (analytical query performance across tens of thousands of records)

Laravel 11 / PHP 8.3 – AI pipeline backend – queues (Redis/Horizon) and JWT/Sanctum authorisation

Vue.js 3 + Inertia.js – frontend and visualisations (D3.js / Apache ECharts)

Redis 7 + RedisJSON – caching of intermediate BLUP calculation results — significant speed-up for repeated operations

Kubernetes + Docker – horizontal scalability, AI microservice isolation

MLOps pipeline – model versioning, data drift monitoring, automatic retraining

Measurable outcomes

analysis time for pedigree review and ranking preparation reduced from weeks to minutes
genetic progress increased by 3–8% per year vs. classical index methods
number of maintained breeders reduced by 10–25% through more precise selection
data errors eliminated through automatic real-time anomaly validation
full auditability for certification schemes (IFS Food, BRCGS, GlobalG.A.P.)

5 lessons from an AI deployment

1. Data matters more than the model

In BLUP-FLOCK, the biggest challenge was not choosing an algorithm — it was data architecture: Temporal Tables, Columnstore Indexes, a schema built for genetic relationships. A poor model on good data can be fixed. A good model on bad data simply does not work.

2. LangChain is not always the answer

We use it for rapid prototyping and architecture testing. In production — a custom pipeline with full control over the logic. In complex domains like animal genetics, ready-made frameworks impose abstractions that do not fit the domain.

3. Explainable AI is non-negotiable

More and more sectors — finance, healthcare, certified animal breeding — require auditability. Build explainability from day one, not as an afterthought on top of a finished model.

4. Zero-downtime migration is possible

The API Gateway + message queues + central database pattern allows AI to be deployed incrementally on existing systems, without operational risk. No client can afford a "big bang migration".

5. KPIs must be defined before the start

At Rszew we know exactly: how much to reduce analysis time, what breeder reduction threshold is the target, which certifications the system must support. Without those KPIs there is no way to measure the success of a deployment.

Summary

AI integrations in production are systems built on data, architecture, and quality control. The greatest value comes not from the model itself, but from how it is used in a business context — and from the precision of the data it works with.

Planning an AI deployment?

We build production-grade AI systems — from RAG-based chatbots and LLM pipelines to specialised analytical modules like BLUP-FLOCK. Every deployment starts with data analysis and architecture, not with choosing a model.

→ Tell us about your use case

Sources

Henderson, C.R. (1975). Best Linear Unbiased Estimation and Prediction under a Selection Model
Lewis, P. et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Microsoft SQL Server 2022 — Temporal Tables
Qdrant Documentation
LangChain Documentation

AI integrations in practice: RAG, chatbots and LLM pipelines — how it actually works