ecommerce

June 2026

Gree Distributor

Conversational AI & RAG Engine for a Legacy E-Commerce Platform

Wrapped a legacy PHP/OpenCart HVAC store in a standalone, retrieval-grounded AI sales assistant: 1,000+ catalog entities indexed, multi-unit system configuration, grounded quotes and order creation — 24/7, with a confidence-gated human fallback.

Duration

Ongoing engagement (build + continuous tuning)

Team Size

My Role

End-to-end: architecture, backend, ML/RAG pipeline, infrastructure & data engineering

Conversational AI & RAG Engine for a Legacy E-Commerce Platform

Executive Summary

Our client operates a long-running, high-traffic online store built on a legacy PHP/OpenCart platform. The catalog is large and technical — hundreds of products with dozens of structured specifications each, multiple equipment families, accessories, and a body of static knowledge covering delivery, warranty, payment, and installation policies. The existing on-site chat was a manually-staffed widget: every catalog question required a human operator, customers dropped off outside working hours, answers were inconsistent, and the highest-value conversations — multi-unit system configurations — were exactly the ones most likely to stall before a quote was produced.

The hard constraint was that we could not rebuild the store. The legacy platform is the system of record for products, pricing, stock, and orders, and had to remain untouched and authoritative. The AI layer had to wrap around it, not replace it.

We designed and built a standalone AI microservice — provisioned on its own dedicated server with its own deployment pipeline, observability, and scaling — that turns the legacy catalog into a conversational, retrieval-grounded assistant. Every answer is grounded in real catalog data with a calibrated confidence gate that escalates rather than hallucinates. The result is a 24/7 automated pre-sales consultant that understands the full catalog, configures multi-unit systems, generates quotes, and creates orders, with a graceful human-operator fallback.

Key Metrics

Average response time

-96%

Before

300 sec

After

11 sec

Lead-to-quote (commercial proposal) time

-93%

Before

30 min

After

2 min

Qualified leads / quotes issued

+27%

Before

Baseline

After

+27%

Order placement time

-75%

Before

5 min

After

1–1.5 min

Off-target operator load (peak season)

-48%

Before

Baseline

After

-48%

Queries resolved without escalation

+160

Before

After

160+

Pre-sales availability

Always-on

Before

Business hours, human-staffed

After

24/7 automated

Multi-unit configuration

Fully automated

Before

Operator-only, manual

After

Self-serve, end-to-end

The Challenges

Key obstacles that needed to be addressed

Cost & latency of a fully human-staffed chat

Operators answered the same catalog questions — fit for a room size, stock status, delivery cost, multi-room configuration — thousands of times over. Outside working hours customers simply waited or dropped off.

Business Impact

Expensive operator time spent on repetitive pre-sales, and lost conversions every hour the team was offline.

Inconsistent, error-prone answers

Responses varied by operator. Technical specifications and live availability were easy to get wrong when answered from memory rather than the source catalog.

Business Impact

Eroded customer trust and created quoting errors on a catalog where specs and stock change constantly.

High-value configurations stalled in the funnel

Complex multi-unit systems (one outdoor unit serving several rooms) were the most operator-dependent conversations — and the most likely to stall before a quote was ever produced.

Business Impact

The largest, most profitable orders were the ones most frequently lost to friction and slow turnaround.

A legacy core that could not be rebuilt

The PHP/OpenCart monolith is the authoritative system of record for products, pricing, stock, and orders. It had to remain untouched — the AI had to wrap around it, not replace it.

Business Impact

Any solution had to add intelligence with zero intrusion into a proven, revenue-critical platform.

Our Solutions

How we tackled the challenges and delivered results

A purpose-built inference server, provisioned from scratch

Rather than bolt an AI script onto the existing host, we provisioned a separate production server dedicated to inference, indexing, and the conversational API — isolating the AI workload from the storefront and keeping latency-sensitive operations off the legacy box.

Implementation

Fastify (Node.js) service under systemd, fronted by HTTPS, with health/liveness probes that verify the vector store, cache, and model providers on every check. Every external call (LLM, vector DB, reranker) is wrapped in a timeout plus exponential-backoff retry with jitter, so a hung upstream fails fast instead of stalling a customer. GitLab CI/CD deploys behind a post-deploy health gate, with an automated test suite (27+ tests) gating refactors.

Node.jsFastifysystemdHTTPSGitLab CI/CD

A multi-stage Retrieval-Augmented Generation pipeline

Each customer message flows through a multi-stage RAG pipeline engineered to ground every answer in real catalog data and to escalate rather than guess when confidence is low.

Implementation

Intent classification routes and short-circuits cheap cases; a fast model rewrites context-dependent messages ("and the 9k one?") into standalone catalog queries; hybrid dense+sparse retrieval against Pinecone pulls the top 50 of 1,000+ entities; Cohere neural reranking narrows 50→10 with a calibrated confidence gate that refuses thin-air answers; a frontier LLM composes the reply strictly from retrieved context with inline source attribution; Redis holds conversation memory and a TTL'd answer cache for fast follow-ups.

PineconeCohereOpenAIRedis

Bridging the legacy catalog into the vector store

A dedicated data-sync indexer continuously projects the legacy MySQL catalog — products, categories, and static knowledge pages — into the vector index, each enriched with structured retrieval metadata.

Implementation

Normalizes and cleans source content (stripping legacy CMS markup so the model reads clean prose), maps raw catalog fields into retrieval-ready metadata (price, stock, capacity, coverage area, inverter/refrigerant flags, model codes, archive status) used for filtering and ranking, and runs incrementally via content-hash change detection for near-real-time freshness, with full-rebuild capability for schema changes.

MySQLPHP / OpenCartPinecone

Multi-unit system configurator

The standout capability: assembling multi-split systems where one outdoor unit serves several rooms — historically the hardest, most operator-dependent conversation in the funnel.

Implementation

Ingested the full technical catalog — outdoor-unit capacities, allowed indoor-unit combinations, and compatibility tables — so the assistant can take "three rooms, one outdoor unit, no ceiling mount in two of them" and return a complete, validated configuration: the right outdoor unit and correct indoor units by mounting type (wall, cassette, ducted, console, floor/ceiling), each with a real price and availability.

OpenAIPineconeNode.js

Order creation & automated quote generation

The assistant turns conversations directly into transactions — assembling the cart, capturing customer details, generating formatted commercial proposals for complex multi-room projects, and logging the order without a manager ever touching it.

Implementation

For quotes, the bot forms an itemized commercial proposal via API — units, capacities, pricing, totals — and delivers it to the customer right in the chat, compressing a multi-day, manager-dependent cycle into minutes. For orders, the customer answers a couple of questions and the bot pushes a structured request into the store's order system through controlled write-backs (never modifying the core), notifying both managers and the customer. To avoid errors it deliberately does not take payment — the order is recorded as a request, with payment collected after confirmation.

Node.jsPHP / OpenCartJivoChat

All solutions successfully implemented and deployed

Results & Impact

Measurable outcomes achieved through our solutions

Availability

24/7 automated pre-sales across the full catalog

Instant, consistent, source-grounded answers with no operator in the loop for routine inquiries — at any hour.

Sales

The hardest conversation, automated end-to-end

Multi-unit system configuration — historically the most operator-intensive, highest-value flow — is now self-serve from recommendation to quote.

Conversion

From chat directly to order

Order creation and commercial-proposal generation turn conversations into transactions instead of handing off to manual entry.

Operations

Operator load reduced

Routine catalog, spec, availability, and logistics questions are deflected from the human team, freeing operators for genuinely complex cases.

Reliability

Hallucination-resistant by design

A calibrated confidence gate plus strictly grounded generation mean the assistant escalates rather than fabricating a price or availability — and a total upstream outage degrades to a friendly handoff, never a 500.

Project delivered on time and exceeded expectations

Technology Stack

Tools and technologies used to build this solution

Backend

Node.jsFastify

AI / Retrieval

Pinecone (hybrid dense + sparse)Cohere rerankingOpenAI LLMs & embeddings

Database

MySQLRedis

Infrastructure

Dedicated Linux serversystemdHTTPS

DevOps

GitLab CI/CDAutomated health checksTest suite (27+ tests)

Tools

PHP / OpenCartJivoChat widget

All technologies were carefully selected to ensure optimal performance, scalability, and maintainability

ragaillmconversational-aivector-searchpineconecohereopenainodejsfastifyecommerceopencartredishvac

Services

Tools

Pages

Ready to Start?

Have an idea?

Conversational AI & RAG Engine for a Legacy E-Commerce Platform

Executive Summary

Key Metrics

The Challenges

Cost & latency of a fully human-staffed chat

Inconsistent, error-prone answers

High-value configurations stalled in the funnel

A legacy core that could not be rebuilt

Our Solutions

A purpose-built inference server, provisioned from scratch

A multi-stage Retrieval-Augmented Generation pipeline

Bridging the legacy catalog into the vector store

Multi-unit system configurator

Order creation & automated quote generation

Results & Impact

24/7 automated pre-sales across the full catalog

The hardest conversation, automated end-to-end

From chat directly to order

Operator load reduced

Hallucination-resistant by design

Technology Stack

Backend

AI / Retrieval

Database

Infrastructure

DevOps

Tools