AI Engineering index
LLM systems, AI system architecture, MLOps, evaluation, deployment, and production AI infrastructure.
A crawlable index of practical notes by Yangming Li, covering LLM systems, AI system architecture, NLP, MLOps, statistical learning, experimentation, and product thinking. The dedicated AI Engineering column collects the posts closest to production AI systems.
LLM systems, AI system architecture, MLOps, evaluation, deployment, and production AI infrastructure.
How to evaluate retrieval quality, generation faithfulness, citations, no-answer behavior, and production reliability.
Build reusable Copilot agent cases, rubrics, schema checks, regression gates, and release decisions.
Why production AI agents need custom eval sets, trajectory checks, calibrated judges, regression tests, and business-ready metrics.
Schema-first evaluation, custom graders, release gates, human review, and monitoring for reliable Copilot Studio agents.
How connected AI tools use external data, tools, and services.
A technical guide to agent architecture, system integration, and production constraints.
Why high-accuracy churn prediction can improve renewal rate while losing revenue, and how uplift thinking fixes the target.
A practical guide to CATE, meta-learners, Qini, AUUC, decile lift, online experiments, and incremental ROI.
How product teams connect experiments, observational data, ATE, CATE, uplift, guardrails, and decisions.
Reliability diagrams, expected calibration error, confidence bins, decision thresholds, and monitoring checks.
A layered guide with retrieval metrics, a synthetic golden dataset, framework comparison, and monitoring checks.
A practical essay on moving agent development from demo-driven iteration to evaluation-driven production readiness.
Moving a Copilot agent from a demo to a reviewed, monitored, production-ready AI system.
How connected AI tools use external data, tools, and services.
A technical guide to agent architecture, system integration, and production constraints.
Study notes on foundational and advanced natural language processing topics.
An ensemble learning guide with practical machine learning context.
A compact review of tensors, shapes, broadcasting, reshaping, and distributions.
Notes on transparency, fairness, privacy, robustness, and accountability.
Model management and deployment foundations for production ML work.
Delta Lake, MLflow, Unity Catalog, and data platform implementation notes.
Using Docker for reproducible data science and ML environments.
Container orchestration concepts and implementation patterns.
A subscription renewal example showing why coupon campaigns need incremental ROI, holdouts, and uplift-based targeting.
How to move from average A/B test lift to targeted incremental impact for coupons, campaigns, churn, and growth.
Power analysis, minimum detectable effect, guardrails, and sample size planning for product experiments.
When to use experiments, observational causal methods, uplift, and guardrails for product decisions.
A practical framework for defining and validating product value.
Notes on user needs, stickiness, and product experience.
Feature gates, experiments, and reliable launch measurement.
How issue tracking and delivery systems support technical execution.