Case-Study: Performance Modelling of AI Inference on GPUs May 15, 2023 How TechnoLynx modelled AI inference performance across GPU architectures — delivering two tools (topology-level performance predictor and OpenCL GPU… Read more →
Case Study - Embedded Video Coding on GPU (Under NDA) Apr 15, 2020 TechnoLynx built a CUDA-based H.264 encoder on a Jetson Nano-class embedded GPU for an automotive edge startup, targeting ≤5% CPU usage across 4+… Read more →
How to Improve GPU Performance: A Profiling-First Approach to Compute Optimization May 5, 2026 Profiling must precede GPU optimisation. Memory bandwidth fixes typically deliver 2-5x more impact than compute-bound fixes for AI workloads. Read more →
Inference Benchmarking Examples: Cost-Per-Request Comparisons That Actually Decide Jun 12, 2026 How to benchmark LLM inference serving configs on cost-per-request and p95 latency, not tokens-per-second, so the comparison maps to margin. Read more →
Latency Testing for AI Inference: A Methodology Beyond Best-Case Numbers May 13, 2026 How to design a latency-testing protocol that exposes batch, concurrency, and tail-percentile behavior under realistic AI inference load. Read more →