Praxis: An AI trading evaluation workspace

OVERVIEW

Overview

Praxis is a personal project built on top of Freqtrade, an open-source quant-trading framework used for strategy backtesting and paper trading. It is designed as a front-end interface layer that consolidates and accelerates the strategy-evaluation process.

In real development, AI can generate many strategy variants at high speed, sharply lowering the cost of generation. The downstream understanding and quality judgment, however, still have to be done by hand, one by one: reading complex backtest data, comparing how different strategies behave in the market, and assessing whether a strategy has the potential to move to the next stage.

As the number of strategies grows exponentially, the cognitive load and time cost of evaluation climb right alongside it.

Praxis is positioned to consolidate these scattered data outputs into one unified workspace, so strategies can be understood, compared, and validated in the shortest possible time, markedly improving decision efficiency before a strategy goes into a dry run (paper trading).

A short Praxis walkthrough: efficiently consolidating strategy code and backtest results into a single evaluation interface.
Role
Product Designer
Time
2 weeks
Team
Solo project
Impact
Evaluation cycle: ~3d to ~1.5d
Exploration density: ~2×
A single source of truth

DISCOVER

AI sent strategy generation soaring, but the evaluation that follows became the bottleneck.

This project comes from a pain point I felt firsthand while developing quant trading strategies day to day.

With AI in the loop, writing code and generating strategies became extremely efficient, spinning off large numbers of different variants in very little time.

Yet the review workflow after generation did not change at all. Every strategy variant still takes time to read, understand, and evaluate for stability, one at a time.

As the pile of strategies waiting to be reviewed grows, the pressure of reading and filtering them hits a serious bottleneck.

Core issue

The bottleneck in the process sits in reading and validating strategies, not in the earlier code generation.

The existing workflow leans heavily on several fragmented, disconnected sources:

  • backtest logs in the terminal
  • standalone JSON strategy-parameter config files
  • messy strategy-output source code across different versions

In the past, all of this scattered information had to be assembled by hand just to make a single complete strategy decision.

DEFINE

Defining the core problem of the evaluation layer

Goal

Build a system dedicated to supporting strategy understanding and validation, so the huge volume of AI-generated strategies can be compared and filtered within a very short cycle.

Only by reading and stress-testing strategies for robustness in a very short time can the generative potential of AI truly be unlocked, keeping the whole strategy-development cycle agile.

Constraints and boundaries

  • The system has to run seamlessly inside the local Freqtrade environment.
  • Because LLMs cannot reliably hold long context, the mapping between strategies and their backtest results has to be persisted at the interface layer.
  • Information transparency and traceability are the foundation of the design.
  • With only a two-week timeline, the feature scope had to be tightly narrowed to an MVP (minimum viable product).
Diagram of the legacy Freqtrade workflow: tasks from Build Strategy through Backtest, Walk-forward, Dry Run, and Live Run, sitting on top of fragmented IDE, CLI, and run-log tools fed by an AI coding agent.
The legacy workflow: switching back and forth between Python code, JSON configs, and CLI terminal tools. Even with AI help, the results stay scattered and very hard to trace across versions.
Diagram of the proposed Praxis workflow: a unified front-end dashboard layered over the existing tools, bringing strategy development, testing, and monitoring into one place.
The improved direction: without changing the underlying toolchain, build a unified front-end console that brings strategy evaluation, testing, and real-time monitoring into a single view.

DEVELOP

When filtering strategies, which pieces of information actually determine the quality of the decision?

Color system and accessibility

The interface design starts from a rigorous design-token color system: a global canvas color, trading-semantic colors, and a version-specific color for each strategy. Every level is strictly contrast-checked, so the experience stays fully consistent when switching between dark and light mode.

The Praxis color system in dark and light mode: canvas, semantic, and per-strategy colors, each labeled with its hex value and contrast ratio.
A high-contrast system of semantic and version-specific colors that stays visually consistent across dark and light modes.

1. A centralized evaluation workspace

All the scattered strategy outputs are pulled into one workspace. The system automatically parses and consolidates the fragmented data, so the user can read and compare across versions seamlessly within a single window.

The Praxis home view: backtest runs presented as a grid of cards, each with key metrics and an equity-curve sparkline.
The home view turns each backtest run into a quickly scannable card grid, replacing the inefficient cross-referencing between logs and files.

2. Three core views focused on evaluation

Following the review funnel of quant strategies, the system is split into three core interfaces:

  • Comparison View: for comparing how different strategy versions behave, side by side.
  • Strategy Dashboard: for an overall view of every strategy currently running.
  • Robustness Analysis: for digging deep into a single strategy's resilience to risk and overfitting.

Trade-off: a fixed three-view structure was chosen on purpose, rather than a highly customizable, chaotic layout, to lower the barrier to understanding and getting started as much as possible.

3. A professional interface that fits the quant-trading context

The interface style takes its cues from professional trading terminals like Bloomberg and TradingView:

  • a high information-density layout
  • a strict, rigorous visual hierarchy
  • monospace typography throughout for numbers, so the data stays aligned and readable

Trade-off: drawing on a visual language users already know from the industry removes the learning cost for newcomers and, almost subconsciously, builds trust in the tool's data.

4. Deliberately excluding the FreqAI module

FreqAI is the module in the Freqtrade framework used to run machine-learning predictions.

But machine-learning output is black-box by nature and very hard to trace or explain clearly. That runs directly against this evaluation system's core emphasis on traceability and reasonable interpretability, so it was firmly cut from the MVP.

The Praxis monitor view: each strategy's entry and exit conditions surfaced as explicit, color-coded logic blocks.
The monitor view shows each strategy's entry and exit signal logic explicitly, the decision transparency a black-box machine-learning model cannot offer.

5. Narrowing the scope of custom strategies

Some highly customized strategies embed extremely complex nested logic and parameters, well beyond the scope of a two-week MVP. So the system supports standardized Freqtrade strategy structures first, focusing on validating how smoothly the core evaluation flow runs.

Trade-off: within a limited timeline, deliver a usable, end-to-end working system first, rather than chasing full coverage of every edge case.

The Praxis backtest detail view: a standardized parameter sidebar on the left, with an equity curve and trade-list results on the right.
Standardized parameters and hyperparameters can be tuned right from the sidebar; deeply nested custom code logic still lives in the source code.

DELIVER

Freeing effort from data wrangling and turning it toward real strategy decisions.

Praxis ships as a local toolkit that slots cleanly into the existing Freqtrade development flow, automating the work of organizing strategy outputs and backtest results.

That lets a developer's focus shift from the low-value work of manually assembling data to the genuinely high-value work of deeply optimizing and judging strategies.

The Praxis robustness analysis view: an equity-curve percentile chart above a consecutive-loss-streak chart and a Monte Carlo drawdown distribution.
The robustness-analysis panel in the finished UI: an equity-curve percentile chart, the frequency of maximum consecutive losses, and a Monte Carlo drawdown distribution, so decisions rest strictly on trustworthy statistics rather than on an AI's qualitative description.

Quantified results

50%

Faster decisions

The combined evaluation and QA time for a single strategy dropped sharply from about 3 days to 1.5.

Strategy exploration density

Within the same time and opportunity cost, the number of strategies that can be validated and filtered in parallel roughly doubled.

1

Single source of truth

A single, consistent strategy-review dashboard, ending the old mess of cross-referencing across files and logs.

The finished Praxis home view shown side by side in dark and light mode.
The final, delivered high-fidelity interface, adapting cleanly to both dark and light working modes.

REFLECTION

Reflection

Once AI compresses the cost of code generation toward zero, the real human work and core advantage shift quickly to understanding, quality-checking, judgment, and decision-making.

For this evaluation layer in an AI-assisted trading workflow, two directions are worth digging into next:

Higher-resolution time-series data handling

Improving how the underlying time-series data is parsed, to support finer-grained, tick-by-tick analysis of micro-level behavior.

Low-code tooling for building strategies

Turning the strong factors found in evaluation into tooling, with a more intuitive visual node-based editor for assembling strategies and tuning hyperparameters.

The product opportunity ahead is making the back-end work of robustness filtering as intuitive, smooth, and near-zero-cost as the front-end AI generation already is.