World Models · Reinforcement Learning
Research prototype
GitHub →
Geometry Dash World-Model Agent (DreamerV3-style)
A DreamerV3-style agent for a 60Hz physics-driven game environment with tight failure constraints,
built with a custom Gymnasium stack, Windows↔WSL synchronization, and high-frequency logging for reproducible evaluation.
JAX
DreamerV3-style
Gymnasium
Windows↔WSL bridge
High-frequency logger
Environment
Custom Gymnasium env + reproducible evaluation harness.
Systems
Windows↔WSL bridge to sync observations/state and actions.
Debugging
High-frequency trajectories for offline analysis and sanity checks.
Optimization · Topological Data Analysis
GitHub →
TopoAdamW: TDA-Guided Meta-Optimizer
A PyTorch optimizer that uses GUDHI-based TDA features to probe local loss-landscape geometry
(e.g., sharp vs. flat regions) and adapt update behavior, with stability safeguards built in.
PyTorch
GUDHI
Loss landscape
Reproducible eval
LLM Systems
Efficient Fine-Tuning (Dream-7B, GPT-OSS-20B)
Memory-efficient fine-tuning pipelines using QLoRA (4-bit), gradient checkpointing, and DeepSpeed,
targeting both single-GPU (16GB) and multi-GPU setups with stable, reproducible evaluation.
DeepSpeed
QLoRA
4-bit
Benchmarking