I'm a computer vision learner and builder, currently focusing on AI scene understanding for real-world camera systems. I care about both model performance and whether a vision system can actually work reliably in messy, practical environments.
I have hands-on experience with core vision tasks such as object detection, instance segmentation, visual perception, and scene understanding. I also keep an eye on emerging VLM / VLA and end-to-end vision models as a future direction, especially how they connect perception, reasoning, and interaction. I enjoy turning research ideas into usable prototypes and tools.
- Visual AI systems for cameras and surveillance scenarios
- Object detection, segmentation, visual perception, and scene reasoning
- CNN / Transformer / vision foundation model applications
- PyTorch training & inference, OpenCV pipelines, and desktop AI tools
- Building clean demos, evaluation scripts, and practical engineering workflows
- Following VLM / VLA and end-to-end multimodal models as an emerging direction
- Windows-Face-Hello — RGB webcam face unlock experiment for Windows, combining face recognition, liveness detection, and system integration.
- Owen-Studio — My technical blog for computer vision and deep learning notes.
- VIT — Vision model experiments covering classification, detection, segmentation, and training workflows.
- lane-vehicle-counter — Lane-level vehicle detection and counting based on OpenCV.
- VLM-Workbench — A desktop prototype for exploring structured visual understanding with VLMs.
Thanks for visiting — feel free to explore my projects or reach me by email. 🚀